Levenberg-Marquardt Method

See Also: Unconstrained Optimization Nonlinear Least-Squares Problems

The Levenberg-Marquardt algorithm can be thought of as a trust-region modification of the Gauss-Newton algorithm. Levenberg-Marquardt steps are obtained by solving subproblems of the form
\[ \min \{ \frac{1}{2} \|f^\prime (x_k)s + f(x_k) \|_2^2 : \| D_ks\|_2 \leq \Delta_k \}, \quad\quad\quad\quad(1.2)\] for some \(\Delta_k > 0\) and scaling matrix \(D_k\). The trust-region radius is adjusted between iterations according to the agreement between predicted and actual reduction in the objective function. For a step to be accepted, the ratio \(\rho_k = \frac{r(x_k) – r(x_k + s_k)} {r_k – \frac{1}{2} \|f^\prime (x_k)s_k + f(x_k) \|_2^2 }\) must exceed a small positive number (typically 0.0001). If this test is failed, the trust region is decreased and the step is recalculated. When the ratio is close to one, the trust region for the next iteration is expanded.

Levenberg-Marquardt codes usually determine the step by noting that the solution of (1.2) also satisfies the equation
\[(f^\prime (x_k) ^ T f^\prime (x_k) + \lambda_k D_k^T D_k) s_k = -f^\prime (x_k)^T f(x_k), \quad\quad\quad(1.3)\] for some \(\lambda_k \geq 0\). The Lagrange multiplier \(\lambda_k\) is zero if the minimum-norm Gauss-Newton step is smaller than \(\Delta_k\); otherwise \(\lambda_k\) is chosen so that \(\|D_k s_k \|_2 = \Delta_k\)

Equations (1.3) are simply the normal equations for the least squares problem
\[\min \left\{ {\left\| \left[ \begin{array}{c} f^{\prime} (x_k) \\ \lambda_k^\frac{1}{2} D_k \end{array} \right] s  +  \left [ \begin{array}{c} f (x_k) \\ 0 \end{array} \right] \right\|_2^2} : s \in  \mathbb{R}^n. \right\} \quad\quad\quad\quad(1.4)\]

Efficient factorization of the coefficient matrix in (1.4) can be performed by a combination of Householder and Givens transformations.

The Levenberg-Marquardt algorithm has proved to be an effective and popular way to solve nonlinear least squares problems. MINPACK-1 contains Levenberg-Marquardt codes in which the Jacobian matrix may be either supplied by the user or calculated by using finite differences. IMSL, MATLAB, ODRPACK, and PROC NLP also contain Levenberg-Marquardt routines.

The algorithms in ODRPACK solve unconstrained nonlinear least squares problems and orthogonal distance regression problems, including those with implicit models and multiresponse data. The simplest case of an orthogonal distance regression problem occurs when we are given observations \(y_i\) at times \(t_i\) and a model \(\phi(t_i, \cdot)\) for each observation. In this case, we aim to find sets of parameters \(x  \in \mathbb{R}^n\) and corrections \(\delta_i\) that solve the minimization problem
\[\min \{ \sum_{i=1}^m [ \phi ( t_i + \delta_i ;x) – y_i]^2 + \delta_i^2 : x \in \mathbb{R}^n, \delta_i \in \mathbb{R}^n  \}\]

The ODRPACK algorithms use a trust-region Levenberg-Marquardt method that exploits the structure of this problem, so that there is little difference between the cost per iteration for this problem and the standard least squares problem in which the \(\delta_i\) are fixed at zero.