Gauss-Newton Method - NEOS Guide

An algorithm that is particularly suited to the small-residual case is the Gauss-Newton algorithm, in which the Hessian is approximated by its first term. In a line-search version of the Gauss-Newton algorithm, the search direction \(d_k\) from the current iterate satisfies the linear system
\[(f^\prime (x_k)^T f^\prime (x_k))d_k = – f^\prime (x_k)^T f(x_k) \, (1.1)\]

Any solution of (1.1) is a descent direction for, since \(d_k^T \nabla r(x_k) = – \| f^\prime (x_k)d_k \|_2^2 < = 0\), unless \(\nabla r (x_k) = 0\).

Gauss-Newton codes perform a line search along the direction to obtain the new iterate. The suitability of a candidate's step length can be determined, as in the case of unconstrained minimization, by enforcing the sufficient decrease condition and the curvature condition.

When the Jacobian of \(f\) has rank less than \(n\), the system (1.1) has a multiplicity of solutions. In these circumstances, Gauss-Newton algorithms choose a particular solution. For example, the choice
\[d_k = -f^\prime (x_k) = f(x_k)\] where, in general, the matrix \(A^+ \) denotes the Moore-Penrose pseudo-inverse of the matrix \(A \), corresponds to the solution of least 2-norm. Since this choice is expensive to compute and requires the determination of the rank of the Jacobian, codes tend to obtain a direction that satisfies (1.1) by using less expensive techniques.

The Gauss-Newton algorithm is used, usually with enhancements, in much of the software for nonlinear least squares. It is a component of the algorithms used by DFNLP, MATLAB, NAG Library, OPTIMA, and TENSOLVE. In many of these codes, the Gauss-Newton model is augmented by a matrix \(S_k\); as a consequence the search direction satisfies
\[(f^\prime (x_k)^T f^\prime (x_k) + S_k)d_k = – f^\prime (x_k)^T f(x_k)\]

The purpose of \(S_k \) is to guarantee fast local convergence while retaining the global convergence properties of the Gauss-Newton method.

The NAG routines use a Gauss-Newton search direction whenever a sufficiently large decrease in \(r\) is obtained at the previous iteration. Otherwise, second-derivative information is obtained from user-supplied function evaluation routines, quasi-Newton approximations, or difference approximations. Using this information, the software attempts to find a more accurate approximation to the Newton direction than the Gauss-Newton direction is able to provide.

The TENSOLVE software augments the Gauss-Newton model with a low-rank tensor approximation to the second-order term. It has been observed to converge faster than standard Gauss-Newton on many problems, particularly when the Jacobian matrix is rank deficient at the solution.