Trust-Region and Line Search Methods for Nonlinear Equations

See Also: Unconstrained Optimization Nonlinear Optimization

Trust-Region Methods

A trust-region version of Newton’s method for nonlinear equations takes the view that the linear model of the function is valid only when the step is not too large, so it places a restriction on the size of the step. That is, if we let \(J(x_k)\) denote the Jacobian matrix and \(p_k\) be the step, then a trust-region method limits \(p_k\) to ensure that the linear model
\[f(x_k) + J(x_k) p_k,\] is a valid for \(f(x_k + p_k)\).

In a general trust-region method, the Jacobian matrix is replaced by an approximation \(B_k\), and the step is obtained as an approximate solution of the subproblem
\[\min \{ {\|f(x_k) + B_k s\| : \|D_k s \|_2 \leq \Delta_k} \}, \quad\quad(1)\] where \(D_k\) is a scaling matrix and \(\Delta_k\) is the trust-region radius. The step is accepted if the ratio
\[\rho_k = \frac{\|f(x_k)\| – \|f(x_k + s_k)\|}{\|f(x_k)\| – \|f(x_k) + B_k s_k \|} \,\] of the actual-to-predicted decrease in \(\|f(x) \|\) is greater than some constant \(\sigma_0\), e.g., 0.0001. If the step is not accepted, the trust-region radius is decreased and the ratio is recomputed. The trust-region radius may also be updated between iterations according to how close the ratio \(\rho_k\) is to its ideal value of 1.

Line-Search Methods

Given an approximation \(B_k\) to the Jacobian matrix, a line-search method obtains a search direction \(d_k\) by solving the system of linear equations
\[B_k d_k =-f(x_k)\] and then defining the next iterate as \(x_{k+1} = x_k + \alpha_k d_k \,\), where the line search parameter \(\alpha_k > 0\) is chosen by the line-search procedure so that \( \|f(x_{k+1})\| < \|f(x_k)\| \).

When the “approximate” Jacobian is “exact”, as in Newton’s method, \(d_k\) is a downhill direction 2-norm, so there is certain to be an \(\alpha_k > 0\) such that \(\|f(x_{k+1}) \|_2 < \|f(x_k) \|_2\). This descent property does not necessarily hold for other choices of the approximate Jacobian, so line-search methods are used only when \(B_k\) is either the exact Jacobian or a close approximation to it.

In an ideal line-search Newton method, we would compute the search direction by solving
\[f^\prime(x_k) d_k = -f(x_k)\] and choose the line-search parameter \(\alpha_k\) to minimize the scalar function \(\phi_\alpha = \| f(x_k + \alpha d_k) \|_2^2.\)

However, since it is usually too time-consuming to find the \(\alpha\) that exactly minimizes \(\phi\), we usually settle for an approximate solution that satisfies the conditions
\[\phi(\alpha_k) \leq \phi(0) + \mu \alpha_k \phi^\prime(0), \quad\mbox{and} \quad |\phi^\prime(\alpha_k)| \leq \eta |\phi^\prime(0)| \] where \( \mu\) and \(\eta\) are two constants. Typical values are \(\mu = 0.001\) and \(\eta=0.9\).

The first of these conditions ensures that \(\|f(x) \|_2^2\) decreases by a significant amount, while the second condition ensures that we move far enough along the search direction by insisting on a significant reduction in the size of the gradient.