See Also: Unconstrained Optimization Nonlinear Optimization

## Trust-Region Methods

A *trust-region version* of Newton’s method for nonlinear equations takes the view that the linear model of the function is valid only when the step is not too large, so it places a restriction on the size of the step. That is, if we let \(J(x_k)\) denote the Jacobian matrix and \(p_k\) be the step, then a trust-region method limits \(p_k\) to ensure that the linear model

\[f(x_k) + J(x_k) p_k,\]
is a valid for \(f(x_k + p_k)\).

In a general trust-region method, the Jacobian matrix is replaced by an approximation \(B_k\), and the step is obtained as an approximate solution of the subproblem

\[\min \{ {\|f(x_k) + B_k s\| : \|D_k s \|_2 \leq \Delta_k} \}, \quad\quad(1)\]
where \(D_k\) is a scaling matrix and \(\Delta_k\) is the trust-region radius. The step is accepted if the ratio

\[\rho_k = \frac{\|f(x_k)\| – \|f(x_k + s_k)\|}{\|f(x_k)\| – \|f(x_k) + B_k s_k \|} \,\]
of the actual-to-predicted decrease in \(\|f(x) \|\) is greater than some constant \(\sigma_0\), e.g., 0.0001. If the step is not accepted, the trust-region radius is decreased and the ratio is recomputed. The trust-region radius may also be updated between iterations according to how close the ratio \(\rho_k\) is to its ideal value of 1.

## Line-Search Methods

Given an approximation \(B_k\) to the Jacobian matrix, a *line-search method* obtains a search direction \(d_k\) by solving the system of linear equations

\[B_k d_k =-f(x_k)\]
and then defining the next iterate as \(x_{k+1} = x_k + \alpha_k d_k \,\), where the line search parameter \(\alpha_k > 0\) is chosen by the line-search procedure so that \( \|f(x_{k+1})\| < \|f(x_k)\| \).

When the “approximate” Jacobian is “exact”, as in Newton’s method, \(d_k\) is a downhill direction 2-norm, so there is certain to be an \(\alpha_k > 0\) such that \(\|f(x_{k+1}) \|_2 < \|f(x_k) \|_2\). This descent property does not necessarily hold for other choices of the approximate Jacobian, so line-search methods are used only when \(B_k\) is either the exact Jacobian or a close approximation to it.

In an ideal line-search Newton method, we would compute the search direction by solving

\[f^\prime(x_k) d_k = -f(x_k)\]
and choose the line-search parameter \(\alpha_k\) to minimize the scalar function \(\phi_\alpha = \| f(x_k + \alpha d_k) \|_2^2.\)

However, since it is usually too time-consuming to find the \(\alpha\) that exactly minimizes \(\phi\), we usually settle for an approximate solution that satisfies the conditions

\[\phi(\alpha_k) \leq \phi(0) + \mu \alpha_k \phi^\prime(0), \quad\mbox{and} \quad |\phi^\prime(\alpha_k)| \leq \eta |\phi^\prime(0)| \]
where \( \mu\) and \(\eta\) are two constants. Typical values are \(\mu = 0.001\) and \(\eta=0.9\).

The first of these conditions ensures that \(\|f(x) \|_2^2\) decreases by a significant amount, while the second condition ensures that we move far enough along the search direction by insisting on a significant reduction in the size of the gradient.