Line Search Methods

Line search methods generate the iterates by setting $$x_{k+1} = x_k + \alpha_k d_k \,$$ where $$d_k$$ is a search direction and $$\alpha_k > 0$$ is chosen so that $$f({x+1}) < f(x_k).$$

Most line search versions of the basic Newton method generate the direction $$d_k$$ by modifying the Hessian matrix $$\nabla^2f(x_k)$$ to ensure that the quadratic model $$q_k$$ of the function has a unique minimizer. The modified Cholesky decomposition approach adds positive quantities to the diagonal of $$\nabla^2f(x_k)$$ during the Cholesky factorization. As a result, a diagonal matrix, $$E_k$$, with nonnegative diagonal entries is generated such that $$\nabla^2f(x_k) + E_k$$ is positive definite. Given this decomposition, the search direction $$d_k$$ is obtained by solving $$(\nabla^2f(x_k) + E_k)d_k = -\nabla f(x_k)$$

After $$d_k$$ is found, a line search procedure is used to choose an $$\alpha_k > 0$$ that approximately minimizes $$f$$ along the ray $${x_k + \alpha d_k : \alpha > 0}$$. The following Newton codes use line-search methods: GAUSS, NAG Fortran Library, NAG C Library, and OPTIMA.

The algorithms used in these codes for determining $$\alpha_k$$ rely on quadratic or cubic interpolation of the univariate function $$\phi_\alpha = f(x_k + \alpha d_k)$$ in their search for a suitable $$\alpha_k$$. An elegant and practical criterion for a suitable $$\alpha_k$$ is to require $$\alpha_k$$ to satisfy the sufficient decrease condition:
$$f(x_k + \alpha d_k) \leq f(x_k) + \mu \alpha_k \nabla f(x_k)^T d_k$$ and the curvature condition:
$$| \nabla f(x_k + \alpha d_k)^T d_k| \leq \eta | \nabla f(x_k)^T d_k|$$

where $$\mu$$ and $$\eta$$ are two constants with $$0 < \mu < \eta < 1$$. The sufficient decrease condition guarantees, in particular, that $$f(x_{k+1}) < f(x_k)$$ while the curvature condition requires that $$\alpha_k$$ be not too far from a minimizer of $$\phi$$. Requiring an accurate minimizer is generally wasteful of function and gradient evaluations, so codes typically use $$\mu=0.001$$ and $$\eta=0.9$$ in these conditions.