Back to Unconstrained Optimization
Nonlinear conjugate gradient methods make up another popular class of algorithms for large-scale optimization. These algorithms can be derived as extensions of the conjugate gradient algorithm or as specializations of limited-memory quasi-Newton methods. Given an iterate \(x_k\) and a direction \(d_k\), a line search is performed along \(d_k\) to produce \(x_{k+1} = x_k + \alpha_k d_k\). The Fletcher-Reeves variant of the nonlinear conjugate algorithm generates \(d_{k+1}\) from the simple recursion
The method's performance is sometimes enhanced by re-starting, that is, periodically \(\beta_k\) setting to zero. The Polak-Ribiere variant of conjugate gradient defines \(\beta_k\) as
\[\beta_k = \frac{[ \nabla f(x_{k+1}) - \nabla f(x_k)]^T \nabla f(x_{k+1}) } { \nabla f(x_k)^T \nabla f(x_k)}\]
These two definitions of \(\beta_k\) are equivalent when \(f\) is quadratic, but not otherwise. Numerical testing suggests that the Polak-Ribiere method tends to be more efficient than the Fletcher-Reeves method. IMSL, NAG(FORTRAN), NAG(C), OPTIMA Library, OPTPACK, and PROC NLP contain nonlinear conjugate gradient codes.