Homotopy Methods

See Also: Unconstrained Optimization Nonlinear Optimization

Algorithms for systems of nonlinear equations that use trust-region and line-search strategies require the condition
\[ \| f(x_{k+1}) \| \leq \| f(x_k) \| \] to hold for each iteration. Consequently, they can become trapped in a region in which the function \(\| f() \|\) has a local minimizer \(z^*\) for which \(f(z^*) \neq 0\), and they can therefore fail to converge to a solution \(x^*\) of the original nonlinear system
\[f(x) = 0.\]

To be more certain of convergence to \(x^*\) from arbitrary starting points, we need to turn to a more complex class of algorithms known as homotopy, or continuation methods. These methods are usually slower than line-search and trust-region methods, but they are useful on difficult problems for which a good starting point is hard to find.

Continuation methods define an easy problem for which the solution is known and a path between the easy problem and the original (hard) problem. The solution to the easy problem is gradually transformed to the solution of the hard problem by tracing this path. The path may be defined by introducing an additional scalar parameter \(\lambda\) into the problem and defining a function \(h(x, \lambda)\):
\[h(x, \lambda) = \lambda \, f(x) + (1 – \lambda) \, (x – x_0), \quad (1)\] which is solved for values of \(\lambda \) between 0 and 1.

  • When \(\lambda = 0\), the solution to (1) is clearly \(x=x_0\).
  • When \(\lambda=1\), \(h(x, \lambda) = f(x) \,\), so the solution of (1) coincides with the solution of the original problem \(f(x)=0\).

The path that takes \((x, \lambda)\) from \((x_0, 0)\) to \((x^*, 1)\) may have turning points, where it is not possible to follow the path smoothly by insisting on an increasing \(\lambda\) at every step. Instead, practical methods allow \(\lambda\) to decrease where necessary.

Some implementations of the method use the more robust approach of expressing both \(x\) and \(\lambda\) in terms of a third parameter, \(s\), which represents arc length along the solution path. Differentiating \(h(x, \lambda)\) with respect to \(s\) yields the ordinary differential equation
\[ \partial_x h(x, \lambda) x^\prime (s) + \partial_\lambda h(x, \lambda) \lambda^\prime (s) = 0\,\]

Sophisticated solvers may then be applied to this problem with initial condition
\[ (x(0), \lambda(0)) = (x_0, 0) \,\] and the side condition
\[\|x^\prime (s)\|^2_2 + \lambda^\prime (s)^2 = 1\]

Another practical method obtains a new iterate \((x_{k+1}, \lambda_{k+1})\) from the current iterate by solving an augmented system of the form \((x_k, \lambda_k) \)
\[\left [ \begin{array}{c} h(x, \lambda) \\ w^T_k x + \mu_k \lambda – t_k  \end{array} \right] = 0\] The additional (linear) equation is usually chosen so that \((w_k, \mu_k)\) is one of the unit vectors in \(R^{n+1}\) and is a target value for one of the components of \((x_{k+1}, \lambda_{k+1})\) whose value has been fixed by a predictor step.