Processing math: 25%

Nonlinear Equations

There are many applications in which the goal is to find values for the variables that satisfy a set of given constraints without the need to optimize a particular objective function. When there are n variables and n equality constraints, the problem is one of solving a system of nonlinear equations. Mathematically, the problem is
f(x)=0, where f:RnRn is a vector function,
f(x)=[f1(x)f2(x)fn(x)], where each fi(x):RnR, i=1,2,,n is smooth. A vector x satisfying f(x)=0 is called a solution or a root of the nonlinear equations. In general, a system of nonlinear equations will have no solution, a unique solution or many solutions.

Many algorithms for nonlinear equations are related to algorithms for unconstrained optimization and nonlinear least-squares. There are close connections to the nonlinear least-squares problem since a number of algorithms for nonlinear equations proceed by minimizing the sum of squares of the equations:
min

Despite the similarities, there are important differences between algorithms for the two problems. In nonlinear equations, the number of equations is equal to the number of variables and all of the equations must be satisfied at a solution point.

Algorithms

Newton’s method forms the basis for many of the algorithms to solve systems of nonlinear equations. We give a brief overview of Newton’s method and outline some of the related algorithms. For more details, see the sources in the references, for example, Chapter 11 in Nocedal and Wright (1999).

Newton’s method for nonlinear equations uses a linear approximation of the function f around the current iterate. Given an initial point x_0,
f(x) \approx f(x_0) + J(x_0)(x – x_0), where J(x_0) is the Jacobian matrix, the matrix of all first-order partial derivatives of the components of f. To find the vector x that makes f(x) = 0, we choose the next iterate x_1 so that
f(x_0) + J(x_0)(x_1 – x_0) = 0. Let p_0 = x_1 – x_0. To find p_0, we solve the system of linear equations
J(x_0)\, p_0 = -f(x_0) and then set x_1 = x_0 + p_0.

Newton’s Method for Nonlinear Equations

Given an iterate x_k, Newton’s method computes f(x_k) \, and its Jacobian matrix and finds a step p_k by solving the system of linear equations
J(x_k) \, p_k = – f(x_k) \quad (1) \, Then, the new iterate is x_{k+1} = x_k + p_k\,.

Most of the computational cost of Newton’s method is associated with two operations: evaluation — of both the function f\, and the Jacobian matrix — and solution of the linear system of equations (1). For the Jacobian, the computation of the i^{th} column requires the computation of the partial derivative of each equation of f with respect to x_i. The solution of the linear system (1) requires order n^3 \, operations when the Jacobian is dense.

Newton’s method is guaranteed to converge if the starting point is sufficiently close to the solution and the Jacobian is nonsingular at the solution. Under these conditions, the rate of convergence is quadratic:
\| x_{k+1} – x^* \| \leq \beta \| x_k – x^* \|^2, for some positive constant \beta \,. This rapid local convergence is the main advantage of Newton’s method. The disadvantages include the calculation of the Jacobian matrix and the potentially erratic behavior when the starting point is not close to a solution (no guaranteed global convergence). Modifications and enhancements to Newton’s method have been developed to deal with these two difficulties.

References
  • Nocedal, J. and Wright, S. J. 1999. Numerical Optimization. Springer-Verlag, New York.