Interior-Point Methods

See Also: Constrained Optimization Linear Programming

The announcement by Karmarkar in 1984 that he had developed a fast algorithm that generated iterates that lie in the interior of the feasible set (rather than on the boundary, as simplex methods do) opened up exciting new avenues for research in both the computational complexity and mathematical programming communities. Since then, there has been intense research into a variety of methods that maintain strict feasibility of all iterates, at least with respect to the inequality constraints. Although dwarfed in volume by simplex-based packages, interior-point products have emerged and have proven to be competitive with, and often superior to, the best simplex packages, especially on large problems.

The NEOS Server offers several solvers that implement interior-point methods, including bpmpd, MOSEK, and OOQP.

Here, we discuss only primal-dual interior point algorithms, which are effective from a computational perspective as well as the most amenable to theoretical analysis. To begin, the dual of the standard form linear programming problem can be written as
\[\max\left\{b^T y \; : \; s = c – A^T y \geq 0 \right\}.\] Then, the optimality conditions for \((x,y,s)\) to be a primal-dual solution triplet are that
\[Ax = b, \quad A^T y + s = c, \quad S X e = 0, \quad x \geq 0, \quad s \geq 0,\] where
\[S = \mbox{diag}(s_1, s_2, \ldots, s_n), \quad X = \mbox{diag}(x_1, x_2, \ldots, x_n)\] and \(e\) is the vector of all ones.

Interior-point algorithms generate iterates \((x_k,y_k,s_k)\) such that \(x_k > 0\) and \(s_k > 0\). As \(k \to \infty\), the equality-constraint violations \(\|A x_k – b \|\) and \(\| a^T y_k + s_k – c\|\) and the duality gap \(x_k^T s_k\) are driven to zero, yielding a limiting point that solves the primal and dual linear programs.

Primal-dual methods can be thought of as a variant of Newton's method applied to the system of equations formed by the first three optimality conditions. Given the current iterate \((x_k,y_k,s_k)\) and the damping parameter \(\sigma_k \in [0,1]\), the search direction \((w_k,z_k,t_k)\) is generated by solving the linear system
\[A w = b – A x_k, \; A^T z + t = c – A^T y_k – s_k, \; S_k w + X_k t = – S_k X_k e + \sigma_k \mu_k e\] where
\[\mu_k = x_k^T s_k / n\]

The new point is then obtained by setting
\[(x_{k+1},y_{k+1},s_{k+1}) \leftarrow (x_k,y_k,s_k) + (\alpha_k^P w_k , \alpha_k^D z_k, \alpha_k^D t_k)\] where \(\alpha_k^D\) and \(\alpha_k^P\) are chosen to ensure that \(x_{k+1} > 0\) and \(s_{k+1} > 0\).

When \(\sigma_k=0\), the search direction is the pure Newton search direction for the nonlinear system,
\(A x = b, \; A^T y + s = c, \; S X e = 0\)
and the resulting method is an ''affine scaling'' algorithm. The effect of choosing positive values of \(\sigma_k\) is to orient the step away from the boundary of the nonnegative orthant defined by \(x \geq 0, \; s \geq 0\), thus allowing longer step lengths \(\alpha_k^P, \; \alpha_k^D\) to be taken. ''Path-following'' methods require \(\alpha_k^P, \; \alpha_k^D\) to be chosen so that \(x_k\) and \(s_k\) are not merely positive but also satisfy the centrality condition\[(x_k,s_k) \in C_k\] where
\[C_k = \left\{ (x,s): x_i s_i \geq \gamma \mu_k, \; i=1,\ldots,n \right\}\] for some \(\gamma \in (0,1)\). The other requirement on \(\alpha_k^P\) and \(\alpha_k^D\) is that the decrease in \(\mu_k\) should not outpace the improvement in feasibility (that is, the decrease in \(\| A x_k – b \|\) and \(\| A^T y_k + s_k – c \|\)). Greater priority is placed on attaining feasibility than on closing the duality gap. It is possible to design path-following algorithms satisfying these requirements for which the sequence \(\{ \mu_k \}\) converges to zero at a linear rate. Further, the number of iterates required for convergence is a polynomial function of the size of the problem (typically, order \(n\) or order \(n^{3/2}\)). By allowing the damping parameter \(\sigma_k\) to become small as the solution is approached, the method behaves more and more like a pure Newton method, and superlinear convergence can be achieved.

The current generation of interior-point software is quite sophisticated. Most current codes are based on a predictor-corrector algorithm outlined in a 1993 paper by Mehrotra. Mehrotra's algorithm consists of the basic infeasible-interior-point approach described above, with a corrector component added to each step to increase the order of approximation to the nonlinear (complementarity) term in the KKT conditions. Implementations of the algorithm also incorporate other heuristics that are essential for robust behavior on many practical problems, including the choice of step length and centering parameter, handling of free variables (i.e. variables without explicit bounds), preprocessing to remove redundant information in the problem specification, determination of a starting point, and so on.

Another useful option in practical software is the ability to cross over to a simplex-like method once a good approximation to the solution has been found. This feature is present in the CPLEX Barrier code.

Most of the computational cost of an interior-point method is associated with the solution of the linear system that defines the search direction. Typically, block elimination is used to obtain a single linear system in \(z\) alone, specifically,

: \(A X_k S_k^{-1} A^T z = A X_k S_k^{-1}\left[ c – A^T y_k – s_k \right] + b – \sigma_k \mu_k A S_k^{-1} e\).

The coefficient matrix is formed explicitly, except that dense columns of \(A\) (which would cause \(A X_k S_k^{-1} A^T\) to have many more non zeros than \(A\) itself) are handled separately, and columns that correspond to apparently nonbasic primal variables are eventually dropped.

References

N. Karmarkar,A new polynomial-time algorithm for linear programming, Combinatorica, 4 (1984), pp. 373-395.

Please see the home page for the book Primal-Dual Interior-Point Methods for more information.