Trust-Region Methods

The trust-region approach can be motivated by noting that the quadratic model $$q_k$$ is a useful model of the function $$f$$ only near $$x_k$$. When the Hessian matrix is indefinite, the quadratic function $$q_k$$ is unbounded below, so $$q_k$$ is obviously a poor model of $$f(x_k+s)$$ when $$s$$ is large. Therefore, it is reasonable to select the step $$s_k$$ by solving the subproblem
$\min \{ q_k(s) : \| D_ks \|_2 \leq \Delta_k \}$ for some $$\Delta_k > 0$$ and scaling matrix $$D_k$$. The trust-region parameter, $$\Delta_k$$, is adjusted between iterations according to the agreement between predicted and actual reduction in the function $$f$$ as measured by the ratio
$\rho_k = \frac{f(x_k) – f(x_k + s_k)}{f(x_k) – q_k(s_k)}.$
If there is good agreement, that is, $$p \approx 1$$ then $$\Delta_k$$ is increased. If the agreement is poor, i.e., $$\rho_k$$ is small or negative, then $$\Delta_k$$ is decreased. The decision to accept the step $$s_k$$ is also based on $$\rho_k$$. Usually,
$x_{k+1} = x_k + s_k,$ if $$\rho_k \geq \sigma_0$$ where $$\sigma_0$$ is small (typically, $$10^{-4}$$); otherwise $$x_{k+1} = x_k$$.
Different algorithms may be used to choose the step $$s_k$$. Most rely on the observation that there is a $$\lambda_k$$ such that
$(\nabla ^2 f(x_k) + \lambda_k D_k^T D_k ) s_k = – \nabla f(x_k), \quad\quad\quad(1.1)$ where either $$\lambda_k = 0$$ and $$\| D_ks_k \|_2 \leq \Delta_k$$, or $$\lambda_k > 0$$ and $$\| D_ks_k \|_2 = \Delta_k$$. An appropriate $$\lambda_k$$ is determined by an iterative process in which (1.1) is solved for each trial value of $$\lambda_k$$.