Consider the following problem, featuring only inequality constraints:
In order to characterize a (locally) optimal solution, we may follow the same reasoning as in the equality-constrained case; if x* is a locally optimal solution, then we cannot find a feasible descent direction at x*. A fundamental observation is that an inequality constraint can be either active at x*, gk(x*) = 0, or inactive, gk(x*) < 0. Let A denote the set of active constraints at x*.
- An active inequality constraint is similar to an equality constraint, with a significant difference. Unlike the case of equality constraints, here we have more degrees of freedom to move around x* by a displacement , as we should just make sure that ; we are free to move along the boundary of the feasible set, but we may also move toward its interior. Formally, under regularity conditions on the constraints, we may characterize a descent direction as follows. On the basis of a first-order Taylor’s expansions, a direction is feasible at x* ifSince the constraint is active at x* and gk(x*) = 0, a feasible direction is characterized by the conditionThis condition should be compared with Eq. (12.67).
- If a constraint is inactive, a point will stay feasible for any displacement, provided that this is small enough. An inactive constraint does not contribute to defining the locally optimal solution x* and can be disregarded.
The following example illustrates these considerations.
Example 12.18 Consider the nonlinear programming problem
The feasible region is shaded in Fig. 12.17. Taking advantage of this picture, and noting the obvious symmetry of the problem, it is easy to see that the optimal solution is , the nonnegativity constraints are inactive, and the constraint
is active. The gradient of the objective function is
In order to improve the objective, we should move along the direction −∇f = [−1, −1]T (southwest), as illustrated in Fig. 12.17. The gradient of the constraint at x* is
Fig. 12.17 Optimality conditions with an inequality constraint.
Looking again at Fig. 12.17, we observe that at point x* we may not only move along the tangent line to the constraint, but also toward the interior of the feasible set, along the direction −∇g(x*) = [2, 2]T. We notice once again that the gradients ∇f(x*) and ∇g(x*) are parallel, like in Example 12.16, and there exists a number μ such that
However, this is not sufficient. We observe that μ = 0.5 > 0. The Lagrange multiplier, for an inequality constraint, should be positive. In our example, this means that the two gradients point toward opposite directions; otherwise, we could improve the objective by moving toward the interior of the feasible region. To see this, imagine that the gradient −∇f in Fig. 12.17 points northeast, rather than southwest.
The example seems to suggest that there is no feasible descent direction at x*, if we can express the gradient of the objective as a linear combination of the gradients of the active constraints:
This is not unlike the case of equality constraints, but now we have a nonnegativity restriction on the multipliers associated with inequality constraints: . This condition may be extended to include the inactive constraints, for which gk(x*) < 0; however, we must enforce a further condition: for inactive constraints. We may express this condition by requiring for all inequality constraints.29 Everything is wrapped up into the following theorem, which allows us to cope with the general problem, including both equality and inequality constraints.
THEOREM 12.11 (Karush–Kuhn–Tucker conditions) Assume that the functions f, hj, gk in problem (12.59) are continuously differentiable, and that x* is feasible and satisfies a constraint qualification condition. Then a necessary condition for the local optimality of x* is that there exist numbers and such that
These conditions are called Karush–Kuhn–Tucker (KKT) conditions.
The condition (12.71) is the familiar stationarity requirement for the Lagrangian function:
Note that all of the constraints, equalities and inequalities, are included in the Lagrangian function, multiplied by the respective multipliers. The conditions (12.72) are known as complementary slackness conditions. They may be interpreted by noting that if a constraint is inactive at x*, i.e., if gk(x*) < 0, the corresponding multiplier must be zero; by the same token, if the multiplier is strictly positive, the corresponding constraint must be active (which roughly means that it could be substituted by an equality constraint without changing the optimal solution).
Note again that the multipliers for inequality constraints are restricted in sign. Moreover, the KKT conditions are, like Theorem 12.10, rather weak, as they are only necessary conditions for local optimality, and they further require differentiability properties and the additional constraint qualification condition (see Example 12.17). They are, however, necessary and sufficient for global optimality in the convex and differentiable case. Finding a solution, at least in principle, requires the solution of a system of equations and inequalities, as illustrated by the following example.
Example 12.19 Consider the convex problem
First write the Lagrangian function
A set of numbers satisfying the KKT conditions can be found by solving the following system of equations and inequalities:
We may proceed with a case-by-case analysis exploiting the complementary slackness conditions. If a multiplier is strictly positive, the corresponding inequality is active, which helps us in finding the value of a decision variable.
Case 1 (μ1 = μ2 = 0) In this case, the inequality constraints are dropped from the Lagrangian function. From the stationarity conditions we obtain the system
This yields a solution x1 = x2 = 2, which violates the second inequality constraint.
Case 2 (μ1, μ2 ≠ 0) The complementary slackness conditions immediately yield x1 = 0, x1 = 3, violating the equality constraint.
Case 3 (μ1 ≠ 0, μ2 = 0) We obtain
The KKT conditions are not satisfied since the value of μ1 is negative.
Case 4 (μ1 = 0, μ2 ≠ 0) We obtain
This solution satisfies all the necessary KKT conditions.
Since this is a convex problem, we have obtained the global optimum. Note how nonzero multipliers correspond to the active constraints, whereas the inactive constraint x1 ≥ 0 is associated with a multiplier μ1 = 0.
The procedure above is obviously cumbersome and cannot be applied to a large scale problem. In practice, the KKT conditions are the theoretical background of many numerical methods for nonlinear programming. Furthermore, the analysis can be extended to include the Hessian matrix of the second-order derivatives; doing so, it is also possible to find sufficient conditions for local optimality in the nonconvex case.
Leave a Reply