Convex programming: difficult vs. easy problems

Let us consider an abstract mathematical programming problem:

images

Intuition would suggest that an unconstrained problem, where images, is much easier to solve than a constrained one. Moreover, the same intuition would suggest that the larger the problem, in terms of the number of decision variables and constraints, the more difficult is solving it. In fact, this intuition is wrong. Some unconstrained problems with a few tens of variables are much more difficult to solve than some constrained problems involving thousands of variables. The key issue is convexity, which we introduced earlier.10 In this section we extend those basic concepts a bit, and we summarize their impact on the difficulty of solving an optimization problem.

images

Fig. 12.5 Global and local optima for a polynomial.

Solving problem (12.16) means finding a feasible global optimizer of function f. For readers’ convenience, let us review the concepts of local and global optimality.

DEFINITION 12.1 Given the optimization problem (12.16), a point x* ∈ S is said to be a global minimizer if f(x*) ≤ f(x)for all x ∈ S. We have a local minimizer if the condition holds only in the intersection between S and a neighborhood of x*.

When dealing with maximization problems, we speak of maximizers; in general, we look for a global optimizer of the objective function. When one speaks of an “optimum,” a little ambiguity arises, because it is not quite clear whether we mean the optimizer x* or the optimal value f(x*); usually, the context clarifies what we really mean. We may also use the notation x* = arg minx∈S f(x) to indicate the optimizer.

Example 12.7 Consider the following unconstrained problem:

images

The objective function is a polynomial, whose graph is depicted in Fig. 12.5. We observe that there are two local minimizers, one of which is also the global one, and one local maximizer. The function goes to infinity when x is large in absolute value, so there is no global maximizer.

With a high-degree polynomial, there are potentially many local maxima and minima. Yet, polynomials in one variable are a relatively easy case; their derivative is very easy to find and is just a polynomial of lower degree. Therefore, application of the stationarity condition requires the solution of a nonlinear polynomial equation. While finding all of the roots of a generic nonlinear equation is not easy, efficient numerical methods are available to find all of the roots of a polynomial. Hence, we are able to find the whole set of stationary points of a polynomial, and it is an easy task to enumerate them to find the global optimizer, if any exists. However, this does not hold if we consider a function of several variables, with the possible complication of constraints. We already know that if a function of a single variable is convex and differentiable, then we do find a global minimizer by looking for a stationary point. Indeed, convexity is the basic feature that makes a mathematical problem relatively easy to solve.

DEFINITION 12.2 (Convex and concave problems) Problem (12.16) is said to be a convex problem if f is a convex function and S is a convex set. Problem (12.16) is said to be a concave problem if f is a concave function and S is a convex set.

Note that we are referring to a minimization problem. If we maximize a concave function over a convex set, we are actually dealing with a convex problem. Assuming that the optimization problem has a finite solution, the following properties can be proved.

PROPERTY 12.3 In a convex problem a local minimizer is also a global minimizer.

PROPERTY 12.4 In a concave problem the global minimizer lies on the boundary of the feasible set.

The first property essentially says that if we are able to find a local optimizer, we are done. The second property may look a bit harder to grasp, but an example will illustrate it.

Example 12.8 Consider the following one-dimensional problem:

images

This is a concave problem, since the leading term in the quadratic polynomial defining the objective is negative, and the second-order derivative is negative everywhere. In Fig. 12.6 we show the objective function and the feasible set. The stationarity point x = 2 is of no use to us, since it is a maximizer. We see that local minimizers are located at the boundary of the feasible set. A local minimizer lies at the left boundary, x = 1, and the global minimizer is located at the right boundary, x = 4.

Concave problems are not easy to solve, even though Property 12.4 helps a lot. What we wish for is convexity. We gave a general definition of convex sets and functions, but how can we check convexity operationally? In the case of a convex and differentiate function of one variable, we know that convexity is linked to nonnegativity of the second-order derivative. If the second-order derivative is strictly positive, then the function is strictly convex. The following theorem generalizes the result to a function of multiple variables.

images

Fig. 12.6 A concave problem may have local optima, but they lie on the boundary of the feasible set.

THEOREM 12.5 Consider a function f(x) defined on , and assume that its Hessian matrix exists at any point x.11 The function is convex if and only if its Hessian matrix is positive semidefinite for all x.

Referring back to Sections 3.8 and 3.9, we do understand what is the basic message of the theorem. A quadratic form is convex if its matrix is positive semidefinite. A generic function, provided that it is suitably differentiable, can be approximated by a second-order Taylor’s expansion. If the Hessian matrix for all Taylor’s expansions, taken at different points x, is always positive semi-definite, then both the approximation and the original function are convex. From Section 3.8 we also know that this condition could be checked in terms of eigenvalues; more practical checks have been developed, which are a bit too technical for our purposes. There are practically relevant cases, in which we know for sure that an objective function is convex.

Example 12.9 In Example 12.5 we considered the constrained minimization of variance of portfolio return, which is expressed as wTΣw. We know that the covariance matrix Σ is positive semi-definite for sure, otherwise variance could be negative for some setting of weights w. Hence, the minimization of variance, subject to linear constraints, is a convex problem.

images

Fig. 12.7 A tangent plane underestimates a convex function.

We also know that a differentiate and convex function of a single variable has the property that, if we draw the tangent line to its graph at any point, this line lies below the function graph everywhere.12 Using Taylor’s expansion for functions of multiple variables, this property can be generalized as follows.

THEOREM 12.6 Consider a convex function f(x) defined on , and assume that its gradient vector f(x) exists at any point xThen the following inequality holds for all x and x0:

images

The theorem can be easily visualized for n = 2, and it states that the function graph is always above the tangent plane at any point x0Figure 12.7 was shown and is repeated here for readers’ convenience; we observe that the tangent plane lies below the function graph.

Visualization in higher dimensions is impossible, but this theorem allows to prove very easily that stationarity is a necessary and sufficient condition for global optimality in an unconstrained problem involving a differentiate convex function.

THEOREM 12.7 If function f is convex and differentiate on , the point x* is a global minimizer if and only if it satisfies the stationarity condition

images

PROOF If f is convex and differentiable, the application of Theorem 12.6 at point x* yields

images

for any x. Since x* is a stationarity point, for any x we have

images

i.e., x* is a global minimizer.

Necessary and sufficient conditions like the one above are, regrettably, a scarce commodity in mathematical programming, unless we require convexity. To check convexity of the feasible region S, the following properties are useful:

  • The set of points described by the inequality g(x) ≤ 0 is convex if function g is convex.13
  • The intersection of convex sets is a convex set.14

We see immediately that if we have a set of inequality constraints

images

the resulting feasible set is convex if all functions gi are convex. The case of equality constraints

images

is not as easy. To see this, let us rewrite the equality constraint as two inequalities:

images

We have a convex set only if the function hi is both convex and concave. This will be the case only if function hi is affine, i.e., the constraint is of the form

images

Furthermore, any integrality requirement will make the feasible set nonconvex as well.15

The best way to wrap up the contents of this section is to list possible structures of mathematical programming problems and the corresponding solution difficulty.

  • continuous linear programming problem is both a convex and a concave problem, since the objective function is linear. This implies that a global optimizer can be found by exploring local optimizers on the boundary of the feasible set. The simplex algorithm, described later in Section 12.6.1, takes advantage of these features. More recently, an alternative class of methods, called interior-point methods, has been developed and offered in commercial packages. The bottom line is that we are able to solve rather large continuous LPs quite efficiently.
  • Mixed-integer linear programming problems are nonconvex. Still, the relaxation of integrality requirements yields a continuous LP that can be used to find a bound (optimistic estimate) on the value of the objective function. This is exploited in branch-and-bound methods, described in Section 12.6.2, which are also widely available. Unfortunately, they are based on a tree search approach that calls for the solution of a possibly large number of LPs. Hence, solving a large-scale MILP to optimality may be rather difficult. Nevertheless, impressive improvement in state-of-the-art packages allow one to find a pretty good, if not optimal, solution in many practical cases.
  • Convex, continuous nonlinear programming problems are relatively easy to solve, if all of the functions involved are differentiate. We will not consider methods for nonlinear programming problems in any detail; however, we describe some theoretical concepts in Section 12.5, since they have a very nice economic interpretation. There is no standard method to deal with nonlinear programming problems, and performance may depend on the exact kind of nonlinearity. Commercial tools offer a set of algorithms, from which the user should select the most appropriate one. Unlike linear programming codes, nonlinear ones require an initial solution from which a search process is started; however, if the problem is convex, the solution does not depend on this starting point. As a general rule, we are not able to solve very large-scale nonlinear problems as efficiently as the linear case. An exception is quadratic programming. If the functions involved are nondifferentiable, additional complications arise. Luckily, most practical management problems may be formulated as LPs or MILPs, at least approximately.
  • Nonconvex, continuous nonlinear programming problems are difficult, as a general rule. We may apply standard nonlinear programming methods, but the solution can be just a local optimizer, and it may depend on the initial point that we provide to start the search process. There is a set of global optimization methods, which may be rather time-consuming and are not quite general, as they may apply only to specific classes of problems. Still, for moderate-scale problems, we are able to find good solutions, even though commercial tools are not as standard and robust as in the linear programming case.
  • Nonlinear mixed-integer programming problems are the hardest ones. Until recently, no commercial software tool was available. Now, some advanced packages are available to solve this class of problems; however, the problem size we are able to tackle efficiently is limited, and robustness may be an issue.

From this discussion, it is clear that there are pretty good reasons to represent a decision problem as a linear, possibly mixed-integer programming problem. Sometimes, we have to approximate the problem a bit to obtain a more tractable form. Luckily, there are many modeling tricks that we may use to this aim, and in the next sections many of them are illustrated.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *