A mathematical programming problem is called a linear programming (LP) problem when all the constraints and the objective function are expressed by linear affine functions, as in the following case:
An LP model can involve inequality and equality constraints, as well as simple bounds. The general form of a linear programming problem is
where we denote the set of equality constraints by ε and the set of inequality constraints by I; any of these sets may be empty. If variable xj is unbounded from below, we may set lj = −∞; by the same token, we may set uj = +∞ if the corresponding variable is unbounded from above. An LP model can be conveniently written in matrix form as follows:
Of course, an inequality involving vectors is interpreted componentwise, i.e., we should think of it as the collection of several inequalities, one per component of the vector.
Example 12.2 Let us consider the LP problem:
It may be cast in the matrix form (12.11) by defining the following matrices and vectors:
Please note the changes in sign needed to recast the second inequality in less-than form. The components of u can all be set to +∞ or, more simply, the vector of upper bounds can just be disregarded.
Using the transformations illustrated before, a generic LP model can also be cast into either of the following forms, involving one type of constraint:
Note that all decision variables are restricted in sign. If a variable is free to take whatever sign, we may transform it into the difference of two nonnegative variables:
The two variables should correspond to the positive and the negative part of xj. However, this transformation introduces an ambiguity, at least in principle. For instance, if xj = 5 we may write 5 = 5 − 0, but also 5 = 8 − 3. As a result, we have an infinite set of equivalent solutions, characterized by the same value of the objective function. Conceptually, this is not a relevant issue, even though it may be from a computational point of view. In practice, the careful implementation of solution methods handles free variables efficiently, but we need not be concerned with such technicalities.
The standard form (12.12) is more convenient computationally, as we shall see in Section 12.6.1. The canonical form (12.13), however, comes in handy to illustrate the geometry of an LP model. In fact, the feasible set for an LP problem in canonical form is expressed by a set of linear inequalities. Since each linear inequality is associated with a half-space, the feasible set is just the intersection of half-spaces, i.e., a polyhedron.6 This is easy to visualize for a simple problem in two dimensions, where half-spaces are just half-planes. This case is illustrated in Fig. 12.2, together with the level curves of two possible objective functions, fa and fb. The feasible set is the shaded polyhedron, which is the intersection of six half-planes. Since the objective function is linear, its level curves are parallel lines. Depending on the coefficients in the objective function, these lines feature different slopes. If we assume a maximization problem, the figure illustrates that, depending on the slope, different optimal solutions are obtained, za or zb. If the level curves are parallel to the edge connecting za and zb, any solution on that edge is optimal. This shows that an LP problem need not have a unique solution. In fact, the following cases may apply:
- There is a unique solution.
- There are multiple equivalent solutions (an infinite number of them, actually).
- There is no feasible solution; this may occur, e.g., when the intersection of the half-spaces is an empty set.Fig. 12.2 Feasible set and optimal solutions for an LP problem.
- The solution is unbounded, i.e., the optimal value of the objective goes to plus or minus infinity, for a maximization and a minimization problem, respectively.
It is easy to see that these cases are not limited to linear problems, as the same situations may arise in nonlinear problems as well.
Example 12.3 It is easy to build optimization problems which are, respectively:
- Unbounded:
- Infeasible:
- Characterized by an infinite set of optima:
The reader is urged to check all of this by drawing the feasible set and the level curves of the objective function, for the three problems above.
It is also useful to notice that derivatives of a linear objective function f(x) = cTx never vanish; its gradient is the vector of coefficients, ∇f(x) = c, and it cannot be set to zero (for a meaningful problem). Hence, we need something more than stationarity conditions to solve a constrained optimization problem, and a linear one in particular. However, Fig. 12.2 suggests that, when looking for the solution of an LP problem, we may just consider the extreme points of the feasible set, i.e., the vertices of the polyhedron. Even though the feasible set in the figure includes an infinite number of points, only six of them are candidates for optimality. Even if there is an edge of equivalent optimal solutions, there is no loss in restricting our attention to one of the two extreme points of the edge. Indeed, this is a fundamental feature of LP problems that is exploited in the standard solution method: the simplex method, which is described later and is based on the idea of exploring a typically small subset of vertices to find the optimal solution. You may not really need to understand how this method works in depth. What is important is that this algorithm is widely available in commercial software tools, and it is normally able to solve rather large-scale problems, possibly involving several thousands of decision variables, in a few seconds.
In the LP models above, decision variables are real numbers, subject to joint restrictions represented by equality and inequality constraints, as well as lower and upper bounds. There are situations in which we must enforce some more restrictions on a subset of variables:
- We might require variables to take only integer values. For instance, we could specify that the number of items to be produced in the optimal mix problem be integer. In virtually all practically relevant cases, we need nonnegative integer variables, i.e., we require , for some j.
- Quite often, we need to express logical decisions, like “we place an order” or “we do not place any order,” or “we start an activity” or “we don’t.” Such logical decisions are of all-or-nothing type and are typically represented by binary decision variables like xj ∈ {0, 1}.
If the integrality condition is enforced on all of the decision variables, we have a pure-integer linear programming problem (ILP), such as
Normally, in a pure ILP, we have only inequality constraints and bounds; equality constraints, i.e., equations involving integer numbers, are mathematically interesting and challenging, but they are uncommon in management. If the restriction is x ∈ {0, l}n, i.e., we consider only binary variables, we have a pure binary LP problem. The most general case is a mixed-integer linear programming (MILP) problem, whereby integrality restrictions are enforced only on a subset L of variables:
Fig. 12.3 The set of feasible solutions of a pure integer LP problem.
In order to specify a binary decision variable, we can enforce the bounds lj = 0 and uj = 1 on an integer variable in L.
It turns out that, as a general rule, integer programming problems are a much harder nut to crack than their continuous counterparts. To get a feeling for the reason behind this unpleasing state of affairs, let us have a look at Fig. 12.3. There, we see the feasible region of a pure ILP problem in two dimensions. If we drop the integrality requirements, we obtain the continuous relaxation of the ILP, whose feasible region is the shaded polyhedron depicted in the figure. The feasible region for the ILP problem consists of those points within the polyhedron that have integer coordinates. This immediately shows that we cannot apply familiar concepts from calculus: The derivative makes no sense, as we have a discrete set of points and we cannot take ordinary limits. However, we know that there is a way to solve a continuous LP. If we drop integrality requirements, we obtain an optimal solution corresponding to one of the extreme points (vertices) of the polyhedron in Fig. 12.3. In this case, we might expect that, although this vertex has noninteger coordinates in general, we could find the optimal solution by rounding the fractional solution of the continuous relaxation. For instance, in the optimal mix problem (12.1), the continuous LP has the optimal solution
whereas the corresponding ILP has optimal solution
It seems that by trying a few combinations of roundings, up and down, we should be able to find the optimal solution; some roundings might produce infeasible solutions, but when we get a feasible one, we just evaluate its objective function; by comparing all of the integer solutions found in this way, we spot the best rounding. Maybe, it is not trivial to prove that what we get is an optimal solution, but at least we should be able to find a pretty good one. Unfortunately, this approach does yield a good heuristic approach in some cases, but it does not work in general. If we have a large number n of integer decision variables, a rough calculation shows that we should try 2n roundings. To see this, observe that we should round up or down each of the n variables, and this results in an exponential number of combinations. Even worse, it is not even true that we may find an optimal solution, or even a feasible one, by just rounding up and down. The following nice example shows that things may be difficult even in two dimensions.7
Example 12.4 Consider the following pure ILP problem:
If we relax the integrality requirement, i.e., we just require x1, x2 ≥ 0, we can apply the simplex method and find
with an optimal objective value 8.5. The reader is invited to try rounding the above solution up and down; unfortunately, the trivially rounded solutions are not feasible. In fact, the integer optimal solution is
Fig. 12.4 Feasible solutions in Example 12.4.
with optimal value 3. We see that the continuous solution is quite far from the true integer minimizer; in this case it is even difficult to find a feasible solution by rounding, let alone the optimal one. The reason behind the trouble is illustrated in Fig. 12.4. The integer feasible set consists of only two points, (1, 0) and (2, 1). The feasible set of the continuous relaxation is the very narrow shaded triangle in Fig. 12.4, and it includes a many noninteger points quite far from the two integer solutions.
Another interesting observation that we can draw from the example is that the continuous relaxation yields a bound on the true optimum value of the integer solution. In the example, we find an upper bound, 8.5., since we are maximizing. If we consider a minimization problem, the optimal value of the continuous relaxation yields a lower bound. To understand why, consider Fig. 12.3 again. The feasible set of the ILP is a subset of the feasible set of the continuous relaxation. Hence, when we drop the integrality requirements, we enlarge the feasible set. By doing so, we cannot increase cost, in a minimization problem, or decrease profit, in a maximization problem, since all of the integer points are still feasible and we are adding opportunities for optimization. Incidentally, in the last example, the upper bound we get, 8.5, is rather weak with respect to the true optimal value, which is 3. In other cases, the bounds yield better estimates of the optimal value. Indeed, this bounding approach is the starting point of the branch and bound method, which is the standard, commercially available approach to solving ILPs and MILPs. For the sake of the interested reader, we outline this algorithm in Section 12.6.2, which may be safely skipped at first reading. The branch and bound method is based on a search tree, on which each node corresponds to a continuous relaxation of the original MILP. The important message is that care must be taken when formulating MILPs, as tackling them may call for the solution of thousands of continuous relaxations to find the optimal integer solution.
12.1.2 Nonlinear programming problems
In an LP model, all the functions defining the objective and the constraints must be linear (affine). If even one function fails to meet this requirement, we face a nonlinear programming problem. For instance, the problem
is nonlinear because the first constraint involves a squared decision variable. The problem
is nonlinear because the objective function involves the product of two decision variables; the second constraint is nonlinear, too. Of course, we may have arbitrary nonlinearities in both the objective and the constraints. Now consider the nonlinear problem
We see that the feasible set is defined by linear functions, whereas the objective function is nonlinear. A more careful look at it, though, should ring a bell. The objective function involves quadratic terms and is an example of a quadratic form.8 We know that, by introducing a suitable symmetric matrix, a quadratic form can be expressed in matrix form. A generic quadratic programming problem can be represented as follows:
We see that, in general, the objective function may also involve a linear term (a constant term is irrelevant, as usual). The leading term includes a factor that is typically just included for convenience, as in so doing we see that the matrix H is actually the Hessian matrix of the objective function. We have already investigated convexity and concavity of quadratic forms. As it turns out, a quadratic programming problem may be very easy to solve or not, depending on these features. We recall convexity in the next section, but before that it is useful to consider two practical examples of nonlinear programming problems.
Example 12.5 In Example 8.5 we considered a simple portfolio optimization model. We must allocate our wealth among n assets, whose return is a random variable Ri, i = 1, …, n. The decision variables are the fractions Wi of wealth that are allocated to each asset; these are also called portfolio weights. If we rule out short sales, natural constraints on decision variables are as follows:
The first constraint ensures that we invest exactly our wealth, no more, no less. Nonnegativity constraints on portfolio weights ensure that no asset is sold short; we could also require that weights not exceed 100%, but this is redundant, given the two constraints above.
We also know from Example 8.5 that the portfolio return is the random variable
We cannot really maximize (nor minimize) a random variable, and a thorough treatment of this decision problem under uncertainty calls for some tools. Nevertheless, we might pursue a rather simple and intuitive strategy. Most of us are risk-averse decision makers and do not want an overly uncertain return, such as the one that would result from choosing a very risky portfolio. What we certainly would like is a portfolio with a large expected return, but this desire must be tempered by a consideration of the risk involved. Hence, a possible way of framing the problem consists of
- Choosing a minimum target expected return μt
- Finding the portfolio with minimum variance, such that its expected return is not below the target μt
Taking advantage of what we learned about linear combinations of random variables. we recall that the variance of portfolio return is
where σij = Cov (Ri, Rj) is the covariance of returns Ri and Rj, the symmetric covariance matrix Σ collects all such covariances, and w is the vector collecting portfolio weights. Let μi be the expected return of asset i. Then, we may formulate the following optimization problem:
It is easy to see that this is a quadratic programming problem, since the objective function is a quadratic form and the constraints are linear.
Example 12.6 The EOQ model has plenty of limiting assumptions. One of them is that the model considers only one item, neglecting possible interactions with other ones. Furthermore, some assumptions concerning the involved costs are questionable as well. For instance, consider the fixed ordering charge. Some logistic channels are so efficiently organized that the need for including this charge should not be taken for granted. Imagine a retail store that receives a shipment each and every day from a large supplier. All of the item types are included in that shipment, so there is really no need to consider a fixed ordering charge per item; in fact such a charge is shared among all of the item types and is incurred anyway because shipments are arranged daily. Apparently, this would encourage ordering small lots quite often. However, there may be a limit on the total number of orders that are issued on the average. One possible reason is that orders must be inspected and tracked; if the number of persons in charge of purchasing activities is limited, an upper bound on the number of orders will result.
In the age of information systems and certified-quality suppliers, limiting the number of replenishment orders that we can issue may not seem warranted. However, building a model in this vein will turn out to be quite instructive. Then, on the basis of these considerations, let us extend the EOQ model to jointly order n items, minimizing the total average inventory holding cost, subject to an upper bound on the average number of orders issued over time. The information we need consists of
- The inventory holding cost hi and the demand rate di for each item, i = 1, …, n; as in the EOQ model, we assume constant demand rates.
- The upper bound U on the average number of orders issued per unit time.
The decision variables are the order quantities Qi. The dynamics of inventory levels is the same as in the EOQ model. Hence, the average inventory cost for item i is hiQi/2, and the average number of orders issued per unit time is di/Qi. Summing over items, we obtain the following model:
In this model, the objective function is linear, but the inequality constraint is not; hence, this is a nonlinear programming model. Later, in Example 12.22, we will show how such a model can be tackled, along with a quite instructive economic interpretation.
Leave a Reply