Sometimes, there are nonlinear relationships between variables, which cannot be disregarded, as forcing the problem into a linear framework would result in a blatantly inadequate model. For instance, we may have to do with economies or diseconomies of scale:
- The total transportation cost may depend in a nonlinear way on the volume of shipped goods on a timespan of interest.
- The cost of purchasing items may be affected by volume discount opportunities.
- The transaction cost of a trade on financial markets may depend on the amount that we buy or sell.
One possibility would be to resort to nonlinear programming solvers to cope with a nonlinear formulation. Unfortunately, there are two possible complications:
- The model may involve integer variables, and solvers for nonlinear integer programming are nowhere as fast and robust as the linear ones.
- We may need to minimize a concave cost function, which makes the overall problem nonconvex; the net result is that commercial solvers may get trapped in a locally optimal solution.
An alternative strategy is to approximate the nonlinear cost function, or the nonlinear link between decision variables, by a piecewise linear function. In the following, we will consider the case in which a nonlinear function y = f(x) describes the cost of an activity carried out at level x. We may fix a few selected points x(i), evaluate the corresponding values y(i) = f(x(i)), and interpolate linearly between points (x(i), y(i)). Such points are usually referred to as knots. A few examples are shown in Fig. 12.14. We see that the points x(i) are the breakpoints separating subintervals on which the function is linear. We also see that we may get a convex function, or a concave one, or a function that is neither convex nor concave. There are several strategies to choose the number of knots and placing them in order to find a satisfactory approximation of the function. In other cases, a piecewise linear function arises naturally in the application, as in the case of a supplier offering discount opportunities based on volume; then, the breakpoints specify a price schedule depending on the number of purchased items. As far as we are concerned, we assume that we are given a list of knots describing the function or, alternatively, a list of breakpoints and slopes.
Fig. 12.14 Piecewise linear functions: (a) convex, (b) concave, (c) neither convex nor concave.
Consider the function in Fig. 12.14(a). Rather than giving the list of knots, we may list breakpoints (x(1), x(2), x(3)) and slopes (c1, c2, c3, c4). Note that, without loss of generality, we assume that the function graph starts at the origin. The function could be described as follows:
Of course, this is correct if the function is continuous. If c1 < c2 < c3 < c4 (increasing marginal costs), then f(x) is convex [Fig. 12.14(a)]; if c1 > c2 > c3 > c4 (decreasing marginal costs), the function is concave [Fig. 12.14(b)]; for arbitrary slopes ci, the function is neither convex nor concave [Fig. 12.14(c)].
The convex case is easy and it can be coped with by continuous LP models. The function f(x) can be converted to a linear form by introducing four auxiliary variables z1, z2, z3, z4, one for each subinterval. Then, we can express x as the sum of each single piece:
Finally, the function f(x) is rewritten as
The approach is correct if the auxiliary variables in Eq. (12.57) are activated in the right sequence. Clearly, z2 should be positive only if z1 is at its upper bound; z3 should be positive only if both z1 and z2 are at their upper bounds, etc. Since the slopes are increasing, we are sure that this is the case at the optimal solution. There is no incentive to use the more expensive variable z2 rather than z1, unless the latter is saturated. Hence, if we have a piecewise convex function, the whole model can be reformulated as an LP problem.
Unfortunately, if the function is not convex this is not guaranteed at all. If c2 < c1, then the solver will use the cheaper variable z2 before z1. This is no surprise, after all; if the problem is nonconvex, there is no way to express it as a convex LP problem. However, we may trade one nonconvexity for another one, i.e., we may devise a modeling trick based on binary decision variables. To get a clue on how a general piecewise linear function may be modeled, it is more convenient to describe the function by the list of its knots (x(i), y(i)), where y(i) = f(x(i)). An example is shown in Fig. 12.15. There, we have four knots for i = 0, 1, 2, 3. Now consider the line segment from knot (x(i), y(i)) to knot (x(i+1), y(i+1)). From our knowledge of convex sets, we know that this line segment can be expressed by taking a convex combination of its extreme points:
where 0 ≤ λ ≤ 1. Now, what about forming a convex combination of the four knots? If we take a linear combination of four points, with nonnegative weights adding up to 1, we have
Fig. 12.15 Modeling a piecewise linear function.
However, this is not really what we want, since by doing so we obtain the convex hull of the four knots, i.e., the shaded area in Fig. 12.15. However, we are close to accomplish our aim. What is wrong with the convex hull? The issue is that we take a convex combination of four points, but we should only take a combination involving two consecutive knots, in order to describe each linear piece. So, we should allow only pairs of “adjacent” coefficients λi to be strictly positive. This is accomplished by introducing a binary decision variable si, i = 1, 2, 3, for each line segment (i − 1, i). Then, we may describe the nonlinear link between x and y = f(x) by the following set of constraints:
We see that if s1 = 1, we can only combine knots for i = 0 and i = 1, i.e., we describe the first linear piece. If s2 = 1, we can only combine knots for i = 1 and i = 2, i.e., we are on the second linear piece, and so on.
The overall model is a MILP problem and we see that the nonconvexity of the cost function has been replaced by a nonconvexity in the feasible set, as we have introduced binary decision variables. Branch and bound codes are able to cope with the model above, and there is a clear tradeoff between the accuracy of our approximation of the nonlinear function f(x), which would require as many knots as possible, and the need of keeping the number of decision variables as low as possible. The transformations above may look a bit tricky, but luckily many commercial tools allow the user to describe a piecewise linear function in readable terms, involving breakpoints and slopes; then the software is able to build the corresponding model, invoking the LP solver or the MILP one depending on the convex or nonconvex nature of the function involved.
Leave a Reply