When dealing with simple linear regression, we typically use least squares to fit the coefficients of a simple linear model y = a + bx. Given a set of joint observations (xi, yi), i = 1, …, N, we define residuals
and minimize the sum of squared residuals:
This is actually a quadratic program, but because of the simplicity of constraints, we know from that a straightforward analytical solution is available, lending itself to statistical analysis. However, we also pointed out that alternative fitting procedures could be considered:23
- We may take the absolute value of residuals, rather than squaring them:
- We may also consider the minimization of the worst-case residual:
Neither problem looks linear at all. In the first case, we are dealing with an absolute value, which is nonlinear and nondifferentiable. In the second case, we have a min–max problem which does look a bit awkward. In fact, both are easily converted into LP models by quite useful modeling tricks.
As a first step, consider the representation of absolute value |x|. The number x could be either positive or negative. If we denote its positive and negative parts by x+ ≡ max{0, x} and x− ≡ max{0, −x}, respectively, we see that x can be written as
Then, the absolute value can be written as
As we already remarked, there is a potential ambiguity in Eq. (12.40), since there is an infinite number of ways to express a number x as the difference of two nonnegative numbers x+ and x−; for instance 5 = 5 + 0 = 8 − 3. So, it seems that we should write a nonlinear constraint like x+ · x− = 0, to make sure that at most one of the two auxiliary variables is nonzero. However, since the absolute value enters the objective function (12.38), this condition will hold in the optimal solution; if we write 5 = 8 − 3 rather than 5 = 5 − 0, we increase the objective function by 6 without changing anything substantial. Hence, to cast (12.38) in linear form, we introduce auxiliary variables and for each observation, and transform (12.38) as follows:
It is important to realize that the decision variables in this model are the auxiliary variables and , and the coefficients a and b; xi and yi are observed data.
By a similar token, problem (12.39) may be easily transformed into LP form by a standard modeling trick as well. To generalize a bit, let us consider the following min–max problem:
The idea is that, for a given x in the feasible set S, we should evaluate each function gi(x) and pick the largest value; this defines the objective function, which should be minimized over the feasible set. To transform the model into a more familiar form, let us introduce the auxiliary variable z, and rewrite it as:
Depending on the nature of set S and functions gi(x), this may be a very tough or easy problem. Since the functions involved in (12.39) can be transformed in linear form, we obtain an easy LP:
Leave a Reply