Logistic regression introduces a nonlinear transformation to account for the qualitative nature of the response variable. But even when considering a quantitative response, we may be forced to consider nonlinearity. Figure 16.3 shows two examples.
- In Fig. 16.3(a) we observe two effects: a threshold and a saturation effect. It may be helpful to interpret this case in concrete terms: Let us assume that we are regressing sales against advertisement expenditure. A linear model would be unable to account for the fact that, unless some threshold is exceeded, no effect will be discernible. Consider the effect of a couple of spots at 3 a.m. on some unknown local TV station. This is why the picture shows an initial portion where the sales level is constant. On the other hand, if you swamp all of the major networks with spots, sales cannot grow to infinity; the impact will be less and less affective, possibly even counterproductive. In such a case, we could imagine a more general regression model based on a nonlinear functional form such asorfor suitable functions h(·, ·) and g(·). Note that we should not take for granted that the error component must be additive.
- The case of Fig. 16.3(b) is a bit different. Here we observe the interaction between the quantitative variable x1 and the qualitative (categorical) variable x2. Within a linear regression framework, we may consider a specification such aswhere x2 ∈ {0,1}. However, in this case the effect of x2 would be a plain shift of a linear function. In Fig. 16.3(b) we observe that when x2 = 1, we have a change in slope as well. Hence, we have a more complex interaction between the two variables, which cannot be captured by a linear form. We have to consider a model such as
Given these introductory examples, we should be convinced that sometimes a nonlinear regression model is warranted. Unfortunately, this may be a sort of quantum leap in terms of complexity:
- We must choose a sensible functional form h(x, β), depending on a limited set of parameters.
- We must find a suitable parameter estimation procedure. In principle, this should not be that difficult. On the basis of Eq. (16.14) and armed with n observations (Yi, xi), i = 1,…,n, we may think of solving a nonlinear least-squares optimization problem:Indeed, this is a quite common practice, often referred to as model calibration.8 Powerful software is available to tackle such a problem numerically. Unfortunately, there is a major problem here. First, the resulting optimization model, unlike ordinary least squares, need not be convex; and second, ensuring good properties of estimators is nontrivial.
- We must choose a proper statistical framework to tackle issues such as bias and uncertainty in estimates, hypothesis testing, diagnostics, etc. This last point is definitely beyon and the reader is referred to the listed references.
Fortunately, there are cases in which all of the above is not as awful as it may look. In the remainder of the section we digress a bit on different forms of nonlinear regression, and their implications. In fact, from a users’ perspective, the ability to see advantages and disadvantages of different model formulations is by far the most important one; other tasks may be left to suitable statistical software.
Leave a Reply