A good starting point is polynomial regression. When facing a clearly nonlinear data pattern, like the one in Fig. 16.3(a), we may try to come up with a suitable approximation of the nonlinear function relating data. In principle, polynomials provide us with an arbitrary degree of flexibility.9 Let us take a closer look at a model of polynomial regression, in the case of one regressor:
In this case, there is one regressor variable that is raised to powers in the range from 0 to m. The function is clearly nonlinear in x. However, there is a very important point: The model is nonlinear in the variables, but it is linear in the parameters. To see this, imagine plugging observations xi, i = 1, …, n. We obtain a function that is linear in βj, j = 0, 1,…, m. We could introduce variables zj = xj, and a multiple linear regression model would result. The same applies to the model with interactions, represented by Eq. (16.16). So, we see that sometimes nonlinear regression can be tackled by the machinery of classical linear regression. There are, however, a few traps:
- Polynomial regression may be quite dangerous, as it is tempting to introduce a high-order polynomial to improve fit. The result can be an overfitted model that offers very poor performance out of the sample. Furthermore, given the oscillatory nature of polynomials,10 extrapolation outside the dataset may result in meaningless predictions.
- While polynomial regression does not result in additional difficulties if we treat the regressors as given numbers, establishing results for stochastic regressors is not trivial.
16.4.2 Data transformations
If we plug values for the regressor variables into Eq. (16.17), we obtain a linear function of parameters βj. Now, let us consider an exponential functional form such as
If we plug values for x1 and x2, we do not obtain a linear function of parameters βj, j = 0, 1, 2. However, it is easy to see that a proper transformation can lead us back to the linear case. Taking the logarithm of both sides of the equality yields
This looks a bit like logistic regression, but the underlying model is of a completely different nature. The logarithm is just one of the possible transformations that can be used to linearize a relationship. Another common example is a regression model of the form
which can be transformed to
In this case, we take the logarithm of both the regressed and the regressor variables. In other cases, we may also adopt a model such as
We see that with a little ingenuity, many tricks can be used to our advantage. However, we stress again that we must be careful and not forget statistical issues. If we want to introduce errors, should we use a form such as
or the following equation?
An answer can be given only if we choose an appropriate distribution for and, more importantly, if we know the underlying phenomena well enough to appreciate the sensibility of each assumption. To illustrate, imagine that is assumed normal. The first model may make no sense if Y is restricted to nonnegative values, as, in principle, a negative realization of the normal error might result in a negative value of Y. The second model would look more sensible from this perspective, since by taking the exponential of a normal variable we obtain a lognormal variable.11 Clearly, each choice must be carefully pondered and has an impact on the statistical side of regression, in terms of testing the model and assessing uncertainty in both parameter estimates and model predictions.
Problems
16.1 Apply the formulas of multiple regression to the case of a single regressor, and verify that the familiar formulas for simple regression are obtained.
Leave a Reply