A GLANCE AT ADVANCED TIME SERIES MODELING

The class of exponential smoothing methods was born out of heuristic intuition, even though methodological frameworks were later developed to provide them with a somewhat more solid justification. Despite these efforts, exponential smoothing methods do suffer from at least a couple of drawbacks:

  • They are not able to cope with correlations between observations over time; for instance, we might have a demand process with positive or negative autocorrelation, formally defined below, and this must be considered by forecasting procedures.
  • They do not offer a clear way to quantify uncertainty; using RMSE as an estimate of standard deviation, as we did in Example 11.6, may be a sensible heuristic, but again it may be inadequate when autocorrelation is an issue.

It is also worth noting that simple linear regression models share some of the limitations above, as standard OLS estimates do assume uncorrelated errors. If necessary, we may resort to an alternative body of statistical theory that deals with formal time series models. In this section we give a flavor of the theory, by outlining the two basic classes of such models: moving-average and autoregressive processes. They can be integrated in the more general class of ARIMA (autoregressive integrated moving average) processes, also known as Box-Jenkins models. As we shall see, time series modeling offers many degrees of freedom, maybe too many; when observing a time series, it may be difficult to figure out which type of model is best-suited to capture the essence of the underlying process. Applying time series requires the following steps:

  1. Model identification. We should first select a model structure, i.e., its type and its order.
  2. Parameter estimation. Given the qualitative model structure, we must fit numerical values for its parameters. We will not delve into the technicalities of parameter estimation for time series models, but this step relies on the statistical tools that we have developed in Section 9.9.
  3. Forecasting and decision making. In the last step, we must make good use of the model to come up with a forecast and a quantification of its uncertainty, in order to find a suitably robust decision.

All of the above complexity must be justified by the application at hand. Time series models are definitely needed in quantitative finance, maybe not in demand forecasting. We should also mention that, quite often, a quantitative forecast must be integrated with qualitative insights and pieces of information; if so, a simpler and more intuitive model might be easier to twist as needed.

In its basic form, time series theory deals with weakly stationary processes, i.e., time series Yt with the following properties, related to first- and second-order moments:

  1. The expected value of Yt does not change in time: E[Yt] = μ.
  2. The covariance between Yt and Yt+k depends only on time lag k.

The second condition deserves some elaboration.

DEFINITION 11.1 (Autocovariance and autocorrelation)

Given a weakly stationary stochastic process Yt, the function

images

is called autocovariance of the process with time lag k. The function

images

is called the autocorrelation function (ACF).

The definition of autocorrelation relies on the fact that variance is constant as well:

images

In practice, autocorrelation may be estimated by the sample autocorrelation function (SACF), given a sample path Ytt = 1, …, T:

images

where images is the sample mean of Yt. The expression in Eq. (11.28) may not look quite convincing, since the numerator and the denominator are sums involving a different number of terms. In particular, the number of terms in the numerator is decreasing in the time lag k. Thus, the estimator looks biased and, for a large value of kRk will vanish. However, this is what one expects in real life. Furthermore, although we could account for the true number of terms involved in the numerator, for large k the sum involves very few terms and is not reliable. Indeed, the form of sample autocorrelation in Eq. (11.28) is what is commonly used in statistical software packages, even though alternatives have been proposed.13 If T is large enough, under the null hypothesis that the true autocorrelations ρk are zero, for k ≥ 1, the statistic images is approximately normal standard. Since z0.99 = 1.96 ≈ 2, a commonly used approximate rule states that if

images

Fig. 11.8 A seasonal time series (left) and its autocorrelogram (right).

images

the sample autocorrelation at lag k is statistically significant. For instance, if T = 100, autocorrelations outside the interval [−0.2, 0.2] are significant. We should keep in mind that this is an approximate result, holding for a large number T of observations. We may plot the sample autocorrelation function at different lags, obtaining an autocorrelogram that can be most useful in pointing out hidden patterns in data.

Example 11.8 (Detecting seasonality with autocorrelograms) Consider the time series displayed in figure 11.8. A cursory look at the plot may not suggest much structure in the data, but the autocorrelogram does. The autocorrelogram displays two horizontal lines defining a band, outside which autocorrelation is statistically significant; notice that T = 100 and the two horizontal lines are set at −0.2 and 0.2, respectively; this is consistent with Eq. (11.29). We notice that sample autocorrelation is stronger at time lags 5, 10, 15, and 20. This suggests that there is some pattern in the data. In fact, the time series was generated by sampling the following process:

images

where:

  • images is a sequence of independent, standard normal variables; taking the exponential of images makes sure that we have positive values.
  • mod(t, 5) denotes the remainder of integer division of t by 5; actually, this is just a process with a seasonal cycle of length 5, with parameters corresponding to the following level and seasonal factors:images

We see that an autocorrelogram is a useful tool to spot seasonality and other hidden patterns in data.

Another important building block in time series modeling is white noise, denoted by images in the following. This is just a sequence of i.i.d. random variables. If they are normal, we have a Gaussian white noise. As we have mentioned, the first step in modeling a time series is the identification of the model structure. In the following we outline a few basic ideas that are used to this aim, pointing out the role of autocorrelation in the analysis.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *