MOVING AVERAGE

Moving average is a very simple algorithm, which serves well to illustrate some tradeoffs that we will face later. As a forecasting tool, it can be used when we assume that the underlying data generating process is simply

images

This is the model we obtain from (11.13) if we do not consider trend and seasonality.8 In plain words, the idea is that demand is stationary, with average Bt. In principle, the average should be constant over time. If so, we should just forecast demand by a plain average of all available observations. The average has the effect of filtering noise out and revealing the underlying “signal.” In practice, there are slow variations in the level Bt. Therefore, if we take the sample mean of all available data

images

we may suffer from two drawbacks:

  1. We might be considering data that do not carry any useful information, as they pertain to market conditions that no longer apply.
  2. We assign the same weight 1/T to all demand observations, whereas more recent data should have larger weights; note that, in any case, weights must add up to 1.

A moving average includes only the most recent k observations:

images

The coefficient k is a time window and characterizes the moving average. To get a grip of the sum, in particular of the +1 term in the lower limit, imagine that k = 2; then, at time t, after observing Yt, we would take the average

images

We see that the sum should start with time bucket t − 1, not t − 2. In a moving average with time window k, each observation within the last k ones has weight 1/k in the average. This is illustrated in Fig. 11.3. The estimate of the level is used to build a forecast. Since demand is assumed stationary, the horizon h plays no role at all, and we set

images
images

Fig. 11.3 Time window in a moving-average scheme.

Example 11.5 Let us apply a moving average with time window k = 3 for the dataset

images

and compute MAD, assuming a forecast horizon of h = 1. We can make a first forecast only at the end of time bucket t = 3, after observing Y3 = 14:

images

Here, images is the estimate of the level parameter Bt at the end of time bucket t = 3. Then, stepping forward, we drop Y1 = 12 from the information set and include Y4 = 15. Proceeding this way, we obtain the following sequence of estimates and forecasts:

images

As we noticed, forecasts do not depend on the horizon; since demand is stationary, any forecast Ft,h based on the information set up to and including time bucket t will be the same for h = 1, 2, …. For instance, say that at the end of t = 5 we want to forecast demand during time bucket t = 10; the forecast would be simply

images

To compute MAD, we must match forecasts and observations properly. The first forecast error that we may compute is

images

By averaging absolute errors over the sample, we obtain the following MAD:

images

Note that we have a history of 8 time buckets, but errors should be averaged only over the 5 periods on which we may calculate an error. The last forecast F8,1 is not used to evaluate MAD, as the observation Y9 is not available.

A standard question asked by students after seeing an example like this is:

Should we round forecasts to integer values?

Since we are observing a demand process taking integer values, it is tempting to say that indeed we should round demand forecasts. Actually, there would be two mistakes in doing so:

  • The point forecast is an estimate of the expected value of demand. The expected value of a discrete random variable may well be noninteger.
  • We are confusing forecasts and decisions. True, we cannot purchase 17.33 items to meet demand; it must be either 17 or 18. However, what if items are purchased in boxes containing 5 items? What about making a robust decision hedging against demand uncertainty? What about existing inventory on hand? The final decision will depend on a lot of factors, and the forecast is just one of the many inputs needed.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *