In choosing the time window k, we have to consider a tradeoff between
- The ability to smooth (filter) noise associated with occasionally large or small demand values
- The ability to adapt to changes in market conditions that shift average demand
If k is large, the method has a lot of inertia and is not significantly influenced by occasional variability; however, it will be slow to adapt to systematic changes. On the contrary, if k is low, the algorithm will be very prompt, but also very nervous and prone to chase false signals. The difference in the behavior of a moving average as a function of time window length is illustrated in Fig. 11.4. The demand history is depicted by empty circles, whereas the squares show the corresponding forecasts (calculated one time bucket before; we assume h = 1). The demand history shows an abrupt jump, possibly due to opening another distribution channel. Two simulations are illustrated, with time windows k = 2 and k = 6, respectively. Note that, in plot (a), we start forecasting at the end of time bucket t = 2 for t = 3, since we assume h = 1, whereas in plot (b) we must wait for t = 6. We may also notice that, when k = 2, each forecast is just the vertical midpoint between the two last observations. We see that, with a shorter time window, the forecast tends to chase occasional swings; on the other hand, with a longer time window, the adaptation to the new regime is slower.
Fig. 11.4 Moving average with k = 2 and k = 6.
Moving average is a very simple approach, with plenty of applications beyond demand forecasting.9 However, this kind of average can be criticized because of its “all or nothing” character. As we see in Fig. 11.3, the most recent observations have the same weight, 1/k, which suddenly drops to 0. We could try a more gradual scheme in which
- Different weights are attributed to more recent observations
- A small but non-zero weight is associated with older observations
In other words, weights should decrease gradually to zero for older observations. This is exactly what the family of exponential smoothing algorithms accomplishes.
Leave a Reply