14.3. TIME SERIES FORECASTING
|
465
|
|
60
|
|
|
|
|
|
|
4.5
|
|
|
|
ORIGINAL SERIES
|
|
|
|
VALUE)
|
4
|
|
|
|
50
|
DIFFERENCED SERIES
|
|
|
|
|
|
|
|
|
|
|
|
|
3.5
|
|
|
|
|
|
|
|
|
|
|
PRICE VALUE
|
40
|
|
|
|
|
|
LOGARITHM(PRICE
|
3
|
|
|
30
|
|
|
|
|
|
|
2.5
|
|
|
|
|
|
|
2
|
|
|
|
|
|
|
|
|
|
|
20
|
|
|
|
|
|
|
1.5
|
|
|
10
|
|
|
|
|
|
|
1
|
|
|
|
|
|
|
|
|
|
0.5
|
|
|
|
|
|
|
|
|
|
|
|
0 0
|
5
|
10
|
15
|
20
|
25
|
30
|
0
|
0
|
|
|
|
|
|
TIME INDEX
|
|
|
|
|
|
|
(a) Unscaled series
ORIGINAL SERIES (LOG)
DIFFERENCED SERIES (LOG)
5
|
10
|
15
|
20
|
25
|
30
|
|
|
TIME INDEX
|
|
|
|
(b) Logarithmic scaling
Figure 14.3: Impact of different operations on stationary and non-stationary series
Time series can be either stationary or nonstationary. A stationary stochastic process is one whose parameters, such as the mean and variance, do not change with time. A nonstationary process is one whose parameters change with time. Some kinds of time series such as white noise are stationary. White noise is the strongest form of stationarity with zero mean, constant variance, and zero covariance between series values separated by a fixed lag. On the other hand, consider the case, where the behavioral attribute corresponds to the price level of an industrial commodity such as crude oil. This is typically nonstationary because the average price level may increase over time as a result of inflation. In fact, most time series in real applications are nonstationary. A stationary series will usually be characterized as a noisy series with a level trend, constant variance, and zero covariance between different series values. For example, in Fig. 14.3a, both the series are nonstationary because the average values increase with time. On the other hand, in Fig. 14.3b, the dashed curve is stationary because the trends do not change significantly with time. A strictly stationary time series is defined as follows:
Definition 14.3.1 (Strictly Stationary Time Series) A strictly stationary time series is one in which the probabilistic distribution of the values in any time interval [ a, b] is identical to that in the shifted interval [ a + h, b + h] for any value of the time shift h.
In other words, all multivariate distributions of subsets of variables must match with their shifted counterparts. The window-based statistical parameters of a stationary time series can be estimated in a meaningful way because the parameters do not vary over dif-ferent time windows. In such cases, the estimated statistical parameters are good predictors of future behavior. On the other hand, the current mean, variances, and statistical correla-tions of the series are not necessarily good predictors of future behavior in regression-based forecasting models for nonstationary series. Therefore, it is often advantageous to convert nonstationary series to stationary ones before forecasting analysis. After the forecasting has been performed on the stationary series, the predicted values are transformed back to the original representation, using the inverse transformation. The strict stationarity concept of Definition 14.3.1 is, however, too restrictive to be meaningfully used in real applications. For example, it is difficult even to determine whether or not a time series is strictly station-ary from a single instance because one must comprehensively characterize all multivariate distributions of subsets of variables.
466 CHAPTER 14. MINING TIME SERIES DATA
A key observation is that it is much easier to either obtain or convert to series that exhibit weak stationarity properties. In such cases, unlike white noise, the mean of the series, and the covariance between approximately adjacent time series values may be non-zero but constant over time. This is referred to as covariance stationarity. This kind of weak stationarity can be assessed relatively easily and is also useful for forecasting models that are dependent on specific parameters such as the mean and covariance. In other nonstationary series, the average value of the series can be described by a trend-line that is not necessarily horizontal, as required by a stationary series. Periodically, the series will deviate from the trend line, possibly because of some changes in the generative process, and then return to the trend line. This is referred to as a trend stationary series. Such weak forms of stationarity are also very useful for creating effective forecasting models. In the following, some practical methods that are commonly used to convert nonstationary series to stationary series will be discussed.
Differencing
A common approach used for converting time series to stationary forms is differencing. In differencing, the time series value yi is replaced by the difference between it and the previous value. Therefore, the new value yi is as follows:
If the series is stationary after differencing, then an appropriate model for the data is:
Here, ei+1 corresponds to white noise with zero mean. A differenced time series would have t − 1 values for a series of length t because it is not possible for the first value to be reflected in the transformed series.
Higher order differencing can be used to achieve stationarity in second order changes.
Therefore, the higher order differenced value yi is defined as follows:
yi = yi
|
− yi−1
|
(14.10)
|
= yi
|
− 2 · yi−1 + yi−2
|
(14.11)
|
This model allows the series to drift over time, since the noise has non-zero mean. The corresponding model is as follows:
yi+1 = yi + c + ei+1.
|
(14.12)
|
Here, c is a non-zero constant that accounts for the drift. Generally, it is rare to use differ-ences beyond the second order.
A different approach is to use seasonal differences when it is known that the series is stationary after seasonal differencing. The seasonal differences are defined as follows:
Here m is an integer greater than 1.
In some cases, such as geometrically increasing series, the logarithm function is applied to the values in the series, before the differencing operation. For example, consider a time series of prices that increase at an approximately constant inflation factor. In such cases,
14.3. TIME SERIES FORECASTING
|
467
|
it may be useful to apply the logarithm function to the time series values, before the differencing operation. An example is provided in Fig. 14.3a, where the variation in inflation is illustrated with time. It is evident that the differencing operation does not help in making the series stationary. In Fig. 14.3b, the logarithm function is applied to the series before the differencing operation. In this case, the series becomes stationary after the differencing operation.
In the following, a number of univariate time series forecasting models will be discussed. These models work effectively under different assumptions on the time series patterns. Some of these models assume a stationary time series, whereas others do not.
14.3.1 Autoregressive Models
Univariate time series contain a single variable that is predicted using autocorrelations. Autocorrelations represent the correlations between adjacently located timestamps in a series. Typically, the behavioral attribute values at adjacently located timestamps are pos-itively correlated. The autocorrelations in a time series are defined with respect to a par-ticular value of the lag L. Thus, for a time series y1, . . . yn, the autocorrelation at lag L is defined as the Pearson coefficient of correlation between yt and yt+L.
Autocorrelation(L) =
|
Covariancet(yt, yt+L)
|
.
|
(14.14)
|
|
|
|
|
Variancet(yt)
|
|
|
The autocorrelation always lies in the range [−1, 1], although the value is almost always positive for very small values of L, and gradually drops off with increasing lag L. The positive correlation is a result of the fact that adjacent values of most time series are very similar, though the similarity drops off with increasing distance. High (absolute) values of the autocorrelation imply that the value at a given position in the series can be predicted as a function of the values in the immediately preceding window. This is, in fact, the key property that enables the use of the autoregressive model. For example, the variation in autocorrelation with lag for the IBM stock example (Fig. 14.1) is illustrated in Fig. 14.4a. Such a figure is referred to as the autocorrelation plot and is used commonly in AR models. While the autocorrelation is usually positive and falls off with lag, the precise behavior is highly application-specific. For periodic series, the autocorrelation may be periodic and negative at certain lag intervals. An example of the autocorrelations for a periodic sine wave is illustrated in Fig. 14.4b.
In the autoregressive model, the value of yt at time t is defined as a linear combination of the values in the immediately preceding window of length p.
Dostları ilə paylaş: |