Stationary Process

views updated

Stationary Process





A times series is a set of data ordered in time, usually recorded at regular time intervals. In probability theory, a time series {xt } is a collection of random variables indexed by time. In the social sciences, examples of time series include the quarterly level of the gross domestic product, the monthly inflation rate, the annual level of crime, and the annual population growth. One of the main features of time series is the interdependency of observations over time. This interdependency needs to be accounted for in the modeling of time series data, in order to improve understanding of their temporal behavior and forecasting of their future movements.

The class of all models for time dependent data is far too large to enable methods of analysis to be designed that would be suitable for all these models. The modeling of the interdependence becomes an impossible task, and a certain degree of stability or invariance in the statistical properties of the time series becomes essential for its modeling.

Stationarity broadly refers to some form of statistical stability or equilibrium. Stationarity is a key concept in the analysis of time series data, as it allows powerful techniques for modeling and forecasting to be developed. There are different forms of stationarity, depending on which of the statistical properties of the time series are restricted. The two most widely used forms of stationarity are: weak stationarity and strict stationarity.


A time series {xt } is said to be strictly stationary when the probability density function of the collection of the random variables xt1, , xtk is the same as that of the random variables xt1+h, , xtk+h for all integers t 1, , tk and positive integers k and h. The strict stationarity property restricts the probabilistic properties of any collection of random variables to be invariant to time shifts, which implies that the probabilistic behavior of each random variable is the same across time.

A time series {xt } is said to be weakly stationary, some-times referred to as second-order or covariance stationary, when

The weak stationarity property restricts the mean and variance of the time series to be finite and invariant in time and takes the linear dependence between two distinct observations (as measured by the covariance) to be a function of the time distance between them. The function γ(h ), for integers h, is called the autocovariance function. The function ρ(h ) = γ(h )/σ2, for integers h, is the autocorrelation function.

Strict stationarity implies weak stationarity when the mean and variance of the time series exist. The two forms of stationarity are equivalent if the time series follows the normal distribution. Overall, weak stationarity is less restrictive than strict stationarity. Moreover, it turns out that the weak stationarity condition is sufficient for most of the statistical results derived for cross-sectional data to hold. For these reasons, weak stationarity is employed more often than strict stationarity. Various authors use the term stationarity to refer to weak stationarity. Figure 1 shows one hundred simulated data from a weakly stationary time series.

A simple example of a weakly stationary time series is the white noise sequence, a sequence of zero mean, constant

variance, and uncorrelated random variables. Probably the most well-known model for weakly stationary time series is the autoregressive moving average of orders p and q, or ARMA(p,q ), model with appropriate restrictions on its coefficients, where the variable under consideration is written as a linear combination of its own p past values, an error term (usually taken to be a white noise sequence), and q past values of the error term. The ARMA(p,q ) model was popularized by George Box and Gwilym Jenkins (1970).

Stationarity plays an important role in the forecasting of time series data. Suppose that at time t we are given a set of data xt-n, , xt and we wish to forecast the future value xt+h. In predicting the value xt+h, we would make use of the given data, and we would predict the value xt+h by some function f of the given data. To determine which function f gives the best forecast, a measure of accuracy or loss function is needed to evaluate how accurate a forecast is. The most commonly used measure in prediction theory is based on the mean squared error (MSE) of the forecast, and the function f is chosen so that the MSE is minimized. It turns out that the function f that produces the smallest MSE is the conditional mean of xt+h given the available information xt-n, , xt, E (xt+h ǀxt-n, , xt ). If the time series were normally distributed, then the conditional mean E (xt+h ǀxt-n, , xt ) is a linear combination of the data xt-n, ,xt. In general, the conditional mean is unknown, and it is common in prediction theory to consider only predictors x̂t+h for xt+h that are a linear combination of the available data xt-n , xt

x̂t+h = a 0 Xt + a 1 Xt-1 + + a n Xt-n

where a 0, a 1, ,a n are parameters that need to be estimated. The assumption of weak stationarity of the time series {xt } guarantees that the parameters a 0,a 1, , an are invariant in time and therefore can be easily estimated from the data.

A detailed discussion of the properties, modeling, and forecasting of stationary time series can be found in Peter Brockwell and Richard Davis (2002).


The conditions for stationarity, weak or strong, can be violated in many different ways. Time series that do not have the stationarity property are called nonstationary. Examples of nonstationary time series data include the levels of the gross domestic product and population, which do not fluctuate around a constant level, but show overall an upward time trend. For these series, the average behavior for the beginning and the end of the sample differs, which rules out the possibility that the time series is stationary. A trend component in the data is one of the most common cases of nonstationarity in economic time series. Popular models for capturing trend behavior are the linear trend and random walk models.

The linear trend model assumes that the random variable xt of the time series can be written as the sum of a deterministic linear trend α + β t and a white noise random variable t,

where α and β are constant parameters. Under the linear trend model (1), xt is regarded as being scattered randomly around the trend line α + β t with the fluctuations around the trend line having no obvious tendency to increase or decrease. It follows that the mean of xt depends on time t, E (xt ) = α + β t so that the time series {xt } is nonstationary. Deterministic nonlinear functions of time t could also be considered to describe the trend of {xt }.

The random walk model assumes that the random variable xt of the time series can be written as the sum of its previous value xt-1 and a white noise random variable t ,

Under the random walk model (2), from one period to the next, the current observation of the time series takes a random step away from its last recorded value. The random walk model (2) can be extended to include an additive constant δ, and becomes the random walk with drift δ model,

If it is assumed that the time series {xt } starts from some initial value x0, then it can be shown for the random walk with drift model (3) that the mean and variance of xt depend on time t, E (xt ) = x0 + δt and Var (xt ) = tVar (t ). Therefore, a time series {xt } following the random walk without or with drift models (2) and (3) is nonstationary.

The trend component in the linear trend model (1) falls in the category of deterministic trends, as it is assumed that the trend of the time series is a deterministic function

of time. On the other hand, the trend component in the random walk without or with drift models (2) and (3) belongs in the category of stochastic trends, since for these models the trend of the time series is taken to be driven by the lagged time series, which is stochastic.

The difference between deterministic and stochastic trends can be seen from the models (1) and (3). The mean of xt is a linear function of time t for both the linear trend and random walk with drift models (1) and (3), but the variance of xt is increasing in time t for the random walk with drift model (3), and is constant in time t for the linear trend model (1). Another difference between the two models is the nature of the effect of shocks to future values of the time series. For the random walk with drift model (3) the effect of a shock is permanent, while for the linear trend model (1) it wears off. Figure 2 shows five hundred simulated data from the linear trend and random walk with drift models (1) and (3) with parameters α = x 0 = 0 and β =δ = 1.

A detailed discussion on nonstationary time series and the various models for the trend can be found in James D. Hamilton (1994).


The analysis of nonstationary time series is more complicated than that of weakly stationary ones. Moreover, statistical inference that involves nonstationary data is usually nonstandard, in contrast to weakly stationary ones. For these reasons, if nonstationarity is evident in the data, it is common practice to apply some transformation to the data that makes them weakly stationary and then apply statistical analysis on the transformed data.

For example, if a time series xt follows the linear trend model (1), then if one subtracts the trend α + βt from xt, the transformed series is a white noise sequence, which is weakly stationary. Nonstationary time series that become weakly stationary after extraction of a deterministic trend are described as trend-stationary.

Another important transformation that can make nonstationary data into weakly stationary is the difference operator. The difference operator, denoted usually by Δ, is such that when applied to xt the result is xt-1,

ΔXt = Xt - Xt-1·

If it is assumed that the time series xt follows the random walk without or with drift models (2) and (3), then the difference operator transforms the data into a white noise sequence without or with an additive constant (the drift), which is weakly stationary. In some cases, it could be required that the difference operator is applied more than once to the data to achieve weak stationarity. Nonstationary time series that become weakly stationary after applying the difference operator are referred to as difference-stationary.

Difference-stationary time series are often described as being integrated of order d, denoted as I(d ). The parameter d is the order of integration and is the number of times that the difference operator has to be applied to the data in order to achieve weak stationarity. In the case that we take d differences and the resulting time series follows a weakly stationary ARMA(p,q ) model, then the original time series follows the autoregressive integrated moving average of orders p, d, and q, or ARIMA(p,d,q ), model of Box and Jenkins (1970).

For nonstationary economic time series, it is common that the order of integration is d = 1, and in a few cases is d = 2. Time series that are integrated of order 1 are referred to as having a unit root. Examples include the random walk with or without drift and the ARIMA(p,d,q ) with d = 1 models. Time series having a unit root have attracted particular attention in the literature of theoretical and applied econometrics.

More information on the topic of unit root time series can be found in William Greene (2003), including testing procedures to discriminate between weakly stationary and unit root time series.

SEE ALSO General Equilibrium; Nash Equilibrium; Partial Equilibrium; Stability in Economics; Steady State; Unit Root and Cointegration Regression


Box, George E. P., and Gwilym M. Jenkins. 1970. Time Series Analysis: Forecasting and Control. San Francisco: Holden-Day.

Brockwell, Peter J., and Richard A. Davis. 2002. Introduction to Time Series and Forecasting. 2nd ed. New York: Springer.

Greene, William H. 2003. Econometric Analysis. 5th ed. Upper Saddle River, NJ: Prentice Hall.

Hamilton, James D. 1994. Time Series Analysis. Princeton, NJ: Princeton University Press.

Violetta Dalla