## Time Series

**-**

## Time Series

# Time Series

I. GENERAL*Gerhard Tintner*

II. ADVANCED PROBLEMS*P. Whittle*

III. CYCLES*Herman Wold*

IV. SEASONAL ADJUSTMENT*Julius Shiskin*

## I. GENERAL

A time series is a set of data ordered in time, typically with observations made at regular intervals, for example, each census year, annually, quarterly, or monthly. This article focuses on the analysis of time series of economic data. The analytical methods used are also applied in other fields, for example, in psychological encephalography, in the analysis of sociological series from panel studies, and in the analysis of voting series in political science. Although data from sampling surveys have been used increasingly in recent years, most of the observations utilized in econometric studies still come from economic time series. This is especially true of econometric studies dealing with problems that are important from the point of view of economic policy, such as the nature of business cycles and the determinants of economic development. An area of growing theoretical and empirical interest is the pooling of time series and cross-section data [*see* Cross-section analysis].

Time series have provided an indispensable source of information for econometric analyses, but peculiar features of such data warrant special caution in their use. The difficulties that beset the use of time series in econometric studies have four different causes.

The first is the simultaneity of economic relations. Observed values of economic variables are usually generated by a system of economic relations. It is now well known that it may be impossible to estimate some of these relations; that is, some relations may be underidentified. Classical least squares estimation of a single relation, ignoring the others, is in general inconsistent. Since the pioneering efforts of Trygve Haavelmo, a substantial portion of the work of econometricians has been devoted to the development of consistent estimation methods for simultaneous economic relations [*see* Simultaneous equation estimation].

The second cause of difficulty is errors in variables or errors in observations. Reported values of some economic variables are often subject to error resulting from techniques of collection or are only approximations to unobservable variables specified in the theoretical relation to be estimated. Methods have been advanced to deal with errors in variables in particular situations (see Johnston 1963, pp. 148-176), but there remains much to be done in this important area.

Multicollinearity is the third cause of difficulty. Many economic time series are highly correlated with one another (are multicollinear), owing to common factors that influence all economic activity. Prior to estimating a particular economic relation, we are often led to specify a regression model containing a number of explanatory variables, and we are interested in isolating the independent influence of each of these variables on the variable to be explained. If the explanatory variables are multicollinear, it will be impossible, using the usual least squares regression procedure, to determine with any confidence the influence of each of them. Thus, multicollinearity of economic time series often impedes empirical testing of alternative hypotheses.

The fourth cause of difficulty is autocorrelation. It is evident from our general knowledge of the economic system that the consecutive items of an economic time series will seldom be independent. The simplest case of this lack of independence is termed autocorrelation; it creates great problems in econometric studies, since the modern theory of statistical estimation and inference, as developed by R. A. Fisher, J. Neyman, A. Wald, and their disciples, is frequently designed for analysis of data that constitute random samples, i.e., where the assumption that items in the sample are statistically independent is justified or where the dependence has a simple character. Economic data, like the data of astronomy and meteorology, do not come from carefully designed experiments but are the result of complex and evolving empirical relationships. Autocorrelation typically increases with the decrease of the time interval of observations. Hence, if we could obtain daily instead of yearly observations (which is frequently impossible with economic data), we would perhaps not increase the available number of degrees of freedom very substantially. Much effort has been expended in the development of methods for dealing with autocorrelation (more generally, lack of independence), but because of the great mathematical difficulties involved and the lack of simple, realistic theoretical models, the results have been less than spectacular.

In practice, all four problems occur simultaneously. However, since it is so difficult to deal with all of them at once, much work focuses on only one problem and assumes that the others do not exist. Since this approach may be unrealistic when working with actual data, interpretations of empirical analyses are often most difficult, thus diminishing the value of such analyses for practical applications to economic policy.

The traditional analysis of time series has dealt primarily with isolating the trend, the irregular-cycle, the fairly regular periodic seasonal, and the random components of the series (Kuznets 1934). It is assumed that each component is independent of the others and can be analyzed separately, that each component is generated by a particular underlying process or model, and that the series is simply the sum of the components. This procedure is now considered somewhat obsolete, although it still provides a useful point of departure. It would be preferable to avoid these perhaps unrealistic assumptions, instead employing a procedure based upon the premise that a series is generated by a single model capable of producing trends and fluctuations and incorporating random elements.

The mathematical model of a time series is a stochastic process (Bartlett 1955). A family of jointly distributed random variables linearly arranged according to a numerical index (often corresponding to time) is called a stochastic process. As it stands, this characterization is so inclusive as to be of little value; useful results can be obtained only by starting with restrictive assumptions on the random variables of the family. The theory of stochastic processes has been studied by many excellent mathematicians, and great progress has been made in recent years. Unfortunately, the study of statistical methods for observable stochastic processes is not nearly so well developed as studies of a more purely probabilistic nature. In part this stems from mathematical difficulties, but another factor is also important. Many of the outstanding and important applications of stochastic processes arise in physics, chemistry, and communications theory, where the samples involved are so large that purely statistical problems, involving statistical inference from the sample to the population, are perhaps not very important.

Even where statistical results exist, as in the statistical treatment of stationary time series (Gren-ander & Rosenblatt 1957), i.e., time series that lack a trend and whose variances and higher moments are independent of time, the results are usually valid only for large samples. (Of course, the same is true for other branches of statistics as well.) Although development of a small-sample theory of stochastic processes faces formidable mathematical difficulties, it is just such a theory that is needed for the analysis of the relatively short economic time series commonly available. The mathematical problems can be made more tractable if we make highly unrealistic assumptions, for example, the so-called circular assumption-that samples are taken from the repetitive population X_{1}, X_{2}, …,X_{N},X_{1},X_{2},…But lacking a small-sample theory, we are often forced to use large-sample methods as crude approximations to valid methods. It should also be remarked that many results in this field are still confined to tests of hypotheses; very few deal with the perhaps more important problems of estimation.

### Trend

The trend is one of the most intractable features of economic time series. By “trend” we mean the long-term secular movement of a series, that is, the mathematical expectation of X_{t}, *EX _{t},* a function of time. It is, of course, difficult to distinguish the trend from cycles of long duration.

**Parametric tests for trend** . One general procedure for estimating and testing for trend in a time series involves assuming a specific form for the trend function, estimating the function, and testing the significance of the estimated relation. Among the most commonly used functional forms are the polynomial and the logistic.

*Orthogonal polynomials.* Suppose we have a set of observations, X_{1}, X_{2}, …, *X _{t}, …, X_{N},* on the variable X for the equidistant points in time 1,2, …, N. We might assume that this series was generated by the polynomial function

, *t* = 1,2, ..., N,

where the a’s are parameters to be estimated and *u _{t}* is a random error term that is normally distributed with constant variance. Also, the it’s are independent. Rather than fit this ordinary polynomial, it is simpler to estimate the orthogonal polynomial

where ξit is an ith-degree polynomial in *t* such that for all *i* and for all *i*

and *j, i≠ j.* Making use of orthogonal polynomials, we can easily estimate a polynomial of degree p, having once estimated a polynomial of degree p - 1. Application of the method of least squares yields

One method for determining the degree of the polynomial that best fits the data has been developed by R. A. Fisher (1925). Let *S _{p}* denote the sum of the squared residuals from an estimated polynomial of degree p. We fit polynomials of degree 1,2, …, until S

_{p+1}

*-*S

_{p}is not statistically significant. This may be determined by an F-test, assuming the w’s are normally distributed and independent. A recent method put forward by T. W. Anderson (1962) may turn out to have major advantages over Fisher’s. However, application of Anderson’s method requires that we decide a priori the highest degree of the polynomial that might possibly be used. The tests proposed by Anderson then enable us to decide whether we might adequately fit a polynomial of lower order.

Although polynomial trends fitted by the method of orthogonal polynomials are often accurate representations of the past history of many time series, it is very dangerous to employ them for extrapolating these series, since polynomials tend to infinity with advancing time.

*The logistic function.* The study of economic development focuses particular attention on long economic time series that can be used to characterize the nature of economic development. While the true form of long-term secular change is uncertain, the logistic function (Davis 1941, p. 247) recommends itself by its success, albeit limited, in animal and human population studies. Also, the logistic function, unlike the polynomial, has an upper asymptote-a desirable property.

If we suppose once again that we have a time series X_{t}, *t=1, …, N,* the stochastic form of the logistic function that we assume generated these observations can be written as

where *k, a,* and *b* are parameters to be estimated and *ut* is a random error term. Estimation of the parameters is difficult, since they enter the function in a nonlinear fashion. Hotelling (1927) proposed a method for fitting the logistic that surmounts this difficulty. Working with the logistic in its nonstochastic form, he utilizes the time derivative of *log _{e}X_{t}* and forms the differential equation

We can now obtain least squares or maximum likelihood estimators of *a* and *a/k,* and from these an estimator of *k; b* can then be estimated, using a method suggested by Rhodes (1940).

Hotelling’s method has the disadvantage of requiring a discrete approximation to the derivative *dX _{t}/dt,* since economic data are typically available only for discrete time intervals. Hence, it seems preferable to utilize an estimation procedure that relies on a difference equation rather than on a differential equation. Such a procedure has been developed by Tintner (1960, p. 273). Utilizing the transformation z

_{t}, = 1/X

_{t}, a difference equation for z

_{t}, can be derived from the logistic, namely,

Application of least squares estimation methods, which are also maximum likelihood methods if the errors are normally distributed, will yield estimators of (1-e^{-a})/kand *e ^{-a},* from which estimators of

*a*and

*k*can be obtained;

*b*may again be estimated by Rhodes’s formula.

*Moving averages.* In fitting polynomials or logistics to an economic time series in order to isolate trend, it is assumed that a simple function will accurately capture the trend throughout the series. An alternative method, which does not involve such a strong assumption, is the method of moving averages. Moving averages can most effectively be applied to time series containing seasonal or cyclical components of relatively constant period. Given a time series X_{t}, *t= 1,… ,N,* the trend value of the series at *t = m*+ 1 is found by taking an average of the first *2m +*1 elements of the series, where *m* is chosen so that *2m*+ 1 corresponds to the period of the seasonal or cyclical component. (The fact that the period of business cycles is irregular creates difficulties in the application of this method.) Similarly, the trend value at *t = m + 2* is found by averaging the observations X_{2}, …, X_{2m+2}.. In general, the trend value at *t* is an average of X_{t-m}, …, X_{t+m}. An immediately evident shortcoming of this procedure is that no trend values are obtained for the first and last *m* time periods covered by the original series unless special methods are adopted.

The weights attached to the elements to be averaged in each set of *2m +*1 consecutive observations may be chosen in a variety of ways. Weighting each element by the constant l/(2m +1) (taking a simple mean of the elements) is equivalent to fitting a linear relation to each set; more complex weighting schemes can be used, so that the procedure is equivalent to fitting a polynomial of any given degree to each set. The degree of the polynomial to be fitted may be found by the variate difference method proposed by Tintner (1940, p. 100). The method of moving averages is thus seen to be similar to the fitting of polynomial functions to time series, but rather than fitting one polynomial to the entire series, a polynomial is fitted to subsets of *2m*+ 1 consecutive observations.

Caution must be exercised in the use of moving averages, since the application of such averages will introduce autocorrelation into even a pure random series and modify any existing autocorrelation. (Of course, the fitting of linear or polynomial trends may have similar effects.) Also, the use of moving averages will diminish the amplitude of existing periodic movements (Tintner 1952, pp. 203-207; see also Slutsky 1927). For example, if we use moving averages to eliminate the seasonal component, this will diminish the amplitude of other cyclical fluctuations in the series.

**Nonparametric tests for trend** . The methods for estimating the trend component of a time series discussed thus far all rely on our having some a priori knowledge of the form of the trend function. As is frequently the case in economics, we may feel very uncertain about the form of the function and would like a statistical test for trend that does not depend on such knowledge, i.e., a non-parametric test. Such a test has been devised by Mann (1945) on the basis of earlier work by Kendall (1938). Given the series X_{t}, t = 1, …, N, we assign ranks Pi, …, p” to each of the observations. For example, if X, is the fourth largest item in the series, then Pi = 4. We then compute a coefficient of disarray,

where S, called the total score, is defined as

Here P, the positive score, is the sum where *n _{i}* is the number of elements X

_{i+1}, …,

*X*with ranks larger than p4. The coefficient r may take on values from -1 to +1, large negative values indicating a downward trend in the series and large positive values an upward trend. If there is no trend,

_{N}*τ*should be in the neighborhood of zero. The significance of τ may be determined by a significance test for S, for which tables have been provided by Kendall (1948) for small N. For larger values of N, S can be considered normally distributed, with mean zero and variance equal to N(N-1)(2N + 5)/18. [

*Further discussions of nonparametric methods in trend analysis may be found in*Foster & Stuart (1954)

*and*Hemelrijk (1958).

*See also*Nonparametric statistics, article on ranking methods.]

### Oscillatory and periodic movements

The study of oscillatory and periodic movements in economic time series deals with seasonal fluctuations, the business cycle, and related matters. The procedures to be described have been developed to analyze series without trends or series from which trends have been eliminated. The primary concerns are discovering the statistical models that appear to have generated the oscillatory time series frequently observed and estimating the parameters of these models. (Useful methods for such analysis are to be found in Bruckmann & Pfanzagl 1960 and Hannan 1963.)

Practically all economic time series display fluctuations of one sort or another. Before we embark on analysis of a particular series, it is desirable to determine whether the observed fluctuations represent anything beyond a purely random process. A nonparametric test for this purpose has been constructed by Wallis and Moore (1941). The test involves determining whether the number of completed runs of length 1, 2, and greater than 2 differs significantly from the expected number of such runs in a series of independent random observations. (A run is defined as the occurrence of consecutive first differences having the same sign.) The statistic comparing the actual with the expected number of such runs is used in an approximate x^{2} test, for which tables are provided by Wallis and Moore. They have also established a test employing the total number of completed runs as the test statistic, a statistic that is normally distributed in large samples. *[See* Nonparametric statistics,*article on* RUNS.]

**Periodogram analysis** . Periodogram analysis is based on the idea that a strictly periodic time series can be expressed as the sum of a number of harmonic waves, each represented by a sine or cosine term. Suppose the variable Y is a function of time, say, Y_{t}, = *f(t).* If *f(t*+ T) = *f(t)* for all values of *t,* then Y, can be expressed as a Fourier series, namely,

where T is referred to as the period of oscillation and A_{0}, A_{j}, and B_{j}, are constants. It can be shown that

where R_{j}, is the amplitude (maximum value of Y) corresponding to the harmonic term with period of length *T/j*(Schuster 1906).

In practical work we wish to determine the principal harmonic components of the time series *X _{t}, t*= 1, …, N. We assume that X

_{t}is generated by a Fourier series plus a normally distributed nonauto-correlated random error term with zero mean and constant variance. We attempt to find the harmonic components corresponding to particular periods that are significant in explaining X

_{t}. To accomplish this we compute the Fourier coefficients,

corresponding to the harmonic component with period of length *N/n.* We can then compute the squared amplitude,

The graph of *R _{n}* against

*N/n*for different values of

*n*is called the periodogram of X,.

To determine whether the harmonic component with period *N/n* is significant, we test the significance of *R _{n}^{2}.* Three tests are available-Schuster’s test, Walker’s test, and Fisher’s test. A complete discussion of these tests and tables for the test statistics are provided by Davis (1941).

Although periodogram analysis has been used in the past, the results of empirical applications hardly justify much hope for its success in the future. The model on which it is based contains two unrealistic assumptions. First, it is assumed that apart from random disturbances, peaks and troughs occur with strict regularity in economic time series. Second, a random disturbance at a given date is assumed to have no effect on the future course of the time series.

**Spectral analysis** . Many of the shortcomings of periodogram analysis can be overcome by employing a more general method known as spectral analysis. Spectral analysis has wider applicability, since it allows for random disturbances in the amplitudes, periods, and phase angles of the harmonic terms in the Fourier series representation of an economic time series.

Both the periodogram and the spectrum of a time series may be thought of as decompositions of the variance of the series at different frequencies (the frequency of a harmonic term is the reciprocal of the length of the period), but the periodogram is based on the assumption of strict periodicity. The periodogram of a series that is not strictly periodic is not a consistent estimator of the spectrum at given frequencies, although it is an unbiased estimator. Since most economic time series are not strictly periodic, statistical estimation of the spectrum is generally preferable to periodogram analysis. Rather than decompose the variance of the series into components at given frequencies (as in periodogram analysis), we decompose the variance into components in the neighborhood of various frequencies, in order to obtain statistically consistent results.

(Various methods have been proposed to estimate the theoretical spectral function, discussions of which may be found in Bartlett 1955; Blackman & Tukey 1958; Grenander & Rosenblatt 1957; Wilks 1962. For additional information on spectral analysis, see Hannan I960; 1963; Jenkins & Priestley 1957; Nerlove 1964; Parzen 1961; Symposium on Time Series Analysis 1963; Takacs I960; Whittle 1963.)

### Interdependence of observations

The interdependence of successive observations, or autocorrelation, is the one outstanding feature that fundamentally distinguishes the methods appropriate for econometric analysis from statistical methods useful in fields where observations are independent.

**Tests for autocorrelation** . Autocorrelation (sometimes called serial correlation) refers to correlation between X_{s}, and X_{t}, in a time series, where s ≠*t.* Correlation between Xt and X(+1 is often called first-order autocorrelation. Autocorrelation may arise through different mechanisms and in different circumstances; hence, a variety of statistical tests for autocorrelation have been developed.

The simpler procedures test for autocorrelation between X_{t}, and X_{t-j}, for a given value of *j.* A very general test of this type is the nonparametric test suggested by Wald and Wolfowitz (1943). It is meant to test for correlation between X_{t} and X_{t+1}, using the circular test statistic

which for large *N* is normally distributed, with mean and variance as given by Wald and Wolfowitz. The null hypothesis tested is that the items of the series are independent. Unfortunately, the test may be misleading for time series with strong trends, since R will then be too strongly influenced by the economically meaningless term X_{N}X_{t},.

A widely used parametric test for correlation between X_{t} and X_{t+1},+1 involves the so-called von Neumann ratio as the test statistic and assumes joint normality of the observations of the sample. The ratio is defined as the mean square of first differences divided by the variance of the observations. Its distribution was established by von Neumann (1941), and tables for significance tests in small samples are given by Hart (1942).

A parametric test, again based on a normal population, to determine whether X_{t}, is significantly correlated with X_{t-j}, for any j is also available. The test statistic for *j = L* is the Lth noncircular autocorrelation coefficient computed from the sample, denoted by *rL,* that is, the simple correlation coefficient between the series X_{1}, *X _{2},* …,

*X*and the series X

_{X-L}_{L+1}, X

_{L+2}

*, …, X*Here

_{N}.*r*is noncircular, since we do not identify X

_{L}_{N+1}with X

_{1}, X

_{2}… X

_{N-1}, etc. The Lth circular autocorrelation coefficient computed from the sample,

*r\,*is the simple correlation coefficient between the series X

_{1},X

_{L+1}, X

_{L+2}and X

_{N}, X

_{1},X

_{L+1}, X

_{L+2}… ,

*X*The distribution of

_{N}.*rL*is not known, but the exact small-sample distribution of

*r"L*has been derived by R. L. Anderson (1942) and can be used for an approximate test of the null hypothesis that X

_{t}, and X,

_{L_t}are not correlated in the population. As a justification for approximating the distribution of

*rL*with that of

*rL,*it can be argued that for large samples the influence of the circular terms becomes negligible. The relationship between the first autocorrelation coefficient and the von Neumann ratio (the von Neumann ratio equals has been investigated by T. W. Anderson (1948).

More refined investigation of autocorrelation requires more refined statistical models. It was noted earlier that the general model of an economic time series is a stochastic process. An example of a stochastic process that is particularly useful in analyzing autocorrelation is the stochastic difference equation, or linear autoregressive process. The observed time series X_{t}, *t =*1, …, Ns, may have been generated by the pthorder linear autoregressive process:

the a_{i} being constants and the *et* being non autocorrelated random disturbances with zero mean and constant variance. Here X_{t} is seen to depend, in a complicated way, on its values for the previous p periods. Estimates of the parameters of (1) provide information on the form of autocorrelation in the population from which the sample was obtained. Since (1) does define a stochastic process, the joint probability distribution of X_{t}, …, *X _{M}* for any M can be found. If the joint probability distributions of X

_{t}, …, X

_{M}, and X

_{t+j},, …,

*X*are identical for all

_{M+j}*j*and for all choices of M, the process is said to be

*stationary.*The values of the constants,

*a*…, p, will determine whether or not (1) defines a stationary stochastic process. Should the process be stationary, maximum likelihood estimation of the parameters of (1) is equivalent to direct application of least squares to the equation. The estimators of the coefficients are consistent and normally distributed for large samples, but they are biased for small samples (Hurwicz 1950).

_{i}, i = 0,The concept of *weak stationarity* is often useful when one is dealing primarily with first and second moments. If the variance of X_{t} exists for all *t,* the process X_{t} is said to be weakly stationary when *(i)* EX_{t} does not depend on *t* and *(ii)* the covariance of X_{t} and X_{s} depends only on *t - s.* It follows from *(ii)* that for a weakly stationary process the variance of X_{t} does not depend on *t.*

In practice the order of the stochastic difference equation to be fitted is unknown. However, Quenouille (1947) has provided a large-sample test for the goodness of fit of a linear autoregressive scheme. After the fitting of a pth-order difference equation, the Quenouille test determines whether a difference equation of higher order will better fit the sample.

**Correlation between autocorrelated series** . Testing for correlation between two variables is often desired, but well-known tests for this purpose are generally based on the assumption that neither variable is autocorrelated. Since economic time series typically are autocorrelated, it is very important to have a test for correlation between two autocorrelated series.

Orcutt and James (1948) have suggested a test based upon the idea that a set of *N* observations of two autocorrelated series is, in a certain sense, equivalent to a smaller number, n’, of independent observations. Drawing upon earlier work by Bartlett (1935), they compute an approximation, valid in large samples, to the variance, V, of the sample correlation coefficient, r, between two autocorrelated series. If they show that *n’* is approximately equal to (V + 1)/V and that r approximately follows the t-distribution with *n’ -*2 degrees of freedom. The significance of r can then be tested, using this distribution.

(For further discussion of and references to tests for correlation between autocorrelated series, see Tintner 1952, pp. 247-250.)

**Autocorrelated residuals in regression** . A special problem that frequently arises in regression studies of economic time series is the possibility of autocorrelation in the error term of a regression model. Once the regression function has been estimated by methods appropriate for a nonautocorre-lated error term, it is desirable to test the residuals from the regression to determine whether or not the null hypothesis that there is in fact no autocorrelation in the disturbances can be accepted.

The autocorrelation of residuals from fitted regression equations has been considered asymptotically by Moran (1948). Suppose the regression model relating variable X_{1} to variable X_{2} is

X_{1t}= *a _{0} + a_{2}X_{2t} + e_{t}*,

where a_{0} and a_{2} are constants and *e _{t}* represents the random disturbances at time

*t.*We obtain estimates â

_{0}and â

_{2}of

*a*and a

_{0}_{2}, and we wish to use the residuals,

to test for autocorrelation of the *e _{t}*. We compute the first circular autocorrelation coefficient of the residuals:

The significance of R_{1} may be determined for large samples by a test involving the first two circular autocorrelation coefficients of X_{2}. In this case using the circular definition of *R _{1}* is probably not too damaging, since the residuals may well be homogeneous in time. But the autocorrelation coefficients for X

_{2}are likely to be much influenced by the circular terms.

The Durbin-Watson ratio (Durbin & Watson 1950-1951),

where *z _{t}* is the residual at time

*t*from a least squares regression with any number of independent variables, has been suggested as a test statistic for positive autocorrelation between

*z*and z

_{t},_{t-1}. This ratio is related to the von Neumann ratio. Durbin and Watson have not established exact significance levels for

*d,*but they do provide lower and upper bounds for various sample sizes and numbers of estimated parameters in the regression equation. If

*d*exceeds the upper bound, the hypothesis of positive autocorrelation in the disturbances must be accepted, whereas if

*d*is less than the lower bound, this hypothesis must be rejected. Should

*d*lie between the bounds, the test is inconclusive. An alternative approximate test for

*d*has been suggested by Theil and Nagar (1961). R. L. Anderson and T. W. Anderson (1950) have established a test for autocorrelation of residuals from fitted Fourier series.

[For *additional information on autocorrelation, see*markov chains; *see also* T. W. Anderson 1948; Cochrane & Orcutt 1949; Durbin 1960; Parzen 1961; Quenouille 1947; Sargan 1961.]

### Transformation of observations

Since methods to deal with the class of stochastic processes that arise in economic time series are frequently not available, we are often forced to try to transform our observations in such a way that classical statistical procedures can legitimately be used. (Analogous transformations were mentioned above in the discussion of the logistic function.) Unfortunately, choosing the correct transformation requires knowledge of the underlying stochastic process. However, it is possible to indicate the transformations that are appropriate for various possible underlying stochastic processes, leaving it to the researcher to decide which type of stochastic process most likely generated his observations.

**Variate difference method** . The variate difference method is a procedure for eliminating the systematic component from a time series composed of a systematic component and a random element. It does this by differencing the observations. The method has not been very popular in recent econometric applications, primarily because of its rather limited applicability, to be noted below.

The method utilizes a somewhat primitive fundamental model of an economic time series. The series X_{t}, *t*= 1, …, *N,* is assumed to be composed of two additive parts, namely, a “smooth” systematic part, *M _{t},* and a random element,

*e*. It is also assumed, perhaps unrealistically, that the mean value of e

_{t},_{t}, is zero, its variance is constant, and the et are not autocorrelated. If M

_{t}is a polynomial function of time, then it can be eliminated from X

_{t}by taking differences of the X

_{t}of a sufficiently high order. If M

_{t}, is a sufficiently smooth function of time, it might be approximated by a polynomial in a finite interval, and taking differences of the X

_{t}might reduce M

_{t}sufficiently.

The transformation of differencing the observations will not eliminate *M _{t},* to any marked degree if M

_{t}is not sufficiently smooth. For example, the method is not applicable to certain monthly time series with very pronounced seasonal fluctuations. In addition, it is evident that this transformation is not appropriate for time series generated by stochastic difference-equation processes.

The main objects of the variate difference method are to determine the difference series that best eliminates Mt and to estimate the variance of e_{t},. To accomplish the former, we compute the variance of the difference series of order *k.* Denote this variance as V_{k}. If *M _{t}* is nearly eliminated in the difference series of order

*k*it can be shown that the following equalities should hold approximately:

_{0},Thus, to determine whether the kth-order difference series eliminates M_{t}, we require a test for the equality of *V _{k}* and

*V*

_{k+1}.Several tests have been suggested. O. Anderson (1929) and R. Zaycoff (1937) have given expressions for the standard error of V_{k+1} - V_{k}. Thus, we might use the ratio of this difference to its standard error as the test statistic, a statistic that is normally distributed for large samples. Tintner (1940) indicated a quite inefficient small-sample test based upon systematic selection of items in the difference series in such a way that the items selected are independent. The exact distribution of *V _{k}* based upon a circular definition of the population has been established by Tintner (1955), providing an exact test (Rao & Tintner 1962). In practical applications we might use the exact distribution of the difference between the circularly defined variances as an approximation to the distribution of the difference between the noncircular variances, which in large samples will not differ significantly from that between the circular variances.

Suppose we find that the f ith-order difference series does eliminate *M _{t}*; then we may use as an estimator of the variance of

*e*the variance of this series, namely, V

_{t},_{k0}. (For an alternative procedure, see Rao & Tintner 1963. For more information on the variate difference method, consult Durbin 1962; Grenander & Rosenblatt 1957; Wilks 1962, p. 526.)

**Transformations in multiple regression** . One of the worst difficulties that plague empirical econometric research is autocorrelation of the residuals in regression studies of economic time series. Although the regression coefficients may be estimated without bias even if the residuals are autocorrelated, the standard errors of the coefficients obtained using classical least squares methods will not be valid.

Suppose we estimate by least squares the regression function

X_{1t}=k_{0}+k_{2}X_{2t}+e_{t}

and find significant autocorrelation in the residuals,

ê_{t}=X_{1t}-X̂_{1t}

where *X̂ _{1t} = K_{0} + K_{2}X_{2t}*. Then it can be shown that a valid estimator of the standard error of

*k*is given by the classical estimator multiplied by the adjustment factor [1 + 2(

_{2}*r*+ … )]½. In this formulation

_{1}R_{1}+ r_{2}R_{2}*r*is the autocorrelation coefficient of

_{i}*e*and

_{t}*e*and R

_{t-t}_{i}is the autocorrelation coefficient of X

_{2t}, and X

_{2,t-i}. Thus, this procedure requires knowledge of the correlograms of

*e*and X

_{t}_{2t}(. If we were to assume that both of these variables were generated by a first-order linear autoregressive process (a simple Markov scheme), then we could use as estimators of the correlograms the first autocorrelation coefficients computed from the sample, say, r̂

_{1}and

*R̂*This would lead to the adjustment factor

_{1}.An alternative to adjusting the standard errors in this manner is to transform the observations in such a way that autocorrelation in the residuals is removed, so that classical least squares estimation can legitimately be applied.

*Elimination of trends.* Autocorrelation of residuals is frequently due to trends in the variables included in time series regression studies. Hence, econometricians often transform their observations by taking deviations from trends. This would generally necessitate fitting trend functions to each of the variables, although that can be avoided in some instances. For example, the variate difference method might be used to find a difference series for each variable that sufficiently eliminates the systematic trend components. First differences would be called for if each variable had a linear trend as its systematic component (an exponential trend, if we are working with logarithms of the variables).

An important theorem, due to Frisch and Waugh (1933), also establishes a rather simple procedure for handling trends in variables. Suppose that all variables in the regression follow linear trends. Then it can be shown that the regression results obtained after transforming the variables into deviations from trends are the same as those obtained by using the original variables but including time itself as an independent variable in the regression. This theorem has been generalized by Tintner (1952, p. 301) to cases in which all variables have trends that are orthogonal polynomials of time or generally orthogonal functions of time, thus extending the applicability of the method.

However, it should be noted that eliminating trends in variables or introducing time as a separate variable in regression studies is really a confession of ignorance that is sometimes unavoidable. The relation that is being estimated must somehow have shifted over time, but rather than seek an explanation for this shifting, we simply eliminate it and estimate a relation that is stable over the period being considered.

*Autoregressive transformations.* Cochrane and Orcutt (1949) have shown that autocorrelation in residuals can be eliminated by a simple transformation of variables if the residuals follow a simple Markov scheme. Assume that we want to fit, by the method of least squares, the relation

The *u _{t}* are not independent but are known to follow the simple Markov scheme

*u _{1} = Au_{t-1} + v_{t},*

where A is a constant, | A | <1, and the *v _{1}* have mean value zero, have finite and constant variance, and are not autocorrelated. If we make the transformations Y

_{it}= X

_{it}- AX

_{i,t-1,}then least squares can legitimately be applied to the linear relationship for the transformed variables:

It should be noted that this transformation reduces to taking first differences of the observations when A is very close to unity, thus establishing a link between the autoregressive transformation and the variate difference method.

A difficulty in applying this method is that it requires knowledge of the constant A, which characterizes the stochastic process generating *u _{t}.* This difficulty can be surmounted in the following manner (Johnston 1963, pp. 192-195): We first fit (2) by the method of least squares and compute the residuals

*û*. We then estimate A by applying least squares to

_{t}= X_{1t}- X̂_{lt}*û _{t} = Aû_{t-1}*+

*v*

_{t}Using Â, the estimator of A, to transform the observations, we fit (3) by least squares and test the residuals *v̂ _{t}* = Y

_{1t}- Ŷ

_{lt}for autocorrelation. If the autocorrelation of these residuals is not significant, we have found a reasonably good estimator of A, and our task is completed. If the autocorrelation is significant, we use the residuals

*Vt*to obtain a new estimator of A by fitting

*v̂ _{t} = Av̂_{t-1} + v_{t}*

and repeat the above steps. We continue this process until the residuals computed from our estimator of (3) are not significantly autocorrelated.

### Stochastic processes in econometrics

The most promising approach to the satisfactory analysis of economic time series is the explicit use of stochastic processes in econometric research. Although some headway has been made along these lines, the results achieved thus far are not very satisfactory. In statistical analysis of time series, we wish to use our observations to estimate the parameters of the underlying stochastic process that generated the time series. But in order to use a small sample to make statistical inferences concerning the values of population parameters, we must know the small-sample distributions of the estimators employed. Unfortunately, considerable mathematical difficulty is encountered in determining the small-sample distributions of parameter estimators of many types of stochastic processes.

Stochastic processes have found wide and diverse applications in economic time series analysis. Dynamic economic problems are frequently analyzed with the aid of dynamic models that can be characterized as stochastic processes (see, for example, Adelman & Adelman 1959; de Cani 1961; F. M. Fisher 1962; Granger & Hatanaka 1964; Morgen-stern 1961; Tinbergen 1939). Stochastic processes have been employed to analyze the formation of commodity and stock prices (Cootner 1964; Kendall 1953; Quenouille 1957). Much effort has been directed toward finding an explanation of the income distribution in terms of stochastic processes, as it is a distribution that is very skew and fits badly into the traditional analysis of statistical income distributions *[see* Size distributions in economics]. Labor mobility problems have been investigated with the aid of Markov processes (Blumen et al. 1955). Prais and Houthakker (1955) employed stochastic processes involving the lognormal distribution in analyzing household budgets. Stochastic processes have also been used quite successfully in operations research, and it is hoped that the same or related methods will be useful in analyses of economic time series (Bharucha-Reid I960; Takács 1960).

**Estimation of parameters.** To illustrate the application of maximum likelihood estimation to stochastic processes, consider the Poisson process [Fisz (1954) 1963, p. 488; *see also* Queues]. Let *N(t)* be the number of events occurring during the time period from 0 to t, where *N(t)* can take on values 0, 1, 2, …. We assume that *N(t)* has independent and homogeneous increments, i.e., that *N(t _{2})* -N(t

_{1}) is independent of, and has the same distribution as,

*N(t*- N(

_{2}+ h)*t*)for all choices of t

_{1}+ h_{1}and t

_{2}, t

_{2}> t

_{1}, and for all choices of

*h*greater than zero. The probability of the occurrence of more than one event in the time interval (t,

*t*+ Δt) tends sufficiently fast to zero as Δt → 0. The transition probability, p

_{ji}(t

_{1}; t

_{2}), is the probability that

*N(t*if N(t

_{2}) = i_{1}) =

*j*and can be written

where L, the parameter to be estimated, is the average number of events occurring per unit of time. Suppose we have observations on *N(t)* for *t = t _{1},* t

_{2}, …, t

_{n}and denote

*N(t*= 1, …,

_{k}) = j_{k}, k*n.*The likelihood function is then

Maximizing P with respect to L yields the maximum likelihood estimator, L̂ = *j _{n}/t_{n}.*

Large-sample or asymptotic distributions of maximum likelihood estimators have been established for simple Markov processes (T. W. Anderson 1959). Consider a first-order stochastic difference equation,

X_{t} = *aX _{t-1} + u_{t},*

where a is a constant to be estimated from the sample X_{1}, …, X_{N}. Assume that the initial value, X_{0}, is a constant and that *u _{t},* is a nonautocorrelated normally distributed random variable with zero mean and constant and finite variance. The maximum likelihood estimator of

*a*is

If *\a\*1 (in which case X_{t} is a stationary time series or a time series with no trend), the expression (â - a)√*N̄* is, under wide conditions, asymptotically normally distributed with mean zero. Using this result, approximate confidence interval estimates of *a* could be made for very large samples.

More interesting is the case in which *a* > 1. Then X_{t} is an evolutionary time series-the stochastic equivalent of an economic time series with exponential trend-and the expression

has a Cauchy distribution, a distribution whose mean and higher moments do not exist. However, if in addition to the above assumptions it is assumed that X_{0} = 0 and *u _{t}* is normally distributed, then is â maximum likelihood estimator, and the expression

is asymptotically normal and provides a basis for approximate confidence interval estimates for very large samples.

Asymptotic distributions such as these could be used as rough approximations to small-sample distributions of this estimator, but it is to be hoped that exact small-sample distributions will be established, so that they would be immediately applicable to the analysis of short economic time series.

*Multiple stochastic processes.* Multiple stochastic processes arise frequently in econometric research. For example, in dynamic business cycle models we often attempt to explain the cyclical behavior of consumption, investment, and income by a system of equations in which the current value of each of these endogenous variables is related to its own lagged values, to lagged values of other endogenous variables, to current and lagged exogenous variables, and to a random term. The interdependence of economic variables leads us to analyze a particular time series in the context of a system of equations that constitutes a multiple stochastic process. [*See* Econometric models, aggregate.]

Not enough is known about the estimation of multiple stochastic processes. In econometric work, lagged endogenous variables are frequently handled like given constants, a treatment which is plainly inadequate. Also, small-sample distributions have not yet been sufficiently explored. Quenouille (1957, p. 70) has investigated the problems of estimating a multiple Markov scheme. This is a relatively simple multiple stochastic process, composed of a system of first-order stochastic linear difference equations. As an example, we might wish to estimate the constants a_{ij} and b_{ij} in the system

*a _{11}X_{1t} + a_{12}X_{2t}* + b

_{11}X

_{1,t-1}+ b

_{12}X

_{2,t-1}=

*e*

_{1t},*a _{21}X_{1t} + a_{22}X_{2t}* + b

_{21}X

_{1,t-1}+ b

_{22}X

_{2,t-1}=

*e*

_{2t},with the sets of observations X_{11}, …, *X _{1N}* and X

_{21}, …, X

_{2N}. We assume that

*e*

_{1}and

*e*are random variables with zero means and constant and finite variances and that they are independent over time. Under certain conditions, we can solve for

_{2}*X*and X

_{lt}_{2t}, rewriting the system as follows:

X_{1t} = u_{11}X_{1,t-1} + *u _{12}X_{2,t-1}* + v

_{11}e

_{1t}+ v

_{12}e

_{2t},

X_{2t} = u_{21}X_{1,t-1} + *u _{22}X_{2,t-1}* + v

_{21}e

_{1t}+ v

_{22}e

_{2t},

Defining the simple covariances as

and the lagged covariances as

the maximum likelihood estimators of the *u _{ij}* are found by solving the system of equations

*c _{11}û_{11} + c_{12}û_{21} = c^{'}_{11}*,

*c*,

_{21}û_{11}+ c_{22}û_{21}= c^{'}_{21}*c*,

_{11}û_{12}+ c_{12}û_{22}= c^{'}_{12}*c*.

_{21}û_{12}+ c_{22}û_{22}= c^{'}_{22}It is also possible to obtain standard errors of the estimated *u _{ij},* but it is not possible to obtain estimators of the

*a*and b

_{ij}_{ij}. In order to accomplish this, further assumptions about the coefficients have to be made. (See, for example, F. M. Fisher 1965. For a further discussion of multiple stochastic processes, see Bartlett 1955. Works containing additional information on the statistical treatment of stochastic processes as well as useful references to the vast literature on this topic are Moran 1951; 1953; Rosenblatt 1962.)

Gerhard Tintner

*[See also* Linear hypotheses,*article on*regression.]

## BIBLIOGRAPHY

Adelman, Irma; and Adelman, Frank L. (1959) 1965 The Dynamic Properties of the Klein-Goldberger Model. Pages 278-306 in American Economic Association, *Readings in Business Cycles.* Homewood, 111.: Irwin. → First published in Volume 27 of *Econometrica.*

Anderson, Oskar N. 1929 Die *Korrelationsrechnung in der Konjunkturforschung: Ein Beitrag zur Analyse von Zeitreihen.* Bonn: Schroeder.

Anderson, R. L. 1942 Distribution of the Serial Correlation Coefficient. *Annals of Mathematical Statistics* 13:1-13.

Anderson, R. L. 1954 The Problems of Autocorrelation in Regression Analysis. *Journal of the American Statistical Association* 49:113-129. → A correction was published in Volume 50, page 1331.

Anderson, R. L.; and Anderson, T. W. 1950 Distribution of the Circular Serial Correlation Coefficient for Residuals From a Fitted Fourier Series. *Annals of Mathematical Statistics* 21:59-81.

Anderson, T. W. 1948 On the Theory of Testing Serial Correlation. *Skandinavisk aktuarietidskrift* 31:88-116.

Anderson, T. W. 1959 On Asymptotic Distributions of Estimates of Parameters of Stochastic Difference Equations. *Annals of Mathematical Statistics* 30:676-687.

Anderson, T. W. 1962 The Choice of the Degree of a Polynomial Regression as a Multiple Decision Problem. *Annals of Mathematical Statistics* 33:255-265.

Anderson, T. W. 1964 Some Approaches to the Statistical Analysis of Time Series. *Australian Journal of Statistics* 6, no. 1:1-11.

Bartlett, M. S. 1935 Some Aspects of the Time-correlation Problem in Regard to Tests of Significance. *Journal of the Royal Statistical Society* 98:536-543.

Bartlett, M. S. (1955) 1962 *An Introduction to Stochastic Processes, With Special Reference to Methods and Applications.* Cambridge Univ. Press.

Bharucha-Reid, Albert T. 1960 *Elements of the Theory of Markov Processes and Their Applications.* New York: McGraw-Hill.

Blackman, R. B.; and Tukey, J. W. (1958) 1959 *The Measurement of Power Spectra, From the Point of View of Communications Engineering.* New York: Dover. → First published in the *Bell System Technical Journal.*

Blumen, Isadore; Kogan, Marvin; and Mccarthy, Philip 1955 *The Industrial Mobility of Labor as a Probability Process.* Cornell Studies in Industrial and Labor Relations, Vol. 6. Ithaca, N.Y.: Cornell Univ. Press.

Bruckmann, G.; and Pfanzagl, J. 1960 Literaturbe-richt über die Zerlegung saisonabhängiger Zeitreihen. Deutsche Gesellschaft für Versicherungsmathematik, *Blätter* 4:301-309.

Cochrane, Donald; and Orcutt, G. H. 1949 Application of Least Squares Regression to Relationships Containing Auto-correlated Error Terms. *Journal of the American Statistical Association* 44:32-61.

Cootner, Paul H. (editor) 1964 *The Random Character of Stock Market Prices.* Cambridge, Mass.: M.I.T. Press.

Cowles Commission for Research in Economics 1953 *Studies in Econometric Method.* Edited by William C. Hood and Tjalling C. Koopmans. New York: Wiley.

Davis, Harold T. 1941 *The Analysis of Economic Time Series.* Cowles Commission for Research in Economics, Monograph No. 6. Bloomington, Ind.: Principia.

de Cani, John S. 1961 On the Construction of Stochastic Models of Population Growth and Migration. *Journal of Regional Science*3, no. 2:1-13.

Durbin, J. R. 1960 Estimation of Parameters in Time-series Regression Models. *Journal of the Royal Statistical Society* Series B 22:139-153.

Durbin, J. R. 1962 Trend Elimination by Moving-average and Variate-difference Filters. Institut International de Statistique, *Bulletin* 39:131-141.

Durbin, J. R.; and Watson, G. S. 1950-1951 Testing for Serial Correlation in Least Squares Regression. Parts 1-2. *Biometrika* 37:409-428; 38:159-178.

Fisher, Franklin M. 1962 A *Priori Information and Time Series Analysis: Essays in Economic Theory and Measurement.* Amsterdam: North-Holland Publishing.

Fisher, Franklin M. 1965 Dynamic Structure and Estimation in Economy-wide Econometric Models. Pages 589-635 in James S. Duesenberry et al., *The Brookings Quarterly Econometric Model of the United States.* Chicago: Rand McNally.

Fisher, R. A. (1925) 1958 *Statistical Methods for Research Workers.*13th ed. New York: Hafner. → Previous editions were also published by Oliver and Boyd.

Fisz, Mareka (1954) 1963 *Probability Theory and Mathematical Statistics.*3d ed. New York: Wiley. → First published as *Rachunek prawdopodobienstwa i statystyka matematyczna.*

Foster, F. G.; and Stuart, Alan 1954 Distribution-free Tests in Time-series Based on the Breaking of Records. *Journal of the Royal Statistical Society* Series B 16:1-22.

Frisch, Ragnar; and Waugh, Frederick V. 1933 Partial Time Regressions as Compared With Individual Trends. *Econometrica* 1:387-401.

Granger, CliveW. J.; and Hatanaka, M. 1964 *Spectral Analysis of Economic Time Series.* Princeton Univ. Press.

Grenander, Ulf 1957 Modern Trends in Time Series Analysis. *Sankhyā* 18:149-158.

Grenander, Ulf; and Rosenblatt, Murray 1957 *Statistical Analysis of Stationary Time Series.* New York: Wiley.

Hannan, Edward J. (1960)1962 *Time Series Analysis.* New York: Wiley.

Hannan, Edward J. 1963 The Estimation of Seasonal Variations in Economic Time Series. *Journal of the American Statistical Association* 58:31-44.

Hart, B. I. 1942 Tabulation of the Probabilities for the Ratio of the Mean Square Successive Difference to the Variance. *Annals of Mathematical Statistics* 13:207-214.

Hemelrijk, Jan 1958 Distribution-free Tests Against Trend and Maximum Likelihood Estimates of Ordered Parameters. Institut International de Statistique, *Bulletin* 36:15-25.

Hotelling, Harold 1927 Differential Equations Subject to Error, and Population Estimates. *Journal of the American Statistical Association* 22:283-314.

Hurwicz, Leonard 1950 Least-squares Bias in Time Series. Pages 365-383 in Tjalling C. Koopmans (editor), *Statistical Inference in Dynamic Economic Models.* New York: Wiley.

Jenkins, G. M.; and Priestley, M. B. 1957 The Spectral Analysis of Time-series. *Journal of* the *Royal Statistical Society* Series B 19:1-12.

Johnston, John 1963 *Econometric Methods.* New York: McGraw-Hill.

Kendall, M. G. 1938 A New Measure of Rank Correlation. *Biometrika* 30:81-93. → This introduces the r coefficient. Its use in testing for trend is discussed in Kendall 1948.

Kendall, M. G. (1948) 1963 *Rank Correlation Methods.*3d ed., rev. & enl. New York: Hafner.

Kendall, M. G. 1953 The Analysis of Economic Time-series. Part 1: Prices. *Journal of the Royal Statistical Society* Series A 116:11-25.

Kendall, M. G.; and Stuart, Alan (1946) 1961 *The Advanced Theory of Statistics.* Volume 2: Inference and Relationship. New ed. New York: Hafner; London: Griffin. → The first edition was written by Kendall alone.

Koopmans, Tjalling C. (editor) 1950 *Statistical Inference in Dynamic Economic Models.* New York: Wiley.

Kuznets, Simon 1934 Time Series. Volume 14, pages 629-636 in *Encyclopaedia of the Social Sciences.* New York: Macmillan.

Malinvaud, Edmond (1964) 1966 *Statistical Methods in Econometrics.* Chicago: Rand McNally. → First published in French.

Mann, Henry B. 1945 Nonparametric Tests Against Trend. *Econometrica* 13:245-259.

Mann, Henry B.; and Wald, A. 1943 On the Statistical Treatment of Linear Stochastic Difference Equations. *Econometrica* 11:173-220.

Moran, P. A. P. 1948 Some Theorems on Time Series. Part 2: The Significance of the Serial Correlation Coefficient. *Biometrika* 35:255-260.

Moran, P. A. P. 1951 Estimation Methods for Evolutive Processes. *Journal of the Royal Statistical Society* Series B 13:141-146.

Moran, P. A. P. 1953 The Estimation of the Parameters of a Birth-and-death Process. *Journal of the Royal Statistical Society* Series B 15:241-245.

Morgenstern, Oskar 1961 A New Look at Economic Time Series Analysis. Pages 261-272 in Hugo Hege-land (editor), *Money, Growth, and Methodology, and Other Essays in Economics.* Lund (Sweden): Gleerup.

Nerlove, Marc 1964 Spectral Analysis of Seasonal Adjustment Procedures. *Econometrica* 32:241-286.

Orcutt, G. H.; and James, S. F. 1948 Testing the Significance of Correlation Between Time Series. *Bio-metrika* 35:397-413.

Parzen, Emanuel (1961) 1965 *Stochastic Processes.* San Francisco: Holden-Day.

Phillips, A. W. 1959 The Estimation of Parameters in Systems of Stochastic Differential Equations. Bio-*metrika* 46:67-76.

Prais, S. J.; and Houthakker, H. S. 1955 *The Analysis of Family Budgets With an Application to Two British Surveys Conducted in 1937-1939 and Their Detailed Results.* Cambridge Univ. Press.

Quenouille, M. H. 1947 A Large Sample Test for the Goodness of Fit of Autoregressive Schemes. *Journal of the Royal Statistical Society* Series A 110:123-129.

Quenouille, M. H. 1957 *The Analysis of Multiple Time-series.* New York: Hafner.

Rao, J. N. K.; and Tintner, Gerhard 1962 The Distribution of the Ratio of the Variances of Variate Differences in the Circular Case. *Sankhyā* 24A: 385-394.

Rao, J. N. K.; and Tintner, Gerhard 1963 On the Variate Difference Method. *Australian Journal of Statistics* 5:106-116.

Rhodes, E. C. 1940 Population Mathematics III. *Journal of the Royal Statistical Society* Series A 103:362-387.

Rosenblatt, Murray 1962 Random *Processes.* New York: Oxford Univ. Press.

Sargan, J. D. 1961 The Maximum Likelihood Estimation of Economic Relationships With Autoregressive Residuals. *Econometrica* 29:414-426.

Schuster, Arthur 1906 On the Periodicities of Sun-spots. Royal Society of London, *Philosophical Transactions* Series A 206:69-100.

Slutsky, Eugen E. (1927) 1937 The Summation of Random Causes as the Source of Cyclic Processes. *Econometrica* 5:105-146. → First published in Russian. Reprinted in 1960 in Slutsky’s *Izbrannye trudy.*

Symposium on Spectral Approach to Time Series. 1957 *Journal of the Royal Statistical Society* Series B 19: 1-63.

Symposium on Stochastic Processes. 1949 *Journal of the Royal Statistical Society* Series B 11:150-282.

Symposium on Time Series Analysis, Brown University, 1962 1963 *Proceedings.* Edited by Murray Rosenblatt. New York: Wiley.

TakÁcs, Lajos 1960 *Stochastic Processes: Problems and Solutions.* New York: Wiley.

Theil, H.; and Nagar, A. L. 1961 Testing the Independence of Regression Disturbances. *Journal of the American Statistical Association* 56:793-806.

Tinbergen, Jan 1939 *Statistical Testing of Business-cycle Theories.* Volume 2: Business Cycles in the United States of America: 1919-1932. Geneva: League of Nations, Economic Intelligence Service.

Tintner, Gerhard 1940 *The Variate Difference Method.* Bloomington, Ind.: Principia.

Tintner, Gerhard 1952 *Econometrics.* New York: Wiley.

Tintner, Gerhard 1955 The Distribution of the Variances of Variate Differences in the Circular Case. *Metron* 17, no. 3/4:43-52.

Tintner, Gerhard 1960 *Handbuch der Ökonometrie.* Berlin: Springer.

von Neumann, John 1941 Distribution of the Ratio of the Mean Square Successive Difference to the Variance. *Annals of Mathematical Statistics* 12:367-395.

Wald, A.; and Wolfowitz, J. 1943 An Exact Test for Randomness in the Non-parametric Case Based on Serial Correlation. *Annals of Mathematical Statistics* 14:378-388.

Wallis, W. allen; and Moore, Geoffrey H. 1941 A Significance Test for Time Series Analysis. *Journal of the American Statistical Association* 36:401-409.

Whittle, P. 1951 *Hypothesis Testing in Time Series Analysis.* Uppsala (Sweden): Almqvist & Wiksell.

Whittle, P. 1963 Prediction *and Regulation by Linear Least-square Methods.* London: English Universities Press.

Wilks, S. S. 1962 *Mathematical Statistics.* New York: Wiley.

Wold, Herman (1938) 1954 A *Study in the Analysis of Stationary Time Series.*2d ed. Stockholm: Almqvist & Wiksell.

Zaycoff, Rashko 1936 V’rkhu razlaganeto na statisti-cheski redove po vreme na tri slagaemi (Über die Zerlegung statistischer Zeitreihen in drei Komponenten). Sofia, Universitet, Statisticheski Institut za Stopanski Prouchvaniia *Trudove*[1936] no. 4:1-22.

Zaycoff, Rashko 1937 V’rkhu izkliuchvaneto na slu-.chainata slagaema posr”dstvom’ metoda “Variate-difference” (Über die Ausschaltung der zufalligen Komponente nach der “Variate-difference”-Methode). Sofia, Universitet, Statisticheski Institut za Stopanski Prouchvaniia *Trudove*[1937] no. 1:1-46.

## II. ADVANCED PROBLEMS

Numerical data often occur in the form of a *time series,* that is, a sequence of observations on a variable taken either continuously or at regular intervals of time. As examples consider records of economic variables (prices, interest rates, sales, unemployment), meteorological records, electroencephalograms, and population and public health records. In contrast to much experimental data, the consecutive observations of a time series are not in general independent, and the fascination of time series analysis lies in the utilization of observed dependence to deduce the way the value of a variable (say, steel production) can be shown to be in part determined by the past values of the same variable, or of other variables (say, demand for automobiles).

In the so-called discrete time case, observations are taken at regular intervals of time. The common interval of time can be taken as the unit, so that the value of a variable *x* at time *t* can be denoted by *x _{t},* where

*t*takes the values …, -2, -1, 0,1, 2,…. Of course, in practice one observes only a finite set of values, say

*x*…,

_{1}x_{2},*x*but it is useful to imagine that the series can in principle extend indefinitely far back and forward in time. For this reason

_{n},*t*is allowed to run to infinity in both directions. (Of course, sometimes one observes a phenomenon continuously, so that

*x*is measured for all

*t*rather than just at intervals-such a process is referred to as continuous. However, since the discrete case is by far the more frequent in the social sciences, this discussion will be limited to that case.)

Suppose that one has a model which explains how the *xt* series should develop. The model is termed a *process* and is denoted by {*x _{t}*}; if some of the rules it specifies are probabilistic ones, it is called a

*stochastic process.*

### Definition of a stationary process

One class of stochastic processes is of particular importance, both in practice and in theory: this is the class of *stationary processes.* A stationary process is one that is in a state of statistical equilibrium, so that its statistical pattern of behavior does not change with time. Formally, the requirement is that for any set of instants of time, t_{1}, t_{2}…,t_{n}, and any time lag, s, the joint distribution of *x _{tl},X_{t2}, … x_{tn}* must be the same as that of

*x*. Thus, x

_{t1+8}, x_{t2+8}, …, x_{tn+8}_{1}and x

_{3}, must have the same uni-variate distribution, (x

_{1},

*x*)and (

_{2}*x*)must have the same bivariate distribution, and so on.

_{5}, x_{6}The assumption of stationarity is a strong one, but when it can be made it greatly simplifies understanding and analysis of a process. An intuitive reason for the simplification is that a stationary process provides a kind of hidden replication, a structure that does not deviate too far from the still more special assumptions of independence and identical distribution, assumptions that are ubiquitous in statistical theory. Whether the stationarity assumption is realistic for a particular process depends on how near the process is indeed to statistical equilibrium. For example, because most economies are evolving, economic series can seldom be regarded as stationary, but sometimes a transformation of the variable produces a more nearly stationary series (see the section on “smoothing” a series, below).

Stationarity implies that if *x* has an expectation, then this expectation must be independent of t, so that

say, for all *t.* Furthermore, if *x _{t}* and

*x*have a covariance, then this covariance can depend only on the relative time lag, s, so that

_{t-8},The important function Γ_{8}, is known as the *auto-covariance function.*

Processes subject only to the restrictions (1) and (2), and not to any other of the restrictions that stationarity implies, are known as *wide-sense stationary processes.* They are important theoretically, but the idea of wide-sense stationarity is important also because in practice one is often content to work with first-order and second-order moments alone, if for no other reason than to keep computation manageable. This survey will be restricted to stationary processes in the strict sense, unless otherwise indicated.

Note that *t* need not necessarily mean time. One might, for example, be considering variations in thickness along a thread or in vehicle density along a highway; then *t* would be a spatial coordinate.

Some particular processes. One of the simplest processes of all is a sequence of independent random variables, {∊_{t}}. If the *∊ _{t}* have a common distribution, then the process is strictly stationary- this is the kind of sequence often postulated for the “residuals” of a regression or of a model in experimental design. If one requires of {∊

_{t}} merely that its elements have constant mean and variance, m and t2, and be uncorrelated, then the process is a wide-sense stationary process. From now on

*{∊*will denote a process of just this latter type. Often such a process of “residuals” is presumed to have zero mean (that is,

_{t}}*m - 0);*however, this will not always be assumed here.

What is of interest in most series is just that the observations are *not* independent or even uncorrelated. A model such as

(a *first-order autoregression)* takes one by a very natural first step from an uncorrelated sequence, {∊_{t}}, to an *autocorrelated* sequence, *{x _{t}}.* Here

*a*is a numerical constant whose value may or may not be known, and the term

*ax*introduces a dependence between observations. Such a model is physically plausible in many situations; it might, for example, crudely represent the level of a lake year by year,

_{t-1}*ax*representing the amount of water retained from the previous year and ∊

_{t-1},_{t}a random inflow. A common type of econometric model is a vector version of (3), in which x

_{t}, ∊

_{t}are vectors and α is a matrix.

If observations begin at time *T,* then the series starts with *x _{T}, x_{T+1}* is

*ax*∊

_{T}+_{T+1},

*x*is

_{T+2}*α*α∊

^{2}x_{T}+_{T+1}+ ∊

_{T+2}, and in general, for

*t ≥ T,*

If |a|<1 and the model has been operative from the indefinitely distant past, then one can let *T* tend to -∞ in (4) and obtain a solution for *x _{t}* in terms of the “disturbing variables”

*∊*:

_{t}The condition |a|l is a necessary one if the infinite sum (5) is not to diverge and if model (3) is to be stable. (By “divergence” one understands in this case that the random variable

does not converge in mean square as T tends to -∞ that is, there does not exist a random variable ξ such that *E(ξ _{T}* - ξ)

^{2}→0.)

The series *{x _{t}}* generated by (5) is stationary, and one verifies that

where *m* and σ^{2} are, respectively, the mean and the variance of ∊_{t}. Note from (7) the exponential decay of autocorrelation with lag.

A useful generalization of (3) is the *pth-order autoregression,*

expressing x_{t} in terms of its own immediate past and a stationary residual, ∊_{t}. When *p =*1, (8) and (3) are the same except for trivia of notation: α_{0} and α_{1}, in (8) correspond to 1 and -α in (3). When *p* >1, a process of type (8) can generate the quasi-periodic variations so often seen in time series. Of course, this is not the only model that can generate such quasi-periodic oscillations (one might, for example, consider a nonlinear process or a Markov chain), but it is probably the simplest type of model that does so.

Corresponding to the passage from (3) to (5), process (8) can be given the moving-average representation

which represents *x _{t}* as a linear superposition of past disturbances. The sequence

*b*is the

_{k}*transient response*of the system to a single unit disturbance.

The relation between the coefficients *a _{k}* and

*b*can be expressed neatly in generating function form:

_{k}For example, if z is set equal to 0, then b_{0} = l/a_{0}; if the functions are differentiated once and *z* is again set equal to 0, then *b _{1}*= –

*a*

_{1}/a_{0}

^{2}. (Discussions of time series sooner or later require some knowledge of complex variables. An introduction to the subject is given in MacRobert 1917.)

The necessary and sufficient condition for series (9) to converge to a proper random variable and for the resulting {*x _{t}*} series to be stationary is that

*A(z),*considered as a function of a complex variable z, have all its zeros outside the unit circle; this is again a stability condition on the relation (8). If, however, the stability condition is not fulfilled, relations such as (8) can still provide realistic models for some of the nonstationary series encountered in practice.

A relation such as (9) is said to express *{x _{t}}* as a

*moving average*of {∊

_{t}}. There are, of course, many other important types of process, particularly the general Markov process [

*see*Markov chains] and the point processes (see Bartlett 1963), but the simple linear processes described in this section are typical of those that are useful for many time series analyses.

### Autocovariance function

The autocovariance function, *Γ _{8},* defined in (2) gives a qualitative idea of the decay of statistical dependence in the process with increasing time lag; a more detailed examination of it can tell a good deal about the structure of the process.

A key result is the following: Suppose that {*x _{t}*}is a moving average of a process {

*y*},

_{t}where the summation is not necessarily restricted to nonnegative values of *k,* although in most physical applications it will be. Denote the autocovariances of the two processes by Γ* _{8}^{(x)}* Γ

*. Then*

_{8}^{(υ)}so that

a generating function relation that will be written in the form

If relation (11) defines a stationary process of finite variance, then (14) is valid for |z| = 1 at least.

A deduction from (14) and (10) is that for the autoregression (8) the autocovariance generating function is

By calculating the coefficient of z^{8} in the expansion of this function on the circle |z|= 1 one obtains Γ_{8}; for process (3) one obtains the result (7) as before; for the second-order process with *α _{0}*= 1 one obtains

where *α ^{-1}, β^{-1}* are the zeros of A(z). If these zeros are complex, say,

*α, β*=

*p exp(±i0),*then (16) has the oscillatory form

The autocovariance, Γ_{s}, has a peak near a lag, s, of approximately *2π/θ,* indicating strong positive correlation between values of *x _{t}* and x

_{t-8}for this value of lag. The nearer the damping factor,

*p,*lies to unity, the stronger the correlation. This is an indication of what one might call a quasi-periodicity in the

*x*series, of “period” 2π/θ, the kind of irregular periodicity that produces the “trade cycles” of economic series. Such disturbed periodicities are no less real for not being strict.

_{t}Either from (15) or from the fact that

it can be shown that for the autoregression (8)

These are the Yule-Walker relations, which provide a convenient way of calculating the *Γ _{8}* from the coefficients,

*a*This procedure will be reversed below, and (18) will be used to estimate the.

_{k}.*a*from estimates of the autocovariances.

_{k}### Spectral theory

Some of the first attempts at time series analysis concerned the prediction of tidal variation of coastal waters, a problem for which it was natural to consider a model of the type

That is, the series is represented as the sum of a number of harmonic components and an uncorre-lated residual. If the frequencies, ω_{i}, (corresponding to lunar and diurnal variations and so forth), are known, so that the A_{i} and a_{i} are to be estimated, then on the basis of an observed series x_{1} x_{2}…,x_{n}the least square estimators of the coefficients B_{i} and C_{i} are approximately

The approximation lies in the use of

where P, and Q_{t} are any two of the functions of time cos (*ω _{i}t*),sin (

*w*) (j =1, 2, …). In this approximation, terms of relative order

_{i}t*n*

^{-1}are neglected. The squared amplitude,

*A*, is thus estimated approximately by

_{j}^{2}= B_{j}^{2}+ C_{j}^{2}This can also be written in the form

which is mathematically (although not computationally) convenient.

The importance of *Â _{j}^{2}* is that it measures the decrease in residual sum of squares (that is, the improvement in fit of model (19)) obtained by fitting terms in cos(ω

_{j}t) and sin(ω

_{j}t). The larger this quantity, the greater the contribution that the harmonic component of frequency, ω

_{j}makes to the variation of

*x*For this reason, if the ω

_{t}._{j}are unknown, one can search for periodicities (see below) by calculating a quantity analogous to (20) for variable ω:the periodogram

An unusually large value of f_{n}(ω) at a particular frequency suggests the presence of a harmonic component at that frequency.

It is an empirical fact that few series are of the type (19): in general, one achieves much greater success by fitting structural models such as an autoregressive one. Even for the autoregressive model, however, or, indeed, for any stationary process, an analogue of representation (19) called the spectral representation holds. Here the sum is replaced by an integral. This integral gives an analysis of *x _{t}* into different frequency components; for stationary series the amplitudes of different components are uncorrelated. In recent work the spectral representation turns out to be of central importance: the amplitudes of frequency components have simple statistical properties and transform in a particularly simple fashion if the process is subjected to a moving-average transformation (see eq. (26)); the frequency components themselves are often of physical significance.

So even in the general case the periodogram f_{n}(ω) provides an empirical measure of the amount of variation in the series around frequency ω. Its expected value for large *n,* the spectral density function *(s.d.f.),*

provides the corresponding theoretical measure for a given process.

If the x_{t} have been reduced to zero mean (which, in fact, affects ϕ(ω) only for ω= 0), then the spectral density function becomes

and, as can be seen from (13) and (14), this is only a trivial modification of the autocovariance generating function, *g(z),* already encountered. In fact,

There is a relation reciprocal to (23), the spectral representation of the autocovariance,

In more general cases ϕ(ω) may fail to exist for certain values, and ϕ(ω)dω must be replaced by dF(ω)in (25), where F(ω) is the nondecreasing spectral distribution function.

An important property of spectral representations is the simplicity of their transformation under moving-average transformations of the process; relation (14) can be rewritten

showing that the effect of the moving-average operation (11) is to scale each frequency component up or down individually, by a factor |B[exp (iω)]|^{2}.

So, if for the autoregression with spectral density function determined by (15) the polynomial *A(z)* has zeros at p^{-1} exp *(±θ),* and *p* is near unity, then *ϕ _{x}(ω)* will have peaks near

*w*=

*±θ,*indicating a quasiperiodicity of “period”

*2π/θ.*

Note that for an uncorrelated series ϕ(ω) is constant-all frequencies are equally represented on the average. For an autoregressive series ϕ(ω) is variable but finite-this is an example of a process with continuous spectrum. A process of type (19) has a constant background term in ϕ owing to the “noise,” *∊ _{t},* but also has infinite peaks at the values ω = ±ω

_{j}, these constituting a line spectrum component.

For a discrete series one need only consider frequencies in the range - π <ω ≤ π since with observations at unit intervals of time the frequencies 2πs+ ω (s integral) cannot be distinguished one from the other. This is the aliasing effect, which can occasionally confuse an analysis. If, however, the series has little variation of frequency greater than π (that is, of period less than two time units), then the effect is not serious, for the higher frequencies that could cause confusion hardly occur.

**Effect of “smoothing” a series-a caution.** In order to isolate the “trend” in a series, {*x _{t}*}, it has sometimes been common to derive a smoothed series, {

*x̄*}, by an averaging operation such as

_{t}although more elaborate and more desirable types of average are often used.

In terms of frequency, the effect of the operation

**a.Curve (2) represents the gain function for the ideal filter passing periods greater than five time units (frequencies less than 2π/5).**

**b.Curve (1) represents the gain function for a five-term uniform average (formula (27) with m =2).**

**c.Curve (3) represents the gain function for a finite moving average approximating the ideal filter (the average with weights given by (30) using ω _{0} =2π/5, but truncated at k= ±10).**

(27) is to multiply the spectral density function by a factor B[exp(iω)]B[exp(-iω)], where

This function is graphed as the dotted curve (1) in Figure 1 and is known as the gain function of the transformation (27).

Now, if the purpose of “trend extraction” is to eliminate the high frequencies from a series-that is, to act as a “low-pass filter”-then the ideal gain factor would correspond to the square-shouldered solid curve (2) in Figure 1. (A gain factor is sometimes referred to as a “window.”) The function (28) obviously departs considerably from this ideal.

To obtain a moving-average transformation

which acts as a perfect low-pass filter for the range -ω_{0}< ω < ω_{0}one must choose

The fact that these coefficients decrease rather slowly means that appreciable truncation of the sum will be necessary (probably at a *k*-value equal to a multiple of *2π/ω _{0}),* but the resultant operation will still be a considerable improvement over (27). The gain function of a truncated smoothing operator is illustrated as the dashed curve (3) in Figure 1.

As was first pointed out by Slutsky (1927), injudicious smoothing procedures can actually have the effect of introducing periodicities-just because the gain function of the averaging operation has a peak at a frequency where it should not. One should always be quite clear about the effect of one’s “smoothing” operations, and the way to do this is to graph the corresponding gain factor as a function of frequency.

Attempts are sometimes made to “eliminate” trend in a series by the method of *variate difference*-that is, by calculating series such as

This measure can have a rough success, in that it largely eliminates deviations from stationarity, although a more fundamental approach would be to fit a model which would actually generate the observed nonstationarity, such as an unstable auto-regression. In any case, in evaluating the series obtained after differencing, one must remember

that the application of a *p*-fold difference, Δ^{p}, has the effect of multiplying the spectral density function of a stationary series by (2 sin 1/2 ω)^{2p}.

### Sample analogues

Consider now the problem of inference from a sample of *n* consecutive observations, *x _{1}, x_{2}*…

*x*.

_{n}**Autocovariance function.** Define the uncorrected lagged product-sum,

If *E*(*x _{t}*) = 0, then

certainly provides an unbiased estimate of Γ, and, under wide conditions, also a consistent one. However, in general the mean will be nonzero and unknown. The sample autocovariance is in such cases naturally measured by

and *Ex* is estimated by

Minor modifications of (33) will be found in the literature. Expression (33) will in general provide a biased but consistent estimate of *Γ _{s}* with a sampling variance of the order of (n − s)

^{−1}. For a given

*n*the variability of

*C*thus increases with

_{s}*s*; fortunately, the earlier autocovariances generally contain most of the information.

In order to eliminate problems of scale, investigators sometimes work with the autocorrelation coefficient

rather than with *C _{s},* but this is not essential.

**Spectral density function.** The sample analogue of spectral density function (the periodogram, formula (21) ) was introduced before the spectral density function itself. Note from (21) that one can write

If the series has already been corrected for the mean (so that one works with *x _{t}* —

*x̄*rather than

*x*), then (36) will become

_{t}Whether one uses formula (36) or formula (37) is not of great consequence. A constant nonzero mean can be regarded as a harmonic component of zero frequency, so the two functions (36) and (37) will differ only near the origin.

The sampling variability of *f _{n}* (ω) does not decrease with increasing

*n,*and

*f*(ω) is not a consistent estimator of φ(ω). The problem of finding a consistent estimator will be discussed below.

_{n}### Fitting and testing autoregressive models

The autoregressive model (8) is a useful trial model, since it usually explains much of the variation and often has some physical foundation. Furthermore, its test theory is typical of a much more general case. The first problem is that of the actual fitting, the estimation of the parameters *a _{k}* and σ

^{2}; the second problem is that of testing the fit of the model.

If the *∊ _{t}* and

*x*have means 0 and

_{t}*μ,*respectively, then the model (8) must be modified slightly to

One usually assumes the *∊ _{t}* normally distributed—not such a restrictive assumption as it appears. To a first approximation the means and variances of autocorrelations are unaffected by nonnormality (see Whittle 1954, p. 210), and estimates of parameters such as the autoregressive coefficients,

*a*should be similarly robust. For normal processes the log-likelihood of the sample

_{k},*x*…

_{1}, x_{2}*x*is, for large

_{n}*n,*

Maximizing this expression with respect to *μ,* one obtains the estimator

The second approximate equality follows if one neglects the difference between the various averages that is, if, as is often done, an end effect is neglected. Thus, the maximum likelihood estimator of *μ* is approximately the usual sample arithmetic mean, despite the dependence between observations. Inserting this estimator in (39), one finds

so that the maximum likelihood estimators of the remaining parameters are determined approximately by the relations

Note the analogue between (42) and (43) and the Yule-Walker relations (18).

To test prescribed values of the *a _{k}* one can use the fact that the estimators

*â*are asymptotically normally distributed with means

_{k}*a*(respectively) and a covariance matrix

_{k}(Here [α_{jk}] denotes a *p* × *p* matrix with typical element α_{jk}, j, *k* = 1, 2, …, *p.*) This result holds if the *∊ _{t}* are independently and identically distributed, with a finite fourth moment. (See Whittle 1954, p. 214.)

However, a more satisfactory and more versatile approach to the testing problem is provided by use of the Wilks λ-ratio. This will be described in a more general setting below; for the present, note the following uses.

To test whether a given set of coefficients *a _{1}, a_{2}, …, a_{p}* are zero, treat

as a χ^{2} variable with *p* degrees of freedom *(df).* Here *σ _{p}^{2}* has been used to denote the estimator (43), emphasizing the order

*p*assumed for the autoregression.

To test whether an autoregression of order *p* gives essentially as good a fit as one of order *p* + *q,* treat

as a χ^{2} variable with *q df.* In both cases large values of the test statistic are critical.

### Fitting and testing more general models

The approximate expression (41) for the log-likelihood (maximized with respect to the mean) can be generalized to any process for which the reciprocal of the spectral density function can be expanded in a Fourier series,

The generalized expression is

where σ^{2} is the “prediction variance,” the conditional variance of *x _{t}* given the values of

*x*…. (See Whittle 1954.)

_{t-1}, x_{t-2},The sum in (48) cannot really be taken as infinite; in most practical cases the coefficients γ^{s} converge reasonably fast to zero as s increases, and the sum can be truncated.

Another way of writing (48) is

In general it will be easier to calculate the sum over autocovariances in (48) than to calculate the integral over the periodogram in (49), but sometimes the second approach is taken.

If the model depends on a number of parameters, *θ _{1} θ_{2} …, θ_{p}* (of which σ

^{2}will usually be one), then

*φ(ω)*will also depend on these, and the maximum likelihood estimators,

*θ̂*are obtained approximately by maximizing either of the expressions (48) and (49). The covariance matrix of the estimators is given asymptotically by

_{j},(See Whittle 1954.) Thus, for the moving-average process

with |β| < 1, one finds that

The maximum likelihood estimator of α is obtained by minimizing (52), and expression (52) with *β̂*

substituted for *β* provides the maximum likelihood estimator of *σ ^{2}* = var(∊). One finds from (50) that the two estimators are asymptotically uncorrelated, with

Practical techniques for the calculation of the maximum likelihood estimators in more general cases have been worked out by Durbin (1959) and Walker (1962).

Tests of fit can be based upon the λ-ratio criterion. Let *σ _{p}^{2}* denote the maximum likelihood estimator of σ

^{2}when parameters

*θ*(one of these being σ

_{1}, θ_{2}, … θ_{p},^{2}itself) are fitted and the values of parameters

*θ*are prescribed. Thus, σ

_{p+1}, θ_{p+2}, …, θ_{p+q},^{2}

_{p+q}will be the maximum likelihood estimator of σ

^{2}when all

*p + q*parameters are fitted. A test of the prescribed values of

*θ*is obtained by treating

_{p+1}, …, θ_{p+q},as a χ^{2} variable with *q df.*

### Multivariate processes

In few realistic analyses is one concerned with a single variable; in general, one has several, so that * x _{t}* must be considered a vector of

*m*jointly stationary variables, (

*x*…,

_{1t}, x_{2t},*x*).

_{mt}There is a generalization (Whittle 1954) of expression (49) for the log-likelihood in such cases, but the only case considered here is that of a multivariate autoregression,

where the **a _{k}** are

*m*×

*m*matrices with

**=**

*a*_{0}*I,*and

This last assumption states that the vector residuals are mutually uncorrelated but that the covariance matrix of a single vector residual is **V.** As before, the maximum likelihood estimator of the mean vector, **μ,** is approximately ** x̄** , and with this inserted the following generalization of (41) results:

Here ** C_{s}** is the

*m*×

*m*matrix whose (

*jk*)th element is the sample covariance of

*x*and

_{jt}*x*that is, the (

_{k,t−s},*jk*)th element of

*is*

**C**_{s}and tr ** A** denotes the sum of the diagonal elements of a matrix

*The maximum likelihood estimators are given by*

**A.**For the important case *p* = 1, which can be written

these become

Estimator (62) will, of course, be modified if certain elements of α are known and need not be estimated. Tests of fit can be based on the λ-ratio criterion as before, with expression (57) used for the log-likelihood.

In econometric work, models of type (61) are particularly important. One minor complication is that exogenous variables may also occur in the right-hand side (see (73), below); exogenous variables are variables which are regarded as external to the system and which need not be explained—for example, in a model of a national economy, variables such as overseas prices and technical progress might be regarded as exogenous. A much more severe complication is that of simultaneity: that ** x_{t}** may be represented as a regression upon some of its own elements as well as upon

**and exogenous variables. This latter difficulty has led to an extensive literature, which will not be discussed here (for a general reference, see Johnston 1963).**

*x*_{t−1}**Regression.** An important special type of multivariate process is the regression

where *x _{t}* (now assumed to be scalar) is regarded as linearly dependent upon a number of variables,

*u*with a stationary residual,

_{jt}*η*(in general auto-correlated). The processes {

_{t}*u*} may be other stationary processes, even lagged versions of the

_{jt}*{x*process itself, or deterministic sequences such as

_{t}}*t*or sin (

^{r}*ωt*).

Simple and unbiased estimators of the *β _{j}* are the least square estimators,

*b*obtained by minimizing and determined by the linear equations

_{j},where the notation

has been used for the simple product-sum. The covariance matrix of these estimators is

where *[a _{jk}]* indicates an

*r*×

*r*matrix with typical element

*a*and Γ

_{jk}_{s}the autocovariance function of {

*η*}. If the processes {

_{t}*u*} are jointly stationary, then one can write

_{jt}and (67) can be rewritten

where ** M** (ω) × [

*M*

_{jt}(ω)],

**×**

*M***(π) −**

*M***(−π), and φ(ω) is the spectral density function of {η**

*M*_{t}}.

This principle can be extended even to mildly nonstationary processes, such as polynomials in *t* (Grenander & Rosenblatt 1957).

The maximum likelihood estimators, *β̂ _{j},* will in general have smaller variances; these estimators are obtained by minimizing the log-likelihood,

They obey the equation system

Their covariance matrix is

If *φ(ω)* has the same value at all ω’s for which any of the *M _{jk}(ω)* change value, then the variances of the least square estimators will be asymptotically equal to those of the maximum likelihood estimators, and, indeed, the two sets of estimators may themselves be asymptotically equal. For example, (40) above shows that the maximum likelihood estimator of a mean, μ, reduces asymptotically to the least square estimator

*x̄,*a great simplification.

If the residual spectral density function, *φ(ω)* involves unknown parameters, then expression (70) must be maximized with respect to these as well as to the *β _{t}.* This maximization can be complicated, and a simpler way of allowing for some autocorrelation in {η

_{t}} is to fit, instead of (64), a model with some autoregression and with uncorrelated residuals,

The *β ^{*}_{j}* of (73) cannot be identified with the

*β*of (64), but they do indicate to what extent the

_{j}*u*can explain

_{jt}*x*variation.

_{t}### Spectral (periodogram) analysis

In fitting a parametric model one obtains an estimate of the spectral density function, *ϕ(ω),* but one often wishes to obtain a direct estimate without the assumptions implied in a model, just as one uses *C _{s}* to estimate the autocovariance function, Γ

_{s}.

The periodogram ordinate, *f _{n}*(ω), cannot be used to this end, for although

one has also, for normal processes,

This variance does not tend to zero with increasing *n,* and *f _{n}*(ω) is not a consistent estimator of

*ϕ(ω).*That explains the very irregular appearance of the periodogram when it is graphed and the reason one conceives the idea of estimating

*ϕ(ω)*by smoothing the periodogram.

What one can say concerning distributions is that the quantities

are approximately independent standard exponential variables for *j* = I, 2, …, *N,* where *N* < *n/2.* That is, the *I _{j},* have approximately a joint probability density function exp .

Suppose now that one attempts to estimate *ϕ(ω)* at ω = λ by an estimator of the form

where *K(ω)* is a symmetric function of period 2π represented by a Fourier series,

From (76) it follows that

The Fourier series *K(ω)* will be chosen as a function with a peak at the origin; as this peak grows sharper, the bias of ϕ̂(λ) (determined from (79)) decreases, but the variance (determined by (80)) increases. The best choice will be a compromise between these two considerations. The question of the optimal choice of weight function has been much studied. The choice is partly a matter of convenience, depending upon whether or not one works from the periodogram, that is, upon which of the two formulas (77) is used. If one is calculating digitally, then a simple and useful smoothing formula is Bartlett’s, for which

To test whether a strong peak in the periodogram indicates the presence of a harmonic component (when a delta-function would be superimposed on the spectral density function), one can use the statistic

for which

The sum in (84) is taken for all values of *j* not greater than 1/*u.* In constructing the *I _{j},* one can use a formula of type (77) to estimate ϕ(ω) although this leads to some underestimate of the relative size of a peak.

A spectral analysis of a multivariate process can lead to some interesting types of investigation, which can only be mentioned briefly here. If

then for a bivariate stationary process {*x _{t}, y_{t}*} the idea of a spectral density function is replaced by that of a spectral density matrix

Suppose one wishes to investigate the dependence of {*y _{t}*} upon {

*x*}. As an alternative to low-lag linear models such as

_{t}one can apply multivariate techniques to the Fourier components of the processes, at individual values of ω. Thus, ϕ_{yy}(ω) is the variance of the Fourier component of {*y _{t}*} at frequency ω one can take a linear regression of this component onto the corresponding Fourier component of {

*x*} and find a “residual variance,”

_{t}The quantity *C _{xy}*(ω) is the type of correlation coefficient known as the coherency; if it approaches unity in magnitude, then the frequency components at co of the two series are closely related. It may well happen, for example, that low frequency components of two series keep well in step, although the short-term variation corresponding to high frequency components may be almost uncorrelated between the two series.

Note that (88) is the spectral density function of the process

where the *b’s* are chosen to minimize var(η).

In practice the elements of the spectral density matrix must be estimated by formulas analogous to (77).

A sometimes illuminating alternative to a periodogram analysis is to decompose a series into components corresponding to different frequency bands by using a number of band-pass operators of type (29) and to examine these components individually.

P. Whittle

[*See also* Linear hypotheses,*article on*regression; Markov Chains.]

## BIBLIOGRAPHY

*There are no completely satisfactory texts, principally because virtually nobody has a satisfactorily broad grasp of both the theory and the application of time series analysis. A person with a mathematical background may do best to read the first four chapters of* Grenander & Rosenblatt 1957 *and to follow this with a study of* Hannan 1960. *For someone with less mathematics, chapters 29 and 30 of* Kendall & Stuart 1966 *provide an introduction which, although not deep, is nevertheless sound and well illustrated.* Jenkins 1965 *surveys some of the more recent work on spectral analysis.*

Bartlett, M. S. 1963 The Spectral Analysis of Point Processes. *Journal of the Royal Statistical Society* Series B 25:264-296.

Durbin, J. 1959 Efficient Estimation of Parameters in Moving-average Models. *Biometrika* 46:306-316.

Granger, Clivew. J.; and Hatanaka, M. 1964 *Spectral Analysis of Economic Time Series.* Princeton Univ. Press.

Grenander, Ulf; and Rosenblatt, Murray 1957 *Statistical Analysis of Stationary Time Series.* New York: Wiley.

Hannan, Edward J. (1960)1962 *Time Series Analysis.* London: Methuen.

International Statistical Institute 1965 *Bibliography on Time Series and Stochastic Processes.* Edited by Herman Wold. Edinburgh: Oliver & Boyd.

Jenkins, G. M. 1965 A Survey of Spectral Analysis. *Applied Statistics* 14:2-32.

Johnston, John 1963 *Econometric Methods.* New York: McGraw-Hill.

Kendall, Maurice G.; and Stuart, Alan 1966 *The Advanced Theory of Statistics.* Volume 3: Design and Analysis, and Time Series. New York: Hafner; London: Griffin.

Macrobert, Thomas M. (1917) 1947 *Functions of a Complex Variable.* 3d ed. New York: Macmillan.

Slutsky, Eugen E. (1927) 1937 The Summation of Random Causes as the Source of Cyclic Processes. *Econometrica* 5:105-146. → First published in Russian. Reprinted in 1960 in Slutsky’s *Izbrannye trudy.*

Symposium on Time Series Analysis, Brown University, 1962 1963 *Proceedings.* Edited by Murray Rosenblatt. New York: Wiley.

Walker, A. M. 1962 Large-sample Estimation of Parameters for Autoregressive Processes With Moving-average Residuals. *Biometrika* 49:117-131.

Whittle, P. 1953 The Analysis of Multiple Stationary Time Series. *Journal of the Royal Statistical Society* Series B 15:125-139.

Whittle, P. 1954 Appendix. In Herman Wold, A *Study in the Analysis of Stationary Time Series.* 2d ed. Stockholm: Almqvist & Wiksell.

Whittle, P. 1963 *Prediction and Regulation by Linear Least-square Methods.* London: English Universities Press.

Wold, Herman (1938) 1954 *A Study in the Analysis of Stationary Time Series.* 2d ed. With an Appendix by P. Whittle. Stockholm: Almqvist & Wiksell.

## III. CYCLES

Cycles, waves, pulsations, rhythmic phenomena, regularity in return, periodicity—these notions reflect a broad category of natural, human, and social phenomena where cycles are the dominating feature. The daily and yearly cycles in sunlight, temperature, and other geophysical phenomena are among the simplest and most obvious instances. Regular periodicity provides a basis for prediction and for extracting other useful information about the observed phenomena. Nautical almanacs with their tidal forecasts are a typical example. Medical examples are pulse rate as an indicator of cardiovascular status and the electrocardiograph as a basis for analysis of the condition of the heart.

The study of cyclic phenomena dates from prehistoric times, and so does the experience that the area has dangerous pitfalls. From the dawn of Chinese history comes the story that the astronomers Hi and Ho lost their heads because they failed to forecast a solar eclipse (perhaps 2137 b.c.). In 1929, after some twelve years of promising existence, the Harvard Business Barometer (or Business Index) disappeared because it failed to predict the precipitous drop in the New York stock market.

Cyclic phenomena are recorded in terms of time series. A key aspect of cycles is the *degree of predictability* they give to the time series generated. Three basic situations should be distinguished:

(a) The cycles are fixed, so that the series is predictable over the indefinite future.

(b) The cycles are partly random, so that the series is predictable only over a limited future.

(c) The cycles are spurious—that is, there are no real cycles—and the series is not predictable.

For the purposes of this article the term “cycle” is used in a somewhat broader sense than the strict cyclic periodicity of case (a).

### Limited and unlimited predictability

The fundamental difference between situations (*a*) and (*b*) can be illustrated by two simple cases.

**The scheme of “hidden periodicities.”** Suppose that an observed time series is generated by two components. The first is strictly periodic, with period length *p,* so that its value at time *t* + *p* is equal to its value at time *t.* The second component, superimposed upon the first, is a sequence of random (independent, identically distributed) elements. Thus, each term of the observed series can be represented as the sum of a periodic term and a random one.

Tidal water is a cyclic phenomenon where this model applies quite well (see Figure 1). Here the observed series is the measured water level at Dover, the strictly periodic component represents the lunar cycle, 12 hours and 50 minutes in length (two maxima in one lunar day), and the random elements are the irregular deviations caused by storms, random variations in air pressure, earthquakes, etc.

The periodic component provides a prediction—

* Hypothetical data.*

*an unbiased predicted value for a future time with expectation equal to that future value of the periodic component, and with prediction error equal to the random element. The difficulty is that the periodic component is not known and must be estimated empirically. A simple and obvious method is that of Buys Ballot’s table; each point on the periodic component is estimated by the average of several points on the observed series, separated in time by the length of the period, p, where p either is known or is assessed by trial and error. The larger is the residual as compared to the cyclic component, the longer is the series needed to estimate with confidence the cyclic component.*

*The approach of hidden periodicities may be extended, with two or more periodic components being considered. Tidal water again provides a typical illustration. In addition to the dominating lunar component, a closer fit to the data is obtained by considering a solar component with period 183 days.*

*In view of its simplicity and its many important applications, it is only natural that the approach involving strictly periodic components is of long standing. A distinction must be made, however, between formal representation of a series (which is always possible), on the one hand, and prediction, on the other. Under general conditions, any series, even a completely random one, can be represented by a sum of periodic components plus a residual, and if the number of periodic components is increased indefinitely, the residual can be made as small as desired. In particular, if each of the periodic components is a sine or a cosine curve (a sinusoid), then the representation of the observed series is called a spectral representation. Such a representation, it is well to note, may be of only limited use for prediction outside the observed range, because if the observed range is widened, the terms of the representation may change appreciably. In the extreme case when the observations are all stochastically independent, the spectral representation of the series is an infinite sum of sinusoids; in this case neither the spectral representation nor alternative forecasting devices provide any predictive information.*

**Irregular cycles** . Until rather recently (about 1930), the analysis of oscillatory time series was almost equivalent to the assessment of periodicities. For a long time, however, it had been clear that important phenomena existed that refused to adhere to the forecasts based on the scheme of hidden periodicities. The most obvious and challenging of these was the sequence of some twenty business cycles, each of duration five to ten years, between 1800 and 1914. Phenomena with irregular cycles require radically different methods of analysis.

**The scheme of “disturbed periodicity.”** The breakthrough in the area of limited predictability came with Yule’s model (1927) for the irregular

*11-year cycle of sunspot intensity (see Figure 2). Yule interpreted the sunspot cycle as similar to the movement of a damped pendulum that is kept in motion by an unending stream of random shocks. [ See the biography of Yule.]*

*The sharp contrast between the scheme of hidden periodicities and the scheme of disturbed periodicity can now be seen. In the hidden periodicities model the random elements are superimposed upon the cyclic component(s) without affecting or disturbing their strict periodicity. In Yule’s model the series may be regarded as generated by the random elements, and there is no room for strict periodicity. (Of course, the two types can be combined, as will be seen.)*

*The deep difference between the two types of model is reflected in their forecasting properties (see Figure 3). The time scales for the two forecasts have here been adjusted so as to give the same period. In the hidden-periodicities model the forecast over the future time span has the form of an undamped sinusoid, thus permitting an effective*

*forecast over indefinitely long spans when the model is correct. In Yule’s model the forecast is a damped sinusoid, which provides effective information over limited spans, but beyond that it gives only the trivial forecast that the value of the series is expected to equal the unconditional over-all mean of the series.*

**Generalizations** . The distinction between limited and unlimited predictability of an observed times series goes to the core of the probability structure of the series.

*In the modern development of time series analysis on the basis of the theory of stochastic processes, the notions of predictability are brought to full significance. It can be shown that the series yt under very general conditions allows a unique representation,*

*known as predictive decomposition, where ( a) the two components are uncorrelated, (b) Φ_{t} is deterministic and Ψ_{t} is nondeterministic, and (c) the nondeterministic component allows a representation of the Yule type. In Yule’s model no Φ_{t} component is present. In the hidden-periodicities model Φ_{t}, is a sum of sinusoids, while Ψ_{t} is the random Φ_{t} residual. Generally, however, Φ_{t} although deterministic in the prediction sense, is random.*

*The statistical treatment of mixed models like (1) involves a variety of important and challenging problems. Speaking broadly, the valid assessment of the structure requires observations that extend over a substantial number of cycles, and even then the task is difficult. A basic problem is to test for and estimate a periodic component on the supplementary hypothesis that the ensuing residual allows a nondeterministic representation, or, more generally, to perform a simultaneous estimation of the two components. A general method for dealing with these problems has been provided by Whittle (1954); for a related approach, see Allais (1962).*

*Other problems with a background in this decomposition occur in the analysis of seasonal variation [ See Time series,article on Seasonal adjustment].*

*Other stochastic models.* Since a tendency to cyclic variation is a conspicuous feature of many phenomena, stochastic models for their analysis have used a variety of mechanisms for generating apparent or genuine cyclicity. Brief reference will be made to the dynamic models for (*a*) predator-prey populations and (*b*) epidemic diseases. In both cases the pioneering approaches were deterministic, the models having the form of differential equation systems. The stochastic models developed at a later stage are more general, and they cover features of irregularity that cannot be explained by deterministic methods. What is of special interest in the present context is that the cycles produced in the simplest deterministic models are strictly periodic, whereas the stochastic models produce irregular cycles that allow prediction only over a limited future.

*Figure 4 refers to a stochastic model given by M. S. Bartlett (1957) for the dynamic balance between the populations of a predator—for example, the lynx—and its prey—for example, the hare. The data of the graph are artificial, being constructed from the model by a Monte Carlo experiment. The classic models of A. J. Lotka and V. Volterra are deterministic, and the ensuing cycles take the form*

*of sinusoids. The cyclic tendency is quite pronounced in Figure 4, but at the same time the development is affected by random features. After three peaks in both populations, the prey remains at a rather low level that turns out to be critical for the predator, and the predator population dies out.*

*The peaks in Figure 5 mark the severe spells of poliomyelitis in Sweden from 1905 onward. The cyclic tendency is explained, on the one hand, by the contagious nature of the disease and, on the other, by the fact that slight infections provide immunity, so that after a nationwide epidemic it takes some time before a new group of susceptibles emerges. The foundations for a mathematical theory of the dynamics of epidemic diseases were laid by Kermack and McKendrick (1927), who used a deterministic approach in terms of differential equations. Their famous threshold theorem states that only if the infection rate, ρ, is above a certain critical value, ρ_{o}, will the disease flare up in epidemics. Bartlett (1957) and others have developed the theory in terms of stochastic models; a stochastic counterpart to the threshold theorem has been provided by Whittle (1955).*

*Bartlett’s predator-prey model provides an example of how a cyclic deterministic model may become evolutive (nonstationary) when stochas-ticized, while Whittle’s epidemic model shows how an evolutive deterministic model may become stationary. Both of the stochastic models are completely nondeterministic; note that the predictive decomposition (1) extends to nonstationary processes.*

*The above examples have been selected so as to emphasize that there is no sharp demarcation between cycles with limited predictability and the spurious periodicity of phenomena ruled by randomness, where by pure chance the variation may take wavelike forms, but which provides no basis even for limited predictions. Thus, if a recurrent phenomenon has a low rate of incidence, say λ per year, and the incidences are mutually independent (perhaps a rare epidemic disease that has no aftereffect of immunity), the record of observations might evoke the idea that the recurrences have some degree of periodicity. It is true that in such cases there is an average period of length 1/λ between the recurrences, but the distance from one recurrence to the next is a random variable that cannot be forecast, since it is independent of past observations.*

*A related situation occurs in the summation of mutually independent variables. Figure 6 shows a case in point as observed in a Monte Carlo experiment with summation of independent variables (Wold 1965). The similarity between the three waves, each representing the consecutive additions*

*of some 100,000 variables, is rather striking. Is it really due to pure chance? Or is the computer simulation of the “randomness” marred by some slip that has opened the door to a cyclic tendency in the ensuing sums? (For an amusing discussion of related cases, see Cole’s “Biological Clock in the Unicorn” 1957.)*

*Figure 6 also gives, in the series of wholesale prices in Great Britain, an example of “Kondratieff waves”—the much discussed interpretation of economic phenomena as moving slowly up and down in spells of some fifty years. Do the waves embody genuine tendencies to long cycles, or are they of a spurious nature? The question is easy to pose but difficult or impossible to answer on the basis of available data. The argument that the “Kondratieff waves” are to a large extent parallel in the main industrialized countries carries little weight, in view of international economic connections. The two graphs have been combined in Figure 6 in order to emphasize that with regard to waves of long duration it is always difficult to sift the wheat of genuine cycles from the chaff of spurious periodicity. [ See the biography of Kondratieff.]*

*Genuine versus spurious cycles*

**Hypothesis testing** . Cycles are a specific feature in many scientific models, and their statistical assessment usually includes (*a*) parameter estimation for purposes of quantitative specification of the model, and (*b*) hypothesis testing for purposes of establishing the validity of the model and thereby of the cycles. In modern statistics it is often (sometimes tacitly) specified that any method under (*a*) should be supplemented by an appropriate device under (*b*). Now, this principle is easy to state, but it is sometimes difficult to fulfill, particularly with regard to cycles and related problems of time series analysis. The argument behind this view may be summed up as follows, although not everyone would take the, same position:

*( i)Most of the available methods for hypothesis testing are designed for use in controlled experiments—the supreme tool of scientific model building—whereas the assessment of cycles typically refers to nonexperimental situations.*

*( ii)The standard methods for both estimation and hypothesis testing are based on the assumption of independent replications. Independence is on the whole a realistic and appropriate assumption in experimental situations, but usually not for non-experimental data.*

*( iii)Problems of point estimation often require less stringent assumptions than those of interval estimation and hypothesis testing. This is frequently overlooked by the methods designed for experimental applications, because the assumption of independence is usually introduced jointly for point estimation, where it is not always needed, and for hypothesis testing, where it is always consequential.*

*( iv)It is therefore a frequent situation in the analysis of nonexperimental data that adequate methods are available for estimation, but further assumptions must be introduced to conduct tests of hypotheses. It is even a question whether such tests can be performed at all in a manner corresponding to the standard methods in experimental analysis, because of the danger of specification errors that mar the analysis of nonexperimental data.*

*( v)Standard methods of hypothesis testing in controlled experiments are thus of limited scope in nonexperimental situations. Here other approaches come to the fore. It will be sufficient to mention predictive testing—the model at issue is taken as a basis for forecasts, and in due course the forecasts are compared with the actual developments. Reporting of nonexperimental models should always include a predictive test.*

*The following example is on the cranky side, but it does illustrate that the builder of a nonexperimental model should have le courage de son modele to report a predictive test, albeit in this case the quality of the model does not come up to the model builder’s courage. The paper (Lewin 1958) refers to two remarkable events—the first permanent American settlement at Jamestown, Virginia, in 1607 and the Declaration of Independence in 1776 —and takes the 169 years in between as the basic “cycle.” After another 84½% years (½ of the basic cycle) there is the remarkable event of the Civil War, in 1861; after 56 more years (⅓ of the cycle) there is the beginning of the era of world wars in 1917; after 28 more years (1/6 of the cycle) there is the atomic era with the first bomb exploded in 1945. The paper, published in 1958, ends with the following predictive statement: “The above relation to the basic 169 year cycle of 1/1, ½, ⅓ 1/6 is a definite decreasing arithmetic progression where the sum of all previous denominators becomes the denominator of the next fraction. To continue this pattern and project, we have the 6th cycle—1959, next U.S. Epochal Event—14 year lapse— 1/12 of 169 years” (Lewin 1958, pp. 11-12). The 1959 event should have been some major catastrophe like an atomic war, if I have correctly understood what the author intimates between the lines in the first part of his article.*

*It is well to note that this paper, singled out here as an example, is far from unique. Cycles have an intrinsic fascination for the human mind. A cursory scanning of the literature, particularly Cycles, the journal of the Foundation for the Study of Cycles, will suffice to show that in addition to the strictly scientific contributions, there is a colorful subvegetation where in quality and motivation the papers and books display all shades of quasi-scientific and pseudoscientific method, down to number mysticism and other forms of dilettantism and crankiness, and where the search for truth is sometimes superseded by drives of self-realization and self-suggestion, not to speak of unscrupulous money-making. The crucial distinction here is not between professional scientists and amateurs. It is all to the good if the search for truth is strengthened by many modes of motivation. The sole valid criterion is given by the general standards of scientific method. Professionals are not immune to self-suggestion and other human weaknesses, and the devoted work of amateurs guided by an uncompromising search for truth is as valuable here as in any other scientific area.*

*Further remarks*

*Cycles are of key relevance in the theory and application of time series analysis; their difficulty is clear from the fact that it is only recently that scientific tools appropriate for dealing with cycles and their problems have been developed. The fundamental distinction between the hidden-periodicity model, with its strict periodicity and unlimited predictability, and Yule’s model, with its disturbed periodicity and limited predictability, could be brought to full significance only after 1933, by the powerful methods of the modern theory of stochastic processes. On the applied side, the difficulty of the problems has been revealed in significant shifts in the very way of viewing and posing the problems. Thus, up to the failure of the Harvard Business Barometer the analysis of business cycles was essentially a unirelational approach, the cycle being interpreted as generated by a leading series by way of a system of lagged relationships with other series. The pioneering works of Jan Tinbergen in the late 1930s broke away from the unirelational approach. The models of Tinbergen and his followers are multirelational, the business cycles being seen as the resultant of a complex system of economic relationships. [ See Business cycles; Distributed lags.]*

*The term “cycle,” when used without further specification, primarily refers to periodicities in time series, and that is how the term is taken in this article. The notion of “life cycle” as the path from birth to death of living organisms is outside the scope of this presentation. So are the historical theories of Spengler and Toynbee that make a grandiose combination of time series and life cycle concepts, seeing human history as a succession of cultures that are born, flourish, and die. Even the shortest treatment of these broad issues would carry us far beyond the realm of time series analysis; this omission, however, must not be construed as a criticism. [ For a discussion of these issues, see Periodization.]*

**Cycles vs. innovations** . The history of human knowledge suggests that belief in cycles has been a stumbling block in the evolution of science. The philosophy of the cosmic cycle was part of Stoic and Epicurean philosophy: every occurrence is a recurrence; history repeats itself in cycles, cosmic cycles; all things, persons, and phenomena return exactly as before in cycle after cycle. What is it in this strange theory that is of such appeal that it should have been incorporated into the foundations of leading philosophical schools and should occur in less extreme forms again and again in philosophical thinking through the centuries, at least up to Herbert Spencer, although it later lost its vogue? Part of the answer seems to be that philosophy has had difficulties with the notion of innovation, having, as it were, a *horror innovationum.* If our philosophy leaves no room for innovations, we must conclude that every occurrence is a recurrence, and from there it is psychologically a short step to the cosmic cycle. This argument being a blind alley, the way out has led to the notions of innovation and limited predictability and to other key concepts in modern theories of cyclic phenomena. Thus, in Yule’s model (Figure 2) the random shocks are innovations that reduce the regularity of the sunspot cycles so as to make them predictable only over a limited future. More generally, in the predictive decomposition (1) the nondeterministic component is generated by random elements, innovations, and the component is therefore only of limited predictability. Here there is a close affinity to certain aspects of the general theory of knowledge. We note that prediction always has its cognitive basis in regularities observed in the past, cyclic or not, and that innovations set a ceiling to prediction by scientific methods. [*See* Time series,*article on* Advanced problems.]

*Mathematical analysis*

*The verbal exposition will now, in all brevity, be linked up with the theory of stochastic processes. The focus will be on ( a) the comparison between the schemes of “hidden periodicities” and “disturbed harmonics” and (b) spectral representation versus predictive decomposition.*

*Write the observed series*

*taking the observations as deviations from the mean and letting the distance between two consecutive observations serve as time unit. Unless otherwise specified, the series (2) is assumed to be of finite length, ranging from t= 1 to t= n.*

*Hidden periodicities. With reference to Figure 1, consider first the case of one hidden periodicity. The observed series y_{t} is assumed to be generated by the model*

*where x_{t}, the “hidden periodicity,” is a sinusoid,*

*while*

*is a sequence of random variables, independent of one another and of x_{t}, and identically distributed with zero mean, E(Є) = 0, and standard deviation σ(Є). For any λ and μ the sinusoid (4) is periodic, x_{t+p} = x_{t}, with period P = 2π/ω), and satisfies the difference equation*

*where ρ= cos ω.*

*The sinusoid x_{t} makes a forecast of y_{t+k} over any prediction span k (of course, for real prediction the values of λ, μ and ρ must be known or assumed),*

*giving the prediction error*

*Δ*(*t, k*) = *y _{t+k}* pred

*y*= Є.

_{t+k}*Hence the forecast (7) is unbiased and has the same mean-square deviation for all t and k,*

*Further light can be cast on the rationale of the forecast (7) by considering the coefficients λ μ as limiting regression coefficients of y_{t} on cos ω_{t} and sin ωt*

*Disturbed periodicity. Yule’s model as illustrated in Figure 2 is*

*where the notation makes for easy comparison with model (3):*

*( a) In (3) the disturbances, Є_{t}, are superimposed on the periodic component, x_{t}, while y_{t} in (9) is entirely generated by current and past disturbances,*

*where α_{1} = 2_{ρ}(α_{0} = 1) and*

*Hence in (3) the correlation coefficient*

*for all k ≠ 0 and t, while (9) gives (11) for all k > 0 and t.*

*(b) (See Figure 3.) If the future disturbances, Є,_{t+k} were absent, y_{t+k} in (3) would reduce to X_{t+k} and thus make an undamped sinusoid (4), while y_{t+k}, in (9), say, y_{t+k} would satisfy the difference equation*

*with initial values y^{*}_{t} = y_{t}, y^{*}_{t-1} = y_{t-1}, giving*

*Hence, y^{*}_{t+k} would make a damped sinusoid with damping factor γ, frequency ω given by cos ω = ρ/γ, and period 2π/ω), and where the two initial values y_{t}, y_{t-1} determine the parameters λ, μ Since the difference equations in (10b) and (12) are the same except for the initial values, the form (13b) of a damped sinusoid extends to α_{k}, except that λ, μ will be different.*

*(c) In (3) the undamped sinusoid, x_{t}, provides a forecast of y_{t+k} that is unbiased in the sense of (8a). In (9) the damped sinusoid, y^{*}_{t+k}, provides a forecast of y_{t+k},*

*where*

*showing that the forecast (14) is unbiased in the sense of the conditional expectation of y_{t+k} as conditioned by the current and past observations y_{t} y_{t-1}…*

*(d) Figure 7 illustrates that in model (3) the prediction error has constant mean square deviation for all spans k. In (9) it has mean square deviation*

*and thus is increasing with k. Formulas (8b) and (16) show that in (3) the disturbances, Є_{t}, do not interfere with the sinusoid component, x_{t}, while in (9) they build up the entire process, y_{t}.*

*(e) The fundamental difference between models (3) and (9) is further reflected in the correlogram of y_{t},*

*k* = 0,1,2,…

*In (3) the correlogram is an undamped sinusoid (4), in (9) a damped sinusoid (13 b). Hence, the two correlograms are curves of the same types as those shown in Figure 3. The graph actually shows the two correlograms, not any two forecasts.*

**Generalizations** . The scheme (3) extends to several hidden periodicities,

*giving the same prediction formulas (7)-(8). Yule’s model (9) extends to the general scheme of autoregression,*

*giving expansions of type (10 a) and (13a) and a prediction like (15a). Note that formula (17) is a composite undamped swinging. The difference equations (6) and (12) extend from order 2 to order 2h. The extension of (13b) gives y^{*}_{t+k} as a composite damped swinging.*

**Stationary stochastic processes** . The above models are fundamental cases of stationary stochastic processes. The observed series (2) is seen as a *realization* of the process. Stationarity means that for any fixed *n* the random variables *η*_{t+1}, …, *η*_{t+n} that generate the observed values *y*_{t+1} … *y*_{t+n} have a joint probability distribution that is independent of *t.* In this interpretation a realization corresponds to a sampling point in an *n*-dimen-sional distribution, and the parameters *λ μ* in model (3) are random variables that vary from one realization (2) of the process to another.

*Two general representation theorems for stationary stochastic processes will be quoted briefly.*

*Spectral representation.* The basic reference for spectral representation is Cramér (1940). Any real-valued stationary process *η _{t}* allows the representation

*where λ(ω), μ(ω) are real processes with zero means and zero intercorrelations, with increments dλ(ω), dμ(ω) which have zero means and zero intercorrelations, and with variances*

*E*{[*dλ*(*ω*)]^{2}} = *E*{[*dλ*(*ω*)]^{2}} = (*dV*(*ω*), 0 ≤ *ω* ≤ π

*where V( ω) is the cumulative spectrum of the process.*

*Conversely, λ(ω), and μ(ω) and can be represented in terms of η_{t}*

*and correspondingly for μ(ω). (Here l.i.m. signifies limit in the mean; l.i.m. may exist even if the ordinary limit does not.)*

*Applying the representation (19) to model (17), the spectrum V( ω) has discontinuities at the points ω = ω_{i}), while the component Є_{t} corresponds to the continuous part of the spectrum. As applied to models (9) and (18), the representation (19) gives a spectrum V(ω) that is everywhere continuous. Broadly speaking, the spectral representation (19) is useful for analyzing the cyclical properties of the series (2) inside the range of observations, while it is of operative use for prediction outside the observation range only in the case when V(ω) presents one or more discontinuities.*

*Predictive decomposition.* The basic reference for predictive decomposition is Wold (1938). Any stationary process *η _{t}* with finite variance allows the decomposition (1). The deterministic component, Φ

_{t}can be linearly predicted to any prescribed accuracy over any given span

*k*on the basis of the past observations

*y*

_{t-1},

*y*

_{t-2}…. The nondeterministic component, Ψ,

_{t}allows a representation of type (10

*a*) with correlation properties in accordance with (11) for all

*k >*0 and

*t,*and hence a prediction of type (13a). The ensuing prediction for

*η*

_{t+k},*pred η_{t+k} = Φ_{t+k} + pred Ψ_{t+k},*

*has least square properties in accordance with (16), and if all joint probability distributions are normal (or have linear regressions), the prediction will be unbiased in the sense of (15b).*

*In models (3) and (17), x_{t} is the deterministic component and Є_{t} the nondeterministic component. Models (9) and (18) are completely nondeterministic. Levels D, M, N in Figure 7 refer to models that are completely deterministic, mixed, and completely nondeterministic, respectively, each level indicating the standard deviation of the prediction error for indefinitely large spans k. Making use of the analytical methods of spectral analysis, Kolmogorov (1941) and Wiener (1942) have developed the theory of the decomposition (1) and the nondeterministic expansion (10).*

*This article aims at a brief orientation to the portrayal of cycles as a broad topic in transition. Up to the 1930s the cyclical aspects of time series were dealt with by a variety of approaches, in which nonscientific and prescientific views were interspersed with the sound methods of some few forerunners and pioneers. The mathematical foundations of probability theory as laid by Kolmogorov in 1933 gave rise to forceful developments in time series analysis and stochastic processes, bringing the problems about cycles within the reach of rigorous treatment. In the course of the transition, interest in cycles has been superseded by other aspects of time series analysis, notably prediction and hypothesis testing. For that reason, and also because cyclical features appear in time series of very different probability structures, it is only natural that cycles have not (or not as yet) been taken as a subject for a monograph.*

*Herman Wold*

*[ See also Business cyclesand Prediction and forecasting, Economic]*

*BIBLIOGRAPHY*

*Allais, Maurice 1962 Test de périodicité: Généralisation du test de Schuster au cas de séries temporelles autocorrelées dans l’hypothése d’un processus de perturbations aleatoires d’un systéme stable. Institut International de Statistique, Bulletin 39, no. 2:143-193.*

*Bartlett, M. S. 1957 On Theoretical Models for Competitive and Predatory Biological Systems. Biometrika 44:27-42.*

*Burkhardt, H. 1904 Trigonometrische Interpolation: Mathematische Behandlung periodischer Naturer-scheinungen mit Einschluss ihrer Anwendungen. Volume 2, pages 643-693 in Enzyklopädie der mathe-matischen Wissenschaften. Leipzig: Teubner. → The encyclopedia was also published in French.*

*Buys Ballot, Christopher H. D. 1847 Les change-mens périodiques de température dépendants de la nature du soleil et de la lune mis en rapport avec le prognostic du temps déduits d’observations Neer-landaises de 1729 á 1846. Utrecht (Netherlands): Kemink.*

*Cole, Lamont C. 1957 Biological Clock in the Unicorn. Science 125:874-876.*

*CramÉr, Harald 1940 On the Theory of Stationary Random Processes. Annals of Mathematics2d Series 41:215-230.*

*Cycles.* → Published since 1950 by the Foundation for the Study of Cycles. See especially Volume 15.

*Kermack, W. O.; and Mckendrick, A. G. 1927 A Contribution to the Mathematical Theory of Epidemics. Royal Society of London, Proceedings Series A 113: 700-721.*

*Kermack, W. O.; and Mckendrick, A. G. 1932 Contributions to the Mathematical Theory of Epidemics. Part 2: The Problem of Endemicity. Royal Society of London, Proceedings Series A 138:55-83.*

*Kermack, W. O.; and Mckendrick, A. G. 1933 Contributions to the Mathematical Theory of Epidemics. Part 3: Further Studies of the Problem of Endemicity. Royal Society of London, Proceedings Series A 141: 94-122.*

*Keyser, Cassius J. (1922) 1956 The Group Concept. Volume 3, pages 1538-1557 in James R. Newman, The World of Mathematics: A Small Library of the Literature of Mathematics From A’h-Mosé the Scribe to Albert Einstein. New York: Simon & Schuster. → A paperback edition was published in 1962.*

*Kolmogorov, A. N. (1941) 1953 Sucesiones esta-cionarias en espacios de Hilbert (Stationary Sequences in Hilbert Space). Trabajos de estadistíca 4:55-73, 243-270. → First published in Russian in Volume 2 of the Biulleten Moskovskogo Universiteta.*

*Lewin, Edward A. 1958 1959 and a Cyclical Theory of History. Cycles 9:11-12.*

*Mitchell, Wesley C. 1913 Business Cycles. Berkeley: Univ. of California Press. → Part 3 was reprinted by University of California Press in 1959 as Business Cycles and Their Causes.*

*Piatier, AndrÉ 1961 Statistique et observation eco-nomique. Volume 2. Paris: Presses Universitaires de France.*

*Schumpeter, Joseph A. 1939 Business Cycles: A Theoretical, Historical, and Statistical Analysis of the Capitalist Process.2 vols. New York and London: McGraw-Hill. → An abridged version was published in 1964.*

*Schuster, Arthur 1898 On the Investigation of Hidden Periodicities With Application to a Supposed 26 Day Period of Meteorological Phenomena. Terrestrial Magnetism 3:13-41.*

*Tinbergen, J. 1940 Econometric Business Cycle Research. Review of Economic Studies 7:73-90.*

*Whittaker, E. T.; and Robinson, G. (1924) 1944 The Calculus of Observations: A Treatise on Numerical Mathematics.4th ed. Princeton, N.J.: Van Nostrand.*

*Whittle, P. 1954 The Simultaneous Estimation of a Time Series: Harmonic Components and Covariance Structure. Trabajos de estadistíca 3:43-57.*

*Whittle, P. 1955 The Outcome of a Stochastic Epidemic—A Note on Bailey’s Paper. Biometrika 42: 116-122.*

*Wiener, Norbert (1942) 1964 Extrapolation, Interpolation and Smoothing of a Stationary Time Series, With Engineering Applications. Cambridge, Mass.: Technology Press of M.I.T. → First published during World War II as a classified report to Section D2, National Defense Research Committee. A paperback edition was published in 1964.*

*Wold, Herman (1938) 1954 A Study in the Analysis of Stationary Time Series.2d ed. Stockholm: Almqvist & Wiksell.*

*Wold, Herman 1965 A Graphic Introduction to Stochastic Processes. Pages 7-76 in International Statistical Institute, Bibliography on Time Series and Stochastic Processes. Edited by Herman Wold. Edinburgh: Oliver & Boyd.*

*Wold, Herman 1967 Time as the Realm of Forecasting. Pages 525-560 in New York Academy of Sciences, Interdisciplinary Perspectives on Time. New York: The Academy.*

*Yule, G. Udny 1927 On a Method of Investigating Periodicities in Disturbed Series, With Special Reference to Wolfer’s Sunspot Numbers. Royal Society, Philosophical Transactions Series A 226:267-298.*

*IV. SEASONAL ADJUSTMENT*

*The objective of economic time series analysis is to separate underlying systematic movements in such series from irregular fluctuations. The systematic movements in the economy—the signals— reveal seasonal patterns, cyclical movements, and long-term trends. The irregular fluctuations—the noise—are a composite of erratic real world occurrences and measurement errors. There are definite advantages in breaking down these two major factors into their respective components. Separation of the systematic components provides a better basis for studying causal factors and forecasting changes in economic activity. Separation of the irregular components provides a basis for balancing the costs of reducing statistical errors against the resultant gains in accuracy. This article is concerned with one of the systematic components, seasonal variations, especially how to measure and eliminate it from economic time series. The relationships between seasonal variations and the other components are also described, with special reference to economic time series in the United States. Characteristics of seasonal variations in different countries, regions, and industries are not discussed.*

**The seasonal factor** . The seasonal factor is the composite effect of climatic and institutional factors and captures fluctuations that are repeated more or less regularly each year. For example, the aggregate income of farmers in the United States displays a definite seasonal pattern, rising steadily each year from early spring until fall, then dropping sharply. Most economic series contain significant seasonal fluctuations, but some (stock prices, for example) contain virtually none.

*Changing weather conditions from one season to another significantly affect activities in such industries as construction and agriculture. Movements in a series resulting from this factor are referred to as climatic variations. Differences from year to year in the intensity of weather conditions during each season introduce an irregular element in the pattern of these movements. For example, a very cold winter will have a greater effect on some industries than a winter with average temperature and precipitation.*

*Intermingled with the effects of variations in climatic conditions are the effects of institutional factors. Thus, the scheduling of the school year from September to June influences the seasonal pattern of industries associated with education, and the designation of tax dates by federal and state authorities affects retail sales and interest rates.*

*Holidays also help to shape the pattern of activities over 12-month periods. The effects of Christmas and Easter upon the volume of business are widespread, but most direct and largest upon retail sales. Other holidays, such as July 4, Memorial Day, and Labor Day, have a like but generally lesser effect. The number of shopping days between Thanksgiving and Christmas may have some effect upon the volume of Christmas shopping. The effects of certain of these holidays (Easter, Labor Day, and Thanksgiving Day) upon the activities of certain months is uneven, because they do not fall on the same day of the month each year; the dates upon which they fall affect the distribution of activity between two months. The movements resulting from this factor are referred to as holiday variations (and are illustrated by curve 2 of Figure 1, below).*

*The use of the Gregorian calendar, which provides for months of different lengths and calendar composition, has a special effect upon monthly fluctuations. This effect is due mainly to differences in the character and volume of business activity on Saturdays and Sundays and the variations in the number of these days in the same month in different years. From this point of view there are more than 12 types of months; for example, there are seven different types of months with 31 days, one starting with each different day of the week. The movements resulting from this factor are referred to as calendar, or trading day, variations (and are illustrated by curve 3 of Figure 1, below).*

*Another type of variation that occurs regularly each year arises from the introduction of new models, particularly in the automobile industry. Although new models are introduced at about the same time each year, the exact date is not predetermined and is separately decided upon by the various companies in the industry. To some extent, these decisions are based upon economic conditions rather than climatic and institutional factors. The movements resulting from this factor are referred to as model year variations.*

*This complex of factors yields an annual cycle in many economic series, a cycle that is recurrent and periodic. The pattern varies over time, partly because calendar variations are not the same from year to year, but mainly because of changes in the relative importance of firms, industries, and geographic areas. Thus, the seasonal pattern of construction in the United States has been changing as a result of the increasing importance of the south as compared with the north. The annual cycle is not divisible into shorter periods, because any period less than a year will not contain all the factors that determine the annual cycle; for example, holidays are spaced unevenly over the full 12 months, the school schedule spans most of the year, and the tax collection program has a different impact in the various quarters. Unlike business cycle fluctuations, the timing and pattern of seasonal movements in various economic processes, such as production, investment, and financial markets, are not highly correlated.*

**The role of the seasonal factor** . The pattern and amplitude of the seasonal factor are of considerable interest to economists and businessmen. Reducing the waste of resources that are left idle during seasonal low months is one of the targets of economists concerned with accelerating economic growth. Knowledge of the seasonal pattern in the sales of their products (as well as in the materials they purchase) is helpful to companies in determining the level of production that is most efficient in the light of storage facilities, insurance costs, and the risks of forced selling. It can be used to reduce overordering, overproduction, and overstocking.

*Some companies forecast only their annual total sales. Then, on the basis of this single forecast, they plan their production schedules, determine their inventory and price policies, and establish quotas for their salesmen. For the companies in this group that also experience large seasonal fluctuations, a good first approximation of the monthly pattern of sales can be obtained by prorating the estimated annual total over the months according to the pattern shown by the seasonal factors. A more refined method involves forecasting the cyclical and trend movements for each of the 12 months ahead and applying the seasonal factors to the forecasts. The seasonal factors can be of further value in making shorter-term forecasts as the year progresses. To keep the forecasts current, the original estimates of the cyclical and trend movements can be revised each month in the light of experience to date, and the seasonal factors can be applied to the revised forecasts.*

*But the principal interest in economic time series is usually the longer-term cyclical and trend movements. The cycle consists of cumulative and reversible movements characterized by alternating periods of expansion and contraction. It lasts three to four years, on the average. The trend reflects the still longer-run movements, lasting many years.*

*The nature of the interest in these longer-term movements can be illustrated by the situation in the spring of 1961. About a year earlier a recession had begun in the United States. Although the March and April 1961 data for most economic time series were below the levels reached in March and April 1960, they were higher than in the immediately preceding months. The question was whether the recent improvements were larger or smaller than normal seasonal changes. In forecasting the pattern in the months ahead, it was crucial to know whether the economy had entered a new cyclical phase— whether the economy had been rising or declining to the levels of March and April 1961.*

*An accurate answer to this question was required to determine the economic programs appropriate at the time. If the underlying movements of the economy were continuing downward, anti-recession measures were in order. But if a reversal had taken place, a different policy was needed. A mistaken reading of statistical trends at such a critical juncture could be costly to the economy; one kind of error could lead to increased unemployment; the other, to eventual inflation.*

*Cyclical movements are shown more accurately and stand out more clearly in data that are seasonally adjusted. Seasonally adjusted data not only avoid some of the biases to which the widely used same-month-year-ago comparisons are subject but also reveal cyclical changes several months earlier than such comparisons do. Seasonally adjusted series, therefore, help the economic statistician to make more accurate and more prompt diagnoses of current cyclical trends.*

*Figure 1, computed by the ratio-to-moving-average method (discussed below) by the Bureau of the Census computer program, illustrates various fluctuations discussed above and the resulting seasonally adjusted series. In addition, the figure shows the months for cyclical dominance (MCD) curve. This MCD measure provides an estimate of the appropriate time span over which to observe cyclical movements in a monthly series. In deriving MCD, the average (without regard to sign) percentage change in the irregular component and cyclical component are computed for one-month spans (January-February, February-March, etc.), two-month spans (January-March, February-April, etc.), up to five-month spans. Then MCD is the shortest span for which the average change (without regard to sign) in the cyclical component is larger than the average change (without regard to sign) in the irregular component. That is, it indicates the point at which fluctuations begin to be more attributable to cyclical than to irregular movements, and the MCD curve is a moving average of this many months. (This procedure is explained in full detail in Shiskin 1957 a.)*

**Seasonal adjustment methods.** There are many different methods of adjusting time series for seasonal variations. All are, however, based on the fundamental idea that seasonal fluctuations can be measured and separated from the trend, cyclical, and irregular fluctuations. The task is to estimate the seasonal factor and to eliminate it from the original observations by either subtraction or division, or some combination of the two.

*All familiar methods of seasonal adjustment, including the well-known link-relative and ratio-to-moving-average methods, follow this simple logic. The link-relative method was introduced in 1919 by Warren M. Persons (1919 a; 1919b) of Harvard University. The ratio-to-moving-average method was developed in 1922 by Frederick R. Macaulay (1931) of the National Bureau of Economic Research in a study done at the request of the Federal Reserve Board. The ratio-to-moving-average method has the advantages of more precise measurement of the components and greater flexibility. In addition, it permits analysis of each of the successive stages in the seasonal adjustment process. For these reasons, it was adopted by almost all groups engaged in large-scale seasonal adjustment work, despite the fact that it is relatively laborious.*

*The ratio-to-moving-average method.* The first step in the ratio-to-moving-average method is to obtain an estimate of the trend and cyclical factors by the use of a simple moving average that combines 12 successive monthly figures, thereby eliminating the seasonal fluctuations. Such a moving

*average is known as a “trend-cycle curve” (see curve 6 of Figure 1), since it ntains virtually all the trend and cycle movements and few or none of the seasonal and irregular movements in the data. Division of the raw data by the moving average yields a series of “seasonal-irregular” ratios. An estimate of the seasonal adjustment factor for a given month is then secured by averaging the seasonal-irregular ratios for that month over a number of years (see curve 4 of Figure 1). It is assumed that the irregular factor will be canceled out in the averaging process. Finally, the original observations are seasonally adjusted by dividing each monthly observation by the seasonal adjustment factor for the corresponding month (see curve 8 of Figure 1). This method yields a multiplicative seasonal adjustment; an additive adjustment can be made by an analogous procedure. At present there is no way of making a simultaneous additive and multiplicative adjustment by this method.*

*The ratio-to-moving-average method has been programmed for electronic computers and is in widespread use throughout the world. The first seasonal adjustment computer program was developed at the U.S. Bureau of the Census in the summer of 1954. Shortly thereafter, it was used extensively for national series for the United States, Canada, the Organization for Economic Cooperation and Development countries, Japan, and other countries. It has also been utilized by many private concerns to adjust their own data. The U.S. Bureau of Labor Statistics adopted a similar method in 1960, and other adaptations were introduced at about the same time in several other countries.*

*These programs take advantage of the electronic computer’s high-speed, low-cost computations by utilizing more powerful and refined techniques than clerical methods had used in the past. Thus, weighted moving averages are used to represent the trend-cycle factor and to measure changing seasonal patterns. As a result, the computer programs are likely to produce satisfactory results more frequently. They also produce more information about each series—for example, estimates of the trend-cycle and irregular components and of the relations between them. This information can be used for checking the adequacy of the results, for forecasting seasonal and other movements, and for studying the relations among different types of economic fluctuations. For example, for the data graphed in Figure 1 the Bureau of the Census program also gives an indication of the relative importance of the components of the retail sales series by calculating the per cent of the total variation that is contributed by these components over one-*

Table 1 - Per cent of total variation contributed by components of total retail sales, United States, 1953-1965 | ||
---|---|---|

COMPONENT | PER CENT OF VARIATION | |

Month-to-month | Twelve-month spans | |

Source: U.S. Bureau of the Census 1966, p. 33. | ||

Holiday | 0.4 | 0.5 |

Trading day | 33.1 | 7.1 |

Seasonal | 64.8 | 0.0 |

Irregular | 1.2 | 2.4 |

Trend-cycle | 0.5 | 90.0 |

Total | 100.0 | 100.0 |

*month and longer spans. These data are shown in Table 1.*

*The Bureau of the Census method was designed to analyze a large variety of series equally well. To this end, alternative routines to handle different kinds of series were built into the program, along with techniques for automatically selecting the most appropriate routine for each series. The completeness, versatility, and economy of this method have stimulated broad interest in economic time series analysis in recent years.*

*This program adjusts for changes in average climatic conditions and institutional arrangements during the year. Adjustments for variations in the number of trading days are also made for some series—for example, new building permits. Further adjustments for variable holidays, such as Easter, are made for certain series, such as retail sales of apparel. Similar adjustments for Labor Day and Thanksgiving Day help bring out the underlying trends. Studies of the effects of unusual weather upon some series have also been started. It is important to note, however, that conventional methods adjust for average weather conditions, and not for the dispersion about this average. For this reason many seasonally adjusted series, such as housing starts, will tend to be low in months when the weather is unusually bad and high in months when the weather is unusually good.*

*The variants of the ratio-to-moving-average method all give about the same results, and there is considerable evidence that this method adjusts a large proportion of historical series very well. There are, however, some series that cannot be satisfactorily adjusted in this way—for example, those with abrupt changes in seasonal patterns or with constant patterns of varying amplitudes, or those which are highly irregular. Another problem concerns the appropriate seasonal adjustment of an aggregate that can be broken down into different sets of components, each with a different seasonal pattern. However, the principal problem remaining now appears to be obtaining satisfactory seasonal adjustment factors for the current year and the year ahead. These are less accurate than those for previous years, but they play a more important role in the analysis of current economic trends and prospects.*

*Regression methods.* Attempts to use regression methods to analyze time series have been intensified since electronic computers have become available. The basic principle is to represent each of the systematic components by explicit mathematical expressions, usually in the functional form of a linear model. This can be accomplished in a simple form, for example, by regressing the difference between the unadjusted series and the trend-cycle component for each month on the trend-cycle values for that month. The constant term in the regression equation is the additive part of the seasonal component, and the product of the regression coefficient and the trend-cycle value is the multiplicative part of the seasonal component. Thus, this approach has the advantage over the ratio-to-moving-average method that it is not committed to a single type of relationship (e.g., additive or multiplicative ) among the seasonal, cyclical, and irregular components of the series.

*Another advantage is that the different types of fluctuations can be related to the forces causing them by representing these forces as appropriate variables in the mathematical expressions. Thus, in measuring the seasonal factor, direct allowance can be made, say, for the level of the series or for temperature and precipitation. In certain series, special factors could be taken into account; for example, in measuring the seasonal factor in unemployment, allowance could be made for the number of students in the labor force, or in the case of automobile sales, the level of automobile dealers’ inventories could be taken into account. Finally, the mathematical expressions for the estimates of the systematic components provide the basis for deriving measures of variance and significance tests to evaluate the reliability of the estimates; this applies, for example, to estimates of the seasonally adjusted series and to the seasonal component, or to the differences in either series over time.*

*The principal doubt about the regression approach is whether fairly simple functional forms can adequately measure the implicit economic patterns. Or, to consider the matter from another point of view, do the complex mathematical forms required to represent the systematic movements of historical series constitute a plausible theory of economic fluctuations? A related question is whether either fairly simple functional forms, which only crudely measure historical patterns, or the more complex forms, which fit the past more closely, can provide the basis for accurate forecasts of future patterns.*

*Thus far, regression methods have been applied to only a small number of series, and their powers to decompose series into the various systematic components and to forecast seasonal factors for future years have not yet been fully tested. While the ratio-to-moving-average method does not have the advantages provided by the mathematical properties of the regression method, extensive tests have demonstrated that it gives good results in practice. Tests completed at the Bureau of the Census show that regression methods yield historical seasonal factors very similar to those yielded by the ratio-to-moving-average method, but that the regression “year-ahead” factors and the trend-cycle curves are less accurate.*

**Criteria for judging a seasonal adjustment.** Although it is not now possible to draw a set of hard-and-fast rules for judging the success of a seasonal adjustment, five guidelines have proved useful.

*( a) Any repetitive intrayear pattern present in a series before seasonal adjustment should be eliminated and thus should not appear in the seasonally adjusted series, in the trend-cycle component, or in the irregular component. This implies that the seasonal factors are not correlated with the seasonally adjusted series, or with the trend-cycle or irregular components. (The correlations should be computed year by year, because residual seasonality sometimes shows up with inverse patterns in different years.)*

*( b) The underlying cyclical movements should not be distorted. Seasonally adjusted series that in unadjusted form had a large seasonal factor should be consistent in terms of cyclical amplitude, pattern, and timing with other related economic series that either had no seasonal factor at all or had a small seasonal factor compared with the cyclical factor. Similarly, changes in a seasonally adjusted series such as new orders for machinery and equipment should be followed by like changes in a corresponding series such as sales.*

*( c) The irregular fluctuations should behave like a random series when autocorrelations of lags of about 12 months are considered. Autocorrelations of smaller lags need not necessarily behave like the similar autocorrelations of a random series because some irregular influences, such as a long strike, spread their effects over several months. A seasonally adjusted artificial series containing a random component should produce a random series as the irregular component.*

*( d) The sum of the seasonally adjusted series should be equal to the sum of the unadjusted series. For most series, sums are meaningful in economic terms, and the preservation of sums meets the common-sense requirement that the number of units produced, traded, or exported in a year should not be altered by the seasonal adjustment.*

*( e) Revisions in the seasonal factors that take place when data for additional years become available should be relatively small.*

**Tests of seasonal adjustments.** With the massive increase in the number of series seasonally adjusted in recent years, due largely to the increasing use of electronic computers for this purpose, the need for routine objective tests of the quality of the adjustments has grown.

*A general type of test involves examining the results of applying a seasonal adjustment procedure to artificial series. One method of constructing suitable artificial series is to combine the irregular, cyclical, and seasonal factors from different real economic series into artificial aggregates; that is, the seasonal factor from one economic series, the trend-cycle factor from another, and the irregular factor from a third are multiplied together to form a new series. A test of the Bureau of the Census method, using 15 different types of such artificial series, revealed that in most instances the “estimated” components trace a course similar to that of the “true” components (Shiskin 1958). Although some limitations were evident, this test showed that the Census method has considerable power to rediscover the different types of fluctuations that were built into the series and does not generate arbitrary fluctuations that have no relationship to the original observations.*

*A statistical test for the presence of a stable seasonal adjustment component may be made by using the analysis of variance and the associated F-test. This is a test of the null hypothesis that monthly means are equal. Here, the variance estimated from the sum of squares of the differences between the average for each month and the average for all months (between-months variance) is compared with the variance estimated from the sum of squares over all months of the differences between the values for each month and the average for that month (within-months variance). If the between-months variance of the “seasonal-irregular” ratios (computed by dividing the original observations by an estimate of the trend-cycle component) is significantly greater than the within-months variance, it can usually be assumed that there is a true seasonal factor in the series, If the between-months variance is not significantly greater than the within-months variance of the irregular series (computed by dividing the seasonally adjusted series by an estimate of the trend-cycle component), then it can usually be assumed that a complete seasonal adjustment has been made. This test must, however, be used cautiously because differences between months can also appear as a result of differences in the behavior of the irregular component from month to month, because differences between months may be hidden when changes in seasonally in one month are offset by changes in another month, and because the assumptions of the test may not be well satisfied. Nevertheless, the F-test has proved to be a useful test of stable seasonality in practice. [ See Linear hypotheses,article onanalysis of variance.]*

*Experience in applying spectral analysis to physical science data has encouraged researchers to explore its use in economics, and this technique is now being used to test for seasonality. Spectral analysis distributes the total variance of a series according to the proportion that is accounted for by each of the cycles of all possible periodicities, in intervals for 2-month and longer cycles. If there is a seasonal pattern in a series, a large proportion of the variance will be accounted for by the 12-month cycle and its harmonics (cycles of 6, 4, 3, 2.4, and 2 months). A significant proportion of the total variance of a seasonally adjusted economic series should be accounted for by a cycle of 45 to 50 months, the average duration of the business cycle, but not by the 12-month cycle or its harmonics. A random series would not be expected to show a significant cycle at any periodicity. While quantitative statistical methods based on suitable assumptions for economic time series have not yet been developed for determining from a spectrum whether seasonality exists, such judgments can often be made from inspection of charts of the spectra.*

*A question sometimes raised about spectral analysis is whether it is appropriate to consider an economic time series from the viewpoint of the frequency domain, as spectral analysis does, rather than the time domain, as most other methods do. This question comes up mainly because economic series are available for relatively short periods and economic cycles, other than the seasonal, are irregular in length and amplitude. However, the prospect that mathematical representation of a time series in this way may reveal relationships not otherwise apparent would appear to make this alternative view worth further exploration.*

*These tests do not provide enough information to determine whether all the criteria listed above are satisfied. To this end, comparisons of the sums of seasonally adjusted and unadjusted data are also made, often for all fiscal years in addition to calendar years. The magnitude of revisions resulting from different methods of seasonal adjustment is usually appraised by seasonally adjusting series which cover periods successively longer by one year (e.g., 1948-1954, 1948-1955, 1948-1956, and so forth) and comparing the seasonal factors for the terminal years with the “ultimate” seasonal factors.*

*Relations of seasonal to other fluctuations. An analysis has been made of the cyclical, seasonal, and irregular amplitudes of a sample of about 150 series considered broadly representative of the different activities of the U.S. economy. This study revealed that, for the post-World War n period, seasonal movements dominate other kinds of month-to-month movements in most current economic series. Seasonal movements are almost always larger than either the irregular or the cyclical movements, and they are often larger than both of the other types combined. More specifically, the average monthly amplitude of the seasonal fluctuations exceeds that of the cyclical factor in 78 per cent of the series, exceeds the irregular factor in 65 per cent of the series, and exceeds the cycle-trend and irregular factors in combination in 45 per cent of the series. Furthermore, where the seasonal factor is larger, it is often much larger. The seasonal factor is three or more times as large as the cyclical factor in 45 per cent of the series, three or more times as large as the irregular factor in 16 per cent of the series, and three or more times as large as the cyclical and irregular fluctuations together in 11 per cent of the series. (See Shiskin 1958.) These results apply to observations of change over intervals of one month; over longer spans the relative importance of the several components would, of course, be different. Table 1 shows how seasonal and trading day fluctuations, which dominate the short-term movements, give way in relative importance to the trend-cycle factor when comparisons are made over longer periods.*

*These findings emphasize the advantages of seasonally adjusted series over those not so adjusted for studying cyclical movements. Where the seasonal fluctuations are large, a difference in the unadjusted data for two months may be due largely or solely to normal seasonal fluctuations; if the data are seasonally adjusted, the difference can be assumed to be caused chiefly by cyclical or irregular factors.*

*Julius Shiskin*

*BIBLIOGRAPHY*

*Much of the material in this article is discussed in greater detail in* Shiskin 1957a; 1958; *and the Shiskin paper in* Organization for European Economic Cooperation 1961. *References that deal with the problem of seasonal adjustment as it relates to current economic conditions are* Organization for European Economic Cooperation 1961 *and* Shiskin 1957a; 1957b. *Early works on seasonality and seasonal adjustment methods are* Barton 1941; Burns & Mitchell 1946; Hotelling & Hotelling 1931; Kuznets 1933; Macaulay 1931; *and* Persons 1919a; 1919b. *Works dealing with the history of the Bureau of the Census method and its variants are* Organization for European Economic Cooperation 1961; Shiskin & Eisenpress 1957; Shiskin et al. 1965; *and* Young 1965. *Alternative methods are described in* Hannan 1963; Rosenblatt 1963; *and* U.S. Bureau of Labor Statistics 1964. Tests *of seasonal adjustment methods are discussed in* Burns & Mitchell 1946; Granger & Hatanaka 1964; Hannan 1963; Kuznets 1933; Rosenblatt 1963; *and* Shiskin 1957b.

*Barton, H. C. Jr. 1941 Adjustment for Seasonal Variation. Federal Reserve Bulletin 27:518-528.*

*Burns, Arthur F.; and Mitchell, Witchell, Witchell, Wesley C. 1946 Measuring Business Cycles. National Bureau of Economic Research, Studies in Business Cycles, No. 2. New York: The Bureau. → See especially pages 43-55, “Treatment of Seasonal Variations” and “Notes on the Elimination of Seasonal Variations.”*

*Granger, Clive W. J.; and Hatanaka, M. 1964 Spectral Analysis of Economic Time Series. Princeton Univ. Press. → See also “Review” by H. O. Wold in Annals of Mathematical Statistics, February 1967, pages 288-293.*

*Hannan, E. J. 1963 The Estimation of Seasonal Variations in Economic Time Series. Journal of the American Statistical Association 58:31-44.*

*Hotelling, Harold; and Hotelling, Floy 1931 Causes of Birth Rate Fluctuations. Journal of the American Statistical Association 26:135-149.*

*Kuznets, Simon 1933 Seasonal Variations in Industry and Trade. New York: National Bureau of Economic Research.*

*Macaulay, Frederick R. 1931 The Smoothing of Time Series. New York: National Bureau of Economic Research.*

*Organization For European Economic Cooperation 1961 Seasonal Adjustment on Electronic Computers. Report and proceedings of an international conference held in November 1960. Paris: Organization for Economic Cooperation and Development.*

*Persons, Warren M. 1919a Indices of Business Conditions. Review of Economics and Statistics 1:5-107.*

*Persons, Warren M. 1919b An Index of General Business Conditions. Review of Economics and Statistics 1:111-205.*

*Rosenblatt, Harry M. (1963) 1965 Spectral Analysis and Parametric Methods for Seasonal Adjustment of Economic Time Series. U.S. Bureau of the Census, Working Paper No. 23. Washington: Government Printing Office. → First published in American Statistical Association, Business and Economics Section, Proceedings, pages 94-133.*

*Shiskin, Julius 1957a Electronic Computers and Business Indicators. National Bureau of Economic Research, Occasional Paper No. 57. New York: The Bureau. → First published in Volume 30 of the Journal of Business.*

*Shiskin, Julius 1957b Seasonal Adjustments of Economic Indicators: A Progress Report. Pages 39-63 in American Statistical Association, Business and Economics Section, Proceedings. Washington: The Association.*

*Shiskin, Julius 1958 Decomposition of Economic Time Series. Science 128:1539-1546.*

*Shiskin, Julius; and Eisenpress, Harry (1957) 1958 Seasonal Adjustments by Electronic Computer Methods. National Bureau of Economic Research, Technical Paper No. 12. New York: The Bureau. → First published in Volume 52 of the Journal of the American Statistical Association.*

*Shiskin, Julius et al. 1965 The X-ll Variant of the Census Method. II: Seasonal Adjustment Program. U.S. Bureau of the Census, Technical Paper No. 15. Washington: The Bureau.*

*U.S. Bureau of labor statistics 1964 The BLS Seasonal Factor Method (1964). Washington: The Bureau.*

*U.S. Bureau of the census Monthly Retail Trade Report[1966]: August.*

*Young, Allan H. 1965 Estimating Trading-day Variation in Monthly Economic Time Series. U.S. Bureau of the Census, Technical Paper No. 12. Washington: The Bureau.*

*
*

## time series

**time series** A set of observations ordered in time and usually equally spaced; each observation may be related in some way to its predecessors. Time-series problems arise in economics, commerce, industry, meteorology, demography, or any fields in which the same measurements are regularly recorded. *Time-series analysis* is based on models of the variability of observations in a time series, by postulating trends, cyclic effects, and short-term relationships, with a view to understanding the causes of variation and to improving forecasting (see also periodogram).*Autoregression* is the use of regression analysis to relate observations to their predecessors. *Moving-average methods* use the means of neighboring observations to reveal underlying trends. Autoregression and moving averages are combined in *ARMA* (or *Box-Jenkins*) forecasting techniques.

Cyclic influences may be of known period (months in a year or days in a week) and data may be seasonally adjusted on the basis of long-term means. Cyclic influences of unknown period may be studied by *spectral analysis*.

Analogous techniques may be used for data regularly ordered in space rather than time.

## Time Series

# TIME SERIES

A "time series" is an epidemiological research design in which a single population group of defined size is studied over a period during which preventive or therapeutic interventions take place, with measurements of factors and variables of interest at specified time intervals. The aim is to detect trends such as variations in incidence rates of disease or other health-related phenomena in response to particular interventions. It may be a simple pre-test/post-test design, or an interrupted time series, in which several measurements are made both before and after an intervention; the latter is regarded as the more valid of these methods.

John M. Last

(see also: *Cohort Study; Epidemiology; Observational Studies* )

## time series

**time series** A data set in which the intervals are of equal time and arranged in order of occurrence. The series may be for individual or averaged values which can be analysed by statistical techniques, including spectrum or harmonic analyses.

*
*

*
*