Specification Tests

views updated

Specification Tests

The terms specification tests is used in economics to denote tests for departures from the premises of inference in empirical modeling. This use of the term, however, is confusing because it conflates statistical with substantive specification errors (misspecifications), rendering the discussions of specification tests problematic (see Theil 1957).

In the context of a statistical model, the relevant misspecifications concern departures from the probabilistic assumptions constituting the statistical premises, such as [1]-[5] in Table 1.

In contrast, specification errors in relation to a structural model concern the model’s inadequacies in relation

Table 1
Normal/linear regression model
Statistical Generating Mechanism (GM): y_t = β₀ + +β’₁ X_t + u_t, t ε N,
[1] Normality: (y_t X _t = x )~ N(.,.)
[2] Linearity: E(y_t ǀ X_t = x_t ) = β₀ + β’x_t
[3] Homoskedasticity: Var(y_t ǀX_t = x_t ) = σ ²,
[4] Independence: {(y_t ǀ X_t = x_t ), t ε N} is an independent process,
[5] t-invariance: θ = (β ₀, β ₁, σ ²) are not changing with t.

to: (a) the demarcation of the segment of reality to be modeled; (b) the crucial aspects of the phenomenon to be quantified; and (c) the extent to which the inferences based on the structural model are germane to the phenomenon of interest. In addition, reliable probing for such specification errors can only take place in the context of a statistically adequate model, that is, an estimated statistical model whose assumptions are (approximately) true for the data in question. Statistical inadequacy renders inference procedures (estimation, testing, prediction) unreliable because the procedures nominally differ from the actual error probabilities. A statistically adequate model provides reliable inference procedures to probe for substantive inadequacies.

Specification tests in the context of a structural model refer to assessing the adequacy of the three choices described above, including incongruous measurement and external invalidity (see Spanos 2006a). The quintessential form of a substantive specification error, however, is the omitted-variables problem. The issue is one of substantive inadequacy insofar as subject-matter information raises the possibility that certain influential factors W_t might have been omitted from the relevant factors X_t explaining the behavior of y_t (Leamer 1990). That is, omitting certain potentially important factors W_t may confound the influence of X_t on y_t, leading to misleading inferences. Stating the problem in a statistically coherent fashion, one is comparing the following two models:

using the hypotheses:

Models (M ₀, M ₁) are based on the same statistical information Z := (y, X, W ), but M ₀ is a special case of M ₁ subject to the substantive restrictions γ = 0; M ₀ can be viewed as a structural model embedded into the statistical model M ₁. Assuming that the statistical model M ₁ is statistically adequate, these statistical parameterizations can be used to assess the relationship between W_t, X_t, and y_t by evaluating the broader issues of confounding and spuriousness using hypothesis testing (see Spanos 2006b).

The statistical parameterizations associated with the two models, and θ = (α,γ, take the form:

where σ ₁₁=Var (y_t, σ ₂₁ = Cov(X_t, y_t ), σ ₃₁ = Cov(W_t, Y_t ), Σ ₂₃ = Cov (X _t, W _t), Σ ₃₃ = Cov (W _t). The textbook omitted-variables argument attempts to assess the seriousness of this unreliability using the sensitivity of the estimator β̂ = (X ′X )^–1 X ′y to the inclusion/exclusion of W_t, by tracing that effect to the potential bias/inconsistency of β̂. Spanos (2006b) argues that the sensitivity of point estimates provides a poor basis for addressing the confounding problem. Although the confounding and spuriousness issues are directly or indirectly related to the parameters α and β , their appraisal depends crucially on the values of all three covariances (σ ₂₁, σ ₃₁, Σ₂₃), and it can only be addressed adequately in a hypothesis-testing setup supplemented with a post-data evaluation of inference based on severe testing (see Spanos 2006b for details).

Specification (misspecification) tests in the context of a statistical model refer to assessing the validity of the probabilistic assumptions constituting the statistical model in question. Taking the normal/linear regression as an example (Table 1), specification error denotes any form of departures from assumptions [1]-[5] arising from viewing the observed data Z : = (y, X ) as a truly typical realization of the stochastic process {(y ǀ X_t = x_t ), t> ε N }. This takes the form of Mis-Specification (M-S) tests, probing for departures from assumptions [1]-[5] (Spanos 1986, 1999).

To get some idea as to how these M-S tests can be viewed as probing for departures from the model assumptions, consider the linearity assumption [2]

Linearity:

When this assumption is invalid, for some non-linear function h (.). For the comparison to be operational, we need to specify a particular form for h (.), say:

Nonlinearity ⇒

where ψ_t (x_t ) includes, say, the second order terms (x_itx _jt), i, j = 2, …, k ; note that x ₁_t = 1. This comparison creates a situation of two competing models:

which, superficially, resembles the initial setup of models (1) and (2) but is fundamentally different in the sense that ψ_t (x _t) does not comprise different variables but functions of x_t. Subtracting Model 1 from Model 2 gives rise to the following auxiliary regression:

where û_t denotes the residuals from Model 1, which can provide a basis for testing the linearity assumption using the hypotheses:

The test of choice is the F -test (see Spanos 1986). Using the same type of reasoning, one can argue that the regression function will be affected by departures from other assumptions, such as [4] independence, leading to the auxiliary regression:

where Z_t-1:=(Y_t-1,x_t-1 ) (see Spanos 1986).

Pursuing the same reasoning further, one can derive an auxiliary regression to provide the basis for a joint test of how certain departures from assumptions [2] and [4] might affect the assumed regression function. This constitutes a combination of the previous two auxiliary regressions, leading to:

expressed in terms of the hypotheses:

All the M-S tests introduced above are F -type tests based on the joint significance of the coefficient of the omitted factors. The only reliable conclusion one can draw on the basis of each of these M-S tests is whether there is evidence that Model 1 is misspecified in the direction being probed (see Spanos 1999; Mayo and Spanos 2004).

SEE ALSO Hausman Tests; Specification; Specification Error

BIBLIOGRAPHY

Leamer, Edward E. 1990. Specification Problems in Econometrics. In Econometrics : The New Palgrave, ed. John Eatwell, Murray Milgate, and Peter Newman, 238–245. New York: Norton.

Mayo, Deborah G., and Aris Spanos. 2004. Methodology in Practice: Statistical Misspecification Testing. Philosophy of Science 71: 1007–1025.

Spanos, Aris. 1986. Statistical Foundations of Econometric Modelling. Cambridge, U.K.: Cambridge University Press.

Spanos, Aris. 1999. Probability Theory and Statistical Inference: Econometric Modeling with Observational Data. Cambridge, U.K.: Cambridge University Press.

Spanos, Aris. 2006a. Econometrics in Retrospect and Prospect. In Palgrave Handbook of Econometrics, ed. Terence C. Mills and Kerry Patterson. Vol. 1: Econometric Theory, 3–58. London: Macmillan.

Spanos, Aris. 2006b. Revisiting the Omitted Variables Argument: Substantive vs. Statistical Adequacy. Journal of Economic Methodology 13: 179–218.

Theil, H. 1957. Specification Errors and the Estimation of Economic Relationships. Review of the International Statistical Institute 25 (1/3): 41–51.

Aris Spanos

International Encyclopedia of the Social Sciences