Residuals

views updated

Residuals

BIBLIOGRAPHY

To define the notion of residuals, let us introduce a linear model describing the relationship between K + 1 independent variables xj with j = 0, 1, 2, , K and a dependent variable y.

y = β 0 + β 1x 1 + β 2x 2+ + β k xk +u

with u being the stochastic error representing all factors affecting y not included in the xj s.

Assuming E(uǀx) = E(u) = 0, the population regression function is:

E(yǀx) = β 0 + β 1x 1 + β 2x 2 + + β kx k

This is estimated by the sample regression function:

ŷ = β̂ 0 + β̂ 1x 1 + β̂ 2x 2 + + β̂ K xK

where β̂ 0, β̂ 1, , β̂ K identify a set of estimated parameters calculated using the estimation rule β.

Assuming a sample of N observations, the residual for observation i, with i = 0, 1, 2, , N, is the difference between the actual value of y i and the fitted value of the estimated regression, that is:

ûi = yi -ŷi = yi - β̂ 0 - β̂ 1x i 1 - β ̂2x i 2 - - β̂ K xiK

If the residual is positive, the estimated regression underpredicts yi and vice versa if the residual is negative. Each residual ûi can be interpreted as an estimate of the unobservable error u i and as such can be employed in constructing a number of statistical tests assessing the properties of the estimated model and a set of indicators determining the goodness of fit.

Ideally, a good model is characterized by small residuals, that is, by fitted values ŷi that are close to the actual values yi. The ordinary least squares (OLS) estimator of the parameters β j is calculated by fitting the regression line that best approximates the data through the minimization of the sum of squared residuals. The resulting first-order conditions for the OLS estimator are and with j = 1, 2, , K.

As regards the goodness of fit, defining the total sum of squares , the explained sum of squares, the explained sum of squares and the residual sum of the squares the part of variation in y that is explained by the regressors, is given by the coefficient R 2 = 1 - SSR /SST. R 2 is between 0 and 1, with the fit of the regression improving as R 2 approaches 1. A consistent estimator of the variance of the stochastic error is given by Its square root σ̂ is then an estimator of the standard deviation of the unobservable factors affecting the dependent variable y. It indicates how well the model predicts y given the information set represented by the observables x js. The estimate σ̂ is used to construct the standard error of the OLS estimator, equal to where SST j is the total sum of squares of x ij and R 2j is the of the regression of xj on all other regressors. The standard error se(βj ) is then employed to test for statistical significance of each estimated parameter through the t -statistic.

If the errors were homoskedastic, we would have that Var (u ǀx 1, x 2, ,xk ) = E (u 2ǀx 1, x 2, , xK ) = E (u 2) = σ 2. On the contrary, in the presence of heteroskedasticity the expected value of u 2 can be assumed to depend on some function of the explanatory variables. Following this logic, Breusch and Pagan suggest a test for heteroskedasticity that consists of an F -test on the regression of the squared residuals on the explanatory variables. The White test for heteroskedasticity simply consists of a similar regression including nonlinearities, that is, the squares and cross-products of the explanatory variables.

white suggests using the residuals ûi to calculate a heteroskedasticity-robust standard error for βj. This is given by , where r̂ ij is the i -th residual of the regression of x j on all other regressors and SSR j is the residual sum of squares of this regression.

SEE ALSO Hausman Tests; Least Squares, Ordinary; Least Squares, Three-Stage; Least Squares, Two-Stage; Ordinary Least Squares Regression; Regression; Test Statistics; White Noise

BIBLIOGRAPHY

Breusch, T. S., and A. R. Pagan. 1979. A Simple Test for Heteroskedasticity and Random Coefficient Variation. Econometrica 50: 9871007.

Greene, William H. 1997. Econometric Analysis. 3rd ed. Upper Saddle River, NJ: Prentice Hall.

White, H. 1980. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica 48: 817838.

Wooldridge, J. M. 2003. Introductory Econometrics: A Modern Approach. 2nd ed. Cincinnati, OH: South-Western.

Luca Nunziata