## Seemingly Unrelated Regressions

## Seemingly Unrelated Regressions

# Seemingly Unrelated Regressions

The seemingly unrelated regressions (SUR) model explains the variation of not just one dependent variable, as in the univariate multiple regression model, but the variation of a set of *m* dependent variables; that is, the monthly consumption expenditures of *m* consumers or the annual voting behavior of *m* voters, in terms of the variation of general and specific input or independent variables and error terms specific to each individual, problems that are frequently encountered in many sciences. Indeed, John Geweke has written, “The seemingly unrelated regressions (SUR) model developed in Zellner (1962) is perhaps the most widely used econometric model after linear regressions. The reason is that it provides a simple and useful representation of systems of demand equations that arise in neoclassical static theories of producer and consumer behavior” (2003, p. 162).

It is the case that a SUR model is a collection of two or more regression relations that can be analyzed with data on the dependent and independent variables. For many years, the individual regression relations were fitted one by one, usually using least squares techniques and justified by an appeal to single equation estimation optimality properties; That is, the least squares estimators are best linear unbiased estimators according to the well-known Gauss Markov theorem and maximum likelihood estimators when single equation normal likelihood functions are employed. Also in traditional multivariate regression models with the same independent variables in each equation and normal error terms in different equations with zero means, different variances and non-zero covariances, it had been shown that applying the least squares method equation by equation leads to fully efficient maximum likelihood estimators for regression coefficients in different equations. What was overlooked in this pre-1962 literature is the fact that when the error terms in the different regression equations are correlated and different independent variables appear in the equations, the regression equations are related, not unrelated as many assumed incorrectly and hence the term “seemingly unrelated,” and that the sample information in other regressions can be employed to improve the precision of estimation of parameters in any given regression equation under a wide range of conditions. That is, new, operational SUR best linear unbiased estimators for the parameters of a set of say *m* regression equations were put forward that uniformly dominate the single equation least squares estimators under a broad range of conditions.

It was shown that these SUR or generalized least squares (GLS) estimators are best linear unbiased and maximum likelihood and Bayesian estimators under frequently encountered conditions. In addition, by joint analysis of the set of regression equations rather than equation by equation analysis, more precise estimates and predictions are obtained that lead to better solutions to many applied problems; that is, portfolio formation procedures in 2003 work by Jose M. Quintana and colleagues, in which dynamic regression equations with time varying parameters and various input variables were employed to explain the variation of monthly stock prices. By taking account of the fact that the regression equations were related and not unrelated, SUR estimation, prediction, and portfolio formation procedures were utilized to yield improved analyses of the variation of stock prices and to form optimal portfolios with very good rates of return. In Arnold Zellner and Henri Theil’s 1962 work, similar techniques were applied to simultaneous equations models to yield a new joint estimator, the three-stage least squares estimator that dominates single equation estimators by taking account of the correlation of error terms in equations of the system by use of joint estimation of coefficients in equations of structural models.

The simplest version of a linear, constant parameter SUR system is one that contains *m* ≥ 2 linear regression equations, y_{i} = X_{i} β_{i} + u_{i}, *i* = 1, 2, …, *m*, where y_{i} is an *n* × 1 vector of observations on the *i* th dependent variable, X_{i} is an *n* × *k _{i}* matrix with full column rank of observations on the

*k*independent variables in the

_{i}*i*th regression equation, β

_{i}is a

*k*

_{i}× 1 vector of regression parameters and

*u*

_{i}is an

*n*× 1 vector of zero mean error terms. The usual method of estimating the regression coefficients was to estimate the equations individually by least squares to obtain β

_{i}= (X

_{i}'X

_{i})

^{-1}X

_{i}'y

_{i}

*, i*= 1, 2, …,

*m*. However, in Zellner’s 1962 work it was shown that when the error terms are correlated across the equations, the equations are related and joint estimation, rather than equation-by-equation estimation, leads to more precise estimates of the regression coefficients and predictions of future values of the dependent variables. Indeed, as explained in the articles and texts cited in this entry’s bibliography, these joint SUR estimators are generalized best linear unbiased estimators and, with a normality assumption for the error terms, maximum likelihood and “diffuse prior” Bayesian estimators. Further they reduce to single equation least squares estimators when error terms in the different equations are mutually uncorrelated; that is, the equations are unrelated. In addition, use of SUR techniques leads to improved tests of hypotheses regarding regression coefficients’ and other parameters’ values.

Similarly, taking account of the error terms’ correlations across equations leads to better predictions of future values of the dependent variables, as shown in the 2005 work of Arnold Zellner and Guillermo Israilevich who use SUR techniques in forecasting U.S. economic sectors’ output growth rates and aggregate output growth rates. And in works of Sid Chib and Edward Greenberg (1995), John Geweke (2003), and Peter E. Rossi and colleagues (2005) modern Bayesian methods are described that yield optimal finite sample estimation, testing, and prediction techniques for many variants of the SUR model; that is, SUR models with time varying parameters and auto correlated error terms. Similarly, when the dependent variables are discrete random variables as in multivariate logit or probit models with correlated error terms, the SUR joint estimation, testing and prediction techniques have been found to be useful, as shown in the 1977 work of T. C. Lee and colleagues.

**SEE ALSO** *General Linear Model; Simultaneous Equation Bias*

## BIBLIOGRAPHY

Chib, Sid, and Edward Greenberg. 1995. Hierarchical Analysis of SUR Models with Extensions to Correlated Serial Errors and Time-Varying Parameter Models. *Journal of Econometrics* 68: 339–360.

Geweke, John. 2003. *Contemporary Bayesian Econometrics and Statistics*. Hoboken, NJ: Wiley.

Greene, William H. 2003. *Econometric Analysis*. 5th ed. Upper Saddle River, NJ: Prentice-Hall.

Judge, George G., William E. Griffiths, Robin C. Hill, et al. 1985. *The Theory and Practice of Econometrics*. New York: Wiley.

Lee, T. C., George G. Judge, and Arnold Zellner. 1977. *Estimating the Parameters of the Markov Probability Model from Aggregate Time Series Data*. 2nd ed. Amsterdam: North-Holland.

Meng Xiao-Li and Donald B. Rubin. 1996. Efficient Methods for Estimation and Testing with Seemingly Unrelated Regressions in the Presence of Latent Variables and Missing Observations. In *Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner*, ed. Donald A Berry, Katheryn M. Chaloner, and John Geweke, 215–227. New York: Wiley.

Percy, David F. 1996. Zellner’s Influence on Multivariate Linear Models. In *Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner*, ed. Donald A Berry, Katheryn M. Chaloner, and John Geweke, 203–213. New York: Wiley.

Quintana, Jose M., Bluford H. Putnam, and David S. Wilford. 1996. *Mutual and Pension Funds Management: Beating the Markets Using a Global Bayesian Investment Strategy*. In 1996 Joint Section on Bayesian Statistical Science, American Statistical Association and International Society for Bayesian Analysis (ISBA). Proceedings volume for papers presented at ISBA Meeting in Istanbul, Turkey, 1995.

Rossi, Peter E., Greg M. Allenby, and Robert McCulloch. 2005. *Bayesian Statistics and Marketing*. Hoboken, NJ: Wiley.

Srivastava, V. K., and David E. A. Giles. 1987. *Seemingly Unrelated Regression Equations Models*. New York: Dekker.

Theil, Henri. 1971. *Principles of Econometrics*. New York: Wiley.

Zellner, Arnold. 1962. An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias. *Journal of the American Statistical Association* 57: 348–368.

Zellner, Arnold. 1963. Estimators for Seemingly Unrelated Regressions: Some Exact Finite Sample Results. *Journal of the American Statistical Association* 58: 977–992; corrigendum, 1972, 67: 255.

Zellner, Arnold, and David S. Huang, 1962. Further Properties of Efficient Estimators for Seemingly Unrelated Regression Equations. *International Economic Review* 3: 300–313.

Zellner, Arnold, and Guillermo Israilevich. 2005. The Marshallian Macroeconomic Model: A Progress Report. *Macroeconomic Dynamics* 9: 220–243 and *International Journal of Forecasting* 21: 627–645.

Zellner, Arnold, and Henri Theil. 1962. Three-Stage Least Squares: Simultaneous Estimation of Simultaneous Equations. *Econometrica* 30: 54–78.

*Arnold Zellner*