## Statistical Methods

## Statistical Methods

# STATISTICAL METHODS

In the 1960s, the introduction, acceptance, and application of multivariate statistical methods transformed quantitative sociological research. Regression methods from biometrics and economics; factor analysis from psychology; stochastic modeling from engineering, biometrics, and statistics; and methods for contingency table analysis from sociology and statistics were developed and combined to provide a rich variety of statistical methods. Along with the introduction of these techniques came the institutionalization of quantitative methods. In 1961, the American Sociological Association (ASA) approved the Section on Methodology as a result of efforts organized by Robert McGinnis and Albert Reiss. The ASA's yearbook, *Sociological Methodology*, first appeared in 1969 under the editorship of Edgar F. Borgatta and George W. Bohrnstedt. Those editors went on to establish the quarterly journal *Sociological Methods and Research* in 1972. During this period, the National Institute of Mental Health began funding training programs that included rigorous training in quantitative methods.

This article traces the development of statistical methods in sociology since 1960. Regression, factor analysis, stochastic modeling, and contingency table analysis are discussed as the core methods that were available or were introduced by the early 1960s. The development of additional methods through the enhancement and combination of these methods is then considered. The discussion emphasizes statistical methods for causal modeling; consequently, methods for data reduction (e.g., cluster analysis, smallest space analysis), formal modeling, and network analysis are not considered.

## THE BROADER CONTEXT

By the end of the 1950s, the central ideas of mathematical statistics that emerged from the work of R. A. Fisher and Karl Pearson were firmly established. Works such as Fisher's *Statistical Methods for Research Workers* (1925), Kendall's *Advanced Theory of Statistics* (1943, 1946), Cramér's *Mathematical Methods of Statistics* (1946), Wilks's *Mathematical Statistics* (1944), Lehman's *Testing Statistical Hypotheses* (1959), Scheffé's *The Analysis of Variance* (1959), and Doob's *Stochastic Processes* (1953) systematized the key results of mathematical statistics and provided the foundation for developments in applied statistics for decades to come. By the start of the 1960s, multivariate methods were applied routinely in psychology, economics, and the biological sciences. Applied treatments were available in works such as Snedecor's *Statistical Methods* (1937), Wold's *Demand Analysis* (Wold and Juréen, 1953), Anderson's *An Introduction to Multivariate Statistical Analysis* (1958), Simon's *Models of Man* (1957), Thurstone's *Multiple-Factor Analysis* (1947), and Finney's *Probit Analysis* (1952).

These methods are computationally intensive, and their routine application depended on developments in computing. BMD (Biomedical Computing Programs) was perhaps the first widely available statistical package, appearing in 1961 (Dixon et al. 1981). SPSS (Statistical Package for the Social Sciences) appeared in 1970 as a result of efforts by a group of political scientists at Stanford to develop a general statistical package specifically for social scientists (Nie et al. 1975). In addition to these general-purpose programs, many specialized programs appeared that were essential for the methods discussed below. At the same time, continuing advances in computer hardware increased the availability of computing by orders of magnitude, facilitating the adoption of new statistical methods.

## DEVELOPMENTS IN SOCIOLOGY

It is within the context of developments in mathematical statistics, sophisticated applications in other fields, and rapid advances in computing that major changes occurred in quantitative sociological research. Four major methods serve as the cornerstones for later developments: regression, factor analysis, stochastic processes, and contingency table analysis.

**Regression Analysis and Structural Equation Models.** Regression analysis is used to estimate the effects of a set of independent variables on one or more dependent variables. It is arguably the most commonly applied statistical method in the social sciences. Before 1960, this method was relatively unknown to sociologists. It was not treated in standard texts and was rarely seen in the leading sociological journals. The key notions of multiple regression were introduced to sociologists in Blalock's *Social Statistics* (1960). The generalization of regression to systems of equations and the accompanying notion of causal analysis began with Blalock's *Causal Inferences in Nonexperimental Research* (1964) and Duncan's "Path Analysis: Sociological Examples" (1966). Blalock's work was heavily influenced by the economist Simon's work on correlation and causality (Simon 1957) and the economist Wold's work on simultaneous equation systems (Wold and Juréen 1953). Duncan's work added the influence of the geneticist Wright's work in path analysis (Wright 1934). The acceptance of these methods by sociologists required a substantive application that demonstrated how regression could contribute to the understanding of fundamental sociological questions. In this case, the question was the determination of occupational standing and the specific work was the substantively and methodologically influential *The American Occupational Structure* by Blau and Duncan (1967), a work unsurpassed in its integration of method and substance. Numerous applications of regression and path analysis soon followed. The diversity of influences, problems, and approaches that resulted from Blalock and Duncan's work is shown in Blalock's reader *Causal Models in the Social Sciences* (1971), which became the handbook of quantitative methods in the 1970s.

Regression models have been extended in many ways. Bielby and Hauser (1977) have reviewed developments involving systems of equations. Regression methods for time series analysis and forecasting (often called Box-Jenkins models) were given their classic treatment in Box and Jenkins's *Time Series Analysis* (1970). Regression diagnostics have provided tools for exploring characteristics of the data set that is to be analyzed. Methods for identifying outlying and influential observations have been developed (Belsley et al. 1980), along with major advances in classic problems such as heteroscedasticity (White 1980) and specification (Hausman 1978). All these extensions have been finding their way into sociological practice.

**Factor Analysis.** Factor analysis, a technique developed by psychometricians, was the second major influence on quantitative sociological methods. Factor analysis is based on the idea that the covariation among a larger set of *observed* variables can be reduced to the covariation among a smaller set of *unobserved* or latent variables. By 1960, this method was well known and applications appeared in most major sociology journals. Statistical and computational advances in applying maximum-likelihood estimation to the factor model ( Jöreskog 1969) were essential for the development of the covariance structure model discussed below.

**Stochastic Processes.** Stochastic models were the third influence on the development of quantitative sociological methods. Stochastic processes model the change in a variable over time in cases where a chance process governs the change. Examples of stochastic processes include change in occupational status over a career (Blumen et al. 1955), friendship patterns, preference for job locations (Coleman 1964), and the distribution of racial disturbances (Spilerman 1971). While the mathematical and statistical details for many stochastic models had been worked out by 1960, they were relatively unknown to sociologists until the publication of Coleman's *Introduction to Mathematical Sociology* (1964) and Bartholomew's *Stochastic Models for Social Processes* (1967). These books presented an array of models that were customized for specific social phenomena. While these models had great potential, applications were rare because of the great mathematical sophistication of the models and the lack of general-purpose software for estimating the models. Nonetheless, the influence of these methods on the development of other techniques was great. For example, Markov chain models for social mobility had an important influence on the development of loglinear models.

**Contingency Table Analysis and Loglinear Models.** Methods for categorical data were the fourth influence on quantitative methods. The analysis of contingency tables has a long tradition in sociology. Lazarsfeld's work on elaboration analysis and panel analysis had a major influence on the way research was done at the start of the 1960s (Lazarsfeld and Rosenberg 1955). While these methods provided useful tools for analyzing categorical data and especially survey data, they were nonstatistical in the sense that issues of estimation and hypothesis testing generally were ignored. Important statistical advances for measures of association in two-way tables were made in a series of papers by Goodman and Kruskal that appeared during the 1950s and 1960s (Goodman and Kruskal 1979). In the 1960s, nonstatistical methods for analyzing contingency tables were replaced by the loglinear model. This model made the statistical analysis of multiway tables possible. Early developments are found in papers by Birch (1963) and Goodman (1964). The development of the general model was completed largely through the efforts of Frederick Mosteller, Stephen E. Fienberg, Yvonne M. M. Bishop, Shelby Haberman, and Leo A. Goodman, which were summarized in Bishop et al.'s *Discrete Multivariate Analysis* (1975). Applications in sociology appeared shortly after Goodman's (1972) didactic presentation and the introduction of ECTA (Fay and Goodman 1974), a program for loglinear analysis. Since that time, the model has been extended to specific types of variables (e.g., ordinal), more complex structures (e.g., association models), and particular substantive problems (e.g., networks) (see Agresti [1990] for a treatment of recent developments). As with regression models, many early applications appeared in the area of stratification research. Indeed, many developments in loglinear analysis were motivated by substantive problems encountered in sociology and related fields.

## ADDITIONAL METHODS

From these roots in regression, factor analysis, stochastic processes, and contingency table analysis, a wide variety of methods emerged that are now applied frequently by sociologists. Notions from these four areas were combined and extended to produce new methods. The remainder of this article considers the major methods that resulted.

**Covariance Structure Models.** The covariance structure model is a combination of the factor and regression models. While the factor model allowed imperfect multiple indicators to be used to extract a more accurately measured latent variable, it did not allow the modeling of causal relations among the factors. The regression model, conversely, did not allow imperfect measurement and multiple indicators. The covariance structure model resulted from the merger of the structural or causal component of the regression model with the measurement component of the factor model. With this model, it is possible to specify that each latent variable has one or more imperfectly measured observed indicators and that a causal relationship exists among the latent variables. Applications of such a model became practical after the computational breakthroughs made by Jöreskog, who published LISREL (*li*near *s*tructural *rel*ations) in 1972 ( Jöreskog and van Thillo 1972). The importance of this program is reflected by the use of the phrase "LISREL models" to refer to this area.

Initially, the model was based on analyzing the covariances among observed variables, and this gave rise to the name "covariance structure analysis." Extensions of the model since 1973 have made use of additional types of information as the model has been enhanced to deal with multiple groups, noninterval observed variables, and estimation with less restrictive assumptions. These extensions have led to alternative names for these methods, such as "mean and covariance structure models" and, more recently, "structural equation modeling" (see Bollen [1989] and Browne and Arminger [1995] for a discussion of these and other extensions).

**Event History Analysis.** Many sociological problems deal with the occurrence of an event. For example, does a divorce occur? When is one job given up for another? In such problems, the outcome to be explained is the time when the event occurred. While it is possible to analyze such data with regression, that method is flawed in two basic respects. First, event data often are censored. That is, for some members of the sample the event being predicted may not have occurred, and consequently a specific time for the event is missing. Even assuming that the censored time is a large number to reflect the fact that the event has not occurred, this will misrepresent cases in which the event occurred shortly after the end of the study. If one assigns a number equal to the time when the data collection ends or excludes those for whom the event has not occurred, the time of the event will be underestimated. Standard regression cannot deal adequately with censoring problems. Second, the regression model generally assumes that the errors in predicting the outcome are normally distributed, which is generally unrealistic for event data. Statistical methods for dealing with these problems began to appear in the 1950s and were introduced to sociologists in substantive papers examining social mobility (Spilerman 1972; Sorensen 1975; Tuma 1976). Applications of these methods were encouraged by the publication in 1976 of Tuma's program RATE for event history analysis (Tuma and Crockford 1976). Since that time, event history analysis has become a major form of analysis and an area in which sociologists have made substantial contributions (see Allison [1995] and Petersen [1995] for reviews of these methods).

**Categorical and Limited Dependent Variables.** If the dependent variable is binary, nominal, ordinal, count, or censored, the usual assumptions of the regression model are violated and estimates are biased. Some of these cases can be handled by the methods discussed above. Event history analysis deals with certain types of censored variables; loglinear analysis deals with binary, nominal, count, and ordinal variables when the independent variables are all nominal. Many other cases exist that require additional methods. These methods are called quantal response models or models for categorical, limited, or qualitative dependent variables. Since the types of dependent variables analyzed by these methods occur frequently in the social sciences, they have received a great deal of attention by econometricians and sociologists (see Maddala [1983] and Long [1997] for reviews of these models and Cameron and Trivedi [1998] on count models).

Perhaps the simplest of these methods is logit analysis, in which the dependent variable is binary or nominal with a combination of interval and nominal independent variables. Logit analysis was introduced to sociologists by Theil (1970). Probit analysis is a related technique that is based on slightly different assumptions. McKelvey and Zavoina (1975) extend the logit and probit models to ordinal outcomes. A particularly important type of limited dependent variable occurs when the sample is selected nonrandomly. For example, in panel studies, cases that do not respond to each wave may be dropped from the analysis. If those who do not respond to each wave differ nonrandomly from those who do respond (e.g., those who are lost because of moving may differ from those who do not move), the resulting sample is not representative. To use an example from a review article by Berk (1983), in cases of domestic violence, police may write a report only if the violence exceeds some minimum level, and the resulting sample is biased to exclude cases with lower levels of violence. Regression estimates based on this sample will be biased. Heckman's (1979) influential paper stimulated the development of sample selection models, which were introduced to sociologists by Berk (1983). These and many other models for limited dependent variables are extremely well suited to sociological problems. With the increasing availability of software for these models, their use is becoming more common than even that of the standard regression model.

**Latent Structure Analysis.** The objective of latent structure analysis is the same as that of factor analysis: to explain covariation among a larger number of observed variables in terms of a smaller number of latent variables. The difference is that factor analysis applies to interval-level observed and latent variables, whereas latent structure analysis applies to observed data that are noninterval. As part of the American soldier study, Paul F. Lazarsfeld, Sam Stouffer, Louis Guttman, and others developed techniques for "factor analyzing" nominal data. While many methods were developed, latent structure analysis has emerged as the most popular. Lazarsfeld coined the term "latent structure analysis" to refer to techniques for extracting latent variables from observed variables obtained from survey research. The specific techniques depend on the characteristics of the observed and latent variables. If both are continuous, the method is called factor analysis, as was discussed above. If both are discrete, the method is called latent class analysis. If the factors are continuous but the observed data are discrete, the method is termed latent trait analysis. If the factors are discrete but the data are continuous, the method is termed latent profile analysis. The classic presentation of these methods is presented in Lazarsfeld and Henry's *Latent Structure Analysis* (1968). Although these developments were important and their methodological concerns were clearly sociological, these ideas had few applications during the next twenty years. While the programs ECTA, RATE, and LISREL stimulated applications of the loglinear, event history, and covariance structure models, respectively, the lack of software for latent structure analysis inhibited its use. This changed with Goodman's (1974) algorithms for estimation and Clogg's (1977) program MLLSA for estimating the models. Substantive applications began appearing in the 1980s, and the entire area of latent structure analysis has become a major focus of statistical work.

**Multilevel and Panel Models.** In most of the models discussed here, observations are assumed to be independent. This assumption can be violated for many reasons. For example, in panel data, the same individual is measured at multiple time points, and in studies of schools, all the children in each classroom may be included in the sample. Observations in a single classroom or for the same person over time tend to be more similar than are independent observations. The problems caused by the lack of independence are addressed by a variety of related methods that gained rapid acceptance beginning in the 1980s, when practical issue of estimation were solved. When the focus is on clustering with social groups (such as schools), the methods are known variously as hierarchical models, random coefficient models, and multilevel methods. When the focus is on clustering with panel data, the methods are referred to as models for cross-section and time series data, or simply panel analysis. The terms "fixed and random effects models" and "covariance component models" also are used. (See Hsiao [1995] for a review of panel models for continuous outcomes and Hamerle and Ronning [1995] for panel models for categorical outcomes. Bryk and Raudenbush [1992] review hierarchical linear models.)

**Computer-Intensive Methods.** The availability of cheap computing has led to the rapid development and application of computer-intensive methods that will change the way data are analyzed over the next decade. Methods of resampling, such as the bootstrap and the jackknife, allow practical solutions to previously intractable problems of statistical inference (Efron and Tibshirani 1993). This is done by recomputing a test statistic perhaps 1,000 times, using artificially constructed data sets. Computational algorithms for Bayesian analysis replace difficult or impossible algebraic derivations with computer-intensive simulation methods, such as the Markov chain algorithm, the Gibbs sampler, and the Metropolis algorithm (Gelman et al. 1995). Related developments have occurred in the treatment of missing data, with applications of the EM algorithm and Markov chain Monte Carlo techniques (Schafer 1997).

**Other Developments.** The methods discussed above represent the major developments in statistical methods in sociology since the 1960s. With the rapid development of mathematical statistics and advances in computing, new methods have continued to appear. Major advances have been made in the treatment of missing data (Little and Rubin 1987). Developments in statistical graphics (Cleveland 1985) are reflected in the increasing number of graphics appearing in sociological journals. Methods that require less restrictive distributional assumptions and are less sensitive to errors in the data being analyzed are now computationally feasible. Robust methods have been developed that are insensitive to small departures from the underlying assumptions (Rousseeuw and Leroy 1987). Resampling methods (e.g., bootstrap methods) allow estimation of standard errors and confidence intervals when the underlying distributional assumptions (e.g., normality) are unrealistic or the formulas for computing standard errors are intractable by letting the observed data assume the role of the underlying population (Stine 1990). Recent work by Muthén (forthcoming) and others combines the structural component of the regression model, latent variables from factor and latent structure models, hierarchical modeling, and characteristics of limited variables into a single model. The development of Mplus (Muthén and Muthén 1998) makes routine application of this general model feasible.

## CONCLUSIONS

The introduction of structural equation models in the 1960s changed the way sociologists viewed data and viewed the social world. Statistical developments in areas such as econometrics, biometrics, and psychometrics were imported directly into sociology. At the same time, other methods were developed by sociologists to deal with substantive problems of concern to sociology. A necessary condition for these changes was the steady decline in the cost of computing, the development of efficient numerical algorithms, and the availability of specialized software. Without developments in computing, these methods would be of little use to substantive researchers. As the power of desktop computers grows and the ease and flexibility of statistical packages increase, the application of sophisticated statistical methods has become more accessible to the average researcher than the card sorter was for constructing contingency tables in the 1950s and 1960s. As computing power continues to develop, new and promising methods are appearing with each issue of the journals in this area.

Acceptance of these methods has not been universal or without costs. Critiques of the application of quantitative methods have been written by both sympathetic (Lieberson 1985; Duncan 1984) and unsympathetic (Coser 1975) sociologists as well as statisticians (Freedman 1987) and econometricians (Leamer 1983). While these critiques have made practitioners rethink their approaches, the developments in quantitative methods that took shape in the 1960s will continue to influence sociological practice for decades to come.

### references

Agresti, Alan 1990 *Categorical Data Analysis*. New York: Wiley.

Allison, Paul D. 1995 *Survival Analysis Using the SAS®**System: A Practical Guide*. Cary, NC: SAS Institute.

Anderson, T. W. 1958 *An Introduction to Multivariate**Statistical Analysis*. New York: Wiley.

Bartholomew, D. J. 1967 *Stochastic Models for Social**Processes*. New York: Wiley.

Belsley, David A., Edwin Kuh, and Roy E. Welsch 1980 *Regression Diagnostics: Identifying Influential Data and**Sources of Collinearity*. New York: Wiley.

Berk, R. A. 1983 "An Introduction to Sample Selection Bias in Sociological Data." *American Sociological Review* 48:386–398.

Bielby, William T., and Robert M. Hauser 1977 "Structural Equation Models." *Annual Review of Sociology*. 3:137–161.

Birch, M. W. 1963. "Maximum Likelihood in Three-Way Contingency Tables." *Journal of the Royal Statistical Society Series B* 27:220–233.

Bishop, Y. M. M., S. E. Fienberg, and P. W. Holland 1975 *Discrete Multivariate Analysis: Theory and Practice*. Cambridge, Mass.: MIT Press.

Blalock, Hubert M., Jr. 1960 *Social Statistics*. New York: McGraw-Hill.

—— 1964. *Causal Inferences in Nonexperimental Research*. Chapel Hill: University of North Carolina Press.

——, 1971 *Causal Models in the Social Sciences*. Chicago: Aldine.

Blau, Peter M., and Otis Dudley Duncan 1967 *The**American Occupational Structure*. New York: Wiley.

Blumen, I., M. Kogan, and P. J. McCarthy 1955 *Industrial Mobility of Labor as a Probability Process*. Cornell Studies of Industrial and Labor Relations, vol. 6. Ithaca, N.Y. Cornell University Press.

Bollen, Kenneth A. 1989 *Structural Equations with Latent**Variables*. New York: Wiley.

Borgatta, Edgar F., and George W. Bohrnstedt, eds. 1969 *Sociological Methodology*. San Francisco: Jossey-Bass.

——, eds. 1972 *Sociological Methods and Research*. Beverly Hills, Calif.: Sage.

Box, George E. P., and Gwilym M. Jenkins 1970 *Time**Series Analysis*. San Francisco: Holden-Day.

Browne, Michael W., and Gerhard Arminger 1995 "Specification and Estimation of Mean- and Covariance-Structure Models." In Gerhard Arminger, Clifford C. Clogg, and Michael E. Sobel, eds., *Handbook of**Statistical Modeling for the Social and Behavioral Sciences*. New York: Plenum.

Bryk, Anthony S., and Stephen W. Raudenbush 1992 *Hierarchical Linear Models: Applications and Data Analysis Methods*. Newbury Park, Calif.: Sage.

Cameron, A. Colin, and Pravin K. Trivedi 1998 *Regression Analysis of Count Data*. New York: Cambridge University Press.

Cleveland, William S. 1985 *The Elements of Graphing**Data*. Monterey, Calif.: Wadsworth.

Clogg, Clifford C. 1977 *MLLSA: Maximum Likelihood**Latent Structure Analysis*. State College: Pennsylvania State University.

Coleman, James S. 1964 *Introduction to Mathematical**Sociology*. Glencoe, Ill.: Free Press.

Coser, Lewis F. 1975 "Presidential Address: Two Methods in Search of Substance." *American Sociological**Review* 40:691–700.

Cramér, Harald 1946 *Mathematical Methods of Statistics*. Princeton, N.J.: Princeton University Press.

Dixon, W. J. chief ed. 1981 *BMD Statistical Software*. Berkeley: University of California Press.

Doob, J. L. 1953. *Stochastic Processes*. New York: Wiley.

Duncan, Otis Dudley 1966 "Path Analysis: Sociological Examples." *American Journal of Sociology* 72:1–16.

—— 1984 *Notes on Social Measurement: Historical and**Critical*. New York: Russell Sage Foundation.

Efron, Bradley, and Robert J. Tibshirani 1993 *An Introduction to the Bootstrap*. New York: Chapman and Hall.

Fay, Robert, and Leo A. Goodman 1974 *ECTA: Everyman's**Contingency Table Analysis*.

Finney, D. J. 1952 *Probit Analysis, 2nd ed.* Cambridge, UK: Cambridge University Press.

Fisher, R. A. 1925 *Statistical Methods for Research Workers*. Edinburgh: Oliver and Boyd.

Freedman, David A. 1987 "As Others See Us: A Case Study in Path Analysis." *Journal of Educational Statistics* 12:101–128.

Gelman, Andrew, John B. Carlin, Hal S. Stern, and Donald B. Rubin 1995 *Bayesian Data Analysis*. New York: Chapman and Hall.

Goodman, Leo A. 1964 "Simple Methods of Analyzing Three-Factor Interaction in Contingency Tables." *Journal of the American Statistical Association* 58:319–352.

—— 1972. "A Modified Multiple Regression Approach to the Analysis of Dichotomous Variables." *American Sociological Review* 37:28–46.

—— 1974 "The Analysis of Systems of Qualitative Variables When Some of the Variables Are Unobservable. Part I: A Modified Latent Structure Approach." *American Journal of Sociology* 79:1179–1259.

——, and William H. Kruskal 1979 *Measures of Association for Cross Classification*. New York: Springer-Verlag.

Hamerle, Alfred, and Gerd Ronning 1995 "Panel Analysis for Qualitative Variables." In Gerhard Arminger, Clifford C. Clogg, and Michael E. Sobel, eds., *Handbook of Statistical Modeling for the Social and Behavioral**Sciences*. New York: Plenum.

Hausman, J. A. 1978 "'Specification Tests in Econometrics." *Econometrica* 46:1251–1272.

Heckman, James J. 1979 "Sample Selection Bias as a Specification Error." *Econometrica* 47:153–161.

Hsiao, Cheng 1995 "Panel Analysis for Metric Data." In Gerhard Arminger, Clifford C. Clogg, and Michael E. Sobel, eds., *Handbook of Statistical Modeling for the**Social and Behavioral Sciences*. New York: Plenum.

Jöreskog, Karl G. 1969 "A General Approach to Confirmatory Maximum Likelihood Factor Analysis." *Psychometrika* 34:183–202.

——, and Marielle van Thillo 1972 *LISREL: A General**Computer Program for Estimating a Linear Structural**Equation System Involving Multiple Indicators of Unmeasured Variables*. Princeton, N.J.: Educational Testing Service.

Kendall, Maurice G. 1943 *Advanced Theory of Statistics*, vol. 1. London: Griffin.

—— 1946. *Advanced Theory of Statistics*, vol. 2. London: Griffin.

Lazarsfeld, Paul F., and Neil W. Henry 1968 *Latent**Structure Analysis*. New York: Houghton Mifflin.

——, and Morris Rosenberg, eds. 1955 *The Language**of Social Research*. New York: Free Press.

Leamer, Edward E. 1983 "Let's Take the Con Out of Econometrics." *American Economic Review* 73:31–43.

Lehmann, E. L. 1959 *Testing Statistical Hypotheses*. New York: Wiley.

Lieberson, Stanley 1985 *Making It Count: The Improvement of Social Research and Theory*. Berkeley: University of California.

Little, Roderick J. A., and Donald B. Rubin 1987 *Statistical Analysis with Missing Data*. New York: Wiley.

Long, J. Scott 1997 *Regression Models for Categorical and**Limited Dependent Variables*. Newbury Park, Calif.: Sage.

Maddala, G. S. 1983 *Limited-Dependent and Qualitative**Variables in Econometrics*. Cambridge, UK: Cambridge University Press.

McKelvey, Richard D., and William Zavoina 1975 "A Statistical Model for the Analysis of Ordinal Level Dependent Variables." *Journal of Mathematical Sociology* 4:103–120.

Muthén, Bengt O. 1998 "Second-Generation Structural Equation Modeling with a Combination of Categorical and Continuous Latent Variables: New Opportunities for Latent Class/Latent Growth Modeling." In A. Sayer and L. Collins, eds., *New Methods for the**Analysis of Change*. Washington D.C.: APA.

Muthén, Linda K., and Bengt O. Muthén 1998 *Mplus:**The Comprehensive Modeling Program for Applied Researchers*. Los Angeles: Muthén & Muthén.

Nie, Norman H., C. Hadlai Hull, Jean G. Jenkins, Karin Steinbrenner, and Dale H. Bent 1975 *Statistical Package for the Social Sciences, 2nd ed*. New York: McGraw-Hill.

Petersen, Trond 1995 "Analysis of Event Histories." In Gerhard Arminger, Clifford C. Clogg, and Michael E. Sobel, eds., *Handbook of Statistical Modeling for the**Social and Behavioral Sciences*. New York: Plenum.

Rousseeuw, Peter J., and Annick M. Leroy 1987 *Robust**Regression and Outlier Detection*. New York: Wiley.

Schafer, J. L. 1997 *Analysis of Incomplete Multivariate**Data*. New York: Chapman and Hall.

Scheffé, H. 1959 *The Analysis of Variance*. New York: Wiley.

Simon, Herbert 1957 *Models of Man*. New York: Wiley.

Snedecor, George W. 1937 *Statistical Methods*. Ames: Iowa State University Press.

Sorensen, Aage 1975 "The Structure of Intragenerational Mobility." *American Sociological Review* 40:456–471.

Spilerman, Seymour 1971 "The Causes of Racial Disturbances: Tests of an Explanation." *American Sociological Review* 36:427–442.

—— 1972 "The Analysis of Mobility Processes by the Introduction of Independent Variables Into a Markov Chain." *American Sociological Review* 37:277–294.

Stine, Robert 1990 "An Introduction to Bootstrap Methods." In John Fox and J. Scott Long, eds., *Modern**Methods of Data Analysis*. Newbury Park, Calif.: Sage.

Theil, H. 1970 "On the Estimation of Relationships Involving Qualitative Variables." *American Journal of**Sociology* 76:103–154.

Thurstone, L. L. 1947 *Multiple-Factor Analysis*. Chicago: University of Chicago Press.

Tuma, Nancy B. 1976 "Rewards, Resources, and the Rate of Mobility." *American Sociological Review* 41:338–360.

——, and D. Crockford 1976 *Invoking RATE*. Center for the Study of Welfare Policy. Menlo Park, Calif.: Stanford Research Institute.

White, Halbert 1980 "A Heteroskedasticity-Consistent Covariance Matrix and a Direct Test for Heteroskedasticity." *Econometrica* 48:817–838.

Wilks, S. S. 1944 *Mathematical Statistics*. Princeton, N.J.: Princeton University Press

Wold, Herman, and Lars Juréen 1953 *Demand Analysis*. New York: Wiley.

Wright, Sewall 1934 "The Method of Path Coefficients." *Annals of Mathematical Statistics* 5:161–215.

J. Scott Long