## Survey Analysis

## Survey Analysis

# Survey Analysis

I. Methods of Survey Analysis*Hanan C. Selvin*

II. The Analysis of Attribute Data*Paul F. Lazarsfeld*

III. Applications in Economics*James N. Morgan*

## I. METHODS OF SURVEY ANALYSIS

The ways in which causal inferences are drawn from quantitative data depend on the design of the study that produced the data. In experimental studies the investigator, by using one or another kind of experimental control, can remove the effects of the major extraneous causal factors on the dependent variable. The remaining extraneous causal factors can be turned into a chance variable if subjects are assigned randomly to the experimental “treatments.” In principle, then, there should be only two sources of variation in the dependent variable: (1) the effects of the independent variables being studied; (2) the effects of the random assignment and of other random phenomena, especially measurement error. By using the procedures of statistical inference, it is possible to arrive at relatively clear statements about the effects of the independent variables. But in survey research (or “observational research,” as it is usu-ally called by statisticians) neither experimental control nor random assignment is available to any significant degree. The task of survey analysis is therefore to manipulate such observational data after they have been gathered, in order to separate the effects of the independent variables from the effects of the extraneous causal factors associated with them.

In the survey the association of the independent and extraneous variables occurs naturally; in the field experiment, or quasi-experimental design, the extraneous variables usually result from the experimenter’s deliberate introduction of a stimulus or his modification of some condition, both of which result in a set of problems different from those considered here *[see* Experimental design, article on quasi-experimental design].

Among the classics of survey analysis are Emile Durkheim’s attempt to explain variations in suicide rates by differences in social structure (1897); the studies of soldiers’ attitudes conducted by the Re-search Branch of the U.S. Army during World War ii and reanalyzed afterward in *The American Soldier* (Stouffer et al. 1949); and the series of voting studies that began with *The People’s Choice* (Lazarsfeld et al. 1944).

As these examples suggest, survey analysis differs from other nonexperimental procedures for analyzing and presenting quantitative data, notably from probability sampling procedures and demographic analysis. In contrast with the statistical analysis of sample surveys, survey analysis often deals with total populations; even when the data of survey analysis come from a probability sample, the conventional statistical problems of estimating parameters and testing hypotheses are secondary concerns (Tukey 1962). And although survey analysis has historical roots that go back to the earliest work in demography, it differs from demography in the source of its data and, therefore, in the operations it performs on these data. Until recently, demographic analysis had largely relied on reworking the published tables of censuses and registers of vital statistics, while survey analysts usually constructed their own tables from individual questionnaires or interviews. Although these differences are still important, survey analysts have begun to use some demographic techniques, and demographers have resorted to survey analysis of specially gathered interview data in such areas as labor mobility and family planning. Perhaps the most striking evidence of the convergence of these two lines of inquiry is in the widespread use of the one-in-a-thousand and one-in-ten-thousand samples of the 1960 *U.S. Census of Population.* These samples allow the analyst of census data to prepare whatever tables he may wish. As other national censuses make their data available in this form, demographic analysis will more closely resemble survey analysis.

The causal emphasis of survey analysis also serves to distinguish it from more narrowly descriptive procedures. It differs from the “social survey,” which, at least in Great Britain, has usually been a statistical account of urban life, especially among the poor. And although it shares with census reports, market research, and opinion polling a reliance on tabular presentation, survey analysis is unlike these fields in seeking to link its data to some body of theory. The theory may be as simple as the proposition that a set of communications has changed certain attitudes *[see* Communication, Mass, article on effects]. Or it may involve an explicit structure of variables, as in analyses of the reasons that people give for having done one thing rather than another *[see* Reason ANALYSIS].

Survey analysis has a role to play in the construction of formal theories, whether they use explicit mathematical relations or are only implicitly mathematical, as in computer simulation. Some mathematical models are indeed useful in the analysis of survey data (Coleman 1964). But survey analysis as it is defined here usually limits itself to identifying variables important enough to be included in the formal theories.

### The background of survey analysis

Two basic elements in survey analysis are the use of rates as dependent variables and the explanation of differences in rates by means of their statistical associations with other social phenomena. Both of these features first appeared in Graunt’s *Natural and Political Observations Made Upon the Bills of Mortality* (1662), which included the first data on urban and rural death rates. This one small book thus makes Graunt a major figure in the history not only of survey analysis but also of statistics and demography. With the exception of the life table, which Graunt invented but which was improved significantly a generation later by the astronomer Edmund Halley, Graunt’s methods set the pattern for statistical analysis until the middle of the nineteenth century *[see* Vital statistics].

Although Graunt had already noted the approximate constancy of certain rates over time (e.g., the excess of male births over female), the German pastor Johann Peter Siissmilch was the first to notice that suicide, a voluntary act, also showed the same constancy. He thus initiated the field of “moral statistics,” which was to make much of nineteenth-century statistics, especially in France, resemble modern sociology. But the major figure in the development of moral statistics was the Belgian astronomer Adolphe Quetelet, who made three important contributions to survey analysis: (1) he used multivariate tables to explore the relations between the rates of crime or marriage and such demographic factors as age and sex; (2) he applied the calculus of probability to explaining the constancy of social rates over time; and (3) he helped to establish organized bodies of statisticians, including the Statistical Society of London (later the Royal Statistical Society), and he organized several international statistical congresses.

Whether or not Quetelet was right in trying to explain the stability of rates by drawing on the theory of errors of observation is still a subject of controversy (Hogben 1957; Lazarsfeld 1961). There can be no question, however, that the organization of the statistical societies in England during the 1830s and 1840s was followed by in-creased application of statistical data to social problems, notably in the statistical demonstrations by John Snow and Thomas Farr of the relation between polluted water and cholera, and in several studies on the differences in the rates of mortality in large and small hospitals.

By the last decade of the nineteenth century the use of tables for causal analysis had reached a high stage of development, both in England and on the continent of Europe. This was also the period when Charles Booth, disturbed by a socialist claim that a third of the people of London were living in poverty, was conducting his monumental study of the London poor, a study initially intended to uncover the cause of poverty. In France, at about the same time, fimile Durkheim drew on the accumulated work in moral statistics to produce the first truly sociological explanation of differences in suicide rates. The two men and their studies could hardly have been more different. Booth, the successful businessman and dedicated conservative, primarily sought accurate data on the poor of London; his original hope for causal analysis was never realized. Durkheim, the brilliant and ascetic university professor, saw in his analysis of official statistics the opportunity to make sociology a truly autonomous discipline. And yet the two men were alike in one important error of omission: both failed to recognize their need for the statistical tools being developed at the same time in the work of Francis Galton, Karl Pearson, and G. Udny Yule.

By 1888 Gal ton’s research on heredity and his acquaintance with Quetelet’s use of the normal distribution had led him to the basic ideas of cor-relation and regression, which were taken up and developed further by Pearson, beginning in 1892 (the year in which Booth was president of the Royal Statistical Society). Three years later, in his first paper on statistics, Yule (1895) called attention to Booth’s misinterpretation of some tabular data. Where Booth had claimed to find no association between two sets of variables, Yule computed correlation coefficients that ranged from 0.28 to 0.40; a similar table in which Durkheim ([1897] 1951, p. 87) saw “not the least trace of a relation” yields an even higher correlation. According to the then-current theory, the coefficient of correlation had meaning only as a parameter of a bivariate normal distribution; Yule had no “right” to make such a computation, but he made it anyway, apparently believing that an illegitimate computation was better than no computation at all.

Two further papers of Yule’s showed the wisdom of this judgment and laid the foundations for much of modern survey analysis. He proved that the use of the correlation coefficient to measure association does riot depend on the form of the underlying distribution; in particular, it need not be normal. In the same paper he also gave the formulas of multiple and partial regression and correlation (1897). Two years later he applied these ideas to a survey analysis of “panel” data on poverty—a multiple regression of changes in poverty rates on three independent variables (1899). In these four years Yule showed how the statistical part of survey analysis could be made truly quantitative. But although some economists and psychologists were early users of multiple regression, it was not until the 1960s that other survey analysts, notably sociologists and students of public opinion, came to see its importance *[see* Linear hypotheses, article on regression].

Not content to deal only with continuous variables, Yule also took in hand the analysis of tabular data and set forth its algebraic basis (1900). Although this material appeared in every edition of his *Introduction to the Theory of Statistics* (Yule & Kendall [1911] 1958, chapters 1-3), it had even less effect on survey analysis than did his work on continuous variables. It was finally brought to the attention of social scientists by Lazarsfeld in 1948 (see Lazarsfeld & Rosenberg 1955, pp. 115-125), as the basis for his codification of survey analysis.

A new area of application for survey analysis began to appear in the 1920s, first in the form of market research and later in opinion polling, communications research, and election research. Among the factors that promoted these new developments were the change in American social psychology and sociology from speculation to empirical research, the wide availability of punched-card machines, and a new interest in the use of formal statistical procedures, notably at Chicago, where William F. Ogburn and Samuel A. Stouffer taught both correlational and tabular techniques in the 1930s. By the time of World War n, survey analysis had advanced to the point where Stouffer was able, as already mentioned, to organize a group of experienced survey analysts to conduct hundreds of attitude surveys among American soldiers.

Three major developments have shaped survey analysis since the 1940s. The emphasis on closer relations between theory and research has led to greater concern with conceptualization and index formation, as well as with the causal interpretation of statistical relations. The rise of university research bureaus has increased both the quantity and the quality of survey analysis. And the advent of the large computer has brought survey analysts to contemplate once again the vision that Yule had placed before them in 1899—the possibility of replacing the crude assessment of percentaged tables with the more powerful methods of multiple regression and other multivariate procedures.

### The structure of survey analysis

Analysis is the study of variation. Beginning with the variation in a dependent variable, the analyst seeks to account for this variation by examining the covariation of the dependent variable with one or more independent variables. For instance, in a sample where 57 per cent prefer the conservative party and 43 per cent the liberal party, the analytic question is why people divide between the two parties. If everyone preferred the same party, there would be no variation and, *within this sample,* no problem for analysis. The answer to this question comes from examining how the distribution of preferences is affected by a set of independent variables, such as the sex of the individual, the social class of his family, and the size of his community. This combination of a single dependent variable and a set of independent variables is the most common structure examined in survey analysis; it also serves as the building block for other, more complex structures—for example, a study with several dependent variables.

The sequence of steps in the analysis of an ideal experiment is determined largely by the design of the study. In real life, of course, an experimenter almost always confronts new problems in his analysis. The survey analyst, however, has so many more options open to him at each step that he cannot specify in advance all of the steps through which an analysis should go. Nevertheless, it is useful to conceive of analysis as a series of cycles, in which the analyst goes through the same formal sequence of steps in each cycle, each time changing some essential aspect of his problem. A typical cycle can be described as follows.

(1) *Measuring the parameters of some distribution.* The concrete form of this step may be as simple as computing percentages in a two-variable table or as complicated as fitting a regression plane to a large set of points. Indeed, the parameters may not even be expressed numerically; in conventional survey analysis (i.e., analysis using percent-aged contingency tables) two-variable relations may be classified simply as “large” or “small.”

(2) *Assessing the criteria for an adequate analysis.* The reasons survey analysts give for stopping one line of investigation and starting another often appear superficial: they have run out of time, cases, or interest. On further investigation, however, it usually appears that they have stopped for one or more of the following reasons: (a) statistical completeness—that is, a sufficiently high pro-portion of the variation in the dependent variable has been accounted for by the variation in the independent variables; (fc) theoretical clarity—that is, the meanings of the relations already found and the nature of the causal structure are sufficiently clear not to need further analysis; (c) unimportance of error—that is, there is good reason to believe that the apparent findings are genuine, that they are not the result of one or another kind of error. These three reasons, then, can be regarded as criteria for an adequate analysis.

(3) *Changing the analytic model.* With these criteria in mind, the analyst decides whether to stop the analysis or to continue it, either by adding more variables or by changing the basic form of the analysis (for example, from linear to curvilinear regression).

In practice, there are two major sets of procedures by means of which the steps involved in each cycle are taken. One rests on the construction of percentaged contingency tables, the other on the different kinds of multivariate statistical analysis. Despite the extensive use of correlational techniques by psychologists, survey analysis has on the whole been dominated by percentaged contingency tables. One reason for this dominance is economic: with punched-card machines, running several tables takes less time than computing a single cor-relation coefficient on a desk calculator. The advent of the large electronic computer has brought a revolutionary change in this situation, and statistical computations that might have taken months, if they were done at all, can now be done in minutes. This change has led to new interest in multivariate statistical techniques and to some questioning of the place of tabular methods in survey analysis. Since many textbooks explain the basic ideas of multivariate statistical procedures, it will be necessary to consider in detail only the logic of tabular analysis.

### Percentaged contingency tables

Let us recall the illustration of a sample in which 57 per cent prefer the conservative party and 43 per cent the liberal party. Further, let us call the dependent variable (party preference) *A* and the three independent variables (sex, social class, and size of community) B, C, and D. For simplicity, let each variable take only two values: A_{1} (conservative) and A_{2} (liberal), B_{1} and B_{2}, and so on. Tabular analysis then begins by considering the distribution of the dependent variable, as in Table 1.

Table 1 — Percentage distribution of A* | |
---|---|

* Hypothetical data. | |

A_{1} | 57% |

A_{2} | 43 |

Total | 100% |

(n = 1,000) |

In this simple distribution, 57 per cent of the sample 1,000 cases have the value A_{t} and 43 per cent have the value A_{2}. Analysis proper begins when a second variable is introduced and the association between them is examined by means of a two-variable table. However, instead of looking at a two-variable table, it is intuitively more appealing to consider once again the distribution of A— or, rather, *two* distributions of A, one for those people classified as B_{1} and the other for those who are B_{2}, as in Table 2.

Table 2 — Percentage distribution of A for those who are B_{1}, and B_{2}^{*} | ||
---|---|---|

B_{1} | B_{2} | |

* Hypothetical data. | ||

A_{1} | 45% | 66% |

A_{2} | 55 | 34 |

Total | 100% | 100% |

(n = 420) | ( n = 580); |

Perhaps the simplest measurement of association is to compare these two univariate distributions—that is, to note the 21 percentage-point difference in the proportion of BJ_{1}’s and B_{2}’s who respond A_{1} . These two distributions are, of course, the two columns of a 2 x 2 table. The point of separating them is to stress the concept of two-variable association as the comparison of univariate distributions.

The same sort of link between different levels appears when a third variable is introduced. This three-variable “elaboration,” or multivariate analysis, as Lazarsfeld has called it (not to be confused with the statistical concept of the same name, which is here called multivariate statistical analysis), involves the re-examination of a two-variable association for the two separate subgroups of people classified according to the values of the third variable, C, as in Table 3.

Table 3 — Associations of A and B for those who are C_{1} and C_{2} | ||||
---|---|---|---|---|

C_{1} | C_{2} | |||

B_{1} | B_{2} | B_{1} | B_{2} | |

* Hypothetical data. | ||||

A_{1} | 25% | 38 % | 72% | 76% |

A_{2} | 75 | 62 | 28 | 24 |

Total | 100% | 100% | 100% | 100% |

(n = 240) | (n = 160) | (n = 180) | (n = 420) |

It is necessary to give a numerical measure of the association in each “partial” table, but the above reasoning applies no matter what measure is chosen. Elaboration, then, is simply the comparison of these two measures of association. That is, when the third variable, C, is introduced, the association between A and B becomes a composite dependent variable; elaboration is the relation between C and some measure of the association between A and B.

This approach to elaboration is another way of describing *statistical interaction,* or the extent to which the association between A and B depends on the value of C. In the usual discussion of interaction, however, it also appears as something new in the treatment of association. The simple formalization presented here emphasizes the common thread that runs through the treatment of one, two, three, or more variables: the measure of the degree of relation for a given number of variables becomes the dependent variable when a new independent variable is introduced.

Lazarsfeld has distinguished three ideal types of configurations that result when the relation between A and B is examined separately for *C _{1}* and C

_{2}. The disappearance of the original relation is called “explanation” when C is causally prior to both A and B, and “interpretation” when C intervenes between A and B. The third type of elaboration, “specification,” is essentially an extreme form of interaction, in which at least one partial relation is larger than the original relation or is of opposite sign. A full discussion of elaboration and many examples are given in Hyman (1955, chapters 6, 7).

Lazarsfeld’s several discussions of elaboration have clarified much of what a survey analyst does in practice; they have also led Simon, Blalock, Boudon, and Duncan (whose work is discussed below, in the section on causal structures) to mathematize the idea of a causal structure and to extend it beyond the three-variable level. The fundamental ideas of elaboration have thus stood the test of time. However, the percentaged contingency table—the tool that most survey analysts have used to carry out the ideas of elaboration—now appears less satisfactory than it formerly did.

### Alternatives to tabular analysis

The value of a technique should be judged against the available alternatives to it. The best alternatives to tabular analysis now seem to be multiple regression (for quantitative dependent variables) and multiple discriminant analysis (for qualitative dependent variables). Two procedures are necessary because tabular analysis, of course, treats quantitative and qualitative variables in essentially the same way. Compared with these techniques, tabular analysis appears to have three principal shortcomings, as follows.

(1) *Lack of a measure of statistical completeness.* The square of the multiple correlation coefficient is the proportion of the variation in the dependent variable that is accounted for by the regression on the independent variables. In regression the analyst always knows how far he has gone toward the goal of complete explanation (usually linear). By the late 1960s no comparable statistic adequate for a large number of independent variables had yet become available in tabular analysis; the analyst therefore does not know whether he has gone far enough or, indeed, whether the introduction of additional variables makes an appreciable addition to the explanatory power of those already included.

(2) *Ambiguity of causal inferences.* Even with very large samples, the number of independent variables that can be considered jointly is usually no more than four or five; percentage comparisons involving many variables are usually based on too few cases to be statistically stable. It is often possible, however, to find many more variables that have appreciable effects on the dependent variable. This inability to examine the joint effects of *all* of the apparently important independent variables makes the interpretation of *any* relation between the independent and dependent variables inherently ambiguous. Suppose, for example, that one can examine the effects of only three variables at once; that variables B, C, D, *E,* and F all have appreciable associations with the dependent variable, *A;* and that, as is usually the case, all of these independent variables are intercorrelated. Then it is impossible to draw clear conclusions from any three-variable table, since what appears to be the effect of, say, B, C, and D is also the effect of *E* and F, but in some unknown and inherently unknowable degree. Maurice Halbwachs had this kind of argument in mind when he said that Durkheim’s attempt to discern the effects of religion by areal comparisons was fundamentally impossible; any comparison between Catholic areas and Protestant areas involves differences in income, social norms, industrialization, and many other factors *[see* Suicide, article on social Aspects]. In contrast, the procedures of multivariate statistics can handle dozens or even hundreds of variables at the same time, so that it is relatively easy to ascertain the meaning of an observed relation between independent and dependent variables.

There is another reason why the percentaged table is an unsatisfactory tool of causal analysis. As Campbell and Clayton (1961) have shown, unwanted regression effects produced by the less than perfect association between the independent variables may be confounded with the causal effects that the analyst wants to examine. These regression effects are particularly dangerous in panel studies.

(3) *Lack of a systematic search procedure.* At the beginning of an analysis the main task is to find the independent variables that are the best predictors of the dependent variables, and the dependent variables that are most predictable from the independent variables. The complex intercor-relations among the independent variables make this a slow task in tabular analysis. In contrast, “stepwise” regression and discriminant programs rapidly arrange the independent variables in the order of their predictive power, and modern programs allow such analyses to be repeated for other dependent variables in a few seconds. Sonquist and Morgan (1964) have devised a computer program that simulates some aspects of the search behavior of a tabular analyst (see also Sterling et al. 1966).

**Problems of multivariate statistics** . In emphasizing the defects of tabular analysis and the virtues of multivariate statistics, the above section presented a somewhat one-sided picture. Such procedures as regression and discriminant analysis have serious problems of their own. For example, to treat a nominally scaled variable such as race or geographical region as an independent variable, one must first transform it into a set of “dummy variables” (for instances of this procedure, see Draper & Smith 1966, section 5.3).

Another problem arises in detecting and representing statistical interaction. In their standard forms, multivariate statistical procedures assume that there is no interaction. However, several ancillary procedures for detecting interaction are available—for example, analysis of variance, the examination of residuals from the regression, the Sonquist-Morgan “Automatic Interaction Detector” (1964), and the stratification of the sample by one or more of the interacting variables, with separate regressions in each part. Similarly, it is possible to represent interaction by appropriate modifications of the standard equations (see Draper & Smith 1966, chapter 5).

On balance, the method of choice appears to be multivariate statistical procedures for the early and middle phases of a survey analysis, and tables for presenting the final results. An examination of current journals suggests that this judgment is increasingly shared by social scientists who engage in survey analysis.

### The meanings of statistical results

In every cycle of analysis there are problems of imputing or verifying the meanings of variables and relations—that is, there are problems of conceptualization and validation. Much of the meaning that one imputes to variables and relations comes from sources other than the statistical data —from the precipitate of past research and theory, from the history of the phenomena studied, and from a wide range of qualitative procedures. Indeed, it is the skillful interweaving of survey and ancillary data that often distinguishes insightful survey analysis from routine manipulation.

Although conceptualization and validation are partly matters of judgment, the wide range of questions in the typical survey provides an objective basis for imputing meanings to observed variables. At one extreme there are such simple procedures as examining the association between a variable of uncertain meaning and one or more additional variables whose meaning is less in question. For example, a self-estimate of “interest in politics” may be validated by seeing how strongly this interest is associated with reading political news, discussing politics with friends, and voting in elections. At the other extreme, the common meanings of large numbers of variables may be extracted by some “rational” scaling procedure, such as Guttman scaling, latent structure analysis, or factor analysis. Again, computer programs have made such procedures much less expensive than they once were, and therefore more desirable than arbitrarily constructed scales.

### Causal structures

Although survey analysts have always aimed at uncovering causes, the idea that the independent and dependent variables should all be located in a determinate structure of relations—usually represented by a set of boxes and arrows—has not yet been generally accepted. Few survey analyses have gone beyond Lazarsfeld’s three types of elaboration. The first significant methodological advance on Lazarsfeld’s formalization was made by Simon (1954); drawing on a large body of work in econometrics, Simon showed how Lazarsfeld’s idea of time order, which had been treated separately from the statistical configurations of his three variables, could be combined with them in a system of simultaneous equations. Blalock (1964) took up Simon’s suggestions and showed how they can be applied to empirical data. Boudon (1965) has developed a theory of “dependence analysis,” which, along with extending the models of Simon and Blalock, shows the relations between these models, ordinary least-squares regression, and the work of Sewall Wright on “path coefficients” (a line of inquiry independently pursued in biology since 1918). Finally, Duncan (1966) has applied the ideas of path analysis to a set of sociological examples and has shown how path analysis can make loose verbal formulations of the causal structure explicit and consistent.

### Statistical inference in survey analysis

The statistical theory of sample surveys has dealt almost entirely with descriptive studies, in which the usual problem is to estimate a few pre-designated parameters. Where survey analysts have tried to apply this theory, they have often ignored or argued away two important assumptions: (1) that nonrandom errors, such as those produced in sampling, interviewing, and coding, are negligible, and (2) that all of the hypotheses tested were stated before examining the data.

Few survey investigators know the direction and magnitude of the nonrandom errors in their work with any accuracy, for measurement of these errors requires a specially designed study. Stephan and McCarthy (1958) reported the results of several empirical studies of sampling procedures. Such studies provide a rough guide to the survey investigator; precise data on the actual operation of survey procedures are probably available only in large research organizations that frequently repeat the same kind of survey. Without such knowledge, the interpretation of statistical tests and estimates in precise probabilistic terms may be misleading. An apparently “significant” relation may result from nonrandom error rather than from the independent variable, and a relation that apparently is not significant may stem from a nonrandom error opposite in sign and approximately equal in magnitude to the effect of the independent variable.

Lack of information about the nonrandom errors casts doubt on *any* inference from data, not simply on the inferences of formal statistics. However, the practices of many survey analysts justify emphasizing the effects of this lack of knowledge on the procedures of statistical inference, especially on tests of significance. All too often the words “significant at the .05 level” are thought to provide information on the random variables without there being any knowledge of the nonrandom variables —as if this phrase were a certificate of over-all methodological quality.

Even when there is no problem of nonrandom error, the history of the hypothesis being tested affects the validity of probability computations. The survey analyst seldom begins a study with a single specific hypothesis in mind. What hypotheses he has are usually diffuse and ill-formulated, and the data of the typical survey are so rich and suggestive that he almost always formulates many more hypotheses after looking at the results. Indeed, the original analyst usually examines such a small proportion of the hypotheses that can be studied with his data that libraries of survey data have been established to facilitate “secondary analysis,” or the restudy of survey data for purposes that may not have been intended by the original investigator.

In the typical survey conducted for scientific purposes the analyst alternates between examining the data and formulating hypotheses that he explores further in the same body of data. This kind of “data dredging” is a necessary and desirable part of survey analysis; to limit survey analysis to hypotheses precisely stated in advance of seeing the data would be to make uneconomical use of a powerful tool. However, the analyst pays a price for this flexibility: although it is legitimate to explore the implications of hypotheses on the same body of data that suggested them, it is *not* legitimate to attach probability statements to these hypotheses (Selvin & Stuart 1966). The situation is analogous to testing a table of random numbers for randomness: the a priori probability of finding six consecutive fives in the first six digits examined is (O.I)^{6}, a quantity small enough to cast doubt on the randomness of the process by which the table was generated. If, however, one hunts through thousands of digits to find such a sequence (and the longer one looks, the greater the likelihood of finding it), this probability computation becomes meaningless. Similarly, the survey analyst who leafs through a pile of tables until he finds an “interesting” result that he then “tests for significance” is deceiving himself; the procedure designed to guard against the acceptance of chance results actually promotes their acceptance. The use of computers has exacerbated this problem. Many programs routinely test every relation for significance, usually with procedures that were intended for a single relation tested alone, and many analysts seem unable to resist dressing their dredgedup hypotheses in ill-fitting probabilistic clothes.

The analyst who wants to perform a statistical test of a dredged-up hypothesis does not have to wait for a new study. If he has foreseen this problem, he can reserve a random subsample of his data at the outset of the analysis, dredge the remainder of his data for hypotheses, and then test a small number of these hypotheses on the reserved subsample. The analyst who was not so foresighted or who has dredged up too many hypotheses to test on one subsample may be able to use a large data library to provide an approximate test. For example, an analyst who wants to test a dredged-up relation involving variables *A, B,* and C would look through such a library until he found another, comparable study with these same variables. He would then divide the sample of this study into a number of random subsamples and see how his dredged-up relation fares in these independent replications. If all or most of the sub-samples yield relations in the same direction as his dredged-up relation, he can be reasonably confident that it was not the result of chance.

This procedure is simple, and it will become even simpler when the questions in the data libraries are put on magnetic tape, as the responses are now. However, this “backward replication” does raise some methodological problems, especially concerning the comparability of studies and questions. Neither this procedure nor the use of reserved subsamples has yet been studied in any detail by survey methodologists or statisticians.

Hanan C. Selvin

*[See also*Evaluation Research; Experimental Design; Linear Hypotheses; Multivariate Analysis; Panel Studies; Sample Surveys; Science, *article On* THE Philosophy OF Science; Sociology; Statistics, Descriptive; Tabular Presentation; *and the biographies of* Booth; Durkheim; Galton; Graunt; Ogburn; Pearson; Quetelet; Stouffer; SÜSSMILCH; Yule

## BIBLIOGRAPHY

Blalock, Hubert M. JR. 1964 *Causal Inferences in Nonexperimental Research.* Chapel Hill: Univ. of North Carolina Press.

Boudon, Raymond 1965 A Method of Linear Causal Analysis: Dependence Analysis. *American Sociological Review* 30:365-374.

Boudon, Raymond 1967 *L’analyse mathématique des faits sociaux.* Paris: Plon.

Boudon, Raymond; and Lazarsfeld, Paul F. (editors) 1966 *L’analyse empirique de la causalité.* Paris and The Hague: Mputon.

Campbell, Donald T.; and Clayton, K. N. 1961 Avoiding Regression Effects in Panel Studies of Communication Impact. *Studies in Public Communication* 3: 99-118.

Coleman, James S. 1964 *Introduction to Mathematical Sociology.* New York: Free Press.

Draper, N. R.; and Smith, H. 1966 *Applied Regression Analysis.* New York: Wiley.

Duncan, Otis Dudley 1966 Path Analysis: Sociological Examples. *American Journal of Sociology* 72:1-16.

Durkheim, ÉMILE (1897) 1951 *Suicide: A Study in Sociology.* Glencoe, 111.: Free Press. → First published in French.

Graunt, John (1662) 1939 *Natural and Political Observations Made Upon the Bills of Mortality.* Edited and with an introduction by Walter F. Willcox. Baltimore: Johns Hopkins Press.

Hogben, Lancelot T. 1957 *Statistical Theory: The Relationship of Probability, Credibility and Error; an Examination of the Contemporary Crisis in Statistical Theory From a Behaviourist Viewpoint.* London: Allen & Unwin.

Hyman, Herbert H. 1955 *Survey Design and Analysis: Principles, Cases, and Procedures.* Glencoe, 111.: Free Press.

Lazarsfeld, Paul F. 1961 Notes on the History of Quantification in Sociology—Trends, Sources and Problems. Pages 147-203 in Harry Woolf (editor), *Quantification: A History of the Meaning of Measurement in the Natural and Social Sciences.* Indianapolis, Ind.: Bobbs-Merrill.

Lazarsfeld, Paul F.; Berelson, Bernard; and Gaudet, Hazel*(1944) 1960* The People’s Choice: How the Voter Makes Up His Mind in a Presidential Campaign. 2d ed. New York: Columbia Univ. Press.

Lazarsfeld, Paul F.; and Rosenberg, Morris (editors) 1955 *The Language of Social* Research:, A *Reader in the Methodology of Social Research.* New York: Free Press. → A revised edition is scheduled for publication in 1968.

Moser, Claus A. 1958 *Survey Methods in Social Investigation.* New York: Macmillan.

Selvin, Hanan C. 1965 Durkheim’s *Suicide:* Further Thoughts on a Methodological Classic. Pages 113-136 in Robert A. Nisbet (editor), *Émile Durkheim.* Engle-wood Cliffs, N.J.: Prentice-Hall.

Selvin, Hanan C.; and Stuart, Alan 1966 Data-dredging Procedures in Survey Analysis. *American Statistician* 20, no. 3:20-23.

Simon, Herbert A. 1954 Spurious Correlation: A Causal Interpretation. *Journal of the American Statistical Association* 49:467-479.

Sonquist, John A.; and Morgan, James N. 1964 *The Detection of Interaction Effects: A Report on a Computer Program for the Selection of Optimal Combinations of Explanatory Variables.* University of Michigan, Institute for Social Research, Survey Research Center, Monograph No. 35. Ann Arbor: The Center.

Stephan, Fbedebick F.; and Mccarthy, Philip J. 1958 *Sampling Opinions: An Analysis of Survey Procedure.* New York: Wiley.

Sterling, T. et al. 1966 Robot Data Screening: A Solution to Multivariate Type Problems in the Biological and Social Sciences. *Communications of the Association for Computing Machinery* 9:529-532.

Stouffer, Samuel A. et al. 1949 *The American Soldier.* 2 vols. Studies in Social Psychology in World War II, Vols. 1-2. Princeton Univ. Press. → Volume 1: Adjustment *During Army Life.* Volume 2: *Combat and Its Aftermath.*

Tukey, John W. 1962 The Future of Data Analysis. *Annals of Mathematical Statistics* 33:1-67.

Woolf, Harry (editor) 1961 *Quantification: A History of the Meaning of Measurement in the Natural and Social Sciences.* Indianapolis, Ind.: Bobbs-Merrill.

Yule, G. Udny 1895 On the Correlation of Total Pauperism With Proportion of Out-relief. *Economic Journal* 5:603-611.

Yule, G. Udny 1897 On the Significance of Bravais’ Formulae for Regression, &c., in the Case of Skew Correlation. Royal Society of London, *Proceedings* 60:477-489.

Yule, G. Udny 1899 An Investigation Into the Causes of Changes in Pauperism in England, Chiefly During the Last Two Intercensal Decades. Part I. *Journal of the Royal Statistical Society* 62:249-286.

Yule, G. Udny 1900 On the Association of Attributes in Statistics: With Illustrations From the Material of the Childhood Society, etc. Royal Society of London, *Philosophical Transactions* Series A 194:257-319.

Yule, G. Udny; and Kendall, M. G. (1911) 1958 An *Introduction to the Theory of Statistics.* 14th ed., rev. & enl. London: Griffin. → M. G. Kendall has been a joint author since the eleventh edition (1937). The 1958 edition was revised by Kendall.

## II. THE ANALYSIS OF ATTRIBUTE DATA

Modern social research requires the study of the interrelations between characteristics that are themselves not quantified. Has someone completed high school? Is he native born? Male or female? Is there a relation between all such characteristics? Do they in turn affect people’s attitudes and behavior? How complex are these connections: do men carry out their intentions more persistently than women? If so, is this sex difference related to level of education?

Some answers to some of these questions have been made possible by two developments in modern social research: improved techniques of collecting data through questionnaires and observations, and sampling techniques that make such collecting less costly. Empirical generalizations in answer to questions like those above proceed essentially in two steps. First, people or collectives have to be measured on the characteristics of interest. Such characteristics are now often called *vartales,* to include simple dichotomies as well as “natural” quantitative variables, like age, or “artificial” indices, like a measure of anxiety. The second step consists in establishing *connections* between such variales. The connections may be purely descriptive, or they may be causal chains based on established theories or intuitive guesswork.

In order to take the second step and study the connection between variates, certain procedures have been developed in *survey analysis,* although the procedures apply, of course, to any kind of data, for instance, census data. Attitude surveys have greatly increased the number of variates that may be connected, and thus the problems of studying connections have become especially visible. The term “survey” ordinarily excludes studies in which there are observations repeated over time on the same individuals or collectives; such studies, usually called panel studies, are not discussed in this article *[see* Panel Studies].

It makes a difference whether one deals with quantitative variables or with classifications allowing only a few categories, which often may not even be ordered. Connections between quantitative variables have long been studied in the form of correlation analysis and its many derivatives *[see* MultiVariate Analysis, *articles on* Correlation]. Correlation techniques can sometimes also be applied to qualitative characteristics, by assigning arbitrary numbers to their subdivisions. But some of the most interesting ideas in survey analysis emerge if one concentrates on data where the characterization of objects cannot be quantified and where only the frequencies with which the objects fall into different categories are known. As a matter of fact, the main ideas can be developed by considering only dichotomies, and that will be done here. The two terms “dichotomy” and “attribute” will be used interchangeably.

### Dichotomous algebra

Early statisticians (see Yule & Kendall 1911) showed some interest in attribute statistics, but such interest was long submerged by the study of quantitative variables with which the economists and psychologists were concerned. Attribute statistics is best introduced by an example which a few decades ago one might have found as easily in a textbook of logic as in a text on descriptive statistics. The example will also permit introduction of the symbolism needed for this exposition.

Suppose there is a set of 1,000 people who are characterized according to sex (attribute 1), according to whether they did or did not vote in the last election (attribute 2), and according to whether they are of high or low socioeconomic status (SES, attribute 3). There are 200 high-status men who voted and 150 low-status women who did not vote. The set consists of an equal number of men and women. One hundred high-status people did not vote, and 250 low-status people did vote. There are a total of 650 voters in the whole set; 100 low-status women did vote. How many low-status men voted? How is the whole set divided according to socioeconomic status?

There are obviously three attributes involved in this problem. It is convenient to give them arbitrary numbers and, for each attribute, to assign arbitrarily a positive sign to one possibility and a negative sign to the other. The classification of Table 1 is then obtained.

Table 1 | |||
---|---|---|---|

Number | Attribute | Sign Assignment | |

+ | - | ||

1 | Sex | Men | Women |

2 | Vote | Yes | No |

3 | SES | High | Low |

The problem assigns to some of the combinations of attribute possibilities a numerical value, the proportion of people who belong to the “cell.” The information supplied in the statement of the problem can then be summarized in the following way:

Here the original raw figures are presented as proportions of the whole set; a bar over an index indicates that in this special subset the category of that attribute to which a negative sign has been assigned applies. For example, is the number of people who did not vote and are of high status; this proportion has the numerical value 100/1,000= .10. The number of indices that are attached to a proportion is called its *stratification level.* The problem under discussion ends with two questions. Their answers require the computation of two proportions, and p_{3}.

The derivation of the missing frequencies can be done rather simply in algebraic form. For the present purpose it is more useful to derive them by introduction of so-called fourfold tables. The point of departure is, arbitrarily, the fourfold table between attributes 1 and 2. This table is then stratifled by introducing attribute 3, giving the scheme of Table 2, in which the information presented in the original problem is underscored. The principal findings are starred.

The tabular form makes it quite easy to fill in the missing cells by addition and subtraction, and this also permits answers to the two questions of the original problem in symbolic, as well as in numerical, form:

The proportion of low-status men voting is given by

and the proportion of high-status people is

If it were only a task of dividing and recombining sets, the problem would now be solved. But in survey analysis a new element enters which goes beyond the tradition of a calculus of classes. One is interested in the *relation* between attributes, and to express that relation an additional notion and an additional symbol are required. Taking the left side of Table 2, it is reasonable to ask how many men would have voted if sex had no influence on political behavior. The proportion of joint occurrence of two independent events is the product of the two separate proportions (p_{1} x p_{2}), and therefore, under independence, .65 x .5 x 1,000 = 325 male voters would have been expected. Actually there are 350, indicating that men have a slightly higher tendency to vote than women. If the difference between the empirical and the “independent” figures for all four cells of the fourfold table had been computed, the same result—25 cases, or a difference of 2.5 per cent of the total set—would have been obtained. (Problems of statistical significance are not relevant for the present discussion.)

Here it is useful to introduce the symbolic abbreviation ǀ12ǀ,

which may be called the cross *product* or the *symmetric parameter of second level.* The quantity ǀ12ǀ is the basis of many measures of association in 2x2 tables *[see* Statistics, Descriptive, article on association]. Note that ǀ12ǀ = - ǀ1ǀ and, in general, .

Cross products can also be computed for the stratified fourfold tables on the right side of Table 2, and by mere inspection new factual information emerges. In the high-status stratum, men and women have the same tendency to vote, and the stratified cross product vanishes. In the low-status stratum, the relation between sex and voting is very marked. The basic question of survey analysis is whether the relation between such cross products of different stratification levels can be put into a general algorithm that can be used to draw substantive inferences. The rest of this presentation is given to the development of the relevant procedures.

In the context of this presentation, with the exception of one case to be mentioned presently, interest is confined to whether the cross product is positive, is negative, or vanishes. This last case occurs when the two attributes are independent. If men and women furnish the same proportion of voters, the cross product will be zero. The same fact can, of course, also be expressed in a different way: the proportion of men is the same among voters and among nonvoters. In empirical studies the cross product will rarely vanish perfectly; it will just be very small. It is a problem in the theory of statistical hypothesis testing to decide when an empirical cross product differs enough from a hypothesized value (often zero) so that the difference is statistically significant. Concern here, however, is

only with general logical relations between cross products, and discussion of sampling and measurement error is excluded.

The eight third-level class proportions of a three-attribute dichotomous system can be arranged in the form of a cube. Such a dichotomous cube consists of eight smaller cubes, each corresponding to one of the third-level proportions. The dichotomous cube and the relative position of the third-level proportions are shown in Figure 1.

A second-level proportion can be *expanded* in terms of its third-level components. Thus, for example, . No proportion can be negative. Therefore, *if a second-level proportion vanishes,* so do its *components;* for example, if , then it follows that

Consider now the proportions that lie in the front sheet of the dichotomous cube. Keeping them in their same relative position, they are as shown in Table 3, where the entries are the proportions corresponding to the frequencies in the middle part of Table 2. Table 3 is a fourfold table that summarizes the relation between attributes 1 and 2 within only a part of the complete set of individuals—those who possess attribute 3. Such a table is called a *stratified fourfold table.* The stratified table is bordered by marginal entries of the second level which are the sums of the respective rows and columns, as indicated in the margins of Table 3, as well as by the first-level proportion *p _{3},* the sum of all the entries. The question of dependence or independence of two attributes can also be raised for a stratified table. A conditional cross product could be constructed from the stratified table by first dividing each entry by p

_{:i}. But because it makes the whole discussion

Table 3 — Front sheet of dichotomous cube | |||
---|---|---|---|

1 | Total | ||

2 | p_{123} | p_{23} | |

Total | p_{13} | p_{13} |

more consistent and avoids repeated computation of proportions, it is preferable to remain with the original proportions, computed on the base of the total set.

An obvious question, then, would concern the proportion of the high-status men who would have voted if sex had no influence on political behavior. Under this hypothesized independence the proportion would be given by *p _{l3}p_{23}/p^{2}_{3}.* The actual proportion of the high-status men who voted is given by

*p*An obvious measure of the relationship between sex and voting within the high-status group is supplied by the difference between this actual proportion and the theoretical one. As an alternative development, if within the subset of individuals who possess attribute 3 there is no relation between attributes 1 and 2, then one would expect to find that there is the same proportion of individuals with attribute 1 in the entire subset as there is in that part of the subset which also possess attribute 2. That is, or

_{123}/p_{3}.This determinant will be taken as the definition of the *stratified cross product* between attributes 1 and 2 among the subset possessing attribute 3. The symbol ǀ12; 31ǀ is used for this cross product. In general, then, *ǀij;kǀ* is defined by

(2)

Note that *ǀijǀ* can be represented as

The elements in the back sheet of the dichotomous cube make up the stratified fourfold table shown in Table 4, which summarizes the relation between attributes 1 and 2 within that subset of individuals who *lack* attribute 3. The cross product of such a stratified fourfold table is defined by

It should be noted that eq. (2) suffices to define both *ǀijǀ kǀ* and *ǀij; kǀ* if the index *k* is permitted to range through both barred and unbarred integers designating particular attributes.

Six fourfold tables can be formed from the elementments

Table 4 — Back sheet of dichotomous cube | |||
---|---|---|---|

1 | Total | ||

2 | |||

Total |

of the dichotomous cube, one for each of the six sheets. Each of these stratified tables can be characterized by its cross product. Thus, six conditional cross products, ǀ12; 3ǀ, ǀ12; ǀ, ǀ23; 1ǀ, ǀ23; ǀ, ǀ13; 2ǀ, and ǀ13; ǀ, can be formed from the elements of a dichotomous cube.

As can be seen from Figure 1, a dichotomous cube is completely known if the absolute frequency for each of its eight “cells” is known. These eight ultimate frequencies form what is called a *fundamental set;* if they are known, one can compute any of the many other combinations which may be of interest, some of which were used in the introductory example. The number of these possible combinations is large if one thinks of all stratification levels and of all combinations of absence and presence of the three attributes. The search for other fundamental sets, therefore, has always been of interest. Yule (see Yule & Kendall 1911) investigated one which consists of the so-called positive joint frequencies—the frequencies on all stratification levels which have no barred indices, together with the size of the sample (obviously, if the sample size is known, the remaining seven terms in a fundamental set can also be given in the form of proportions ).

If one is looking for something like a calculus of associations, the question arises whether a fundamental set whose terms include cross products could be developed. One might then start with the following elements: *N,* the total number of individuals; *P _{1},p_{2},p_{3}* the first-level proportions; and ǀ12ǀ, ǀ23ǀ, ǀ13ǀ, the three possible second-level cross products. But these are only seven elements, and so far no third-level data have been utilized. The eighth element must be of the third level, somehow characterizing the dichotomous cube as a whole (that is, depending explicitly on the third-level proportions).

One might choose as the eighth element one of the six stratified cross products, but there is no good reason for choosing one in preference to another. Also, any one of these would lack the symmetry which one can reasonably require of a parameter representing the whole cube.

The choice of the eighth parameter can be determined by three criteria: (a) The parameter should be *symmetric;* that is, its value should not be affected if the numbering of the attributes is changed around. For example, *ǀijǀ* is symmetric, since *ǀijǀ = ǀjiǀ.* (b) The parameter should be *homogeneous,* in the sense that each of its terms should involve the same number of subscripts. (c) The parameter should be such that it can be used, together with lower-level class proportions, to *evaluate any third-level proportion.*

The second-level cross products *ǀijǀ = Pa — p,p _{i}* obviously satisfy the first two conditions. That condition (a), symmetry (ǀ12ǀ = ǀ21ǀ), and condition (i>), homogeneity, are satisfied can be seen by inspection. That a condition analogous (for second-level proportions) to the third one is satisfied can be demonstrated by equations like

*P _{ij} = P_{i}P_{j̄} — ǀijǀ*

which can be verified for any combination of bars and indices.

A homogeneous, symmetric parameter of third level can be built up from lower-level proportions and parameters as follows:

The introduction of a new mathematical quantity often, at first sight, seems arbitrary, and indeed it is, in the sense that any combination of old symbols can be used to define a new one. The utility of an innovation and its special meaning can only evolve slowly, in the course of its use in the derivation of theorems and their applications. It will be seen that most of the subsequent material in this presentation centers on symmetric parameters of level higher than that of the cross product.

The quantity *ǀijkǀ,* implicitly defined by eq. (3), quite evidently satisfies the criteria of homogeneity and symmetry; that it can be used together with lower-level class proportions to compute any third-level class proportion will now be shown.

If the indices *i, j,* and *k* are allowed to range through any three different numbers, barred or unbarred, it is easily shown that the quantities *ǀijkǀ, ǀijkǀ,* and so forth thus defined are not independent but are related to one another and to *ǀijkǀ* as follows:

As a mnemonic, note that if in *ǀijkǀ* an odd number of indices is barred, the symmetric parameter changes its sign; if an even number is barred, the value of *ǀijkǀ* remains unchanged. This is easily proved by showing that

Once the third-level symmetric parameter is defined and computed by eq. (3), the computation of any desired third-level class proportion is possible. For example,

With the introduction of the third-level symmetric parameter, a three-attribute dichotomous system can now be completely summarized by *a new fundamental set* of eight data: *N, p _{1}, p_{2}, p_{3},* ǀ12ǀ,ǀ13ǀ, ǀ23ǀ,ǀ123ǀ.

Through the symmetric parameter, cross products of all levels can be connected. To keep a concrete example in mind, refer to Table 2, where the relation between sex and voting is reported for the whole sample and for two SES strata. It will be seen presently that the following formulas form the core of survey analysis.

Symmetric parameters are substituted into the form

**yielding**

In the last determinant the second row multiplied by *p _{i}* is subtracted from the first row, and then the second column multiplied by

*p*is subtracted from the first column. This leaves the right side of (4) as

_{i}Thus,

By a similar computation,

**Eqs** .(5) and (6) are, of course, related to each other by the general rule of barred indices expressed above.

It is worthwhile to give intuitive meaning to eqs. (5) and (6). Suppose the relation between sex (i) and voting (j) is studied separately among people of high and low SES (fe). Then *ǀij;,kǀ* is the cross product for sex and voting as it prevails in the high SES group. Eq. (5) says that this stratified interrelation is essentially the cross product *ǀijǀ* as it prevails in the total population corrected for the relation that SES has with both voting and sex, given by the product *ǀikǀ ǀjkǀ.* But an additional correction has to be considered: the “triple interaction” between all three attributes, *ǀijkǀ.*

Subtracting eq. (6) from eq. (5), using the fact that *Pk + Pi-* 1, and rearranging terms, one obtains

The symmetric parameter thus, in a sense, “measures” the *difference in the association* between *i* and *j* under the condition of *k* as compared with the condition of *k.* This is especially true if *p _{k} — pi,* that is, if the two conditions are represented equally often—and often it is possible to manipulate marginals to produce such an equal cut (either by choosing appropriate sample sizes or by dichotomizing at the median).

By dividing eq. (5) and eq. (6) by *p _{t}* and pk, respectively, and adding the two, one obtains

The formula on the left side is analogous to the traditional notion of partial correlation: a weighted average of the two stratified cross products. It may be called the partial association between *i* and *j* with *k* partialed out.

**Relation to measures of association.** It is very important to keep in mind the difference between a partial and a stratified association. It can happen, for instance, that a partial association is zero, while the two stratified ones have nonzero values, one positive and one negative.

It was mentioned before that the cross products are not “measures of association.” They form, however, the core of most of the measures which have been proposed for fourfold tables (Goodman & Kruskal 1954). A typical case is the “per cent difference,” which may be exemplified by a well-known paradox *[see* Statistics, Descriptive, article on association]. Suppose that the three attributes are *i,* physical prowess; *j,* intelligence; and *k,* Ses. Designate f¡, as the difference between the percentage of intelligent people in the physically strong group and in the physically weak group. It can easily be verified that *f _{ij} = ǀijǀ/p_{i}p_{i}.* The corresponding relation for the subset of high SES people would be

Substituting this expression into eq. (7), one obtains

The f-coefficients are asymmetric. The first subscript corresponds to the item which forms the basis for the percenting; thus, *f _{ti} is* the difference between the per cents of intelligent people in the high SES subset and in the low SES subset. Now eq. (8) permits the following interpretation: Suppose that the high SES subset has a higher percentage of intelligent people but a lower percentage of physically strong people than the low SES subset

*(f*0 and

_{ki}*f*> 0). Then it can happen that

_{ki}*in each*Ses

*subset*the physically stronger people

*are more often relatively intelligent, while in the total group*—high and low SES combined—

*the opposite is true:*

Consider one more example of introducing a traditional measure into eq. (7). Suppose someone wants to use the well-known phi-coefficient, which is defined by

With the use of the obvious symbol, the phi-coefficient applied to a stratified fourfold table would be

By introducing the last two expressions into eq. (7), one would obtain a relation between stratified and unstratified phi-coefficients.

This is a good place to say a word about the relation between the Yule tradition and the tenor of this presentation. In a rather late edition Yule included one page on “relations between partial associations.” He attached little importance to that approach: “In practice the existence of these relations is of little or no value. They are so complex that lengthy algebraic manipulation is necessary to express those which are not known in terms of those which are. It is usually better to evaluate the class frequencies and calculate the desired results directly from them” (Yule & Kendall 1911, p. 59 in the 1940 edition). The few computations Yule presented were indeed rather clumsy. It is easy to see what brought about improvement: the use of determinants, an index notation, and, most of all, the symmetric parameters. Still, it has to be acknowledged that Yule drew attention to the approach which was later developed. Incidentally, Yule reported the theorem on the weighted sum of stratified cross products (eq. 7). It appeared in earlier editions only as an exercise, but in later editions it was called “the one result which has important theoretical consequences.” The consequences he had in mind were studies of spurious factors in causal analysis, which he discussed under the title “illusory associations.” It will presently be seen what he had in mind.

### Modes of explanation in survey analysis

**Eq** .(7) is undoubtedly the most important for survey analysis. To bring out its implication, it is preferable to change the notation. Assume two original attributes, x and y. Their content makes it obvious that x precedes y in time sequence. If *ǀxyǀ >* 0, then this relation requires explanation. The explanation is sought by the introduction of a test factor, t, as a stratification variable. The possible combinations form a dichotomous cube, and eq. (7) now takes the form

This *elaboration* leads to two extreme forms, which are, as will be seen, of major interest. In the first case, the two stratified relations vanish; then eq. (9) reduces to

(10)

In the second, the test factor, *t,* is unrelated to x (so that ǀxt =0), and then

which also results if *\ty\=* 0. This form will turn out to be of interest only if one of the two stratified relations is markedly stronger than the other. Call eq. (11) the S form (emphasis on the stratified cross products) and eq. (10) the M form (emphasis on the “marginals”).

To this formal distinction a substantive one, the time order of the three attributes, must be added. If x is prior to y in time, then *t* either can be located *between x* and *y* in time or can *precede* both. In the former case one speaks of an *intervening* test variable, in the latter of an *antecedent* one. Thus, there are four major possibilities. It is, of course, possible that i is subsequent in time to both x and *y.* But this is a case which very rarely occurs in survey analysis and is therefore omitted in this presentation.

Given the two forms of eq. (9) and the two relevant time positions of *t,* essentially four modes arise with two original variables and one test variable, as shown in Table 5. If a relation between two variables is analyzed in the light of a third, either with real data or theoretically, only these four modes or combinations thereof are sought, irrespective

Table 5 | |||
---|---|---|---|

STATISTICAL | FORM | ||

S | M | ||

Antecedent | SA | MA | |

POSITION Of t | Intervening | SI | Ml |

of whether they are called interpretation, understanding, theory, or anything else.

Before this whole scheme is applied to concrete examples, the restriction put on the paradigm of Table 5 should be re-emphasized. Only one test variable is assumed. In actual survey practice it is highly unlikely that in eq. (9) the stratified cross products would disappear after one step. Instead of reaching eq. (10), one is likely to notice just a lowering of the first two terms on the right side of eq. (9). As a result, additional test variables must be introduced. But these further steps do not introduce new ideas. One stops the analysis at the point where one is satisfied with a combination of the four modes summarized in Table 5.

The notion of sequence in time also can be more complicated than appears in this schematic presentation. Sometimes there is not enough information available to establish a time sequence. Thus, when a positive association between owning a product and viewing a television program that advertises it is found, it is not necessarily known whether ownership preceded listening or whether the time sequence is the other way around. Additional information is then needed, if one is to proceed with the present kind of analysis. In-other cases a time sequence might be of no interest, because the problem is of a type for which latent structure analysis or, in the case of quantitative variables, factor analysis is more appropriate *[see* Latent Structure; Factor Analysis]. The problem at hand is to “explain” an empirically found association between x and *y.* But “explain” is a vague term. The procedures which lead to the main paradigm show that there exist four basic modes of explanation, the combination of which forms the basis for survey analysis. It is also reasonable to relate each type to a terminology which might most frequently be found in pertinent literature. But although the basic types or modes of analysis are precisely defined, the allocation of a name to each of them is somewhat arbitrary and could be changed without affecting the main distinctions. Now each of the four types in the paradigm will be taken up and exemplified.

**Specification** . In cases of the type SA, the test variable, *t,* is usually called a condition. General examples easily come to mind, although in practice they are fairly rare and are a great joy to the investigator when they are found. For example: the propaganda effect of a film is greater among less-educated than among highly educated people; the depression of the 1930s had worse effects on authoritarian families than on other types.

Three general remarks can be made about this type of finding or reasoning: First, it corresponds to the usual stimulus-disposition-response sequence, with *X* as the stimulus and the antecedent *t* as the disposition. Second, the whole type might best be called one of *specification.* One of the two stratified associations will necessarily be larger than the original relationship. The investigator specifies, so to speak, the circumstances under which the original relationship holds true more strongly. Third, usually one will go on from the specification and ask why the relationship is stronger on one side of the test dichotomy than it is in the total group. This might then lead to one of the other types of analysis. Durkheim (1897) used type SA in discussing why relatively fewer married people commit suicide than unmarried people. He introduced as a test variable “a nervous tendency to suicide, which the family, by its influence, neutralizes or keeps from development.” This is type SA exactly. It does not appear to be a convincing explanation, because the introduction of the hypothetical test variable (tendency to suicide) sounds rather tautological. A more important question is why the family keeps this tendency from development, which leads to type *MI,* as will be seen later.

**Contingency** . The type *SI* is also easily exemplified. In a study of the relationship between job success *(y)* and whether children did or did not go to progressive schools *(x),* it is found that if the progressively educated children come into an authoritarian situation (t), they do less well in their work than others; on the other hand, if they come into a democratic atmosphere, their job success is greater.

The relation between type of education and job success is elaborated by an intervening test factor, the work atmosphere. This is a “contingency.” In many prediction studies the predicted value depends upon subsequent circumstances that are not related to the predictor. An example is the relation between occupational status and participation in community activities. White-collar people participate more if they are dissatisfied with their jobs, whereas manual workers participate more if they are satisfied.

**Correcting spurious relationships** . Type MA is used mainly in rectifying what is usually called a *spurious relationship.* It has been found that the more fire engines that come to a fire *(x),* the larger the damage (y). Because fire engines are used to reduce damage, the relationship is startling and requires elaboration. As a test factor, the size of the fire (i) is introduced. It might then be found that fire engines are not very successful; in large, as well as small, fires the stratified relation between x and *y* vanishes. But at least the original positive relation now appears as the product of two marginal relations: the larger the fire, the more engines called out, on the one hand, and the greater the damage, on the other hand.

**Interpretation** . Type MI corresponds to what is usually called *interpretation.* The difference between the discovery of a spurious relationship and interpretation in this context is related to the time sequence between x and *t.* In an interpretation, *t* is an intervening variable situated between x and *y* in the time sequence.

Examples of type MI are numerous. Living in a rural community rather than a city (x) is related to a lower suicide rate *(y).* The greater intimacy of rural life (t) is introduced as an intervening variable. If there were a good test of cohesion, it would undoubtedly be found that a community’s being a rural rather than an urban one *(x)* is positively correlated with its degree of cohesion (t) and that greater cohesion (t) is correlated with lower suicide rates (t/). But obviously some rural communities will have less cohesion than some urban communities. If cohesion is kept constant as a statistical device, then the partial relationship between the rural-urban variable and the suicide rate would have to disappear.

**Differences between the modes** . It might be useful to illustrate the difference between type *MA* and type MI in one more example. During the war married women working in factories had a higher rate of absence from work than single women. There are a number of possible elaborations, including the following:

The married women have more responsibilities at home. This is an intervening variable. If it is introduced and the two stratified relations—between marital status and absenteeism—disappear, the elaboration is of type Ml. The relation is interpreted by showing what intervening variable connects the original two variables.

The married women are prone to physical infirmity, as crudely measured by age. The older women are more likely to be married and to have less physical strength, both of these as a result of their age. Age is an antecedent variable. If it turns out that when age is kept constant the relation between marital status and absenteeism disappears, a spurious effect of type MA is the explanation. Older people are more likely to be married and more likely to need rest at home.

The latter case suggests, again, an important point. After the original relationship is explained, attention might shift to *ǀtyǀ,* the fact that older people show a higher absentee rate. This, in turn, might lead to new elaborations: Is it really the case that older women have less physical resistance, be they married or single? Or is it that older women were born in a time when work was not as yet important for women and therefore have a lower work morale? In other words, after one elaboration is completed, the conscientious investigator will immediately turn to a new one; the basic analytical processes, however, will always be the same.

**Causal relations** . One final point can be cleared up, at least to a degree, by this analysis. It suggests a clear-cut definition of the *causal* relation between two attributes. If there is a relation between x and *y,* and if for every conceivable *antecedent* test factor the partial relations between x and *y* do *not* disappear, then the original relation should be called a causal one. It makes no difference here whether the necessary operations are actually carried through or made plausible by general reasoning. In a controlled experiment there may be two matched groups: the experimental exposure corresponds to the variable *x,* the observed effect to *y.* Randomization of treatments between groups makes sure that *\xt\ =* 0 for any antecedent t. Then if *\xy\≠* 0 and there have been no slip-ups in the experimental procedure, the preceding analysis always guarantees that there is a causal relation between exposure, *x,* and effect, *y.* There are other concepts of causal relations, differing from the one suggested here *[see* Causation].

This has special bearing on the following kinds of discussion. It is found that the crime rate is higher in densely populated areas than in sparsely populated areas. Some authors state that this could not be considered a true causal relation, but such a remark is often intended in two very different ways. Assume an intervening variable—for instance, the increased irritation which is the result of crowded conditions. Such an interpretation does not detract from the causal character of the original relationship. On the other hand, the argument might go this way: crowded areas have cheaper rents and therefore attract poorer, partly demoralized people, who are also more likely to *be* criminals to begin with. Here the character of the inhabitants is antecedent to the characteristics of the area. In this case the original relationship is indeed explained as a spurious one and should not be called causal.

**Variables not ordered in time** . Explanation consists of the formal aspect of elaboration and some substantive ordering of variables. Ordering by time sequence has been the focus here, but not all variables can be ordered this way. One can distinguish orders of complexity, for example, variables characterizing persons, collectives, and sets of collectives. Other ordering principles could be introduced, for instance, degree of generality, exemplified by the instance of a specific opinion, a broader attitude, and a basic value system. What is needed *is* to combine the formalism of elaboration with a classification of variables according to different ordering principles. This covers a great part of what needs to be known about the logic of explanation and inference in contemporary survey analysis.

### Higher-level parameters

This presentation has been restricted to the case of three attributes, but symmetric parameters can be developed for any level of stratification. Their structure becomes obvious if the parameter of fourth level is spelled out as an example:

It is possible for lower-level symmetric parameters to vanish while some higher-level ones do not, and the other way around. The addition of ǀ1234ǀ to the fundamental set would permit the analysis of questions like these;. We already know that economic status affects the relation between sex and voting; is this contextual effect greater for whites or for Negroes? If an antecedent attribute (3) only lowers ǀ12; 3ǀ, would a fourth intervening attribute explain the residual relation by making ǀ12; 34ǀ = 0? The theorems needed to cope with a larger number of attributes in a survey analysis are often interesting by themselves but are too complex for this summary.

A substantive procedure can always be put into a variety of mathematical forms. Thus, for example, attributes can be treated as random variables, *x¡,* that can assume the values zero and one. Then the symmetric parameters play the role of covariances, and the difference between two stratified cross products corresponds to what is called interaction. Such translations, however, obscure rather than clarify the points essential for survey analysis. Starting from the notion of spurious correlation, Simon (1954) has translated the dichotomous cube into a system of linear equations that also permits the formalization of the distinction between the *MA* and the Mi types. In such terminology, however, specifications (SA and SI) cannot be handled. In spite of this restriction, Blalock (1964) has productively applied this variation of survey analysis to problems for which the symmetric parameters of higher than second level may be negligible.

Polytomous systems have also been analyzed from a purely statistical point of view. A fourfold table is the simplest case of a contingency table that cross-tabulates two polytomous variates against each other. Thus, some of its statistical properties fall under the general heading of nonparametric statistics (Lindley 1964). An additional possibility is to start out with another way to characterize a fourfold table. Instead of a difference, , that is identical with the cross product, one can build formulas on the so-called odds ratio (Goodman 1965) . This leads to interesting comparisons between two stratified tables but has not been generalized to the more complex systems that come up in actual survey analysis. Eq. (12) forms the basis of this extension (Lazarsfeld 1961).

Paul F. Lazarsfeld

*[See also*Counted Data

## BIBLIOGRAPHY

Blalock, Hubert M. JR. 1964 *Causal Inferences in Nonexperimental Research.* Chapel Hill: Univ. of North Carolina Press.

Boudon, Raymond 1965 Méthodes d’analyse caúsale. *Revue francaise de sociologie* 6:24-43.

Capecchi, Vittorio 1967 Linear Causal Models and Typologies. *Quality and Quantity*—*European Journal of Methodology* 1:116-152.

Durkheim, ÉMILE (1897)1951 *Suicide: A Study in Sociology.* Glencoe, 111.: Free Press. → First published in French.

Goodman, Leo A. 1965 On the Multivariate Analysis of Three Dichotomous Variables. *American Journal of Sociology* 71:290-301.

Goodman, Leo A.; and Kruskal, William H. 1954 Measures of Association for Cross Classifications. Part 1. *Journal of the American Statistical Association* 49:732-764. → Parts 2 and 3 appear in volumes 54 and 58 of the *Journal of the American Statistical Association.*

Hyman, Herbert H. 1955 *Survey Design and Analysis: Principles, Cases and Procedures.* Glencoe, 111.: Free Press. → See especially Chapter 7.

Lazarsfeld, Paul F. 1961 The Algebra of Dichotomous Systems. Pages 111-157 in Herbert Solomon (editor), *Item Analysis and Prediction.* Stanford Univ. Press.

Lindley, Dennis V. 1964 The Bayesian Analysis of Contingency Tables. *Annals of Mathematical Statistics* 35:1622-1643.

Nowak, Stefan 1967 Causal Interpretation of Statistical Relationships in Social Research. *Quality and Quantity*—*European Journal of Methodology* 1:53-89.

Selvin, Hanan C. 1958 Durkheim’s *Suicide* and Problems of Empirical Research. *American Journal of Sociology* 63:607-619.

Simon, Herbert A. (1954) 1957 Spurious Correlation: A Causal Interpretation. Pages 37-49 in Herbert A. Simon, Models *of Man: Social and Rational.* New York: Wiley. → First published in Volume 49 of the *Journal of the American Statistical Association.*

Yule, G. Udny; and Kendall, Maurice G. (1911) 1950 *An Introduction to the Theory of Statistics.* 14th ed., rev. & enl. London: Griffin. → See especially chapters 1-4, dealing with attribute data. M. G. Kendall has been a joint author since the eleventh edition (1937).

## III. APPLICATIONS IN ECONOMICS

The development of economic theory and analysis requires an understanding of the forces that shape the decisions of firms, households, and governments. These forces can sometimes be deduced from general principles, such as profit maximization, or inferred from observations of economic behavior under different or changing prices, income, interest rates, etc. But the range of analysis, interpretation, and understanding is vastly enlarged by surveys, which elicit information directly from decision makers. Personal interviews can ascertain far more than simply the facts of the decision maker’s situation and behavior; they can also uncover the amount of information available to him, his insight and understanding of the situation, his purposes and expectations, and the various constraints on his behavior. It is important to know, for example, not only what interest rates people are paying but also whether they know what the rate is, whether they see any alternative to borrowing, and what purposes their borrowing serves (in addition to the obvious one).

Interviews frequently throw light in a negative way on forces affecting decisions, by showing that substantial numbers of people do not have sufficient information or insight (understanding of the meaning of the information possessed) to be affected by particular factors. For example, interviews commonly show that many people believe—mistakenly—that during an inflationary period one should buy *more* bonds (not shift to stock) or take out *more* life insurance. In some cases individuals may, however, have adequate substitutes for detailed information and understanding, as when they know, without being able to compute an interest rate, that banks provide cheaper credit.

More important, detailed interviews frequently reveal that decision makers have purposes other than profit maximization and expense minimization, which are practically the sole objectives assumed in many theories of economic behavior. A man whose main desire is to be an excellent doctor or to run a successful business may be less concerned with avoiding income taxes than with getting his job done. Indeed, surveys offer the most direct and promising approach to interdisciplinary research, in which the variables to be explained or predicted pertain to behavior of various sorts and the explanatory variables incorporate theoretical constructs from economics, sociology, psychology, anthropology, etc.

Surveys vary from simple collections of limited facts—the call reports of member banks to the Federal Reserve System, the survey of manufacturers, the decennial census, etc.—to detailed personal interviews of heads of families or business firms. Survey data are used to estimate simple over-all statistics—such as the proportion of households with two cars, inside plumbing, their own home, or debt payments of more than 20 per cent of their income—and to study complex relations based on differences between subgroups or between representative individuals in society. It is the search for functional relations that brings such an endeavor within the sphere of science. The goal is not to explain differences in individual behavior, many of which cancel out even in small-group averages, but to find the forces that affect large parts of the population in the same way. Data that were originally collected for other purposes can often be used to analyze subgroup differences, but survey data have the advantage of being tailor-made for the analytical purposes at hand. Repeated surveys can provide data on changes over time in the characteristics and behavior of a population or of subgroups within a population. Panel studies are particularly useful in the study of subgroups over time, since they place little reliance on the memories of individuals.

**History and development** . The few surveys of economic behavior or conditions of families conducted before World War n used nonprobability samples and were largely descriptive, rather than analytical. They were concerned mostly with the plight of the workingman or the composition of the budget of a typical workingman’s family (see Williams & Zimmerman 1935).

Since World War n there has been rapid improvement in techniques for the sampling of human populations and for eliciting information from them. At the same time, economists have focused increasingly on policy issues and on explanation and prediction of behavior, and this has led to an accelerated demand for information. Surveys have been expanded and improved to estimate the growth of population, the extent of unemployment, patterns of migration, the quality of housing, the extent of poverty, and the plight of special groups, such as the aged. Furthermore, the emphasis in survey analysis has shifted from measurement to explanation of behavior. Survey data can be and are used to test hypotheses regarding behavior, although this has commonly meant using data from a survey designed with other purposes in mind. When important problems are involved— such as the effects of medical insurance on the utilization of medical services, of taxes on incentives, or of local taxes on business location—special studies have been designed.

**Major sources of survey data** . Since surveys employing probability samples are relatively new and expensive, they have been conducted on a large scale in only a few countries, primarily the economically advanced ones. In the United States the major surveys have been carried on by the Bureau of Labor Statistics, the Bureau of the Census, and the Survey Research Center of the University of Michigan.

The surveys by the Bureau of Labor Statistics include the 1935-1936 expenditure survey (conducted jointly with the United States Department of Agriculture), the 1941-1942 survey of family spending and saving in wartime, the 1950 expenditure survey, and the 1960-1961 expenditure survey. The primary purpose of these surveys was to devise weights for a cost-of-living index, but other uses have been found for the data. The surveys increased over the years in quality, coverage, and availability for secondary analyses. The 1950 study appeared in 18 volumes and also led to a two-volume conference work (Conference on Consumption and Savings 1960).

The Bureau of the Census, which pioneered in probability sampling, has been responsible not only for the samples connected with the decennial censuses of population and housing but also for current population surveys (quarterly, with rotating panel samples for measuring population, labor force, and unemployment), housing surveys, special surveys of the aged, special studies embedded within the current population surveys (including such things as buying plans, home additions and repairs, ownership of major durables, migration, and income), and surveys of manufacturers (replacing the census of manufacturers).

The Survey Research Center of the University of Michigan has conducted annual surveys of consumer finances since 1947, initially with the support and cooperation of the Federal Reserve Board and since 1960 with a variety of sponsors. The results appeared in the *Federal Reserve Bulletin* through 1959; since then they have appeared in annual volumes issued by the Survey Research Center. These volumes also provide summaries of surveys of consumer attitudes, expectations, and major past purchases, plus such special topics as attitudes toward a tax cut or toward federal expenditure programs. In addition, the Survey Research Center has conducted studies on a variety of special economic topics: the utilization of medical services in Michigan, attitudes toward innovations in household goods, the determinants of family income, the effects of private pension plans on saving, the economic impact of auto accidents in Michigan, and attitudes toward public expenditure programs—to cite just a few. (A complete bibliography can be obtained from the Survey Research Center.)

National studies on economic topics have also been conducted in the United States by the Market Surveys Section of the Department of Agriculture’s Agricultural Marketing Service, by the National Opinion Research Center of the University of Chicago, and by National Analysts and other private survey organizations. Survey centers with a state or local focus have been established at Columbia University (Bureau of Applied Social Research), the University of Wisconsin, the University of California (Berkeley), and the University of Illinois.

Major expenditure or saving surveys have been conducted, although not regularly, in Great Britain, Puerto Rico, Ceylon, India, Japan, Sweden, Israel, and Mexico. In addition, there have been numerous small-scale studies of restricted areas.

Surveys on economic topics are carried on continually by the Government Social Survey in Great Britain and the Danish National Institute of Social Research and, less regularly, by the Institut National de la Statistique et des Etudes Economiques (INSEE) and the newly formed Centre de Recherche Économique sur l’Épargne in France, the Swedish Central Bureau of Statistics, the Polish Radio and Television Service’s Public Opinion Research Center, the Forschungsstelle für Empirische Sozialökonomie in Germany (Cologne), the National Council of Applied Economic Research in India (New Delhi), and the Indian government’s National Sample Survey. New survey centers are being established in Peru and at the Royal College in Kenya (Nairobi). Centers exist in Chile and Argentina as well, but these focus mainly on sociological or political questions. Of course, single surveys are sometimes done without the establishment of a permanent organization.

**Sampling methods** . Scientific surveys require probability samples of the population being studied. Since complete lists of people or families are rare, some other sampling frame is required. The one generally adopted is geographic. Everyone is assumed to have some place of residence. A probability sample of people can be developed by sampling the map and taking those who live in the sampled areas. Lists of people, from various sources, have also been used, of course—even rice-ration coupon book lists (in Ceylon); and many special studies use samples of special lists. *[See* Sample Surveys.]

A simple probability sample would, however, be expensive and relatively inefficient. Hence, most samples are multistage clustered stratified samples, frequently with interlaced controls. The primary sampling units, roughly counties, are ordered on the basis of a number of criteria—such as region, rate of population growth, and per cent of labor force employed in manufacturing—and are sampled with a population interval that assures proper stratification. Interviews are clustered—in selected counties, in selected areas within counties, and in selected clusters of addresses within those areas— to minimize travel costs or, rather, to maximize the information per dollar spent on traveling and interviewing. Suburbs of the largest metropolitan areas are pooled, stratified by type, and sampled, so that the particular suburban localities of one area in a given sample may represent not all the suburbs of that area, but rather suburbs of that type in several areas.

In general, however, a subsection of a probability sample, selected without using the sampling units (geographic location) as a criterion, is a probability sample of the subgroup it deals with, for instance, a given age or occupation group. In most recent samples, even the regional subsamples are probability samples.

Sometimes, however, a probability sample would have to be unduly large to provide sufficient precision in the information about some subgroup, such as the aged, or those with hospital experience, or women of childbearing age. In this case, unless complete lists are available, some sort of screening is generally required. Either the group is selected from previous interview studies and revisited or a large number of addresses are visited, with interviews being taken only if the right kind of person lives at the address.

A probability sample can be inefficient or even misleading if clustering is overdone, and can be biased if serious attempts are not made to secure information from a large proportion of the sample selected. Those difficult to interview often vary widely—the very young, the very old, the very rich, families in which all members work, individuals suffering financial reverses, etc.

The precision of a sample is almost independent of the size of the population it represents and from which it is selected. Hence, it does not take a substantially larger sample to represent the whole United States than to represent California or Chicago. Once a national sample is developed and interviewers are hired and trained, there are substantial economies in doing a continuing series of national sample surveys; only the last stage of sampling needs to be repeated (selection of addresses). Furthermore, there are no problems in generalizing the results to a broad population and in relating survey data to national aggregates from other sources.

On the other hand, special samples sometimes become themselves an analytical device, since they isolate those at extremes of a behavioral continuum or uncover what is behind the decisions that have had the greatest economic effect.

**Methods of eliciting information** . The reliability of information is a far more important problem, and more in need of further work, than sampling techniques. It is not always true that facts are more reliably reported than attitudes. Memories are fallible, and people misunderstand questions. A growing body of literature and unpublished studies indicates that it is difficult to make general statements regarding reliability, which seems to depend very much on the content of the questions, as well as on the interviewer’s training and procedures. The most dramatic illustration of the nature of the problem arose when the U.S. Census Bureau changed the sample it used for the current population surveys. The first time the new sample was used, estimates of unemployment jumped markedly. It turned out that the cause of the jump was not the sample change but the better training of the new interviewers, who followed their instructions more carefully, asked all the questions, and found more people in the labor force, hence more unemployed (Hansen 1954; U.S. Department of Commerce 1954).

It has proved difficult to get people to look up records, and their memories for financial data are often poor. Hence, when *changes* in financial magnitudes are required, the common practice is to rely on reinterviews. The difficulties with reinter-views are that people move and that the probability of securing two interviews even for those who do not move is somewhat lower than the chance of a single interview. Three different panels have been embedded into the surveys of consumer finances of the Survey Research Center, which also carried on a special panel study interviewing people five times over a three-year period. The Center is currently reinterviewing a panel several times in a study of the effects of the tax cut of 1964-1965 in the United States. The Bureau of the Census’ current population surveys use a rotating panel to increase precision in estimating changes in the labor force, and the bureau conducted a reinter-view study for the Federal Reserve Board in 1963-1964 to secure data on assets and savings.

For collecting information on attitudes, expectations, buying plans, and motives, a technique known as fixed-question-open-answer has been used, in which the verbatim replies to a uniform question are analyzed and converted into categories at a central office. This is done under close supervision and with intensive reworking, to assure uniformity, which is crucial when comparisons between surveys are needed. On the other hand, elaborate scaling techniques have not commonly been used in economic surveys, partly because these surveys usually have not contained numerous questions that attempt to measure the same attitude or purpose.

While prime reliance has been placed on personal interviews, some use has been made of telephone and mail questionnaires. Telephone inquiries are particularly useful for brief follow-up questions or for locating scattered people sampled from lists. Answers to open-ended questions tend to be briefer and more noncommittal, and there are more serious limits to what can be asked. Response rates to mail questionnaires have varied from very low to very high, depending on the content, the sample, the amount of work required, and (to a small extent) the procedures. Answers again tend to be more noncommittal, and there is the added difficulty of knowing who actually replied.

**Treatment of missing information** . However unbiased the original sample may be, the survey results can be biased because of nonresponse, that is, cases where there is no interview at all. There will also be missing items of information in otherwise satisfactory interviews. Some of the resulting bias can be reduced by weighting subgroups for nonresponse and by assigning values to some of the more crucial bits of missing information on the basis of other information known about the respondent from previous surveys. But the problem cannot be avoided, and any estimates from a sample imply some assumption about the missing pieces. If it is known that most of the nonresponse came, say, from families with two wage earners and no children, then similar families who were interviewed can be “weighted up” to represent the missing ones. The more information available about the nonresponse cases, the more likely it is that such weighting can be designed to reduce bias. Sometimes interviews are simply duplicated to make up for missing interviews, but this is a relatively crude procedure. It is better to weight up fifty similar cases by 2 per cent than to duplicate one case. [See ERRORS, *article on* Nonsam Ling Errors.]

When the sampling is not done everywhere at uniform rates—for example, when the rate in lower-income areas is half that in higher-income areas—weights are used to make the final results representative, the weights being the inverse of the sampling ratios. This was commonly done with the early surveys of consumer finances, in which higher-rent dwellings were oversampled and upper-income people were oversampled for reinterviewing.

**Analysis of survey data** . Rapid development has taken place in the techniques of analysis of survey data, which present a challenge because of their very richness. In economic studies some measure of economic capacity—such as income, total consumption, or wealth—is usually an important classifying characteristic, but the effects of economic capacity on any particular thing—such as home ownership or expenditure on durables— may well vary between rural areas and cities or between old and young. While much survey data is produced only in the form of tables, usually by age or income or family size, increasingly the analysis moves to more complex procedures. Many problems arise, and some theoretical or statistical assumptions must be imposed to make the analysis manageable.

A much used procedure is multiple regression, which assumes that each explanatory factor can be converted into some numerical scale and that these variables have a linear additive relationship with the variable to be explained. Since many of the explanatory factors—occupation, region, etc. —are not themselves scaled, it has become common to introduce “dummy variables” to represent each subclass (except one) of each such factor (Hill 1959; Suits 1957). A dummy variable takes on the value 1 if the individual belongs to the subclass, 0 if he does not.

The assumption of additive effects has remained troublesome. By comparison, the cross-product of two dummy variables introduces only a limited form of interaction effect. It isolates only one of four possible corners of a two-way table; the corner isolated depends on the definitions of the dummy variables. Most recently the search for interaction effects has been formalized and programmed for machine computation (Sonquist & Morgan 1964). This flexible approach produces findings that would otherwise be uncovered only by accident or by many repeated regressions for various subgroups. For example, it has been found that the effects of insurance on the utilization of medical care are important only for adult females; that nonwhite wives are more likely to work than white wives, but the difference is great only at the stage in the family life cycle when there are children in school; that a measure of achievement motivation is related to hourly earnings, but only for middle-aged male college graduates; and that having a private pension is associated with higher (not lower) savings for those persons who have at least $500 in assets—an asset level that seems to indicate that the individual has learned how he can save.

**Limitations** . Sample surveys are not appropriate vehicles for securing *all* kinds of information. It would be useful to know the extent of illegal gambling, or the number of abortions in the United States, or how many people cheat on their income taxes, but it is doubtful that such information would be freely given. It *is* possible to study popular attitudes about such things, how attitudes of different groups in the population differ, and how attitudes change over time.

In the financial area, individuals easily remember large, irregular, salient matters, such as the price or payment arrangements for a new car or the size of a hospital bill, and regular payments, such as the rent or the mortgage. Other details of family expenditure, however, are generally difficult to recall, and it has never proved easy to get people to keep records. Indeed, only a few keep any records of their expenditures on any regular basis— and these individuals may be atypical in other ways as well.

Some matters are likely to be regarded as private information. In the United States, where most people are employed by others on a regular basis, income is not a particularly sensitive item, nor is installment credit; but assets (particularly with older people) and small loan debts do seem to be sensitive.

It should be kept in mind, however, that where memories are poor or people are sensitive, it may well be possible to secure approximate information, sufficient for many purposes. It is not necessary to know a man’s income to the dollar to classify him for analytical purposes.

The paucity of published studies of response errors belies both the amount of research that has been done and the growing interest in methodology. It is, however, quite impossible to determine the accuracy of each of the hundred bits of information secured in a survey, and it will remain necessary to rely on general impressions as to what respondents can be expected to be able and willing to recall in an interview. As long as errors are distributed randomly, or at least independently of the explanatory factors being investigated, they do not lead to spurious positive findings, although they reduce the possibility of finding real relationships.

**Forecasting short-run changes.** Surveys are used both to forecast short-run changes in the behavior of consumers or businessmen and to develop a behavioral theory of what it is that brings about these changes. A continuing series of surveys asking about expectations, attitudes, buying plans, and recent major purchases provides evidence on short-run changes in the propensity to consume (or invest). Analyses of such data provide evidence as to the causes of changes in attitudes and how such changes are related to subsequent economic behavior. Once consumers become affluent enough to have some real freedom of action—particularly when the use of cars and appliances and other forms of consumer investment allow postponement or speeding up of investments—changes in “consumption” can become more volatile than changes in business investment.

Since many of the consumer’s decisions involve commitment to a pattern of payments (and depreciation) into the future, his views of the future can be expected to affect his decisions. Thus, it is argued, the consumer’s confidence and optimism about his own and his country’s economic future can and will affect his propensity to make expenditures. The original and continuing work in this area is that of George Katona (I960; 1964) and Eva Mueller (1963). Also, data on buying plans and on a few consumer attitudes are collected by the U.S. Bureau of the Census in its current population surveys.

In the United States, surveys of business investment intentions and other expectations, based on mail questionnaires, have been carried on both by the government (the Securities and Exchange Commission and the Department of Commerce) and by private groups (market research firms and magazines like *Business Week).* In Europe expectational data are collected much more frequently and from larger samples, but the questions relate only to the direction of change. Such questions are easier to answer. The pooled results, industry by industry, are sent to the firms, to induce them to continue cooperating.

**Further uses of survey data** . Survey analysis provides information on such points as the shape of the consumption function, the role of interest rates, asset preferences and the demand for money, and factors affecting investment decisions. The relationships discovered can be incorporated into systems of equations describing large sectors of an economy, or the economy as a whole. At a simpler level, information relating initial conditions to subsequent actions can be used to construct a Markov process describing the behavior of variables over time. For example, the relation of a respondent’s father’s education to his own provides evidence on intergenerational change in education levels and can be used to estimate the distribution of education in the population several generations hence (Morgan et al. 1962). It is, of course, necessary to use a sample of sons, not fathers, since not only may a father have more than one son, but an individual son’s education may be affected by the number of brothers and sisters he has. Survey data pertaining to brand loyalty have also been treated and analyzed in terms of Markov processes. *[See* Markov Chains.]

Surveys repeated over time can also be used to study processes of social change, not only by indicating over-all changes but also by identifying the subgroups of society in which changes are occurring most rapidly. There are interpretive problems, of course, when age groups are involved, since the age of an individual, the year he was born (his generation, or “cohort”), and the year of the survey are related perfectly to one another. Hence, it is never possible to hold two of them constant and vary the third. However, if it can be assumed that chronological age does not matter, it is possible to separate time trends from differences between generations. Or if the year of a man’s birth does not matter, it is possible to separate age effects from time trends, or to find separate time trends for each age group. Or if it can be assumed that there are no major trends, repeated surveys can separate the effects of age from those of cohort (year of birth).

Repeated surveys also allow one to investigate whether the changes in the over-all aggregate of a dependent variable are due to changes in the population structure, changes in the way certain factors affect behavior, or changes in trends unrelated to the variables that have been measured.

James N. Morgan

*[Directly related are the entries* Consumers, *articles On* Consumer Assets*and* Consumer Behavior; Consumption Function; Cross-Section Analysis; Index Numbers, *articles on* Practical Applications*and* Sampling; Panel Studies; Prediction AND Forecasting, Economic; Sample Surveys

## BIBLIOGRAPHY

Anderson, Odin; and Feldman, Jacob (1954) 1956 *Family Medical Costs and Voluntary Health Insurance: A Nationwide Survey.* New York: McGraw-Hill. → First published as *National Family Survey of Medical Costs and Voluntary Health Insurance.*

Barlow, Robin; Morgan, James N.; and Wirick, Grover 1960 A Study of Validity in Reporting Medical Care in Michigan. American Statistical Association, Social Statistics Section, *Proceedings* [I960]: 54-65.

Break, George F. 1957 Income Taxes and Incentives to Work: An Empirical Study. *American Economic Review* 47:529-549.

Caplovitz, David 1963 *The Poor Pay More: Consumer Practices in Low-income Families.* New York: Free Press.

Conard, Alfred et al. 1964 *Automobile Accident Costs and Payments: Studies in the Economics of Injury Reparation.* Ann Arbor: Univ. of Michigan Press.

Conference ON Consumption AND Savings, University OF Pennsylvania, *1959* 1960 *Proceedings.* 2 vols. Edited by Irwin Friend and Robert Jones. Philadelphia: Univ. of Pennsylvania Press.

Eisner, Robert 1957 Interview and Other Survey Techniques and the Study of Investment. Pages 513-601 in Conference on Research in Income and Wealth, *Problems of Capital Formation: Concepts, Measurement, and Controlling Factors.* Princeton Univ. Press.

Ferber, Robert 1959 *Collecting Financial Data by Consumer Panel Techniques.* Urbana: Univ. of Illinois, Bureau of Economic and Business Research.

Ferber, Robert 1962 Research on Household Behavior. *American Economic Review* 52:19-63.

Friend, Irwin; and Bronfenbrenner, Jean 1955 Plant and Equipment Programs and Their Realization. Pages 53-111 in Conference on Research in Income and Wealth, *Short Term Economic Forecasting.* Princeton Univ. Press.

Hansen, Morris 1954 Questions and Answers. *American Statistician* 8, no. 4:33-34.

Heller, Walter W. 1951 The Anatomy of Investment Decisions. *Harvard Business Review* 29, no. 2:95-103.

Hill, T. P. 1959 An Analysis of the Distribution of Wages and Salaries in Great Britain. *Econometrica* 27:355-381.

International Labor Office 1961 *Family Living Studies: A Symposium.* Studies and Reports, New Series, No. 63. Geneva: The Office.

Juster, Francis T. 1964 *Anticipations and Purchases: An Analysis of Consumer Behavior.* Princeton Univ. Press.

Katona, George 1960 *The Powerful Consumer: Psychological Studies of the American Economy.* New York: McGraw-Hill.

Katona, George 1964 *The Mass Consumption Society.* New York: McGraw-Hill.

Katona, George; and Morgan, James N. 1952 The Quantitative Study of Factors Determining Business Decisions. *Quarterly Journal of Economics* 66:67-90.

Kish, Leslie; and Lansing, John B. 1954 Response Errors in Estimating the Value of Homes. *Journal of the American Statistical Association* 49:520-538.

Klein, Lawrence R.; and Lansing, John B. 1955 Decisions to Purchase Consumer Durable Goods. *Journal of Marketing* 20, October: 109-132.

LÁale, Helen 1959 *Methodology of the Survey of Consumer Expenditures in 1950.* Philadelphia: Univ. of Pennsylvania Press.

Lansing, John; Ginsburg, Gerald; and Braaten, Kaisa 1961 An *investigation of Response Error.* Urbana: Univ. of Illinois, Bureau of Economic and Business Research.

Lansing, John B.; and Withey, Stephen B. 1955 Consumer Anticipations: Their Use in Forecasting Consumer Behavior. Pages 381-453 in Conference on Research in Income and Wealth, Short *Term Economic Forecasting.* Princeton Univ. Press.

Liviatan, Nissan 1963 Tests of the Permanent-income Hypothesis Based on a Reinterview Savings Survey. Pages 29-59 in *Measurement in Economics: Studies in Mathematical Economics and Econometrics in Memory of Yehuda Grunfeld.* Stanford Univ. Press. → A “Note” by Milton Friedman and a reply by Liviatan appear on pages 59-66.

Lydall, Harold F. 1957 The Impact of the Credit Squeeze on Small and Medium Sized Manufacturing Firms. *Economic Journal* 67:415-431.

Mcnerney, Walter J. et al. 1962 *Hospital and Medical Economics.* 2 vols. Chicago: Hospital Research and Educational Trust. → See especially Volume 1, pages 61-357.

Marquardt, Wilhelm; and Stbrigel, Werner 1959 *Der Konjunkturtest: Eine neue Methode der Wirtschaftsbeobachtung.* Berlin: Duncker & Humblot.

Morgan, James N. 1958 A Review of Recent Research on Consumer Behavior. Volume 3, pages 93-219 in Lincoln Clark (editor), *Consumer Behavior: Research on Consumer Reactions.* New York: Harper.

Morgan, James N.; Barlow, Robin; and Bbrazer, Harvey 1965 A Survey of Investment Management and Working Behavior Among High-income Individuals. *American Economic Review* 55, no. 2: 252-264.

Morgan, James N.; and Sonquist, John A. 1963 Problems in the Analysis of Survey Data, and a Proposal. *Journal of the American Statistical Association* 58: 415-434.

Morgan, James N. et al. 1962 *Income and Welfare in the United States.* New York: McGraw-Hill.

Morrissett, Irving 1957 Psychological Surveys in Business Forecasting. Pages 258-315 in Rensis Likert and Samuel P. Hayes (editors), *Some Applications of Behavioral Research.* Paris: UNESCO.

Mueller, Eva 1957 Effects of Consumer Attitudes and Purchases. *American Economic Review* 47:946-965.

Mueller, Eva 1960 Consumer Attitudes: Their Influence and Forecasting Value. Pages 149-179 in Universities-National Bureau Committee for Economic Research, *The Quality and Economic Significance of Anticipations Data.* Princeton Univ. Press.

Mueller, Eva 1963 Ten Years of Consumer Attitude Surveys: Their Forecasting Record. *Journal of the American Statistical Association* 58:899-917.

Mueller, Eva; and Morgan, James N. 1962 Location Decisions of Manufacturers. *American Economic Review* 52, no. 2: 204-217.

Mueller, Eva; Wilken, Arnold; and Wood, Margaret 1961 *Location Decisions and Industrial Mobility in Michigan.* Ann Arbor: Univ. of Michigan, Institute for Social Research.

Neter, John; and Waksberg, Joseph 1964a Conditioning Effects From Repeated Household Interviews. *Journal of Marketing* 28, April: 51-56.

Neter, John; and Waksberg, Joseph 1964b A Study of Response Errors in Expenditures Data From Household Interviews. *Journal of the American Statistical Association* 59:18-55.

Orcutt, Guy H. et al. 1961 *Microanalysis of Socioeconomic Systems: A Simulation Study.* New York: Harper.

Sirkin, Monroe G.; MAYNES, E. SCOTT; and Frechtling, John A. 1958 The Survey of Consumer Finances and the Census Quality Check. Pages 127-168 in Conference on Research in Income and Wealth, *An Appraisal of the 1950 Census Income Data.* Princeton Univ. Press.

Sonquist, John A.; and Morgan, James N. 1964 *The Detection of Interaction Effects: A Report on a Computer Program for the Selection of Optimal Combinations of Explanatory Variables.* Monograph No. 35. Ann Arbor: Univ. of Michigan, Institute for Social Research, Survey Research Center.

Stone, Richard 1963 Consumers’ Wants and Expenditures: A Survey of British Studies Since 1945. In Colloque International sur les Besoins de Biens de Consummation, Grenoble, 11-15 September, 1961, *Actes.* Paris: Editions du Centre National de la Recherche Scientifique. → Reprinted in mimeographed form by the Department of Applied Economics, Cambridge University.

Suits, Daniel B. 1957 Use of Dummy Variables in Regression Equations. *Journal of the American Statistical Association* 52:548-551.

Theil, H. 1955 Recent Experiences With the Munich Business Test. Econometrics 23:184-192.

Tobin, James 1959 On the Predictive Value of Consumer Intentions and Attitudes. *Review of Economics and Statistics* 41:1—11. → See the comments by George Katona on page 317.

U.S. Department OF Commerce, Special Advisory Committee ON Employment Statistics 1954 *Measurement of Employment and Unemployment by the Bureau of the Census in Its Current Population Survey.* Washington: Government Printing Office.

Williams, Faith M.; and Zimmerman, Carle C. 1935 *Studies of Family Living in the United States and Other Countries: An Analysis of Material and Method*. U.S. Department of Agriculture, Miscellaneous Publication No. 223. Washington: Government Printing Office.

Wirick, Grover; and Barlow, Robin 1964 The Economic and Social Determinants of the Demand for Health Services. Pages 95-125 in Conference on the Economics of Health and Medical Care, University of Michigan, 1962, *The* Economics *of Health and Medical Care.* Ann Arbor: Univ. of Michigan.