causal modelling

views updated

causal modelling A causal model is an abstract quantitative representation of real-world dynamics. Hence, a causal model attempts to describe the causal and other relationships, among a set of variables. The best-known form of causal modelling is path analysis, which was originally developed in genetics, but was adopted as a technique in the 1960s by American sociologists such as Otis Dudley Duncan. Most causal modelling is associated with survey research (see the classic text by H. M. Blalock , Causal Inferences in Nonexperimental Research, 1964

Essentially, causal models are based on structural equations of the form z = b1x + b2y, and are analysed using regression techniques. However, a simpler way to understand the principle of causal models is to think of them as hypotheses about the presence, sign, and direction of influence for the relations of all pairs of variables in a set. Usually these relations are mapped in diagrams or flow graphs as in the simple example shown below.

Even when there are only three variables under examination many different models of their relationship are possible. Thus, investigating all the different possible models is an important step in the analysis of data, and in linking sociological theory to empirical research.

Causal models incorporate the idea of multiple causality, that is, there can be more than one cause for any particular effect. For example, how a person votes may be related to social class, age, sex, ethnicity, and so on. Moreover, some of the independent or explanatory variables could be related to one another. For example, ethnicity and class may be related, so that the effect of ethnicity on vote is both direct and indirect (through class)—as shown in Figure 3. This example also serves to show the importance of thinking about such models before data are collected. Theorizing in this way tells us what data we need to collect to test our model. However, we are interested not only in what affects voting, but also how different variables affect it. For example, does sex have a positive or negative effect on vote? Put this way the question seems a strange one, but if (say) we were to ask whether being female makes it more or less likely that a person votes Republican, this would involve assigning a positive dimension to the relationship if the former were true and a negative dimension if the latter were the case. We could equally ask questions such as whether age or class or sex was more important in its effect on voting. Causal analysis would also be able to show the combined impact of age, class, and sex on vote. That is, we could say how much of the variance in vote is accounted for by the other three variables.

At best causal models usually account for only a proportion (usually no more than 20 or 30 per cent) of the variance in a dependent variable. For this reason causal models include a residual or error term to account for the variance left unexplained. There are, after all, many other social characteristics which affect how people vote, apart from those of age, sex, and class. It is also important to note that the causal model assumes a hierarchy—age, sex, and class cause vote, but vote does not cause age, sex, and class. Finally, it should be noted that causal models do not prove that one variable is caused by the effect of others. All the model can do is to indicate whether it is compatible with the data; and, if so, what the strengths of the causal effects are, given the model being used.

Herbert Asher's Causal Modelling (2nd edn., 1990) gives a short—but highly technical—introduction to the logic and tools of causal models. See also MULTI-LEVEL MODELS; MULTIVARIATE ANALYSIS.