Confounding, Confounding Factors

views updated


The word confounding has been used to refer to at least three distinct concepts. In the oldest and most widespread usage, confounding is a source of bias in estimating causal effects. This bias is sometimes informally described as mixing of effects of extraneous factors (called confounders) with the effect of interest. This usage predominates in nonexperimental research, especially in epidemiology and sociology. In a second and more recent usage originating in statistics, confounding is a synonym for change in an effect measure upon stratification or adjustment for extraneous factors (a phenomenon called noncollapsibility or Simpson's paradox). In a third usage, originating in the experimental-design literature, confounding refers to inseparability to main effects and interactions under a particular design. The three concepts are closely related and are not always distinguished from one another. In particular, the concepts of confounding as a bias in effect estimation and as noncollapsibility are often treated as equivalent, even though they are not. Only the former concept will be described here.


A classic discussion of confounding in which explicit reference is made is to "confounded effects" is found in John Stuart Mill's A System of Logic, although Mill lays out the primary issues and acknowledges Francis Bacon as a forerunner in dealing with them. Mill lists a requirement for experiment intended to determine causal relations: " none of the circumstances [of the experiment] that we do know shall have effects susceptible of confounded with those of the agents whose properties we with to study [emphasis added]."

In Mill's time, the world experiment referred to an observation in which some circumstances were under the control of the observer, as it still is used in ordinary English, rather than to the notion of a comparative trial. Nonetheless, Mill's requirement suggests that a comparison is to be made between the outcome of one's "experiment" (which is essentially, an uncontrolled trial) and what one would expect the outcome to be if the agents one wished to study had been absent. If the outcome is not as one would expect in the absence of the study agents, then Mill's requirement ensures that the unexpected outcome was not brought about by extraneous "circumstances" (factors). If, however, those circumstances do bring about the unexpected outcome, and that outcome is mistakenly attributed to effects of the study agents, then the mistake is one of confounding (or confusion) of the extraneous effects with the agent effects.

Much of the modern literature follows the same informal conceptualization give by Mill. Terminology is now more specific, with "treatment" used to refer to an agent administered by the investigator and "exposure" often used to denote an unmanipulated agent. The chief development beyond Mill is that the expectation for the outcome in the absence of the study exposure is now almost always explicitly derived from observation of a control group that is untreated or unexposed. For example, D. Clayton and M. Hills (1993) state of observational studies:

there is always the possibility that an important influence on the outcome differs systematically between the comparison [exposed and unexposed] groups. It is then possible [that] part of the apparent effect of exposure is due to these differences, [in which case] the comparison of the exposure groups is said to be confounded [emphasis in the original].

In fact, confounding is also possible in randomized experiments owing to systematic improprieties in treatment allocation, administration, and compliance. A further and somewhat controversial point that confounding (as per Mill's original definition) can also occur perfect randomized trials due to random differences between comparison groups.


Various mathematical formalizations of confounding have been proposed for use in statistical analyses. Perhaps the one closest to Mill's concept is based on the counterfactual model for casual effects. Suppose one wishes to consider how a health-status (outcome) measure of a population would change in response to an intervention (population treatment). More precisely, suppose one's objective is to determine the effect that applying a treatment x 1 had or would have an outcome measure µ relative to applying treatment x 0 to a specific target population A. For example, A could be a cohort of breast-cancer patients, treatment x 1 could be a new hormone therapy, x 0 could be a placebo therapy, and the measure µ could be a five-year survival probability. The treatment x 1 is sometimes called the index treatment; and x 0 is sometimes called the control or reference treatment (which if often a standard or placebo treatment).

The counterfactual model posits that, in population A, µ will equal µA1 if x 1 is applied, µA0 is applied; the casual effect of x 1 relative to x 0 is defined as the change from µA0 to µA1, which might be measured as µA1 µA0 or µA1/µA0. If A is given treatment x 1 then µ will equal µA1 and µA1 will be observable, but µA0 will be unobserved. Suppose, however, we expect µA0 to equal µB0, where µB0 is the value of the outcome µ observed or estimated for a population B that was administered treatment x 0. The latter population is sometimes called the control or reference population. Confounding is said to be present if in fact µA0 [.notequal] µB0, for then there must be some difference between populations A and B (other than treatment) that is affecting µ.

If confounding is present, a naïve (crude) association measure obtained by substituting µB0 for µA0 is an effect measure will not equal the effect measure, and the association measure is said to be confounded. For example, if µB0 [.notequal] µA0 then µA1 µA1, which measure the association of treatments with outcomes across the populations, is confounded for µA1 µA0, which measures the effect of treatment x 1 on population A. Thus, saying an association measure such as µA1 µb0 is confounded for an effect measure such as µA1 µA0 is synonymous with saying the two measures are not equal.

The preceding counterfactual approach to confounding gradually emerged through attempts to separate effect measures into a component due to the effect of interest and a component due to the effect of interest and a component due to extraneous effects. One noteworthy aspect of this approach is that confounding depends on the outcome measure. For example, suppose populations A and B have a different five-year survival probability µ under placebo treatment x 0; that is, suppose µB0 [.notequal] µA0, so that µA1 µB0 is confounded for the actual effect µA1 µB0 of treatment on five-year survival. It is then still possible that ten-year survival, µ, under the placebo would be identical in both populations; that is, µA0 could equal µB0, so that µA1 µB0 is not confounded for the actual effect of treatment on ten-year survival. (We should generally expect no confounding for 200-year survival, since no treatment is likely to raise the 200-year survival probability of human patients above zero.)

A second noteworthy point is that confounding depends on the target population of inference. The preceding example, with A as the target, had different five-year survivals µA0 and µA0 for A and B under placebo therapy, and hence µA1 µB0 was confounded for the effect µA1 µA0 of treatment on population A. A lawyer or ethicist may also be interested in what effect the hormone treatment would have had on population B. Writing µB1 for the (unobserved) outcome of B under treatment, this effect on B may measured by µB1 µB0. Substituting µA1 for the unobserved µB1 yields µA1 µB0. This measure of association is confounded for µB1 µB0 (the effect of treatment x 1 on five-year survival in population B) if and only if µA1 [.notequal] µB1. Thus, the same measure of association, µA1, may be confounded for the effect of treatment on neither, one, or both of populations A and B, and may or may not be confounded for the effect of treatment on other targets.


A third noteworthy aspect of the counterfactual formalization of confounding is that is invokes no explicit difference (imbalances) between populations A and B with respect to circumstances or covariates that might influence µ. Clearly, if µA0 and µB0 differ, then A and B must differ with respect to factors with influence µ. This observation has led some authors to define confounding as the presence of such covariate differences between the compared populations. Nonetheless, confounding is only a consequence of these covariate differences. In fact, A and B may differ profoundly with respect to convariates that influence µ, and yet confounding may be absent. In other words, a covariate difference between A and B is a necessary but not sufficient condition for confounding. This is because the impact of covariate differences may balance each other out, leaving no confounding.

Suppose now that populations A and B differ with respect to certain covariates, and that these differences have led to confounding of an association measure for the effect measure of interest. The responsible covariates are then termed confounders of the association measure. In the above example, with µA1 µB0 confounded for the effect µA1 µA0, the factors responsible for the confounding (i.e., the factors that led to µA0 [.notequal] µB0) are the confounders. It can be deduced that a variable cannot be a confounder unless it can effect the outcome parameter µ within treatment groups and it is distributed differently among the compared populations. These two necessary conditions are sometimes offered together as a definition of a confounder. Nonetheless, counterexamples show that the two conditions are not sufficient for a variable with more than two levels to be a confounder.


Perhaps the most obvious way to avoid confounding in estimating µA1 µA0 is to obtain a reference population B for which µB0 is known to equal µA0. Among epidemiologists, such a population is sometimes said to be comparable to or exchangeable with A with respect to the outcome under the reference treatment. In practice, such a population may be difficult or impossible to find. Thus, an investigator may attempt to construct such a population, or to construct exchangeable index and reference populations. These constructions may be viewed as design-based methods for the control of confounding.

Perhaps no approach is more effective for preventing confounding by a known factor than restriction. For example, gender imbalances cannot confound a study restricted to women. However, there are several drawbacks: Restriction on enough factors can reduce the number of available subjects to unacceptable low levels and may greatly reduce the generalizability of results as well. Matching the treatment populations on confounders overcomes these drawbacks, and, if successful, can be as effective as restriction. For example, gender imbalances cannot confound a study in which the compared groups have identical proportions of women. Unfortunately, differential losses to observation may undo the initial covariate balances produced by matching.

Neither restriction nor matching prevents (although it may diminish) imbalances on unrestricted, unmatched, or unmeasured covariates. In contrast, randomization offers a means of dealing with confounding by covariates not accounted for by the design. It must be emphasized, however, that this solution is only probabilistic and subject to severe constraints in practice. Randomization is not always feasible or ethical, and (as mentioned earlier) many practical problems, such as differential loss and noncompliance, can lead to confounding in comparisons of the groups actually receiving treatments x 1 and x 0. One somewhat controversial solution to noncompliance problems is intent-to-treat analysis, which defines the comparison groups A and B by treatment assigned rather than treatment received. Confounding may, however, affect even intent-to-treat analyses, and (contrary to widespread misperceptions) the bias in those analyses can be away from the null (exaggerating an effect). For example, the assignments may not always be random, as when blinding is insufficient to prevent the treatment providers from protocol violations. And, purely by bad luck, randomization may itself produce allocations with severe covariate imbalances between the groups (and consequent confounding), especially if the study size is small. Blocked (matched) randomization can help ensure that random imbalances on the blocking factors will not occur, but it does not guarantee balance of unblocked factors.


Design-based methods are often infeasible or insufficient to prevent confounding. Thus, there has been an enormous amount of work devoted to analytic adjustments for confounding. With a few exceptions, these methods are based on observed covariate distributions in the compared populations. Such methods can successfully control confounding only to the extent that enough confounders are adequately measured. Then, too, many methods employ parametric models at some stage, and their success may thus depend on the faithfulness of the model to reality. These issues cannot be covered in depth here, but a few basic points are worth noting.

The simplest and most widely trusted methods of adjustment begin with stratification on confounders. A covariate cannot be responsible for confounding within internally homogeneous strata of the covariate. For example, gender imbalances cannot confound observations within a stratum composed solely of women, More generally, comparisons within strata cannot be confounded by a covariate that is unassociated with treatment within strata. This is so regardless of whether the covariate was used to define the strata. Thus, one need not stratify on all confounders in order to control confounding. Furthermore, if one has accurate background information on relations among the confounders, one may use this information to identify sets of covariates sufficient for control of confounding.

Some controversy has occurred about adjustment for covariates in randomized trials. Although Fisher asserted that randomized comparisons were "unbiased," he also pointed out that they could be confounded in the sense used here. Resolution comes from noting that Fisher's use of the word unbiased referred to the design and was not meant to guide analysis of a given trial. Once the trial is underway and the actual treatment allocation is completed, the unadjusted treatment-effect estimate will be biased if the covariate is associated with treatment, and this bias can be removed by adjustment for the covariate.

Sander Greenland

(see also: Bias )


Bross, I. D. J. (1967). "Pertinency of an Extraneous Variable." Journal of Chronic Diseases 20:487495.

Clayton, D., and Hills, M. (1993). Statistical Models in Epidemiology. New York: Oxford University Press.

Fisher, R. A. (1935). The Design of Experiments. Edinburgh: Oliver & Boyd.

Greenland, S. and Robins, J. M. (1986). "Identifiability, Exchangeability, and Epidemiological Confounding." International Journal of Epidemiology 15:413419.

Greenland, S.; Robins, J. M.; and Pearl, J. (1999). "Confounding and Collapsibility in Causal Inference." Statistical Science 14:2946.

Greenland, S., and Rothman, K. J. (1998). "Measures of Effect and Measures of Association." Modern Epidemiology, 2nd edition, eds. K. J. Rothman and S. Greenland. Philadelphia: Lippincott.

Groves, E. R., and Ogburn, W. F. (1928). American Marriage and Family Relationships. New York: Henry Holt.

Kitagawa, E. M. (1955). "Components of a Difference between Two Rates." Journal of the American Statistical Association 50:11681194.

Miettinen, O. S. (1972). "Components of the Crude Risk Ratio." American Journal of Epidemiology 96:168172.

Mill, J. S. (1843). A System of Logic, Ratiocinative and Inductive. London: Longmans Green.

Pearl, J. (2000). Causality. New York: Cambridge University Press.

Robins, J. M. (1998). "Correction for Non-Compliance in Equivalence Trials." Statistics in Medicine 17:269302.

Rothman, K. J. (1977). "Epidemiologic Methods in Clinical Trials." Cancer 39:17711775.

Yule, G. U. (1903). "Notes on the Theory of Association of Attributes in Statistics." Biometrika 2:121134.

About this article

Confounding, Confounding Factors

Updated About content Print Article