The American educational psychologist Gene V. Glass (1976) coined the term meta-analysis to stand for a method of statistically combining the results of multiple studies in order to arrive at a quantitative conclusion about a body of literature. The English statistician Karl Pearson (1857–1936) conducted what is believed to be one of the first statistical syntheses of results from a collection of studies when he gathered data from eleven studies on the effect of a vaccine against typhoid fever (1904). For each study, Pearson calculated a new statistic called the correlation coefficient. He then averaged the correlations across the studies and concluded that other vaccines were more effective than the new one.
Early work on quantitative procedures for integrating results of independent studies was ignored for decades. Instead, scholars typically carried out narrative reviews. These reviews often involved short summaries of studies considered “relevant,” with the operational definition of that term usually based on arbitrary and unspecified criteria. Scholars would then arrive at an impressionistic conclusion regarding the overall findings of those studies, usually based largely on the statistical significance of the findings in the individual studies. This latter aspect of narrative reviews is especially problematic. Given that the findings from individual studies are based on samples, it is expected that they will vary from one another even if the studies are estimating the same underlying population parameter. Often, however, scholars misinterpreted expected sampling variability as evidence that the results of studies were “mixed” and therefore inconclusive. In addition, scholars generally did not think to account for the power of the statistical tests in their studies. In the social sciences, statistical power is often not high, resulting in an unacceptably high rate of Type II errors (i.e., the failure to reject a false null hypothesis). In a collection of studies with typical statistical power characteristics, ignoring power can lead to the appearance that the intervention has no effect even if it really does.
Further, narrative reviews were usually not conducted with the same level of explicitness as is required in primary studies. For example, scholars were rarely explicit about either the decision rules invoked in a review or the pattern of results across studies that would be used to accept or reject a hypothesis. In addition, narrative literature reviews typically could not impart a defensible sense of the magnitude of the overall relations they investigated, nor did they adequately account for potential factors that might influence the direction or magnitude of the relations.
The explosion of research in the social and medical sciences that occurred in the 1960s and 1970s created conditions that highlighted another of the difficulties associated with narrative reviews, specifically that it is virtually impossible for humans to make sense out of a body of literature when it is large. Robert Rosenthal and Donald Rubin (1978), for example, were interested in research on the effects of interpersonal expectancies on behavior and found over three hundred studies. Glass and Mary L. Smith (1979) found over seven hundred estimates of the relation between class size and academic achievement. Jack Hunter and his colleagues (1979) uncovered more than eight hundred comparisons of the differential validity of employment tests. It is not reasonable to expect that these scholars could have examined all of the evidence they uncovered and made decisions about what that evidence said in an unbiased and efficient manner.
Meta-analysis addresses the problem of how to combine and weight evidence from independent studies by relying on effect sizes and on statistical procedures for weighting research evidence. Effect sizes are statistics that express how much impact an intervention had (e.g., the average increase on an achievement test), and as such, they give reviewers a way to quantify both the findings from individual studies and the findings aggregated across a body of studies. Effect sizes can also be standardized to allow for comparisons between measures of the same outcome with different scaling properties (e.g., two different math achievement tests).
Effect sizes with known distribution properties can be weighted to arrive at a better estimate of the population parameter when they are combined. The most common method involves weighting each effect size by the inverse of its squared standard error. Using this method, larger studies contribute more to the analysis than do smaller studies. As such, a study with one hundred participants contributes more to the overall analysis than a study with ten participants. The rationale for this procedure is that the study with one hundred participants estimates the population effect more precisely (i.e., has less random variability) and is therefore a better estimate of that population effect.
To carry out a meta-analysis, an average of the weighted effect sizes is computed. A confidence interval can be placed around the weighted average effect size; if the confidence interval does not include zero, then the null hypothesis that the population effect size is zero can be rejected at the given level of confidence. In addition, scholars usually conduct a statistical test to assess the plausibility that the observed effect sizes appear to be drawn from the same population. If the null hypothesis of effect size homogeneity is rejected, then the reviewer has reasonable cause to conduct follow-up tests that attempt to localize the sources of variation. Generally, these correlational analyses attempt to relate variability in study outcomes to characteristics of interventions (e.g., intensity), as well as to study design, sampling, and measurement characteristics.
Finally, representing study results as weighted effect sizes allows scholars to conduct tests of the plausibility of publication bias on study results. Publication bias is the tendency for studies lacking a statistically significant effect not to appear in published literature. Therefore, these studies are more difficult to uncover during a literature search. All else being equal, studies that do not result in a rejection of the null hypothesis have smaller effects than those that do reject the null hypothesis. As such, failing to locate these studies at a rate similar to that of published studies means that the overall estimate arising from a body of studies might be positively biased. Several statistical methods (e.g., the trim-and-fill analysis) are available to help scholars assess the potential impact of publication bias on their conclusions.
SEE ALSO Methods, Quantitative
Cooper, Harris, and Larry V. Hedges, eds. 1994. Handbook of Research Synthesis. New York: Russell Sage Foundation.
Glass, Gene V. 1976. Primary, Secondary, and Meta-analysis of Research. Educational Researcher 5: 3–8.
Glass, Gene V., and Mary L. Smith. 1979. Meta-analysis of Research on Class Size and Achievement. Educational Evaluation and Policy Analysis 1: 2–16.
Hunter, Jack E., Frank L. Schmidt, and R. Hunter. 1979. Differential Validity of Employment Tests by Race: A Comprehensive Review and Analysis. Psychological Bulletin 86: 721–735.
Lipsey, Mark W., and David B. Wilson. 2001. Practical Meta-analysis. Thousand Oaks, CA: Sage.
Pearson, Karl. 1904. Report on Certain Enteric Fever Inoculation Statistics. British Medical Journal 3: 1243–1246.
Rosenthal, Robert, and Donald Rubin. 1978. Interpersonal Expectancy Effects: The First 345 Studies. Behavioral and Brain Sciences 3: 377–415.
Jeffrey C. Valentine
"Meta-Analysis." International Encyclopedia of the Social Sciences. . Encyclopedia.com. (August 18, 2017). http://www.encyclopedia.com/social-sciences/applied-and-social-sciences-magazines/meta-analysis
"Meta-Analysis." International Encyclopedia of the Social Sciences. . Retrieved August 18, 2017 from Encyclopedia.com: http://www.encyclopedia.com/social-sciences/applied-and-social-sciences-magazines/meta-analysis
Meta-analysis is the statistical synthesis of the data from a set of comparable studies of a problem, and it yields a quantitative summary of the pooled results. It is the process of aggregating the data and results of a set of studies, preferably as many as possible that have used the same or similar methods and procedures; reanalyzing the data from all these combined studies; and thereby generating larger numbers and more stable rates and proportions for statistical analysis and significance testing than can be achieved by any single study. The process is widely used in the biomedical sciences, especially in epidemiology and in clinical trials. In these applications, meta-analysis is defined as the systematic, organized, and structured evaluation of a problem of interest. The essence of the process is the use of statistical tables or similar data from previously published peer-reviewed and independently conducted studies of a particular problem. It is most commonly used to assemble the findings from a series of randomized controlled trials, none of which on its own would necessarily have sufficient statistical power to demonstrate statistically significant findings. The aggregated results, however, are capable of generating meaningful and statistically significant results.
There are some essential prerequisites for meta-analysis to be valid. Qualitatively, all studies included in a meta-analysis must fulfill predetermined criteria. All must have used essentially the same or closely comparable methods and procedures; the populations studied must be comparable; and the data must be complete and free of biases—such as those due to selection or exclusion criteria. Quantitatively, the raw data from all studies is usually reanalyzed, partly to verify the original findings from these studies, and partly to provide a database for summative analysis of the entire set of data. All eligible studies must be included in the meta-analysis. If a conscious decision is made to exclude some, there is always a suspicion that this was done in order to achieve a desired result. If a pharmaceutical or other commercial organization conducts a meta-analysis of studies aimed at showing its product in a favorable light, then the results will be suspect unless evidence is provided of unbiased selection. One criterion for selection is prior publication in a peer-reviewed medical journal, but there are good arguments in favor of including well-conducted unpublished studies under some circumstances.
A variation of the concept is a systematic review, defined as the application of strategies that limit bias in the assembly, critical appraisal, and synthesis of all relevant studies of a specific topic. Meta-analysis may be, but is not necessarily, used as part of this process. Systematic reviews are conducted on peer-reviewed publications dealing with a particular health problem and use rigorous, standardized methods for the selection and assessment of these publications. A systematic review can be conducted on observational (case-control or cohort) studies as well as on randomized controlled trials.
John M. Last
(see also: Epidemiology; Observational Studies; Statistics for Public Heath )
Dickerson, K., and Berlin, J. A. (1992). "Meta-Analysis: State of the Science." Epidemiologic Reviews 14:154–176.
Petitti, D. B. (2000). Meta-Analysis, Decision Analysis and Cost Effectiveness Analysis in Medicine, 2nd edition. New York: Oxford University Press.
"Meta-Analysis." Encyclopedia of Public Health. . Encyclopedia.com. (August 18, 2017). http://www.encyclopedia.com/education/encyclopedias-almanacs-transcripts-and-maps/meta-analysis
"Meta-Analysis." Encyclopedia of Public Health. . Retrieved August 18, 2017 from Encyclopedia.com: http://www.encyclopedia.com/education/encyclopedias-almanacs-transcripts-and-maps/meta-analysis
"meta-analysis." A Dictionary of Nursing. . Encyclopedia.com. (August 18, 2017). http://www.encyclopedia.com/caregiving/dictionaries-thesauruses-pictures-and-press-releases/meta-analysis
"meta-analysis." A Dictionary of Nursing. . Retrieved August 18, 2017 from Encyclopedia.com: http://www.encyclopedia.com/caregiving/dictionaries-thesauruses-pictures-and-press-releases/meta-analysis