Bayesian statistics is concerned with the relationships among conditional and unconditional probabilities. Suppose the sampling space is a bag filled with twenty black and eighty white balls. The probability of a white ball being drawn at random is. 8, as defined by the relative frequency of such balls. If three more bags with seventy black and thirty white balls each are in play and a ball is drawn at random from one bag, the probability of it being white is. 8 ·.25 +.3 ·.75 =.425. Once a white ball is in evidence, the probability that it was drawn from the bag containing mostly white balls is larger than .25, and the probability that it was drawn from a bag containing mostly black balls is less than .75. The estimation of these inverse probabilities is the object of Bayes’s theorem.
Let the idea that the obtained white ball came from the bag containing mostly white balls be H, for “hypothesis,” and the idea that the ball came from a bag containing mostly black balls be ~H; let the drawing of a white ball be E, for “evidence.” Bayes’s theorem states that , here. 471. The ratio of expresses the degree to which the probability of H changes in light of the evidence. This degree of probability change can be seen before the posterior probability of H is calculated because is equal to the ratio , here 1.882.
The prior probability of a hypothesis constrains the degree to which it can be changed by evidence. If the evidence supports the hypothesis, the magnitude of the Bayesian revision decreases as the prior probability becomes larger. Consider the odds version of Bayes’s theorem, which is . Now, , which is equal to . Note that a larger p(H) reduces the second ratio, and thus reduces the product of the two ratios (where the first ratio > 1 if the evidence supports H). Analogously, a large p(H) leads to a stronger updating if the evidence is contrary to H.
The Reverend Thomas Bayes (1702–1761) worked out his eponymous theorem, but his solution was only published two years posthumously (see Stigler 1999). The validity of the theorem is given by its mathematical coherence. Any of its constituent probabilities can be recovered if the others are known. As a model of scientific and of everyday inference, the theorem formalizes inductive reasoning. Scientists seek to corroborate or discredit certain hypotheses, and laypeople (and animals) need to mold their beliefs at least in part with reference to the observations they make. For inductive reasoning, not only the probable truth of certain beliefs is of interest, but also the probability that certain events will recur. In the previous example, the sampling of one white ball not only alters the probability that any particular bag was sampled, it also increases the probability that a white ball will be sampled again (assuming the next draw will be made from the same bag).
In the short run, the revision of the probability of the evidence may not reduce uncertainty. In the present example, where p(E) rises from .425 to .535, one would be slightly less confident to bet on any particular color for the next draw. Over repeated sampling, however, p(E) converges on either p(EǀH) or on p(Eǀ~H), and p(H) converges on 0 or 1. Prior uncertainty is greatest when there are many equiprobable hypotheses. If there were 101 bags, each with the different proportion of white balls, p(E) = .5. Pierre-Simon Laplace’s (1749–1827) rule of succession states that once a sample is drawn, the probability that the next draw (after replacement) will replicate the result is where k is the number of successes and n is the sample size. For an infinite number of hypotheses, this rule is obtained with integral calculus.
Bayesian alternatives to conventional hypothesis testing, confidence-interval estimation, meta-analysis, and regression are available, though computationally cumbersome. In practice, most researchers remain committed to orthodox methods that exclude prior knowledge. Fisherian null hypothesis significance testing, for example, yields the probability of the evidence under the null hypothesis, p(EǀH). What the researcher really wants, namely p(HǀE), cannot be estimated because, in the absence of p(Eǀ~H), the likelihood ratio remains undefined. If, however, the researcher specifies ~H (as in the Neyman-Pearson approach) and assigns a probability to it, p(HǀE) can be quantified. Indeed, prior probabilities can be represented as a distribution over possible outcomes. The mean of the posterior distribution is given by the weighted average of the prior mean and the empirical mean of the data, where the weights depend on the relative precision (i.e., the reciprocals of the variance of the means) of the prior mean and the mean of the data. Likewise, the standard deviation of the posterior distribution becomes smaller as the precision of the prior distribution or the distribution of the data increases (see Howard et al.  for formulas and a numerical example).
Despite their reluctance to use Bayesian statistics for data analysis, many social and cognitive psychologists model the reasoning processes of their research participants along Bayesian lines (Krueger and Funder 2004). Any reasoning activities involving decisions, categorizations, or choices are natural candidates. Given some probative evidence, people need to decide, for example, if a person is male or female, guilty or innocent, healthy or sick. Likewise, they need to decide whether they should attribute a person’s behavior to dispositional or situational causes, and how much they should yield to a persuasive message. Even strategic choices between cooperation and defection in social dilemmas depend on what people assume others will do, given their own presumed choices.
The question of whether everyday reasoning satisfies Bayesian coherence remains controversial. In some contexts, such as jury deliberations, people appear to form their beliefs on the basis of narrative, not probabilistic, coherence. In other contexts, such as the Monty Hall problem, they fail to see how Bayes’s theorem can be readily applied. These difficulties can partly be overcome by altering the presentation of the problem. For example, diagnostic decisions in medicine are improved when the data are presented as frequencies instead of probabilities.
Many orthodox significance testers, who disavow the estimation of inverse probabilities, reveal implicit Bayesianism in their research practice. After a series of successful experiments, the probability of the null hypothesis being true becomes very small, and reasonable researchers desist from wasting further resources. The evidence of the past becomes the theory of the present, thus blurring the distinction between the two. Other hypotheses, such as the idea that a concerted mental concentration of a collective of people can alter the earth’s magnetic field, are so improbable a priori that even devout Fisherians would not consider testing them.
SEE ALSO Prediction; Probability; Psychometrics; Regression
Howard, George S., Scott E. Maxwell, and Kevin J. Fleming. 2000. The Proof of the Pudding: An Illustration of the Relative Strengths of Null Hypothesis, Meta-analysis, and Bayesian Analysis. Psychological Methods 5: 315–332.
Krueger, Joachim I., and David C. Funder. 2004. Towards a Balanced Social Psychology: Causes, Consequences, and Cures for the Problem-seeking Approach to Social Behavior and Cognition. Behavioral and Brain Sciences 27: 313–376.
Stigler, Stephen M. 1999. Statistics on the Table: The History of Statistical Concepts and Methods. Cambridge, MA: Harvard University Press.
Joachim I. Krueger