Ronald Aylmer Fisher
Fisher, Ronald Aylmer
FISHER, RONALD AYLMER
(b. London, England, 28 February 1890; d. Adelaide, Australia, 29 July 1962),
statistics, evolutionary genetics. For the original article on Fisher see DSB, vol. 5.
Fisher’s monumental influence on mathematical statistics is no greater or lesser than his influence on evolutionary genetics. Indeed, while he was at Rothamsted Experimental Station at Harpenden in Hertfordshire, England, between 1919 and 1933, Fisher not only revolutionized statistics, he helped usher in modern evolutionary theory and the historical period of evolutionary biology denoted the “evolutionary synthesis.” This update deals with Fisher’s contribution to evolutionary genetics, as it has only been since the mid-1980s that precisely what that contribution is has been fully understood.
The Evolutionary Synthesis . With J. B. S. Haldane and Sewall Wright, Fisher originated the field of theoretical population genetics, which synthesized the recently (1900) rediscovered principles of Mendelian heredity with Darwinian natural selection. Among the three, however, Fisher made the greatest contribution to the origins of population genetics. Fisher, of course, published much on that topic, but the three works that establish him as the dominant theorist among his contemporaries are his “The Correlation of Relatives on the Supposition of Mendelian Inheritance” in 1918; his “On the Dominance Ratio” in 1922; and the locus classicus of evolutionary genetics, The Genetical Theory of Natural Selection, in 1930. These three works form one long argument that defends the reconciliation of Mendelian heredity and Darwinian natural selection from the then-pervasive critics by eliminating speculative evolutionary causes or, to use William Provine’s term, constricting, the causes correctly attributable to evolution.
In “On the Dominance Ratio,” Fisher discusses, as he says, “the distribution of the frequency ratio of the allelomorphs of dimorphic factors, and the conditions under which the variance of the population may be maintained” (1922, p. 322). He saw this paper as linked to the earlier “Correlation of Relatives on the Supposition of Mendelian Inheritance.” In broad brush strokes, this means that where the 1918 paper defended the principles of Mendelian heredity against the criticisms of the biometricians (and in fact showed the two schemes to be compatible),
the 1922 paper continues by carrying through its mathematical methods and concepts as well as defending Darwinism’s using the principles of Mendelian heredity. Specific to “On the Dominance Ratio,” Fisher’s aim was to respond to a set of criticisms to the effect that Darwinian natural selection cannot be the correct explanation of the modulation of genetic variation in populations because the genetics of populations are such that there is not enough variation available for selection to act upon. In his response, Fisher considered the interaction of natural selection, random survival (genetic drift), assortative mating, and dominance. During the course of the paper, Fisher eliminated from consideration what he took to be insignificant evolutionary factors, such as epistatic gene interaction and genetic drift, and argued that natural selection acted very slowly on mutations of small effect and in the context of large populations maintaining a large amount of genetic variation.
Analysis of Random Drift . Consider drift, or what Fisher referred to variously as random survival, steady decay, or the Hagedoorn effect. The phrase random drift comes from Wright’s landmark paper of 1931, “Evolution in Mendelian Populations.” Notwithstanding Wright’s obvious contributions to the development of the concept and mathematical modeling of drift, it was Fisher who, in his 1922 paper, was the first among the architects of population genetics to explore mathematically the evolutionary consequences of drift in a Mendelian population.
In finite populations, the variation in the number of offspring between individuals may result in random fluctuations in allele frequencies. These random fluctuations affect the chances of survival of a mutant allele in a population. Fisher argued that the survival of a rare mutant depended upon chance and not selection. Indeed, he argued that such a mutation would be more likely to remain at low frequencies in a large rather than in a small population, since in a large population the mutant would have a greater probability of survival. Random fluctuations in allele frequencies also reduce a population’s genetic variation. In The Relative Value of the Processes Causing Evolution (1921), Arund L. and Anna C. Hagedoorn argued that random survival is an important cause of the reduction of genetic variation in natural populations. Fisher argued that the Hagedoorns were mistaken. Fisher determined two key quantities for the situation in which a population is under the influence only of the steady decay of genetic variation, that is, the Hagedoorn effect: the first quantity describes the time course in generations of the Hagedoorn effect; the second describes the “half-life” in generations of the effect. Fisher determined the time course to be 4N (where N is population size) and the half-life to be 2.8N. This means that the Hagedoorn effect requires 4N generations to reduce the genetic variation in the population to the point that all alleles are identical by descent. The “half-way” point is reached in 2.8 N generations. (Wright demonstrated in a 1929 letter to Fisher that his, Fisher's, calculations were twice too high: the time-course in generations is 2N and the half-life of the Hagedoorn effect is 1.4N. In his paper, “The Distribution of Gene Ratios for Rare Mutations” , Fisher showed that the correction had only a minor effect on his argument.)
Fisher used these quantities to weight the significance of the effect of steady decay; the longer the time course, the weaker the effect. Given that the time course of the Hagedoorn effect depends on the population size, the larger the population, the weaker, or less significant, the effect. It is evident that as population size increases over 104, that the time course becomes considerable. Indeed, Fisher says, “As few groups contain less than 10,000 individuals between whom interbreeding takes place, the period required for the action of the Hagedoorn effect, in the entire absence of mutation, is immense” (1922, p. 330). According to Fisher, then, the Hagedoorn effect is evolutionarily insignificant and populations are large.
Fisher’s insights regarding the evolutionary effects of genetic drift reflect his strong Darwinian assumptions, as he (Fisher) says: “a numerous species, with the same frequency of mutation, will maintain a higher variability than will a less numerous species: in connection with this fact we cannot fail to remember the dictum of Charles Darwin, that ‘wide-ranging, much diffused and common species vary most’” (1922, p. 324).
Gene Interaction . In his 1918 paper, Fisher considered the statistical consequences of dominance, epistatic gene interaction, assortative mating, multiple alleles, and linkage on the correlations between relatives. Fisher argued that the effects of dominance and gene interaction would confuse the actual genetic similarity between relatives. He also knew that the environment could confuse such similarity. Fisher here introduced the concept of variance and the analysis of variance to the scientific literature:
When there are two independent causes of variability capable of producing in an otherwise uniform population distributions with standard deviations σ1 and σ2, it is found that the distribution, when both causes act together, has a standard deviation . It is therefore desirable in analyzing the causes of variability to deal with the square of the standard deviation as the measure of variability. We shall term this quantity the Variance of the normal population to which it refers, and we may now ascribe to the constituent causes fractions or percentages of the total variance which they together produce. (1918, p. 399)
Fisher then used this tool to partition the total variance into its component parts.
Fisher labeled that portion of the total variance that accurately described the correlation between relatives the “additive” genetic component of variance. The “nonadditive” genetic component included dominance, gene interaction, and linkage. Environmental effects, such as random changes in environment, comprised a third component of the total variance. In 1922, on the basis of his 1918 work, Fisher argued that the additive component of variance was important for evolution by natural selection. Indeed, he argued that, particularly in large populations (>104), nonadditive and environmental components of the total variance are negligible. He further claimed that selection would remove any factor for which the additive contribution to the total genetic variance is very high and leave those for which the contribution is low. Indeed, Fisher says, “It is therefore to be expected that the large and easily recognized factors in natural organisms will be of little adaptive importance, and that the factors affecting important adaptations will be individually of very slight effect” (1922, p. 334). Ultimately, for Fisher, evolution proceeds very slowly, with low levels of selection acting on mutations of small effect and in large populations holding considerable genetic variation.
The Genetical Theory of Natural Selection . Fisher’s work discussed above and other work on, for example, the evolution of dominance and mimicry, would culminate in his book, The Genetical Theory of Natural Selection (1930), one of the principal texts—along with Haldane’s The Causes of Evolution(1932) and Wright’s “Evolution in Mendelian Populations” (1931) and “The Roles of Mutation” (1932)—completing the reconciliation of Darwinism and Mendelism and establishing the field of theoretical population genetics (and, for Fisher, its application to eugenics). The Genetical Theory is celebrated as the locus classicus for the reconciliation of Darwinian natural selection and Mendelian heredity. Remarkably, the book manuscript was produced by Fisher’s dictating to his wife, Ruth, during the evenings. It was revised and reis-sued in 1958 and most recently in a variorum edition issued in 1999 (edited by J. H. Bennett).
The first seven (of twelve) chapters of The Genetical Theory set out Fisher’s synthesis of Darwin’s mechanism of natural selection and Mendelian genetics. Fisher considered the first two chapters, on the nature of inheritance and the “fundamental theorem of natural selection,” the most important of the book. Indeed, these two chapters accomplish the key piece of the reconciliation. Moreover, the general argument strategy Fisher used in 1918 and 1922, of defending the principles of Mendelian heredity and defending Darwinism under the rubric of Mendelian heredity, is carried through. Fisher’s aim in The Genetical Theory was to establish particulate inheritance against the blending theory and then demonstrate how plausibly Darwinian natural selection may be the principal cause of evolution in Mendelian populations.
Fisher’s first chapter considers implications of a synthesis of natural selection with, alternatively, blending and Mendelian inheritance. He demonstrates that on the Mendelian theory, natural selection may be the main mechanism of a population’s variability. The demonstration importantly resolved a persistent problem for Darwin’s theory of descent with modification, one that had led biologists to abandon natural selection as an evolutionary mechanism: Darwin’s acceptance of blending inheritance required him to imagine special mechanisms controlling mutation because of enormous mutation rates demanded by the blending theory. Because Mendelian heredity did not demand such enormous mutation rates, Fisher was able to eliminate these controlling mechanisms and, so, revive natural selection as an important evolutionary mechanism.
Fisher’s second chapter develops, mathematically, his genetical theory of natural selection. The arguments are drawn largely from his “On the Dominance Ratio” of 1922 and “The Distribution of Gene Ratios for Rare Mutations” of 1930, the response to Wright’s aforementioned correction of Fisher’s 1922 paper. Three key elements may be distilled from Fisher’s “heavy” mathematics in the second chapter of The Genetical Theory. The first is a measure of average population fitness, Fisher’s Malthusian parameter—that is, the reproductive value of all genotypes at all stages of their life histories. The second is a measure of variation in fitness, which Fisher partitions into genetic and environmental components (based on his distinctions from his 1918 and 1922 papers). The third is a measure of the rate of increase in fitness, that is, the change in fitness due to natural selection. For Fisher, “the rate of increase of fitness of any organism at any time is equal to its genetic variance in fitness at that time” (The Genetical Theory of Natural Selection, 1930, p. 37; emphasis in original). This last element is Fisher’s fundamental theorem of natural selection and is the centerpiece of his natural selection theory.
Understanding the Fundamental Theorem . Interestingly, inasmuch as Fisher considered his fundamental theorem the centerpiece of his evolutionary theory, it happens that the theorem is also the most obscure element of it. The theorem was thoroughly misunderstood until 1989, when Warren Ewens—in “An Interpretation and Proof of the Fundamental Theorem of Natural Selection”—rediscovered George Price’s 1972 clarification and proof of it in “Fisher’s ‘Fundamental Theorem’ Made Clear.” Fisher’s original statement of the theorem in 1930 suggests that mean fitness can never decrease because variances cannot be negative. Price showed that in fact the theorem does not describe the total rate of change in fitness but rather only one component of it. That part is the portion of the rate of increase that can be ascribed to natural selection. And, actually, in Fisher’s ensuing discussion of the theorem, he makes this clear. The total rate of change in mean fitness is due to a variety of forces, including natural selection, environmental changes, epistatic gene interaction, dominance, and so forth. The theorem isolates the changes due to natural selection from the rest, a move suggested in Fisher’s 1922 paper. The key change that Price and Ewens make in the statement of the theorem, the change that clarifies it, is to write “additive genetic variance” for “genetic variance” (since “genetic variance” includes both an additive and nonadditive part). With the theorem clarified and proven, Price and later Ewens argue that it is not so fundamental. Given that it is a statement about only a portion of the rate of increase in fitness, it is incomplete. The Price-Ewens interpretation of the theorem is the standard one. However, Anthony Edwards –in his 1994 paper, “The Fundamental Theorem of Natural Selection”—argues that Fisher’s isolation of change in the genetic variance due to selection is biologically deep, that is, fundamental, and so the charge of incompleteness is impotent.
Fisher compared both his 1922 and 1930 explorations of the balance of evolutionary factors and the “laws” that describe them to the theory of gases and the second law of thermodynamics, respectively. The received view of these comparisons is that Fisher’s interests in physics and mathematics led him to look for biological analogues. No doubt this is part of the story. However, a more plausible interpretation of the comparison comes from treating Fisher’s major works of 1918, 1922, and 1930 as one long argument. If this is done, one finds that Fisher’s strategy in synthesizing Darwinian natural selection with the principles of Mendelian heredity was to defend, against its critics, selection as an evolutionary mechanism under Mendelian principles. Following this argument strategy, Fisher built his genetical theory of natural selection piecemeal, or from the bottom up. That is, Fisher worked to justify the claim of his fundamental theorem by constructing plausible arguments about the precise balance of evolutionary factors. Thus, his piecemeal consideration of the interaction between dominance, gene interaction, genetic drift, mutation, selection, and so on led to his theorem. It is not, at least not primarily, the search for biological analogues to physical models and laws that underwrites the theorem.
Eugenics . The last five chapters of The Genetical Theory explore natural selection in human populations, particularly social selection in human fertility. According to Fisher, the decline of human civilizations is due in part to the point in economic development where an inversion in fertility of the upper classes to that of the lower classes is reached. Fisher’s central observation, based upon England’s 1911 census data, was that the development of economies in human societies structures the birth-rate so that it is inverted with respect to social class—low birth-rates for the upper class and high birth-rates for the lower class. Families who, for whatever reason, were not capable of producing many children rose in class because of the financial advantage of having few children. In the final chapter of The Genetical Theory, Fisher offers strategies for countering this effect. He proposed the abolishment of the economic advantage of small families by instituting what he called “allowances” to families with larger numbers of children, with the allowances proportional to the earnings of the father. In spite of Fisher’s espousal of eugenics in this part of the book, he means the discussion to be taken as an inseparable extension of the preceding part.
No one has thought that Fisher’s contribution to evolutionary genetics was less than groundbreaking. Rather, precisely what Fisher established, its nature and scope, and exactly how he did so has been less than clear. With Fisher’s work on variance in 1918, his work on the balance of factors in evolution in 1922, and his fundamental theorem of natural selection in 1930, we have a unified argument setting aside pervasive anti-Darwinism, originating a new mathematical approach to the evolution of populations and establishing the very essence of natural selection. All of which are good reasons for the universal approbation of Fisher’s work in evolutionary genetics.
There is a wealth of material on Fisher’s contributions both to statistics and to evolutionary genetics, including unpublished correspondence. See J. H. Bennett's Natural Selection, Heredity, and Eugenics: Including Selected Correspondence of R. A. Fisher with Leonard Darwin and Others (Oxford: Clarendon Press, 1983) and Statistical Inference and Analysis: Selected Correspondence of R. A. Fisher (Oxford: Clarendon Press, 1990). These works contain a bibliography of all of Fisher’s publications. See also The University of Adelaide Digital Library: R. A. Fisher Digital Archive; available from http://digital.library.adelaide.edu.au/coll/special/fisher/.
WORKS BY FISHER
“The Correlation of Relatives on the Supposition of Mendelian Inheritance.” Royal Society of Edinburgh 52 (1918): 399–433.
On the Dominance Ratio.” Proceedings of the Royal Society of Edinburgh 42 (1922): 321–341.
“The Distribution of Gene Ratios for Rare Mutations.” Proceedings of the Royal Society of Edinburgh 50 (1930): 205–220.
The Genetical Theory of Natural Selection. Oxford: Clarendon Press, 1930. Released in a revised, second edition in 1958 from Dover Publications. Released in a variorum edition by Oxford University Press in 1999, edited by J. H. Bennett.
Edwards, Anthony W. F. “The Fundamental Theorem of Natural Selection.” Biological Reviews of the Cambridge Philosophical Society 69 (1994): 443–474.
Ewens, Warren J. “An Interpretation and Proof of the Fundamental Theorem of Natural Selection.” Theoretical Population Biology 36 (1989): 167–180.
Hagedoorn, Arund L., and Anna C. Hagedoorn. The Relative Value of the Processes Causing Evolution. The Hague, Netherlands: M. Nijhoff, 1921.
Haldane, John B. S. The Causes of Evolution. London:Longmans, 1932.
Hodge, M. Jonathon S. “Biology and Philosophy (Including Ideology): A Study of Fisher and Wright.” In The Founders of Evolutionary Genetics: A Centenary Reappraisal, edited by Sahotra Sarkar. Dordrecht, Netherlands, and Boston: Kluwer Academic Publishers, 1992.
Price, George R. “Fisher’s ‘Fundamental Theorem’ Made Clear.” Annals of Human Genetics 36 (1972): 129–140.
Provine, William. “Founder Effects and Genetic Revolutions in Microevolution and Speciation: A Historical Perspective.” In Genetics, Speciation, and the Founder Principle, edited by Luther Val Giddings, Kenneth Y. Kaneshiro, and Wyatt W. Anderson. New York: Oxford University Press, 1989.
Wright, Sewall. “Evolution in Mendelian Populations.” Genetics 16 (1931): 97–159.
———. “The Roles of Mutation, Inbreeding, Crossbreeding, and Selection in Evolution.” Proceedings of the Sixth Annual Congress of Genetics 1 (1932): 356–366.
Fisher, Ronald Aylmer
Fisher, Ronald Aylmer
(b. London, England, 17 February 1890; d. Adelaide, Australia, 29 July 1962)
statistics, biometry, genetics.
Fisher’s father was a prominent auctioneer and the head of a large family; Fischer was a surviving twin. Chronic myopia probably helped channel his youthful mathematical gifts into the high order of conceptualization and intuitiveness that distinguished his mature work. At Cambridge, which he entered in 1909 and from which he graduated in 1912, Fisher studied mathematics and theoretical physics. His early postgraduate life was varied, including_ working for an investment house, doing farm chores in Canada, and teaching high school. Fisher soon became interested in the biometric problems of the day and in 1919 joined the staff of Rothamsted Experimental Station as a one-man statistics department charged, primarily, with sorting and reassessing a sixty-six-year accumulation of data on manurial field trials and weather records. In the following decade and a half his work there established him as the leading statistician of his era, and early in his tenure he published the epochal Statistical Methods for Research Workers.
Meantime, avocationally, Fisher was building a reputation as a top-ranking geneticist. He left Rothamsted in 1933 to become Galton professor of eugenics at University College, London. Ten years later he moved to Cambridge as Balfour professor of genetics, In 1959, ostensibly retired, Fisher emigrated to Australia and spent the last three years of his life working steadily and productively in the Division of Mathematical Statistics of the Commonwealth Scientific and Industrial Research Organization. Innumerable honors came to him, including election to fellowship of the Royal Society in 1929 and knighthood in 1952.
In 1917 Fisher married Ruth Eileen Guinness and, like his own parents, they had eight children—a circumstance that, according to a friend, was “a personal expression of his genetic and evolutionary convictions.” Later in life he and Lady Fisher separated. Slight, bearded, eloquent, reactionary, and quirkish, Fisher made a strong impact on all who met him. The geniality and generosity with which he treated his disciples was complemented by the hostility he aimed at his dissenters. His mastery of the elegantly barbed phrase did not help dissolve feuds, and he left a legacy of unnecessary confusion in some areas of statistical theory. Nevertheless, Fisher was an authentic genius, with a splendid talent for intertwining theory and practice. He had a real feel for quantitative experimental data whose interpretation is unobvious, and throughout his career he was a happy and skillful performer on the desk calculator.
Fisher’s debut in the world of mathematical statistics was occasioned by his discovery, as a young man, that all efforts to establish the exact sampling distribution of the well-known correlation coefficient had foundered. He tackled the problem in the context of points in an n-dimensional Euclidean space (n being the size of the sample), an original and, as it turned out, highly successful approach. In the following years he applied similar methods to obtain the distributions of many other functions, such as the regression coefficient, the partial and multiple correlation coefficients, the discriminant function, and a logarithmic function of the ratio of two comparable variances. Fisher also tidied up the mathematics and application of two important functions already in the literature: the Helmert-Pearson ϰ2 (the sum of squares of a given number of independent standard normal variates, whose distribution is used to test the “goodness of fit” of numerical observations to expectations) and Gosset’s z (the ratio of a normal sample mean, measured from a given point, in terms of its sample standard deviation). The latter was modulated by Fisher to the now familiar t, whose frequency distribution provides the simplest of all significance tests. He seized on Gosset’s work to centralize the problem of making inferences from small samples, and he went on to erect a comprehensive theory of hypothesis testing.
The idea here was that many biological experiments are tests of a defined hypothesis, with the experimentalist wanting to know whether his results bolster or undermine his theorizing. If we assume, said Fisher in effect, that the scatter of results is a sample from a normal (Gaussian) distribution whose mean (suitably expressed or transformed) is the “null” hypothesis, we can, using the t distribution, compute the “tail” probability that the observed mean is a random normal sample—in much the same way that we can compute the probability of getting, say, seven or more heads in ten tosses of a fair coin. It might be thought that this probability was all an experimentalist would need, since he would subsequently make his own scientific judgment of its significance in the context of test and hypothesis. However, Fisher advocated the use of arbitrary “cutoff” points; specifically, he suggested that if the probability were <1/20 but>1/100, a weak judgment against the null hypothesis be made, and if the probability were <1/100 a strongly unfavorable judgment be made. This convention, helped by the publication of special tables, became popular, and it has often been used blindly. The real value of the discipline lay in its sound probabilistic structure rather than in its decision-making rules. It is noteworthy that other statistical theorists extended Fisher’s ideas by introducing the power of a test, that is, its intrinsic ability, in terms of the parameters and distribution functions, to detect a given difference (between the null hypothesis and the observational estimate) at a prearranged probability level; but, for reasons never made plain, Fisher would not sanction this development.
Fisher devised his own extension of significance testing, the remarkable analysis of variance. This was bound in with his novel ideas on the wide subject of the theory of experimental design. He emphasized not merely the desirability but the logical necessity, in the design of an experiment whose results could not be freed from error, of maximizing efficiency (by devices such as blocks and confounding), and of introducing randomization in such a way as to furnish a valid estimate of the residual error. Whereas the time-honored practice in handling several factors had been to vary only one at a time during the experiment, Fisher pointed out that simultaneous variation was essential to the detection of possible interaction between factors, and that this change could be made without extra cost (in terms of experimental size and operational effort). Entrained with these innovations was the use of randomized blocks and Latin squares for the actual disposition of the test units.
Many would subscribe to the thesis that Fisher’s contributions to experimental design were the most praiseworthy of all his accomplishments in statistics. It was to facilitate the interpretation of multifactor experiments carried out in the light of his ideas on design that Fisher introduced the analysis of variance (which he had originally devised to deal with hierarchical classification). In this scheme the variances due to different factors in the experiment are screened out and tested separately for statistical significance (that is, for incompatibility with the null hypothesis of “no effect”). The appeal of the analysis has been great, and here again misuses have not been uncommon among uncritical researchers—for example, some have contented themselves with, and even published, analyses of variance unaccompanied by tabulations of the group means to which the significance tests refer. This itself is a tribute, of a sort, to Fischer.
Arising out of his work on the analysis of variance was the analysis of covariance, a scheme in which the regression effect of concomitant nuisance factors could be screened out so as to purify further the significance tests. (For example, in a multifactor physiological experiment the weight of the animals might be a nuisance factor affecting response and calling for elimination or allowance by analysis of covariance.)
An early landmark in the Fisherian revolution was his long paper “On the Mathematical Foundations of Theoretical Statistics” (1922). Convinced that the subject had progressed too haphazardly and lacked a solid mathematical base, he set out to repair the situation. He drew attention to the shortcomings of the method of moments in curve fitting, stressed the importance of exact sampling distributions, and moved toward a realistic view of estimation. Discussing estimation of a parameter, he urged that a “satisfactory statistic” should be consistent (which means, roughly, unbiased), efficient (having the greatest precision), and sufficient (embracing all relevant information in the observations). His making “information” into a technical term, equatable to reciprocal variance, was a useful step (the reuse of the same word, much later, in C. E. Shannon’s information theory is allied but not directly comparable).
To appreciate Fischer’s handling of estimation theory, we must look upon it as a subdivision of the general problem of induction that has worried theoreticians since Hume’s day. In its simplest and oldest form it is the question of how to arrive at a “best” value of a set of observations. Today we unthinkingly take the arithmetic mean, but in fact the conditions under which this procedure can be justified need careful definition. Broadly speaking, the arithmetic mean is usually the maximum likelihood estimate. This term expressed not a wholly new principle but one that Fischer transformed and named. He urged the need to recognize two kinds of uncertainty and proposed that the probability of an event, given a parameter, should be supplemented by the likelihood of a parameter, given an event. Likelihood has similarities to probability, but important differences exist (for instance, its curve cannot be integrated).
At this point, by way of illustration, we may bring in a modified form of one of Fischer’s own examples. It concerns a discrete (discontinuous) distribution. We are given four counts, n1 through n4, of corn seedlings (each in a different descriptive category) that are hypothesized to arise from a parameter p, as shown below.
|Descriptive category (i)||1||2||3||4||Σ|
|Fractions expected (fi)||(2 + p)/4||(1 − p)/4||(1 − p)/4||p/4||1|
|Numbers observed (ni)||1,997||906||904||32||3,839|
The problem is to estimate p. Now, by definition, the likelihood, L, will be
Instead of seeking the maximum of this, we shall find it more convenient to handle the logarithm, which is
Differentiating this expression with respect to p, equating the result to zero, and replacing the ni with the actual observations yields the quadratic
from which we find p̂ = 0.03571 (the caret is widely employed nowadays to denote “an estimate of”). Therefore, among all possible values of p this is the one most likely to have given rise to the particular tetrad of observed numbers. Fischer now went further and showed that the second derivative of (2) could be equated to the variance of p̂. This gives us
which, incidentally, is the minimum variance, indicating that the estimate of p is efficient in Fischer’s special sense of that term. Substitution of p̂ for p in (4) and insertion of the actual ni, followed by extraction of the square root, gives us the best estimate of the standard error of p̂. Thus we can submit p̂= .0.3571 ± 0.00584 as our final result.
Attachment of a standard error to an estimate, as above, is quite old in statistics, and this is yet another matter to which Fischer brought change. He was a pioneer in interval estimation, that is, the specification of numerical probability limits bracketing the point estimate. Fischer’s approach is best described in the context of sampling the normal distribution. Imagine such a distribution, with mean μ and variance (second moment) σ2, both unknown and from which we intend to draw a sample of ten items. Now, in advance of the sampling, various statements can be made about this (as yet) unknown mean, m, One such is
the factor 1.96 being taken from a table of the partial integrals of the normal function. The statement is formal, although without practical value. We draw the sample, finding, say, m=8.41 with a standard deviation of s = 6.325; then, according to Fischer, a probability statement analogous to (5) can be cast into this form:
the factor 2.23 being taken from the t table in the Gosset-Fischer theory of small samples.
Strictly speaking, this is dubious in that it involves a probable location of a parameter, which, by definition, is fixed—and unknown. But, said Fischer, the observed values of m and s have changed the “logical status” of μ, transforming it into a random variable with a “well-defined distribution.” Another way of stating (6) is that the values 3.95–12.87 are the “fiducial limits” within which μ can be assigned with probability 0.95. There is no doubt that this is a credibility statement with which an experimentalist untroubled by niceties of mathematical logic should be satisfied. And indeed it might be thought that the notion of fiducialism could be rationalized in terms of a different definition of probability (leaning on credibility rather than orthodox limiting frequency). But Fischer always insisted that probability “obtained by way of the fiducial argument” was orthodox. The doctrine landed Fischer and the adherents of fiducialism in a logical morass, and the situation was worsened by the historical accident that an allied concept, the theory of confidence limits, was introduced by Jerzy Neyman about the same time (ca. 1930). Although it, too, had weaknesses, Neyman’s theory, as the more mathematically rigorous and widely applicable, was espoused by many statisticians in preference to its rival. Battle lines were drawn, and the next few decades witnessed an extraordinarily acrimonious and indecisive fight between the schools. Fiducialism is still being explored by mathematical statisticians.
This by no means exhausts Fischer’s contributions to statistics—topics such as multivariate analysis, bioassay, time series, contingency tables, and the logarithmic distribution are others that come to mind—and in fact it would be hard to do so, if only because he sometimes gave seminal ideas to colleagues to work out under their own names. We must end with some reference to another subject on which Fischer left a deep mark: genetics. In his young manhood natural selection and heredity were in a state of renewed ferment. The rediscovery in 1900 of Mendel’s splendid work on particulate inheritance threw discredit not only on Karl Pearson’s elaborate researches into blending inheritance but also on Darwinism itself, believed by some to be incompatible with Mendelism. Fischer thought otherwise, and in 1918 he published a paper, “The Correlation Between Relatives on the Supposition of Mendelian Inheritance,” that brought powerful new mathematical tools to bear on the issue and that eventually swung informed opinion over to his views—which were, in brief, that blending inheritance is the cumulative effect of a large number of Mendelian factors that are individually insignificant. (This was to blossom into the modern discipline of biometric genetics, although Fischer himself never made any important contributions thereto.)
Fischer came to regard natural selection as a study in its own right, with evolution as but one of several sequelae. His work on the phenomenon of dominance was outstanding. He early pointed out that correlations between relatives could be made to furnish information on the dominance of the relevant genes. He demonstrated that the Mendelian selection process invariably favors the dominance of beneficial genes, and that the greater the benefit, the faster the process. Dominance, then, must play a major role in evolution by natural selection. It may here be added that Fischer’s work in this area, as elsewhere, was a careful blending of theory and practice. He carried out breeding experiments with various animals (mice, poultry, and snails were some), often under trying circumstances (for example, much mouse breeding was done in his own home in Harpenden). One of his best experiments concerned the inheritance of whiteness of plumage in white leghorns: by “breeding back” into wild jungle fowl he showed that the responsible dominant gene is a result of artificial selection, and its dominance is quickly lessened when it is introduced into the ancestral wild species.
Fischer also enunciated a “fundamental theorem of natural selection” (for an idealized population) in this form: “The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.”He was keenly interested in human genetics and, like most eugenicists, held alarmist views about the future of Homo sapiens. An important consequence of this special interest was his realization that a study of human blood groups could be instrumental in advancing both theory and practice, and in 1935 he set up a blood-grouping department in the Galton Laboratory. Many good things came out of this enterprise, including a clarification of the inheritance of Rhesus groups.
I. Original Works. Over a period of half a century Fischer turned out an average of one paper every two months. The best listing of these is M. J. R. Healy’s in Journal of the Royal Statistical Society, 126A (1963), 170–178. The earliest item, in Messenger of Mathematics, 41 (1912), 155–160, is a brief advocacy of the method of maximum likelihood for curve fitting; and the last, in Journal of Theoretical Biology, 3 (1962), 509–513, concerns the use of double heterozygotes for seeking a statistically significant difference in recombination values. Key theoretical papers are “On the Mathematical Foundations of Theoretical Statistics,” in Philosophical Transactions of the Royal Society, 222A (1922), 309–368: “The Theory of Statistical Estimation,” in Proceedings of the Cambridge Philosophical Society, 22 (1925), 700–725; and “The Statistical Theory of Estimation,” in Calcutta University Readership Lectures (Caluctta, 1938). A publication of unusual interestis “Has Mendel’s Work Been Rediscovered?” in Annals of Science1 (1936), 115–137, in which Fischer finds that, probabilistically, some of Mendel’s celebrated pea results were a little too good to be true, but in which he also shows that Mendel’s insight and experimental skill must have been outstandingly fine. Forty-three of Fischer’s most important earlier papers are reproduced in facsimile. with the author’s notes and corrections, in Contributions to Mathematical Statistics (New York, 1950).
Fischer published five holograph books: Statistical Methods for Research Workers (London, 1925; 13th ed., 1958), translated into five other languages; The Genetical Theory of Natural Selection (London, 1930: 2nd ed., New York, 1958); The Design of Experiments (London, 1935; 8th ed., 1966); The Theory of Inbreeding (London, 1949: 2nd ed., 1965): and Statistical Methods and Scientific Inference (London, 1956; 2nd ed., 1959). The statistical tables in his first book were subsequently expanded and published separately, with F. Yates, as Statistical Tables for Biological Agricultural, and Medical Research (London, 1938; 6th ed., 1963)
II. Secondary Litetarture. Journal of the American Statistical Association, 46 (1951), contains four informative papers by F. Yates, H. Hotelling, W. J. Youden, and K. Mather written on the occasion of the twenty-fifth anniversary of the publication of Statistical Methods. The principal commemorative articles published after Fischer’s death will be found in Biometrics, 18 (Dec. 1962) and 20 (June 1964): Biographical Memoirs of Fellows of the Royal Society of London9 (1963), 92–129 Journal of the Royal Statistical Society, 126A (1963), 159–170; and science, 156 (1967) 1456–1462. The last contains “An Appreciation” by Jerzy Neyman, who differed with Fischer on several issues, and is therefore of special interest. The writings of Neyman and E. S. Pearson should be consulted for further information on controversial matters. Fischer’s contributions to discussions at various meetings of the Royal Statistical Society are also illuminating in this regard (they are indexed in the Healy bibliography). A good philosophicalk study of statistical reasoning, with particular reference to Fischer’s ideas, is Ian Hacking’s Logic of Statistical Inference (London, 1965).
Norman. T. Gridgeman