Wilks, S. S.

views updated

Wilks, S. S.

The various professional roles of the statistician Samuel Stanley Wilks (1906-1964) so parallel the development of mathematics in the mid-twentieth century that he seems ready-made for the hero in a sociological novel entitled “The Professional Mathematician.” Like most important mathematicians, Wilks early made strong research contributions, and his innovations opened new lines of research for others. But not very many mathematicians succeeded, as he did, in attracting fine students who, in their turn, used their training to specialize in such diverse areas as statistics, mathematical statistics, probability, sociology, governmental service, and defense research. Still fewer mathematicians have advanced their fields by major editorial commitments as Wilks did, both through the Annals of Mathematical Statistics and through the Wiley Publications in Statistics. What is rare among mathematicians is the belief, which Wilks held, in the value of organization, distinct from individual achievement, for the future of mathematics. Acting on this belief, he deliberately devoted much of his career to scientific societies, to committees of governmental agencies, and to public and private foundations; his contributions ranged far beyond the limits of even the broadest

interpretations of mathematics. The combination of these activities produced a dedicated scholar, a major educator, and a public servant.

Wilks was born to Chance C. and Bertha May Gammon Wilks in Little Elm, Texas. He and his two younger brothers were raised on a small (for Texas), 250-acre farm near Little Elm. During his formal education, he had a notable set of teachers. W. M. Whyburn, later chairman of the department of mathematics at the University of North Carolina, taught Wilks in the seventh grade. In high school, Wilks used to sneak off to take a college mathematics course during study hour. In 1926 he took a bachelor’s degree in architecture at North Texas State Teachers College. While earning his m.a. in mathematics at the University of Texas (he received it in 1928), he studied topology with R. L. Moore and statistics with E. L. Dodd, who encouraged him to join Henry L. Rietz at the University of Iowa, then the leading center in the United States for the study of mathematical statistics.

At Iowa, E. F. Lindquist introduced him to a problem which led to Wilks’s thesis, entitled “On the Distributions of Statistics in Samples From a Normal Population of Two Variables With Matched Sampling of One Variable” (1932a). This began Wilks’s long series of contributions to multivariate analysis, his interest in the applications of statistical methods, and his lifelong relation with the fields of education, testing, and the social sciences. He received a PH.D. in mathematical statistics in 1931.

After the National Research Council awarded him a National Research Fellowship, Wilks and his bride, Gena Orr of Denton, Texas, went to Columbia University so that he could work with Harold Hotelling. There Wilks also listened to C. E. Spearman and met Walter Shewhart of Bell Telephone Laboratories, who then and later introduced him to many research problems arising from industrial applications of statistics.

The next year his fellowship was renewed, and Wilks went to work in Karl Pearson’s department of applied statistics at University College, London. Wilks’s only child, Stanley Neal, was born there in October 1932. In London Wilks met and worked with Egon Pearson, who remained a lifelong friend, and, of course, he met R. A. Fisher and Jerzy Neyman. At midyear he moved to Cambridge, where he worked with John Wishart and got to know M. S. Bartlett and W. G. Cochran. By the end of his two-year fellowship he had published six papers, two of which grew out of his doctoral

thesis; another was entitled “Moments and Distributions of Estimates of Population Parameters From Fragmentary Samples” (1932b).

In this important paper Wilks dealt with the problem of missing values in multivariate data —in some kinds of investigations two or more characteristics are to be measured for each member of the sample, but occasionally the value of one variable or another may be missing, as when only part of a skeleton is found in an archeological study. Wilks found for bivariate normal distributions the maximum likelihood equations for estimating the parameters and suggested some alternative estimators. He also suggested the determinant of the inverse of the asymptotic covariance matrix of a set of estimators as the appropriate measure of information that estimators jointly contain.

At this period, mathematical statisticians were developing exact and approximate distributions of statistics of more and more complex quantities under idealized assumptions, making it possible to assess evidence offered by bodies of data about more and more complicated questions. Wilks’s paper “Certain Generalizations in the Analysis of Variance” (1932c) was a major contribution to this development.

Among the multivariate criteria proposed by Wilks, the one most used today is likely the one denoted by W in his 1932 Biometrika paper. In 1947 Bartlett used the notation A for this statistic as applied in a wider variety of contexts than originally proposed by Wilks, and in 1948 C. R. Rao introduced the frequently heard term “Wilks’s A criterion.” This criterion provided a multivariate generalization of what is now called the analysis of variance F test. If the “among” sum of squares in analysis of variance is generalized to a p × p matrix A of sums of squares and products and the “within” sum of squares is similarly generalized to a p × p matrix B , then the Wilks criterion is a ratio of determinants, namely, det(B )/det(A + B ). In 1932 it was a considerable feat to determine, as Wilks did, the null distributions of this and many similar statistics.

In this and other papers through the years, Wilks found the likelihood ratio criterion for testing many hypotheses in multivariate problems. The criteria repeatedly turn out to be powers of ratios of products of determinants of sample covariance matrices essentially like the formula above, and their moments are products of beta functions. He suggested the determinant of the covariance matrix as the generalized variance of a sample of points in a multidimensional space, and he found the distribution of the multiple correlation coefficient.

Wilks also found the likelihood ratio criterion and its moments for testing the null hypothesis that several multivariate populations have equal covariance matrices. Much later, in 1946, he found the likelihood ratio criteria for testing whether in one multivariate population the variances are equal and the covariances are equal, for testing whether all the means are equal if the variances are equal and the covariances are equal, and for testing all these hypotheses simultaneously. These problems arise from studying whether several forms of a test are nearly parallel. Elsewhere he showed that for a test consisting of many items, scorings based upon two different sets of weights for the items would produce pairs of total scores that were highly correlated across the individuals taking the test. The implication is that modest changes in the weights of the items on a long test make only small changes in the relative evaluations of individuals. He also suggested the likelihood test for independence in contingency tables, and he found the large-sample distribution of the likelihood ratio for testing composite hypotheses.

Wilks encouraged his students to develop nonparametric or distribution-free methods and made major contributions to this area himself. In particular, he invented the statistical idea of tolerance limits, by analogy with the term as used in industry in connection with piece parts. Shewhart had asked for ways to make guarantees about mass-produced lots of parts. Wilks found that by using order statistics, for example, the largest and smallest measurements in a sample, one could make confidence statements about the fraction of the true population contained between the sample order statistics. To illustrate, for a sample of size 10 from a continuous population, the probability is 0.62 that at least 80 per cent of the population is contained between the smallest and largest sample values. Wilks and others have extended this idea in many directions.

In 1933 Wilks joined the department of mathematics at Princeton University, which was his base of operations for the rest of his life. At Princeton he gradually developed both undergraduate and graduate courses in mathematical statistics, repeatedly writing up course notes, producing his long-awaited hard-cover Mathematical Statistics in 1962, and publishing several soft-cover books that had a substantial influence on the teaching of mathematical statistics. Wilks wrote or coauthored six books and about forty research papers.

Wilks’s first doctoral student, Joseph Daly, received his degree in 1939, and thereafter Princeton had a small but strong graduate program in statistics.

Wilks contributed to education in many ways. From the time of his arrival in Princeton, he participated in the work of the College Entrance Examination Board (and the Educational Testing Service), and some of his research problems arose from this source. He served on the Board’s commission on mathematics and was a coauthor of their experimental text in probability and statistics for secondary school students. Later he became a member of the Advisory Board of the School Mathematics Study Group (SMSG) and a visiting lecturer for various groups.

Wilks was one of a small group of statisticians who organized the Institute of Mathematical Statistics in 1935 and later negotiated the arrangements transferring the Annals of Mathematical Statistics from the private ownership of its founder and first editor, Harry C. Carver, to the Institute. Thereupon Wilks took over the editorship of the Annals (serving from 1938 through 1949) and with it, in effect, the shaping of the long-run future of the Institute. To quote John Tukey, “A marginal journal with a small subscription list became an unqualified first-rank journal in its field; a once marginal society grew to adulthood in size and responsibility and contribution. There is no doubt that the wisdom and judgment of Wilks was crucial; some of us suspect it was irreplaceable” (1965, p. 150).

Starting about 1941, Wilks began research for the National Defense Research Committee. During World War II, he became director of the Princeton Statistical Research Group, which had branches both in Princeton and in New York City. And he took an active part in the development and operation of the short courses that introduced statistical quality control to American industry. That such organizational developments persist is illustrated by the American Society for Quality Control, which was 20,000 strong in 1966. In 1947 Wilks was awarded the Presidential Certificate of Merit for his contributions to antisubmarine warfare and to the solution of convoy problems.

After World War Ii, Wilks devoted his time more and more to national affairs and services to his profession, and less to research.

He served the Social Science Research Council in many capacities, including successively the chairmanship of each of its three major subdivisions, over a period of 18 years. And from 1953 until his death he served the Russell Sage Foundation as a member of the Board of Directors and of the Executive Committee. He served the National Science Foundation both on its Divisional Committee for the Mathematical, Physical, and Engineering Sciences and on that for the Social Sciences.

Over the years he worked on uncounted committees for the Institute of Mathematical Statistics, the American Statistical Association, and the federal government. He served in many capacities the Division of Mathematics of the National Research Council, chairing the division from 1958 to 1960. He helped organize the Conference Board of the Mathematical Sciences and served as its chairman in 1960. He was a member of the U.S. National Commission for UNESCO from 1960 to 1962 (see Anderson 1965 a pp. 3-7).

He was president of the Institute of Mathematical Statistics in 1940, and of the American Statistical Association in 1950. His other honors include election to the International Statistical Institute, the American Philosophical Society, and the American Academy of Arts and Sciences. In 1947 the University of Iowa honored him with a Centennial Alumni Award. After his sudden death on March 7, 1964, several memorials were established, including an S. S. Wilks Memorial Fund at Princeton University and an American Statistical Association Samuel S. Wilks Memorial Award.

Frederick Mosteller

[For the historical context of Wilks’s work, see the biographies offisher, R. A.; Pearson; Spearman; for discussion of the subsequent development of his ideas, seemultivariate analysis; nonparametric statistics, article onorder statistics.]

WORKS BY WILKS

1932a On the Distributions of Statistics in Samples From a Normal Population of Two Variables With Matched Sampling of One Variable. Metron , no. 3/4:87-126.

1932b Moments and Distributions of Estimates of Population Parameters From Fragmentary Samples. Annals of Mathematical Statistics 3:163-195.

1932c Certain Generalizations in the Analysis of Variance. Biometrika 24:471-494.,

1932d On the Sampling Distribution of the Multiple Correlation Coefficient. Annals of Mathematical Statistics 3:196-203

1935 The Likelihood Test of Independence in Contingency Tables. Annals of Mathematical Statistics 6:190-196.

1938a The Large-sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. Annals of Mathematical Statistics 9:60-62.

1938b Weighting Systems for Linear Functions of Correlated Variables When There Is No Dependent Variable. Psychometrika 3:23-40.

1941 Determination of Sample Sizes for Setting Tolerance Limits. Annals of Mathematical Statistics 12: 91-96.

1946 Sample Criteria for Testing Equality of Means, Equality of Variances, and Equality of Covariances in a Normal Multivariate Distribution. Annals of Mathematical Statistics 17:257-281.

(1957) 1959 College entrance examination board, Commission on mathematicsIntroductory Probability and Statistical Inference: An Experimental Course, by E. C. Douglas, F. Mosteller …, and S. S. Wilks. Princeton, N.J.: The Board.

1962 Mathematical Statistics. New York: Wiley.

1965 Statistical Aspects of Experiments in Telepathy. Parts 1-2. New York Statistician 16, no. 6:1-3; no. 7:4-6. ? Published posthumously.

1965 Guttman, Irwin; and Wilks, Samuel S. Introductory Engineering Statistics. New York: Wiley.? Published posthumously.

SUPPLEMENTARY BIBLIOGRAPHY

Anderson, T. W. 1965 a Samuel Stanley Wilks: 1906-1964. Annals of Mathematical Statistics 36:1-23.

Anderson, T. W. 1965 b The Publications of S. S. Wilks. Annals of Mathematical Statistics 36:24-27.

Cochran, W. G. 1964 S. S. Wilks. International Statistical Institute, Review 32:189-191.

Craig, Cecil C 1964. Professor Samuel S. Wilks. Industrial Quality Control 20, no. 12:41 only.

Gulliksen, Harold 1964 Samuel Stanley Wilks: 1906-1964. Psychometrika 29:103-104.

Mosteller, Frederick 1964 Samuel S. Wilks: Statesman of Statistics. American Statistician 18, no. 2: 11-17.

Pearson, E. S. 1964 Samuel Stanley Wilks: 1906-1964. Journal of the Royal Statistical Society Series A 127: 597-599.

Stephan, Frederick F. et al. 1965 Samuel S. Wilks. Journal of the American Statistical Association 60: 939-966. ? Other contributors were J. W. Tukey, F. Mosteller, A. M. Mood, M. H. Hansen, L. E. Simon, and W. J. Dixon.

Tukey, John W. 1965 Samuel Stanley Wilks: 1906-1964. Pages 147-154 in American Philosophical Society, Yearbook, 1964. Philadelphia: The Society.

International Encyclopedia of the Social Sciences