ChiSquare Test
CHISQUARE TEST
Studies often collect data on categorical variables that can be summarized as a series of counts. These counts are commonly arranged in a tabular format known as a contingency table. For example, a study designed to determine whether or not there is an association between cigarette smoking and asthma might collect data that could be assembled into a 2−2 table. In this case, the two columns could be defined by whether the subject smoked or not, while the rows could represent whether or not the subject experienced symptoms of asthma. The cells of the table would contain the number of observations or patients as defined by these two variables.
The chisquare test statistic can be used to evaluate whether there is an association between the rows and columns in a contingency table. More specifically, this statistic can be used to determine whether there is any difference between the study groups in the proportions of the risk factor of interest. Returning to our example, the chisquare statistic could be used to test whether the proportion of individuals who smoke differs by asthmatic status.
The chisquare test statistic is designed to test the null hypothesis that there is no association between the rows and columns of a contingency table. This statistic is calculated by first obtaining for each cell in the table, the expected number of
Table 1
Observed values for data presented in a twobytwo table  
source: Courtesy of author.  
Variable 2  Variable 1  Total  
Yes  No  
Yes  a  b  a+b 
No  c  d  c+d 
Total  a+c  b+d  n 
events that will occur if the null hypothesis is true. When the observed number of events deviates significantly from the expected counts, then it is unlikely that the null hypothesis is true, and it is likely that there is a rowcolumn association. Conversely, a small chisquare value indicates that the observed values are similar to the expected values leading us to conclude that the null hypothesis is plausible. The general formula used to calculate the chisquare (X ^{2}) test statistic is as follows:
where O = observed count in category; E = expected count in the category under the null hypothesis; df = degrees of freedom; and c, r represent the number of columns and rows in the contingency table.
The value of the chisquare statistic cannot be negative and can assume values from zero to infinity. The pvalue for this test statistic is based on the chisquare probability distribution and is generally extracted from published tables or estimated using computer software programs. The pvalue represents the probability that the chisquare test statistic is as extreme as or more extreme than observed if the null hypothesis were true. As with the t and F distributions, there is a different chisquare distribution for each possible value of degrees of freedom. Chisquare distributions with a small number of degrees of freedom are highly skewed; however, this skewness is attenuated as the number of degrees of freedom increases. In general, the degrees of freedom for tests of hypothesis that involve an r×c contingency table is
Table 2
Expected values for data presented in a twobytwo table  
source: Courtesy of author.  
Variable 2  Variable 1  Total  
Yes  No  
Yes  (a+b)(a+c)/n  (a+b)(b+d)/n  a+b 
No  (c+d)(a+c)/n  (c+d)(b+d)/n  c+d 
Total  a+c  b+d  n 
equal to (r7minus;1)×(c−1); thus for any 2×2 table, the degrees of freedom is equal to one. A chisquare distribution with one degree of freedom is equal to the square root of the normal distribution, and, consequently, either the chisquare or standard normal table can be used to determine the corresponding pvalue.
The chisquare test is most widely used to conduct tests of hypothesis that involve data that can be presented in a 2×2 table. Indeed, this tabular format is a feature of the casecontrol study design that is commonly used in public health research. Within this contingency table, we could denote the observed counts as shown in Table 1. Under the null hypothesis of no association between the two variables, the expected number in each cell under the null hypothesis is calculated from the observed values using the formula outlined in Table 2.
The use of the chisquare test can be illustrated by using hypothetical data from a study investigating the association between smoking and asthma among adults observed in a community health clinic. The results obtained from classifying 150 individuals are shown in Table 3. As Table 3 shows, among asthmatics the proportion of smokers was 40 percent (20/50), while the corresponding proportion among asymptomatic individuals was 22 percent (22/100). By applying the formula presented in Table 2, for the observed cell counts of 20, 30, 22, and 78 (Table 3) the corresponding expected counts are 14, 36, 28, and 72. The observed and expected counts can then be used to calculate the chisquare test statistic as outlined in Equation 1. The resulting value of the chisquare
Table 3
Hypothetical data showing chisquare test  
source: Courtesy of author.  
Symptoms of asthma  Ever smoke cigarettes  Total  
Yes  No  
Yes  20  30  50 
No  22  30  100 
Total  42  108  150 
test statistic is approximately 5.36, and the associated pvalue for this chisquare distribution that has one degree of freedom is 0.02. Therefore, if there was truly no association between smoking and asthma, there is a 2 out of 100 probability of observing a difference in proportions that is at least as large as 18 percent (40%–22%) by chance alone. We would therefore conclude that the observed difference in the proportions is unlikely to be explained by chance alone, and consider this result statistically significant.
Because the construction of the chisquare test makes use of discrete data to estimate a continuous distribution, some authors will apply a continuity correction when calculating this statistic. Specifically,
where ^{O}_{i}−E_{i} is the absolute value of the difference between O_{i} and E_{i} and the term 0.5 in the numerator is often referred to as Yates correction factor. This correction factor serves to reduce the chisquare value, and, therefore, increases the resulting pvalue. It has been suggested that this correction yields an overly conservative test that may fail to reject a false null hypothesis. However, as long as the sample size is large, the effect of the correction factor is negligible.
When there is a small number of counts in the table, the use of the chisquare test statistic may not be appropriate. Specifically, it has been recommended that this test not be used if any cell in the table has an expected count of less than one, or if 20 percent of the cells have an expected count that is greater than five. Under this scenario, the Fisher's exact test is recommended for conducting tests of hypothesis.
Paul J. Villeneuve
(see also: Normal Distributions; Probability Model; Sampling; Statistics for Public Health; TTest )
Bibliography
Cohran, W. G. (1954). "Some Methods for Strengthening the Common X ^{2} Test." Biometrics 10:417–451.
Grizzle, J. E. (1967). "Continuity Correction in the X2 Test for 2×2 Tables." The American Statistician 21:28–32.
Pagano, M., and Gauvreau, K. (2000). Principles of Biostatistics, 2nd edition. Pacific Grove, CA: Duxbury Press.
Rosner, B. (2000). Fundamentals of Biostatistics, 5th edition. Pacific Grove, CA: Duxbury Press.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"ChiSquare Test." Encyclopedia of Public Health. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"ChiSquare Test." Encyclopedia of Public Health. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/education/encyclopediasalmanacstranscriptsandmaps/chisquaretest
"ChiSquare Test." Encyclopedia of Public Health. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/education/encyclopediasalmanacstranscriptsandmaps/chisquaretest
ChiSquare
ChiSquare
X ^{2} TEST FOR POPULATION VARIANCES
X ^{2} TESTS OF GOODNESS OF FIT AND INDEPENDENCE
The term chisquare (χ ^{2}) refers to a distribution, a variable that is χ ^{2} distributed, or a statistical test employing the χ ^{2} distribution. A χ ^{2} distribution with k degrees of freedom (df ) has mean k, variance 2k, and mode k – 2 (if k > 2), and is denoted . Much of its usefulness in statistical inference derives from the fact that the sample variance of a normally distributed variable is χ ^{2} distributed with df = N – 1. All χ ^{2} distributions are asymmetrical, rightskewed, and nonnegative. Owing to the broad utility of the χ ^{2} distribution, tabled χ ^{2} probability values can be found in virtually every introductory statistics text.
X ^{2} TEST FOR POPULATION VARIANCES
A test of the null hypothesis that (e.g., H _{0}: σ^{2} 1.8) is conducted by obtaining the sample variance s ^{2}, computing the test statistic
and consulting values of the distribution. For a twotailed test, G is compared to the critical values associated with the lower and upper (50 × α)% of the distribution. Rejection implies, with confidence 1 – α, that the sample is not drawn from a normally distributed population with variance .
X ^{2} TESTS OF GOODNESS OF FIT AND INDEPENDENCE
The χ ^{2} goodness of fit test compares two finite frequency distributions—one a set of observed frequency counts in C categories, the other a set of counts expected on the basis of theory or chance. The statistic
is computed, where O_{i} and E_{i} are, respectively, the observed and expected frequencies for category i given a fixed total sample size N. G is approximately χ ^{2} distributed with df = C – 1. If the null hypothesis of equality is rejected, the test implies a statistically significant departure from expectations.
This test can be extended to test the null hypothesis that several frequency distributions are independent. For example, given a 3 × 4 contingency table of frequencies, where R = 3 rows (conditions) and C = 4 columns (categories), G may be computed as
and compared against a distribution. Expected frequencies are computed as the product of the marginal totals for column i and row j divided by N. Rejection of the null hypothesis implies that not all rows (or columns) were sampled from independent populations. This test may be extended to any number of dimensions.
These χ ^{2} tests have been found to work well with average expected frequencies as low as 2. However, these tests are inappropriate if the assumption of independent observations is violated.
COMPARISON OF DISTRIBUTIONS
A common application of χ ^{2} is to test the hypothesis that a sample’s parent population follows a particular continuous probability density function. The test is conducted by first dividing the hypothetical distribution into C “bins” of equal wdth. The frequencies expected for each bin (E_{i} ) are approximated by computing the probability of randomly selecting a case from that bin and multiplying by N. Observed frequencies (O_{i} ) are obtained by using the same bin limits in the observed distribution. The onetailed test is conducted by using equation 2 and comparing the result to the critical value drawn from a distribution. Note that the number of bins, and points of division between bins, must be chosen arbitrarily, yet these decisions can have a large impact on conclusions.
The χ ^{2} distribution has many other applications in the social sciences, including Bartlett’s test of homogeneity of variance, Friedman’s test for median differences, tests for heteroscedasticity, nonparametric measures of association, and likelihood ratios. In addition, χ ^{2} statistics form the basis for many model fit and selection indices used in latent variable analyses, item response theory, logistic regression, and other advanced techniques. All of these methods involve the evaluation of the discrepancy between a model’s implications and observed data.
SEE ALSO Distribution, Normal
BIBLIOGRAPHY
Howell, David C. 2006. Statistical Methods for Psychology. 6th ed. Belmont, CA: Wadsworth Publishing.
Pearson, Karl. 1900. On the Criterion That a Given System of Deviations From the Probable in the Case of a Correlated System of Variables Is Such That It Can Be Reasonably Supposed To Have Arisen From Random Sampling. Philosophical Magazine 50: 157175.
Kristopher J. Preacher
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"ChiSquare." International Encyclopedia of the Social Sciences. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"ChiSquare." International Encyclopedia of the Social Sciences. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/socialsciences/appliedandsocialsciencesmagazines/chisquare
"ChiSquare." International Encyclopedia of the Social Sciences. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/socialsciences/appliedandsocialsciencesmagazines/chisquare
chisquared test
chisquared test (χ2) In statistics, a hypothesis test used to determine the goodness of fit of a particular data set with that expected from a theoretical distribution. The test statistic is a function of the difference between observed and expected values which is compared to the chisquared distribution. The chisquared distribution is a distribution of sample variance based on a single parameter, the degrees of freedom.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"chisquared test." A Dictionary of Earth Sciences. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"chisquared test." A Dictionary of Earth Sciences. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest
"chisquared test." A Dictionary of Earth Sciences. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest
chisquare test
chisquare test (kyskwair) n. (in statistics) a test to determine if the difference between two groups of observations is statistically significant (see significance), used in controlled trials and other studies. It measures the differences between theoretical and observed frequencies (see frequency distribution) and identifies whether or not variables are dependent (see variable).
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"chisquare test." A Dictionary of Nursing. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"chisquare test." A Dictionary of Nursing. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/caregiving/dictionariesthesaurusespicturesandpressreleases/chisquaretest
"chisquare test." A Dictionary of Nursing. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/caregiving/dictionariesthesaurusespicturesandpressreleases/chisquaretest
chisquared test
chisquared test (χ^{2} test) A statistical test that is used to determine whether data obtained by sampling agree with those predicted hypothetically, and thus to test the validity of the hypothesis.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"chisquared test." A Dictionary of Plant Sciences. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"chisquared test." A Dictionary of Plant Sciences. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest1
"chisquared test." A Dictionary of Plant Sciences. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest1
chisquared test
chisquared test(χ^{2} test) A statistical test that is used to determine whether data obtained by sampling agree with those predicted hypothetically, and thus to test the validity of the hypothesis.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"chisquared test." A Dictionary of Ecology. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"chisquared test." A Dictionary of Ecology. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest0
"chisquared test." A Dictionary of Ecology. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest0
chisquared test
chisquared test (χ^{2}) A statistical test that is used to determine whether data obtained by sampling agree with those predicted hypothetically, and thus to test the validity of the hypothesis.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"chisquared test." A Dictionary of Zoology. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"chisquared test." A Dictionary of Zoology. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest2
"chisquared test." A Dictionary of Zoology. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/science/dictionariesthesaurusespicturesandpressreleases/chisquaredtest2
chisquare test
chisquare test: see statistics.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"chisquare test." The Columbia Encyclopedia, 6th ed.. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"chisquare test." The Columbia Encyclopedia, 6th ed.. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/reference/encyclopediasalmanacstranscriptsandmaps/chisquaretest
"chisquare test." The Columbia Encyclopedia, 6th ed.. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/reference/encyclopediasalmanacstranscriptsandmaps/chisquaretest
chisquare
chisquare See SIGNIFICANCE TESTS.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"chisquare." A Dictionary of Sociology. . Encyclopedia.com. 20 Jul. 2017 <http://www.encyclopedia.com>.
"chisquare." A Dictionary of Sociology. . Encyclopedia.com. (July 20, 2017). http://www.encyclopedia.com/socialsciences/dictionariesthesaurusespicturesandpressreleases/chisquare
"chisquare." A Dictionary of Sociology. . Retrieved July 20, 2017 from Encyclopedia.com: http://www.encyclopedia.com/socialsciences/dictionariesthesaurusespicturesandpressreleases/chisquare