## Degrees of Freedom

**-**

## Degrees of Freedom

# Degrees of Freedom

There are several ways to talk about degrees of freedom, usually shortened to *df* in text. A one-sentence definition for a set of data is: *The number of degrees of freedom of a set of data is the dimension of the smallest space in which the data lie*. From this (perhaps rather puzzling) sentence, all other descriptions of *df* emerge after suitable connections are established.

Consider the following set of data: (*y* _{1}, *y* _{2}, *y* _{3}, *y* _{4}, *y* _{5}, *y* _{6}, *y* _{7}) = (26, 32, 35, 41, 43, 51, 52) are 7 values of the output variable of a simple process that has been set to 7 different sets of conditions, not obtained in the indicated order but rearranged for convenience. If all these responses were separately and independently generated, we would say that these numbers have (or possess, or carry) seven degrees of freedom. If plotted in a seven-dimensional space, they would define a single, unambiguous point. (If the entire experiment were repeated, another single point, elsewhere in this space, would be defined, and so on.)

Suppose we want to estimate the true mean μ of the distribution of possible *y* -values using this sample of values. The mean (or average) of the data is *ý̄* = 40. The deviations from this average are of the form y_{i} – *γ̂* = –14, –8, –5, 1, 3, 11, 12, respectively. These seven numbers also define the coordinates of a point in the seven-dimensional space, but *the point cannot lie anywhere in the space*. It is constrained by the fact that these seven numbers are now *deviations from the mean* and so must sum to zero. Thus the point defined by deviations from this estimated mean lies in a subspace of 6 dimensions defined by Σ(y_{i} – *γ̂* ) = 0 and contained within the original 7-dimensional space. We can talk about this in the following manner: We have used the original seven data values to estimate one parameter, the mean *;*. This estimated mean “carries” or “has” or “uses up” or “takes up” one degree of freedom. Consequently the 7 deviations *from* that mean, the so-called residuals, y_{i} – *γ̂* for *i* = 1, 2, …, 7, must carry the remaining 6 *df*. Suppose now that obtaining data from our simple process above required the setting and resetting of an input variable *x* and that the *xs* associated with the *y* -data above were, in the same respective order, (x_{1}, x_{2}, … *, x) =* (8, 9, 10, 11, 12, 13, 14). We might wish to check if there were a linear relationship between the *ys* and the *xs*. This could be done by fitting a straight line, namely a *first order linear regression equation* of the form

where β _{0} and β_{1} are parameters (to be estimated by the method of least squares) and *£*, *i=* 1, 2, …, *n* (where *n =* 7 for our example) represent random errors. The *least squares estimates,* b_{0} and *b1* of β_{0} and β_{1} respectively, are given by b_{0} *=* –8.714, *b1 =* 4.429 so that the *fitted equation* (or *fitted model)* is

The residuals from this regression calculation, namely the *y _{i}–ŷ_{i}* values using predictions obtained by substituting the 7 values of

*x*individually into (2), are, in order, –0.714, 0.857, –0.571, 1.000, –1.429, 2.143, –1.286. These residuals, like the ones above, also sum to zero. However, because of the mathematical calculations involved in the least squares process, the sum of the cross products Σx

_{i}(y

_{i}– ŷ

_{i}) = 8(–0.714) + 9(0.857) + … + 14(–1.286) = 0 also. So the estimation of the 2 parameters β

_{0}and β

_{1}has now “taken up” 2 of the original 7

*df*, leaving the 7 residuals carrying only 7 – 2 =

*5df*.

Suppose at this stage we wish to estimate the variance σ^{2} of the *y* -distribution. The appropriate estimate would be *s* ^{2} = Σ (y_{i} – ŷ_{i})^{2}/*(n – 2)* using the *ŷ _{i}* values of equation (2) and where

*n*is the original total number of

*dfs*. The numerical value of this estimate is {(–0.714)

^{2}+ (0.857)

^{2}+ (–0.571)

^{2}+ (1.000)

^{2}+ (–1.429)

^{2}+ (2.143)

^{2}+ (–1.286)

^{2}}/(7-2) = 10.8586/5 = 2.172. Notice that the divisor 7 – 2 = 5 is the appropriate residual df left over after fitting the 2-parameter straight line model. If additional parameters were added to the model, the reduction in

*df*would have to be adjusted suitably. For more examples and details, see Norman R. Draper and Harry Smith’s

*Applied Regression Analysis*(1998).

More generally, all sums of squares involving response values in statistical work can be written in the form *y’My*, where *y’* = *(y ^{1}, y^{2}, y_{n})* and its transpose

*y*are, respectively a row and a column vector of observations and where

*M*is a specific

*n X n*symmetric matrix whose diagonal elements are the coefficients of and whose off-diagonal entries are one-half the coefficients of the y

_{i}y

_{j}in the sum of squares. The

*df*attached to such a sum of squares is always the rank of the matrix

*M*. This rank can best be discovered numerically by asking a computer to produce the eigenvalues of

*M*and counting how many of these eigenvalues are nonzero.

We can carry out various statistical tests on data like the above. These tests are not discussed here. However, all such tests involve a test statistic, which is compared to a selected percentage point of an appropriate statistical distribution. When the *n* statistical errors ε_{i} can be assumed to be normally distributed, tests on regression parameters typically involve either a *t(υ)* distribution or an *F* (*υ* _{1}, *υ* _{2}) distribution, where the *f* s represent appropriate numbers of degrees of freedom determined by the *df* of quantities occurring in the tests. Tables of the percentage points of *t* and *F* distributions appear in most statistical textbooks, tabulated by their degrees of freedom.

Another area where degrees of freedom need to be calculated is *contingency tables*, in which discrete (noncontinuous) data arise in categories such as the number of piston-ring failures in four compressors, each of which has three sections, the North, the Center, and the South. This particular example leads to a 4 X 3 2-way table with 12 entries. The test statistic in this type of problem is distributed approximately as a x^{2}(*υ* ) variable where, for our example,*υ* = (4 – 1)(3 – 1) = 6. Again, percentage points of *x* ^{2} distributions appear in most statistical textbooks. A lucid account of this and similar examples is in Owen L. Davies and Peter L. Goldsmith’s *Statistical Methods in Research and Production* (1986, pp. 317-334).

There exists, in the area of nonparametric regression, the similar concept of *equivalent degrees of freedom*. For this, see P. J. Green and B. W. Silverman’s *Nonparametric Regression and Generalized Linear Models* (1994, p. 37).

**SEE ALSO** *Econometric Decomposition; Eigen-Values and Eigen-Vectors, Perron-Frobenius Theorem: Economic Applications; Frequency Distributions; Hypothesis and Hypothesis Testing; Least Squares, Ordinary; Nonparametric Estimation; Nonparametric Regression; Regression; Regression Analysis; Statistics*

## BIBLIOGRAPHY

Davies, Owen L., and Peter L. Goldsmith, eds. 1986. *Statistical Methods in Research and Production*. 4th ed. New York: Longman.

Draper, Norman R., and Harry Smith. 1998. *Applied Regression Analysis*. 3rd ed. New York: Wiley.

Green, P. J., and B. W. Silverman. 1994. *Nonparametric Regression and Generalized Linear Models*. New York:Chapman and Hall.

*Norman R. Draper*

## degrees of freedom

**degrees of freedom** **1.** When a substance is heated its kinetic energy increases. Kinetic energy is made up from the translation and rotation of particles, and the vibration of atoms which constitute the molecules of a substance. A substance may, therefore, absorb heat energy supplied to it in several ways, and is said to possess a number of degrees of freedom. In general, a molecule consisting of *N* atoms will have 3*N* degrees of freedom; thus for a diatomic molecule there will be six degrees of freedom: three will be translational, two rotational, and one vibrational. In a phase diagram, describing, for example, a three-phase system (such as ice-water-vapour), pressure and/or temperature can be altered independently, in an area where only one phase exists, without altering the one-phase condition. Along the line separating two areas, if temperature is altered then pressure must alter accordingly, or vice versa, to maintain the two-phase equilibrium. At a point where three phases are in equilibrium, alteration of either temperature or pressure will cause one phase to disappear. The system thus possesses (*a*) two degrees of freedom in the area; (*b*) one degree of freedom along the line; and (*c*) no degrees of freedom at the point.

**2.** In statistics, the number of independent variables involved in calculating a statistic. This value is equal to the difference between the total number of data points under consideration, and the number of restrictions. The number of restrictions is equal to the number of parameters which are the same in both observed data set and theoretical data set, e.g. total cumulative values, means.

## degrees of freedom

**degrees of freedom** In statistical analysis, the number of independent observations associated with an estimate of variance (see measures of variation) or of a component of an analysis of variance. The simplest example is in estimating the variance of a sample of *n* observations, where the degrees of freedom is *n *– 1, the divisor of the sum of squared deviations from the sample mean. More generally, the degrees of freedom, *f*, is the difference between the number of parameters in a given model, and a special case of that model with fewer parameters. The number of degrees of freedom is required when selecting a particular instance of the chi-squared distribution, Student's t distribution, or F distribution.

#### More From encyclopedia.com

#### You Might Also Like

#### NEARBY TERMS

**Degrees of Freedom**