Factor Analysis
Factor Analysis
I STATISTICAL ASPECTSA. E. Maxwell
II PSYCHOLOGICAL APPLICATIONSLloyd G. Humphreys
I STATISTICAL ASPECTS
In many fields of research–for example, agriculture (Banks 1954), psychology (Burt 1947), economics (Geary 1948), medicine (Hammond 1944; 1955), and the study of accidents (Herdan 1943), but notably in psychology and the other social sciences–an experimenter frequently has scores for each member of a sample of individuals, animals, or other experimental units on each of a number of variates, such as cognitive tests, personality inventories, sociometric and socioeconomic ratings, and physical or physiological measures. If the number of variates is large, or even moderately so, the experimenter may wish to seek some reduction or simplification of his data. One approach to this problem is to search for some hypothetical variates that are weighted sums of the observed variates and that, although fewer in number than the latter, can be used to replace them. The statistical techniques by which such a reduction of data is achieved are known collectively as factor analysis, although it is well to note here that the principal component method of analysis discussed below (see also Kendall & Lawley 1956) has certain special features. The derived variates are generally viewed merely as convenient descriptive summarizations of the observed data. But occasionally their composition is such that they appear to represent some general basic aspects of everyday life, performance or achievement, and in such cases they are often suitably labeled and are referred to as factors. Typical examples from psychology are such factors as “numerical ability,” “originality,” “neuroticism,” and “toughmindedness.” This article describes the statistical procedures in general use for arriving at these hypothetical variates or factors.
Preliminary concepts. Suppose that for a random sample of size N from some population, scores exist on each of p jointly normally distributed variates x (i=1,2, ċ ,p). If the scores on each variate are expressed as deviations from the sample mean of that variate, then an unbiased estimator of the variance of x_{i} is given by the expression
summation being over the sample of size N. Similarly, an unbiased estimator of the covariance between variates x_{i} and x_{j} is given by
Note that this is conventional condensed notation. A fuller, but clumsier, notation would use x_{iv} for the deviation (v = 1, ċ , N) so that really means .
In practice, factor analysis is often used even in cases in which its usual assumptions are known to be appreciably in error. Such uses make the tacit presumption that the effect of the erroneous assumptions will be small or negligible. Unfortunately, nearly nothing is known about the circumstances under which this robustness, or non sensitivity to errors in assumptions, is justified. Of course, the formal manipulations may always be carried out; the assumptions enter crucially into distribution theory and optimality of the estimators.
The estimated variances and covariances between the p variates can conveniently be written in square matrix form as follows:
Since a_{ij} = a_{ji}, the matrix A is symmetric about its main diagonal.
From the terms of A, the sample correlations, r_{ij}, between the pairs of variates may be obtained from
with r_{ij}. The corresponding matrix is the correlation matrix.
The partial correlation concept is helpful here. If, to take the simplest case, estimates of the correlations between three variates are available, then the estimated correlation between any two, say x_{i} and x_{j}, for a given constant value of the third, x_{k}, can be found from the expression
and is denoted by r_{ij.k}.
In terms of a correlation matrix, the aim of factor analysis can be simply stated in terms of partial correlations (see Howe 1955). The first question asked is whether a hypothetical random variate f_{1} exists such that the partial correlations r_{ij}.f_{1}, for all i and j, are zero, within the limits of sampling error, after the effect of f_{1} has been removed. (If this is so, it is customary to say that the correlation matrix, apart from its diagonal cells, is of rank one, but details will not be given here.) If the partial correlations are not zero, then the question is asked whether two hypothetical random variates, f_{1} and f_{2}, exist such that the partial correlations between the variates are zero after the effects of both f_{1} and f_{2}, have been removed from the original matrix, and so on. (If f_{1} and f_{2}, reduce the partial correlations to zero, then the matrix, apart from its diagonal cells, is said to be of rank two, and so on.) The aim of the procedure is to replace the observed variates with a set of derived variates that, although fewer in number than the former, are still adequate to account for the correlations between them. In other words, the derived variates, or factors, account for the variance common to the observed variates.
Historical note. Factor analysis is generally taken to date from 1904, when C. E. Spearman published an article entitled “‘General Intelligence’ Objectively Determined and Measured.” Spearman postulated that a single hypothetical variate would in general account for the intercorrelations of a set of cognitive tests, and this variate was his famous factor “g.” For the sets of tests that Spearman was considering, this hypothesis seemed reasonable. As further matrices of correlations became available, however, it soon became obvious that Spearman’s hypothesis was an oversimplification of the facts, and multiple factor concepts were developed. L. L. Thurstone, in America, and C. Burt and G. H. Thomson, in Britain, were the most active pioneers in this movement. Details of their contributions and references to early journal articles can be found in their textbooks (Thurstone 1935; 1947; Burt 1940; Thomson 1939). These writers were psychologists, and the statistical methods they developed for estimating factors were more or less approximate in nature. The first rigorous attempt by a mathematical statistician to treat the problem of factor estimation (as distinct from principal components) came with the publication in 1940 of a paper by D. N. Lawley entitled “The Estimation of Factor Loadings by the Method of Maximum Likelihood.” Since 1940, Lawley has published other articles dealing with various factor problems, and further contributions have been made by Howe (1955), by Anderson and Rubin (1956 ), and by Rao (1955 ), to mention just a few. Modern textbooks on factor analysis are those of Harman (1960) and Lawley and Maxwell (1963).
While methods of factor analysis, based on the above model, were being developed, Hotelling in 1933 published his principal components model, which, although it bears certain formal resemblances to the factor model proper, has rather different aims. It is widely used today and is described below.
The basic factor equations. The factor model described in general correlational terms above can be expressed more explicitly by the equations
In these equations k (the number of factors) is specified; f_{s} stands for the factors (generally referred to as common factors, since they usually enter into the composition of more than one variate). The factors are taken to be normally distributed and, without loss of generality, to have zero means and unit variances; to begin with, they will be assumed to be independent. The term e_{i}, refers to a residual random variate affecting only the variate X_{i}. There are p of these e_{i}, and they are assumed to be normally distributed with zero means and to be independent of each other and of the f_{s}. Their variances will be denoted by v_{i} the diagonal matrix of the V_{i} is called V. The Jvalues are called loadings (weights), l_{is} being the loading of the ith variate on the sth factor. The quantities l_{is}, and V_{i} are taken to be unknown parameters that have to be estimated. If a subscript for individual were introduced, it would be added to X_{i} and f_{s}, but not to l_{is} or V_{i}.
If the population variancecovariance matrix corresponding to the sample matrix A is denoted by C, with elements c_{ij}, then it follows from the model that
and
If the loadings for p variates on k factors are denoted by the p × k matrix L, with transpose L′, eqs. (2) and (3) can be combined in the single matrix equation
Estimating the parameters in the model. Since the introduction of multiple factor analysis, various approximate methods for estimating the parameters l_{is} and v_{i} have been proposed. Of these, the best known is the centroid, or simple summation, method. It is well described in the textbooks mentioned above, but since the arithmetic details are unwieldy, they will not be given here. The method works fairly well in practice, but there is an arbitrariness in its procedure that makes statistical treatment of it almost impossible (see Lawley & Maxwell 1963, chapter 3). For a rigorous approach to the estimation of the factor parameters, I turn to the method of maximum likelihood, although this decision requires some justification. The maximum likelihood method of factor estimation has not been widely used in the past for two reasons. First, it involves very onerous calculations which were wellnigh prohibitive before the development of electronic computers. Second, the arithmetic procedures available, which were iterative, frequently did not lead to convergent estimates of the loadings. But recently, largely because of the work of the Swedish statistician K. G. Jöreskog, quick and efficient estimation procedures have been found. These methods are still being perfected, but a preliminary account of them is contained in a recent paper (Jöreskog 1966). When they become better known, it is likely that the maximum likelihood method of factor analysis will become the accepted method. An earlier monograph by Jöreskog (1963) is also of interest. In it he links up work by Guttman (1953) on image theory with classical factor analytic concepts (see also Kaiser, in Harris 1963). (The image of a variate is defined as that part of its variance which can be estimated from the other variates in a matrix.)
The first point to note about eqs. (1) is that since the p observed variates Xi are expressed in terms of p + k other variates, namely, the k common factors and the p residual variates, which are not observable, these equations are not capable of direct verification. But eq. (4) implies a hypothesis, H_{0} regarding the covariance matrix C, which can be tested, that it can be expressed as the sum of a diagonal matrix with positive diagonal elements and a symmetric positive semidefinite matrix with at most k latent roots: these matrices are respectively V and LL/. The value postulated for k must not be too large; otherwise, the hypothesis would be trivially true. If the v were known, it would only be necessary to require k < p, but in the more usual case, where they are unknown, the condition can be shown to be (p + k) < (p – k)^{2}. Since the x_{i} are assumed to be distributed in a multivariate normal way, the loglikelihood function, omitting a function of the observations, is given by
where n = N –1, C is the determinant of the matrix C, and c^{ij} is the element in the ith row and jth column of its inverse, C^{1}. To find maximum likelihood estimators of l_{is} and v_{i}, (5) is differentiated with respect to them and the results are equated to zero. A difficulty arises, however, when k > 1, for there are then too many parameters in the model for them to be specified uniquely. This can be seen by an examination of eq. (4), for if L is postmultiplied by an orthogonal matrix M. the value of LL′, which is now given by LMM′L′, is unaltered since MM′ = I, the identity matrix. This means that the maximum likelihood method, although it provides a unique set of estimates of the c_{ij}, leads to equations for estimating the l_{is} which are satisfied by an infinity of solutions, all equally good from a statistical point of view. In this situation all the statistician can do is to select a particular solution, one that is convenient to find, and leave the experimenter to apply whatever rotation he thinks desirable. Thus the custom is to choose L in such a way that the k × k matrix J = L′V^{1}L, is diagonal. It can be shown that the successive elements of J are the latent roots, in order of magnitude, of the matrix V^{1/2} (A  V) V^{1/2}, so that for a given value of V, the determination of the factors in the factor model resembles the determination of the principal components in the component model.
The maximization of eq. (5) with the above diagonalization side condition leads to the equations
and
where circumflex accents denote estimates of the parameters in question. Eq. (7) can usually be solved by iterative methods and details of those in current use can be found in Lawley and Maxwell (1963), Howe (1955), and Jöreskog (1963; 1966). The calculations involved are onerous, and when p is fairly large, say 12 or more, an electronic computer is essential.
A satisfactory property of the above method of estimation, which does not hold for the centroid and principal component methods, is that it can be shown to be independent of the metric used. A change of scale of any variate x_{i} merely introduces proportional changes in its loadings.
Testing hypotheses on number of factors. In the factor analysis of a set of data the value of k is seldom known in advance and has to be estimated. To begin with, some value of it is assumed and a matrix of loadings L for this value is estimated. The effects of the factors concerned are now eliminated from the observed covariance (or correlation) matrix, and the residual matrix, A  LL′, is tested for significance. If it is found to be statistically significant, the value of k is increased by one and the estimation process is repeated. The test employed is of the large sample chisquare type, based on the likelihood ratio method of Neyman and Pearson, and is given by
with 1/2{(p  k)^{2}  (p + k )} degrees of freedom. A good approximation to expression (8), and one easier to calculate, is
There is also some evidence to suggest that the test can be improved by replacing n by
Factor interpretation. As already mentioned, the matrix of loadings, L, given by a factor analysis is not unique and can be replaced by an equivalent set LM where M is an orthogonal matrix. This fact is frequently used by experimenters when interpreting their results, a matrix M being chosen that will in some way simplify the pattern of loadings or make it more intuitively meaningful. For example, M may be chosen so as to reduce to zero, or nearly zero, as many loadings as possible in order to reduce the number of parameters necessary for describing the data. Again, M may be chosen so as to concentrate the loadings of variates of similar content, say verbal tests, on a single factor so that this factor may be labeled appropriately. Occasionally, too, the factors are allowed to become correlated if this seems to lead to more meaningful results.
It is now clear that given a matrix of loadings from some analysis, different experimenters might choose different rotation matrices in their interpretation of the data. This subjective element in factor analysis has led to a great deal of controversy. To avoid subjectivity, various empirical methods of rotation have been proposed which, while tending to simplify the pattern of loadings, also lead to unique solutions. The best known of these are the varimax and the promax methods (for details see Kaiser 1958; Hendrickson & White 1964). But another approach to the problem, proposed independently by Howe (1955), Anderson and Rubin (1956), and Lawley (1958), seems promising. From prior knowledge the experimenter is asked to postulate in advance (a) how many factors he expects from his data and (b) which variates will have zero loadings on the several factors. In other words, he is asked to formulate a specific hypothesis about the factor composition of his variates. The statistician then estimates the nonzero loadings and makes a test of the “goodness of fit” of the factors structure. In this approach the factors may be correlated or uncorrelated, and in the former case estimates of the correlations between them are obtained. The equations of estimation and illustrative examples of their application can be found in Howe (1955) and in Lawley and Maxwell (1963; 1964); the latter gives a quick method of finding approximate estimates of the nonzero loadings.
Estimating factor scores. As the statistical theory of factor analysis now stands, estimation is a twofold process. First, the factor structure, as described above, of a set of data is determined. In practice, however, it is often desirable to find, in addition, equations for estimating the scores of individuals on the factors themselves. One method of doing this, developed by Thomson, is known as the “regression method.” In it the l_{i8} are taken to be the covariances between the f_{8} and the x_{i}, and then for uncorrelated factors the estimation equation is
or, more simply from the computational viewpoint,
where , and, as before, J = L′V^{1}L, and I is the identity matrix. If sampling errors in L and V are neglected, the covariance matrix for the errors of estimates of the factor scores is given by (I + J)^{1}.
If the factors are correlated and their estimated correlation matrix is denoted by P, then eqs. (10) and (11) become, respectively,
and
while the errors of estimates are given by (P^{1} + J)^{1}. An alternative method of estimating factor scores is that of Bartlett (1938). Here, the principle adopted is the minimization, for a given set of observations, of which is the sum of squares of standardized residuals. The estimation equation now is
It is of interest to note that although the sets of estimates gotten by the two methods have been reached by entirely different approaches, a comparison shows that they are simply related. For uncorrelated factors the relationship is
for correlated factors it is
Comparing factors across populations. If factors can be viewed as representing “permanent” aspects of behavior or performance, ways of identifying them from one population to another are required. In the past, identification has generally been based on the comparison of matrices of loadings. In the case of two matrices, a common approach, developed by Ahmavaara (1954) and Cattell and Hurley (1962), is to rotate one into maximum conformity in the least square sense with the other. For example, the matrix required for rotating L_{1} into maximum conformity with L_{2}, when they both involve the same variates, is obtained by calculating the expression and normalizing it by columns. The factors represented by L_{1} in its transformed state are likely to be more or less correlated, but estimates of the correlations between them are given by , standardized so that its diagonal cells are unity, where and is its transpose. This procedure is fairly satisfactory when the sample covariance matrices involved do not differ significantly. When they do, the problem of identifying factors is more complicated.
A possible approach to it has been suggested by Lawley and Maxwell (1963, chapter 8), who make the assumption that although two covariance matrices, C_{1} and C_{2}, involving the same variates may be different, they may still have the same Lmatrix. This could occur if the two k × k covariance matrices _{1} and _{2} between the factors themselves were different. To keep the model fairly simple, they assume that the residual variances in the populations are in each case V and then set up the equations
For this model Lawley and Maxwell show how estimates of L, V, Γ1, and Γ2 may be obtained from two sample covariance matrices A_{1} and A_{2}. They also supply a test for assessing the significance of the difference between the estimates of Γ1 and Γ2, and also for testing the “goodness of fit” of the model.
The method of principal components
The principal component method of analyzing a matrix of covariances or correlations is also widely used in the social sciences. The components correspond to the latent roots of the matrix, and the weights defining them are proportional to the corresponding latent vectors.
The model can also be stated in terms of the observed variates and the derived components. An orthogonal transformation is applied to the X_{i} (i = 1,2, ċ , p) to produce a new set of uncorrelated variates y_{1},y_{2}, ċ , y_{p} These are chosen such that y_{1} has maximum variance, y_{2} has maximum variance subject to being uncorrelated with y_{1}, and so on. This is equivalent to a rotation of the coordinate system so that the new coordinate axes lie along the principal axes of an ellipsoid closely related to the covariance structure of the x_{i}. The transformed variates are then standardized to give a new set, which will be denoted z_{s}. When this method is used, no hypothesis need be made about the nature or distribution of the x_{i}. The model is by definition linear and additive, and the basic equations are
where z_{s}, stands for the sth component, and W_{is}, is the weight of the sth component in the ith variate. In matrix notation eqs. (16) become
x = Wz
where and W is a square matrix of order p with elements w_{is}.
Comparison of eqs. (16) with eqs. (1) shows that in the principal component model residual variates do not appear, and that if all p components are obtained, the sample covariances can be reproduced exactly, that is, A = W’W. Indeed, there is a simple reciprocal relationship between the observed variates and the derived components.
A straightforward iterative method for obtaining the weights w_{is} is given by Hotelling in his original papers; the details are also given in most textbooks on factor analysis. In practice, all p components are seldom found, for a small number generally accounts for a large percentage of the variance of the variates and can be used to summarize the data. There is also a criterion, developed by Bartlett (1950; 1954), for testing the equality of the remaining latent roots of a matrix after the first k have been extracted; this is sometimes used to help in deciding when to stop the analysis.
The principal component method is most useful when the variates x_{{} are all measured in the same units. Otherwise, it is more difficult to justify. A change in the scales of measurement of some or all of the variates results in the covariance matrix being multiplied on both sides by a diagonal matrix. The effect of this on the latent roots and vectors is very complicated, and unfortunately the components are not invariant under such changes of scale. Because of this, the principal component approach is at a disadvantage in comparison with the proper factor analysis approach.
A. E. MAXWELL
[see also CLUSTERING; DISTRIBUTIONS, STATISTICAL, article on MIXTURES OF DISTRIBUTIONS; LATENT STRUCTURE; STATISTICAL IDENTIFIABILITY.]
BIBLIOGRAPHY
AHMAVAARA, Y. 1954 Transformational Analysis of Factorial Data. Suomalainen Tiedeakatemia, Helsinki, Toimituksia: Annales Series B 88, no. 2.
ANDERSON, T. W.; and RUBIN, HERMAN 1956 Statistical Inference in Factor Analysis. Volume 5, pages 111–150 in Berkeley Symposium on Mathematical Statistics and Probability, Third, Proceedings. Edited by Jerzy Neyman. Berkeley: Univ. of California Press.
BANKS, CHARLOTTE 1954 The Factorial Analysis of Crop Productivity: A Reexamination of Professor Kendall’s Data. Journal of the Royal Statistical Society Series B 16:100–111.
BARTLETT, M. S. 1938 Methods of Estimating Mental Factors. Nature 141:609–610.
BARTLETT, M. S. 1950 Tests of Significance in Factor Analysis. British Journal of Psychology (Statistical Section) 3:77–85.
BARTLETT, M. S. 1954 A Note on the Multiplying Factor for Various x^{2} Approximations. Journal of the Royal Statistical Society Series B 16:296–298.
BURT, CYRIL 1940 The Factors of the Mind: An Introduction to Factoranalysis in Psychology. Univ. of London Press.
BURT, CYRIL 1947 Factor Analysis and Physical Types. Psychometrika 12:171–188.
CATTELL, RAYMOND B.; and HURLEY, JOHN R. 1962 The Procrustes Program: Producing Direct Rotation to Test a Hypothesized Factor Structure. Behavioral Science 7:258–262.
GEARY, R. C. 1948 Studies in Relationships Between Economic Time Series. Journal of the Royal Statistical Society Series B 10:140–158.
GIBSON, W. A. 1960 Nonlinear Factors in Two Dimensions. Psychometrika 25:381–392.
HAMMOND, W. H. 1944 Factor Analysis as an Aid to Nutritional Assessment. Journal of Hygiene 43:395–399.
HAMMOND, W. H. 1955 Measurement and Interpretation of Subcutaneous Fats, With Norms for Children and Young Adult Males. British Journal of Preventive and Social Medicine 9:201–211.
HARMAN, HARRY H. 1960 Modern Factor Analysis. Univ. of Chicago Press. → A new edition was scheduled for publication in 1967.
HARRIS, CHESTER W. (editor) 1963 Problems in Measuring Change: Proceedings of a Conference. Madison: Univ. of Wisconsin Press. → See especially “Image Analysis” by Henry F. Kaiser.
HENDRICKSON, ALAN E.; and WHITE, PAUL O. 1964 Promax: A Quick Method for Rotation to Oblique Simple’ Structure. British Journal of Statistical Psychology 17: 65–70.
HERDAN, G. 1943 The Logical and Analytical Relationship Between the Theory of Accidents and Factor Analysis. Journal of the Royal Statistical Society Series A 106:125–142.
HOHST, PAUL 1965 Factor Analysis of Data Matrices. New York: Holt.
HOTELLINC, HAROLD 1933 Analysis of a Complex of Statistical Variables Into Principal Components. Journal of Educational Psychology 24:417–441, 498–520.
HOWE, W. G. 1955 Some Contributions to Factor Analysis. Report No. ORNL1919, U.S. National Laboratory, Oak Ridge, Tenn. Unpublished manuscript.
JöRESKOG, K. G. 1963 Statistical Estimation in Factor Analysis: A New Technique and Its Foundation. Stockholm: Almqvist & Wiksell.
JöRESKOG, K. G. 1966 Testing a Simple Hypothesis in Factor Analysis. Psychometrika 31:165–178.
KAISER, HENRY F. 1958 The Varimax Criterion for Analytic Rotation in Factor Analysis. Psychometrika 23: 187–200.
KENDALL, M. G.; and LAWLEY, D. N. 1956 The Principles of Factor Analysis. Journal of the Royal Statistical Society Series A 119:83–84.
LAWLEY, D. N. 1940 The Estimation of Factor Loadings by the Method of Maximum Likelihood. Royal Society of Edinburgh, Proceedings 60:64–82.
LAWLEY, D. N. 1953 A Modified Method of Estimation in Factor Analysis and Some Large Sample Results. Pages 35–42 in Uppsala Symposium on Psychological Factor Analysis, March 17–19, 1953. Nordisk Psykologi, Monograph Series, No. 3. Uppsala (Sweden): Almqvist & Wiksell.
LAWLEY, D. N. 1958 Estimation in Factor Analysis Under Various Initial Assumptions. British Journal of Statistical Psychology 11:1–12.
LAWLEY, D. N.; and MAXWELL, ALBERT E. 1963 Factor Analysis as a Statistical Method. London: Butterworth.
LAWLEY, D. N.; and MAXWELL, A. E. 1964 Factor Transformation Methods. British Journal of Statistical Psychology 17:97–103.
MAXWELL, A. E. 1964 Calculating Maximumlikelihood Factor Loadings. Journal of the Royal Statistical Society Series A 127:238–241.
RAO, C. R. 1955 Estimation and Tests of Significance in Factor Analysis. Psychometrika 20:93–111.
SPEARMAN, C. E. 1904 “General Intelligence” Objectively Determined and Measured. American Journal of Psychology 15:201–293.
THOMSON, GODFREY H. (1939) 1951 The Factorial Analysis of Human Ability. 5th ed. Boston: Houghton Mifflin.
THURSTONE, LOUIS L. 1935 The Vectors of Mind: Multiplefactor Analysis for the Isolation of Primary Traits. Univ. of Chicago Press.
THURSTONE, Louis L. 1947 Multiplefactor Analysis. Univ. of Chicago Press. → A development and expansion of Thurstone’s The Vectors of Mind, 1935.
II PSYCHOLOGICAL APPLICATIONS
The essential statistical problem of factor analysis involves reduction or simplification of a large number of variates so that some hypothetical variates, fewer in number, which are weighted sums of the observed variates, can be used to replace them. If psychological experimenters were satisfied with this sole, statistical objective, there would be no problem of psychological interpretation and of meaning of factors. They would simply be convenient abstractions. However, psychologists and psychometricians, starting with Charles Spearman (1904), the pioneer factor analyst, have wanted to go beyond this objective and have thereby created the very large psychological literature in this field. The goal of factor analysts following in the Spearman tradition has been to find not only convenient statistical abstractions but the elements or the basic building blocks, the primary mental abilities and personality traits in human behavior. Such theorists have explicitly accepted chemical elements– sometimes even the periodic table–as their model and factor analysis as the method of choice in reaching their goal.
Factor interpretation and methodology
Factor extraction methods. There are several variations of factor methods, certain of which are more amenable to psychological interpretation than others. For example, the experimenter can start his analysis from a variancecovariance matrix or from a correlational matrix with estimated communalities (discussed below) in the principal diagonal. If he is interested in psychological interpretations of factors, he almost uniformly selects the latter, since use of the variancecovariance procedure results in obtaining factors that contain unknown amounts of commonfactor, nonerrorspecific, and error components. For purposes of psychological interpretation, including generalizing to new samples of psychological measures, the inclusion of nonerrorspecific and error variance in the factors is undesirable. The intercorrelations and communalities, on the other hand, are determined only by the common factors.
The experimenter also has a choice among several methods of factor extraction, including the centroid, principal components (sometimes called principal axes), and maximum likelihood methods. Choice among these is based largely on feasibility criteria. The first was used almost exclusively before the advent of highspeed digital computers. The third is generally acknowledged to be superior statistically to the second, but it is too expensive in time and computer to use. The second is at present the method most frequently used by psychologists, since it extracts a maximum amount of variance with each successive factor. The centroid method only approximates this criterion, although frequently it is a close approximation. There is thus no pressing need to redo all previous work involving the centroid method now that computational facilities are available. The maximum likelihood method can and should be used, as a check on conclusions reached with the more economical principal components, when size of matrix and computer availability make it feasible.
The communality problem. When the experimenter elects to analyze correlations and communalities, he must estimate the latter. These communalities represent the proportion of common factor variance in the total variance of a variable: the amount that a variable has in common with other variables in a particular study. Unfortunately, from the methodological viewpoint, there is no way to obtain an unbiased estimate of the communality. Several ruleofthumb methods are available, and there are theoretically sound upper and lower bounds for the communality estimate.
An unbiased reliability estimate can be used as an upper bound for the estimated communality. Reliability and communality differ to the extent that reliability includes specific nonerror variance. A lowerbound estimate in the population of persons is the squared multiple correlation between each variable and all of the others (Guttman 1954). The reader should note, however, that while this procedure provides a lowerbound estimate for the population, a sample value can be seriously inflated. The multiple correlation coefficient capitalizes on chance very effectively. For example, when the number of variables equals the number of observations, the multiple correlation in the sample is necessarily unity, although the population value may in fact be zero. The investigator who wants a lowerbound estimate may still utilize the Guttman theorem if he estimates the population values from sample values that are corrected for their capitalization on chance.
Number of factors. If he is interested in the psychological meaning of his factors, the experimenter has a further choice among criteria for determining the number of factors to retain and interpret. When estimated communalities are employed, no one of the possible criteria is more than a rule of thumb. The various criteria lead to radically different decisions concerning the number of factors to be retained; and different investigators, in applying one, several, or all of these criteria, will reach different conclusions about the number of factors.
One class of criteria for determining the number of factors has been characterized as emphasizing psychological importance without regard to sampling stability. Some investigators use some absolute value of the factor loadings, e.g., .30, either rotated or unrelated or both, without regard to the number of observations on which the correlations are based. A more recent suggestion has been to retain factors whose principal roots were greater than unity (Kaiser & Caffrey 1965). Such criteria appear to make an assumption that the number of observations is very large, so that the factors and loadings that are large enough psychologically are at the same time not the result of sampling error.
A second class of criteria has been characterized as emphasizing the number of observations, even though there are no known sampling distributions for factors or factor loadings. In several related criteria, factor loadings and/or residuals are compared in one way or another with the standard error of correlation coefficients of zero magnitude for the sample size involved. Factoring of the intercorrelations of random normal deviates as a method of obtaining empirical sampling errors has also been used.
A third criterion involves the “psychological meaning” of the rotated factors: the investigator merely states in effect that he is satisfied with the results of his analysis. Since any behavioral scientist of any modest degree of ingenuity can rationalize the random grouping of any set of variables, this does not appear to be a useful criterion scientifically. Without agreement on an objective criterion, however, psychological meaning of the factors tends to be the principal criterion used in deciding upon the number of factors to interpret.
Even in situations where probabilities of alpha and beta errors can be estimated, different investigators, depending on their temperaments or on social consequences, may set quite different standards for such errors. In determining the number of factors, however, there are no objective methods of error estimation, and the range of probabilities of alpha and beta errors resulting from differences among investigators or differences in social consequences is increased severalfold. For example, for one matrix of personality variables two investigators differ by a ratio of four to one in their assessment of the proper number of factors to retain and interpret. The difference between 12 and 3 factors is far from trivial. Such discrepancies reduce factor analysis to a hypothesis formation technique. As a method of discovery of psychological principles, or of hypothesis testing generally, ambiguities of this magnitude cannot be tolerated. The lack of a suitable test for number of factors has opened the door for a great deal of poor research.
Factor rotations. After the factors are extracted, the experimenter has to decide whether to rotate or not. Rotation of axes to psychologically meaningful positions follows inevitably from an interest in finding the psychological elements.
The rotation problem is seen most clearly in the twofactor case. First, the two factors are conceptualized as orthogonal (perpendicular) dimensions extending from values of 1.00 to +1.00. Then the points representing the loadings of the tests on these factors are fixed in the space defined by these dimensions. Imagine now that a pin is inserted at the origin of the two dimensions and that these are now rotated about the pin. Wherever they stop, new coordinates can be determined for the test points. It must be noted that the test points are located as accurately by the new dimensions as by the original ones, and that the intereorrelations of the tests are described with equal accuracy. There are, in point of fact, an infinite number of positions of the coordinates and thus an infinite number of mathematical solutions to the factor problem. The investigator interested in psychological meaning rotates the dimensions into some psychologically unique position. It is important, where possible, that factor descriptions of measures remain stable from sample to sample of either persons or measures, or both. This can be achieved, apparently in the great majority of cases, with an adequate rotational solution.
Rotation is almost uniformly performed when factors obtained are from a correlation matrix having communality estimates in the diagonal. Factors obtained from the variancecovariance matrix, on the other hand, are generally not rotated and are preferred by the experimenter interested in description alone rather than in explanation. The experimenter also has a choice among several different rotational methods, based upon different criteria and leading to either orthogonal or oblique factors.
Orthogonal versus oblique rotation. Orthogonal rotations offer the simplicity of uncorrelated dimensions in exchange for a poorer fit of the test points. Oblique rotations offer a better fit for the test points in exchange for a complexity of correlated dimensions. If oblique rotations are used, the investigator can also elect to factor in the second and perhaps higher orders; i.e., he can factor the intercorrelations among his firstorder factors, among his secondorder factors, and so on. After factoring in several orders, the investigator also has the option of presenting and interpreting his results in the several orders, or, by means of a simple transformation, he can convert the oblique factors in several orders to orthogonal, hierarchical factors in a single order.
Until the advent of highspeed digital computers, basically the only method for achieving a given rotational result was hand rotation. There are now several computer programs for rotation to either orthogonal or oblique structure.
If the investigator elects an orthogonal solution to his pr’oblem, he has a number of programs among which to choose. One of the earlier programs is the quartimax of Neuhaus and Wrigley (1954). This was followed by Kaiser’s varimax program (1958). An important difference between the two is that quartimax typically produces a general factor in ability data which is a function of the sampling of test variables, i.e., the general factor may reflect verbal, perceptual, or other specific emphasis, depending upon the nature of the tests sampled. Varimax provides results that are more stable from one test battery to another. This is achieved by a more even distribution of variance among the rotated factors. In the opinion of many investigators, varimax rotations have achieved a nearultimate status for the orthogonal case, but Schonemann (1964) has now developed a program that he calls varisim, which spreads existing variance more evenly among the several factors than varimax does. Results from the two programs are not completely parallel even for welldefined factors. There is as much rationale for varisim as for varimax. In consequence, the ultimate status of varimax has been dislodged, and we are again faced with a somewhat arbitrary choice among orthogonal rotational methods.
Oblique rotational programs are now fairly numerous and exhibit variability in results comparable to that among orthogonal ones. There is one important difference: no oblique program has as yet achieved the status that varimax once had. Because of the various sources of dissatisfaction with existing programs, there is much more research activity in the area of oblique rotation than in orthogonal rotation. There is still frequent resort to visually guided rotations if the investigator is striving for an oblique structure.
Methodological summary. The investigator who wishes to find psychological meaning in his data, the one who is trying to discover the basic building blocks or causal entities in human behavior, has a difficult task. Important decisions for which there are no sound foundations must be made at several steps in the procedure. Communalities must be estimated; the estimate of the number of factors to be extracted and retained for rotations must be based upon inadequate criteria; and although subjective bias possibly resulting from hand rotations has been eliminated by rotations obtained on highspeed computers, the choice of rotational program among either oblique or orthogonal solutions may lead to quite different results.
In the absence of sound estimation methods, the criterion of replicability is typically offered as a substitute. Replicability is a very important criterion in science generally. When applied to factor analysis, however, one must be aware that seemingly parallel results may have been forced on the data, typically without intention on the part of the experimentalist to do so. For example, considerable congruence of factor patterns can be obtained from the intercorrelations of two independent sets of random normal deviates by extracting as many factors as variables and by rotating to oblique simple structure. The result will be onetoone correspondence of the factors. The intercorrelations of the factors will differ, but even these differences will not be large, since they are randomly distributed about zero.
Methods of assessing the congruence of factor patterns also leave something to be desired. The most common method by far is that of visual inspection and unaided judgment. The most precise, the correlation between two estimated factor scores in the same sample, is rarely seen. Claimed replication of a factor is frequently without adequate foundation.
Early general factor interpretations
Mental energy. Spearman did not have available any of the abovedescribed techniques for the factor analysis of relationships among variables. Neither did he have access to the multitude of tests now available. He hypothesized that one general factor was sufficient to account for the intercorrelations among his variables, and he developed relatively simple methods to test this hypothesis. (Present methods of multiple factor analysis include Spearman’s single factor as a special case.) In psychological interpretation Spearman is of interest, however, because he interpreted his single factor as “mental energy.” This was considered the sole basis or building block of mental ability or intelligence. [see SPEARMAN.]
Multiple bonds. Spearman’s interpretation was challenged by Godfrey Thomson (1919) and by Edward Thorndike (Thorndike et al. 1926). Thomson proved that correlational matrices having the form required to satisfy Spearman’s onefactor interpretation could also be “explained” by the presence of many overlapping elements. Thorndike discussed connections (bonds) between stimuli and responses as an alternative to Spearman’s mental energy concept. Considering that there are many thousands of stimuli to which a person will respond differentially, and that tests sample these, the extent to which there is overlap in the elements sampled by two measures determines the degree to which they are correlated. If the intercorrelations of several measures have the formal properties necessary for Spearman’s unitary mental energy explanation (one factor), they also can be explained by multiple bonds or overlapping elements (multiplicity of factors). [see THORNDIKE.]
Unitary mental energy is a basic building block, a general influence or “cause”; multiple bonds are a complex of stimulusresponse connections that are acquired in a dynamic, complex physical and social environment. Multiple bonds that underlie the behavior under observation cannot be said to cause that behavior in the same sense that mental energy is said to cause intellectual performance.
Recourse to parsimony in this instance is not an acceptable solution, since the two explanations are so different. It should come as no surprise, for example, to learn that Spearman and his followers have stressed genetic bases for intelligence, while the multiple bonds notion lends itself most readily to a stress on environmental forces and learning.
Multiple factors
Thurstone’s primary mental abilities. Although Thurstone (1938) is considered to have broken with Spearman, the break was related only to the number of factors required to account for intelligence. Thurstone considered that some seven to nine factors were sufficient to account for the intercorrelations of the more than fifty tests he used. However, Spearman himself had come to doubt the singlefactor explanation; the break was more apparent than real. On the issue of what lies behind factors there was no break. Careful reading of Thurstone’s writings makes it quite clear that to him factors were much more than descriptive devices. Factors were functional unities; their ubiquity strongly suggested genetic determiners; after all, they were called primary mental abilities. [see THURSTONE.]
Ferguson’s learning emphasis. However, just as a single factor can be replaced by multiple overlapping bonds, so also can multiple group factors be replaced by sets of overlapping bonds. One need only assume that environmental pressures and learning come in somewhat separate “chunks.” Demographic differences, e.g., parental occupation, region of the country, rural–urban differences, etc., could account for some of the “chunking” required. Ferguson (1956) has produced a very satisfactory explanation along these lines in which learning and transfer are important variables. Various kinds of learning are facilitated or inhibited by the variety of environments in which children develop. Learning transfers, both positively and negatively, to novel situations. The amount and direction of the transfer are determined by stimulus and environmental similarities. Learning and transfer, along with environmental differences, produce the clustering of measures on which the factors depend. [see LEARNING, article on TRANSFER.]
Physical analogies. Thurstone (1947), in order to convince himself and others that factors were “real,” constructed a factor problem that has attracted a good deal of attention. He showed that if dimensions of boxes were factored, the result was a threefactor solution which could be rotated into a position such that the factors represented the dimensions of length, breadth, and depth, the three basic dimensions of Euclidean space. He also showed that these factors were correlated, i.e., an oblique solution gave a better fit to the data than did an orthogonal one. The obliquity reflects, of course, the fact that the dimensions of manmade boxes tend to be correlated, i.e., long boxes tend to be big boxes.
In a situation more relevant to behavior Cattell and Dickman (1962) have demonstrated that the intercorrelations of the performance of balls in several “tests” yield four factors that can be identified as size, weight, elasticity, and string length. It is clear from this and the preceding example that factor analysis can sometimes identify known physical factors in data.
One question about these examples is the certitude with which the factors can be identified after rotation, granting that the correct number of factors can be obtained by present methods. Thurstone suggested the criteria of simple structure for the adequacy of rotations. Generally speaking, simple structure is achieved when the number of zero loadings in a factor table has been maximized while increasing the magnitude of loadings on a small number of variables. The application of these criteria to the examples described resulted in clearcut identification of the three and four factors. It has been shown by Overall (1964), however, that if Thurstone had started with a different set of measurements, the criteria of simple structure for rotations would have led to differently defined factors, i.e., they would not have been the “pure” physical dimensions but would have represented complex combinations of those dimensions.
A more basic question is whether psychological data are similar to physical data, i.e., whether psychological dimensions obtained by factoring are similar to physical dimensions. The demonstration that three or four physical factors, as the case may be, can be recovered from correlational data does not prove that factors in psychological data have a similar functional unity. Not only is Thomson’s alternative explanation theoretically acceptable for multiple factors, but it makes good psychological sense as well. Psychological tests measure performance on each of a series of items. These performances make up the total score. Although Thomson would not have suggested a onetoone correspondence between item and element or stimulusresponse bond, one can conclude that there are at least as many elements represented in a test as there are items. Thus the multiple bonds approach fits the actual measurement situation so well that the adherents of the other point of view must bear the burden of proof–and for psychological, not physical, data.
Guilford’s structure of intellect. The work of J. P. Guilford has been most influential in the factor analysis of human abilities (e.g., 1956). It has increased by ten times the small number of primary mental abilities proposed by Thurstone, but the approach to their interpretation remains much the same. Guilford’s thinking about the nature of factors is modeled very closely after the periodic table of the chemical elements; he has in fact proposed a structure which points out missing factors and has proceeded in his own empirical work to “discover” many of these.
In spite of similarities in thinking about the nature of factors, the discrepancy in numbers between Guilford and Thurstone is highly significant, and it illustrates a basic difficulty with psychological tests and the attempts to find causal entities from the analysis of their intercorrelations. Not only do psychological tests measure performance on a relatively large number of passfail items, but there is at present no necessary or sufficient methodological or theoretical basis for deciding which items should be added together to make up a single test score (Humphreys 1962). The number of factors has proliferated in Guilford’s work because he has produced large numbers of homogeneous experimental tests. By additional test construction, making each test more and more homogeneous, the number of factors could be increased still further. As a matter of fact, there is no agreedupon stopping place short of the individual test item, i.e., a single item represents the maximum amount of homogeneity. This logic results in the same number of primary mental abilities as there are ability test items.
The progression from Thurstone to Guilford can be interpreted as further evidence for the multiple bonds theoretical approach. On the other hand, positing a functional unity inside the organism for each item represents a scientific dead end.
Cattall’s structure of personality . The work of R. B. Cattell has been most influential in the factor analysis of the domain of personality (e.g. 1957). Cattell’s thinking about the character of factors does not differ materially from that of Spearman, Thurstone, and Guilford in that for Cattell, factors are real influences.
The number of identified personality factors has increased considerably under Cattell’s direction. Although measurement problems differ, Cattell’s work parallels that of Guilford with human abilities. Selfreport questionnaires present the multiple items problem with yesno scoring of items. Personality investigators also have the problem of deciding which items should be added together in any given score. A great deal of additional work, however, has been done with rating scales and with socalled objective tests of personality. “Density” of sampling of the test or rating domain, a concept introduced by Cattell, is still involved in the proliferation of factors, even though the mechanism is not that of item selection. Thus, in obtaining ratings, one must decide on the number and overlap in meaning of traits to be rated. One must decide whether to include both extroversion and sociability or, even closer, both ascendance and dominance. While there is no rigorous method to depend on in the sampling of measures, decisions about what will be tested still affect the number of factors and their importance.
Furthermore, it is also typical of many experimental designs that large numbers of variables relative to the number of observations are analyzed; that many of these variables have low reliability and thus low communality; that many factors are retained for rotational purposes; and that rotations are made to an oblique structure. All of these elements contribute to possible capitalization on chance.
It is of interest that Cattell uses as a primary rotational criterion a count of the number of variables in the hyperplane, i.e., the multidimensional plane defined by all factors other than the one in question. (More simply, a measure having a zero loading on a factor is located geometrically someplace in the factor’s hyperplane.) This criterion places a premium on the extraction of a large number of factors relative to the number of measures, on the use of variables of low reliability, and on the use of variables unrelated to the major purpose of the analysis. In the opinion of many critics Cattell has increased the probability of making Type i errors beyond tolerable bounds, although neither he nor his critics can assign a value to alpha in this situation.
A dramatic example of the difficulties that may be involved in typical factor analytic research is given by some data described by Horn (1967). He obtained a good fit to an oblique factor pattern derived from an analysis of ability and personality variables by factoring the intercorrelations of the same number of random normal deviates, based upon the same number of observations, as the psychological variables. This finding highlights the principle that replication of findings may be of little import in factor analytic investigations.
It is also apparent that the essential reason for factor analyzing intercorrelations, to seek some reduction or simplification of data, has not been realized. The number of variables and the number of factors have grown astronomically, and the end is not yet in sight. It is highly possible that the search for psychological meaning, the search for the basic building blocks or elements, has been responsible. If psychological data are different from physical data in important respects, and if the multiple bonds are a more accurate representation of the data than the chemical elements point of view, researchers would profit from taking another look at the reasons why they factor analyze. An economical description of complex data is itself an important scientific goal.
LLOYD G. HUMPHREYS
[Directly related are the entriesCLUSTERING; MULTIVARIATE ANALYSIS; TRAITS. Other relevant material may be found inINTELLIGENCE AND INTELLIGENCE TESTING; PSYCHOLOGY, article OnCONSTITUTIONAL PSYCHOLOGY; and in the biographies ofSPEARMAN and THORNDIKE.]
BIBLIOGRAPHY
CATTELL, RAYMOND B. 1957 Personality and Motivation Structure and Measurement. New York: World.
CATTELL, RAYMOND B.; and DICKMAN, KERN 1962 A Dynamic Model of Physical Influences Demonstrating the Necessity of Oblique Simple Structure. Psychological Bulletin 59:389–400.
FERGUSON, GEORGE A. 1956 On Transfer and the Abilities of Man. Canadian Journal of Psychology 10:121–131.
GUILFORD, J. P. 1956 The Structure of Intellect. Psychological Bulletin 53:267–293.
GUTTMAN, LOUIS 1954 Some Necessary Conditions for Common Factor Analysis. Psychometrika 19:149–161.
HORN, JOHN 1967 On Subjectivity in Factor Analysis. Unpublished manuscript.
HUMPHREYS, LLOYD G. 1962 The Organization of Human Abilities. American Psychologist 17:475–483.
KAISER, HENRY F. 1958 The Varimax Criterion for Analytic Rotation in Factor Analysis. Psychometrika 23:187–200.
KAISER, HENRY F.; and CAFFREY, JOHN 1965 Alpha Factor Analysis. Psychometrika 30:1–14.
NEUHAUS, JACK O.; and WRIGLEY, CHARLES 1954 The Quartimax Method: An Analytical Approach to Orthogonal Simple Structure. British Journal of Statistical Psychology 7:81–91.
OVERALL, JOHN E. 1964 Note on the Scientific Status of Factors. Psychological Bulletin 61:270–276.
SCHONEMANN, P. H. 1964 A Solution of the Orthogonal Procrustes Problem With Applications to Orthogonal and Oblique Rotation. Ph.D. dissertation, Univ. of Illinois.
SPEARMAN, CHARLES 1904 “General Intelligence” Objectively Determined and Measured. American Journal of Psychology 15:201–293.
THOMSON, GODFREY H. 1919 The Proof or Disproof of the Existence of General Ability. British Journal of Psychology 9:321–336.
THORNDIKE, EDWARD L. et al. 1926 The Measurement of Intelligence. New York: Columbia Univ., Teachers College.
THURSTONE, LOUIS L. 1938 Primary Mental Abilities. Univ. of Chicago Press.
THUHSTONE, LOUIS L. 1947 Multiplefactor Analysis. Univ. of Chicago Press. → A development and expansion of Thurstone’s The Vectors of Mind, 1935.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"Factor Analysis." International Encyclopedia of the Social Sciences. . Encyclopedia.com. 10 Dec. 2017 <http://www.encyclopedia.com>.
"Factor Analysis." International Encyclopedia of the Social Sciences. . Encyclopedia.com. (December 10, 2017). http://www.encyclopedia.com/socialsciences/appliedandsocialsciencesmagazines/factoranalysis
"Factor Analysis." International Encyclopedia of the Social Sciences. . Retrieved December 10, 2017 from Encyclopedia.com: http://www.encyclopedia.com/socialsciences/appliedandsocialsciencesmagazines/factoranalysis
Factor Analysis
Factor Analysis
Factor analysis is usually adopted in social scientific studies for the purposes of: (1) reducing the number of variables; and (2) detecting structure in the relationships between variables. The first is often referred to as common factor analysis, whereas the second is known as component analysis when both variables are operated as statistical techniques. While factor analysis expresses the underlying common factors for an entire group of variables, it also helps researchers differentiate these factors by grouping variables into different dimensions or factors, each of which is ideally uncorrelated with the others.
A major breakthrough in attitude measurement came with the development of factor analysis (1931) by psychologist L. L. Thurstone (1887–1955). Thurstone introduced multiplefactors theory, which identified seven distinct and primary mental abilities consisting of: verbal comprehension, word fluency, number facility, spatial visualization, associative memory, perceptual speed, and reasoning. This theory differed from the more general, less separated theories of intelligence that were prevalent at the time and was among the first to show that human beings can be intelligent in different areas. The concept of multiple factors slowly received validation from empirical studies and gradually replaced the unidimensional factor in social research.
In social science studies, researchers often face a large number of variables. Although it is a good idea for scientists to exhaust all the relevant variables in their research to provide thorough responses to research questions, such an approach makes a theory too complex to generalize to empirical applications. For example, a researcher may want to explain delinquent behaviors by exploring all relevant independent variables, such as illegal drug use, harsh parenting, school dropout, school failure, singleparent household, gang affiliation, parentchild bonding, smoking, alcohol use, and many other variables. With so many independent variables, it is difficult to provide a simplified model for a parsimonious explanation of delinquent behavior. A good theoretical explanation should achieve both consideration of completion and parsimony in its coverage of variables. Factor analysis reduces the number of variables to a smaller set of factors that facilitates our understanding of the social problem. It provides such functions to determine the “common factors” of these independent variables. Each of these common factors should be the best representative of certain independent variables, and every factor should be, theoretically, independent from the other factors. Researchers substitute these factors for the variables because the factors can explain a similar degree of variance on the dependent variable but are simpler in terms of the number of independent variables. In most cases, factors found in an analysis may not provide a complete description of the relevant independent variables, but these factors should be the most important factors, the best way of summarizing a body of data.
Factor analysis may use either correlations or covariances. The covariance cov_{ab} between two variables, a and b, is their correlation times their two standard deviations: cov_{ab} = r_{ab} s_{a} s_{b}, where r_{ab} is their correlation and s_{a} and s_{b} are their standard deviations. Any variable’s covariance with itself is its variance—the square of its standard deviation. A correlation matrix can be thought of as a matrix of variances and covariances of a set of variables that have already been adjusted to a standard deviation of 1. Since a correlation or covariance matrix can be translated to one another easily, in many statistical books, authors may use either a correlation or covariance matrix or both to illustrate how factor scores are obtained.
The central theorem of factor analysis, in mathematical terms, is that we can partition a covariance matrix M into a common portion C that is explained by a set of factors, and a unique portion R that is unexplained by those factors. In matrix language, M = C + R, which means that each entry in matrix M is the sum of the corresponding entries in matrices C and R. The explained C can be further broken down into component matrices C 1, C 2, C 3, … and Cx, explained by individual factors. Each of these onefactor components Cx equals the “outer product” of a column of factor loading. A statistical program may rank several matrices Cx if it finds that there is more than one matrix with eigenvalues greater than 1. An eigenvalue is defined as the amount of variance explained by one more factor. Since a component analysis is adopted to summarize a set of data, it would not be meaningful to find another factor that explains less variance than is contained in one variable (eigenvalues of less than 1). Therefore, statistical programs often default this rule selecting factors.
Principal component analysis is commonly used in statistics for factor analysis and was introduced to achieve representation or summarization. It attempts to reduce p variables to a set of m linear functions of those variables that best describe and summarize the original p. Some conditions need to be satisfied to have a set of m factors for the purpose of factor analysis. First, the m factors must be mutually uncorrelated. Second, any set of m factors should include the functions of a smaller set. Third, the squared weights defining each linear function must sum to 1, denoting the total variance explained. By using all p, we get a perfect reconstruction of the original X scores, while by using the first m (with the greatest eigenvalues), we get the best reconstruction possible for that value of m and the most simplified model for interpretation.
Statistical programs allow researchers to select how many factors will be chosen. Ideally, we want to identify a certain number of factors that would explain or represent all the relevant variables. However, the use of factor analysis is not just to find all the statistically “significant” factors; rather, those factors identified should be meaningful to the researchers and interpreted subjectively by them. If the factors generated are meaningless in terms of the compositions of variables, such a factor analysis is not useful. In general, researchers may use exploratory factor analysis to find statistically significant factors (eigenvalues > 1) if they do not have prior knowledge of what factors may be generated from a number of variables. Therefore, it is very common that two different researchers would have two sets of factors even though they used an identical dataset. It is not about who is right or wrong, but whether researchers can adopt a group of factors that lead to better interpretation of the data. If researchers have prior knowledge (e.g., theories) of those factors, they can limit the number of factors to be generated in statistical programs rather than allowing statistical programs to generate them. In other words, researchers determine if the proposed variables are grouped into factors as suggested by the theory.
Researchers may use the rotation of a factorloading matrix to simplify structure in factor analysis. Consider a set of p multiple regressions from p observed variables, wherein each regression predicts one of the variables from all m factors. The standardized coefficients in this set of regressions form a p × m matrix called the factorloading matrix. We may replace the original factors with a set of linear functions of those factors for the same predictions as before, but with a different factorloading matrix. In practice, this rotated matrix is expected to be used with simpler structures to better serve researchers’ subjective interpretations.
SEE ALSO Covariance; EigenValues and EigenVectors, PerronFrobenius Theorem: Economic Applications; Methods, Quantitative; Models and Modeling; Regression Analysis; Statistics
BIBLIOGRAPHY
Thurstone, L. L. 1931. Measurement of Social Attitudes. Journal of Abnormal and Social Psychology 26 (3): 249–269.
ChengHsien Lin
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"Factor Analysis." International Encyclopedia of the Social Sciences. . Encyclopedia.com. 10 Dec. 2017 <http://www.encyclopedia.com>.
"Factor Analysis." International Encyclopedia of the Social Sciences. . Encyclopedia.com. (December 10, 2017). http://www.encyclopedia.com/socialsciences/appliedandsocialsciencesmagazines/factoranalysis0
"Factor Analysis." International Encyclopedia of the Social Sciences. . Retrieved December 10, 2017 from Encyclopedia.com: http://www.encyclopedia.com/socialsciences/appliedandsocialsciencesmagazines/factoranalysis0
factor analysis
factor analysis A family of statistical techniques for exploring data, generally used to simplify the procedures of analysis, mainly by examining the internal structure of a set of variables in order to identify any underlying constructs. The most common version is socalled principal component factor analysis.
In survey data, it is often the case that attitudinal, cognitive, or evaluative characteristics go together. For example, respondents who are in favour of capital punishment may also be opposed to equality of opportunity for racial minorities, opposed to abortion, and may favour the outlawing of trade unions and the right to strike, so that these items are all intercorrelated. Similarly, we might expect that those who endorse these (in the British context) rightwing political values may also support rightwing economic values, such as the privatization of all stateowned utilities, reduction of welfare state benefits, and suspending of minimumwage legislation. Where these characteristics do go together, they are said either to be a factor, or to load on to an underlying factor–in this case with what one might call the factor ‘authoritarian conservatism’.
Factor analysis techniques are available in a variety of statistical packages and can be used for a number of different purposes. For example, one common use is to assess the ‘factorial validity’ of the various questions comprising a scale, by establishing whether or not the items are measuring the same concept or variable. Confronted by data from a battery of questions all asking about different aspects of (say) satisfaction with the government, it may be that individual items dealing with particular economic, political, and social policies, the government's degree of trustworthiness, and the respondent's satisfaction with the President are not related, which suggests that these different aspects are seen as conceptually distinct by interviewees. Similarly, for any given set of variables, factor analysis can determine the extent to which these can be reduced to a smaller set in order to simplify the analysis, without losing any of the underlying concepts or variables being measured. Alternatively, researchers may ask respondents to describe the characteristics of a social attribute or person (such as ‘class consciousness’ or ‘mugger’), and factoranalyse the adjectives applied to see how the various characteristics are grouped.
All these uses are ‘exploratory’, in the sense that they attempt to determine which variables are related to which, without in any sense testing or fitting a particular model. Consequently, as is often the case in this kind of analysis, researchers may have difficulty interpreting the underlying factors on to which the different groups of variables load. Some marvellously imaginative labels have been devised by sociologists who have detected apparent underlying factors but have no clear idea of what these higherorder abstractions might be. Less frequently, however, a ‘confirmatory’ factor analysis is undertaken. Here, the researcher anticipates that a number of items measuring (say) ‘job satisfaction’ all form one factor, and this proposition is then tested by comparing the actual results with a solution in which the factor loading is perfect.
Alternative criteria exist for determining the best method for doing the analysis, the number of factors to be retained, and the extent to which the computer should ‘rotate’ factors to make them easier to interpret. An ‘orthogonal rotation’ yields factors which are unrelated to each other whereas an ‘oblique’ rotation allows the factors themselves to be correlated; and, as might be expected, there is some controversy about which procedure is the more plausible in any analysis. Although there are conventions about the extent to which variables should correlate before any are omitted from a factor, and the amount of variance (see VARIATION) to be explained by a factor before it may be ignored as insignificant, these too are matters of some debate. The general rule of thumb is that there should be at least three variables per factor, for meaningful interpretation, and that factors with an ‘eigenvalue’ of less than one should be discarded. (The latter quantity corresponds to the percentage of variance, on average, explained by the equivalent number of variables in the data, and is thus a standardized measure which allows researchers to eliminate those factors that account for less of the variance than the average variable.) However, even when a factor has an eigenvalue greater than 1, there is little to be gained by retaining it unless it can be interpreted and is substantively meaningful. At that point, statistical analysis ceases, and sociological theory and imagination take over. Moreover, the correlation matrix which is produced for the variables in any set and which yields the data from which factors are extracted, requires for its calculation variables which have been measured at the interval level and have a normal distribution. The use of the technique is therefore often accompanied by disputes as to whether or not these conditions have been met. For a useful introduction by a sociologist see Duane F. Alwin , ‘Factor Analysis’, in E. F. Borgatta and and M. L. Borgatta ( eds.) , Encyclopedia of Sociology (1992
). See also MEASUREMENT; PERSONALITY; SCREE TEST.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"factor analysis." A Dictionary of Sociology. . Encyclopedia.com. 10 Dec. 2017 <http://www.encyclopedia.com>.
"factor analysis." A Dictionary of Sociology. . Encyclopedia.com. (December 10, 2017). http://www.encyclopedia.com/socialsciences/dictionariesthesaurusespicturesandpressreleases/factoranalysis
"factor analysis." A Dictionary of Sociology. . Retrieved December 10, 2017 from Encyclopedia.com: http://www.encyclopedia.com/socialsciences/dictionariesthesaurusespicturesandpressreleases/factoranalysis
factor analysis
factor analysis See multivariate analysis.
Cite this article
Pick a style below, and copy the text for your bibliography.

MLA

Chicago

APA
"factor analysis." A Dictionary of Computing. . Encyclopedia.com. 10 Dec. 2017 <http://www.encyclopedia.com>.
"factor analysis." A Dictionary of Computing. . Encyclopedia.com. (December 10, 2017). http://www.encyclopedia.com/computing/dictionariesthesaurusespicturesandpressreleases/factoranalysis
"factor analysis." A Dictionary of Computing. . Retrieved December 10, 2017 from Encyclopedia.com: http://www.encyclopedia.com/computing/dictionariesthesaurusespicturesandpressreleases/factoranalysis