IQ Debate

views updated


In 1905 two Frenchmen, Alfred Binet (1857–1911) and Theophil Simon (1873–1961), invented the IQ (Intelligence Quotient) test to distinguish between mentally retarded and normal school children. They set tasks that normal children could do; for example, five-year-olds were asked to compare two weights, copy a square, repeat a sentence of ten syllables, count four pennies, and unite the halves of a divided rectangle.

By 2005 there were thousands of tests but two have special significance. The first, Raven's Progressive Matrices, measures on-the-spot problem solving where no previously learned method is applicable. It presents a pattern of shapes from which one piece is missing, offers six alternative missing pieces, and then asks the examinee to choose the correct one (Raven 2000). The second, the Wechsler Intelligence Scale for Children (WISC), supplements Raven's by using ten to twelve subtests to measure a variety of cognitive skills. These tests constitute technologies that raise significant ethical issues.

What IQ Tests Measure

Various cognitive skills go into problem solving. One such skill is mental acuity, which involves both solving problems without a previously learned method and the active creation of alternative solutions. The WISC subtest called Similarities measures mental acuity: The subject must decide what certain things, such as dawn and dusk, have in common. Similar subtests include Block Design, Picture Concepts, and of course Matrices. Another set of subtests are quite different. Clearly, a wide range of basic knowledge and a large vocabulary enhance problem-solving ability. These are measured by the Vocabulary and Verbal Comprehension subtests and, until recently, by the Information and Arithmetic subtests that were dropped in the fourth edition of the WISC. Although there is learned content in these subtests, it is the kind of learning that intelligent people will master more easily and more thoroughly. A third kind of relevant skill is speed of information processing—which is measured by the Coding and Symbol Search subtests. Finally, that ability called memory, which allows individuals to access accumulated knowledge, is tested by the Digit Span (the number of digits a person can repeat after they are read out —and the ability to repeat them in reverse order) and the Letter-Number Sequences subtests.

Given that the WISC tests cover the cognitive skills that go into problem solving, it may seem surprising that there is so much debate about whether IQ tests measure intelligence. There are several reasons why the controversy endures.

Attitudes affect cognitive skills because people invest mental energy into problems only if they feel they are significant. Attitude shifts over time have enhanced performance on some subtests more than on others (Flynn 2003). Members of a street gang may see little point in problems that appear to lack practical significance. Lots of noncognitive skills contribute to problem solving such as empathy, tact, setting people at ease, and being a good listener. In addition, IQ tests do not measure a host of attributes regarded as important, such as artistic and musical ability, honesty, and generosity.

Most debate about what IQ tests measure consists in endless repetition of these points and inventing a host of intelligences, such as emotional intelligence, social intelligence, surviving-in-a-wilderness intelligence, and musical intelligence, among others (Jensen 1998). This sterile debate can perhaps be circumvented by a modest claim: IQ tests measure cognitive skills relevant to problems encountered in the mainstream of industrial societies; and test the basic knowledge needed to function in those societies. However, there is a caveat: IQ tests cannot determine when a person scores better than others because of attitudes friendlier toward the kind of problems that are to be solved.

Uses of IQ Tests

IQ tests perform three main roles: comparing individuals for cognitive skills; comparing groups; and measuring cognitive skill trends over time, this last being a special case of comparing groups because it entails comparing one generation with another.

IQ scores give each person a percentile rank using Standard Deviations (SDs) as the link. An IQ of 100 is average for any particular age and is at the 50th percentile. An IQ of 130 is two SDs above the mean (an SD = 15) and is at the 98th percentile (only 2.3% of the subject's peers have a higher score); an IQ of 110 is 0.67 SDs above the mean and is at the 75th percentile; an IQ of 70 is two SDs below the mean and equals the 2nd percentile (only 2.3% of the subject's peers have a lower score). Certain IQ scores set the threshold for performing certain social roles. Few people with IQs below 130 will receive a Ph.D. from an academically superior university; few with IQs below 110 will enter the elite professions, that is, medicine, law, accounting, natural science, and engineering; and few with IQs below 100 will hold a professional, managerial, or technical post of any kind. Those with IQs below 70 are often regarded as being unable to cope with normal life and are labeled mentally retarded.

Race Differences

The existence of IQ thresholds for occupations generates group comparisons unfavorable to blacks. The mean IQ of white Americans is 100, while black Americans have a mean IQ of 85 or one SD below whites. The pool of potential professionals, managers, and technicians has a threshold of 100. Therefore, 50 per cent of whites would qualify but only the highest scoring 16 per cent of blacks (a score of 100 is at their 84th percentile). The Berkeley psychologist Arthur Jensen suggests that even if environments were equalized, blacks would still have a mean IQ of only 90 (Jensen 1973, p. 363). If he is correct, even then, only 25 percent of blacks would qualify.

Some believe scholars should not debate whether ethnic groups show genetic differences for intelligence. This moral advice will fail and should fail. Those who read Jensen will quickly find that he has an argument that must be answered, high professional standards, and no trace of racial bias. Thus the only reason not to test his hypothesis is that it would be unpleasant if it were true. In addition, if those who have offered evidence in favor of genetic equality were to opt out of the debate, Jensen's hypothesis would remain undisputed, a sort of unilateral disarmament. The debate should proceed and be conducted purely along evidential lines. The strongest evidence supporting a genetic hypothesis is the under performance, both on IQ tests and academically, of children of the black middle and upper classes—who do fall at least 10 IQ points short of their white counterparts (Herrnstein and Murray 1994, p. 288). The strongest evidence in favor of an environmental hypothesis was obtained as the result of an historical event: the U.S. military occupation of Germany after World War II, which removed thousands of black males from the American environment. The U.S. army left behind many illegitimate children. The mean IQs of those with black fathers and those with white fathers were the same (Flynn 1999).

Whatever the causes of the IQ gap between black and white Americans, it exists. When standardized tests are used as screening devices, the lesser representation of blacks leaves the realm of theory and becomes fact. The debate as to whether affirmative action should be used to redress the balance is complex. Opponents point to cases of underprivileged whites who are rejected in favor of the child of a black professional, lower performance in key areas such as police protection, and the fact that blacks may actually suffer harm, for example, by being admitted to universities where they are doomed to fail (Herrnstein and Murray 1994).

Proponents argue that black Americans suffer from their group membership in many ways, ranging from police behavior toward them, higher consumer prices in the ghetto, discrimination in housing and employment, and an unfavorable marriage market. White men very rarely marry black women. Therefore, black women are restricted to marrying black men and many are unlikely to find permanent partners—because too many black men die young, are imprisoned, or are not regularly employed. Therefore, more than one-half of black children are raised in solo-mother homes, often below the poverty line (Flynn 2000, pp. 148–149). Supporters of affirmative action also contend that most efficiency gains would accrue if standardized tests were only used to disqualify those without essential skills and if job-related criteria were substituted to rank applicants above that level. They cite data showing that when blacks admitted to elite universities (for which they would not normally qualify) are matched with blacks who went to other universities, the graduation rates are similar—and that the former profit by earning higher incomes (Kane 1998).

Genes and Environment

Studies of identical twins separated at birth and raised apart show that, at adulthood, twin and co-twin are far more alike in IQ than randomly selected individuals. This appears to be because of their identical genes—and does that not mean that genes are far more potent than environment? Jensen calculated that if environment were in fact this weak, no plausible environmental difference within a society such as America could account for a one SD IQ gap—which is the gap between the IQs of blacks and whites (Jensen 1973, pp. 166–169).

In 1987 James R. Flynn, a moral philosopher at the University of Otago, challenged this reasoning with evidence showing the existence of massive IQ gains over time. For example, the Dutch gained fully 20 IQ points on Raven's Matrices from one generation to the next, that is, from 1952 to 1982, a result replicated in several nations. Since there can be little genetic upgrading in a single generation, Flynn contended that these huge gains must have been due to environment (Flynn 2003). Thus, a paradox arose that baffled the discipline for many years: How can twin studies show environment to be so weak, while IQ gains over time show environment to be so enormously potent?

In 2001 William T. Dickens, an economist at the Brookings Institution, and Flynn offered reciprocal causation as a possible solution. Imagine identical twins who were separated at birth and raised apart in a basketball-mad state such as Indiana. Their identical genes dictate that they are born both a bit taller and quicker than average. Thus, although raised in different cities, both tend to be picked for informal basketball games at school. The extra play upgrades their skill advantage and they both get picked for the school team. They then play a rigorous schedule and get professional coaching, which upgrades their skill advantage further. At adulthood, they end up with basketball skills that are remarkably similar and well above average—and their identical genes get all the credit. But that assumption is a mistake. It overlooks the fact that these identical twins also had atypically similar basketball environments—their genes are getting credit for shared factors like more practice, playing on a team, and professional coaching. The kinship studies mask the potency of environment.

Skill gains over time show the true strength of environment. In 1950 TV brought basketball into American homes and basketball put baseball into the shadows—those close-ups look so good even on the small screen. Suddenly everyone was playing basketball and skills escalated. At first, to be better than average, a player needed merely to pass and shoot well. However, the rising quality of the average performance became a powerful factor in its own right. To excel, a few people learned to shoot with both hands. Then everyone who wanted to compete had to try to do the same, which pushed the mean up further. Soon a few people learned to pass with both hands and then, everyone had to try to do that. Every rise in the average performance encouraged a further rise.

So now this has resolved the gene-environment paradox: The key is reciprocal causation as a potent multiplier of skill differences. Within a generation, genes drive the feedback process and get credit for the environmental input—which gives the illusion of environmental weakness. Between generations, a persistent environmental factor (the rising popularity of basketball) drives the feedback process—and shows how environment can produce huge skill differences between groups separated by only a few years of time.

New Spectacles

The concept of reciprocal causation provides spectacles that improve our perception of what may cause group IQ differences. Do blacks start with what may be a modest but significant genetic disadvantage, one that gets multiplied into a 15-point IQ deficit? Or are there persistent environmental factors that divide black and white, analogous to belonging to the pre-TV and post-TV generations? Some have attempted to identify the kind of factors that might inhibit black academic achievement and IQ test performance: that they feel threatened by intellectual competition with whites; that black males are ambivalent about intellectual success and may even strive to fall below the class mean (so blacks would have negative multipliers!); and, as has been seen, that the problems of black males affect black children, so that a majority of them are raised by solo-mothers struggling to avoid poverty.

The brute fact that average IQ scores increase over time adds a new dimension to another debate: whether IQ tests should be used to classify people as mentally retarded. IQ gains mean that subjects will get higher IQs on an out-of-date test. If someone was average when compared to the test performance of their peers today (and therefore gets an IQ of 100), they would automatically be better than average compared to their peers of 20 years ago (and therefore get an IQ well above 100). After all, the fact that the average performance was worse in the past is what constitutes IQ gains over time. There is no doubt that people have been denied special education or have been executed on death row because taking obsolete tests inflated their IQs above 70, the usual cut-off point for mental retardation (Kanaya et al. 2003). These facts strengthen the argument of those who believe in purely behavioral criteria for mental retardation: School children should be classified as such if they cannot understand the rules of games they play frequently; prisoners should be executed only if their life histories show they can cope with the usual activities of everyday life, for example, by qualifying for a driver's license.

Are IQ Gains Real?

The United States and other nations have been making massive IQ gains for at least as far back as the 1930s. Are these really intelligence gains? The answer is that they are piecemeal cognitive skill gains that affect the real world—but they are not gains in terms of the kind of general intelligence IQ tests are designed to measure.

When an IQ test measures individuals competing with one another, certain people tend to do better than average on all or most of the WISC subtests—which is to say part of what is being measured is a better functioning brain that gives someone an advantage for most cognitive skills. Society does not upgrade average brain quality from one generation to another because it does not run radical experiments in selective breeding. What it does do is manipulate environmental factors that have a differential effect on various cognitive skills. If Americans fill more leisure time with cognitively demanding games, and fill more professional positions in which they must make decisions rather than simply following rules, scores on the Similarities subtest should rise—and they have enormously. If efforts to improve reading in the United States have not made people love books, and if visual entertainment of a largely escapist sort tempts people away from books, one would not expect better ability to read serious literature, or bigger non-specialized vocabularies, or the command of more general information—and the relevant WISC subtests show that this is indeed the case (Flynn 2003).

In sum, IQ tests are good tools for comparing the cognitive skills of individuals and alerting researchers to group differences. However, finding causes and solutions for those differences involves the totality of social science. The general intelligence factor that IQ tests are designed to measure may indicate which mind competes best with other minds at a certain time and place. But it is a crude measure of what society is doing to a wide variety of cognitive skills over time. We must free our minds of it and look at trends on the various WISC subtests. They reveal the intellectual history of these times.


SEE ALSO Emotional Intelligence;Eugenics;Race.


Deary, Ian J. (2001). Intelligence: A Very Short Introduction. New York: Oxford University Press. A good introduction to IQ tests and their significance.

Dickens, William T., and James R. Flynn. (2001). "Great Leap Forward." New Scientist 170(2287): 44–47. Spells out the concept of multipliers and how they clarify the roles of genes and environment.

Flynn, James R. (1999). "Searching for Justice: The Discovery of IQ Gains Over Time." American Psychologist 54: 5–
20. Discusses the social issues that IQ tests raise.

Flynn, James R. (2000). How to Defend Humane Ideals: Substitutes for Objectivity. Lincoln: University of Nebraska Press. Makes a case for affirmative action as necessary even in the absence of racism.

Flynn, James R. (2003). "Movies about Intelligence: The Limitations of g." Current Directions in Psychological Science 12: 95–99. Argues that IQ gains represent real gains in certain cognitive skills even if they are not general intelligence gains.

Herrnstein, Richard J., and Charles Murray. (1994). The Bell Curve: Intelligence and Class in American Life. New York: Free Press. A case that social progress is producing an underclass without the genetic potential to contribute to society; for a rebuttal, see Flynn (2000).

Jensen, Arthur R. (1973). Educability and Group Differences. New York: Harper and Row. A case that blacks on average have a lower potential for intelligence than whites; for a rebuttal, see Flynn (1999).

Jensen, Arthur R. (1998). The g Factor: The Science of Mental Ability. Westport, CT: Praeger. A case that the concept of general intelligence, g, is central to the study of human cognition—for reservations, see Flynn (2003).

Kanaya, Tomoe; Matthew H. Scullen; and Stephen J. Ceci. (2003). "The Flynn Effect and U.S. Policies: The Impact of Rising IQ Scores on American Society via Mental Retardation Diagnoses." American Psychologist 58: 778–790. Spells out how IQ gains over time have made a lottery out of classifying people as mentally retarded.

Kane, Thomas J. (1998). "Racial and Ethnic Preferences in College Admissions." In The Black-White Test Score Gap, eds. Christopher Jencks and Meredith Phillips. Washington, DC: Brookings Institution Press. Presents data that show that blacks profit from affirmative action at the university level.

Raven, John. (2000). "The Raven's Progressive Matrices: Change and Stability over Culture and Time." Cognitive Psychology 41: 1–48. The "architect" of Raven's Progressive Matrices describes its role in measuring cognitive trends throughout the world.