Treatments of modern measures of intelligence often begin with a discussion of the French psychologist Alfred Binet (1857–1911). In 1905, Binet initiated the applied mental measurement movement when he introduced the first intelligence test. In response to a turn-of-the-century law in France requiring that children of subnormal mental ability be placed in special programs (rather than be expelled from school), Binet was called upon to design a test that could identify these children. Binet's first test consisted of thirty items, most of which required some degree of comprehension and reasoning. For example, one task required children to take sentences in which words were missing and supply the missing words that made sense in context (such sentence-completion tasks are still used widely). Binet grouped his test items such that the typical child of a given age group was able to answer fifty percent of the questions correctly. Individuals of similar chronological age (CA) varied widely in their scale scores, or mental age (MA). The ratio of MA to CA determined one's level of mental development; this ratio was later multiplied by 100 to calculate what is now known as the intelligence quotient (IQ).
Binet's approach was successful: children's scores on his test forecasted teacher ratings and school performance. While Binet was developing this first test of general intellectual functioning, the English psychologist Charles Spearman (1863–1945) was conducting research to identify the dominant dimension responsible for the validity of the test's predictions.
The Hierarchical Organization of Mental Abilities
Spearman was the first to propose and offer tangible support for the idea that a psychologically cohesive dimension of general intelligence, g, underlies performance on any set of items demanding mental effort. Spearman showed that g appears to run through all heterogeneous collections of intellectual tasks and test items. He demonstrated that when heterogeneous items are all lightly positively correlated and then summed, the signal carried by each is successively amplified and the noise carried by each is successively attenuated.
Modern versions of intelligence tests index essentially the same construct that was uncovered at the turn of the twentieth century by Spearman, but with much more efficiency. For example, g is a statistical distillate that represents approximately half of what is common among the thirteen subtests comprising the Wechsler Adult Intelligence Scale. As noted by intelligence researcher Ian J. Deary, the attribute g represents the research finding that "there is something shared by all the tests in terms of people's tendencies to do well, modestly, or poorly on all of them." This "tendency" is quite stable over time. In 2001, Deary's team published a study that was the longest temporal stability assessment of general intelligence, testing subjects at the age of eleven and a second time at the age of seventy-seven. They observed a correlation of 0.62, which rose to over 0.70 when statistical artifacts were controlled.
Psychometricians have come to a consensus that mental abilities follow a hierarchical structure, with g at the top of the hierarchy and other broad groups of mental abilities offering psychological import beyond g. Specifically, mathematical, spatial-mechanical, and verbal reasoning abilities all have demonstrated incremental (additional) validity beyond g in forecasting educational and vocational outcomes.
g and the Prediction of Life Outcomes
Research on general intelligence has confirmed the validity of g for forecasting educational and occupational achievement. Empiricism also has documented general intelligence's network of relationships with other socially important outcomes, such as aggression, crime, and poverty. General intellectual ability covaries 0.70–0.80 with academic achievement measures, 0.40–0.70 with military training assignments, 0.20–0.60 with work performance (higher correlations reflect greater job complexity), 0.30–0.40 with income, and around 0.20 with obedience to the law. Measures of g also correlate positively with altruism, sense of humor, practical knowledge, social skills, and supermarket shopping ability, and correlate negatively with impulsivity, accident-proneness, delinquency, smoking, and racial prejudice. This diverse family of correlates reveals how individual differences in general intelligence influence other personal characteristics.
Experts' definitions of general intelligence fit with g 's nexus of empirical relationships. Most measurement experts agree that measures of general intelligence assess individual differences pertaining to abstract thinking or reasoning, the capacity to acquire knowledge, and problem-solving ability. Traditional measures of general intelligence and standard academic achievement tests both assess these general information-processing capacities. In 1976, educational psychologist Lee Cronbach noted: "In public controversies about tests, disputants have failed to recognize that virtually every bit of evidence obtained with IQs would be approximately duplicated if the same study were carried out with a comprehensive measure of achievement" (1976, p. 211, emphasis in original).
The Causes of Individual Differences in Intelligence
Both genetic and environmental factors contribute to the individual differences observed in intelligence. The degree to which individual differences in intelligence are genetically influenced is represented by an estimate of heritability, the proportion of observed variation in intelligence among individuals that is attributable to genetic differences among the individuals. By pooling various family studies of g (e.g., identical and fraternal twins reared together or apart), the heritability of general intelligence in industrialized nations has been estimated to be approximately 40 percent in childhood and between 60 and 80 percent in adulthood. This pattern is thought to reflect the tendency of individuals, as they grow older and more autonomous, to increasingly self-select into environments congruent with their unique abilities and interests.
Environmental contributions to individual differences in intelligence are broadly defined as all non-genetic influences. Shared environmental factors, such as socioeconomic status and neighborhood context, are those that are shared by individuals within a given family but differ across families; non-shared environmental factors, such as the mentoring of a special teacher or one's peer group, are those that are generally unique to each individual within a family. The majority of environmental influences on intelligence can be attributable to non-shared factors for which the specifics, thus far, are not well known. Family studies of intelligence have consistently documented that the modest importance of shared environmental influences in early childhood, approximately 30 percent, decreases to essentially zero by adulthood.
The Debate over Research on Intelligence
The above empiricism is widely accepted among experts in the fields of measurement and individual differences. Yet research pertaining to general intelligence invariably generates controversy. Because psychological assessments are frequently used for allocating educational and vocational opportunities, and because different demographic groups (such as those based on socioeconomic status or race) differ in test scores and criterion performance, social concerns have accompanied intellectual assessment since its beginning. Because of these social concerns, alternative conceptualizations of intelligence, such as Howard Gardner's theory of multiple intelligences and Robert Sternberg's triarchic theory of intelligence, have generally been received positively by the public. Measures of these alternative formulations of intelligence, however, have not demonstrated incremental validity beyond what is already gained by conventional measures of intelligence. That is, they have not been shown to account for any more variance in important life outcomes (such as academic achievement and job performance) than that already accounted for by conventional intelligence tests.
See also: Age and Development; IQ; Retardation; Special Education.
Bouchard, T. J., Jr. 1997. "IQ Similarity in Twins Reared Apart: Findings and Responses to Critics." In Intelligence: Heredity and Environment, ed. R. J. Sternberg and E. L. Grigorenko. New York: Cambridge University Press.
Brand, Christopher. 1987. "The Importance of General Intelligence." In Arthur Jensen: Consensus and Controversy, ed. S. Magil and C. Magil. New York: Falmer Press.
Brody, N. 1992. Intelligence, 2nd ed. San Diego, CA: Academic Press.
Carroll, John B. 1993. Human Cognitive Abilities: A Survey of Factor-Analytic Studies. Cambridge, UK: Cambridge University Press.
Cronbach, L. J. 1975. "Five Decades of Public Controversy over Mental Testing." American Psychologist 30: 1–14.
Cronbach, L. J. 1976. "Measured Mental Abilities: Lingering Questions and Loose Ends." In Human Diversity: Its Causes and Social Significance, ed. B. D. Davis and P. Flaherty. Cambridge, MA: Ballinger.
Deary, Ian J. 2001. Intelligence: A Very Short Introduction. New York: Oxford University Press.
Gottfredson, Linda S. 1997. "Intelligence and Social Policy." Intelligence 24 (special issue).
Jensen, Arthur R. 1998. The g Factor: The Science of Mental Ability. Westport, CT: Praeger.
Lubinski, David. 2000. "Assessing Individual Differences in Human Behavior: Sinking Shafts at a Few Critical Points." Annual Re-view of Psychology 51: 405–444.
Messick, S. 1992. "Multiple Intelligences or Multilevel Intelligence? Selective Emphasis on Distinctive Properties of Hierarchy: On Gardner's Frames of Mind and Sternberg's Beyond IQ in the Context of Theory and Research on the Structure of Human Abilities." Psychological Inquiry 3: 365–384.
Murray, Charles. 1998. Income, Inequality, and IQ. Washington, DC: American Enterprise Institute.
Neisser, U., G. Boodoo, and Bouchard, et al. 1996. "Intelligence: Knowns and Unknowns." American Psychologist 51: 77–101.
Snyderman, Mark, and Stanley Rothman. 1987. "Survey of Expert Opinion on Intelligence and Aptitude Testing." American Psychologist 42: 137–144.
Spearman, Charles. 1904. "General Intelligence Objectively Determined and Measured." American Journal of Psychology 15:201–292.e
INTELLIGENCE TESTS. Although the tests created specifically to gauge intelligence were introduced to the United States in the early twentieth century, their roots go back much farther, even to exams in ancient China. The American tests, however, emerged directly from the work of nineteenth-century English scientists who were laying the foundation for the field of psycho-metrics: the scientific approach to measurement of psychological characteristics.
Early European Testing and the Stanford-Binet Test
Sir Francis Galton produced the first systematic investigations of the concept of intelligence. Galton seemed uniquely qualified for this task, as he was known for collecting and quantifying massive amounts of data. Galton's statistical analyses included seemingly random and subjective assessments. Nonetheless, his groundbreaking pronouncement endures: that intelligence is a trait normally distributed among populations. A normal distribution means that most people were of average intelligence, while a minority fell above or below this middle range. Plotting this distribution resulted in the formation of the now familiar bell curve.
Reflecting popular nineteenth-century theories of evolution, including those of his cousin, Charles Darwin, Galton viewed intelligence as a single, inherited trait. His landmark 1869 publication, Hereditary Genius, established the parameters of the scientific investigation of mental processes for years to come; his understanding of intelligence as a fixed and predetermined entity would remain largely unchallenged for nearly a century.
Eager to further explore Galton's ideas, psychologist James McKeen Cattell returned from his studies in Europe to the University of Pennsylvania in the late 1880s and began his own work. Cattell's "mental tests," a term he introduced, reflected his skills at statistical analysis. Similar to Galton's, however, his tests ultimately failed to show any real correlation between scores and demonstrated achievement. Still, Cattell's work earned growing recognition and respect for the emerging field of psychology.
The earliest intelligence tests to move beyond the theoretical and into the practical realm were the work of the French researcher Alfred Binet. The passage of a 1904 law requiring that all children attend school prompted the French government to decide what to do with children who could not keep up with classroom work. Binet and his colleague, Théodore Simon, set out to devise a test as a means of identifying these students, who would then receive tutoring or be placed in alternative classes.
Binet's first test was published in 1905.Like its sub-sequent revisions, this early version asked students to demonstrate proficiency at a variety of skills. Starting with the most basic and increasing in difficulty, they were designed to measure childrens' vocabulary and their ability to understand simple concepts and identify relationships between words. An age level or "norm" was assigned to each task, based on the age at which approximately 70 percent of children could successfully complete that task. Totaling the individual scores would yield a child's "mental age." This would be subtracted from his or her chronological age; a difference of two or more indicated that a child was mentally retarded.
Binet's research differed from that of previous investigators in several important ways: test scores were meant to measure classroom performance, not innate intelligence, and they were intended to target students who could benefit by receiving extra help. Binet was one of the few who challenged popular perceptions of intelligence as an inherent and unchangeable entity.
American professor of psychology Lewis Terman set out to refine what became widely known as the Binet-Simon Scale. Named after his long and distinguished career at Stanford University, the Stanford-Binet Intelligence Test emerged as the one to which all future tests would be compared. First published in 1916, the Stanford-Binet asked students to demonstrate competency in a variety of areas, including language comprehension, eye-hand coordination, mathematical reasoning, and memory. Terman advanced the idea proposed in 1912 by German psychologist Wilhelm Stern that intelligence could more accurately be expressed as a ratio, dividing mental age by chronological age. This would be multiplied by one hundred (to avoid the use of decimals) to arrive at what Stern labeled the "mental quotient." This quickly became known as an intelligence quotient, or IQ.
This formula ultimately yielded to new methods of calculation. Still predicated on Galton's assumption that intelligence is normally distributed, tables of raw data are statistically adjusted so that the mean scores are set at 100, with the middle two-thirds of the distribution set between 85 and 115 to form the "normal" range. This scale defines those who score below 70 as mentally retarded; those with 130 or above are often labeled gifted.
Testing the Masses
The United States entry into World War I in 1917 prompted an immediate and unprecedented demand for standardized tests. The federal government sought a way to quickly and efficiently determine the abilities of large numbers of military recruits to determine appropriate assignment of duties. Robert Yerkes of Harvard and other prominent psychiatrists created a committee in response to this need. Adopting the work of Arthur Otis, whose research in this field already was underway, they quickly produced two versions of a workable test. The Army Alpha was a written exam and the Army Beta was a verbal assessment for the considerable number of men who were unable to read. The tests resulted in grades ranging from A to E. Within weeks a group of four thousand recruits completed the first trial run.
By the end of the war over 1.7 million men had taken either the Army Alpha or Beta. Based on their scores, tens of thousands of men were promoted or assigned a lower-level duty. An additional 8,000 men received discharges as a result of their poor performance. The impact of the Army testing program reached far beyond the military service. Its success convinced the nation of the usefulness of wide-scale standardized testing. The popularity of the Alpha, in particular, launched a rapidly expanding intelligence test industry. In the years immediately following the war, schoolchildren across the country began taking its numerous revisions; by 1930 over seven million American students had taken the test.
As the popularity of mass testing continued to grow, the need for individual tests as diagnostic tools remained. The Wechsler-Bellevue Intelligence Scale supplemented the Stanford-Binet in 1939.Devised by David Wechsler of Bellevue Hospital in New York City, results included both verbal and nonverbal scores. The test was named the Wechsler Scale in 1955 (WAIS), later revised to WAIS R. The expanded group of tests, including the Wechsler Intelligence Scale for Children, Revised (WISC-R), and the Wechsler Preschool and Primary Scale of Intelligence (WPPSI), form a battery of tests that continue to be widely used. While schools no longer routinely offer individual tests specifically designed to measure intelligence, their use continues, usually as a follow-up to demonstrated academic difficulty or to determine eligibility for special programs, such as those for gifted children. Educators continue to rely on the relative ease and efficiency of administering group tests.
Although they date back to the 1917 prototype designed for military use, standardized tests at the start of the twenty-first century offer the promise of a more reliable and sophisticated means to predict future success. There are additional advantages as well: no special training is required to administer them, they can be given to large groups at once, and computers quickly and accurately generate results. The Cognitive Abilities Test (CAT) and the School and College Ability Test (SCAT) are among the more popular. Developers of these tests compare them favorably to both the Stanford-Binet and Wechsler series. Many high school students take the Scholastic Assessment Test (SAT) as part of the college application process. Its earliest version going back to 1926, the SAT is calculated to measure both verbal and mathematical ability. Proponents point to its usefulness as one indicator of future success, and claim that it counters inevitable disparities in grading practices nationwide.
Defining Intelligence: The Debate Continues
Alfred Binet rejected the idea of tests as providing a fixed label; he believed that children could indeed grow smarter. Binet's optimism notwithstanding, the history of intelligence testing in the United States reveals that early tests reflected the prejudices of the society in which they were produced. Not surprisingly, few questioned the idea that intelligence is innate and inherited. Tests made no accommodations for the disparate social and cultural backgrounds of test takers, and indeed, helped to fuel popularly held assumptions of the need to rank entire groups based on their racial or ethnic origins. They were hailed by some as a "measuring stick to organize society." Early-twentieth-century concerns about "feeblemindedness" validated the need for testing. Amidst growing concerns over an influx of immigration, tests were proposed to reduce the flow of "mental defectives" into the country. Congress, aided by the findings of prominent psychologists, passed the 1924 Immigration Act, which restricted admission for those believed to be of inferior intellect; especially targeted were Russians, Italians, Jews, and others primarily from southern and eastern Europe. Entry examinations given at Ellis Island seemingly ignored the numerous language and cultural barriers that would be readily apparent today.
While standardized tests continue to play a dominant role in American society, many critics argue that subtle inequities remain, producing results that more accurately represent the social and economic background of the test taker rather than providing a true measure of one's capabilities. The SAT and other tests, meanwhile, retain their foothold in the academic arena. The ability to "coach" students to produce greater scores has launched a multi-million-dollar mass tutoring industry. This has prompted many to further renounce their use as an "objective" means of assessment, arguing that they are more accurate indicators of students' social and economic backgrounds.
Meanwhile, biological interpretations of intelligence endure. Interrogating the degree to which race or ethnicity are determining factors, the 1994 publication of The Bell Curve: Intelligence and Class Structure in American Life, pushed the debate to new heights. While authors Richard Herrnstein and Charles Murray suggested the merits of acknowledging genetic differences, some critics immediately decried a racist agenda and uncovered studies they believed to be scientifically unsound.
Experts continue to voice disagreement over methods of measuring intelligence. At the core of the debate lie questions regarding the very concept of intelligence itself. Some embrace interpretations that echo the theories of turn-of-the-twentieth-century psychologist Charles Spearman of England, who pointed to a single, overarching general intelligence, or "g" factor. At the other extreme is the more recent twentieth-century model created by J. P. Guilford of the University of Southern California, who has identified no less than 150 components of intelligence. Arguably the most detailed model, it has had limited impact on the field of testing; many have adopted his claim, however, that intelligence is comprised of multiple parts.
The psychologist Robert Sternberg believes that the logical or analytical reasoning that most intelligence tests measure is only one of several factors. He had added to this two other areas of assessment—practical intelligence, or the ability to cope amidst one's environment, and experiential intelligence, or propensity for insight and creativity —to form his triarchic theory of intelligence. Sternberg's theory has advanced the notion that psychological assessments move beyond the written test toward those that seek measures of practical knowledge that guide our day-to-day experiences. Also believing that traditional IQ tests ignore critical components of intelligence, Howard Garner has introduced what he calls "multiple intelligences," which range from musical ability to self-awareness. Not surprisingly, Gardner is among those who advocate more expansive interpretations of intelligence, suggesting decreased reliance on the standardized tests of the past and more emphasis on real-life performance.
Experts continue to explore the concept of intelligence. New lines of inquiry widen the scope of investigation and questions abound. Should traits of character and morality be examined? Should the ability to form emotional bonds and display musical talent be considered? Will more comprehensive approaches replace short-answer tests? And does the ability to determine one's IQ necessarily define how this score should be used? Studies are moving beyond the realm of psychological inquiry. Increasingly sophisticated ways of measuring brain activity suggest new modes of interpretation while technological advances have produced an "artificial intelligence" that previous generations of researchers could barely imagine. While we may be no closer to finding a universally accepted definition of intelligence, clearly the quest to do so remains.
Chapman, Paul Davis. Schools as Sorters: Lewis Terman, Applied Psychology, and the Intelligence Testing Movement, 1890–1930. New York: New York University Press, 1988.
Eysench, H. J., and Leon Kamin. The Intelligence Controversy. New York: Wiley, 1981.
Fancher, Raymond E., ed. The Intelligence Men: Makers of the IQ Controversy. New York: Norton, 1985.
Gardner, Howard. Frames of Mind: The Theory of Multiple Intelligences. New York: Basic Books, 1983.
——."Who Owns Intelligence?" Atlantic Monthly (February 1999).
Gould, Stephen Jay. The Mismeasure of Man. New York: Norton, 1983.
Herrnstein, Richard J., and Charles Murray. The Bell Curve: Intelligence and Class Structure in American Life. New York: Free Press, 1994.
Sokal, Michael M., ed. Psychological Testing and American Society, 1890–1930. New Brunswick, N.J.: Rutgers University Press, 1987.
Sternberg, Robert J. Beyond IQ: A Triarchic Theory of Human Intelligence. Cambridge, U.K.: Cambridge University Press, 1985.
Yam, Philip, ed. "Exploring Intelligence." Spec. issue of Scientific American (Winter 1998).
Zenderland, Leila. Measuring Minds: Henry Herbert Goddard and the Origins of American Intelligence Testing. Cambridge, U.K.: Cambridge University Press, 1998.
See alsoEducation ; Racial Science .
A measurement of intelligence based on standardized test scores.
Although intelligence quotient (IQ) tests are still widely used in the United States, there has been increasing doubt voiced about their ability to measure the mental capacities that determine success in life. IQ testing has also been criticized for being biased with regard to race and gender. In modern times, the first scientist to test mental ability was Alfred Binet , a French psychologist who devised an intelligence test for children in 1905, based on the idea that intelligence could be expressed in terms of age. Binet created the concept of "mental age," according to which the test performance of a child of average intelligence would match his or her age, while a gifted child's performance would be on par with that of an older child, and a slow learner's abilities would be equal to those of a younger child. Binet's test was introduced to the United States in a modified form in 1916 by Lewis Terman . The scoring system of the new test, devised by German psychologist William Stern, consisted of dividing a child's mental age by his or her chronological age and multiplying the quotient by 100 to arrive at an "intelligence quotient" (which would equal 100 in a person of average ability).
The Wechsler Intelligence Scales , developed in 1949 by David Wechsler , addressed an issue that still provokes criticism of IQ tests today: the fact that there are different types of intelligence. The Wechsler scales replaced the single mental-age score with a verbal scale and a performance scale for nonverbal skills to address each test taker's individual combination of strengths and weaknesses. The Stanford-Binet and Wechsler tests (in updated versions) remain the most widely administered IQ tests in the United States. Average performance at each age level is still assigned a score of 100, but today's scores are calculated solely by comparison with the performance of others in the same age group rather than test takers of various ages. Among the general population, scores cluster around 100 and gradually decrease in either direction, in a pattern known as the normal distribution (or "bell") curve.
Although IQ scores are good predictors of academic achievement in elementary and secondary school, the correspondence between IQ and academic performance is less consistent at higher levels of education, and many have questioned the ability of IQ tests to predict success later in life. The tests don't measure many of the qualities necessary for achievement in the world of work, such as persistence, self-confidence, motivation , and interpersonal skills, or the ability to set priorities and to allocate one's time and effort efficiently. In addition, the creativity and intuition responsible for great achievements in both science and the arts are not reflected by IQ tests. For example, creativity often involves the ability to envision multiple solutions to a problem (a trait educators call divergent thinking ); in contrast, IQ tests require the choice of a single answer or solution to a problem, a type of task that could penalize highly creative people.
GENDER DIFFERENCES IN MATH
In the late 1970s, political scientists Sheila Tobias and others called attention to the trend for girls to avoid and feel anxiety about math, a fact she attributed to social conditioning. Girls historically were discouraged from pursuing mathematics by teachers, peers, and parents.
In the early 1990s, two studies suggested that there might be differences in how boys and girls approach mathematics problems. One study, conducted by researchers at Johns Hopkins University, examined differences in mathematical reasoning using the School and College Ability Test (SCAT). The SCAT includes 50 pairs of quantities to compare, and the test-takers must decide whether one is larger than the other or whether the two are equal, or whether there is not enough information. Groups of students in second through sixth grade who had been identified as "high ability" (97th percentile or above on either the verbal or quantitative sections of the California Achievement Test) participated in the study. The boys scored higher than the girls overall, and the average difference between male and female scores was the same for all grade levels included in the study. Another study by Australian researchers at the University of New South Wales and La Trobe University gave 10th-graders 36 algebraic word problems and asked them to group the problems according to the following criteria: whether there was sufficient information to solve the problem; insufficient information; or irrelevant information along with sufficient information. (There were 12 problems in each category.) Students were grouped into ability groups according to prior test scores. Boys and girls performed equally well in identifying problems containing sufficient information, but boys were more able than girls to detect problems that had irrelevant information, or those that had missing information. Next, the researchers asked the students to solve the problems. Girls performed as well as boys in solving problems that had sufficient information, but no irrelevant information. On the problems that contained irrelevant information, girls did not perform as well as boys. The researchers offered tentative conclusions that perhaps girls are less able to differentiate between relevant and irrelevant information, and thus allow irrelevant information to confuse their problem-solving process. The researchers hypothesized that this tendency to consider all information relevant may reflect girls' assumption that test designers would not give facts that were unnecessary to reaching a solution.
Some researchers have argued that offering all-girl math classes is an effective way to improve girls' achievement by allowing them to develop their problem-solving skills in an environment that fosters concentration. Others feel this deprives girls of the opportunity to learn from and compete with boys, who are often among the strongest math students.
The value of IQ tests has also been called into question by recent theories that define intelligence in ways that transcend the boundaries of tests chiefly designed to measure abstract reasoning and verbal comprehension. For example, Robert Steinberg's triarchical model addresses not only internal thought processes but also how they operate in relation to past experience and to the external environment. Harvard University psychologist Howard Gardner has posited a theory of multiple intelligences that includes seven different types of intelligence: linguistic and logicalmathematical (the types measured by IQ tests); spatial; interpersonal (ability to deal with other people); intrapersonal (insight into oneself); musical; and bodilykinesthetic (athletic ability).
Critics have also questioned whether IQ tests are a fair or valid way of assessing intelligence in members of ethnic and cultural minorities. Early in the 20th century, IQ tests were used to screen foreign immigrants to the United States; roughly 80% of Eastern European immigrants tested during the World War I era were declared "feeble-minded," even though the tests discriminated against them in terms of language skills and cultural knowledge of the United States. The relationship between IQ and race became an inflammatory issue with the publication of the article "How Much Can We Boost IQ and Scholastic Achievement?" by educational psychologist Arthur Jensen in the Harvard Educational Review in 1969. Flying in the face of prevailing belief in the effects of environmental factors on intelligence, Jensen argued that the effectiveness of the government social programs of the 1960's War on Poverty had been limited because the children they had been intended to help had relatively low IQs, a situation that could not be remedied by government intervention. Jensen was widely censured for his views, and standardized testing underwent a period of criticism within the educational establishment, as the National Education Association called for a moratorium on testing and major school systems attempted to limit or even abandon publicly administered standardized tests. Another milestone in the public controversy over testing was the 1981 publication of Stephen Jay Gould's best-selling The Mismeasure of Man, which critiqued IQ tests as well as the entire concept of measurable intelligence.
Many still claim that IQ tests are unfair to members of minority groups because they are based on the vocabulary, customs, and values of the mainstream, or dominant, culture. Some observers have cited cultural bias in testing to explain the fact that, on average, African-Americans and Hispanic-Americans score 12-15 points lower than European-Americans on IQ tests. (Asian-Americans, however, score an average of four to six points higher than European-Americans.) A new round of controversy was ignited with the 1994 publication of The Bell Curve by Richard Herrnstein and Charles Murray, who explore the relationship between IQ, race, and pervasive social problems such as unemployment, crime, and illegitimacy. Given the proliferation of recent theories about the nature of intelligence, many psychologists have disagreed with Herrnstein and Murray's central assumptions that intelligence is measurable by IQ tests, that it is genetically based, and that a person's IQ essentially remains unchanged over time. From a sociopolitical viewpoint, the book's critics have taken issue with The Bell Curve 's use of arguments about the genetic nature of intelligence to cast doubt on the power of government to remedy many of the nation's most pressing social problems.
Yet another topic for debate has arisen with the discovery that IQ scores in the world's developed countries—especially scores related to mazes and puzzles— have risen dramatically since the introduction of IQ tests early in the century. Scores in the United States have risen an average of 24 points since 1918, scores in Britain have climbed 27 points since 1942, and comparable figures have been reported throughout Western Europe, as well in Canada, Japan, Israel, Australia, and other parts of the developed world. This phenomenon— named the Flynn effect for the New Zealand researcher who first noticed it—raises important questions about intelligence testing. It has implications for the debate over the relative importance of heredity and environment in determining IQ, since experts agree that such a large difference in test scores in so short a time cannot be explained by genetic changes.
A variety of environmental factors have been cited as possible explanations for the Flynn effect, including expanded opportunities for formal education that have given children throughout the world more and earlier exposure to some types of questions they are likely to encounter on an IQ test (although IQ gains in areas such as mathematics and vocabulary, which are most directly linked to formal schooling, have been more modest than those in nonverbal areas). For children in the United States in the 1970s and 1980s, exposure to printed texts and electronic technology—from cereal boxes to video games—has been cited as an explanation for improved familiarity with the types of maze and puzzle questions that have generated the greatest score changes. Improved mastery of spatial relations has also been linked to video games. Other environmental factors mentioned in connection with the Flynn effect include improved nutrition and changes in parenting styles.
Bridge, R. Gary. The Determinants of Educational Outcomes: The Impact of Families, Peers, Teachers, and Schools. Cambridge, MA: Ballinger Publishing Co., 1979.
Eysenck, H. J. The Intelligence Controversy. New York: Wiley, 1981.
Fraser, Steven. The Bell Curve Wars: Race, Intelligence, and the Future of America. New York: Basic Books, 1995.
Herrnstein, Richard J., and Charles Murray. The Bell Curve: Intelligence and Class Structure in American Life. New York: Free Press, 1994.
Kline, Paul. Intelligence: The Psychometric View. London: Routledge, 1991.
Sternberg, R. J. Beyond IQ: A Triarchic Theory of Human Intelligence. Cambridge, Eng.: Cambridge University Press, 1985.
IQ, or intelligence quotient, is a measure of intelligence that schools, children's homes, and other child-saving institutions have used since the 1910s to assess the intelligence of children for various diagnostic purposes. Welcomed and reviled in different social and political contexts in the twentieth century, especially in the United States, because its deployment has influenced the life chances of millions of children, the IQ and the tests that produce it had modest beginnings. French psychologist Alfred Binet devised the first test of intelligence for school children in 1908 and 1911. He understood that intellectual capacity increased as children matured; his age scale, which he obtained as a norm of right over wrong answers about everyday artifacts and information for each year of childhood, was based on the Gaussian bell-shaped curve. The result gave the child's "mental age." If the child was three years old and her or his mental age was normal for a three year old, then the child was normal because his or her chronological and mental ages were the same. If the child's mental age was "higher" than her or his chronological age, then the child was advanced, or had a higher than normal IQ. If the situation were reversed, then the child was behind or retarded, with a lower than normal IQ for his or her age. There were several tests for each age, and Binet expressed scores as mental ages. His great insight was that mental age existed apart from, but was related to, chronological age. William Stern, of Hamburg University, devised the notion of the intelligence quotient–soon dubbed the IQ – by dividing the child's mental age by her or his chronological age. Thus a child of ten with a mental age of twelve would have an IQ of 120. One with a chronological age of five and a mental age of four would have an IQ of only 80, and so on.
The American psychologist Lewis M. Terman, of Stanford University, "Americanized" the Binet test, and Stern's notion of the IQ, in the 1910s. He standardized the Binet test on many small town, middle-class California school children of northwestern European, Protestant extraction, so that the norms for each age were synchronized with cultural knowledge best understood by such children–and their relatives, peers, and neighbors. In transforming Binet's test into the Stanford-Binet measuring scale of intelligence, or, more simply, the Stanford-Binet, Terman insisted that the test measured innate intelligence in individuals and in groups, and this assumption was not widely or seriously questioned by mainstream academic psychologists until the 1960s. The Stanford-Binet became the model for subsequent IQ tests and tests of intelligence, in the United States for the next generation, thus influencing the lives of many children in America and abroad. From the 1920s to the 1960s, the IQ reigned supreme in education and social welfare institutions. Although it is true that in the 1920s there was a furious, if short-lived, controversy among social scientists over whether such tests constituted legitimate scientific measures of the "average IQ" of specific ethnic and racial groups in the population, only an ignored handful of researchers questioned whether an individual's IQ was innate at birth and stable thereafter.
After the 1960s, various constituencies and interest groups raised critical questions about IQ testing. Champions of civil rights and feminism claimed that defenders of segregation and institutionalized racism had used so-called average racial IQ scores to keep minorities and females from good schools, jobs, and neighborhoods. Some psychologists claimed that intelligence was too complex a phenomenon to be reduced to a simple ratio; most post–World War II tests, based on a model developed by the psychologist David Wechsler, argued that intelligence was the consequence of multiple factors and processes. Researchers in early childhood education insisted in the 1960s that IQs of preschool age children could and did respond to environmental stimuli and pressures by at least as much as the gap between many racial minorities and the white majority. As in the 1920s, a nature versus nurture debate took place over the next several decades without a definite conclusion. After World War II, most institutions, such as schools and child-saving organizations, tended to interpret IQ scores as mere indicators, to be used with many other indices to understand a child and her or his potentiality.
See also: Child Development, History of the Concept of; Intelligence Testing.
Boring, E. G. 1950. A History of Experimental Psychology, 2nd ed. New York: Century Co.
Cravens, Hamilton. 1988 . The Triumph of Evolution: The Heredity-Environment Controversy, 1900–1941. Baltimore, MD: The Johns Hopkins University Press.
Cravens, Hamilton. 2002 . Before Head Start: The Iowa Station and America's Children. Chapel Hill: The University of North Carolina Press.
Cremin, Lawrence A. 1961. The Transformation of the School: Progressivism and American Education, 1876–1955. New York: Knopf.
Curti, Merle. 1980. Human Nature in American Thought. Madison: University of Wisconsin Press.
Hunt, J. McVicker. 1960. Intelligence and Experience. New York: The Ronald Press.
Stoddard, George D. 1943. The Meaning of Intelligence. New York: Macmillan.
Terman, Lewis M., et al. 1917. The Stanford Revision and Extension of the Binet-Simon Scale for Measuring Intelligence. Baltimore: Warwick and York.
Intelligence tests are psychological tests that are designed to measure a variety mental functions, such as reasoning, comprehension, and judgment.
The goal of intelligence tests is to obtain an idea of the person’s intellectual potential. The tests center around a set of stimuli designed to yield a score based on the test maker’s model of what makes up intelligence. Intelligence tests are often given as a part of a battery of tests.
There are many different types of intelligence tests and they all do not measure the same abilities. Although the tests often have aspects that are related with each other, we should not expect that scores one intelligence test, that measures a single factor, will be similar to scores on another intelligence test, that measures a variety of factors. Also, when determining whether or not to use an intelligence test, a person should make sure that the test has been adequately developed and has solid research to show its reliability and validity. Additionally, psychometric testing requires a clinically trained examiner. Therefore, the test should only be administered and interpreted by a trained professional.
A central criticism of intelligence tests is that psychologists and educators use these tests to distribute the limited resources of our society. These test results are used to provide rewards such as special classes for gifted students, admission to college, and employment. Those who do not qualify for these resources based on intelligence test scores may feel angry and as if the tests are denying them opportunities for success. Unfortunately, intelligence test scores have not only become associated with a person’s ability to perform certain tasks, but with self-worth.
Many people are under the false assumption that intelligence tests measure a person’s inborn or biological intelligence. Intelligence tests are based on an individual’s interaction with the environment and never exclusively measure inborn intelligence. Intelligence tests have been associated with categorizing and stereotyping people. Additionally, knowledge of one’s performance on an intelligence test may affect a person’s aspirations and motivation to obtain goals. Intelligence tests can be culturally biased against certain minority groups.
When taking an intelligence test, a person can expect to do a variety of tasks. These tasks may include having to answer questions that are asked verbally, doing mathematical problems, and doing a variety of tasks that require eye hand coordination.
Some tasks may be timed and require the person to work as quickly as possible. Typically, most questions and tasks start out easy and progressively get more difficult. It is unusual for anyone to know the answer to all of the questions or be able to complete all of the tasks. If a person is unsure of an answer, guessing is usually allowed.
The four most commonly used intelligence tests are:
- Stanford-Binet Intelligence Scales
- Wechsler-Adult Intelligence Scale
- Wechsler Intelligence Scale for Children
- Wechsler Primary & Preschool Scale of Intelligence
In general, intelligence tests measure a wide variety of human behaviors better than any other measure that has been developed. They allow professionals to have a uniform way of comparing a person’s performance with that of other people who are similar in age. These tests also provide information on cultural and biological differences among people.
Intelligence tests are excellent predictors of academic achievement and provide an outline of a person’s mental strengths and weaknesses. Many times the scores have revealed talents in many people, which have lead to an improvement in their educational opportunities. Teacher, parents, and psychologists are able to devise individual curriculum that matches a person’s level of development and expectations.
Some researchers argue that intelligence tests have serious shortcomings. For example, many intelligence tests produce a single intelligence score. This single score is often inadequate in explaining the multidimensional aspects of intelligence. Another problem with a single score is the fact that individuals with similar intelligence test scores can vary greatly in their expression of these talents. It is important to know the person’s performance on the various subtests that make up the overall intelligence test score. Knowing the performance on these various scales can influence the understanding of a person’s abilities and how these abilities are expressed. For example, two people have identical scores on intelligence tests.
Although both people have the same test score, one person may have obtained the score because of strong verbal skills while the other may have obtained the score because of strong skills in perceiving and organizing various tasks.
Furthermore, intelligence tests only measure a sample of behaviors or situations in which intelligent behavior is revealed. For instance, some intelligence tests do not measure a person’s everyday functioning, social knowledge, mechanical skills, and/or creativity. Along with this, the formats of many intelligence tests do not capture the complexity and immediacy of real-life situations. Therefore, intelligence tests have been criticized for their limited ability to predict non-test or nonacademic intellectual abilities. Since intelligence test scores can be influenced by a variety of different experiences and behaviors, they should not be considered a perfect indicator of a person’s intellectual potential.
The person’s raw scores on an intelligence test are typically converted to standard scores. The standard scores allow the examiner to compare the individual’s score to other people who have taken the test. Additionally, by converting raw scores to standard scores the examiner has uniform scores and can more easily compare an individual’s performance on one test with the individual’s performance on another test. Depending on the intelligence test that is used, a variety of scores can be obtained. Most intelligence tests generate an overall intelligence quotient or IQ. As previously noted, it is valuable to know how a person performs on the various tasks that make up the test. This can influence the interpretation of the test and what the IQ means. The average of score for most intelligence tests is 100.
See alsoStanford-Binet Intelligence Scales; Wechsler Adult Intelligence Scale; Wechsler Intelligence Scale for Children.
Kaufman, Alan, S., and Elizabeth O. Lichtenberger. Assessing Adolescent and Adult Intelligence. Boston: Allyn and Bacon, 2001.
Matarazzo, J. D. Wechsler’s Measurement and Appraisal of Adult Intelligence. 5th ed. New York: Oxford University Press, 1972.
Sattler, Jerome M. “Issues Related to the Measurement and Change of Intelligence.” In Assessment of Children: Cognitive Applications. 4th ed. San Diego: Jerome M. Sattler, Publisher, Inc., 2001.
Sattler, Jerome M. and Lisa Weyandt. “Specific Learning Disabilities.” In Assessment of Children: Behavioral and Clinical Applications. 4th ed. Written by Jerome M. Sattler. San Diego: Jerome M. Sattler, Publisher, Inc., 2002.
Keith Beard, Psy.D.
Intelligence tests are psychological tests that are designed to measure a variety of mental functions, such as reasoning, comprehension, and judgment.
The goal of intelligence tests is to obtain an idea of the person's intellectual potential. The tests center around a set of stimuli designed to yield a score based on the test maker's model of what makes up intelligence. Intelligence tests are often given as a part of a battery of tests.
There are many different types of intelligence tests and they all do not measure the same abilities. Although the tests often have aspects that are related with each other, one should not expect that scores from one intelligence test, that measures a single factor, will be similar to scores on another intelligence test, that measures a variety of factors. Also, when determining whether or not to use an intelligence test, a person should make sure that the test has been adequately developed and has solid research to show its reliability and validity. Additionally, psychometric testing requires a clinically trained examiner. Therefore, the test should only be administered and interpreted by a trained professional.
A central criticism of intelligence tests is that psychologists and educators use these tests to distribute the limited resources of our society. These test results are used to provide rewards such as special classes for gifted students, admission to college, and employment. Those who do not qualify for these resources based on intelligence test scores may feel angry and as if the tests are denying them opportunities for success. Unfortunately, intelligence test scores have not only become associated with a person's ability to perform certain tasks, but with self-worth.
Many people are under the false assumption that intelligence tests measure a person's inborn or biological intelligence. Intelligence tests are based on an individual's interaction with the environment and never exclusively measure inborn intelligence. Intelligence tests have been associated with categorizing and stereotyping people. Additionally, knowledge of one's performance on an intelligence test may affect a person's aspirations and motivation to obtain goals. Intelligence tests can be culturally biased against certain groups.
When taking an intelligence test, a person can expect to do a variety of tasks. These tasks may include having to answer questions that are asked verbally, doing mathematical problems, and doing a variety of tasks that require eye-hand coordination. Some tasks may be timed and require the person to work as quickly as possible. Typically, most questions and tasks start out easy and progressively get more difficult. It is unusual for anyone to know the answer to all of the questions or be able to complete all of the tasks. If a person is unsure of an answer, guessing is usually allowed.
The four most commonly used intelligence tests are:
- Stanford-Binet Intelligence Scales
- Wechsler-Adult Intelligence Scale
- Wechsler Intelligence Scale for Children
- Wechsler Primary & Preschool Scale of Intelligence
In general, intelligence tests measure a wide variety of human behaviors better than any other measure that has been developed. They allow professionals to have a uniform way of comparing a person's performance with that of other people who are similar in age. These tests also provide information on cultural and biological differences among people.
Intelligence tests are excellent predictors of academic achievement and provide an outline of a person's mental strengths and weaknesses. Many times the scores have revealed talents in many people, which have led to an improvement in their educational opportunities. Teachers, parents, and psychologists are able to devise individual curricula that matches a person's level of development and expectations.
Some researchers argue that intelligence tests have serious shortcomings. For example, many intelligence tests produce a single intelligence score. This single score is often inadequate in explaining the multidimensional aspects of intelligence. Another problem with a single score is the fact that individuals with similar intelligence test scores can vary greatly in their expression of these talents. It is important to know the person's performance on the various subtests that make up the overall intelligence test score. Knowing the performance on these various scales can influence the understanding of a person's abilities and how these abilities are expressed. For example, two people have identical scores on intelligence tests. Although both people have the same test score, one person may have obtained the score because of strong verbal skills while the other may have obtained the score because of strong skills in perceiving and organizing various tasks.
Furthermore, intelligence tests only measure a sample of behaviors or situations in which intelligent behavior is revealed. For instance, some intelligence tests do not measure a person's everyday functioning, social knowledge, mechanical skills, and/or creativity. Along with this, the formats of many intelligence tests do not capture the complexity and immediacy of real-life situations. Therefore, intelligence tests have been criticized for their limited ability to predict non-test or nonacademic intellectual abilities. Since intelligence test scores can be influenced by a variety of different experiences and behaviors, they should not be considered a perfect indicator of a person's intellectual potential.
The person's raw scores on an intelligence test are typically converted to standard scores. The standard scores allow the examiner to compare the individual's score to other people who have taken the test. Additionally, by converting raw scores to standard scores the examiner has uniform scores and can more easily compare an individual's performance on one test with the individual's performance on another test. Depending on the intelligence test that is used, a variety of scores can be obtained. Most intelligence tests generate an overall intelligence quotient or IQ. As previously noted, it is valuable to know how a person performs on the various tasks that make up the test. This can influence the interpretation of the test and what the IQ means. The average of score for most intelligence tests is 100.
Kaufman, Alan, S., and Elizabeth O. Lichtenberger. Assessing Adolescent and Adult Intelligence. Boston: Allyn and Bacon, 2001.
Matarazzo, J. D. Wechsler's Measurement and Appraisal of Adult Intelligence. 5th ed. New York: Oxford University Press, 1972.
Sattler, Jerome M. "Issues Related to the Measurement and Change of Intelligence." In Assessment of Children: Cognitive Applications. 4th ed. San Diego: Jerome M. Sattler, Publisher, Inc., 2001.
Sattler, Jerome M. and Lisa Weyandt. "Specific Learning Disabilities." In Assessment of Children: Behavioral and Clinical Applications. 4th ed. Written by Jerome M. Sattler. San Diego: Jerome M. Sattler, Publisher, Inc., 2002.
Keith Beard, Psy.D.
in·tel·li·gence quo·tient (abbr.: IQ) • n. a number representing a person's reasoning ability (measured using problem-solving tests) as compared to the statistical norm or average for their age, taken as 100.