Standardized Testing

views updated

Standardized Testing

Standardized testing is so much a part of American culture that almost everyone can recognize its multiple choice format, even young children. It is no wonder; for most Americans, the testing starts in kindergarten. This testing culture seems to be uniquely American. In fact, Europeans refer to these tests as "American tests." A standardized test is called such because everyone takes the same test with the same questions, so one's performance can be compared to everyone else's, in order for a relative score to be obtained. Schools across the country vary greatly, so an A at one school might not be equal to an A from another; standardized tests thus serve to offer an equitable measure of aptitude. Because they are designed to screen applicants and are nearly impossible to finish, many people freeze when taking them. Perhaps their fear would change to anger if they knew the checkered history of these tests.

In 1912, Henry Goddard, who coined the word "moron," ran his version of an intelligence test at Ellis Island, and "proved scientifically" that the majority of Jews, Hungarians, Italians, and Russians were what he considered "feebleminded." A few years later, the president of Columbia College, disappointed by all the recent Jewish immigrants enrolling after World War I, made Columbia the first school to use an "intelligence" test for admissions; he hoped these tests would limit the number of Jewish students without instituting an overt policy to do so.

World War I provided a perfect opportunity to test large numbers of people. The Army Mental Tests were created in 1917, with the help of Goddard and Carl Campbell Brigham. Results of these tests were to be used in assigning recruits to jobs in the army. Questions had the familiar multiple choice format, but pretty odd subject matter. For example, "Crisco is a (A) patent medicine (B) disinfectant (C) toothpaste (D) food product" and "The forward pass is used in (A) tennis (B) hockey (C) football (D) golf." Few would agree that these questions test not intelligence, but rather an awareness of consumer and leisure culture, things to which impoverished immigrants without American hobbies and with little or no English skills probably would not know the answers to.

Brigham used questions like these to prove his thesis. In analyzing the "data" from the Army tests, Brigham concluded that there are four racial strains in America. In order of their supposed "intelligence," and using his terminology, they are: Nordic; Alpine; Mediterranean; and last, American Negroes. The data gathered from Brigham's army tests was continually invoked in Congress and was instrumental in fostering congressional debates that led to the Immigration Restriction Act of 1924. The act imposed quotas on immigrants entering the United States.

In 1925, the College Board, which had been established in 1900 to standardize the entrance exams that Harvard, Yale, Princeton, and a few other elite colleges had all been administering separately—and today includes more than 2,500 colleges, schools, school systems, and educational associations—hired Brigham to develop an intelligence test for use in college admissions. That test was called the Scholastic Aptitude Test (SAT). The first SAT, published in 1926 and administered to 8,040 people, included questions about brand names, chicken breeds, and cuts of beef, and had a section on artificial language that included made-up vocabulary and grammar rules, and asked testers to make sentences. There was also an analogy section that gave testers only six minutes to answer 40 questions. The test was—and still is, albeit in dramatically modified form—used to "predict" one's success in college.

Because so many people were going to college after World War II, there was a huge demand for the SAT. In 1947 the College Boardcreated the non-profit Educational Testing Service (ETS) to take care of the testing demand. The next year, 75,000 students took the SAT. In 1954 the College Board instituted a "test use" requirement, which forced all members of the Board to use at least one of its exams; a monopoly was thus established. Until 1957 neither the testers nor their high schools were even told their scores. Annual administrations of the SAT passed over one million in 1963, and were at 1.8 million by 1994.

Henry Chauncey, ETS' first president, thought of the SAT as an IQ test. He admired such tests and believed that testing could help define vocational goals of students, especially those whose "talents" lend themselves to stopping education after high school. He suggested that students get tested at the end of the eighth or ninth grade, and then every one or two years thereafter. One does have to pay a fee each time one takes a standardized test. Some estimates of the amount Americans spend on testing in general are as high as $500 million annually.

Standardized tests are being used more than ever to aid overworked and understaffed admissions departments. The tests come by their nickname "the Gatekeepers" honestly; the score is the first, and often the only thing, an admissions officer looks at on an application. If the prospective student has too low a number, they do not get in, no matter how impressive the rest of the application. And schools sometimes jack up the "average" incoming test scores of students in their admissions material so the school will seem more "selective." Yet these gatekeepers are not crafted by academicians; writing test questions requires no special academic training. College professors do not write the SAT; lawyers do not write the LSAT, much less the Bar exam.

Most of ETS' tests contain some combination of math and verbal questions (analogies, sentence completions, and critical reading passages); some throw in "analytic" questions such as logic games. The MCAT (Medical College Admissions Test) contains curriculum based questions on biology, chemistry, and physics; this is more like the non-ETS (and less expensive) ACT (American College Testing) Assessment Test, a curriculum-based test that is also used for college admissions. The GRE (Graduate Record Examination) and GMAT (Graduate Management Admissions Test) were by the 1990s given only on computer, as opposed to pencil and paper, in a format called "computer adaptive" or CAT. The CAT format redefines "standardized," since each tester gets questions from a pool and the computer "adapts" each time she gets a question right or wrong by giving her a harder one if her last answer was correct and an easier one if her last answer was incorrect. It is therefore possible that no two test takers will take the exact same test.

ETS does seem to be distancing itself from the idea of "aptitude" testing. Until 1982, the GRE General Test was called the GRE Aptitude Test. And in 1994, the A in SAT began to stand for "Assessment." ETS now says that native intelligence is not what their tests are designed to detect; the SAT measures "developed ability, not innate intelligence; a test of abilities that are developed slowly over time both through in-school and out-of-school experience."

The world of standardized testing starts long before high school. Estimates report that 127 million tests a year are being given at the K-12 level alone (including the Iowa Test of Basic Skills, a widely-used achievement test for grade schools). Also at that level, teachers have complained that because their peers are "teaching to the test," the resulting scores are meaningless. ETS has always maintained that coaching is not effective on its tests; it did not even publish its own practice questions until 1978. But the million-dollar cottage industry of coaching businesses, whose existence depends on these tests, disagrees. Some such coaching programs claim that a student can raise his or her SAT score by up to 300 points after taking a coaching course.

Not only do testing opponents insist that high scores reveal little more than a talent for taking tests and understanding the testing "mentality," but they also seem to correlate with the income and education of the tester's parents. College Board data shows that someone taking the SAT can expect to score about 30 points higher for every $10,000 in her parents' yearly income. Unfortunately, these tests still seem to do what Brigham meant for them to do. For many years now, the median score for African Americans on the SAT has been 200 points below that for whites; females have been scoring 35 points lower on the math sections than males. ETS has made some changes to answer these charges. The 1996 "recentering" of SAT scores, done technically to create a better distribution of scores around the test's numerical midpoint, boosted the average scores for groups like African Americans and Hispanics. Additionally, the Preliminary Scholastic Assessment Test (PSAT), a test used to determine who gets the National Merit Scholarship, replaced its old scoring formula, which assigned equal weight to the math and verbal sections, with a formula that doubles the verbal score, usually the higher one for female testers, in the hope that more women might get the scholarship.

School entry might be the most common way people are introduced to the entrenched testing culture, but it is not the only way. A partial list of organizations that buy tests from ETS includes, but is not limited to: the CIA (Central Intelligence Agency), the government of Trinidad and Tobago, The American Society of Plumbing Engineers, and The Institute for Nuclear Power Operations. One might also be required to take an ETS exam to become a golf pro, and in some states, a travel agent, a real estate salesman, and a beautician.

—Karen Lurie