Standardization is a technique used in comparing indicators from two or more populations. The goal of the standardization procedure is to control for compositional differences between these groups that may influence the indicator that is being examined. This method allows a researcher to determine the extent to which differences in the rates of events between populations are due to differences in population characteristics. Often sociologists ask questions, that require comparisons between groups of people: Which city has a higher crime rate? Which country has lower mortality? Which ethnic group is more likely to coreside with elderly family members? In making these comparisons, one usually calculates a summary measure: crimes per capita, crude death rate, or the proportion of elders living with family members. However, any two groups of people are likely to differ along several dimensions, such as age, educational level, race, and income. These dimensions, or factors, also may be related to the event being explored. As a result, the summary measure to some extent reflects the compositional differences in the groups being studied.
Standardization historically has been a central aspect of demographic methods (Bogue 1969; Hinde 1998; Murdock and Ellis 1991; Shryock and Siegel 1980), but its importance extends beyond that use to a way of thinking about summary or aggregate measures. While offering the advantage of conciseness, aggregate measures mask underlying compositional differences, and the use of standardization represents an acknowledgment that population characteristics influence the rate at which events occur in a population. Summary indicators are very useful; they provide a single number for comparison rather than a whole series of numbers, and they are easily calculated. However, comparisons among population groups or among subgroups in a population should account for the differing compositional makeup of those groups. Demographers have been led to standardization for several reasons. First, there is a natural desire to make comparisons between groups along demographic indicators: crude death rates, crude birthrates, marriage rates, and employment, among others. Standardization allows these comparisons to reflect differences in the underlying processes, rather than being confounded by the effects of composition. Standardization procedures can accommodate the effects of a single factor or many factors, leaving the technique bounded only by the available data. Standardization also allows the estimation of indicators for groups for which data are incomplete or of poor quality.
Many demographic measures are affected by the composition of the population, particularly the age distribution. Age composition is especially critical in considering crude death rates, since mortality rates have a very distinctive age-specific pattern: high at very young and very old ages. Populations with a large proportion of persons in those age groups experience a large number of deaths, regardless of age-specific rates of mortality. Two populations with identical sets of agespecific rates of mortality but different age distributions will have different crude death rates. The removal of the "interference" of age distribution from the summary measure—the crude death rate—is the goal of the standardization procedure. In the rest of this article, the standardization procedure will be explained using mortality rates, and then several other examples of standardization will be presented.
The first step in a comparison is to calculate a crude rate or proportion. Crude rates or proportions are calculated by the formula
where E refers to the number of events of interest in the population during the time period and P refers to the population during that period. If the population is measured at the middle of the year and the events occur throughout the year, this proportion can be interpreted as a rate. In cases where this proportion is small, for instance, mortality rates, the crude rate commonly is multiplied by 1,000 and reported as the number of events per 1,000 people.
Crude rates or proportions are used to represent a variety of characteristics of a population. These rates have an advantage over a comparison of absolute numbers, since they account for differences in size between two populations. Obviously, in a comparison of the annual number of homicides in Chicago versus that in Seattle, one must account for the fact that the population of Chicago is 2.8 million people compared to about one-half million in Seattle. Similarly, comparing the number of deaths in the United States (over 2 million) to those in Sweden (about 90,000) in 1994 would be unreasonable without knowing that the population of the United States is three times that of Sweden.
Despite the advantage of crude rates over absolute numbers, crude rates are influenced by the composition of the populations being compared. If the event of interest varies by some factor and the two populations have varying levels of that factor, the crude rates will partly reflect this compositional variation rather than only a difference in the rate at which the event is occurring. If the populations being compared are standardized with respect to the factor, any remaining difference between the crude rates can be attributed to a true difference in rates of occurrence. If the difference in the crude rate disappears, one can conclude that the compositional variation rather than a difference in the underlying rates of occurrence led to a difference in the crude of events.
To understand the rationale of standardization, it is necessary to recognize that in essence, the crude rate is a weighted average of a set of factor-specific rates, where the weights are the distribution of the factor in the population. Thinking in this manner, one can rewrite the crude rate as
where pa is the population in group a and ea is the number of events occurring in group a. The sum of all ea equals the total number of events, E, and the sum of all pa equals the total population, P. Note that this equation has two components. The first, ea/pa, represents the group-specific rate of events or the group-specific proportion, which sometimes is expressed as ma. The second component of the rate calculation, pa/P, represents the proportion of the population in each of the a groups. These are the two series of elements needed to apply the direct standardization technique. Using this notation, the crude rate can be rewritten as
When the formula for the crude rate is written in this manner, it is easy to see how the composition of the population, that is, its distribution among the a groups, affects the crude rate. If the group-specific rate ma is high when the proportion of the population in that group, pa/P, is large, more events will be observed in the total population than will be observed if pa/P is small. Similarly, if ma is small when pa/P is small, few events will occur.
A comparison of the crude death rates in Sweden and the United States provides an example of the use of standardization. Sweden has one of the world's highest life expectancies at birth, approximately 76 years for men and 81.4 years for women in 1994. The crude death rate of Sweden, however, was about 10.4 deaths per 1,000 in that year. In contrast, life expectancy at birth in the United States was 72.2 years for mens and 78.8 years for women in 1993, and the crude death rate was about 8.6 deaths per 1,000 in that year (United Nations 1997). It seems natural to expect that the country with the longest life expectancy would also have the lowest crude death rate, so what accounts for this discrepancy? To understand the reason for this difference in the crude rates, it is necessary to observe the differing age distributions of the two populations. In the United States about 13 percent of the population is over age of 65; while in Sweden over 17 percent of people are over that age. Since death rates are highest in this age range, the larger proportion of the Swedish population in old age creates more deaths, even with lower age-specific death rates. Standardization demonstrates the extent to which these differences in age distribution account for the difference in the crude death rate.
As was mentioned above, this method of standardization—direct standardization—requires a standard population distribution and a set of factor-specific rates for the populations being studied. Direct standardization uses this standard population to calculate new standardized crude rates for the populations of interest. In this case, the population distribution of the standard population replaces the observed population distribution. Since each population's crude rate will be calculated with the same distribution, the effect of the compositional differences will be eliminated and each population will have the same composition. To apply direct standardization, the formula
is used, where eja represents the number of events occurring in group a in population j, pja represents the population size of group a in population j, psa represents the number of people in group a in the standard population s, and Ps represents the standard population. Comparing equations (2) and (4) shows the similarities. The second term in equation (2), the compositional distribution of the population of interest, pa/P, has been replaced with the compositional distribution of the standard population, psa/Ps. The first term in the crude rate calculation remains the factor-specific rate in the population of interest, population j.
Returning to the example of the United States and Sweden, using the age distribution of the United States as the standard distribution and computing a standardized crude death rate for Sweden by applying the age-specific death rates of Sweden yields a standardized crude death rate of 7.6 deaths per 1,000 for Sweden. Instead of being higher than the crude death rate in the United States, Sweden's crude death rate falls below that of the United States. At least part of the difference in the crude rates therefore is due to Sweden's older population rather than to a difference in agespecific death rates. In general, populations with a relatively old age distribution tend to have higher crude death rates than do populations with similar age-specific mortality patterns, since death rates are higher at older ages.
The data demands for direct standardization, while not overwhelming, can be difficult to meet if there is limited information on factor-specific rates in one of the populations of interest. For example, in many studies of mortality in less developed countries or in a historical perspective, information on age-specific death rates may be missing or unreliable. In these cases, an alternative method referred to as indirect standardization can be used. Indirect standardization requires knowledge only of the composition of the population and the total number of events of interest. Direct standardization involves the application of population-specific sets of rates to a standard population; conversely, indirect standardization involves the application of a standard set of rates to individual population distributions. In indirect standardization, a set of standard rates is applied to the population and the expected number of events is compared to the actual number. This standardizing ratio is estimated by the formula
where Ej is the actual number of events in the population j, msa is the factor-specific rate in the standard population s, and pja is the number of people in population j who are in group a. The denominator of the ratio calculates the number of events that would be expected in population j if the factor-specific rates of the standard population were applied to the population. When the event of interest is death, this ratio often is referred to as the standardized mortality ratio. To obtain the new indirectly standardized crude rate, this standardizing ratio is multiplied by the crude rate for the standard population:
where CRs is the crude rate in the standard population. These indirectly standardized crude rates then can be compared to each other. Obviously, when the standardizing ratio is greater than 1.0, the ISR will be larger than the crude rate for the standard population, and when the standardizing ratio is less than 1.0, the ISR will be smaller than the standard population's crude rate.
Indirect standardization does not control for composition as well as the direct standardization method does but should yield similar results in terms of direction and magnitude. Returning to the example of Sweden and the United States, the actual number of recorded deaths in Sweden would be greater than the observed number if U.S. agespecific death rates were applied to the Swedish population's age distribution. The resulting standardized mortality ratio would be 0.912, and when that was multiplied by the crude rate for the United States, the ISR for Sweden would be 7.8, very similar to the result obtained through direct standardization.
When indirect standardization is employed, there is no choice to be made about the standard population; this method is used when only one population distribution is available. The choice of the standard population for direct standardization should be considered carefully, but within reasonable bounds the choice of standard should not alter the conclusions radically. Researchers generally are interested in the direction and approximate size of differences between the groups, and these values are preserved with the choice of any of a number of reasonable standard populations. There are three general choices for the standard: use one of the populations being studied, use an average of the populations, or use a population outside those being studied. Each of these choices has advantages and disadvantages. Theoretically, the choice of standard should be made to minimize the effects of that choice on the results.
Using one of the populations being studied eliminates the need to standardize that population and often makes the explication of comparisons easier. For instance, in comparing crime rates across several cities, choosing one city as the basis for comparison may be appropriate. When comparisons are made of a population over time, it is standard procedure to choose a distribution that is representative of the middle of the time period. For instance, in a study of mortality change between 1950 and 1990 in the United States, it would be appropriate to use the 1970 census for the standard age distribution. A drawback to using one of the study populations as the standard, however, can be that the population chosen has an unusual distribution of factors. This unusual distribution may skew the summary measures in a way that is inconsistent or difficult to interpret. Also, choosing one of the populations as a standard can carry implications that this distribution is the "ideal" or "correct" distribution and may place interpretational burdens on the results.
Using an average of the populations eliminates the problem of setting one population as the ideal and ameliorates the problem of unusual distributions. A comparison of racial differences in mortality in the United States, for example, might use the age distribution of the total U.S. population, an unweighted average of the distribution of each racial group, as the standard. This choice eliminates the assumption that any one population has a preferred distribution and allows for meaningful comparisons among groups. The use of an aggregate population as the standard is encountered frequently in comparisons of subgroups within a national population.
A third choice is to pick a population completely exogenous to the study as a standard. This choice most often involves an artificial population that is representative of a standard pattern of factor distributions. Several sources of standard populations exist. In the case of age, Coale and Demeny's (1983) set of regional model life tables contains sets of age distributions typical of a variety of mortality levels and patterns. The use of an external standard eliminates any value judgments associated with the choice of standard. An external standard also can be chosen to minimize or eliminate extreme distributions of factors. The external standard also provides a way of comparing very diverse populations. Again, the choice of standard should match the populations being studied as closely as possible to minimize the effect of that choice on the results.
An exogenous standard also might be employed as a way to simulate the effects of a variety of changes in population composition on the crude rate. This use of the standardization technique highlights the underlying logic of the procedure by using the method to investigate the extent to which compositional chances influence aggregate comparisons. Here the technique is used as a methodological device to explore the effects of changes. For instance, a researcher might be interested in the effects on average wages of changing occupational structures among men and women. A testable hypothesis could be that as women approach men in terms of occupational distribution, the gender gap in wages will disappear. If a variety of simulated occupational structures are applied to a set of gender- and occupation-specific wage rates, the effect of occupational structure on the wage gap can be examined.
Since standardization developed in the field of demography, most applications involve the study of demographic phenomena. The example of the United States and Sweden involved comparisons of mortality rates. However, standardization is used widely in other areas as well. For example, the U.S. Census Bureau routinely reports the distribution of the American population aged 15 and older among marital states, and historical comparisons of this distribution are used to examine changes in marital behavior over time. However, the age composition of the population can greatly influence the distribution among marital states, particularly when the proportion of the population in the age range of 15 to 25 years is very large. In 1960, 65.6 percent of women aged 15 and older were married compared to 60.4 percent of similarly aged women in 1975 (United States Bureau of the Census 1976). At first glance, these comparisons seem to signal a retreat from marriage: A smaller proportion of women was married in 1975 than in 1960. However, when the age distribution of the population is standardized to the 1960 population, the proportion married in 1975 increases to 63.5. While this is still a decline compared to 1960, the magnitude of the change is much less. The difference in the proportion married is due largely to a difference between 1960 and 1975 in the proportion of women just over the age of 15, the baby boomers, who were young teenage women who had not yet married.
Standardization can be used to control for characteristics other than age. Suppose, for instance, one is comparing the health status of two different groups: elderly white Americans and elderly African-Americans. If we compare the proportion of each group in poor health, we find that 34 percent of elderly whites and 50 percent of elderly African-Americans report their health as fair or poor. However, we know that health status varies by education and that the educational distributions of these two groups differ. Among elderly whites, about 12 percent have fewer than eight years of school, compared to 39 percent of elderly African-Americans. Clearly, since lower levels of education are associated with poorer health and elderly African-Americans have lower levels of educational attainment, some of the difference in observed health status between the groups can be expected to result from the different educational compositions.
It is desirable to compare these two groups without the influence of education. Using the educational distribution of the elderly white population as a standard and applying the observed education-specific rates of poor health among elderly African-Americans, one obtains an overall proportion of 42 percent in poor health, compared to the unstandardized proportion of 50 percent. Thus, if the African-American older population had an educational distribution similar to that of the more highly educated white elderly population, the expected health status of older African-Americans would improve.
Lichter and Eggebeen (1994) used standardization techniques to examine the effects of parental employment on rates of child poverty. In their work these researchers use direct standardization techniques in two different ways. In the first, they simulate the effects of a variety of assumptions about parental employment patterns on children's poverty rates. This is an illustration of using an "exogenous" or artificial population distribution as a standard population. By changing the employment distribution of the parents of children in poverty, they determine that only modest declines in child poverty would result from increasing those levels of employment. Their second application of standardization compares the poverty rates of black children obtained by using the employment distribution of white parents as the standard to the rates directly observed. In this case, they have chosen one of the study populations as the standard and are interested in the extent to which differences in child poverty between blacks and whites are determined by factors other than parental employment distributions. They find in fact that parental employment differences among female-headed families account for a substantial portion of the observed differences in child poverty.
Standardization can control for more than one factor at a time and can be applied to more than two groups. Himes et al. (1996) standardize for age, sex, and marital status in an examination of the living arrangements of minority elderly in the United States. Living arrangements are known to be different for men and women, for married and unmarried, and for younger and older elderly. These factors—age, sex, and marital status—also are known to vary across racial and ethnic subgroups. Therefore, the observed differences in living arrangements are likely to be due in part to these underlying characteristics rather than being a reflection of differences in attitudes or beliefs. Standardization allows a comparison among groups without the influence of these compositional differences. In this research, the compositional distribution of the entire United States with respect to age, sex, and marital status was chosen as the standard. In this analysis, the standardization procedure had the greatest effect on comparisons of the African-American population and much smaller effects on the white, non-Hispanic, Hispanic, Asian, and Native American populations.
Standardization is widely used in a variety of sociological inquiries. While it originated in demographic analyses, it can be applied to a variety of questions in which a researcher wants to determine the extent to which compositional differences in population groups account for observed differences in summary measures. Standardization is also useful as a simulation technique, allowing researchers to explore the effects of a variety of compositional changes on a summary indicator. Researchers should bear in mind, however, that the results of standardization are merely artificially constructed indicators; they do not represent a real population or circumstance.
Bogue, Donald J. 1969 Principles of Demography. New York: Wiley.
Coale, Ansely J., and Paul Demeny 1983 Regional Model Life Tables and Stable Populations, 2nd ed. New York: Academic Press.
Himes, Christine L., Dennis P. Hogan, and David J. Eggebeen 1996 "Living Arrangements of Minority Elders." Journal of Gerontology: Social Sciences 51B:S42–S48.
Hinde, Andrew 1998 Demographic Methods. New York: Oxford University Press.
Lichter, Daniel T., and David J. Eggebeen 1994 "The Effect of Parental Employment on Child Poverty." Journal of Marriage and the Family 56:633–645.
Murdock, Steve H., and David R. Ellis 1991 Applied Demography: An Introduction to Basic Concepts, Methods, and Data. Boulder, Colo.: Westview Press.
Shryock, Henry S., and Jacob S. Siegel 1980. The Methods and Materials of Demography, 4th printing (rev.). Washington, D.C.: U.S. Government Printing Office.
United Nations 1997 Demographic Yearbook 1995. New York: United Nations.
United States Bureau of the Census 1976 Social Indicators 1976. Washington, D.C.: U.S. Government Printing Office.
Christine L. Himes
Standardization (of Rates)
STANDARDIZATION (OF RATES)
Standardization (or adjustment) of rates is used to enable the valid comparison of groups (e.g., those studied in different places or times) that differ regarding an important health determinant (most commonly age). Although often presented in epidemiologic textbooks as a separate technique, it is in fact a specific application of the general methods to control for confounding factors. As such, many of the issues related to confounding and methods used to adjust for confounding can be applied to standardization. Historically, the need for age standardization was recognized well before the general concept of confounding was formalized. It has it roots in the earliest epidemiological studies—the first known reference to age standardization appeared in a publication by F. G. P. Neison in 1844. The most familiar application is in the presentation of age-standardized mortality or cancer incidence rates to explore temporal trends.
Two major approaches to standardization have been used, direct and indirect. Direct standardization is used when the study population is large enough that age-specific rates within the population are stable. When the population is small (or the outcome is rare), the number of events observed can be small. In that circumstance, indirect standardization methods can be used to produce a standardized mortality rate (SMR) or a standardized incidence rate (SIR).
Direct standardization is commonly used in reports of vital statistics (e.g., mortality) or disease incidence trends (e.g., cancer incidence). Indirect standardization has a played a major role in studies of occupational disease or studies of place and time-limited environmental catastrophes. Indirect standardization was introduced as a tool before direct standardization (1844 vs. 1899).
The standard approach to explaining standardization involves the concepts of expected and "observed" counts. In direct standardization, one estimates the rate that would have been observed if
|Direct Age Standardization|
|Age Group||Number of cases||Number of months||Mortality rate||Reference population||Expected number deaths|
|source: Courtesy of author.|
the study population had had the same age structure as the reference group (e.g., the number of cases of disease that would be expected if the disease rates in the study population were applied to the reference population). In indirect standardization, one computes the number of cases of disease that would have been expected if the disease rates from the reference population had applied in the study population. Dividing the observed case count by the expected count yields the SMR. A more modern approach to standardization recognizes that these methods are computing weighted averages of the age-specific rates.
To perform a direct age standardization, one first has to select a reference population. This population is arbitrary, although conventionally one uses either the World Standard Population produced by the World Health Organization, or a census population count for the country in which the work is being conducted. Next, one computes the age-specific rates within the study group. Then, one multiplies these rates by the number of people in that age group in the reference population. These expected counts are summed and divided by the total population size of the reference population to yield the directly standardized rate. This is illustrated in the example shown in Table 1. The crude mortality rate is 63/1,000. Standardizing to the reference population gives an age-adjusted mortality rate of 79,540/1,800,000 = 44/1,000. The adjusted rate is lower than the crude rate is since the proportion of the reference population in the oldest age group (11%), which has the highest age-specific mortality rate, is only 50 percent of that found in the study population (22%). This adjusted rate can be directly compared to
|Indirect Age Standardization|
|Age Group||Number of cases||Number of deaths||Mortality rate in Reference population||Expected number deaths|
|source: Courtesy of author.|
adjusted rates from other years to detect trends in mortality.
Indirect standardization uses the reference population to provide age-specific rates. Within each age stratum, one multiplies the reference rate by the number of people in the study population to determine the number of cases that would have been expected if that were the rate in the study group. These expected numbers are added up across all age groups and divided into the observed number to yield the SMR. Values greater than 1 (or 100, as the SMR is commonly expressed multiplied by 100) indicate a higher mortality than expected. It is possible to compute an indirectly standardized rate, but this is much less common than SMR/SIRs. Unlike directly standardized rates, one can not compare SMRs across time or place. One can however, compare SMRs for different outcomes within the same study population. This is a significant limitation to the use of SMRs. In the example given in Table 2, the researcher observed 458 deaths. However, based on the age-specific rates in the reference population, only 221 deaths would have been expected, yielding an SMR of 2.07 (or 207) suggesting a higher mortality rate in the study population than in the reference population.
The use of standardized rates is controversial. Any summary measure can hide patterns that might have important public health implications. For example, with age standardization, one might fail to detect age-specific differences in risk across time or place. This might arise if a disease is displaying an increasing incidence due to a birth cohort effect (people at younger ages might have a higher risk in recent years compared to previous years, while older people could have the opposite pattern). An age-standardized rate could hide these trends. Despite this risk, standardized rates have been found to provide useful summary measures, especially when outcomes are rare and specific rates display wide random variability.
One of the biggest potential abuses of standardized rates is by health care planners who use the standardized rates to estimate demand for services. This is incorrect practice. The standardized rate reflects the number of new cases that would arise in a hypothetical population. The actual number of cases expected is given by the crude rate, which should always be employed in health care planning analyses.
(see also: Rates; Rates: Adjusted; Rates: Age-Adjusted; Rates: Age-Specific )
stand·ard·ize / ˈstandərˌdīz/ • v. [tr.] cause (something) to conform to a standard: Jones's effort to standardize oriental spelling. ∎ [intr.] (standardize on) adopt (something) as one's standard: we could standardize on U.S. equipment. ∎ determine the properties of by comparison with a standard. DERIVATIVES: stand·ard·iz·a·ble adj. stand·ard·i·za·tion / ˌstandərdiˈzāshən/ n. stand·ard·iz·er n.
1. The establishment of an international, national, or industrial agreement concerning the specification or production of components – electrical, electronic, or software – or equipment in general, or of procedures for the use or testing of equipment or software.
2. The act of committing an organization to use specific standards to meet particular needs whenever they arise within the organization. Typically an organization might standardize upon use of a specific compiler for some language, some specific application package, or a particular database management system.