Small-Area Analysis

views updated


Demand for demographic data and analysis referring to localities and similar small areas has grown rapidly in recent decades, spurred by new business applications and government programs. To meet that demand, analysts have drawn on an expanding set of data sources, statistical techniques, and computer applications. The result has been improved data quality across a broad spectrum of variables and geographic areas, enhancing both the usefulness and the importance of small-area analyses. Analysts using small-area data include demographers, sociologists, geographers, economists, marketers, epidemiologists, planners, and others.

Analysts define small areas in several different ways. Under one definition, they are states and other subnational areas for which samples from national surveys are too small to provide meaningful estimates. More typically, small areas refer to counties and subcounty areas like cities, census tracts, postal code areas, and individual blocks. Small areas may range from less than an acre to thousands of square miles, and from no inhabitants to many millions.

This article reviews commonly-used data sources and application techniques and discusses several distinctive features of small-area demographic analysis. Although reference is primarily to the United States, many of the issues discussed transcend national boundaries.

Data Sources

Censuses, administrative records, and sample surveys are the major sources of data for small-area demographic analyses. In most countries, censuses constitute the most comprehensive source of small-area data, typically at five or ten year intervals. They cover a variety of population characteristics (such as age, sex, marital status, and education) and housing information (such as number of dwelling units, occupancy rates, household size, and housing value or monthly rent).

Administrative records kept by national, state, and local governments often provide small-area data for years between censuses. These records contain information on variables such as births, deaths, school enrollments, social insurance, building permits, drivers' licenses, and voter registration–each reflecting a facet of population structure and change that may be useful for constructing estimates and projections or for tracking demographic trends. Most industrialized countries maintain relatively accurate records of this kind; some European countries even produce census-type statistics based solely on administrative records. In many developing countries, however, administrative records are seriously incomplete.

Sample surveys are another potential source of demographic and socioeconomic data, provided that the samples are large enough to yield reliable estimates for small areas. A notable example is the American Community Survey, which is expected to cover some three million U.S. households annually by 2003. This survey will eventually generate estimates down to the block group level for the entire nation.

Estimates and Projections

Small-area estimates of total population generally rely on housing unit, component, or regression methods. The housing unit method derives population estimates from calculations of occupied housing units (i.e., households) and average household size, plus the number of persons living in group quarters facilities such as college dormitories, military barracks, and prisons. Component methods derive population estimates from birth, death, and migration data (births and deaths from vital statistics records, migration from changes in school enrollments or other indicators of population mobility). Regression methods derive population estimates from symptomatic indicators of population change–such as births, school enrollments, electric utility customers, registered voters, drivers licenses, and tax returns–in a multivariate model. All three methods produce useful estimates, but the housing unit method is the most commonly used for small-area estimates because it is relatively easy to apply and the requisite data are widely available.

Estimates of demographic characteristics such as age, sex, and ethnicity are typically based on the cohort-component method. Here, birth, death, and migration rates are applied separately to each age, sex, and ethnic subgroup in the population. Estimates of socioeconomic characteristics such as income, employment, and education are often based on imputation techniques, whereby known proportions of the population exhibiting a characteristic in a larger area (e.g., a state) are applied to population estimates for smaller areas (e.g., cities, counties). Typically, these proportions are calculated separately for different subgroups of the population. Estimates of demographic and socioeconomic characteristics can also be based on administrative records (e.g., Medicare data).

Population projection methods used for small areas are mainly of three kinds. Trend extrapolation methods extend observed historical trends. These methods may be simple, such as projecting past growth rates to remain unchanged, or complex, as in ARIMA time series models. These methods are frequently used for small-area projections because the data requirements are small, they are easy to apply, and their forecasts have often proven to be reasonably accurate.

The cohort-component method accounts separately for the three components of population change–births, deaths, and migration. Projections of each component can be based on the extrapolation of past trends, projected trends in other areas, structural models, or professional judgment. Simplified versions of the method such as those described by Hamilton and Perry (1962) can also be applied. The cohort-component method is the most frequently-used projection method because it can accommodate a broad range of data sources, assumptions, and application techniques, and can provide projections of demographic characteristics as well as total population.

Structural models are based on an entirely different logic from the other projection methods. They relate the projected population to variables known to drive population change, like comparative wages, employment, and land use. Some structural models involve only a single equation and a few variables; others contain many equations, variables, and parameters. They are often used in combination with the cohort-component method. Although they require highly-detailed data and substantial investments of effort and modeling skill, structural models provide a broader range of projections than the other methods.

Uses of Small-Area Analysis

To increase knowledge. Small-area data shed light on socioeconomic and demographic variations across states, counties, cities, census tracts, and other geographic areas, enlarging the knowledge available to scholars, policy makers, and other analysts. Empirical researchers have applied small-area analysis to many areas of inquiry. For example, Kathleen M. Day (1992) investigated how differences in government tax and expenditure policies affected inter-provincial migration in Canada; Rene J. Borroto and Ramon Martinez-Piedra (2000) studied how poverty, urbanization, and geographic location affected the incidence of cholera among regions in Mexico; and Patricia E. Beeson, David N. DeJong, and Werner Troesken (2001) studied how differences in industrial, educational, geographic, and demographic characteristics affected county population growth rates in the United States.

To inform public policy. Small-area analysis supports decision-making by national, state, and local government agencies. Small-area data are indispensable for drawing administrative and electoral boundaries, allocating government funds, siting public facilities, developing program budgets, determining eligibility for public programs, and monitoring program effectiveness. For example, Tayman, Parrott, and Carnevale (1997) used block-level population and household projections to choose sites for fire stations; Gould and colleagues (1998) used birth data by postal code area to identify areas in need of adolescent pregnancy prevention programs; and Hashimoto, Murakami, Taniguchi, and Hagai (2000) developed techniques for monitoring infectious diseases by health district.

To support business decision making. Small-area data figure prominently in many types of business decisions–including site selection, sales fore-casting, consumer profiles, litigation support, target marketing, and labor force analysis–analogous to their use in public policy. For example, small-area data were used by Morrison and Abrahamse (1996) to select locations for supermarkets; by Thomas (1997) to project the demand for a hospital's obstetrical services; and by Murdock and Hamm (1997) to produce population estimates and projections to support a company's bank loan application.

Problems of Small-Area Analysis

Several distinctive problems of small-area analysis require attention. First, unlike most larger administrative units, the geographic boundaries of many small areas change over time. Cities annex adjoining areas, census tracts get subdivided, postal code areas are reconfigured, service areas are redefined, and new statistical areas are established. Such changes undermine the consistency of historical data series.

Second, many types of data are not tabulated for some small areas of interest, such as census tracts, school districts, market sales territories, and traffic analysis zones. Consequently, analyses routinely performed for larger areas may be impossible for small areas or feasible only using proxy variables.

Third, even the best censuses and administrative record systems contain errors. The effects of data errors are typically greater for small areas than large areas, where errors are often mutually offsetting. In addition, survey data are generally less reliable for small areas than large areas because sample sizes are smaller and survey responses more variable.

Finally, trends at the small-area level are more likely to be disrupted by idiosyncratic factors–such as the opening or closing of a prison or military base, the construction of a large housing development, the opening of a new road or railway, and the addition or loss of a major employer–than trends at the larger-area level. The effect of growth constraints like zoning restrictions and seasonal populations such as migrant workers is also likely to be greater for small areas than large areas. Factors like these often distort small-area trends.


Several developments have broadened the scope and improved the quality of small-area analyses around the turn of the twenty-first century. A wider variety of small-area data has become available in many countries, primarily through administrative records and sample surveys. The Internet has greatly enhanced access to these data, and the rapid growth of computing power, data storage capacity, and software applications has expanded their potential usefulness. Geographic information systems (GIS) technology has facilitated the collection, organization, manipulation, analysis, and presentation of geographically-referenced data. These developments have prompted many new uses of small-area data, at ever finer levels of detail.

Concerns about privacy and confidentiality, however, pose a formidable barrier to the continued advancement of small-area analysis. To many citizens, the collection of personal information, whether by businesses or government, is an invasion of privacy. Confidentiality is potentially at risk when personal data are shared among public and private agencies. Such concerns have caused many government statistical offices to curtail the release of demographic data, and the use of administrative records has been restricted in the United States, Germany, the United Kingdom, and elsewhere. Devising acceptable ways to utilize information while preserving privacy and confidentiality is a major challenge for small-area analysts.

See also: Business Demography; Census; Geographic Information Systems; Projections and Forecasts, Population; State and Local Government Demography.


Beeson, Patricia E., David N. DeJong, and Werner Troesken. 2001. "Population Growth in U.S. Counties, 1840–1990." Regional Science and Urban Economics 31: 669–699.

Borroto, Rene J., and Ramon Martinez-Piedra. 2000. "Geographical Patterns of Cholera in Mexico, 1991–1996." International Journal of Epidemiology 29: 764–772.

Cleland, John. 1996. "Demographic Data Collection in Less Developed Countries 1946–1996." Population Studies 50: 433–450.

Day, Kathleen M. 1992. "Interprovincial Migration and Local Public Goods." Canadian Journal of Economics 25: 123–144.

Gould, Jeffrey B., Beate Herrchen, Tanya Pham, Stephan Bera, and Claire Brindis. 1998. "Small-Area Analysis: Targeting High-Risk Areas for Adolescent Pregnancy Prevention Programs." Family Planning Perspectives 30: 173–176.

Hamilton, C. Horace, and Josef Perry. 1962. "A Short Method for Projecting Population by Age from One Decennial Census to Another." Social Forces 41: 163–170.

Hashimoto, Shuji, Yoshataka Murakami, Kiyosu Taniguchi, and Masaki Nagai. 2000. "Detection of Epidemics in their Early Stage through Infectious Disease Surveillance." International Journal of Epidemiology 29: 905–910.

Morrison, Peter A., and Allan F. Abrahamse. 1996. "Applying Demographic Analysis to Store Site Selection." Population Research and Policy Review 15: 479–489.

Murdock, Steven H., and Rita R. Hamm. 1997. "A Demographic Analysis of the Market for a Long-Term Care Facility: A Case Study in Applied Demography." In Demographics: A Casebook for Business and Government, ed. Hallie J. Kintner, Thomas W. Merrick, Peter A. Morrison, and Paul R. Voss. Santa Monica, CA: Rand.

Siegel, Jacob S. 2002. Applied Demography: Applications to Business, Government, Law and Public Policy. San Diego, CA: Academic Press.

Smith, Stanley K., Jeff Tayman, and David A Swanson. 2001. State and Local Population Projections: Methodology and Analysis. New York: Kluwer Academic/Plenum Publishers.

Tayman, Jeff, Bob Parrott, and Sue Carnevale. 1997. "Locating Fire Station Sites: The Response Time Component." In Demographics: A Casebook for Business and Government, ed. Hallie J. Kintner, Thomas W. Merrick, Peter A. Morrison, and Paul R. Voss. Santa Monica, CA: Rand.

Thomas, Richard K. 1997. "Using Demographic Analysis in Health Services Planning: A Case Study in Obstetrical Services." In Demographics: A Casebook for Business and Government, ed. Hallie J. Kintner, Thomas W. Merrick, Peter A. Morrison, and Paul R. Voss. Santa Monica, CA: Rand.

Stanley K. Smith