Demographic Surveys, History and Methodology of
DEMOGRAPHIC SURVEYS, HISTORY AND METHODOLOGY OF
Demographic surveys are surveys that wholly or primarily collect information on population characteristics and on the causes and consequences of population change. In addition, demographic surveys can be a name given to surveys that contain mostly demographic information although they also contain information of a non-demographic nature.
Historical Overview of Population Surveys
Population censuses attempt to measure characteristics of the total population of a country or territory through the full enumeration of all persons and relevant events. Surveys have emerged as alternatives to census taking with the development of statistical sampling techniques that permit interviewing only a part of the population of interest to obtain estimates that are valid for the population as a whole.
Population surveys have a long history, including the 1086 Domesday survey in England. This survey, as well as most other early surveys, was a social survey dealing with living conditions and poverty. Many of these studies were carried out in the eighteenth and nineteenth centuries, but none was based on true probability sampling methods. The first study that employed probabilistic sampling was a 1913 study by A. L. Bowley on the living conditions of the working classes in five English cities. Survey research in the demographic field only came into wide usage in the mid-1900s.
Demographic surveys are often taken in conjunction with a census. This was done for the first time in 1940, in the United States. The items covered in the census were significantly increased for 5 percent of the census population, making it possible to collect extensive additional information without increasing the burden on all census respondents and at relatively small additional cost.
One of the first demographic surveys was conducted by Raymond Pearl in 1939, covering 31,000 women in American hospitals. Other early U.S. demographic surveys include the Current Population Survey (CPS) carried out monthly by the Bureau of the Census since 1940; the 1941 Indianapolis study by Pascal Whelpton and Clyde Kiser; the 1960 Growth of American Families Study by Whelpton, Arthur Campbell, and John Patterson; and the 1965 and 1970 National Fertility Surveys carried out by Charles F. Westoff and Norman B. Ryder of Princeton University. The National Center for Health Statistics (NCHS) carried out six rounds of the National Survey of Family Growth (NSFG) between 1973 and 2002.
The CPS is focused on employment and unemployment and economic activity but additional questions are added from time to time to obtain information on other population characteristics. One of its advantages is its large sample size: 50,000 households. The data from the CPS serve to update information on the U.S. population between the decennial censuses. Annual demographic data files are available from this source. The other early surveys mentioned above were designed to provide information specifically related to fertility, family planning, and family formation. They sampled women in the fertile age group, with sample sizes below 10,000.
The NCHS undertakes a number of health related survey activities that provide significant demographic information, such as the National Health and Nutrition Examination Survey, which has been carried out eight times since 1960. The round that began in 1999 has been converted into a continuous survey in which 5000 people are surveyed annually in 15 locations in the United States.
Most developed countries have survey activities similar to those in the United States. Periodic labor force surveys are a major source for demographic information. Special demographic surveys have been more rare. The 1946 survey on fertility in Britain by David Glass and Eugene Grebenik was a forerunner for fertility surveys that were carried out in the 1960s in Belgium, Canada, Greece, Hungary, The Netherlands, the United Kingdom, and the Soviet Union. In the 1970s similar surveys were conducted in 15 European countries as an offshoot of the World Fertility Survey (WFS) program, which operated from 1973 to 1984 but was mainly focused on developing countries. A further round of fertility surveys, the Fertility and Family Surveys in Countries of the Economic Commission for Europe Region, was carried out in the 1990s in about twenty countries under the sponsorship of the United Nations Population Fund (UNFPA).
In developing countries, the main sources of demographic information, aside from population censuses, are labor force and economic surveys, and surveys on population and health. Among the latter, the Puerto Rico studies on family planning by Paul K. Hatt in 1947 and Reuben J. Hill, Mayone Stycos, and Kurt W. Back in 1959 were some of the earliest. In India, the 1952 Mysore study was groundbreaking. In the 1960s more than 125 fertility and related surveys were carried out in the developing world, a majority in Africa. Special demographic surveys have most often been achieved through participation in international survey programs like the WFS. The ongoing Demographic and Health Surveys (DHS) program funded by the United States Agency for International Development (USAID) has sponsored over 150 surveys in the period from 1984 to 2001. Among other international programs that have contributed significantly to the availability of demographic survey data in developing countries are the World Bank–sponsored Living Standards Measurement Surveys (LSMS) program, which has carried out over 30 complex surveys since 1985; the UNICEF sponsored Multiple Indicator Cluster Survey (MICS) program, with over 120 surveys since 1995; the Centers for Disease Control and Prevention (CDC) USAID-sponsored surveys, in operation since 1985, with over 40 surveys; the Contraceptive Prevalence Surveys (CPS), also sponsored by USAID, which carried out 39 surveys over 1976–1984; and numerous smaller survey efforts.
There is a basic distinction between surveys that are planned to provide a snapshot of the population under study at the time of the survey and those planned to provide repeated information on the same sample populations. The former are usually called single-round surveys, the latter are called panel or longitudinal surveys. A longitudinal survey can measure changes in the population with greater precision than could be achieved by drawing on retrospective information collected in single-round surveys (given the likelihood of recall error by respondents) or by comparing the results from two surveys that are based on independent samples. The effect of programmatic interventions in the period between surveys can also be measured more easily.
These advantages of the longitudinal design are balanced by a number of important disadvantages. Longitudinal surveys are generally more costly; the sample population is affected by death and migration; and the respondents may suffer respondent's fatigue if interviewed on too regular a basis. In developing countries an added problem is locating the exact households to be revisited, given the absence of good addresses and the inaccessibility of some sample areas. A particular example of longitudinal surveys are demographic surveillance systems (DSS) These systems reinterview the residents of a small and specific geographic area on a regular schedule. Interviews can happen as often as once every two weeks, as in the Matlab area of Bangladesh. The DSS design is ideal for studying change in a population. The major drawback is that the survey area is typically not representative of the population in the country.
Some of the problems of longitudinal surveys can be overcome in a hybrid design that combines a single round and a longitudinal survey. In this design the sample clusters are the same in each successive survey, but the individual respondents need not be the same. The characteristics of people in a specific sample cluster are more homogeneous than the characteristics of people in different clusters, thus making the samples more similar than if the samples had been totally independent. This provides greater precision in the estimates of change.
Sampling is a difficult task even when the necessary baseline data about the population to be sampled are readily available. In the United States, most research institutions obtain their basic data from the U.S. Bureau of the Census and other government agencies that collect basic statistical information or from commercial firms that sell samples and sampling frames. Most developing countries lack updated census and other information that can serve as secure sampling frames. More often than not, special field operations are necessary to develop an appropriate sample frame by creating up-to-date listings of households or dwellings.
Probability sampling consists of randomly selecting the desired number of subjects from a complete list of all similar subjects in the sample universe. It depends on mechanical random selection and ensures that every element in the population of interest has a known, positive probability of selection. The way samples are actually drawn will depend on what the samples are expected to represent. For instance, if a sample is expected to provide information for a country as a whole and also for each of four of its provinces, each of those provinces needs to be allocated a large enough sample to permit calculation of the required indicators with the desired level of precision.
One factor that helps determine the type of sample to be drawn is whether the sampled individuals will be interviewed through a personal or a phone interview. For personal interview samples, it is typically too costly to interview people who are chosen individually from a list of all individuals in the sample universe. Kish calls this element sampling. For this and other reasons, most personal interview samples are drawn under cluster sampling. Cluster sampling selects groups of elements, with each group or cluster containing contiguous sampling elements (e.g., an urban block). Using cluster sampling implies that all the elements of the population are represented and identifiable in one of the clusters. The size of the clusters and the number of elements to be selected in each selected cluster will be determined by the objectives of the study and the field costs of the survey. The major advantage of cluster sampling is cost savings in the fieldwork; the major drawback is that the homogeneity of elements within each cluster means that the variance between elements is greater.
Developments in Data Processing
Some of the main bottlenecks in getting survey data published shortly after data collection have traditionally been the hardware, software, and manpower available for processing the information collected. In the 1960s and early 1970s most surveys were still processed by coding the information on special coding sheets and entering that information on punch cards that were then used in computer analysis of the data. Survey researchers typically had to operate through intermediaries at computer centers to have the data tabulated. With the advent of microcomputers in the late 1970s and the creation of appropriate software, it became possible to do most data processing in-house. Until the mid-1980s, the speed of the available processors and software limitations still made the processing of large surveys a difficult enterprise.
Large data collection efforts such as censuses were most often processed using optical readers. This avoided the onerous task of entering the data by hand and speeded up their availability for analysis. Due to special requirements of page layout and the necessarily limited length of the questionnaires, few comprehensive surveys were processed through the optical reader process.
One of the major problems in survey data processing is how to create a file that is free of structural or consistency errors in the variables. Such a file is created through detailed editing of the data and, where possible, imputation of missing data. This editing eliminates the errors introduced during the interview, in the coding process, and in data entry. The availability of microcomputers for data entry made it possible to build structural, range, and some consistency checks into the data entry program and resulted in fewer errors in initial data files. Further consistency checking can eliminate these types of errors altogether. The development of appropriate software for these stages of processing has been a major factor in the earlier availability of survey data. The Demographic and Health Surveys program developed its Integrated System for Survey Analysis (ISSA), which can handle all data entry, editing, and tabulation. The Netherlands' Institute of Statistics developed a similar program called BLAISE, while the CDC developed the widely used program called EPI-info. Statistical analysis packages such as SPSS and SAS also contributed much to the speedier publication of survey data.
The continued development of personal computers and the availability of laptops and handheld computers are further facilitating survey processing. Frequently, data are entered on a handheld computer or laptop during the interview, thus obviating the need for further data entry. In addition, checks incorporated during the interview can ensure that the resulting files are largely free of error, which minimizes the need for extensive cleaning of the data. There are already instances where survey data are instantly transmitted from the interviewer's computer to a central computer for tabulations.
The proliferation of software and equipment has also had its drawbacks, especially in developing countries. Too many different systems are in use, making it more difficult to build the capacity of organizations to process their own surveys.
Telephone surveys are the most common and cheapest way to collect information for marketing and other purposes. For obtaining demographic survey data, they can only be used where all the sample population is reachable by phone. This excludes developing countries. In the United States, the proportion of households with a phone rose above 90 percent in the 1970s, making it possible to sample nearly as well in a telephone survey as through personal interviews. This has generated a fast-growing telephone interviewing industry.
A major advantage of telephone surveys is that the sample design has no impact on the speed of data collection. Distance between sample subjects is not a problem. Another major advantage is quality control, particularly where the telephone interviews are conducted by means of a Computer Assisted Telephone Interviewing (CATI) system. This system can control the sample selection, the flow of the interview, and the quality of data entry. A further advantage is that the use of a CATI system ensures instant availability of the data. Telephone surveys are generally considered to be unsuitable for interviews of longer than 20 minutes, particularly if the subject matter of the interview requires a high degree of cooperation. Due to their cost-effectiveness, telephone surveys are also used in combination with other methods of data collection. Short screening interviews are often done by phone to determine which respondents should receive a more comprehensive personal interview. Sampling for telephone interviews poses its own challenges, however, due to the existence of unlisted phone numbers. A technique called "list assisted random digit dial" is used to decide how many telephone numbers to select from telephone lists with different occurrences of unlisted numbers.
Adlakah, Arjun, Jeremiah M. Sullivan, and James R. Abernathy. 1980. "Recent Trends in the Methodology of Demographic Surveys in Developing Countries." Scientific Report Series, No. 33.
Baum, Samuel, Kathleen Dopkowski, William G. Duncan, and Peter Gardiner. 1974. "The World Fertility Survey Inventory." World Fertility Survey Occasional Papers, nos. 3–6.
Cleland, John, and Christopher Scott. 1987. The World Fertility Survey: An Assessment. Oxford: Oxford University Press.
Dekker, Arie. 1997. "Data Processing for Demographic Censuses and Surveys with Special Emphasis on Methods Applicable to Developing Country Environments." Netherlands Interdisciplinary Demographic Institute (NIDI) Report No.51. The Hague, Netherlands: NIDI.
International Development Research Center. 2002. Population and Health in Developing Countries, Vol. 1, Population, Health and Survival at INDEPTH Sites. Ottawa, Canada: International Development Research Center.
Kalton, Graham. 1983. Introduction to Survey Sampling. Beverly Hills, CA: Sage.
Lavrakas, Paul J. 1993. Telephone Survey Methods, Sampling, Selection and Supervision. Applied Social Research Methods Series, Vol. 7. London: Sage.
Lloyd, Cynthia B., and Catherine M. Marquette. 1992. Directory of Surveys in Developing Countries: Data on Families and Households, 1975–92. New York: Population Council.
Macro International Inc. 1996. Sampling Manual. DHS-III Basic Documentation Number 6. Calverton, MD: Macro International.
Moser, Claus A., and Graham Kalton. 1972. Survey Methods in Social Investigation. 2nd edition. London: Heinemann, 1971. New York: Basic Books.
Pearl, Raymond. 1939. The Natural History of Population. London. Oxford University Press.
Population Council. 1970. A Manual for Surveys of Fertility and Family Planning; Knowledge, Attitudes, and Practice. Population Council: New York.
Ryder, Norman B., and Charles F. Westoff. 1971. Reproduction in the United States, 1965. Princeton, NJ: Princeton University Press.
Shyrock, Henry S. Jr., Jacob S. Siegel, et al. 1971 (3rd rev. printing, 1975). The Methods and Materials of Demography. 2 vols. Prepared for the U.S. Bureau of the Census. Washington D.C.: U.S. Government Printing Office.
Tablin, Delphine. 1984. "Comparison of Single and Multi-round Surveys for Measuring Mortality in Developing Countries." In Methodologies for the Collection and Analysis of Mortality Data, eds. Jacques Vallin, John Pollard, Larry Heligman. Liège, Belgium: International Union for the Scientific Study of Population, Ordina Editions.
United Nations. 1961. The Mysore Population Study. (ST/SOA/Series A/34). New York, NY: United Nations.
United Nations. 1984. Handbook of Household Surveys. (ST/ESA/STA/SER.F/31). New York: United Nations.
Whelpton, Pascal K., and Clyde V. Kiser, eds. 1946 (rev. 1950, 1952, 1954, 1958). Social and Psychological Factors Affecting Fertility, 5 vols. New York: Milbank Memorial Fund.