At the most basic level, social science is the process of developing and testing scientific explanations for various aspects of behavior and experience. Of the possible types of explanations, causal explanations are generally preferred in science because they specify why some phenomenon occurs rather than simply indicating that or when it occurred. This focus on causation allows scientists to predict and, in some cases, control phenomena. Other types of explanations, although important, are far less adept in this regard. A preference for causal explanations has led researchers to develop special procedures for testing putative cause–and–effect relationships, collectively known as experimental methodology.
Experimental methodology affords an inference of causality through a collection of specific procedures. First, the researcher creates two or more groups of research participants that are equivalent at the outset of the study. Second, the researcher introduces different levels (e.g., doses, versions, intensities) of an independent variable to each group of participants. An independent variable is the entity, experience, or situation, for example, that the researcher proposes as the “cause” in a cause–and–effect relationship. It is referred to as “independent” because the experimenter has independent control over the specific group of participants that gets one level or another, and as a “variable” because it has more than one level (e.g., more than one dose, version, intensity, etc.). After introducing a different level of the independent variable to each group of participants, the researcher then measures or records at least one dependent variable —that is, the behavior, experience, or event that the researcher proposes as the “effect” in a cause–and–effect relationship. It is referred to as “dependent” because its outcome depends upon the effects of the independent variable and as a “variable” because it has more than one value.
The logic of this arrangement and the ability to infer a causal relationship from an experiment are straightforward. If the groups of research participants were equivalent at the beginning of the study and the level of the independent variable was the only thing that differed systematically among the groups, then any difference found on the dependent variable must be attributed to the causal influence of the independent variable on the dependent variable. Beyond the logic of experimental methodology, a number of issues may arise when constructing experiments that complicate matters considerably.
Creating Equivalent Groups An important aspect of conducting an experiment is creating groups of participants that are equivalent prior to the introduction of the independent variable. There are two general strategies for accomplishing this equivalence—random assignment and blocking. Random assignment refers to the placement of each participant into one group or another on the basis of chance alone. This can be done by flipping a coin, consulting a random number table, or by using some other randomization technique to determine the specific group to which a participant is assigned. Random assignment has been called the “great equalizer” because any differences among participants prior to the introduction of the independent variable tend to be distributed equally across experimental groups.
The second strategy for creating equivalent groups is blocking, which refers to the practice of creating equivalent groups of participants by controlling for participants’ characteristics that could potentially have a systematic effect on the dependent variable. Such control can be accomplished in a number of ways, but the two most common strategies are matching and statistical elimination. With matching, the researcher assesses participant characteristics that may influence the dependent variable, purposely creates pairs of participants that share those characteristics, and then assigns one member of the pair to one group and the other member to another group. With statistical elimination, the researcher measures participant characteristics that may influence the dependent variable, and uses statistical techniques to mathematically remove the effects of those characteristics from the dependent variable. Of the two techniques, matching is better suited to research designs with only two groups, whereas statistical elimination can be used in research designs with two or more groups.
Operational Definitions The independent and dependent variables are hypothetical constructs that carry with them conceptual definitions that researchers use to communicate information about the variables to other researchers. In order to conduct an experiment, however, the hypothetical construct must be made “real” by developing a strategy that will allow the independent variable to be manipulated and the dependent variable to be measured or recorded within the experimental context. The process of making hypothetical constructs real is called operationalization. To operationalize a hypothetical construct, a researcher must select or create a concrete, real–world manifestation of the construct and define it with enough precision so that other researchers can identify it within the context of the experiment and, should they so choose, use the same operationalizations in other research projects.
A researcher can select from a number of options in order to operationalize the independent variable. Among these options are manipulations of participants’ physiological states (influencing the natural biological states of participants), environmental context (altering the situation or context in which participants are engaged), and blocking (dividing participants into groups or categories on the basis of certain characteristics such as age, gender, income, education level, etc.). Blocking is not a manipulation in the strict sense of the term, but naturally occurring differences among individuals can be used as a quasi–independent variable. Although many options for operationalizing independent variables are available, some may be more appropriate than others to the phenomenon at issue, and the final choice of operational definition is typically determined by the phenomenon in question, resources available to the researcher, and ethical considerations.
The researcher also has many options from which to choose when operationalizing the dependent variable. The research may measure the presence, frequency/intensity, or latency/duration of participants’ thoughts, beliefs, attitudes, behaviors, or behavioral intentions. Again, although many operational definitions are available, the final choice is typically determined by the phenomenon in question, resources available to the researcher, and ethical considerations.
Replication The operational definitions used in an experiment are important not only for communicating how variables are manipulated and measured in the context of a study, but also for the purpose of replication. Replication refers to the ability of other researchers to repeat a study and produce results similar to those found in the original. There are two types of replication—direct and conceptual. Direct replication occurs when a researcher repeats an existing study exactly as it was conducted originally. Conceptual replication occurs when a researcher repeats an existing study using different operationalizations of the same independent and/or dependent variables, or other modifications of the original procedures. Both forms of replication are important, but conceptual replication provides stronger support for a hypothetical cause–and–effect relationship among variables because consistent results across many conceptual replications reduces the likelihood that the results are simply a consequence of a single operational definition.
Research Design The strategy used to create equivalent groups and the ability of the researcher to manipulate an independent variable restricts the type of research design that can be employed. Specifically, experimental methodology can be divided into two categories, true experiments and quasi experiments. True experiments are characterized by (1) random assignment and (2) manipulation of the independent variable. In contrast, quasi experiments possess only one of these characteristics, which in most cases is manipulation of the independent variable. Three distinct true experimental designs and at least ten quasi–experimental designs have been identified. These basic designs can be combined or modified to produce a large number of possible research designs that can be used to test a broad range of cause–and–effect explanations.
Research Setting In addition to design issues, researchers must also select a research setting. There are two settings for research—laboratory and field. A laboratory setting refers to research conducted in a dedicated scientific laboratory; a field setting refers to any location other than the laboratory. Field settings are typically less controlled than laboratory settings, but they make up for this deficit by affording researchers an opportunity to study phenomena in a context that is similar to, if not the same as, the environments in which the phenomena occur naturally.
Realism and Impact Another consideration when designing and conducting an experiment is realism. Realism is the extent to which an experiment stimulates or reflects various aspects of the phenomenon under investigation. There are two types of realism—mundane and experimental. Mundane realism is the extent to which a phenomenon is studied in an environment similar to the one in which it occurs naturally. In contrast, experimental realism is the extent to which real psychological processes occur within the context of the study, regardless of the artificiality of the experimental setting or procedures. As a general rule, researchers tend to maximize experimental realism even if doing so minimizes mundane realism, because the presence of extraneous variables that may accompany mundane realism reduces the capacity of researchers to derive a causal inference from the study. Although the distinction between mundane and experimental realism does map onto the distinction between laboratory and field settings, these issues are distinct and should be treated as such.
Related to the issue of realism is impact. Impact refers to the intensity of the process or phenomenon elicited in a study. High–impact studies typically involve situations designed to maximize the experience or expression of the phenomenon. Low–impact studies, which are sometimes called judgment studies, typically involve assessing or recording simulations of a process or phenomenon in an “as if” manner. Low–impact studies are typically less desirable than high–impact studies because participants asked to simulate a process may respond in a manner that is unrepresentative of how they would respond if the process or phenomenon in question were actually occurring. The choice of impact level is often determined by the phenomenon under investigation, resources available to the researcher, and ethical considerations.
Validity Of major concern to scientists is the validity of different research designs. In this connection, validity refers to the extent to which a design is free of confounds (i.e., flaws) that serve as alternative explanations for the proposed causal effects of the independent variable on the dependent variable. Confounds can be divided into two types—threats to internal validity and threats to external validity. Internal validity refers to the absence of alternative explanations for the causal relationship between the independent and dependent variables. External validity refers to the extent to which the results of an experiment can be generalized to other operationalizations of the independent or dependent variable, other populations, or other settings. Both types of validity are important, but researchers tend to focus on internal validity because a study with low internal validity is not interpretable. When research results are not interpretable, the generalizability of the results is irrelevant.
Eight threats to internal validity and four threats to external validity have been identified. The eight threats to internal validity are:
history (changes on the dependent variable due to events other than the independent variable),
maturation (changes on the dependent variable due to time–dependent changes in characteristics of the participants during a study),
testing (changes on the dependent variable due to the effects of measuring the dependent variable at an earlier point in time),
instrumentation (changes on the dependent variable due to changes in the calibration or accuracy of measurement instruments),
statistical regression (changes on the dependent variable due to the assignment of participants to different groups based on extreme characteristics or abilities),
selection (changes on the dependent variable due to differential selection of participants for possessing certain characteristics or abilities),
mortality (changes on the dependent variable due to differential loss of participants from different groups in a study), and
interactions between two or more threats to internal validity.
The four threats to external validity are: (1) interaction of testing and the independent variable (pretesting can alter participants’ sensitivity or responsiveness to the independent variable, which can render responses from pretested participants unrepresentative of non–pretested participants); (2) interaction of selection and the independent variable (participants selected for a study because they possess certain characteristics may respond to different levels of the independent variable in a manner that is unrepresentative of participants who do not possess those characteristics); (3) reactive arrangement (the arrangement of the experimental context may produce effects that do not generalize to nonexperimental situations, particularly when participants’ responses have been altered during a study because they know they are being observed); and (4) multiple–treatment interference (multiple levels of the independent variable applied to the same participant may produce interference among the different levels and produce effects that would not occur if each level were used independently).
Following a series of unethical studies in the first half of the twentieth century, the scientific community along with the Nuremberg war tribunal developed a code of ethics for scientific research. Succinctly, this code requires that research participants freely consent to participate, be fully informed of the purpose and potential risks associated with participation, and be afforded the right to discontinue participation for any reason at any time. Similarly, the researcher must strive to minimize risks to participants, protect them from harm insofar as possible, and be fully qualified to conduct the research with honesty and integrity. This code and other more stringent ethical principles developed by individual branches of social science are enforced by Internal Review Boards (IRBs). IRBs are institutional panels of experts and community volunteers that review the potential risks to participants of studies proposed by researchers. IRBs carefully examine the procedure and materials that are to be used in a study to ensure that the researcher is doing everything possible to protect participants from undue harm. Consequently, the goal of IRBs is to balance the benefits of acquiring scientific knowledge with protecting the rights of research participants.
One point of concern for IRBs is informed consent, which typically entails a document that outlines the rights of research participants as well as the potential risks and benefits associated with the study for which they have volunteered. It is necessary for participants to freely consent and fully understand these benefits and risks in order for a study to meet the ethical requirements set forth by the scientific community. However, informed consent is somewhat problematic in research that involves some form of deception. In such cases, it may be unclear whether enough information has been supplied at the outset for participants to fully understand the risks of participation. Similarly, informed consent is problematic when working with vulnerable populations such as young children or individuals with poor mental health who cannot understand or communicate informed consent, prisoners who may feel compelled to participate in research, individuals who do not speak the language in which the informed consent document is written, and people with poor physical health. Also, research on nonhuman animals poses problems with consent and raises additional ethical responsibilities. In such research, it is important to provide humane and ethical treatment, and to take steps to minimize discomfort, illness, stress, and privation.
The culmination of a study, and the resolution of a researcher’s ethical obligations, is the debriefing —the full disclosure to participants by the researcher of the purpose and procedures of a study, including any deceptive elements. In addition to disclosure, another purpose of a debriefing is to assess and rectify any adverse effects that may have occurred during the study. Adverse effects include stress, pain, threats to self–esteem or identity, and the like. It is the researcher’s responsibility to provide affected participants with appropriate remedies or treatment options, or to otherwise undo any harm done.
SEE ALSO Ethics in Experimentation; Experiments, Human; Experiments, Shock; Regression; Scientific Method
Aronson, Eliot, Timothy D. Wilson, and Marylin B. Brewer. 1998. Experimentation in Social Psychology. In The Handbook of Social Psychology, 4th ed., ed. Daniel T. Gilbert, Susan T. Fiske, and Gardner Lindzey, 441–486. Boston: McGraw–Hill.
Campbell, Donald T., and Julian C. Stanley. 1966. Experimental and Quasi–Experimental Designs for Research. Boston: Houghton Mifflin.
Cook, Thomas D., and Donald T. Campbell. 1979. Quasi–Experimentation: Design and Analysis Issues for Field Studies. Chicago: Rand McNally.
E. L. Stocks
David A. Lishner