Variables, Random

views updated

Variables, Random

A random variable is a real-valued function that maps a sample space into the real line. The sample space, denoted by Ω = {ω}, is the set of possible outcomes of some chance phenomenon (e.g., acts of individuals, an experiment). To illustrate, consider the familiar example of tossing a coin. There are only two possible outcomes; hence Ω = {H, T}. Here, the symbols H and T are used to denote the outcomes “head” and “tails.” A random variable Y can be defined by setting Y= 1 if H occurs, or Y= 0 if T occurs. The use of the word if here is important; if a coin is actually tossed, “heads” is observed, and Y = 1 is recorded, then this value is a realization of the random variable Y. The number 1 is not random but is simply a number. The variable Y is considered random unless it is observed. If the coin is “fair,” then the probability that Y= 1 (on a toss that has not been observed yet) equals the probability that Y = 0; both probabilities equal 0.5. Alternatively, one could also define a random variable X, equal to –1 if heads occurs, and equal to 1 if tails occurs; in fact, any pair of distinct values could be used to define a random variable describing the outcome of a coin toss.

The concept of random variables was introduced by Pafnuty Chebyshev (1821–1894), who in the mid-nineteenth century defined a random variable as “a real variable which can assume different values with different probabilities” (Spanos 1999, p. 35). The concept is closely tied to the theory of probability, which has been studied since the seventeenth century. However, the modern understanding of random variables and their relation to probability arrived more recently, dating to the work by Andrey Kolmogorov (1933).

Random variables may be either discrete or continuous. In the discrete case, the elements of Ω are countable, although perhaps infinite in number. In the continuous case, elements of Ω are not countable, implying that there are infinitely many elements. The elements ω_j of Ω are called elementary events; collections of the elementary events are called simply events.

To formalize the definition of random variables, first consider the case where Ω contains a finite number of elements. An event A is a subset of Ω, that is, A ⊆ Ω. The complement of A (with respect to Ω) is defined by Ā = Ω – A. Then Ω is the certain event, while Ø = Ω is the impossible, or null event. Let ℑ be the set of all events (including Ω and Ø) defined on the sample space Ω = {ω ₁, … ω_m }; ȑ is a field in the sense that it is closed under the formation of unions and complements (i.e., if A, B Є ℑ then A U B Є ℑ; if A Є ℑ then ā Є ℑ ). Then the probability of the event A Є ℑ denoted P (A ), is a set function onto the closed interval [0,1] satisfying

0 ≤ P (A ) 1 for all A Є ℑ;

P (Ω) = 1; and

P (A ∪ B) = P (A) + p (B) if A ∩ B = Ø

The triple (Ω, ℑ, P ) is called a probability space.

If Ω contains infinitely many elements (either countable or non-countable), ℑ is required to be a Σ-field, meaning that ℑ is closed under the formation of complements and unions of countably many events. In addition, the set function P must be countably additive so that condition (iii) above becomes

iii. If A ₁, A ₂, … are disjoint members of the Σ-field

With the preceding concepts, a formal definition is possible. Suppose (Ω, ℑ, P ) is a probability space in which Ω is not necessarily countable. Then a random variable Y defined on this space is a function mapping Ω into the real line such that the set {ω ǀY (ω ) ≤ y } Є ℑ for every real y. Hence for each ω Є Ω Y (ω ) is a real number.

Many examples of random variables appear in the social sciences. Linear regressions involve an attempt to explain the meaning of a continuous random variable; the error term in such equations is a random variable with 0 mean, reflecting statistical noise. Schmidt and Witte (1989) considered the (random, continuous) time that elapses between a criminal’s release from prison and his subsequent conviction for another crime. Nakosteen and Zimmer (1980) examined not only workers’ incomes (continuous) but also their decisions to move to a location or to stay at their current location (discrete). Others have considered counts of durable goods purchased by households; decisions of graduating high-school students to continue their education in college, enlist in the military, or enter the labor force; and other issues.

SEE ALSO Probability; Statistical Noise

BIBLIOGRAPHY

Kolmogorov, Andrey N. 1933. Grundbegriffe der Wahrscheinlichkeitrech nung. Ergebnisse der Mathematik 2 (3). Translated as Foundations of the Theory of Probability by N. Morrison. New York: Chelsea, 1956.

Nakosteen, Robert, and Michael Zimmer. 1980. Migration and Income: The Question of Self-selection. Southern Economic Journal 46: 840–851.

Schmidt, Peter, and Ann Dryden Witte. 1989. Predicting Criminal Recidivism Using Split Population Survival Time Models. Journal of Econometrics 40: 141–159.

Spanos, Aris. 1999. Probability Theory and Statistical Inference: Econometric Modeling with Observational Data. Cambridge and New York: Cambridge University Press.

Paul W. Wilson

International Encyclopedia of the Social Sciences