Probability, Limits In

views updated

Probability, Limits In

PROBABILITY LIMITS IN BAYESIAN STATISTICS

The terms limits in probability and probability limits are encountered in the field of Bayesian inference and in the field of asymptotic theory; however, their meanings are applied differently in each of these two areas. In terms of popularity, probability limits as applied to Bayesian inferences seems to appear more frequently in current literature, especially in applied statistical sciences.

PROBABILITY LIMITS IN BAYESIAN STATISTICS

In Bayesian inference, or Bayesian statistics, probability limits are also referred to as “credibility limits.” Probability limits are the upper and lower end-points of the probability (or credible) interval that has a specified (posterior) probability (e.g., 95% or 99%) of containing the true value of a population parameter. Probability limits are used when the parameter is considered as the realization of a random variable with given prior distribution. As the distribution is presumably assessed prior to sample evidence, such distribution is called a “prior distribution.” In classical inference, the parameter is considered to be an unknown constant, and then confidence limits are applied. Both upper and lower probability limits reflect not only prior information, but also sample information; therefore they are random statistics.

Confidence limits are the upper and lower end-points of an interval around a parameter estimate, such that if an experiment were repeated an infinite number of times, in the specified percentage (usually 95% or 99%) of trials the interval generated would contain the true value of the parameter. Confidence limits may be calculated using asymptotic (normal approximation) or exact methods. Both upper and lower confidence limits are obtained (purely) from sample data, so they are also a realization of random statistics.

Bayesian inference or Bayesian statistics is based on the theory of subjective probability. A formal Bayesian analysis leads to probabilistic assessments of the object of uncertainty. For instance, a Bayesian inference might be the statement “The probability is 0.95 that the mean µ of a normal distribution lies between 5.6 to 11.3.” Therefore, the 95 percent probability limits for the parameter µ are 5.6 and 11.3. The number 0.95 here represents a degree of belief, either in the sense of subjective probability coherent or subjective probability rational; it needs not correspond to any objective long-run relative frequency. By contrast, a classical inference based on the sampling theory might lead to the statement “A 0.95 confidence interval for the mean of a normal distribution is from 5.6 to 11.3.” The number 0.95 in this case represents a long-run relative frequency.

Finding Bayesian Probability Limits There is some flexibility in choosing the credible limits from a given probability distribution for the parameter. Examples include: choosing the narrowest probability interval that, for a uni-modal distribution, will involve choosing those values of highest probability density (highest posterior density credible limits); choosing the probability interval where the probability of being below the interval is as likely as being above it (the interval will contain the median); and choosing the probability interval that has the mean as its central point. The following is an illustration of finding probability limits using the second method.

In a problem of making inferences about the mean⋋ of a Poisson distribution, the prior distribution is given as a (α, β distribution with known α and β. In Bayesian statistics, the parameter ∖ is considered to be a random variable, so a bold ∖ is used in the next sentence. Now suppose that a random sample of size n has been drawn from the underlying Poisson distribution. Let X = (X₁, X₂, …, Xn ), then the joint conditional probability density function (pdf) of X , given λ = λ, is

The prior pdf is

Hence, the joint mixed continuous and discrete probability function is given by

provided that xi = 0, 1, 2, …, i = 1, 2, …, n with 0 < A < ∞, and is equal to zero elsewhere. Then the marginal distribution of the sample is

Suppose that the observations are x = (x, x, …, x).

Finally, the posterior pdf of ⋌, given X = x , is

provided that x_i = 0, 1, 2, …, i = 1, 2, …, n with 0 < λ < ∞, and is equal to zero elsewhere. This posterior pdf is also a gamma distribution with parameters

Notice that the posterior pdf reflects both prior information carried by (α, β ) and the sample information

To obtain a credible interval, note that the posterior distribution of is Based on this, the following interval is a (1 –α ) 100 percent credible interval for λ:

where and are the lower and upper quantiles of a x² distribution with degrees of freedom. Then, and are (1 –a 100 percent probability limits of λ.

Relationship to Confidence Limits For the example above, the confidence limits can be obtained by applying the distribution of sample sum . Note that the distribution of is also a Poisson distribution with λ * = nλ. For the lower limit of a (1 –α ) 100 percent confidence interval for λ can be selected so that

and the upper limit of a (1–α) 100 percent confidence interval for λ can be solved by

In general, equations such as (2), and (3) above are impossible to solve algebraically. The solutions can be obtained using a search procedure on a computer.

The asymptotic confidence limits for λ may be obtained by

So, the (1–α) 100 percent confidence limits for λ are

In addition, the further approximation for the confidence limits for λ using standard error of are

Compare (1) to (4) or (5); they are quite different. Probability limits are obtained by a combination of subjective knowledge of λ and the objective information from the sample data whereas confidence limits only contain pure objective information.

There are situations in which the confidence limits and credible limits are numerically identical. For example, if μ is a location parameter of a normal distribution, and the prior for μ is uniform, then the 95 percent central confidence limits are identical to the 95 percent highest density credible limits.

PROBABILITY LIMITS IN ASYMPTOTIC THEORY

For a sequence of random variables {X_n}, if there exists a real number c such that for every small positive number σ the probability that the absolute difference between X_n and c is less than σ has the limit of 1 when n → ∞, namely, then we say that {x_n } converges in probability to constant c, and c is called the probability limit of {X_n }(White 1984). See Simons (1971) for how to identify probability limits.

For example, if the average of n independent, identically distributed random variables Y , i = 1, …, n, with a common mean, μ is given by then as n → ∝ converges to μ in probability. The common mean, μ of the random variables Y is the probability limit of the sequence of random variables {Y_n }. This result is known as the weak law of large numbers. In the context of estimation, we also call {Y_n } a (weakly) consistent sequence of estimators for μ

More generally, convergence in probability means that a sequence of random variables {X_n converges to a random variable X, which is not necessarily a constant. It has not been observed that “probability limit” is used, for this general case, as a limiting random variable.

In probability theory there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limiting random variable (illustrated below) is an important concept in probability theory, and in its applications to statistics and stochastic processes. The following section presents a brief overview of the various modes of convergence of a random variable sequence, and the relationships between them (for greater detail, see Parzen 1960).

Modes of Convergence of a Random Variable Sequence Suppose that {F_n } is a sequence of cumulative distribution functions corresponding to random variables {X _n}, and that F is a distribution function corresponding to a random variable X We say that the sequence {X_n }converges to X in distribution, if lim for every real number a at which F is continuous. This is the notion of convergence used in the central limit theorem.

A sequence of random variables {X_n } is called convergent in probability to a random variable X if for any σ > 0, Convergence in probability is the notion of convergence used in the weak law of large numbers.

A sequence of random variables {X_n } is called convergent in mean square to a random variable X if More generally, for any p > 0, a sequence of random variables {X_n } is called convergent in p–mean to a random variable X if

A sequence of random variables {X_n } is called convergent almost surely to a random variable X if

sequence of random variables {X_n } is called convergent surely to a random variable X if

Relationships between Various Modes of Convergence There are a few important connections between these modes of convergence. Convergence in distribution is the weakest form of convergence, and is in fact sometimes called weak convergence. It does not, in general, imply any other mode of convergence; however, convergence in distribution is implied by all other modes of convergence mentioned herein. Convergence in probability is the second weakest form of convergence. It implies convergence in distribution, but generally does not imply any other mode of convergence. However, convergence in probability is implied by all other modes of convergence mentioned herein, except convergence in distribution. If {X_n } converges in distribution to a constant c, then {X_n } converges in probability to c. Namely, when the limit (constant) in probability exists, the convergence in probability is equivalent to the convergence in distribution. Both convergence in mean square and almost sure convergence imply convergence in both distribution and probability. However, the inverse is not commonly true; convergence in mean square does not imply almost sure convergence, and vice versa.

LEGITIMATE CRITICISMS

In order to avoid possible confusion in using the term probability limits in these two different contexts discussed above, we suggest using “credible limits” for Bayesian probability limits, and using “a limit that … converges to in probability” in reference to the limit of a sequence of random variables that is convergent in probability when discussing asymptotic theory.

SEE ALSO Bayesian Econometrics; Bayesian Statistics; Econometrics; Probability Distributions; Probability Theory; Probability, Subjective; Statistics