An important remark about confidence levels

A further point concerns the correct interpretation of the confidence level. Consider the 95% confidence interval we calculated in Example 9.8. We cannot say that the confidence interval (34.0893, 84.5107) contains the unknown expected value with probability 95%. What we can say is that if we repeat the sampling procedure many times, and we compute a confidence interval for each sample, about 95% of the confidence intervals will contain the true unknown expected value. But there is nothing we can say about a specific confidence interval. To clarify the point, let us consider the simpler case of a normal random variable X with expected value 0. We can say that P(X ≤ 0) = 0.5. But if we observe a realization x = −0.55, we certainly cannot say that P(−0.55 ≤ 0) = 0.5. More generally, if μ is the expected value of a random variable X and x is a realization of that variable, the expression P(x ≤ μ) is meaningless, since it involves two numbers. The condition x ≤ μ is either true or false, and since we do not know μ, there is nothing we can say about it. By the same token, the statement about the confidence level 1 − α of a confidence interval applies a priori, i.e., to a pair of random variables that provide us with the lower and upper bounds of the interval. But it would be wrong to claim that a confidence interval provides us with some probabilistic information about the expected value. Actually, this applies within the framework of orthodox statistics, whereby the expected value μ is a number. Within a Bayesian framework, we do associate a probability distribution with an unknown parameter; this distribution can be the result of merging a priori beliefs with empirical evidence from sampled observations.

To reinforce a clear view of what a confidence interval is, we may consider the general definition of interval estimators and estimates.10

DEFINITION 9.5 (Interval estimate and estimator) An interval estimate of a real valued parameter θ is a pair of functions L(xand U(x), where images and n is the size of the sample, such that L(x) ≤ U(x), for any x in the range of interest. If the sample X = x is observed, the inference L(x) ≤ θ ≤ U(xis made. The random interval [L(X), U(X)] is called an interval estimator.

This definition includes standard confidence intervals as a specific case, and it clearly points out that the interval estimator is a pair of random variables. The interval estimate consists of the realization of the two random variables. Any probabilistic statement must refer to estimators, and not to estimates.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *