A recurring task in applications is summing random variables. If we have n random variables Xi, i = 1,…, n, we may build another random variable
What can we say about the distribution of Y? The answer depends on two important features of the terms in the sum:
- Is the distribution of all of the Xi the same?
- Are the involved variable independent?
We will clarify what we mean by “independent random variables” formally but what we know about independent events and conditional probabilities is enough to get the overall idea: Two variables are independent if knowing the realization of one of them does not help us in predicting the realization of the other one.
DEFINITION 7.5 (i.i.d. variables) We say that the variables Xi, i = 1,…, n, are i.i.d. if they are independent and identically distributed.
Arguably, the case of i.i.d. variables is the easiest we may think of. Unfortunately, even in this case, characterizing the distribution of the sum on random variables is no trivial task. It might be tempting to think that the distribution of Y should, at least qualitatively, similar to the distribution of the Xi, but a simple counterexample shows that this is not the case.
Fig. 7.17 Sampling the sum of two i.i.d. uniform variables.
Example 7.11 Consider two independent random variables, uniformly distributed between 0 and 10: U1, U2 ∼ U(0, 10). The support of their sum, Y = U1 + U2, is clearly the interval [0, 20], but what about the distribution? The analytical answer would require a particular form of integral, but we may guess the answer by sampling this distribution with the help of statistical software. Figure 7.17 shows the histogram obtained by sampling 10,000 observations of the sum. A look at the plot suggests a triangular distribution. In fact, it can be shown that the distribution of Y is triangular, with support on interval [0, 20] and mode m = 10.
By the same token, if we sum two i.i.d. exponential variables, we do not get an exponential. There is an important case in which distribution is preserved by summing.
PROPERTY 7.6 The sum of jointly normal random variables is a normal random variable.
It is important to note that this property does not assume independence: It applies to normal variables that are not independent and have different parameters. The term “jointly” may be puzzling, however. The point is that characterizing the joint distribution of random variables is not as simple as it may seem. It is not enough to specify the distribution of each single variable, as this provides us with no clue about their joint behavior. The term above essentially says that we are dealing with a multivariate normal distribution, which we define later, in Section 8.4.
So, the results concerning the general distribution of the sum of random variables are somewhat discouraging, but we recall that something more can be said if we settle for the basic features of a random variable, i.e., expected value and variance. We stated a couple of properties when dealing with discrete random variables, that carry over to the continuous case.
PROPERTY 7.7 (Expected value of a sum of random variables) The expected value of the sum of random variables is the sum of their expected values, assuming that they exist:
PROPERTY 7.8 (Variance of a sum of independent random variables) The variance of the sum of independent random variables is the sum of their variances, assuming that they exist:
It is important to notice that the two properties above do not require variables to be identically distributed. The property about variance does require independence, however.
Example 7.12 Consider two independent normal variables, X1 ∼ N(10, 25) and X2 ∼ N(−8,16). Then, the sum Y = X1 + X2 is a normal random variable, with expected value
and standard deviation
Note that we cannot add standard deviations; doing so would lead to a wrong result (5 + 4 = 9).
The example illustrates the fact that, when variables are independent, we may sum variances, but not standard deviations:
Another important remark concerns the sum and the difference between independent random variables. The property implies Var (X + Y) = Var (X) + Var (y), but what about their difference? We must apply the property carefully:
We see that the variance of a difference is not the difference of the variance; it is also worth noting that this would easily lead to nonsense, as by taking differences of variances we could find a negative variance.
Leave a Reply