In the previous section we formally introduced the concept of the joint cumulative distribution function (CDF). In the case of two random variables, X and Y, this is a function FX,Y(x, y) of two arguments, giving the probability of the joint event {X ≤ x, Y ≤ y}:
The joint CDF tells the whole story about how the two random variables are linked. Then, on the basis of the joint CDF, we may also define a joint probability mass function (PMF) pX,Y(x, y) for discrete random variables, and a joint probability density function (PDF) fX,Y(x, y) for continuous random variables. These concepts translate directly to the more general case of n jointly distributed random variables.
If we have the joint PMF or PDF, we may compute the usual things, such as expected values, variances, and expected values of function of the random variables. The most general statement, when two random variables X and Y are involved, concerns the expected value of a function g(X, Y) of the two random variables. It can be computed as
To find the expected value E[X], all we have to do is plugging g(x, y) = x in the formula above; if we are interested in variance Var(Y), then we plug g(x, y) = (y − μY)2, where μY = E[Y]. We will not really have to compute such things in the remainder but the reader may appreciate the potential difficulty of computing multiple sums or integrals.
Apart from this computational difficulty, the very task of characterizing the full joint distribution of random variables may be difficult. It is even more difficult to infer this information based on empirical data. This is why one often settles for a limited characterization of dependence in terms of correlation. Before doing so, it is quite useful to understand how independence may simplify the analysis significantly. We will state, without any proof, a few fundamental properties that hold under the assumption of independence. All of these results are a consequence of a well-known fact: If events are independent, the probability of a joint event is just the product of the probabilities of all of the individual events. In the case of the joint CDF, this implies:
In other words, the joint CDF can be factored into the product of the two marginal CDFs, FX(x) and FY(y). The same factorization applies to the joint PMFs and PDFs:
It is by using this factorization that we can prove, e.g., the properties concerning the variance of a linear combination of independent random variables, that we have already used a few times and is recalled here for readers’ convenience:
for random variables Xi and coefficients λi, i = 1,…, n. Later, in Section 8.3.2, we discuss the general case where independence is not assumed. In passing, we may also state the following theorem.
THEOREM 8.2 Consider a function of random variables X and Y and assume that it can be factorized as the product of two terms g(X)h(Y). If the two random variables are independent, then E[g(X)h(Y)] = E[g(X)]·E[h(Y)].
In particular, it is important to notice that the expected value of a sum is always the sum of the expected value, but this commutation cannot be applied to products in general:
However, equality holds if X and Y are independent.
Leave a Reply