We are already familiar with the concept of conditional probability when events are involved. When dealing with random variables X and Y, we might wonder whether knowing something about Y, possibly even its realized value, can help us in predicting the value of X. To introduce the concepts in the simplest way, it is a good idea to work with a pair of discrete random variables with discrete support. So, let us consider a variable X that can take values xii = 1,…, k, and a variable Y that can take values yjj = 1,…, l. Given the joint PMF, we know all of the relevant probabilities

images

and we may also consider conditional probabilities, such as

images

assuming, of course, that P(Y = yj) ≠ 0. Generalizing a bit, let us define the conditional PMF:

images

It is essential to note that, if the two random variables are independent, then

images

i.e., knowledge of Y is no use in predicting X. If the two variables are not independent, one natural question concerns the expected value of X if we know that Y = yj. Such a conditional expectation is obtained as follows:

images

Example 8.6 Let X and Y be two binary random variables whose joint distribution is characterized by the PMF:

images

Let us find the distribution of X conditional on Y = 0 or Y = 1. The first step is computing the marginal PMF of Y:

images

Then we find pX|Y(x, 0) first:

images

By the same token:

images

Now we may compute the conditional expected values:

images

Incidentally, the unconditional expected value of X is

images

We see that knowledge of Y does change our expectation about X. The two random variables are not independent.

The case of two jointly continuous random variables is conceptually similar, and it goes through the definition of the following PDF:

images

for y such that fY(y) ≠ 0. It is no surprise that we cannot divide by a probability P(Y = y), as this is identically zero, but the concept is quite similar to the discrete case.

Conditioning is a useful concept that can be exploited, among other things:

  1. To simplify the calculations of expectations
  2. To characterize properties of some probability distributions
  3. To characterize properties of certain stochastic processes

We illustrate these points in the following sections.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *