TOTAL PROBABILITY AND BAYES’ THEOREMS

Conditional probabilities are a very important and powerful concept. In this section we see how we may tackle problems like the one in Example 5.2, which we use as a guideline. To frame the problem clearly, let us define the following events:

  • isM: the child is a male.
  • isF: the child is a female.
  • TM: the test predicts a male.
  • TF: the test predicts a female.

Now the first question is: What do we know and what would we like to know? The problem statement provides us with the following conditional probabilities:

images

from which we may also infer

images

Using conditional probabilities, we also see that what we need is, in a sense, inverting the conditional information in the probabilities above, as we need to compare the two conditional probabilities:

images

To see how we can accomplish our task, let us abstract a little and consider two events E and F. Intersection is a commutative operation:

images

Using the definition of conditional probability, we may write

images

but since the two left-hand sides are the same, we may also conclude that

images

We have proved the following theorem.

THEOREM 5.6 (Bayes’ theorem) Given two events E and F, we have

images

provided that P(E) ≠ 0.

Immediate application to our problem yields

images

Let us assume for simplicity that P(isM) = P(isF) = 0.5. In the relationship above, we need the probabilities of events TM and TF. Let us focus on the first one: In how many ways can the result of the test predict a male? Well, there are two cases:

  1. The child is indeed a male, and the test predicts the correct result; this happens with probability P(TM | isM)P(isM).
  2. The child is in fact a female, but the test is wrong; this happens with probability P(TM | isF)P(isF).

Since the two events are mutually exclusive, we may add the two probabilities to obtain

images

A similar result holds for event TF. Let us pause a moment and generalize the result.

THEOREM 5.7 (Total probability theorem) Given a sample space Ω and a family of mutually exclusive and collectively exhaustive events H1H2, …, Hnthe probability of event E can be expressed as

images

The events H1H2, …, Hn form a partition of the sample space, as illustrated in Fig. 5.6Mutually exclusive means that all of them are disjoint:

images
images

Fig. 5.6 An illustration of the total probability theorem.

Collectively exhaustive means that their union yields the whole sample space:

images

Given such a partition, we see that we may cut the event E into a collection of n mutually exclusive “slices” that, when patched together, yield back event E. The total probability theorem is a very convenient way to decompose the calculation of probabilities when we may slice the relevant event into disjoint pieces, as suggested in Fig. 5.6, and conditional probabilities are easy to compute. This is a very useful theorem in computing probabilities.

If we put Bayes’ and total probability theorems together, we see that if H1H2H3,…, Hn is a partition of the sample space, then for an event E we have the following equation:

images

Let us apply what we came up with to the gender prediction problem. The probability that Mary’s child is indeed a male is

images

By the same token, the probability that Prances’ child is indeed a female is

images

So, we see that Frances is the one who should be more confident about the gender of her child. We urge the reader to apply Bayes’ theorem to the illness problem of Section 1.2.2 and find the result that we obtained there by an informal reasoning.

Bayes’ theorem is fundamental in working with information and it is the starting point of a whole branch of statistics,7 which we touch on To conclude the section, we consider a rather well-known puzzle.

Example 5.9 Consider a dumb but quite popular TV program, in which the participant sits in front of three boxes A, B, and C. One of the boxes contains a prize and the guy, who has no clue where the prize is, has to choose one. Say that he chooses A. The presenter knows where the prize is; he opens box C, showing that it does not contain the prize; then, he offers the participant the possibility of giving up the previous choice and switching to box B. Should the participant accept the offer?

When handed this question, the class typically divides into two camps:

  1. One school of thought maintains that there is no point in switching from box A to box B. A priori, the probability of finding the prize was images; now, with two box remaining, the two probabilities are just images. Others go as far as to suggest that the presenter is cheating and trying to lure the participant into switching, in order to save the prize.
  2. Another school of thought maintains that indeed the probabilities were symmetric a priori, but now the probability “mass” associated with box C should shift to box B; then, the probability that the prize is in box B is now images and the participant would double the odds of winning by accepting the offer.8

Students hinting at the possibility that the presenter is cheating do have a point. We must state clearly the assumptions behind his behavior. In real games like this one, there are in fact many boxes with different prizes, and one would think that there is an incentive to try stealing the big one from the lucky participant. However, perhaps, a bigger incentive is to create suspense to keep the audience and make the game take more time, so that they can slip a few more juicy spots into the program. Therefore, let us assume that the presenter has no malicious intent and that his aim is just to stretch the game a little bit. Of course, whatever we conclude is as valid as this assumption, but this is a good feature of a formal analysis: Any assumption is stated clearly and we may assess its impact on our conclusions.

The first step in tackling the problem is finding a sensible formalization. We are dealing with the following events:

  • A, the prize is in box A.
  • B, the prize is in box B.
  • C, the prize is in box C.
  • opC, the presenter opens box C after participant’s choice.

What we need to do is evaluating the conditional probability P(A | opC); note that

images

so calculating one of the two probabilities is quite enough.

The next step is to clearly state what we know, or we assume to know:

  • A priori, the participant has no reason to believe that one box is more likely to contain the prize than the other ones:images
  • The presenter is not cheating and knows where the prize is. Then, we can evaluate the following conditional probabilities:
    • images because in such a case he could either open box B or C and nothing would change. So, let us assume that he chooses one of the two possibilities purely at random.
    • P(opC | B) = 1, because this is the only available option to him. He cannot open box A, because it is the selected one; he cannot open box B, because it would spoil the game.
    • P(opC | C) = 0, because he would necessarily open box B in this case, to avoid spoiling the game.

Now we are ready to apply Bayes’ theorem:

images

What we miss in this expression is just P(opC), which can be found by the total probability theorem:

images

If we put everything together, we obtain

images

Hence, the participant should switch to box B, since the odds of winning the prize would be images, rather than just images.

We should note that the conclusion of the example depends on all of the assumptions we made. This is a strength of a formal analysis, not a limitation: by stating a problem clearly, we point out which assumptions are critical as well as if and how our conclusion depends on them. If we are uncertain about the assumptions, it is no good reason not to consider their role explicitly.

Problems

5.1 Consider two events E and G, such that E ⊆ G. Then prove that P(E) ≤ P(G).

5.2 Assume that P(A) = P(B) for two events A and B. Then prove that, given another event E

images

Find an interpretation of the result as a probability inversion formula.

5.3 In Example 5.9 we assumed that the presenter opens box C knowing where the prize is. Now, let us assume that he has no information on where the prize is. Does this change our conclusions?


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *