DIFFERENT CONCEPTS OF PROBABILITY

We have met relative frequencies, a fundamental concept in descriptive statistics. Intuitively, relative frequencies can be interpreted as “probabilities” in some sense, as they should tell us something about the likelihood of events. While this is legitimate and quite sensible in many settings, we should wonder whether this frequentist interpretation is the only meaning that we may possibly attach to the more or less intuitive notion of probability. In fact, when doing so we implicitly take for granted that

  1. We have a suitable number of observations to estimate relative frequencies in a reliable way.
  2. Past outcomes help us in making decisions for the future.

As you may imagine, none of the above should be taken for granted. There might be very rare, yet potentially relevant events whose likelihood is hard to evaluate precisely for the very reason that they are indeed rare. How many times did we observe a financial crisis due to subprime mortgages? Furthermore, market conditions do change in time, and past knowledge need not be 100% helpful in predicting the future. Whenever people are involved, rather than mechanical devices, repeatability of an experiment is not ensured.

Indeed, sometimes probability is more akin to the idea of “belief”; asking what is the probability that a war will erupt in some place under certain sociopolitical conditions is very different from asking what is the probability of some outcome in a game of chance based on dice throwing. Certainly, we should not like the idea of running many experiments to identify relative frequencies in the first case. Hence, we should pause a little and wonder whether there are different concepts of probabilities.

Consider a prototypical random experiment, dice throwing, and the following questions:

Q1. If we throw a die, what is the probability that the outcome is 5?

Q2. If we throw a die, what is the probability that the outcome is 5 or 2?

Q3. If we throw a die, what is the probability that the outcome is an even number?

Q4. If we throw two dice, what is the probability that the sum of the outcomes is 7?

The answers are rather easy to find, but we should reflect on the underlying principles that are used to come up with each answer.

A1. Assuming that the die is fair and no one is cheating, most of us would say that the answer is images, i.e., 1 in 6. Are we using relative frequencies in finding this answer? Not really, unless we want to throw that die a huge number of times to check the result empirically. The relative frequency that we would obtain will likely get close to images, but not exactly. Since the number of possible outcomes is 6 (ruling out the remote possibility that the die lands on an edge or a vertex), the intuitive justification for our answer is symmetry. We do not see a strong rationale for saying that the likelihoods of the possible results are different.1 This symmetry is the foundation of the classical concept of probability. Actually, the die is not perfectly symmetric, as someone punched little holes on its faces, which are not perfectly equal. However, we do not know how to measure the impact of this lack of symmetry, if any. Of course, we could throw the die a huge number of times to see if there is a bias in favor of some outcome, but then the same procedure should be repeated for any kind of die, as it could depend on size, weight, and material. This does not sound too practical, but, since we are interested in management and decision making, there is a more important point. Say that there is indeed a small experimental discrepancy in the relative frequency of each outcome. Should we rely on that information in order to make a decision? Would it really make a difference?

A2. Since the two outcomes have the same probability, and they cannot occur at the same time, the intuitive answer is

images

Hence, we are just summing probabilities of elementary outcomes, which seems rather plausible in this case. Maybe, in more involved experiments, where we have to deal with complex events, we cannot just add probabilities like that. Nevertheless, the idea of adding probabilities looks sensible when outcomes are mutually exclusive, and it can be considered a basic rule of the game.

A3. Using more or less the same reasoning as before, the answer should be images. We may think of that as the sum of the probabilities of getting either 2, or 4, or 6. Alternatively, we may consider two mutually exclusive events, “even” and “odd,” with the same probability. Whatever the choice, we see that events need not be restricted to elementary outcomes. We might deal with events consisting of several elementary outcomes for at least a couple of reasons. First, maybe all we can observe is just “even” or “odd,” because we are not able to see the exact result. In many practical problems, we are not allowed to observe everything and we must settle for some partial information. Moreover, we might be interested only in those two outcomes, because we are betting on them, and more detailed information is irrelevant for our purposes. Whatever the reason, we realize that events may consist of multiple outcomes, and we must find a sensible and consistent way to work with them.

A4. To begin with, it is reasonable to assume that the two dice do not influence each other. Then, there are 6 × 6 = 36 possible outcomes of the form (D1D2), where both D1 and D2 can take any integer value between 1 and 6. Hence, we should just count the number of outcomes are such that D1 + D2 = 7. There are six such outcomes:

images

out of the 36 possible cases. Hence, the required probability is images. What we see in action here is a classical approach with historical roots in gambling; we have many equally outcomes, and we just take the ratio of the number of “favorable” ones over their total number. In arriving at the total number of outcomes, and in assessing their likelihood, we assumed that the two dice are independent. As dice have no memory, we could even throw the same die twice. We have seen something similar in coin flipping (see Section 1.2.3); there, the probability of getting a particular result, say “head–head,” when flipping a coin twice, is just the product of elementary probabilities:

images

It seems that when considering independent events this is a plausible rule of the game. Yet, that discussion pointed out that sometimes events are not independent at all, and we may take advantage of this. So, we must make this concept a bit more precise.

The discussion of these four questions points out a few basic requirements on how we should work with probabilities. Moreover, we see that there are at least two different ways to regard probabilities: Descriptive statistics suggests the idea of probability as relative frequencies, whereas the classical approach relies on symmetry and counting arguments. Yet, these two concepts do not cover all of the possibilities. To see why, consider the following:

  • Elementary counting does not work when there are an infinite number of outcomes, as in the case of real numbers on an interval.
  • We can work with relative frequencies if past data are available and relevant, but this is not the case when forecasting sales for a brand new product, possibly representing a real technological breakthrough. If past data are not helpful at all, we might be forced to work with probabilities as beliefs, i.e., subjective assessments of likelihood.

Subjective probability does not sound like a rigorous and scientific concept. Yet, this is what we have to work with in many situations, and subjective does not imply free from any rule, as shown by the following experiment (described, e.g., in the text by Kahneman et al. [5]).

Example 5.1 Consider the following description of a person:

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and she participated in antinuclear demonstrations.

On the basis the information above, rank the following statements in decreasing order of likelihood, i.e., from the most probable to the least probable.

(a) Linda is a teacher in an elementary school.

(b) Linda works in a bookstore and takes yoga classes.

(c) Linda is active in a feminist movement.

(d) Linda is a psychiatric social worker.

(e) Linda is a member of the League of Women Voters.

(f) Linda is a bank teller.

(g) Linda is an insurance salesperson.

(h) Linda is a bank teller who is active in a feminist movement.

Please, do rank the statements before reading further!

Given the limited evidence we have, there are some statements that may be ranked in any order, depending on your subjective opinion. For instance, it is difficult to say if (f) is more likely than (g) or vice versa. Many people, when handed this question, rank (c) higher than (f) and (g) because the description suggests a rather precise kind of person. Again, this is consistent with the concept of subjective assessment of probability.

Yet, this does not mean that any ordering makes sense. A rather surprising experimental fact, reported in Ref. [5], is that many respondents consider (h) more likely than (f). It is easy to see that this makes no sense. If we consider the issue in terms of relative frequencies, the set of persons meeting the condition in (h) is clearly a subset of the set of persons meeting the condition in (f). Or, if we take a logical viewpoint, (h) implies (f), but not vice versa. Indeed, is no way (f) can be less plausible than (h).

Typically, those who rank (f) and (h) in the wrong way do not make the same mistake with (c) and (h). Arguably, the psychological trap is that Linda does not seem like the prototypical bank teller, but adding the second feature (she is an active feminist) to statement (h) tricks many into believing that this is more plausible than (f).

The example above shows that even if we deal with subjective probability, there must be some logical and consistent structure in the way we think. Indeed, in Section 1.2.2 we have seen that intuition may lead us to wrong conclusions. Let us consider a similar example here.

Example 5.2 A quick and easy test is able to predict the gender of a baby very early during childbearing. Unfortunately, the test is not 100% reliable:

  • If the unborn child is a male, the result of the test is “male” with a probability of 90%.
  • If the unborn child is a female, the result of the test is “female” with a probability of 70%.

Frances tries the test, and the result is “female.” Mary tries the test, and the result is “male.” Between Frances and Mary, which one should be more confident (or less uncertain) about the gender of her child?

In this case, too, many are tricked by wrong intuition and believe that the correct answer is Mary. If you see some similarity with the example of Section 1.2.2, please try the same line of reasoning to prove that the correct answer is Frances. (Hint: Say that we consider 200 unborn babies, and that exactly half of them are males; how many tests will predict “male”?)

In Section 5.4 we illustrate a systematic way to solve such puzzles. For now, we might have more than enough evidence that some discipline should be involved in dealing with probabilities. To this aim, we will consider the so-called axiomatic approach to the theory of probabilities. Even if it is not free from some criticism, this is the most common approach and is a starting point to educate our way of reasoning with probabilities.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *