Probability measures

The final step is associating each event E ∈ F with a probability measure P(E), in some sensible way. As a starting point, it stands to reason that, for an event E ⊆ Ω, its probability measure should be a number satisfying the following condition:

This is certainly true if we think of probabilities in terms of relative frequencies, but it also applies to whatever likelihood concept we wish to consider, including subjective belief.³ Furthermore, since the sample space Ω is, in a sense, the largest event including all of the other ones, we should also assume

Loosely speaking, this condition says that something has to happen. Finally, let us consider the union of disjoint events. Again, intuition suggests that, in this case

Hence probabilities are additive for disjoint events. A simple example is question Q2 above, where the probability that a die yields 2 or 5 is just the sum of the two respective probabilities. The idea can be generalized to an arbitrary number of disjoint events but, when events have an intersection, additivity need not hold.

Example 5.3 Consider a deck of 52 poker cards. If we draw a card at random, the probability that it is a king is Similarly, there is a probability that it is spades. But what is the probability that it is the king or spades? If we just add probabilities, we get

but there is something wrong: We are counting twice the king of spades, which is the intersection of the set of kings and the set of spades. We get the correct result , if we subtract the intersection, so that common elements are correctly counted once.

The example suggests that sensible rules should apply to manage complicated events, possibly amounting to a huge list. Actually, it turns out that the rules we need can be obtained as a consequence of the first three rules, which we take as the following axioms to define a probability measure.

DEFINITION 5.2 A probability measure P(·) is a mapping from events E within a sample space Ω to real numbers such that⁴

0 ≤ P(E) ≤ 1, for all E ∈ F
P(Ω) = 1
For each sequence E₁, E₂, E₃…, of mutually exclusive (disjoint) events, i.e., such that E_i ∩ E_j = ø for i ≠ j, we have

The last axiom may look a bit awkward, but it is just the generalization of additivity for probabilities of disjoint events to a possibly infinite (countable) number of events. From these axioms about events and probabilities, we can derive some properties that are intuitive, as well as some that are not.

Example 5.4 Given the probability P(E) of event E, what is the probability of its complement E^⊂? Since these two events are obviously disjoint, and E ∪ E^⊂ = Ω using the axioms, we obtain

Hence, the probability that an event does not happen P(E^⊂) = 1 − P(E). Using this theorem, we may also see that P(ø) = 0.

Example 5.5 Example 5.3 above suggests that, if two events E₁ and E₂ are not disjoint, the following should hold:

To prove this, we may note that the union of sets E₁ and E₂ can be expressed in terms of disjoint sets:

In plain English, this amounts to saying that the union of E₁ and E₂ can be rewritten as the union of two sets: E₁ and the part of E₂ that is disjoint from E₁ (it may help to check this by a simple drawing). Hence, we may use the third axiom:

Furthermore, we may express E₂, too, as the union of disjoint sets:

In plain English, this amounts to saying that E₂ consists of the union of two subsets: the part of E₂ that is disjoint from E₁ and the intersection of E₁ and E₂. Then

which can be plugged into Eq. (5.1) to obtain the result immediately.

Probability measures

Comments

Leave a Reply Cancel reply