PROBABILITY SPACES, MEASURABILITY, AND INFORMATION

Successful investing in stock shares is typically deemed a risky and complex endeavor. However, the following piece of advice seems to offer a viable solution:²⁶

Buy a stock. If its price goes up, sell it. If it goes down, don’t buy it.

In this section we dig a little deeper into concepts related to measurability of random variables and their relationship to the flow of information and its impact on decisions. Despite their more theoretical character,²⁷ the concepts we consider here are often met on quantitative finance, where it is common to read about filtrations and adapted processes. We will not try a full and rigorous treatment, which would require a quite sophisticated machinery; still, we will be able to understand what is wrong with the above suggestion from a probabilistic perspective, and the concepts that we illustrate should look less intimidating after getting an intuitive feel for them.

We pointed out that a random variable is actually a mapping

from a set Ω of outcomes of a random experiment to the set of real numbers . Indeed, random variables are often denoted by X(ω) to emphasize this point. However, not all conceivable mappings are legitimate random variables. To see this, we need to clarify the concept of probability space.

DEFINITION 7.11 (Probability space) A probability space is a triple, usually denoted by (Ω, F, P), where Ω, is the sample space, consisting of outcomes of a random experiment; F is a family of subsets of Ω, the events, with suitable closure properties; and P is a probability measure mapping events into the interval [0, 1].

To fully get the message behind this definition, we should observe that for a given sample space we may define different probability measures, which is not surprising, but we may also define different families of events. If we roll a die, the obvious sample space is Ω = {1, 2, 3, 4, 5, 6}, and we may consider a family of events obtained by arbitrary combinations of set operations like union, complement, and intersection. This would make a rather large family of all subsets of size 1, 2, etc., also including Ω itself and its complement, the empty set ø:

However, we may constrain events a bit in order to reflect information or lack thereof. For instance, we might consider the following family of events:

It is easy to check that if we try taking complements and unions of elements in F₂, we still get an element of F₂. Since intersection is just a combination of these two operations, we see F₂ that is closed under elementary set operations. This family of events, with respect to F₁, is definitely less rich, and this reflects lack of information. It is the set of events we would deal with if the only information available about the roll of the die were “even” or “odd.”

When assigning a probability measure to subsets of Ω, we need to make sure that we are able to do the same for any event that we may obtain by elementary set operations. In other words, we should not get a subset that is not an event. This requirement may be expressed by requiring that F be a field.

DEFINITION 7.12 (Field) A family F of subsets of Ω is called a field if the following conditions hold:

Ω ∈ F
E ∈ F ⇒ (Ω \ E) ∈ F
E, G ∈ F ⇒ (E ∪ G) ∈ F

A field is also called an algebra of sets. The conditions in the definition state that F is closed under elementary set operations. Note that the first and second conditions imply that the empty set ø belongs to the field F.

Example 7.17 Given Ω = {1, 2, 3, 4}, consider the family of subsets

This is not a field since, for instance

Actually, to cope with continuous random variables and more generally with probability distributions with infinite support, a stronger concept is needed: a σ-field, also called a σ-algebra. To define this stronger concept, the third condition is extended to a countable union of events:

Indeed, whenever the concept infinity comes into play, pathological cases can occur. The very concept of σ-field is necessary to avoid weird cases in which it is impossible to assign a probability measure to an event. We will not be concerned with these anomalies, since we limit our treatment to a finite sample space.

Even describing a finite field by enumerating all of its subsets may be a daunting task. However, we may describe it implicitly by considering a finite partition P of Ω, i.e., a finite family of subsets E_i, i = 1,…, n, such that

Given a partition P, we may consider the σ-field σ(P) generated by combining subsets in the partition in any possible way. In the case of (7.22), the partition consists of all singleton sets, whereas in the case of (7.23) we have the two subsets of even and odd outcomes.

Let us now turn to random variables. Given a probability space, we may define random variables as mappings of outcomes into real numbers, but we should clarify how we associate a measure probability with a random variable. Actually, we associate a probability measure with underlying events. Indeed, the probability that we assign to random variables should be associated with the underlying events in the field F. Consider a discrete random variable X and numeric value a. How can we define the probability P(X = a)? We should consider the subset of outcomes ω ∈ Ω such that X(ω) = a, which essentially amounts to inverting the function X(ω):²⁸

Then, the probability we seek is just the probability measure of the subset X⁻¹(a). However, this is only possible if any such subset is an event in the σ-field F.

Example 7.18 Consider the sample space Ω = {1,2,3,4} and a partition

Let F = σ(P) be the field generated by this partition and define the mapping X(ω):

This mapping is not a random variable with respect to the field F. In fact, we cannot assign the probability P(X = 3), since

To be a random variable, a mapping Y(ω) should be constant for the three outcomes in {2, 3, 4}. For instance

is a legitimate random variable.

What is wrong with Example 7.18 is not the mapping X(ω) per se; it is its association with the field F. If we had a richer field, generated by the partition of Ω into its singletons, there would be no issue. Technically speaking, we say that X is not F-measurable.

DEFINITION 7.13 (Measurable random variable) We say that a random variable X(ω) is F-measurable if

for all values of x.

Fig. 7.26 An event tree.

In other words, the inverse function for any value x must be an event in the field F, so that we may associate a probability measure with it. This can be done or not depending on the random variable and the richness of the field of events F. If we go back to dice throwing, it is clear that if our field is given by (7.23), we can assign only one value to all even outcomes, and another value to all odd outcomes. The field is, in a sense, smaller than F₁, since all the events in F₂ are events in F₁, but the converse is not true; this represents a limitation in the available information.

The link between event fields, measurability, and information can be further clarified if we consider a dynamic problem. Let us consider a stochastic process in the form of the event tree depicted in Fig. 7.26. To be concrete, let us interpret this as a stochastic process describing the price of a stock share. At time t = 0, the stock price is X₀ = 10. Then, the price may go up or down, resulting in a stochastic process X_t, t = 0, 1, 2, 3. In this case the sample space consists of outcomes ω_i, i = 1, 2, …, 8, and each outcome corresponds to a scenario, i.e., a possible path of stock prices. For instance, outcome ω₃ is associated with scenario (10, 12, 11, 13). If we are at any terminal node in the scenario tree, we know which scenario has occurred, since we can observe the whole history of stock prices. However, if we are, e.g., on node n₄, we do not know whether we are observing scenario ω₃ or ω₄, since they cannot be distinguished. Nevertheless, we do have some information, since by observing the past history of stock prices we can rule out any other scenario. In the root node n₀ we have the least information, since any scenario is possible.

All of this is reflected in the event fields with which random variables X_t, t = 0, 1, 2, 3, are associated. We can capture information by suitable partitions of the sample space

At time t = 0 we cannot say anything, and our field of events is

At time t = 1 we can at least rule out half of the scenarios. This is reflected by the more refined partition

which generates the field

At time t = 2, there is a further branching, refining the partition

which generates an even richer field:

Finally, at time t = 3, we have the finest partition, consisting of singletons

which generates the richest field F₃, consisting of all possible subsets Ω. We note that, as time goes by, we get larger and larger fields.

DEFINITION 7.14 (Filtration) An increasing sequence of σ-fields

defined on a common sample space Ω is called a filtration.

A filtration defines precisely how information is collected by observing a stochastic process. This concept can be defined for continuous-time processes with a continuous state space, and it requires a sophisticated mathematical machinery. However, the essential message is quite simple:

The sequence of decisions we make by observing the stochastic process at times t = 0, 1, 2,… must reflect the available information and cannot be anticipative.

The piece of advice with which we have opened this section is clearly not implement able: It would require knowledge of the future. From a technical perspective, consider decision variables and representing the number of stock shares that we buy and sell, respectively, at time t. At time t = 0 we have a unique decision, since we can just buy or sell here and now. Seen from time t = 0, the variables for the next time instants are random variables, as they depend on the decision that we will make on the basis of the observed path of the stochastic process and our expectations for the future, which are represented by the scenario tree. The random variables at time t must be F_t-measurable. If we are at node n₄ in the event tree of Fig. 7.26, we cannot say “buy if scenario is ω₃” and “do not buy if scenario is ω₄.” The decision, whatever it is, must be the same for the two scenarios. Otherwise, the random variable corresponding to the decision at that node would not be constant on the event {ω₃, ω₄}, and it would not be measurable; we would be in trouble just as in Example 7.18. Technically speaking, we say that decisions must be adapted to the filtration F_t. The piece of financial advice we have considered, unfortunately, is not adapted to the filtration and would require clairvoyance to be practically implementable.

Problems

7.1 A random variable X has normal distribution with μ = 250 and σ = 40. Find the probability that X is larger than 200.

7.2 Consider a normal variable with μ = 250 and σ = 20, and find the probability that X falls in the interval between 230 and 260.

7.3 We should set the reorder point R for an item, whose demand during lead time is uncertain. We have a very rough model of uncertainty – the lead time demand is uniformly distributed between 5000 and 20000 pieces. Set the reorder point in such a way that the service level is 95%.

7.4 You are working in your office, and you would like to take a very short nap, say, 10 minutes. However, every now and then, your colleagues come to your office to ask you for some information; the interarrival time of your colleagues is exponentially distributed with expected value 15 minutes. What is the probability that you will not be caught asleep and reported to you boss?

7.5 A friend of yours is an analyst and is considering a probability model to capture uncertainty in monthly demand of an item featuring high-volume sales. He argues that the central limit applies and, after a thorough check of data, proposes a normal distribution with expected value 12,000 and standard deviation 7000 items. Is this a reasonable model?

7.6 Let X be a normal random variable with expected value μ = 6 and standard deviation σ = 1. Consider random variable W = 3X². Find the expected value E[W] and the probability P(W > 120).

7.7 You have just issued a replenishment order to your supplier, which is not quite reliable. You have ordered 400 items, but what you will receive is a normal random variable with that expected value, and standard deviation 40 (let us assume that using a continuous random variable is a sensible approximation of the discrete random variable modeling the integer number of received items). After receiving the shipment, you will have to serve a number of customer requests. The amount that your customers ask for is a random variable with expected value 3 and standard deviation 0.3. How many customer requests should you receive in order to have a probability of stockout larger than 10%?

7.8 Let be a chi-square variable with n degrees of freedom. Prove that E[X] = n.

7.9 You work for a manufacturing firm producing items with a limited time window for sale. Items are sold by a distributor facing uncertain demand over the time window, which we model by a normal distribution with expected value 10,000 and standard deviation 2500. The distributor decides how many items to order using a newsvendor model. From the distributors perspective, each item costs $10 and is sold at the recommended price of $14. Unsold items are bought back by the manufacturer. Assume that the manufacturer would like to see at least 15,000 items on the shelves (in order to promote her brand name); at what price should she be willing to buy unsold items back?

7.10 You are in charge of deciding the purchased amount of an item with limited time window for sales and uncertain demand. The unit purchase cost is $10 per item, the selling price is $16, and unsold items at the end of the sales window have a salvage value of $7 per item. Demand is also influenced by the level of competition. If there is none, demand is uniformly distributed between 1200 and 2200 items. However, if a strong competitor enters the game, demand is uniformly distributed between 100 and 1100 items.

If the probability that the competitor enters the market is assumed to be 50%, how many items should you order to maximize expected profit? (Let us assume that selling prices are the same in both scenarios.)
What if this probability is 20%? Does purchased quantity increase or decrease?

7.11 In some applications we are interested in the distribution of the maximum among a set of realization of random variables. Let us consider a set of n i.i.d. variables U_i, i = 1,…,n, with uniform distribution on the unit interval [0, 1]. Let X be their maximum: X = max{U₁, U₂,…,U_n}. Prove that the CDF of X is F_x(x) = xⁿ.

PROBABILITY SPACES, MEASURABILITY, AND INFORMATION

Problems

Comments

Leave a Reply Cancel reply