Distributions obtained from the normal

As we pointed out, if we sum i.i.d. random variables, we may end up with a completely different distributions, with the normal as a notable exception. However, there are ways to combine independent normal random variables that lead to new distributions that have remarkable applications, among other things, in inferential statistics. In fact, statistical tables are available for the random variables we describe below, providing us with quantiles we need to carry out statistical tests.

The chi-square distribution Consider a set of independent standard normal variables Zii = 1,…,n Consider random variable X defined as

images

Obviously, X cannot have normal distribution, as it cannot take negative values. This distribution is called chi-square, with n degrees of freedom. This is often denoted as images. The following results can be proved:

images

Figure 7.18 shows the PDF for chi-square variables with 4 and 8 degrees of freedom. The second one corresponds to the PDF with the lower mode, and the higher expected value and variance.

Student’s t distribution Consider a standard normal variable Z and a chi-square variable images with n degrees of freedom. Also assume that they are Then, the random variable

images

Fig. 7.18 PDF of two chi-square variables, images and images

independent. Then, the random variable

images

has Student’s t distribution with n degrees of freedom.20 We can show that

images

incidentally, we see that variance need not be always defined, as it may go to infinity. Figure 7.19 shows the PDFs of T1 and T5 variables, along with the PDF of standard normals. The PDF with the highest mode, drawn with a continuous line, corresponds to the standard normal; T1, represented with a dash-dotted line, features the lowest mode and the fattest tails. Indeed, the t distribution does look much like a standard normal, but it has fatter tails. When the number of degrees of freedom increases, the t distribution gets closer and closer to the standard normal. To see this quantitatively, we observe that kurtosis for the t distribution is

images
images

Fig. 7.19 Comparing the PDFs of two t distributions, T1 and T5 against the standard normal.

when n goes to infinity, kurtosis tends to 3, which is the kurtosis for a normal variable. Indeed, traditional statistical tables display quantiles for t variables up to n = 30, suggesting the use of quantiles for the standard normal for larger values of n. This approximation is not needed anymore, but it is useful to keep in mind that t distribution does tend to the standard normal for large values of n.

The F distribution Consider two independent random variables with chi-square distributions images and images, respectively. The random variable

images

is said to have F distribution with n1 and n2 degrees of freedom, which is denoted by Y ∼ F(n1n2). Note that the degrees of freedom cannot be interchanged, as the former refers to the denominator of the ratio, the latter to its denominator.

There is a relationship between F and t distributions, which can be grasped when considering a F(1, n) variable. This involves a images variable, with 1 degree of freedom, which is just a standard normal squared. Hence, what we have is

images

i.e., the square of a t variable with n degrees of freedom. Furthermore, when n2 is large enough, we get a Fn,∞ variable. By the law of large numbers, that we will state precisely later, when n2 goes to infinity, the ratio images variable converges to the numerical value 1. Hence a Fn,∞ variable is just a images variable divided by n:

images

Fig. 7.20 PDF of a F(5,10) random variable.

images

Figure 7.20 shows the PDF of a F random variable with 5 and 10 degrees of freedom. After this list of weird distributions obtained from the standard normal, the reader might well wonder why one should bother. The answer will be given when dealing with inferential statistics and linear regression, but we can offer at least some intuition:

  • We know from descriptive statistics that the sample variance involves squaring observations; when the population is normally distributed, this entails essentially squaring normal random variables, and the distribution of sample variance is linked to chi-square variables.
  • Furthermore, when we standardize a normal random variable, we take a normal variable (minus the expected value), and we divide it by its standard deviation; this leads to the t distribution.
  • Finally, a common task in inferential statistics is comparison of two variances. A typical way to do this is to take their ratio and check whether it is small or large. When variances come from sampling normal distributions, we are led to consider the ratio of two chi-square variables.

Comparing the PDFs of the three distributions, we see that the t distribution is symmetric, whereas the other two have nonnegative support. This should be kept in mind when working with quantiles from these distributions.

The lognormal distribution Unlike the previous distributions, the lognormal does not stem from statistical needs, but it is worth mentioning anyway because of its role in financial applications, among others. A random variable Y is lognormally distributed if log Y is normally distributed; put another way, if Y is normal, then eY is lognormal. Since the exponential is a nonnegative function, a lognormal random variable cannot take negative values. In fact, it has often been used (and misused) as a model of random stock prices since, unlike the normal, it cannot yield negative prices.21 Another noteworthy feature of lognormal random variables is that a product of lognormals is a lognormal variable; this is a consequence of the similar property of sums of normal variables and the properties of logarithms.

The following formulas illustrate the relationships between the parameters of a normal and a lognormal distribution. If X ∼ N(μσ2) and Y = eX, then

images

In particular, we see that

images

Since the exponential is a convex function, this is a consequence of Jensen’s inequality.22 Figure 7.21 shows the PDF of a lognormal variable with parameters μ = 0 and σ = 1.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *