The covariance is a generalization of variance. Hence, it is not surprising that it shares a relevant shortcoming: Its value depends on the unit of measurement of the underlying quantities. We recall that it is impossible to say whether a variance of 10,000 is large or small; a similar consideration applies to standard deviation, which at least is measured in the same units as expected values, so that we may consider the coefficient of variation Cx = σx/μx. We may try a normalization as well, in order to define an adimensional version of covariance.
DEFINITION 8.7 (Correlation coefficient) The correlation coefficient between random variables X and Y is defined as
data:image/s3,"s3://crabby-images/466ed/466edd43afc7842933781127c19abbcbae1e1923" alt="images"
The coefficient of correlation is adimensional, and it can be easily interpreted on the basis of the following theorem.
THEOREM 8.8 The correlation coefficient ρXY takes values in the range [−1, 1]. If ρXY = ±1, then X and Y are related by Y = a + bX, where the sign of b is the sign of the correlation coefficient.
PROOF Consider the following linear combination of X and Y:
data:image/s3,"s3://crabby-images/6b17f/6b17f52cae3bc02b57d9c30444199b2199007e5a" alt="images"
We know that variance cannot be negative; hence
data:image/s3,"s3://crabby-images/4d640/4d64083f5b9b3d08b0acffa4f17d2365df50279c" alt="images"
where σX and σY are the standard deviations of X and Y, respectively. This inequality immediately yields ρX,Y ≥ −1. By the same token, consider a slightly different linear combination:
data:image/s3,"s3://crabby-images/c981d/c981db3be54f0e3cd446292773050eb212f25d85" alt="images"
We also know that if Var(Z) = 0, then Z must be a constant. In the first case, variance will be zero if ρ = −1. Then, we may write
data:image/s3,"s3://crabby-images/343ec/343ecda562b7c28b28977dbe9e9ca0dcdee3cc23" alt="images"
for some constant α. Rearranging the equality, we have
data:image/s3,"s3://crabby-images/bc4d9/bc4d97607d4634e2baa7d29e3f63a47a153b0a35" alt="images"
This can be rewritten as Y = a + bX, and since standard deviations are non negative, we see that the slope is negative. Considering the second linear combination yields a similar relationship for the case ρ = 1, in which the slope b is positive.
Given the theorem, it is fairly easy to interpret a specific value of correlation:
- A value close to 1 shows a strong degree of positive correlation.
- A value close to −1 shows a strong degree of negative correlation.
- If correlation is zero, we speak of uncorrelated variables.
data:image/s3,"s3://crabby-images/9cfb0/9cfb029511d9f46b2d64a092e665548e02bc299c" alt="images"
Fig. 8.3 Samples of jointly normal variables for different values of the correlation coefficient ρ.
A visual illustration of correlation is given in Fig. 8.3. Each scatterplot shows a sample of 100 joint observations from a joint normal with μ1 = μ2 = 10, σ1 = σ2 = 5, and different values of correlation. The effect of correlation is quite evident if we think to “draw a line” going through each cloud; the slope of the line corresponds to the sign of the correlation. In the limit case of ρ = ±1, the observations would exactly lie on a line. We stress again that uncorrelated variables need not be independent. A notable case, in which lack of correlation implies independence, is the multivariable normal distribution.
Leave a Reply