The covariance is a generalization of variance. Hence, it is not surprising that it shares a relevant shortcoming: Its value depends on the unit of measurement of the underlying quantities. We recall that it is impossible to say whether a variance of 10,000 is large or small; a similar consideration applies to standard deviation, which at least is measured in the same units as expected values, so that we may consider the coefficient of variation Cx = σx/μx. We may try a normalization as well, in order to define an adimensional version of covariance.
DEFINITION 8.7 (Correlation coefficient) The correlation coefficient between random variables X and Y is defined as
The coefficient of correlation is adimensional, and it can be easily interpreted on the basis of the following theorem.
THEOREM 8.8 The correlation coefficient ρXY takes values in the range [−1, 1]. If ρXY = ±1, then X and Y are related by Y = a + bX, where the sign of b is the sign of the correlation coefficient.
PROOF Consider the following linear combination of X and Y:
We know that variance cannot be negative; hence
where σX and σY are the standard deviations of X and Y, respectively. This inequality immediately yields ρX,Y ≥ −1. By the same token, consider a slightly different linear combination:
We also know that if Var(Z) = 0, then Z must be a constant. In the first case, variance will be zero if ρ = −1. Then, we may write
for some constant α. Rearranging the equality, we have
This can be rewritten as Y = a + bX, and since standard deviations are non negative, we see that the slope is negative. Considering the second linear combination yields a similar relationship for the case ρ = 1, in which the slope b is positive.
Given the theorem, it is fairly easy to interpret a specific value of correlation:
- A value close to 1 shows a strong degree of positive correlation.
- A value close to −1 shows a strong degree of negative correlation.
- If correlation is zero, we speak of uncorrelated variables.
Fig. 8.3 Samples of jointly normal variables for different values of the correlation coefficient ρ.
A visual illustration of correlation is given in Fig. 8.3. Each scatterplot shows a sample of 100 joint observations from a joint normal with μ1 = μ2 = 10, σ1 = σ2 = 5, and different values of correlation. The effect of correlation is quite evident if we think to “draw a line” going through each cloud; the slope of the line corresponds to the sign of the correlation. In the limit case of ρ = ±1, the observations would exactly lie on a line. We stress again that uncorrelated variables need not be independent. A notable case, in which lack of correlation implies independence, is the multivariable normal distribution.
Leave a Reply