The correlation coefficient

The covariance is a generalization of variance. Hence, it is not surprising that it shares a relevant shortcoming: Its value depends on the unit of measurement of the underlying quantities. We recall that it is impossible to say whether a variance of 10,000 is large or small; a similar consideration applies to standard deviation, which at least is measured in the same units as expected values, so that we may consider the coefficient of variation C_x = σ_x/μ_x. We may try a normalization as well, in order to define an adimensional version of covariance.

DEFINITION 8.7 (Correlation coefficient) The correlation coefficient between random variables X and Y is defined as

The coefficient of correlation is adimensional, and it can be easily interpreted on the basis of the following theorem.

THEOREM 8.8 The correlation coefficient ρ_XY takes values in the range [−1, 1]. If ρ_XY = ±1, then X and Y are related by Y = a + bX, where the sign of b is the sign of the correlation coefficient.

PROOF Consider the following linear combination of X and Y:

We know that variance cannot be negative; hence

where σ_X and σ_Y are the standard deviations of X and Y, respectively. This inequality immediately yields ρ_X,Y ≥ −1. By the same token, consider a slightly different linear combination:

We also know that if Var(Z) = 0, then Z must be a constant. In the first case, variance will be zero if ρ = −1. Then, we may write

for some constant α. Rearranging the equality, we have

This can be rewritten as Y = a + bX, and since standard deviations are non negative, we see that the slope is negative. Considering the second linear combination yields a similar relationship for the case ρ = 1, in which the slope b is positive.

Given the theorem, it is fairly easy to interpret a specific value of correlation:

A value close to 1 shows a strong degree of positive correlation.
A value close to −1 shows a strong degree of negative correlation.
If correlation is zero, we speak of uncorrelated variables.

Fig. 8.3 Samples of jointly normal variables for different values of the correlation coefficient ρ.

A visual illustration of correlation is given in Fig. 8.3. Each scatterplot shows a sample of 100 joint observations from a joint normal with μ₁ = μ₂ = 10, σ₁ = σ₂ = 5, and different values of correlation. The effect of correlation is quite evident if we think to “draw a line” going through each cloud; the slope of the line corresponds to the sign of the correlation. In the limit case of ρ = ±1, the observations would exactly lie on a line. We stress again that uncorrelated variables need not be independent. A notable case, in which lack of correlation implies independence, is the multivariable normal distribution.

The correlation coefficient

Comments

Leave a Reply Cancel reply