A geometric view of PCA

The linear data transformation, including centering, can be written as

where . We assume that data have already been centered, in order to ease notation. Hence

The Z_i variables are called principal components: Z₁ is the first principal component. We recall that the matrix A rotating axes is orthogonal:

Now, let us consider the sample covariance matrix of X, i.e., S_X. Since we assume centered data, we recall from Section 15.3.1 that this matrix is given as follows:

Now, we may also find the corresponding sample covariance matrix for Z, S_Z, taking advantage of the results of Section 15.3.1. However, we would like to find a matrix A such that the resulting principal components are uncorrelated; in other words, S_Z should be diagonal:

where the sample variance of each principal component. The matrix A should diagonalize the sample covariance matrix S_X, and we have already seen such a diagonalization in Eq. (3.16). To diagonalize S_X, we should consider the product

where matrix P is orthogonal and its columns consist of the normalized eigenvectors of the sample covariance matrix; since this is symmetric, its eigenvectors are indeed orthogonal.² The diagonalized matrix consists of the eigenvalues λ_i, i = 1,…,p, of the sample covariance matrix S_X. Putting everything together, we see that the rows , i = 1,…,p, of matrix A should be the normalized eigenvectors of the sample covariance matrix:

We also see that the sample variances of the principal components Z_i are the eigenvalues of S_X:

If we sort eigenvalues in decreasing order, we see that indeed Z₁ is the first principal component, accounting for most variability. Then, the second principal component Z₂ is orthogonal to Z₁ and is the second in rank. The fraction of variance explained by the first q components is

Taking the first few components, we can account for most variability and reduce the problem dimension by replacing the original variables by the principal components.

A geometric view of PCA

Comments

Leave a Reply Cancel reply