The linear data transformation, including centering, can be written as

images

where images. We assume that data have already been centered, in order to ease notation. Hence

images

The Zi variables are called principal componentsZ1 is the first principal component. We recall that the matrix A rotating axes is orthogonal:

images

Now, let us consider the sample covariance matrix of X, i.e., SX. Since we assume centered data, we recall from Section 15.3.1 that this matrix is given as follows:

images

Now, we may also find the corresponding sample covariance matrix for ZSZ, taking advantage of the results of Section 15.3.1. However, we would like to find a matrix A such that the resulting principal components are uncorrelated; in other words, SZ should be diagonal:

images

where images the sample variance of each principal component. The matrix A should diagonalize the sample covariance matrix SX, and we have already seen such a diagonalization in Eq. (3.16). To diagonalize SX, we should consider the product

images

where matrix P is orthogonal and its columns consist of the normalized eigenvectors of the sample covariance matrix; since this is symmetric, its eigenvectors are indeed orthogonal.2 The diagonalized matrix consists of the eigenvalues λii = 1,…,p, of the sample covariance matrix SX. Putting everything together, we see that the rows imagesi = 1,…,p, of matrix A should be the normalized eigenvectors of the sample covariance matrix:

images

We also see that the sample variances of the principal components Zi are the eigenvalues of SX:

images

If we sort eigenvalues in decreasing order, we see that indeed Z1 is the first principal component, accounting for most variability. Then, the second principal component Z2 is orthogonal to Z1 and is the second in rank. The fraction of variance explained by the first q components is

images

Taking the first few components, we can account for most variability and reduce the problem dimension by replacing the original variables by the principal components.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *