EIGENVALUES AND EIGENVECTORS

In Section 3.4.3 we observed that a square matrix is a way to represent a linear mapping from the space of n-dimensional vectors to itself.

Such a transformation, in general, entails both a rotation and a change of vector length. If the matrix is orthogonal, then the mapping is just a rotation. It may happen, for a specific vector v and a scalar λ, that

Then, if λ > 0 we have just a change in the length of v; if λ < 0 we also have a reflection of the vector. In such a case we say

That λ is an eigenvalue of the matrix A
That v is an eigenvector of the matrix A

It is easy to see that if v is an eigenvector, then any vector obtained by multiplication by a scalar , i.e., any vector of form αv, is an eigenvector, too. If we divide a generic eigenvector by its norm, we get a unit eigenvector:

Example 3.13 Consider

Then we have

Hence, λ = 2 is an eigenvalue of A, and v is a corresponding eigenvector. Actually, any vector of the form [α, α]^T is an eigenvector of A corresponding to the eigenvalue λ = 2. The eigenvector

is a unit eigenvector of A.

Matrix eigenvalues are an extremely useful tool with applications in mathematics, statistics, and physics that go well beyond the scope of this book. As far as we are concerned, the following points will be illustrated in this and later:

Eigenvalues may be used to investigate convexity and concavity of a function.
They are relevant in optimization applications.
They are useful in multivariate statistical methods, like principal component analysis, which have a lot of applications, e.g., in marketing and quantitative finance.

Now we should wonder whether there is a way to compute eigenvalues systematically. In practice, there are powerful numerical methods to do so, but we will stick to the most natural idea, which illustrates a lot of points about eigenvalues. Note that if λ is an eigenvalue and v an eigenvector, we have

This means that we may express the zero vector by taking a linear combination of the columns of matrix A − λI. A trivial solution of this system is v = 0. But a nontrivial solution can be found if and only if the columns of that matrix are not linearly independent. This, in turn, is equivalent to saying that the determinant is zero. Hence, to find the eigenvalues of a matrix, we should solve the following equation:

This equation is called characteristic equation of the matrix, as eigenvalues capture the essential nature of a matrix.

Example 3.14 Let us apply Eq. (3.14) to the matrix A in Eq. (3.12):

This second-order equation has solutions λ₁ = 2 and λ₂ = − 3. These numbers are the eigenvalues of matrix A. To find the eigenvectors corresponding to an eigenvalue, just plug an eigenvalue (say, 2) into the equation (A − λI)v = 0:

These two equations are redundant (the matrix is singular), and by taking either one, we can show that any vector v such that v₁ = v₂ is a solution of the system, i.e., an eigenvector associated with the eigenvalue λ = 2.

Can we say something about the number of eigenvalues of a matrix? This is not an easy question, but we can say that a n × n matrix can have up to n distinct eigenvalues. To see why, observe that the characteristic equation involves a polynomial of degree n, which may have up to n distinct roots. In general, we can state what follows:

Eigenvalues may be complex conjugates, rather than real numbers; for instance, consider the following matrix:The characteristic polynomial isHence, the two eigenvalues are λ = ±i, where i is the unit imaginary number defined by i² = −1. We do not find real eigenvalues, but this is not surprising as matrix B is a matrix rotating a vector by π/2, i.e., 90° on the plane. (Please check this graphically!)
Eigenvalues may be multiple roots of the characteristic polynomial, as in the case λ² + 2λ + 1 = (λ + l)² = 0. When there are multiple eigenvalues, finding eigenvalues may be tricky, but we can do without these technicalities.

The following properties of the eigenvalues of a matrix A are worth mentioning here:

The determinant of the matrix is the product of the eigenvalues: det(A) = .
The trace of the matrix, which is just the sum of its entries on the diagonal, is the sum of the eigenvalues: .

The first property has an interesting consequence: The matrix A is singular (hence, not invertible) when one of its eigenvalues is zero.

3.7.1 Eigenvalues and eigenvectors of a symmetric matrix

In applications, it is often the case that the matrix A is symmetric.¹⁸ A symmetric matrix has an important property.

THEOREM 3.10 Let be a symmetric matrix. Then

The matrix has only real eigenvalues.
The eigenvectors are mutually orthogonal.

The second point in the theorem implies that if we form a matrix P using normalized (unit) eigenvectors

this matrix is orthogonal, i.e., P^TP = I and P⁻¹ = P^T. This allows us to build a very useful factorization of a symmetric matrix A:

If we denote by D the diagonal matrix consisting of the n eigenvalues, we see that we may factor matrix A as follows:

Going the other way around, we may “diagonalize” matrix A as follows:

Example 3.15 Consider the following matrix:

Its characteristic polynomial can be obtained by developing the determinant along the last row:

We immediately see that its roots are

They are three real eigenvalues, as expected, but one is a multiple eigenvalue. To find the eigenvectors, let us plug λ = 3 into Eq. (3.13):

From the last row of the matrix, we see that v₃ can be chosen freely. Furthermore, we see that the first and the second equation are not linearly independent. In fact, the rank of matrix A is 1, and we may find two linearly independent eigenvectors of the form

Let us choose

in order to have two orthogonal and unit eigenvectors. We leave as an exercise for the reader to verify that, if we plug the eigenvalue λ = 1, we obtain eigenvectors of the form

Now, we may check that

where

Luckily, there are efficient and robust numerical methods for finding eigenvalues and eigenvectors, as well as for diagonalizing a matrix. These methods are implemented by widely available software tools. What is more relevant to us is the remarkable number of ways in which we may take advantage of the results above. As an example, the computation of powers of a symmetric matrix, A^k, is considerably simplified: