The identity matrix and matrix inversion

Matrix inversion is an operation that has no counterpart in the vector case, and it deserves its own section. In the scalar case, when we consider standard multiplication, we observe that there is a “neutral” element for that operation, the number 1. This is a neutral element in the sense that for any , we have x · 1 = 1 · x = x.

Can we define a neutral element in the case of matrix multiplication? The answer is yes, provided that we confine ourselves to square matrices. If so, it is easy to see that the following identity matrix will do the trick:

The identity matrix consists of a diagonal of “ones,” and it is easy to see that, for any matrix , we have AI = IA = A. By the same token, for any vector , we have Iv = v (but in this case we cannot commute the two factors in the multiplication).

In the scalar case, given a number x ≠ 0, we define its inverse x⁻¹, such that x · x⁻¹ = x⁻¹ · x = 1. Can we do the same with (square) matrices? Indeed, we can sometimes find the inverse of a square matrix A, denoted by A⁻¹, which is a matrix such that

However, the existence of the inverse matrix is not guaranteed. We will learn in the following text that matrix inversion is strongly related to the possibility of solving a system of linear equations. Indeed, if a matrix A is invertible, solving a system of linear equations like Ax = b is easy. To see this, just premultiply the system by the inverse of A:

Note the analogy with the solution of the linear equation ax = b. We know that x = b/a, but we are in trouble if a = 0. By a similar token, a matrix might not be invertible, which implies that we fail to find a solution to the system of equations. To really understand the issues involved, we need a few theoretical concepts from linear algebra, which we introduce in Section 3.5.

For now, it is useful to reinterpret the solution of a system of linear equations under a different perspective. Imagine “slicing” a matrix , i.e., think of it as made of its column vectors a_j:

Then, we may see that solving a system of linear equations amounts to expressing the right-hand side b as a linear combination of the columns of A. To illustrate

More generally

There is no guarantee that we can always express b as a linear combination of columns a_j, j = 1, …, n, with coefficients x_j. More so, if we consider a rectangular matrix. For instance, if we have a system of linear equations associated with a matrix in , with m > n, then we have many equations and just a few unknown variables. It stands to reason that in such a circumstance we may fail to solve the system, which means that we may well fail to express a vector with many components, using just a few vectors as building blocks. This kind of interpretation will prove quite helpful later.

The identity matrix and matrix inversion

Comments

Leave a Reply Cancel reply