Matrix diagonalization is one of my favourite topics in linear algebra. I still remember learning it for the first time and being quite awed by it. What is diagonalization? If $A$ is a square matrix, the diagonalization is finding an invertible matrix $P$ such that $P^{-1}AP = D$ where $D$ is a diagonal matrix. It’s important to note that diagonalization depends on the field or ring that we are working in. In this post, we will just be working over fields, though diagonalization of course makes sense over rings, but the procedure is more complicated there.

Not all matrices can be diagonalized. For example, the upper triangular matrix

$$\begin{pmatrix}1 & 1\\ 0 & 1\end{pmatrix}$$ actually cannot be diagonalized over any field. Though I guess it is already a diagonal matrix over the zero ring, which is quite perverse if you ask me. Over a field, the procedure for diagonalizing a square $n\times n$ matrix $A$ involves the following procedure:

- Compute the characteristic polynomial, defined as the determinant $|A-\lambda I_n|$ where $I_n$ is the $n\times n$ identity matrix and $\lambda$ is a variable. This gives you a polynomial in $\lambda$.
- Find all the roots of the characteristic polynomial. The roots are called eigenvalues. The characteristic polynomial should split into a product of linear factors $(\lambda-\lambda_i)$. If that is not the case, stop because the matrix is not diagonalizable over the field. Or else, move to an algebraically closed field containing your field and keep going.
- For each root $\lambda_i$ of the characteristic polynomial, compute a basis of the null space of $A-\lambda_i I_n$. These vectors are called eigenvectors. The dimension of each null space should be the same as the multiplicity of the corresponding eigenvalue in the characteristic polynomial. If not, stop as the matrix is not diagonalizable.
- The matrix $P$ is the matrix whose columns are the basis vectors for the null spaces computed in the previous step, grouped together by eigenvalues $\lambda_1,\dots,\lambda_k$. The diagonal matrix is the diagonal matrix whose diagonal entries are the eigenvalues, each repeated according to their multiplicity in the characteristic polynomial.

Here we will look at the special case of $2\times 2$ matrices, where we can take some shortcuts.

## Example

It will be easiest if we just look at an example:

$$A = \begin{pmatrix}1 & 1\\-2 & 4\end{pmatrix}.$$ We start by computing the characteristic polynomial, which is the determinant of

$$\begin{pmatrix}1 – \lambda & 1\\-2 & 4 -\lambda\end{pmatrix}.$$

This determinant is

$$p(\lambda) = \lambda^2 -5\lambda + 6.$$ We can factor this polynomial as

$$p(\lambda) = (\lambda – 2)(\lambda – 3).$$ With a $2\times 2$ matrix, we can tell immediately that the matrix $A$ is diagonalizable. That is because each eigenvalue must have at least one eigenvector, and we have two distinct eigenvalues. On the other hand, if we have only a single eigenvalue, then the matrix is not diagonalizable unless of course it already *is* diagonal.

Therefore, we can make this general statement: *a $2\times 2$ matrix $A$ over a field $F$ that is not already diagonal, is diagonalizable if and only if its characteristic polynomial splits into a product of two distinct linear factors over $F$.*

Next, we need to find the matrix $P$. Now, we have to put the eigenvalues back into the matrix $A-\lambda I_2$. However, in the case of a $2×2$ matrix, we know that whatever eigenvalue $\lambda$ we put in $A-\lambda I_2$, the second row of the matrix $A-\lambda I_2$ will be a multiple of the first row. So we don’t need to row-reduce it at all and we can just consider the first row. Moreover, if the first row is $[a, b]$ then the corresponding vector spanning the null space (the eigenvector) will be $[b, -a]$. This works for any values of $a$ and $b$.

For example, our first eigenvalue is $\lambda = 2$ and the first row of $A – \lambda I_2$ is $[-1 1]$. Therefore, the first column of our matrix $P$ is $[1, 1]^t$. The second eigenvalue $\lambda = 3$ gives a first row of $A – \lambda I_2$ to be $[-2, 1]$ so the second column of $P$ is $[1, 2]^t$. Thus

$$P = \begin{pmatrix}1 & 1\\1 & 2\end{pmatrix}.$$ Incidentally, the inverse of $P$ is

$$P^{-1} = \begin{pmatrix}2 & -1\\-1 & 1\end{pmatrix}.$$ Therefore, we have

$$\begin{pmatrix}2 & -1\\-1 & 1\end{pmatrix}\begin{pmatrix}1 & 1\\-2 & 4\end{pmatrix}\begin{pmatrix}1 & 1\\1 & 2\end{pmatrix} = \begin{pmatrix}2 & 0\\0 & 3\end{pmatrix}.$$

And that’s all there is to it!

*Want to support this blog? Consider watching my YouTube video on the Snow Bunting and subscribing to my channel:*