Eigenvalues and Eigenvectors (BR Chapter 11)

  • Let $\mathbf{A} \in \mathbb{R}^{n \times n}$ and $$ \mathbf{A} \mathbf{x} = \lambda \mathbf{x}, \quad \mathbf{x} \ne \mathbf{0}. $$ Then $\lambda$ is an eigenvalue of $\mathbf{A}$ with corresponding eigenvector $\mathbf{x}$.

Compute eigenvalues (by hand)

  • From eigen-equation $\mathbf{A} \mathbf{x} = \lambda \mathbf{x}$, we have $$ (\mathbf{A} - \lambda \mathbf{I}) \mathbf{x} = \mathbf{0}. $$ That is the marix $\mathbf{A} - \lambda \mathbf{I}$ must be singular and $$ \det(\mathbf{A} - \lambda \mathbf{I}) = 0. $$

  • The $n$-degree polynomial $$ p_{\mathbf{A}}(\lambda) = \det(\lambda \mathbf{I} - \mathbf{A}) $$ is called the characteristic polynomial. Eigenvalues are the roots of $p_{\mathbf{A}}(\lambda)$.

  • Example: For $$ \mathbf{A} = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}, $$ the characteristic polynomial is $$ p_{\mathbf{A}}(\lambda) = \det \begin{pmatrix} \lambda - 2 & -1 \\ -1 & \lambda - 2 \end{pmatrix} = \lambda^2 - 4 \lambda + 3 = (\lambda - 1)(\lambda - 3). $$ Therefore $\mathbf{A}$'s eigenvalues are $\lambda_1 = 1$ and $\lambda_2 = 3$. Solving linear equations $$ \begin{pmatrix} \lambda - 2 & -1 \\ -1 & \lambda - 2 \end{pmatrix} \mathbf{x} = \mathbf{0} $$ now gives the corresponding eigenvectors $$ \mathbf{x}_1 = \begin{pmatrix} 1 \\ -1 \end{pmatrix}, \quad \mathbf{x}_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. $$ We observe that (1) $\text{tr}(\mathbf{A}) = \lambda_1 + \lambda_2$, (2) $\det(\mathbf{A}) = \lambda_1 \lambda_2$, and (3) the two eigenvectors are orthogonal to each other.

In [2]:
using LinearAlgebra

A = [2.0 1.0; 1.0 2.0]
Out[2]:
2×2 Array{Float64,2}:
 2.0  1.0
 1.0  2.0
In [2]:
eigen(A)
Out[2]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
eigenvalues:
2-element Array{Float64,1}:
 1.0
 3.0
eigenvectors:
2×2 Array{Float64,2}:
 -0.707107  0.707107
  0.707107  0.707107
  • Example: For the rotation matrix $$ \mathbf{Q} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}, $$ same procedure shows eigen-pairs $$ \mathbf{Q} \begin{pmatrix} 1 \\ -i \end{pmatrix} = i \begin{pmatrix} 1 \\ -i \end{pmatrix}, \quad \mathbf{Q} \begin{pmatrix} 1 \\ i \end{pmatrix} = (-i) \begin{pmatrix} 1 \\ i \end{pmatrix}. $$ The three properties (1)-(3) still hold.
In [5]:
Q = [0.0 -1.0; 1.0 0.0]
Out[5]:
2×2 Array{Float64,2}:
 0.0  -1.0
 1.0   0.0
In [6]:
eigen(Q)
Out[6]:
Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
eigenvalues:
2-element Array{Complex{Float64},1}:
 0.0 - 1.0im
 0.0 + 1.0im
eigenvectors:
2×2 Array{Complex{Float64},2}:
 0.707107-0.0im       0.707107+0.0im     
      0.0+0.707107im       0.0-0.707107im

Similar matrices

  • If $\mathbf{A} \mathbf{x} = \lambda \mathbf{x}$, then $$ (\mathbf{B} \mathbf{A} \mathbf{B}^{-1}) (\mathbf{B} \mathbf{x}) = \mathbf{B} \mathbf{A} \mathbf{x} = \lambda (\mathbf{B} \mathbf{x}). $$ That is $\mathbf{B} \mathbf{x}$ is an eigenvector of the matrix $\mathbf{B} \mathbf{A} \mathbf{B}^{-1}$.

    We say the matrix $\mathbf{B} \mathbf{A} \mathbf{B}^{-1}$ is similar to $\mathbf{A}$ because they share the same eigenvalues.

Diagonalizing a matrix

  • Collecting the $n$ eigen-equations $$ \mathbf{A} \mathbf{x}_i = \lambda_i \mathbf{x}_i, \quad i = 1,\ldots,n, $$ into a matrix multiplication format gives $$ \mathbf{A} \mathbf{X} = \mathbf{X} \boldsymbol{\Lambda}, \text{where } \boldsymbol{\Lambda} = \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix}. $$ If we assume that the $n$ eigenvectors are linearly independent (most matrices do, but not all matrices), then we have $$ \mathbf{X}^{-1} \mathbf{A} \mathbf{X} = \boldsymbol{\Lambda} \quad \quad \text{diagonalizing a matrix} $$ or $$ \mathbf{A} = \mathbf{X} \boldsymbol{\Lambda} \mathbf{X}^{-1}. \quad \quad \text{eigen-decomposition}. $$

Non-diagonalizable matrices

  • Geometric multiplicity (GM) of an eigenvalue $\lambda$: count the independent eigenvectors for $\lambda$. Look at the null space of $\mathbf{A} - \lambda \mathbf{I}$.

  • Algebraic multiplicity (AM) of an eigenvalue $\lambda$: count the repetition for $\lambda$. Look at the roots of characteristic polynomial $\det(\lambda \mathbf{I} - \mathbf{A})$.

  • Always $\text{GM} \le \text{AM}$.

  • The shortage of eigenvectors when $\text{GM} < \text{AM}$ means that $\mathbf{A}$ is not diagonalizable. There is no invertible matrix such that $\mathbf{A} = \mathbf{X} \boldsymbol{\Lambda} \mathbf{X}^{-1}$.

  • Classical example: $$ \mathbf{A} = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}. $$ AM = 2, GM=1. Eigenvalue 0 is repeated twice but there is only one eigenvector $(1, 0)$.

  • More examples: all three matrices $$ \begin{pmatrix} 5 & 1 \\ 0 & 5 \end{pmatrix}, \begin{pmatrix} 6 & -1 \\ 1 & 4 \end{pmatrix}, \begin{pmatrix} 7 & 2 \\ -2 & 3 \end{pmatrix} $$ have AM=2 and GM=1.

In [3]:
eigen([0 1; 0 0])
Out[3]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
eigenvalues:
2-element Array{Float64,1}:
 0.0
 0.0
eigenvectors:
2×2 Array{Float64,2}:
 1.0  -1.0         
 0.0   2.00417e-292
In [4]:
eigen([5 1; 0 5])
Out[4]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
eigenvalues:
2-element Array{Float64,1}:
 5.0
 5.0
eigenvectors:
2×2 Array{Float64,2}:
 1.0  -1.0        
 0.0   1.11022e-15
In [5]:
eigen([6 -1; 1 4])
Out[5]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
eigenvalues:
2-element Array{Float64,1}:
 5.0
 5.0
eigenvectors:
2×2 Array{Float64,2}:
 0.707107  0.707107
 0.707107  0.707107
In [6]:
eigen([7 2; -2 3])
Out[6]:
Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
eigenvalues:
2-element Array{Complex{Float64},1}:
 5.0 - 4.2146848510894035e-8im
 5.0 + 4.2146848510894035e-8im
eigenvectors:
2×2 Array{Complex{Float64},2}:
  0.707107-0.0im          0.707107+0.0im       
 -0.707107-1.49012e-8im  -0.707107+1.49012e-8im

Some properties

  • Multiplying both sides of the eigen-equation $\mathbf{A} \mathbf{x} = \lambda \mathbf{x}$ by $\mathbf{A}$ gives $$ \mathbf{A}^2 \mathbf{x} = \lambda \mathbf{A} \mathbf{x} = \lambda^2 \mathbf{x}, $$ showing that $\lambda^2$ is an eigenvalue of $\mathbf{A}$ with eigenvector $\mathbf{x}$.

    Similarly $\lambda^k$ is an eigenvalue of $\mathbf{A}^k$ with eigenvector $\mathbf{x}$.

  • For a diagonalizable matrix $\mathbf{A} = \mathbf{X} \boldsymbol{\Lambda} \mathbf{X}^{-1}$, we have $$ \mathbf{A}^k = \mathbf{X} \boldsymbol{\Lambda}^k \mathbf{X}^{-1}. $$

  • Shifting $\mathbf{A}$ shifts all eigenvalues. $$ (\mathbf{A} + s \mathbf{I}) \mathbf{x} = \lambda \mathbf{x} + s \mathbf{x} = (\lambda + s) \mathbf{x}. $$

  • $\mathbf{A}$ is singular if and only if it has at least one 0 eigenvalue.

  • Eigenvectors associated with distinct eigenvalues are linearly independent.

    Proof: Let $$ \mathbf{A} \mathbf{x}_1 = \lambda_1 \mathbf{x}_1, \quad \mathbf{A} \mathbf{x}_2 = \lambda_2 \mathbf{x}_2, $$ and $\lambda_1 \ne \lambda_2$. Suppose $\mathbf{x}_1$ and $\mathbf{x}_2$ are linealy dependent. Then there is $\alpha \ne 0$ such that $\mathbf{x}_2 = \alpha \mathbf{x}_1$. Hence $$ \alpha \lambda_1 \mathbf{x}_1 = \alpha \mathbf{A} \mathbf{x}_1 = \mathbf{A} \mathbf{x}_2 = \lambda_2 \mathbf{x}_2 = \alpha \lambda_2 \mathbf{x}_1, $$ or $\alpha (\lambda_1 - \lambda_2) \mathbf{x}_1 = \mathbf{0}$. Since $\alpha \ne 0$, $\lambda_1 \ne \lambda_2$, $\mathbf{x}_1=\mathbf{0}$, a contradiction.

  • The eigenvalues of a triangular matrix are its diagonal entries.

    Proof: $$ p_{\mathbf{A}}(\lambda) = (\lambda - a_{11}) \cdots (\lambda - a_{nn}). $$

  • Eigenvalues of an idempotent matrix are either 0 or 1.

    Proof: $$ \lambda \mathbf{x} = \mathbf{A} \mathbf{x} = \mathbf{A} \mathbf{A} \mathbf{x} = \lambda^2 \mathbf{x}. $$ So $\lambda = \lambda^2$ or $\lambda =0, 1$.

  • Eigenvalues of an orthogonal matrix have complex modulus 1.

    Proof: Since $\mathbf{A}' \mathbf{A} = \mathbf{I}$, $$ \mathbf{x}^* \mathbf{x} = \mathbf{x}^* \mathbf{A}' \mathbf{A} \mathbf{x} = \lambda^* \lambda \mathbf{x}^* \mathbf{x}. $$ Since $\mathbf{x}^* \mathbf{x} \ne 0$, we have $\lambda^* \lambda = |\lambda| = 1$.

  • Let $\mathbf{A} \in \mathbb{R}^{n \times n}$ (not required to be diagonalizable), then $\text{tr}(\mathbf{A}) = \sum_i \lambda_i$ and $\det(\mathbf{A}) = \prod_i \lambda_i$. The general version can be proved by the Jordan canonical form, a generalization of the eigen-decomposition.

Spectral decomposition for symmetric matrices

  • For a symmetric matrix $\mathbf{A} \in \mathbb{R}^n$,

    1. all eigenvalues of $\mathbf{A}$ are real numbers, and
    2. the eigenvectors corresponding to distinct eigenvalues are orthogonal to each other.

      Proof of 1: Pre-multiplying the eigen-equation $\mathbf{A} \mathbf{x} = \lambda \mathbf{x}$ by $\mathbf{x}^*$ (conjugate transpose) gives $$ \mathbf{x}^* \mathbf{A} \mathbf{x} = \lambda \mathbf{x}^* \mathbf{x}. $$ Now both $$ \mathbf{x}^* \mathbf{x} = x_1^* x_1 + \cdots + x_n^* x_n $$ and $$ \mathbf{x}^* \mathbf{A} \mathbf{x} = \sum_{i,j} a_{ij} x_i^* x_j = \sum_i a_{ii} x_i^*x_i + \sum_{i < j} a_{ij} (x_i^*x_j + x_i x_j^*) = a_{11} x_1^* x_1 + a_{12}(x_1^* x_2 + x_1 x_2^*) + \cdots $$ are real numbers. Therefore $\lambda$ is a real number.

      Proof of 2: Suppose $$ \mathbf{A} \mathbf{x}_1 = \lambda_1 \mathbf{x}_1, \quad \mathbf{A} \mathbf{x}_2 = \lambda_2 \mathbf{x}_2, $$ and $\lambda_1 \ne \lambda_2$. Then \begin{eqnarray*} (\mathbf{A} - \lambda_2 \mathbf{I}) \mathbf{x}_1 &=& (\lambda_1 - \lambda_2) \mathbf{x}_1 \\ (\mathbf{A} - \lambda_2 \mathbf{I}) \mathbf{x}_2 &=& \mathbf{0}. \end{eqnarray*} Thus $\mathbf{x}_1 \in \mathcal{C}(\mathbf{A} - \lambda_2 \mathbf{I})$ and $\mathbf{x}_2 \in \mathcal{N}(\mathbf{A} - \lambda_2 \mathbf{I})$. By the fundamental theorem of linear algebra, $\mathbf{x}_1 \perp \mathbf{x}_2$.

  • For an eigenvalue with multiplicity, we can choose its eigenvectors to be orthogonal to each other. Also we normalize each eigenvector to have unit $\ell_2$ norm. Thus we obtain the extremely useful spectral decomposition $$ \mathbf{A} = \mathbf{Q} \boldsymbol{\Lambda} \mathbf{Q}' = \begin{pmatrix} \mid & & \mid \\ \mathbf{q}_1 & \cdots & \mathbf{q}_n \\ \mid & & \mid \end{pmatrix} \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix} \begin{pmatrix} - & \mathbf{q}_1' & - \\ & \vdots & \\ - & \mathbf{q}_n' & - \end{pmatrix} = \sum_{i=1}^n \lambda_i \mathbf{q}_i \mathbf{q}_i', $$ where $\mathbf{Q}$ is orthogonal (columns are eigenvectors) and $\boldsymbol{\Lambda} = \text{diag}(\lambda_1, \ldots, \lambda_n)$ (diagonal entries are eigenvalues).