MATH661

Eigenvalue-Revealing Factorizations

1.The eigenvalue problem

Linear endomorphisms $𝒇 : ℂ^{m} \to ℂ^{m}$ , represented by $𝑨 \in ℂ^{m \times m}$ , can exhibit invariant directions $𝒙 \neq 𝟎$ for which

𝒇 (𝒙) = 𝑨 𝒙 = λ 𝒙,

known as eigenvectors, with associated eigenvalue $λ \in ℂ$ . Eigenvectors are non-zero elements of the null space of $𝑨 - λ 𝑰$ ,

(𝑨 - λ 𝑰) 𝒙 = 𝟎,

and the null-space is referred to as the eigenspace of $𝑨$ for eigenvalue $λ$ , $ℰ_{𝑨} (λ) = N (𝑨 - λ 𝑰)$ .

Non-zero solutions are obtained if $𝑨 - λ 𝑰$ is rank-deficient (singular), or has linearly dependent columns in which case

\det (𝑨 - λ 𝑰) = | \begin{array}{cccc} a_{11} - λ & a_{12} & \dots & a_{1 m} \\ a_{21} & a_{22} - λ & \dots & a_{2 m} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m m} - λ \end{array} | = 0 .

From the determinant definition as “sum of all products choosing an element from row/column”, it results that

\det (λ 𝑰 - 𝑨) = λ^{m} + c_{1} λ^{m - 1} + \dots + c_{m - 1} λ + c_{m} = p_{A} (λ),

known as the characteristic polynomial associated with the matrix 𝑨, and of degree $m$ . The characteristic polynomial is monic, meaning that the coefficient of the highest power $λ^{m}$ is equal to one. The fundamental theorem of algebra states that $p_{A} (λ)$ of degree $m$ has $m$ roots, hence $𝑨 \in ℂ^{m \times m}$ has $m$ eigenvalues (not necessarily distinct), and $m$ associated eigenvectors. This can be stated in matrix form as

𝑨 𝑿 = 𝑿 𝚲,

with

𝑿 = [\begin{array}{lll} 𝒙_{1} & \dots & 𝒙_{m} \end{array}], 𝚲 = diag (λ_{1}, \dots, λ_{m}),

the eigenvector matrix and eigenvalue matrix, respectively.

1.1.Coordinate transformations

The statement $𝑨 𝒙 = λ 𝒙$ , that eigenvector $𝒙$ is an invariant direction of the operator $𝑨$ along which the effect of operator is scaling by $λ$ , suggests that similar behavior would be obtained under a coordinate transformation $𝑻 𝒚 = 𝑰 𝒙 = 𝒙$ . Assuming $𝑻$ is of full rank and introducing $𝑩 = 𝑻^{- 1} 𝑨 𝑻$ , this leads to

𝑨 𝒙 = 𝑨 𝑻 𝒚 = λ 𝒙 = λ 𝑻 𝒚 \Rightarrow 𝑻^{- 1} 𝑨 𝑻 𝒚 = λ 𝒚 .

Upon coordinate transformation, the eigenvalues (scaling factors along the invariant directions) stay the same. Metric-preserving coordinate transformations are of particular interest, in which case the transformation matrix is unitary $𝑩 = 𝑸^{*} 𝑨 𝑸$ .

Definition. Matrices $𝑨, 𝑩 \in ℂ^{m \times m}$ are said to be similar, $𝑩 \sim 𝑨$ , if there exists some full rank matrix $𝑻 \in ℂ^{m \times m}$ such that $𝑩 = 𝑻^{- 1} 𝑨 𝑻$ .

Proposition. Similar matrices $𝑨, 𝑩 \in ℂ^{m \times m}$ , $𝑩 = 𝑻^{- 1} 𝑨 𝑻$ , have the same eigenvalues, and eigenvectors $𝒙$ of $𝑨$ , $𝒚$ of $𝑩$ are related through $𝒙 = 𝑻 𝒚$ .

Since the eigenvalues of $𝑩 \sim 𝑨$ are the same, and a polynomial is completely specified by its roots and coefficient of highest power, the characteristic polynomials of $𝑨, 𝑩$ must be the same

p_{𝑨} (λ) = \prod_{k = 1}^{m} (λ - λ_{k}) = p_{𝑩} (λ) .

This can also be verified through the determinant definition

p_{𝑩} (t) = \det (λ 𝑰 - 𝑩) = \det (λ 𝑻^{- 1} 𝑻 - 𝑻^{- 1} 𝑨 𝑻) = \det (𝑻^{- 1} (λ 𝑰 - 𝑨) 𝑻) = \det (𝑻^{- 1}) \det (λ 𝑰 - 𝑨) \det (𝑻) = p_{𝑨} (λ),

since $\det (𝑻^{- 1}) = 1 / \det (𝑻)$ .

1.2.Paradigmatic eigenvalue problem solutions

$\circ$ Reflection matrix. The matrix

𝑯 = 𝑰 - 2 𝒒 𝒒^{T} \in ℝ^{2 \times 2}, || 𝒒 || = 1,

is the two-dimensional Householder reflector across $N (𝒒^{T})$ . Vectors colinear with $𝒒$ change direction along the same orientation upon reflection, while vectors orthogonal to $𝒒$ (i.e., in the null space $𝑵 (𝒒^{T})$ ) are unchanged. It is therefore to be expected that $λ_{1} = - 1$ , $𝒙_{1} = 𝒒$ , and $λ_{2} = 1$ , $𝒒^{T} 𝒙_{2} = 0$ . This is readily verified

𝑯 𝒒 = (𝑰 - 2 𝒒 𝒒^{T}) 𝒒 = 𝒒 - 2 𝒒 = - 𝒒,

𝑯 𝒙_{2} = (𝑰 - 2 𝒒 𝒒^{T}) 𝒙_{2} = 𝒙_{2} .

Figure 1. Reflector in two dimensions

∴	v=[-1; 1]; q=v/norm(v); H=I-2qq';

∴	eigvals(H)

$[\begin{array}{c} - 0.9999999999999996 \\ 1.0 \end{array}]$ (1)

∴	eigvecs(H)

$[\begin{array}{cc} - 0.7071067811865475 & 0.7071067811865475 \\ 0.7071067811865475 & 0.7071067811865475 \end{array}]$ (2)

∴

$\circ$ Rotation matrix. The matrix

𝑹 (θ) = [\begin{array}{ll} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{array}],

represents the isometric rotation of two-dimensional vectors. If $θ = 0$ , $𝑹 = 𝑰$ with eigenvalues $λ_{1} = λ_{2} = 1$ , and eigenvector matrix $𝑿 = 𝑰$ . For $θ = π$ , the eigenvalues are $λ_{1} = λ_{2} = - 1$ , again with eigenvector matrix $𝑿 = 𝑰$ . If $\sin θ \neq 0$ , the orientation of any non-zero $𝒙 \in ℝ^{2}$ changes upon rotation by $θ$ . The characteristic polynomial has complex roots

p (λ) = {(λ - \cos θ)}^{2} + \sin^{2} θ \Rightarrow λ_{1, 2} = \cos θ \pm i \sin θ = e^{\pm i θ}

and the directions of invariant orientation have complex components (are outside the real plane $ℝ^{2}$ )

𝑿 = [\begin{array}{ll} 1 & - 1 \\ i & i \end{array}], 𝑹 𝑿 = [\begin{array}{ll} e^{- i θ} & - e^{i θ} \\ i e^{- i θ} & i e^{i θ} \end{array}] = [\begin{array}{ll} 1 & - 1 \\ i & i \end{array}] [\begin{array}{ll} e^{- i θ} & 0 \\ 0 & e^{i θ} \end{array}] .

∴	R=[1 -1; 1 1]/sqrt(2); eigvals(R)

$[\begin{array}{c} 0.7071067811865475 - 0.7071067811865475 i \\ 0.7071067811865475 + 0.7071067811865475 i \end{array}]$ (3)

∴	eigvecs(R)

$[\begin{array}{cc} 0.7071067811865475 + 0.0 i & 0.7071067811865475 + 0.0 i \\ 0.0 + 0.7071067811865475 i & 0.0 - 0.7071067811865475 i \end{array}]$ (4)

∴

$\circ$ Second-order differentiation matrix. Eigenvalues of matrices arising from discretization of continuum operators can be obtained from the operator eigenproblem. The second-order differentiation operator $\partial_{x}^{2}$ has eigenvalues $- ξ^{2}$ associated with eigenfunctions $\sin (ξ x)$

\partial_{x}^{2} \sin (ξ x) = - ξ^{2} \sin (ξ x) .

Sampling of $\sin (ξ x)$ at $x_{k} = k h$ , $k = 1, \dots, m$ , $h = π / (m + 1)$ leads to the vector $𝒖 \in ℝ^{m}$ with components $u_{k} = \sin (ξ k h)$ . The boundary conditions at the sampling interval end-points affect the eigenvalues. Imposing $\sin (ξ x) = 0$ , at $x = 0$ and $x = π$ leads to $ξ \in ℤ$ . The derivative can be approximated at the sample points through

u_{k}^{''} ≅ \frac{\sin [ξ (x_{k} + h)] - 2 \sin [ξ x_{k}] + \sin [ξ (x_{k} - h)]}{h^{2}} = \frac{2}{h^{2}} (\cos (ξ h) - 1) \sin (ξ k h) = - \frac{4}{h^{2}} \sin^{2} (\frac{ξ h}{2}) \sin (ξ k h) .

The derivative approximation vector $𝒖^{''} = {[u_{k}^{''}]}_{k = 1, \dots m}$ results from a linear mapping $𝒖^{''} = 𝑫 𝒖,$ and the matrix

𝑫 = \frac{1}{h^{2}} diag ([\begin{array}{lll} 1 & - 2 & 1 \end{array}]),

has eigenvectors $𝒖$ and eigenvalues $- (4 / h^{2}) \sin^{2} (ξ h / 2)$ , $ξ = 1, 2, \dots, m$ . In the limit of an infinite number of sampling points the continuum eigenvalues are obtained, exemplifying again the correspondence principle between discrete and continuum representations

{lim}_{h \to 0} - \frac{4}{h^{2}} \sin^{2} (\frac{ξ h}{2}) = - ξ^{2} .

∴	m=4; o=ones(m-1,1)[:,1]; D=diagm(-1 => o, 1 => o)-2*I

$[\begin{array}{cccc} - 2.0 & 1.0 & 0.0 & 0.0 \\ 1.0 & - 2.0 & 1.0 & 0.0 \\ 0.0 & 1.0 & - 2.0 & 1.0 \\ 0.0 & 0.0 & 1.0 & - 2.0 \end{array}]$ (5)

∴	function ErrorλD(m) h=π/(m+1); ξ=1:m; o=ones(m-1); D=(diagm(-1=>o, 1=>o)-2I)/h^2; λD = sort(eigvals(D),rev=true) λ = -(4/h^2)sin.(ξ*h/2).^2 ε=norm(λD .- λ)/norm(λ) end;

∴	ε=ErrorλD.(20:20:100)';

∴	print(round.(ε; digits=3))

[0.0 0.0 0.0 0.0 0.0]

∴	function eigD(m) h=π/m; o=ones(m-1,1)[:,1]; D=diagm(-1 => o, 1 => o)-2I; λD=sort(eigvals(D),rev=true) λ = -4sin.(collect(1:m).*h/2).^2 return [λD λ] end;

∴

clf();

∴	for m=10:10:100 local Λ=eigD(m) plot(1:m,Λ[:,1],"ob",markerfacecolor="none") plot(1:m,Λ[:,2],"xr") end

∴	xlabel("k"); ylabel("λ[k]");

∴	pre=homedir()*"/courses/MATH661/images/";

∴	savefig(pre*"L14D2eigs.eps");

∴

1.3.Matrix eigendecomposition

A solution $𝑿, 𝚲$ to the eigenvalue problem $𝑨 𝑿 = 𝑿 𝚲$ always exists, but the eigenvectors of $𝑨$ do not always form a basis set, i.e., $𝑿$ is not always of full rank. The factorized form of the characteristic polynomial of $𝑨 \in ℂ^{m \times m}$ is

p_{𝑨} (λ) = \det (λ 𝑰 - 𝑨) = \prod_{k = 1}^{K} {(λ - λ_{k})}_{}^{m_{k}},

with $K ⩽ m$ denoting the number of distinct roots of $p_{𝑨} (λ)$ , and $m_{k}$ is the algebraic multiplicity of eigenvalue $λ_{k}$ , defined as the number of times the root $λ_{k}$ is repeated. Let $ℰ_{k}$ denote the associated eigenspace $ℰ_{k} = ℰ_{𝑨} (λ_{k}) = N (𝑨 - λ_{k} 𝑰)$ . The dimension of $ℰ_{k}$ denoted by $n_{k}$ is the geometric multiplicity of eigenvalue $λ_{k}$ . The eigenvector matrix is of full rank when the vector sum of the eigenspaces covers $ℂ^{m}$ , as established by the following results.

Proposition. If $λ_{i} \neq λ_{j}$ then $ℰ_{i} \cap ℰ_{j} = {𝟎}$ (the eigenspaces of distinct eigenvalues are disjoint)

Proof. Let $𝒙 \in$ $ℰ_{i}$ , hence $𝑨 𝒙 = λ_{i} 𝒙$ and $𝒙 \in$ $ℰ_{j}$ , hence $𝑨 𝒙 = λ_{j} 𝒙$ . Subtraction gives

𝑨 𝒙 - 𝑨 𝒙 = 𝟎 = (λ_{i} - λ_{j}) 𝒙 .

Since $λ_{i} \neq λ_{j}$ it results that $𝒙 = 𝟎$ . $□$

Proposition. The geometric multiplicity of an eigenvalue is less or equal to its algebraic multiplicity,

$n_{k} = \dim (N (𝑨 - λ_{k} 𝑰)) ⩽ m_{k} .$

Proof. Let $𝑽 \in ℂ^{m \times n_{k}}$ be an orthogonal basis for $N (𝑨 - λ_{k} 𝑰)$ . $□$

Definition 1. An eigenvalue for which the geometric multiplicity is less than the algebraic multiplicity is said to be defective

Proposition 2. A matrix is diagonalizable is the geometric multiplicity of each eigenvalue is equal to the algebraic multiplicity of that eigenvalue.

If the column vectors of $X$ are linearly independent, then $X$ is invertible and $A$ can be reduced to diagonal form

A = X Λ X^{- 1}, A = U Σ V^{T}

Diagonal forms are useful in solving linear ODE systems
$y^{'} = Ay \Leftrightarrow (X^{- 1} y) = Λ (X^{- 1} y)$
Also useful in repeatedly applying $A$
$u_{k} = A^{k} u_{0} = A A \dots A u_{0} = (X Λ X^{- 1}) (X Λ X^{- 1}) \dots (X Λ X^{- 1}) u_{0} = X Λ^{k} X^{- 1} u_{0}$

When can a matrix be reduced to diagonal form? When eigenvectors are linearly independent such that the inverse of $X$ exists
Matrices with distinct eigenvalues are diagonalizable. Consider $A \in ℝ^{m \times m}$ with eigenvalues $λ_{j} \neq λ_{k}$ for $j \neq k$ , $j, k \in {1, \dots, m}$

Proof. By contradiction. Take any two eigenvalues $λ_{j} \neq λ_{k}$ and assume that $x_{j}$ would depend linearly on $x_{k}$ , $x_{k} = c x_{j}$ for some $c \neq 0$ . Then
$\begin{array}{rcl} A x_{1} = λ_{1} x_{1} & \Rightarrow & A x_{1} = λ_{1} x_{1} \\ A x_{2} = λ_{2} x_{2} & \Rightarrow & A c x_{1} = λ_{2} c x_{1} \end{array}$
and subtracting would give $0 = (λ_{1} - λ_{2}) x_{1}$ . Since $x_{1}$ is an eigenvector, hence $x_{1} \neq 0$ we obtain a contradiction $λ_{1} = λ_{2}$ .
The characteristic polynomial might have repeated roots. Establishing diagonalizability in that case requires additional concepts

Finding eigenvalues as roots of the characteristic polynomial $p (λ) = \det (A - λ I)$ is suitable for small matrices $𝑨 \in ℝ^{m \times m}$ .
- analytical root-finding formulas are available only for $m ⩽ 4$
- small errors in characteristic polynomial coefficients can lead to large errors in roots
Octave/Matlab procedures to find characteristic polynomial
- poly(A) function returns the coefficients
- roots(p) function computes roots of the polynomial

octave]

A=[5 -4 2; 5 -4 1; -2 2 -3]; disp(A);

5 -4 2 5 -4 1 -2 2 -3

octave]

p=poly(A); disp(p);

1.00000 2.00000 -1.00000 -2.00000

octave]

r=roots(p); disp(r');

1.0000 -2.0000 -1.0000

octave]

Find eigenvectors as non-trivial solutions of system $(𝑨 - λ 𝑰) 𝒙 = 0$
$λ_{1} = 1 \Rightarrow 𝑨 - λ_{1} 𝑰 = (\begin{array}{ccc} 4 & - 4 & 2 \\ 5 & - 5 & 1 \\ - 2 & 2 & - 4 \end{array}) \sim (\begin{array}{ccc} - 2 & 2 & - 4 \\ 0 & 0 & - 6 \\ 5 & - 5 & 1 \end{array}) \sim (\begin{array}{ccc} - 2 & 2 & - 4 \\ 0 & 0 & - 6 \\ 0 & 0 & 0 \end{array})$
Note convenient choice of row operations to reduce amount of arithmetic, and use of knowledge that $𝑨 - λ_{1} 𝑰$ is singular to deduce that last row must be null
In traditional form the above row-echelon reduced system corresponds to
${\begin{cases} - 2 x_{1} + 2 x_{2} - 4 x_{3} & = & 0 \\ 0 x_{1} + 0 x_{2} - 6 x_{3} & = & 0 \\ 0 x_{1} + 0 x_{2} + 0 x_{3} & = & 0 \end{cases} \Rightarrow 𝒙 = α (\begin{array}{c} 1 \\ 1 \\ 0 \end{array}), . || 𝒙 || = 1 \Rightarrow α = 1 / \sqrt{2}$
In Octave/Matlab the computations are carried out by the null function
octave]

null(A+5*eye(3))'
ans = [](0x3)
octave]

The eigenvalues of $𝑰 \in ℝ^{3 \times 3}$ are $λ_{1, 2, 3} = 1$ , but small errors in numerical computation can give roots of the characteristic polynomial with imaginary parts

octave>

lambda=roots(poly(eye(3))); disp(lambda')

1.00001 - 0.00001i 1.00001 + 0.00001i 0.99999 - 0.00000i

octave>

In the following example notice that if we slightly perturb $𝑨$ (by a quantity less than 0.0005=0.05%), the eigenvalues get perturb by a larger amount, e.g. 0.13%.

octave]

A=[-2 1 -1; 5 -3 6; 5 -1 4]; disp([eig(A) eig(A+0.001*(rand(3,3)-0.5))])

3.0000 + 0.0000i 3.0005 + 0.0000i -2.0000 + 0.0000i -2.0000 + 0.0161i -2.0000 + 0.0000i -2.0000 - 0.0161i

octave]

Extracting eigenvalues and eigenvectors is a commonly encountered operation, and specialized functions exist to carry this out, including the eig function

octave>

 [X,L]=eig(A); disp([L X]);

-2.00000 0.00000 0.00000 -0.57735 -0.00000 0.57735 0.00000 3.00000 0.00000 0.57735 0.70711 -0.57735 0.00000 0.00000 -2.00000 0.57735 0.70711 -0.57735

octave>

disp(null(A-3*eye(3)))

0.00000 0.70711 0.70711

octave>

disp(null(A+2*eye(3)))

0.57735 -0.57735 -0.57735

octave>

Recall definitions of eigenvalue algebraic $𝒎_{λ}$ and geometric multiplicities $𝒏_{λ}$ .

Definition. A matrix which has $n_{λ} < m_{λ}$ for any of its eigenvalues is said to be defective.

octave>

A=[-2 1 -1; 5 -3 6; 5 -1 4]; [X,L]=eig(A); disp(L);

Diagonal Matrix -2.0000 0 0 0 3.0000 0 0 0 -2.0000

octave>

disp(X);

-5.7735e-01 -1.9153e-17 5.7735e-01 5.7735e-01 7.0711e-01 -5.7735e-01 5.7735e-01 7.0711e-01 -5.7735e-01

octave>

disp(null(A+2*eye(3)));

0.57735 -0.57735 -0.57735

octave>

disp(rank(X))

octave>

2.Computation of the SVD

The SVD is determined by eigendecomposition of $A^{T} A$ , and $A A^{T}$
- $A^{T} A = {(U Σ V^{T})}^{T} (U Σ V^{T}) = V (Σ^{T} Σ) V^{T}$ , an eigendecomposition of $A^{T} A$ . The columns of $V$ are eigenvectors of $A^{T} A$ and called right singular vectors of $A$
  $B = A^{T} A = V Σ^{T} Σ V^{T} = V Λ V^{T}$
- $A A^{T} = (U Σ V^{T}) {(U Σ^{T} V^{T})}^{T} = U (Σ Σ^{T}) U^{T}$ , an eigendecomposition of $A^{T} A$ . The columns of $U$ are eigenvectors of $A A^{T}$ and called left singular vectors of $A$
- The matrix $Σ$ has form
  $Σ = (\begin{array}{cccccc} σ_{1} \\ σ_{2} \\ ⋱ \\ σ_{r} \\ 0 \\ ⋱ \end{array}) \in ℝ_{+}^{m . \times n .}$
  and $σ_{i}$ are the singular values of $A$ .
The singular value decomposition (SVD) furnishes complete information about $A$
- $rank (A) = r$ (the number of non-zero singular values)
- $U, V$ are orthogonal basis for the domain and codomain of $A$