Eigenvalue-Revealing Factorizations

1.The eigenvalue problem

Linear endomorphisms 𝒇:mm, represented by 𝑨m×m, can exhibit invariant directions 𝒙𝟎 for which

𝒇(𝒙)=𝑨𝒙=λ𝒙,

known as eigenvectors, with associated eigenvalue λ. Eigenvectors are non-zero elements of the null space of 𝑨-λ𝑰,

(𝑨-λ𝑰)𝒙=𝟎,

and the null-space is referred to as the eigenspace of 𝑨 for eigenvalue λ, 𝑨(λ)=N(𝑨-λ𝑰).

Non-zero solutions are obtained if 𝑨-λ𝑰 is rank-deficient (singular), or has linearly dependent columns in which case

det(𝑨-λ𝑰)=| a11-λ a12 a1m a21 a22-λ a2m am1 am2 amm-λ |=0.

From the determinant definition as “sum of all products choosing an element from row/column”, it results that

det(λ𝑰-𝑨)=λm+c1λm-1++cm-1λ+cm=pA(λ),

known as the characteristic polynomial associated with the matrix 𝑨, and of degree m. The characteristic polynomial is monic, meaning that the coefficient of the highest power λm is equal to one. The fundamental theorem of algebra states that pA(λ) of degree m has m roots, hence 𝑨m×m has m eigenvalues (not necessarily distinct), and m associated eigenvectors. This can be stated in matrix form as

𝑨𝑿=𝑿𝚲,

with

𝑿=[ 𝒙1 𝒙m ],𝚲=diag(λ1,,λm),

the eigenvector matrix and eigenvalue matrix, respectively.

1.1.Coordinate transformations

The statement 𝑨𝒙=λ𝒙, that eigenvector 𝒙 is an invariant direction of the operator 𝑨 along which the effect of operator is scaling by λ, suggests that similar behavior would be obtained under a coordinate transformation 𝑻𝒚=𝑰𝒙=𝒙. Assuming 𝑻 is of full rank and introducing 𝑩=𝑻-1𝑨𝑻, this leads to

𝑨𝒙=𝑨𝑻𝒚=λ𝒙=λ𝑻𝒚𝑻-1𝑨𝑻𝒚=λ𝒚.

Upon coordinate transformation, the eigenvalues (scaling factors along the invariant directions) stay the same. Metric-preserving coordinate transformations are of particular interest, in which case the transformation matrix is unitary 𝑩=𝑸𝑨𝑸.

Definition. Matrices 𝑨,𝑩m×m are said to be similar, 𝑩𝑨, if there exists some full rank matrix 𝑻m×m such that 𝑩=𝑻-1𝑨𝑻.

Proposition. Similar matrices 𝑨,𝑩m×m, 𝑩=𝑻-1𝑨𝑻, have the same eigenvalues, and eigenvectors 𝒙 of 𝑨, 𝒚 of 𝑩 are related through 𝒙=𝑻𝒚.

Since the eigenvalues of 𝑩𝑨 are the same, and a polynomial is completely specified by its roots and coefficient of highest power, the characteristic polynomials of 𝑨,𝑩 must be the same

p𝑨(λ)=k=1m(λ-λk)=p𝑩(λ).

This can also be verified through the determinant definition

p𝑩(t)=det(λ𝑰-𝑩)=det(λ𝑻-1𝑻-𝑻-1𝑨𝑻)=det(𝑻-1(λ𝑰-𝑨)𝑻)=det(𝑻-1)det(λ𝑰-𝑨)det(𝑻)=p𝑨(λ),

since det(𝑻-1)=1/det(𝑻).

1.2.Paradigmatic eigenvalue problem solutions

Reflection matrix. The matrix

𝑯=𝑰-2𝒒𝒒T2×2,||𝒒||=1,

is the two-dimensional Householder reflector across N(𝒒T). Vectors colinear with 𝒒 change direction along the same orientation upon reflection, while vectors orthogonal to 𝒒 (i.e., in the null space 𝑵(𝒒T)) are unchanged. It is therefore to be expected that λ1=-1, 𝒙1=𝒒, and λ2=1, 𝒒T𝒙2=0. This is readily verified

𝑯𝒒=(𝑰-2𝒒𝒒T)𝒒=𝒒-2𝒒=-𝒒,
𝑯𝒙2=(𝑰-2𝒒𝒒T)𝒙2=𝒙2.

Figure 1. Reflector in two dimensions

Rotation matrix. The matrix

𝑹(θ)=[ cosθ -sinθ sinθ cosθ ],

represents the isometric rotation of two-dimensional vectors. If θ=0, 𝑹=𝑰 with eigenvalues λ1=λ2=1, and eigenvector matrix 𝑿=𝑰. For θ=π, the eigenvalues are λ1=λ2=-1, again with eigenvector matrix 𝑿=𝑰. If sinθ0, the orientation of any non-zero 𝒙2 changes upon rotation by θ. The characteristic polynomial has complex roots

p(λ)=(λ-cosθ)2+sin2θλ1,2=cosθ±isinθ=e±iθ

and the directions of invariant orientation have complex components (are outside the real plane 2)

𝑿=[ 1 -1 i i ],𝑹𝑿=[ e-iθ -eiθ ie-iθ ieiθ ]=[ 1 -1 i i ][ e-iθ 0 0 eiθ ].

Second-order differentiation matrix. Eigenvalues of matrices arising from discretization of continuum operators can be obtained from the operator eigenproblem. The second-order differentiation operator x2 has eigenvalues -ξ2 associated with eigenfunctions sin(ξx)

x2sin(ξx)=-ξ2sin(ξx).

Sampling of sin(ξx) at xk=kh, k=1,,m, h=π/(m+1) leads to the vector 𝒖m with components uk=sin(ξkh). The boundary conditions at the sampling interval end-points affect the eigenvalues. Imposing sin(ξx)=0, at x=0 and x=π leads to ξ. The derivative can be approximated at the sample points through

uk''sin[ξ(xk+h)]-2sin[ξxk]+sin[ξ(xk-h)]h2=2h2(cos(ξh)-1)sin(ξkh)=-4h2sin2(ξh2)sin(ξkh).

The derivative approximation vector 𝒖''=[uk'']k=1,m results from a linear mapping 𝒖''=𝑫𝒖, and the matrix

𝑫=1h2diag([ 1 -2 1 ]),

has eigenvectors 𝒖 and eigenvalues -(4/h2)sin2(ξh/2), ξ=1,2,,m. In the limit of an infinite number of sampling points the continuum eigenvalues are obtained, exemplifying again the correspondence principle between discrete and continuum representations

limh0-4h2sin2(ξh2)=-ξ2.

1.3.Matrix eigendecomposition

A solution 𝑿,𝚲 to the eigenvalue problem 𝑨𝑿=𝑿𝚲 always exists, but the eigenvectors of 𝑨 do not always form a basis set, i.e., 𝑿 is not always of full rank. The factorized form of the characteristic polynomial of 𝑨m×m is

p𝑨(λ)=det(λ𝑰-𝑨)=k=1K(λ-λk)mk,

with Km denoting the number of distinct roots of p𝑨(λ), and mk is the algebraic multiplicity of eigenvalue λk, defined as the number of times the root λk is repeated. Let k denote the associated eigenspace k=𝑨(λk)=N(𝑨-λk𝑰). The dimension of k denoted by nk is the geometric multiplicity of eigenvalue λk. The eigenvector matrix is of full rank when the vector sum of the eigenspaces covers m, as established by the following results.

Proposition. If λiλj then ij={𝟎} (the eigenspaces of distinct eigenvalues are disjoint)

Proof. Let 𝒙i, hence 𝑨𝒙=λi𝒙 and 𝒙j, hence 𝑨𝒙=λj𝒙. Subtraction gives

𝑨𝒙-𝑨𝒙=𝟎=(λi-λj)𝒙.

Since λiλj it results that 𝒙=𝟎.

Proposition. The geometric multiplicity of an eigenvalue is less or equal to its algebraic multiplicity,

nk=dim(N(𝑨-λk𝑰))mk.

Proof. Let 𝑽m×nk be an orthogonal basis for N(𝑨-λk𝑰).

Definition 1. An eigenvalue for which the geometric multiplicity is less than the algebraic multiplicity is said to be defective

Proposition 2. A matrix is diagonalizable is the geometric multiplicity of each eigenvalue is equal to the algebraic multiplicity of that eigenvalue.

octave] 
A=[5 -4 2; 5 -4 1; -2 2 -3]; disp(A);

5 -4 2

5 -4 1

-2 2 -3

octave] 
p=poly(A); disp(p);

1.00000 2.00000 -1.00000 -2.00000

octave] 
r=roots(p); disp(r');

1.0000 -2.0000 -1.0000

octave] 
octave> 
lambda=roots(poly(eye(3))); disp(lambda')

1.00001 - 0.00001i 1.00001 + 0.00001i 0.99999 - 0.00000i

octave> 
octave] 
A=[-2 1 -1; 5 -3 6; 5 -1 4]; disp([eig(A) eig(A+0.001*(rand(3,3)-0.5))])

3.0000 + 0.0000i 3.0005 + 0.0000i

-2.0000 + 0.0000i -2.0000 + 0.0161i

-2.0000 + 0.0000i -2.0000 - 0.0161i

octave] 

octave> 
 [X,L]=eig(A); disp([L X]);

-2.00000 0.00000 0.00000 -0.57735 -0.00000 0.57735

0.00000 3.00000 0.00000 0.57735 0.70711 -0.57735

0.00000 0.00000 -2.00000 0.57735 0.70711 -0.57735

octave> 
disp(null(A-3*eye(3)))

0.00000

0.70711

0.70711

octave> 
disp(null(A+2*eye(3)))

0.57735

-0.57735

-0.57735

octave> 

Definition. A matrix which has nλ<mλ for any of its eigenvalues is said to be defective.

octave> 
A=[-2 1 -1; 5 -3 6; 5 -1 4]; [X,L]=eig(A); disp(L);

Diagonal Matrix

-2.0000 0 0

0 3.0000 0

0 0 -2.0000

octave> 
disp(X);

-5.7735e-01 -1.9153e-17 5.7735e-01

5.7735e-01 7.0711e-01 -5.7735e-01

5.7735e-01 7.0711e-01 -5.7735e-01

octave> 
disp(null(A+2*eye(3)));

0.57735

-0.57735

-0.57735

octave> 
disp(rank(X))
octave> 

2.Computation of the SVD