Lecture 12: The Eigenvalue Problem

1.Definitions

Linear endomorphisms 𝒇:mm, represented by 𝑨m×m, can exhibit invariant directions 𝒙𝟎 for which

𝒇(𝒙)=𝑨𝒙=λ𝒙,

known as eigenvectors, with associated eigenvalue λ. Eigenvectors are non-zero elements of the null space of 𝑨-λ𝑰,

(𝑨-λ𝑰)𝒙=𝟎,

and the null-space is referred to as the eigenspace of 𝑨 for eigenvalue λ, 𝑨(λ)=N(𝑨-λ𝑰).

Non-zero solutions are obtained if 𝑨-λ𝑰 is rank-deficient (singular), or has linearly dependent columns in which case

det(𝑨-λ𝑰)=0det(λ𝑰-𝑨)=| λ-a11 a12 a1m a21 λ-a22 a2m am1 am2 λ-amm |=0.

From the determinant definition as “sum of all products choosing an element from row/column”, it results that

det(λ𝑰-𝑨)=λm+c1λm-1++cm-1λ+cm=pA(λ),

known as the characteristic polynomial associated with the matrix 𝑨, and of degree m. The characteristic polynomial is monic, meaning that the coefficient of the highest power λm is equal to one. The fundamental theorem of algebra states that pA(λ) of degree m has m roots, hence 𝑨m×m has m eigenvalues (not necessarily distinct), and m associated eigenvectors. This can be stated in matrix form as

𝑨𝑿=𝑿𝚲,

with

𝑿=[ 𝒙1 𝒙m ],𝚲=diag(λ1,,λm),

the eigenvector matrix and eigenvalue matrix, respectively. By definition, the matrix 𝑨 is diagonalizable if 𝑿 is of full rank, in which case the eigendecomposition of 𝑨 is

𝑨=𝑿𝚲𝑿-1.

1.1.Coordinate transformations

The statement 𝑨𝒙=λ𝒙, that eigenvector 𝒙 is an invariant direction of the operator 𝑨 along which the effect of operator is scaling by λ, suggests that similar behavior would be obtained under a coordinate transformation 𝑻𝒚=𝑰𝒙=𝒙. Assuming 𝑻 is of full rank and introducing 𝑩=𝑻-1𝑨𝑻, this leads to

𝑨𝒙=𝑨𝑻𝒚=λ𝒙=λ𝑻𝒚𝑻-1𝑨𝑻𝒚=λ𝒚.

Upon coordinate transformation, the eigenvalues (scaling factors along the invariant directions) stay the same. Metric-preserving coordinate transformations are of particular interest, in which case the transformation matrix is unitary 𝑩=𝑸𝑨𝑸.

Definition. Matrices 𝑨,𝑩m×m are said to be similar, 𝑩𝑨, if there exists some full rank matrix 𝑻m×m such that 𝑩=𝑻-1𝑨𝑻.

Proposition. Similar matrices 𝑨,𝑩m×m, 𝑩=𝑻-1𝑨𝑻, have the same eigenvalues, and eigenvectors 𝒙 of 𝑨, 𝒚 of 𝑩 are related through 𝒙=𝑻𝒚.

Since the eigenvalues of 𝑩𝑨 are the same, and a polynomial is completely specified by its roots and coefficient of highest power, the characteristic polynomials of 𝑨,𝑩 must be the same

p𝑨(λ)=k=1m(λ-λk)=p𝑩(λ).

This can also be verified through the determinant definition

p𝑩(t)=det(λ𝑰-𝑩)=det(λ𝑻-1𝑻-𝑻-1𝑨𝑻)=det(𝑻-1(λ𝑰-𝑨)𝑻)=det(𝑻-1)det(λ𝑰-𝑨)det(𝑻)=p𝑨(λ),

since det(𝑻-1)=1/det(𝑻).

1.2.Paradigmatic eigenvalue problem solutions

Reflection matrix. The matrix

𝑯=𝑰-2𝒒𝒒T2×2,||𝒒||=1,

is the two-dimensional Householder reflector across N(𝒒T). Vectors colinear with 𝒒 change direction along the same orientation upon reflection, while vectors orthogonal to 𝒒 (i.e., in the null space 𝑵(𝒒T)) are unchanged. It is therefore to be expected that λ1=-1, 𝒙1=𝒒, and λ2=1, 𝒒T𝒙2=0. This is readily verified

𝑯𝒒=(𝑰-2𝒒𝒒T)𝒒=𝒒-2𝒒=-𝒒,
𝑯𝒙2=(𝑰-2𝒒𝒒T)𝒙2=𝒙2.

Figure 1. Reflector in two dimensions

Rotation matrix. The matrix

𝑹(θ)=[ cosθ -sinθ sinθ cosθ ],

represents the isometric rotation of two-dimensional vectors. If θ=0, 𝑹=𝑰 with eigenvalues λ1=λ2=1, and eigenvector matrix 𝑿=𝑰. For θ=π, the eigenvalues are λ1=λ2=-1, again with eigenvector matrix 𝑿=𝑰. If sinθ0, the orientation of any non-zero 𝒙2 changes upon rotation by θ. The characteristic polynomial has complex roots

p(λ)=(λ-cosθ)2+sin2θλ1,2=cosθ±isinθ=e±iθ

and the directions of invariant orientation have complex components (are outside the real plane 2)

𝑿=[ 1 -1 i i ],𝑹𝑿=[ e-iθ -eiθ ie-iθ ieiθ ]=[ 1 -1 i i ][ e-iθ 0 0 eiθ ].

Second-order differentiation matrix. Eigenvalues of matrices arising from discretization of continuum operators can be obtained from the operator eigenproblem. The second-order differentiation operator x2 has eigenvalues -ξ2 associated with eigenfunctions sin(ξx)

x2sin(ξx)=-ξ2sin(ξx).

Sampling of sin(ξx) at xk=kh, k=1,,m, h=π/(m+1) leads to the vector 𝒖m with components uk=sin(ξkh). The boundary conditions at the sampling interval end-points affect the eigenvalues. Imposing sin(ξx)=0, at x=0 and x=π leads to ξ. The derivative can be approximated at the sample points through

uk''sin[ξ(xk+h)]-2sin[ξxk]+sin[ξ(xk-h)]h2=2h2(cos(ξh)-1)sin(ξkh)=-4h2sin2(ξh2)sin(ξkh).

The derivative approximation vector 𝒖''=[uk'']k=1,m results from a linear mapping 𝒖''=𝑫𝒖, and the matrix

𝑫=1h2diag([ 1 -2 1 ]),

has eigenvectors 𝒖 and eigenvalues -(4/h2)sin2(ξh/2), ξ=1,2,,m. In the limit of an infinite number of sampling points the continuum eigenvalues are obtained, exemplifying again the correspondence principle between discrete and continuum representations

limh0-4h2sin2(ξh2)=-ξ2.

1.3.Matrix eigendecomposition

A solution 𝑿,𝚲 to the eigenvalue problem 𝑨𝑿=𝑿𝚲 always exists, but the eigenvectors of 𝑨 do not always form a basis set, i.e., 𝑿 is not always of full rank. The factorized form of the characteristic polynomial of 𝑨m×m is

p𝑨(λ)=det(λ𝑰-𝑨)=k=1K(λ-λk)mk,

with Km denoting the number of distinct roots of p𝑨(λ), and mk is the algebraic multiplicity of eigenvalue λk, defined as the number of times the root λk is repeated. Let k denote the associated eigenspace k=𝑨(λk)=N(𝑨-λk𝑰). The dimension of k denoted by nk is the geometric multiplicity of eigenvalue λk. The eigenvector matrix is of full rank when the vector sum of the eigenspaces covers m, as established by the following results.

Proposition. The geometric multiplicity is at least 1, nk1.

Proof. By contradiction if nk=dimk, then k={𝟎}, but eigenvectors cannot be null.

Proposition. If λiλj then ij={𝟎} (the eigenspaces of distinct eigenvalues are disjoint)

Proof. Let 𝒙i, hence 𝑨𝒙=λi𝒙 and 𝒙j, hence 𝑨𝒙=λj𝒙. Subtraction gives

𝑨𝒙-𝑨𝒙=𝟎=(λi-λj)𝒙.

Since λiλj it results that 𝒙=𝟎.

Proposition. The geometric multiplicity of an eigenvalue is less or equal to its algebraic multiplicity,

0<nk=dim(N(𝑨-λk𝑰))mk.

Proof. Let 𝑽^m×nk be an orthonormal basis for N(𝑨-λk𝑰). By definition of a null space, 𝒚N(𝑨-λk𝑰)

(𝑨-λk𝑰)𝒚=𝟎𝑨𝒚=λk𝒚,

i.e., every vector of the eigenspace is an eigenvector with eigenvalue λk. Then

𝑨𝑽^=𝑨[ 𝒗1 𝒗2 𝒗nk ]=[ 𝑨𝒗1 𝑨𝒗2 𝑨𝒗nk ]=λ[ 𝒗1 𝒗2 𝒗nk ].

Form the unitary matrix 𝑽=[ 𝑽^ 𝒁 ]m×m, and compute

𝑽𝑨𝑽=[ 𝑽^ 𝒁 ]𝑨[ 𝑽^ 𝒁 ]=[ 𝑽^ 𝒁 ][ 𝑨𝑽^ 𝑨𝒁 ]=[ 𝑽^𝑨𝑽^ 𝑽^𝑨𝒁 𝒁𝑨𝑽^ 𝒁𝑨𝒁 ].

Since 𝑽 is unitary, obtain

𝑽^𝑨𝑽^=λ[ 𝒗1 𝒗2 𝒗nk ][ 𝒗1 𝒗2 𝒗nk ]=λ𝑰nk,𝒁𝑨𝑽^=λ[ 𝒛1 𝒛2 𝒛m-nk ][ 𝒗1 𝒗2 𝒗nk ]=𝟎,

where 𝑰nk is the nk×nk identity matrix, and in the above 𝟎 denotes a (m-nk)×nk matrix of zeros. The matrix

𝑩=𝑽𝑨𝑽=[ λ𝑰 𝑪 𝟎 𝑫 ]

is similar to 𝑨 and has the same eigenvalues. Since det(z𝑰-𝑩)=det((z-λ)𝑰)det(𝑫), the algebraic multiplicity of λ must be at least nk, i.e., nkmk.

Definition 1. An eigenvalue for which the geometric multiplicity is less than the algebraic multiplicity is said to be defective.

Example. Non-defective matrices exist, for example

𝑨=[ 1 0 0 0 2 0 0 0 3 ],𝑿=𝑰,𝚲=diag([ 1 2 3 ]).

Example. Non-defective matrices with repeated eigenvalues exist, for example

𝑨=[ 1 0 0 0 1 0 0 0 1 ],𝑿=𝑰,𝚲=diag([ 1 1 1 ]).

Example. Defective matrices exist, for example

𝑨=[ 3 1 1 0 3 1 0 0 3 ],

has eigenvalue λ=3 with algebraic multiplicity m1=3. Reduction to row-echelon form of 𝑨-λ𝑰 leads to

𝑨-λ𝑰=[ 0 1 1 0 0 1 0 0 0 ],

and N(𝑨-λ𝑰)=𝒆1, i.e., the geometric multiplicity is equal to 1. The above is known as a Jordan block.

Proposition 2. A matrix is diagonalizable if the geometric multiplicity of each eigenvalue is equal to the algebraic multiplicity of that eigenvalue.

Proof. Recall that 𝑨 is diagonalizable if the eigenvector matrix 𝑿 is of full rank. Since the eigenspaces j of the K distinct eigenvalues are disjoint, the column space of 𝑿 is the direct vector sum of the eigenspaces

C(𝑿)=1K.

The dimension of C(𝑿) is therefore given by the sum of the eigenspace dimensions

dimC(𝑿)=k=1Knkk=1Kmk=m.

Since nkmk, the only possibility for 𝑿 to be of full rank, dimC(𝑿)=m, is for nk=mk.

1.4.Matrix properties from eigenvalues

Eigenvalues as roots of the characteristic polynomial

pA(λ)=det(λ𝑰-𝑨)=λm+c1λm-1++cm-1λ+cm=k=1m(λ-λk)

reveal properties of a matrix 𝑨m×m. The evaluation of pA(0) leads to

det(-𝑨)=(-1)mdet(𝑨)=(-1)mk=1mλk,

hence the determinant of a matrix is given by the product of its eigenvalues

det(𝑨)=k=1mλk.

The trace of a matrix is the sum of its diagonal elements is equal to the sum of its eigenvalues

tr(𝑨)=k=1makk=k=1mλk,

a relationship established by the Vieta formulas.

1.5.Matrix eigendecomposition applications

Whereas the SVD, QR, LU decompositions can be applied to general matrices 𝑨m×n with m not necessarily equal to n, the eigendecomposition requires 𝑨m×m, and hence is especially relevant in the characterization of endomorphisms. A generic time evolution problem is stated as

t𝒖=𝒖t=𝒇(𝒖),𝒖(0)=𝒖0,𝒖:+m,

stating that the rate of change in the state variables 𝒖 characterizing some system is a function of the current state through the function 𝒇:mm, an endomorphism. An approximation of 𝒇 is furnished by the MacLaurin series

𝒇(𝒖)=𝒗+𝑨𝒖+𝒪(||𝒖||2),𝒗=𝒇(𝟎),𝑨=𝒇𝒖(𝟎).

Truncation at first order gives a linear ODE system 𝒖t=𝒗+𝑨𝒖, that can be formally integrated to give

𝒖(t)=𝒗t+et𝑨𝒖0.

The matrix exponential et𝑨 is defined as

et𝑨=𝑰+11!t𝑨+12!(t𝑨)2+13!(t𝑨)3+.

Evaluation of 𝑨n requires n-1 matrix multiplications or (n-1)m3 floating point operations. However, if the eigendecomposition of 𝑨=𝑿𝚲𝑿-1 is available the matrix exponential can be evaluate in only 2m3 operations since

𝑨k=(𝑿𝚲𝑿-1)(𝑿𝚲𝑿-1)(𝑿𝚲𝑿-1)=𝑿𝚲k𝑿-1,

leads to

et𝑨=𝑿et𝚲𝑿-1.

2.Computation of the SVD

The existence of the SVD 𝑨=𝑼𝚺𝑽 was establish by a constructive procedure by complete induction. However the proof depends on determining the singular values, e.g., σ1=||𝑨||. The existence of the singular values was established by an argument from analysis, that the norm function on a compact domain must attain its extrema. This however leaves open the problem of effectively determining the singular values. In practive the singular values and vectors are determined by solving the eigenvalue problem for 𝑨𝑨 and 𝑨𝑨

𝑨𝑨=(𝑼𝚺𝑽)(𝑼𝚺𝑽)=𝑽𝚺T𝑼𝑼𝚺𝑽=𝑽𝚺T𝚺𝑽(𝑨𝑨)𝑽=𝑽𝚺T𝚺,
𝑨𝑨=(𝑼𝚺𝑽)(𝑼𝚺𝑽)=𝑼𝚺𝑽𝑽𝚺T𝑼=𝑼𝚺𝚺T𝑼(𝑨𝑨)𝑼=𝑼𝚺𝚺T.

From the above the left singular vectors 𝑼 are eigenvectors of 𝑨𝑨, and the right singular vectors are eigenvectors of 𝑨𝑨. Both 𝑨𝑨 and 𝑨𝑨 have the same eigenvalues that are the squared singular values.