Lecture 13: Power Iterations

1.Reduction to triangular form

The relevance of eigendecompositions 𝑨=𝑿𝚲𝑿-1 to repeated application of the linear operator 𝑨m×m as in

𝒆t𝑨=𝑰+11!t𝑨+12!t2𝑨2+=𝑿et𝚲𝑿-1,

suggests that algorithms that construct powers of 𝑨 might reveal eigenvalues. This is indeed the case and leads to a class of algorithms of wide applicability in scientific computation. First, observe that taking condition numbers gives

μ(𝑨)=μ(𝑿𝚲𝑿-1)μ2(𝑿)μ(𝚲)=(|λ|max/|λ|min),

where |λ|max, |λ|min are the eigenvalues of maximum and minimum absolute values. While these express an intrinsic property of the operator 𝑨, the factor μ2(𝑿) is associated with the conditioning of a change of coordinates, and a natural question is whether it is possible to avoid any ill-conditioning associated with a basis set 𝑿 that is close to linear dependence. The answer to this line of inquiry is given by the following result.

Schur Theorem. For any 𝑨m×m there exists 𝑸 unitary and 𝑻 upper triangular such that 𝑨=𝑸𝑻𝑸.

Proof. Proceed by induction, starting from an arbitrary eigenvalue λ and eigenvector 𝒙. Let 𝒖1=𝒙/||𝒙||, the first column vector of a unitary matrix 𝑼=[ 𝒖1 𝑽 ]. Then

𝑼𝑨𝑼=[ 𝒖1 𝑽 ]𝑨[ 𝒖1 𝑽 ]=[ 𝒖1 𝑽 ][ 𝑨𝒖1 𝑨𝑽 ]=[ 𝒖1 𝑽 ][ λ𝒖1 𝑨𝑽 ]=[ λ1 𝒃 𝟎 𝑪 ],

with 𝑪(m-1)×(m-1) that by the inductive hypothesis can be written as 𝑪=𝑾𝑺𝑾, with 𝑾 unitary, 𝑺 upper triangular. The matrix

𝑸=𝑼[ 1 𝟎 𝟎 𝑾 ]

is a product of unitary matrices, hence itself unitary. The computation

𝑸𝑨𝑸=(𝑼[ 1 𝟎 𝟎 𝑾 ])𝑨𝑼[ 1 𝟎 𝟎 𝑾 ]=[ 1 𝟎 𝟎 𝑾 ]𝑼𝑨𝑼[ 1 𝟎 𝟎 𝑾 ]=[ 1 𝟎 𝟎 𝑾 ][ λ1 𝒃 𝟎 𝑪 ][ 1 𝟎 𝟎 𝑾 ]=[ λ1 𝒃 𝟎 𝑺 ]=𝑻,

then shows that 𝑻 is indeed triangular.

The eigenvalues of an upper triangular matrix are simply its diagonal elements, so the Schur factorization is an eigenvalue-revealing factorization.

2.Power iteration for real symmetric matrices

When the operator 𝑨 expresses some physical phenomenon, the principle of action and reaction implies that 𝑨m×m is symmetric, 𝑨=𝑨T and has real eigenvalues. Componentwise, symmetry of 𝑨=[aij] implies aij=aji. Consider 𝑨𝒙=λ𝒙, and take the adjoint to obtain 𝒙T𝑨T=λ𝒙T, or 𝒙T𝑨=λ𝒙T since 𝑨 is symmetric. Form scalar products 𝒙T𝑨𝒙=λ𝒙T𝒙, 𝒙T𝑨T𝒙=λ𝒙T𝒙, and subtract to obtain

0=(λ-λ)𝒙T𝒙λ=λλ,

since 𝒙𝟎, an eigenvector.

Example. Consider a linear array of identical mass-springs. The ith point mass obeys the dynamics

mx¨i=k(xi+1-xi)-k(xi-xi-1)=k(xi+1-2xi+xi-1),

expressed in matrix form as 𝒙¨=𝑨𝒙, with 𝑨 symmetric.

For a real symmetric matrix the Schur theorem states that

𝑨=𝑨T(𝑸𝑻𝑸T)=𝑸𝑻T𝑸T𝑻=𝑻T,

and since a symmetric triangular matrix is diagonal, the Schur factorization is also an eigendecomposition, and the eigenvector matrix 𝑸 is a basis, C(𝑸)=m.

2.1.The power iteration idea

Assume initially that the eigenvalues are distinct and ordered |λ1|>|λ2|>>|λm|. Repeated application of 𝑨 on an arbitrary vector 𝒗=𝑸𝒄m=C(𝑸) is expressed as

𝑨n𝒗=(𝑸𝚲𝑸T)n𝑸𝒄=(𝑸𝚲𝑸T)(𝑸𝚲𝑸T)(𝑸𝚲𝑸T)𝑸𝒄=𝑸𝚲n𝒄,

a linear combination of the columns of 𝑸 (eigenvectors of 𝑨) with coefficients 𝚲n𝒄=[ λ1nc1 λ2nc2 λmncm ]T.

For large enough n, |λ1|>|λk|, k=2,,n, leads to a dominant contribution along the direcion of eigenvector 𝒒1

𝑨n𝒗=𝑸𝚲n𝒄=λ1nc1𝒒1++λmncm𝒒mλ1nc1𝒒1.

This gives a procedure for finding one eigenvector of a matrix, and the Schur theorem proof suggests a recursive algorithm to find all eigenvalues can be defined.

The sequence of normalized eigenvector approximants 𝒗n=𝑨n𝒗/||𝑨n𝒗|| is linearly convergent at rate r=|λ2/λ1|.

2.2.Rayleigh quotient

To estimate the eigenvalue revealed by power iteration, formulate the least squares problem

minc||𝑨𝒗-𝒗c||,

that seeks the best approximation of one power iteration 𝑨𝒗 as a linear combination of the initial vector 𝒗. Of course, if 𝒗=𝒒 is an eigenvector, then the solution would be c=λ, the associated eigenvalue. The projector onto C(𝒗) is

𝑷=𝒗𝒗T𝒗T𝒗,

that when applied to 𝑨𝒗 gives the equation

𝑷𝑨𝒗=𝒗𝒗T𝒗T𝒗𝑨𝒗=𝒗T𝑨𝒗𝒗T𝒗𝒗=c𝒗c=𝒗T𝑨𝒗𝒗T𝒗.

The function r:m,

r(𝒗)=𝒗T𝑨𝒗𝒗T𝒗,

is known as the Rayleigh quotient which, evaluated for an eigenvector, gives r(𝒒)=λ. To determine how well the eigenvalue is approximated, carry out a Taylor series in the vicinity of an eigenvector 𝒒

r(𝒗)=r(𝒒)+11![𝒗r(𝒒)]T(𝒗-𝒒)+𝒪(||𝒗-𝒒||2),

where 𝒗r is the gradient of r(𝒗)

𝒗r=[ rv1 rvm ].

Compute the gradient through differentiation of the Rayleigh quotient

𝒗r(𝒗)=𝒗(𝒗T𝑨𝒗)𝒗T𝒗-(𝒗T𝑨𝒗)(𝒗T𝒗)2𝒗(𝒗T𝒗).

Noting that 𝒗vi=𝒆i, the ith column of 𝑰, the gradient of 𝒗T𝒗 is

𝒗(𝒗T𝒗)=𝒗i=1mvi2=i=1m𝒗vi2=i=1m2vi𝒗vi=2i=1mvi𝒆i=2𝒗.

To compute 𝒗(𝒗T𝑨𝒗), let 𝒖=𝑨𝒗, and since 𝑨 is symmetric 𝒖T=𝒗T𝑨T=𝒗T𝑨, leading to

𝒗(𝒗T𝑨𝒗)=𝒗(𝒖T𝒗)=i=1m𝒗(uivi)=i=1mui𝒗vi+i=1mvi𝒗ui.

Use ui=j=1maijvj also expressed as uj=i=1majivi by swapping indices to obtain

𝒗ui=j=1maij𝒗vj=j=1maij𝒆j

and therefore

i=1mvi𝒗ui=i=1mvij=1maij𝒆j=j=1mi=1maijvi𝒆j=j=1mi=1maijvi𝒆j.

Use symmetry of 𝑨 to write

i=1maijvi=i=1majivi=uj,

and substitute above to obtain

i=1mvi𝒗ui=j=1muj𝒆j=𝒖=𝑨𝒗.

Gathering the above results

𝒗(𝒗T𝒗)=2𝒗,𝒗(𝒗T𝑨𝒗)=2𝑨𝒗,

gives the following gradient of the Rayleigh quotient

𝒗r(𝒗)=2𝒗T𝒗(𝑨𝒗-r(𝒗)𝒗).

When evaluated at 𝒗=𝒒, obtain 𝒗r(𝒒)=𝟎, implying that near an eigenvector the Rayleigh quotient approximation of an eigenvalue is of quadratic accuracy,

r(𝒗)-λ=𝒪(||𝒗-𝒒||2).

2.3.Refining the power iteration idea

Power iteration furnishes the largest eigenvalue. Further eigenvalues can be found by use of the following properties:

If μ is a known initial approximation of the eigenvalue then the inverse power iteration 𝒗n=(𝑨-μ𝑰)-1𝒗n-1, actually implemented as successive solution of linear systems

(𝑨-μ𝑰)𝒗n=𝒗n-1,

leads to a sequence of Rayleigh quotients 𝒓(𝒗n) that converges quadratically to an eigenvalue close to μ. An important refinement of the idea is to change the shift at each iteration which leads to cubic order of convergence

Algorithm (Rayleigh quotient iteration)

Given 𝒗,𝑨

μ=𝒗T𝑨𝒗/𝒗T𝒗

for i=1 to nmax

𝒘=(𝑨-μ𝑰)\𝒗 (solve linear system)

𝒗=𝒘/||𝒘||

λ=𝒗T𝑨𝒗

if |λ-μ|<ε exit

μ=λ

end

return λ,𝒗

Power iteration can be applied simultaneously to multiple directions at once

Algorithm (Simultaneous iteration)

Given 𝑨

𝑸=𝑰; 𝝁=diag(𝑨)

for i=1 to nmax

𝑽=𝑨𝑸(power iteration applied to multiple directions)

𝑸𝑹=𝑽(orthogonalize new directions)

𝝀=diag(𝑸T𝑨𝑸)

if ||𝝀-𝝁||<ε exit

end

return 𝝀,𝑸