MATH661

The relevance of eigendecompositions

𝑨 = 𝑿 𝚲 𝑿^{- 1}

to repeated application of the linear operator

𝑨 \in ℂ^{m \times m}

as in

suggests that algorithms that construct powers of

𝑨

might reveal eigenvalues. This is indeed the case and leads to a class of algorithms of wide applicability in scientific computation. First, observe that taking condition numbers gives

where

{| λ |}_{max}

{| λ |}_{min}

are the eigenvalues of maximum and minimum absolute values. While these express an intrinsic property of the operator

𝑨

, the factor

μ^{2} (𝑿)

is associated with the conditioning of a change of coordinates, and a natural question is whether it is possible to avoid any ill-conditioning associated with a basis set

𝑿

that is close to linear dependence. The answer to this line of inquiry is given by the following result.

Schur Theorem. For any $𝑨 \in ℂ^{m \times m}$ there exists $𝑸$ unitary and $𝑻$ upper triangular such that $𝑨 = 𝑸 𝑻 𝑸^{*}$ .

Proof. Proceed by induction, starting from an arbitrary eigenvalue $λ$ and eigenvector $𝒙$ . Let $𝒖_{1} = 𝒙 / || 𝒙 ||$ , the first column vector of a unitary matrix $𝑼 = [\begin{array}{ll} 𝒖_{1} & 𝑽 \end{array}]$ . Then

with $𝑪 \in ℂ^{(m - 1) \times (m - 1)}$ that by the inductive hypothesis can be written as $𝑪 = 𝑾 𝑺 𝑾^{*}$ , with $𝑾$ unitary, $𝑺$ upper triangular. The matrix

The eigenvalues of an upper triangular matrix are simply its diagonal elements, so the Schur factorization is an eigenvalue-revealing factorization.

2.Power iteration for real symmetric matrices

When the operator

𝑨

expresses some physical phenomenon, the principle of action and reaction implies that

𝑨 \in ℝ^{m \times m}

is symmetric,

𝑨 = 𝑨^{T}

and has real eigenvalues. Componentwise, symmetry of

𝑨 = [a_{i j}]

implies

a_{i j} = a_{j i}

. Consider

𝑨 𝒙 = λ 𝒙

, and take the adjoint to obtain

𝒙^{T} 𝑨^{T} = \overline{λ} 𝒙^{T}

, or

𝒙^{T} 𝑨 = \overline{λ} 𝒙^{T}

since

𝑨

is symmetric. Form scalar products

𝒙^{T} 𝑨 𝒙 = λ 𝒙^{T} 𝒙

𝒙^{T} 𝑨^{T} 𝒙 = \overline{λ} 𝒙^{T} 𝒙

, and subtract to obtain

Example. Consider a linear array of identical mass-springs. The

i^{th}

point mass obeys the dynamics

and since a symmetric triangular matrix is diagonal, the Schur factorization is also an eigendecomposition, and the eigenvector matrix

𝑸

is a basis,

C (𝑸) = ℝ^{m}

2.1.The power iteration idea

Assume initially that the eigenvalues are distinct and ordered

| λ_{1} | > | λ_{2} | > \dots > | λ_{m} |

. Repeated application of

𝑨

on an arbitrary vector

𝒗 = 𝑸 𝒄 \in ℝ^{m} = C (𝑸)

is expressed as

a linear combination of the columns of

𝑸

(eigenvectors of

𝑨

) with coefficients

𝚲^{n} 𝒄 = {[\begin{array}{llll} λ_{1}^{n} c_{1} & λ_{2}^{n} c_{2} & \dots & λ_{m}^{n} c_{m} \end{array}]}^{T}

$\circ$ For large enough $n$ , $| λ_{1} | > | λ_{k} |$ , $k = 2, \dots, n$ , leads to a dominant contribution along the direcion of eigenvector $𝒒_{1}$

𝑨^{n} 𝒗 = 𝑸 𝚲^{n} 𝒄 = λ_{1}^{n} c_{1} 𝒒_{1} + \dots + λ_{m}^{n} c_{m} 𝒒_{m} ≅ λ_{1}^{n} c_{1} 𝒒_{1} .

This gives a procedure for finding one eigenvector of a matrix, and the Schur theorem proof suggests a recursive algorithm to find all eigenvalues can be defined.

Example. Construct an $m \times m$ matrix with eigenvalues $λ_{j} = 2^{1 - j}$ , $j = 1, 2, \dots, m$ , and arbitrary orthonormalized eigenvectors $𝑸$ .

∴	function genAQΛ(m) X=rand(m,m); Q=qr(X).Q; Λ=diagm(2.0 .^ (0:-1:1-m)) A=QΛQ' return A,Q,Λ end;

∴

Carry out $n$ power iterations starting from some arbitrary initial vector $𝒗_{0}$ , constructing

𝑽 = [\begin{array}{llll} 𝒗_{0} & 𝑨 𝒗_{0} & \dots & 𝑨^{n} 𝒗_{0} \end{array}]

∴	m=6; A,Q,Λ=genAQΛ(m);

∴	n=20; V=rand(m,n+1);

∴	for k=1:n global V V[:,k+1] = A*V[:,k] end

∴

Compute components of the vectors in the $𝑸$ basis, $𝑪 = 𝑸^{T} 𝑽$ . Row $i$ of $𝑪$ contains the coefficients of $𝑨 𝒗_{0}^{j}$ along eigenvector $𝒒_{i}$ . Notice that for $i > 1$ the coefficients decrease.

∴

C=Q'*V;

∴	clf(); plot(log10.(abs.(C)));

∴	xlabel("Iteration"); ylabel("lg\|c_i\|");

∴

The sequence of normalized eigenvector approximants

𝒗_{n} = 𝑨^{n} 𝒗 / || 𝑨^{n} 𝒗 ||

is linearly convergent at rate

r = | λ_{2} / λ_{1} |

2.2.Rayleigh quotient

To estimate the eigenvalue revealed by power iteration, formulate the least squares problem

that seeks the best approximation of one power iteration

𝑨 𝒗

as a linear combination of the initial vector

𝒗

. Of course, if

𝒗 = 𝒒

is an eigenvector, then the solution would be

c = λ

, the associated eigenvalue. The projector onto

C (𝒗)

is known as the Rayleigh quotient which, evaluated for an eigenvector, gives

r (𝒒) = λ

. To determine how well the eigenvalue is approximated, carry out a Taylor series in the vicinity of an eigenvector

𝒒

Noting that

\nabla_{𝒗} v_{i} = 𝒆_{i}

, the

i^{th}

column of

𝑰

, the gradient of

𝒗^{T} 𝒗

To compute

\nabla_{𝒗} (𝒗^{T} 𝑨 𝒗)

, let

𝒖 = 𝑨 𝒗

, and since

𝑨

is symmetric

𝒖^{T} = 𝒗^{T} 𝑨^{T} = 𝒗^{T} 𝑨

, leading to

Use

u_{i} = \sum_{j = 1}^{m} a_{i j} v_{j}

also expressed as

u_{j} = \sum_{i = 1}^{m} a_{j i} v_{i}

by swapping indices to obtain

When evaluated at

𝒗 = 𝒒

, obtain

\nabla_{𝒗} r (𝒒) = 𝟎

, implying that near an eigenvector the Rayleigh quotient approximation of an eigenvalue is of quadratic accuracy,

2.3.Refining the power iteration idea

Power iteration furnishes the largest eigenvalue. Further eigenvalues can be found by use of the following properties:

μ

is a known initial approximation of the eigenvalue then the inverse power iteration

𝒗_{n} = {(𝑨 - μ 𝑰)}^{- 1} 𝒗_{n - 1}

, actually implemented as successive solution of linear systems

leads to a sequence of Rayleigh quotients

𝒓 (𝒗_{n})

that converges quadratically to an eigenvalue close to

μ

. An important refinement of the idea is to change the shift at each iteration which leads to cubic order of convergence

Lecture 13: Power Iterations

1.Reduction to triangular form

2.Power iteration for real symmetric matrices

2.1.The power iteration idea

2.2.Rayleigh quotient

2.3.Refining the power iteration idea