Lecture 11: LU Algorithm Variants

The basic procedures within LU factorization can be adapted to account for special structure of the system matrix 𝑨 or to obtain properties associated with 𝑨.

1.Determinants

Definition. The determinant of a square matrix 𝑨=( 𝒂1 𝒂m )m×m is a real number

det(A)=| a11 a12 a1m a21 a22 a2m am1 am2 amm |

giving the (oriented) volume of the parallelepiped spanned by matrix column vectors.

1.1.Cross product

2.Structured Matrices

The special structure of a matrix can be exploited to obtain more efficient factorizations. Evaluation of the linear combination 𝑨𝒙=x1𝒂1++xn𝒂n requires mn floating point operations (flops) for 𝑨m×n. Evaluation of p linear combinations 𝑨𝑿, 𝑿n×p requires mnp flops. If it is possible to evaluate 𝑨𝒙 with fewer operations, the matrix is said to be structured. Examples include:

3.Cholesky factorization of positive definite hermitian matrices

3.1.Symmetric matrices, hermitian matrices

Special structure of a matrix is typically associated with underlying symmetries of a particular phenomenon. For example, the law of action and reaction in dynamics (Newton's third law) leads to real symmetric matrices, 𝑨m×m, 𝑨T=𝑨. Consider a system of m point masses with nearest-neighbor interactions on the real line where the interaction force depends on relative position. Assume that the force exerted by particle i+1 on particle i is linear

fi+1,i=f(ui+1-ui)=k(ui+1-ui),

with ui denoting displacement from an equilibrium position. The law of action and reaction then states that

fi,i+1=-fi+1,i=k(ui-ui+1).

If the same force law holds at all positions, then

fi-1,i=k(ui-1-ui).

The force on particle i is given by the sum of forces from neighboring particles i-1,i+1

fi=fi-1,i+fi+1,i=k(ui-1-ui)+k(ui+1-ui)=k(ui+1-2ui+ui-1).

Introducing 𝒇,𝒖m, and assuming u0=um+1=0, the above is stated as

𝒇=𝑲𝒖,

with 𝑲=kdiag([ 1 -2 1 ]) is a symmetric matrix, 𝑲=𝑲T, a direct consequence of the law of action and reaction. The matrix 𝑲 is in this case tridiagonal as a consequence of the assumption of nearest-neighbor interactions. Recall that matrices represent linear mappings, hence

𝑲=[ 𝒇(𝒆1) 𝒇(𝒆2) 𝒇(𝒆m) ],

with 𝒇(𝒖) the force-displacement linear mapping, Fig. 2, obtaining the same symmetric, tri-diagonal matrix.

Figure 2. Image of 𝒆i through mapping representing a linear force is 𝒇(𝒆i)=k[ 1 -2 1 ]T.

This concept can be extended to complex matrices 𝑨m×m through 𝑨=𝑨, in which case 𝑨 is said to be self-adjoint or hermitian. Again, this property is often associated with desired physical properties, such as the requirement of real observable quantitites in quantum mechanics. Diagonal elements of a hermitian matrix must be real, and for any 𝒙,𝒚m, the computation

𝒙𝑨𝒚=(𝒙𝑨𝒚)=𝒚𝑨𝒙=𝒚𝑨𝒙,

implies for 𝒙=𝒚 that

𝒙𝑨𝒙=𝒙𝑨𝒙,

hence 𝒙𝑨𝒙 is real.

3.2.Positive-definite matrices

The work (i.e., energy stored in the system) done by all the forces in the above point mass system is

𝒲=12𝒖T𝑲𝒖,

and physical considerations state that 𝒲0. This leads the following definitions.

Definition. A hermitian matrix 𝑨m×m is positive definite if for any non-zero 𝒙m, 𝒙𝑨𝒙>0.

Definition. A hermitian matrix 𝑨m×m is positive semi-definite if for any non-zero 𝒙m, 𝒙𝑨𝒙0.

If 𝑨 is hermitian positive definite, then so is 𝑿𝑨𝑿 for any 𝑿m×n. Choosing

𝑿=[ 𝒆1 𝒆n ]m×n

gives 𝑨n=𝑿𝑨𝑿, the nth principal submatrix of 𝑨, itself a hermitian positive definite matrix. Choosing 𝑿=𝒆jshows that the jth diagonal element of 𝑨 is positive, ajj=𝒆jT𝑨𝒆j>0

3.3.Symmetric factorization of positive-definite hermitian matrices

The structure of a hermitian positive definite matrix 𝑨m×m, can be preserved by modification of LU-factorization. The resulting algorithm is known as Cholesky factorization, and its first stage is stated as

𝑨=[ a11 𝒘 𝒘 𝑩 ]=[ α 𝟎 𝒘/α 𝑰 ][ 1 𝟎 𝟎 𝑪 ][ α 𝒘/α 𝟎 𝑰 ]=[ α 𝟎 𝒘/α 𝑰 ][ α 𝒘/α 𝟎 𝑪 ]=[ a11 𝒘 𝒘 𝑪+𝒘𝒘/a11 ],

whence 𝑪=𝑩-𝒘𝒘/a11. Repeating the stage-1 step

𝑨=𝑳1𝑨1𝑳1,

leads to

𝑨=𝑳1𝑳2𝑨2𝑳2𝑳1==𝑳𝑳,𝑳=𝑳1𝑳2𝑳m.

The resulting Cholesky algorithm is half as expensive as standard LU-factorization.

Algorithm (Cholesky factorization, A=LL)

𝑳=𝑨

for i=1:m

for j=i+1:m

L[j:m,j]=L[j:m,j]-L[j:m,i]L[j,i]/L[i,i]

L[i:m,i]=L[i:m,i]/L[i,i]