Linear System Solution

Synopsis. The traditional problem within linear algebra is to find the scaling coefficients of a linear combination to exactly represent some given vector. Methods with a long history of hand computation have been developed for this purpose, and can still offer insight into properties of linear mappings and their associated matrices.

1.Orthogonal projectors and linear systems

Consider the linear system 𝑨𝒙=𝒃 with 𝑨m×n, 𝒃m given. The scaling coefficients 𝒙n are sought and are said to be a solution of the linear system when the equation 𝑨𝒙=𝒃 is satisfied. Orthogonal projectors and knowledge of the four fundamental matrix subspaces allows us to succintly express whether there exist no solutions, a single solution of an infinite number of solutions:

If a solution exists, it can be found by backsubstitution solution of 𝑹𝒙=𝑸T𝒃. If multiple solutions exist, an orthonormal basis 𝒁 is found for the null space and the family of solutions is 𝒙+𝒁𝒚.

2.Gaussian elimination and row echelon reduction

Suppose now that 𝑨𝒙=𝒃 admits a unique solution. The QR factorization approach of reducing the problem to 𝑹𝒙=𝑸T𝒃 is one procedure to compute the solution. It has the benefit of working with the orthonormal 𝑸 matrix. Finding the orthonormal 𝑸 matrix is however a computational expense. Recall that orthogonality implied linear independence. Other approaches might exist that only impose linear independence, without orthogonality. Gaussian elimination is the main such approach. Consider the system

{ x1+2x2-x3 = 2 2x1-x2+x3 = 2 3x1-x2-x3 = 1 .

The idea is to combine equations such that we have one fewer unknown in each equation. Ask: with what number should the first equation be multiplied in order to eliminate x1 from sum of equation 1 and equation 2? This number is called a Gaussian multiplier, and is in this case -2. Repeat the question for eliminating x1 from third equation, with multiplier -3.

{ x1+2x2-x3 = 2 2x1-x2+x3 = 2 3x1-x2-x3 = 1 .{ x1+2x2-x3 = 2 -5x2+3x3 = -2 -7x2+2x3 = -5 .

Now, ask: with what number should the second equation be multiplied to eliminate x2 from sum of second and third equations. The multiplier is in this case -7/5.

{ x1+2x2-x3 = 2 -5x2+3x3 = -2 -7x2+2x3 = -5 .{ x1+2x2-x3 = 2 -5x2+3x3 = -2 -115x3 = -115 .

Starting from the last equation we can now find x3=1, replace in the second to obtain -5x2=-5, hence x2=1, and finally replace in the first equation to obtain x1=1.

The above operations only involve coefficients. A more compact notation is therefore to work with what is known as the "bordered matrix" and work with coefficients arising in rows

[ 𝑨 𝒃 ]=[ 1 2 -1 2 2 -1 1 2 3 -1 -1 1 ][ 𝑨1 𝒃1 ]=[ 1 2 -1 2 0 -5 3 -2 0 -7 2 -5 ][ 𝑨2 𝒃2 ]=[ 1 2 -1 2 0 -5 3 -2 0 0 -115 -115 ].

In Julia the above operations would be carried out as

A=[1. 2 -1 2; 2 -1 1 2; 3 -1 -1 1]; A[2,:]=A[2,:]-2*A[1,:]; A[3,:]=A[3,:]-3*A[1,:]; 
A

[ 1.0 2.0 -1.0 2.0 0.0 -5.0 3.0 -2.0 0.0 -7.0 2.0 -5.0 ] (1)

A[3,:]=A[3,:]-(7/5)*A[2,:]; A

[ 1.0 2.0 -1.0 2.0 0.0 -5.0 3.0 -2.0 0.0 0.0 -2.1999999999999993 -2.2 ] (2)

Once the above triangular form has been obtained, the solution is found by back substitution, in which we seek to form the identity matrix in the first 3 columns, and the solution is obtained in the last column.

[ 1 2 -1 2 0 -5 3 -2 0 0 -115 -115 ][ 1 2 -1 2 0 -5 3 -2 0 0 1 1 ][ 1 0 0 1 0 1 0 1 0 0 1 1 ].

The operations arising in Gaussian elimination are successive linear combinations of rows that maintain the solution of the linear system. This idea is useful in identifying the fundamental subspaces associated with a matrix. The matrices arising at successive stages of the procedure are said to be similar to one another

𝑨𝑨1𝑨2,

and since 𝑨k is obtained by linear combination of the rows of 𝑨k-1, the row space is not changed

C(𝑨T)=C(𝑨1T)=C(𝑨2T)=.

During the procedure a pivot element is identified in the diagonal position, as shown bordered above. If a zero value is encountered rows are permuted to bring a non-zero element to the pivot position. If a non-zero pivot value cannot be found by row permutation, one is sought by column permutations also. If a non-zero pivot cannot be found by either row or column permutations, the matrix is rank-deficient r=rank(𝑨)<min(m,n) and has a non-trivial null space as in the following examples

𝑨=[ 1 2 3 0 1 1 1 2 3 ]3×3,𝒃=[ 3 1 3 ]3,𝒄=[ 3 1 4 ]3.
[ 𝑨 𝒃 ]=[ 1 2 3 3 0 1 1 1 1 2 3 3 ][ 𝑨1 𝒃1 ]=[ 1 2 3 3 0 1 1 1 0 0 0 0 ]{ x1+2x2+3x3 =3 x2+x3 =1 0 =0 ,
[ 𝑨 𝒄 ]=[ 1 2 3 3 0 1 1 1 1 2 3 4 ][ 𝑨1 𝒄1 ]=[ 1 2 3 3 0 1 1 1 0 0 0 1 ]{ x1+2x2+3x3 =3 x2+x3 =1 0 =1 .

The 𝑨𝒙=𝒃 has an infinite number of solutions, while the 𝑨𝒙=𝒄 system has no solutions. Note that 𝑨1 has a row of zeros, hence the rows must be linearly dependent and N(𝑨){𝟎}. By the FTLA when 𝒃C(𝑨) an infinite number of solutions is obtained, and for 𝒄C(𝑨) no solutions are obtained.

The rows with non-zero pivot elements are linearly independent, and reduction to the above row-echelon form is useful to identify the rank of a matrix. The first non-zero entry on a row is called either a pivot or a leading entry. A matrix is said to be brought to reduced row-echelon form when:

In contrast to the Gram-Schmidt procedure, Gaussian elimination does not impose orthogonality between rows, nor that a row have unit norm. This leads to fewer computations, and is therefore well-suited to hand computation of small-dimensional matrices.

The steps in Gaussian elimination can be precisely specified in a format suitable for direct computer coding.

Algorithm Gauss elimination without pivoting

for s=1 to m-1

for i=s+1 to m

t=-ais/ass

for j=s+1 to m

aij=aij+tasj

bi=bi+tbs

for s=m downto 1

xs=bs/ass

for i=1 to s-1

bi=bi-aisxs

return x

The variant of the above algorithm that accounts for possible zeros arising in a diagonal position is known as Gauss elimination with pivoting.

Algorithm Gauss elimination with partial pivoting

p=1:m (initialize row permutation vector)

for s=1 to m-1

piv = abs(ap(s),s)

for i=s+1 to m

mag = abs(ap(i),s)

if mag>piv then

piv=mag;k=p(s);p(s)=p(i);p(i)=k

if piv<ϵ then break(“Singular matrix”)

t=-ap(i)s/ap(s)s

for j=s+1 to m

ap(i)j=ap(i)j+tap(s)j

bp(i)=bp(i)+tbp(s)

for s=m downto 1

xs=bp(s)/ap(s)s

for i=1 to s-1

bp(i)=bp(i)-ap(i)sxs

return x

3.LU-factorization

The operations arising in Gaussian elimination correspond to a matrix factorization, analogous to how the Gram-Schmidt procedure can be stated as the QR factorization. Revisting the previous example

{ x1+2x2-x3 = 2 2x1-x2+x3 = 2 3x1-x2-x3 = 1 .𝑨𝒙=𝒃,𝑨=[ 1 2 -1 2 -1 1 3 -1 -1 ],𝒃=[ 2 2 1 ],

the idea is to express linear combinations of rows as a matrix multiplication. Recall that 𝑨𝒙 is a linear combination of columns, and 𝑨𝑿 expresses multiple column linear combinations. Linear combinations of columns are expressed as products in which the first factor contains the columns and the second contains the scaling coefficients. Analogously linear combinations of rows are expressed by products 𝑳𝑨 where now the left factor contains the scaling coefficients entering into a linear combination of the rows of 𝑨. For example, the first stage of Gaussian elimination for the above system can be expressed as

𝑳1𝑨=[ 1 0 0 -2 1 0 -3 0 1 ][ 1 2 -1 2 -1 1 3 -1 -1 ]=[ 1 2 -1 0 -5 3 0 -7 2 ].

The next stage is also expressed as a matrix multiplication, after which an upper triangular matrix 𝑼 is obtained

𝑳2𝑳1𝑨=[ 1 0 0 0 1 0 0 -7/5 1 ][ 1 2 -1 0 -5 3 0 -7 2 ]=[ 1 2 -1 0 -5 3 0 0 -11/5 ]=𝑼.

For a general matrix 𝑨m×m the sequence of operations is

𝑳m-1...𝑳2𝑳1𝑨=𝑼.

Definition. The matrix

𝑳k=( 1 0 0 0 0 0 0 1 0 0 -lk+1,k 0 0 -lk+2,k 0 0 -lm,k 1 )

with li,k=ai,k(k)/ak,k(k), and 𝑨(k)=(ai,j(k)) the matrix obtained after step k of row echelon reduction (or, equivalently, Gaussian elimination) is called a Gaussian multiplier matrix.

The inverse of a Gaussian multiplier is

𝑳k-1=( 1 0 0 0 0 0 0 1 0 0 lk+1,k 0 0 lk+2,k 0 0 lm,k 1 )=𝑰-(𝑳k-𝑰).

From (𝑳m-1𝑳m-2𝑳2𝑳1)𝑨=𝑼 obtain

𝑨=(𝑳m-1𝑳m-2𝑳2𝑳1)-1𝑼=𝑳1-1𝑳2-1𝑳m-1-1𝑼=𝑳𝑼.

The above is known as an LU factorization, short for lower-upper factorization. Solving a linear system by LU-factorization consists of the steps:

  1. Find the factorization LU=A

  2. Insert the factorization into Ax=b to obtain (LU)x=L(Ux)=Ly=b, where the notation y=Ux has been introduced. The system

    Ly=b

    is easy to solve by forward substitution to find y for given b

  3. Finally find x by backward substitution solution of

    Ux=y

The various procedures encountered so far to solve a linear system are described in the following table.

Given 𝑨m×n

Singular value decomposition

Gram-Schmidt

Lower-upper

Transformation of coordinates

𝑨𝒙=𝒃

𝑼𝚺𝑽T=𝑨

𝑸𝑹=𝑨

𝑳𝑼=𝑨

(𝑼𝚺𝑽T)𝒙=𝒃𝑼𝒚=𝒃𝒚=𝑼T𝒃

(𝑸𝑹)𝒙=𝒃𝑸𝒚=𝒃, 𝒚=𝑸T𝒃

(𝑳𝑼)𝒙=𝒃𝑳𝒚=𝒃(forwardsubtofind)y

𝚺𝒛=𝒚𝒛=𝚺+𝒚

𝑹𝒙=𝒚(backsubtofind.𝒙)

𝑼𝒙=𝒚 (back sub to find 𝒙)

𝑽T𝒙=𝒛𝒙=𝑽𝒛

4.Matrix inverse

For 𝑨m×n the pseudo-inverse 𝑨+ has been introduced based on the SVD, 𝑨=𝑼𝚺𝑽T as

𝑨+=𝑽𝚺+𝑼T.

When 𝑨m×m is square and of full rank the system 𝑨𝒙=𝒃 has a solution that can be stated as 𝒙=𝑨-1𝒃, where 𝑨-1 is the inverse of 𝑨. The matrix 𝑨 is said to be invertible 𝑿m×m such that

𝑨𝑿=𝑿𝑨=𝑰,

and in this case 𝑿=𝑨-1 is the inverse of 𝑨.

The inverse can be computed by extending Gauss elimination

[ 𝑨 |. 𝑰 ][ 𝑰 |. 𝑿 ],

a procedure known as the Gauss-Jordan algorithm.

A square matrix has an inverse only when it is of full rank. The following are equivalent statements:

  1. 𝑨 invertible

  2. 𝑨𝒙=𝒃 has a unique solution for all 𝒃m

  3. 𝑨𝒙=𝟎 has a unique solution

  4. The reduced row echelon form of 𝑨 is 𝑰

  5. 𝑨 can be written as product of elementary matrices