Least squares, Gram-Schmidt solved problems

The examples below are of complexity similar to the problems in Homework 7. Please note that even in these small dimensional cases the arithmetic gets tiresome, hiding the simplicity of the solution by projection:

min𝒙n||𝒃-𝑨𝒙||2,𝑸𝑹=𝑨,𝑹𝒙=𝑸T𝒃.

The intent of Homework 7 is to:

  1. Gain familiarity with the Gram-Schmidt algorithm;

  2. Understand the steps in solving the least squares problem by projection;

  3. Understand the necessity of working with symbolic representations of vectors and matrices, leaving the arithmetic to software systems (Matlab/Octave).

Theory primer.
The least squares problem is stated as

min𝒙n||𝒃-𝑨𝒙||2

and seeks to find the scaling coefficients 𝒙n required to most closely approximate 𝒃m by linear combination of the columns of 𝑨=[ 𝒂1 𝒂2 𝒂n ]m×n.

The solution is found by orthogonal projection of 𝒃 onto the column space of 𝑨. It is efficient to define an orthonormal basis for the column space of 𝑨 through the column vectors of 𝑸

𝑸=[ 𝒒1 𝒒2 𝒒r ],𝑸𝑹=𝑨.

computed using the Gram-Schmidt algorithm (see below). Read 𝑸𝑹=𝑨 to state: “linear combinations of the columns of 𝑸 can be formed to obtain the column vectors of 𝑨”. Therefore projectiing onto the column space of 𝑨 is the same as projecting onto the column space of 𝑸 and the projection matrix is

𝑷=𝑸𝑸T

where 𝑸m×r is a matrix with r mutually orthogonal columns of unit norm, and r=rank(𝑨). The matrix 𝑸=[ 𝒒1 𝒒2 𝒒r ] is (see below).

The projection of 𝒃 onto C(𝑨) is a vector 𝒗 that (since it's in the column space of 𝑨) can also be expressed as 𝒗=𝑨𝒙

𝒗=𝑷𝒃=𝑨𝒙.

This leads to

𝑸𝑸T𝒃=𝑸𝑹𝒙𝑹𝒙=𝑸T𝒃=𝒚

Algorithm 1

Given n vectors 𝒂1,,𝒂n

Initialize 𝒒1=𝒂1,..,𝒒n=𝒂n, 𝑹=𝑰n

for i=1 to n

rii=(𝒒iT𝒒i)1/2; if |rii|<ε skip to next i

𝒒i=𝒒i/rii

for j=i+1 to n

rij=𝒒iT𝒂j; 𝒒j=𝒒j-rij𝒒i

end

end

return 𝑸,𝑹

Note that if 𝒃C(𝑨) (𝒃 is in the column space of 𝑨) then 𝒃=𝑸𝒚 and

𝑷𝒃=𝑸𝑸T𝑸𝒚=𝑸𝒚=𝒃,

stating that the projection of 𝒃 is 𝒃 itself. In this case the least squares problem has a solution such that 𝒃=𝑨𝒙 exactly.

  1. Matrix 𝑨 with linearly independent columns, 𝒃 not in the column space

    𝑨=[ 𝒂1 𝒂2 ]=[ 1 -1 0 0 1 1 ],𝒃=[ 1 1 1 ].

    Solution. Note that the column vectors are orthogonal

    𝒂1T𝒂2=[ 1 0 1 ][ -1 0 1 ]=0.

    Remember to first look at a problem before blindly carrying out calculations. In this case 𝑸 is found by scaling each column vector by its norm

    𝑸=[ 𝒒1 𝒒2 ]=[ 𝒂1||𝒂1|| 𝒂2||𝒂2|| ]=12[ 1 -1 0 0 1 1 ],

    and the matrix 𝑹 is a diagonal matrix containing the norms

    𝑹=[ 2 0 0 2 ].

    Verify:

    𝑸𝑹=12[ 1 -1 0 0 1 1 ][ 2 0 0 2 ]=[ 1 -1 0 0 1 1 ][ 1 0 0 1 ]=[ 1 -1 0 0 1 1 ]=𝑨

    Now that the QR factorization has been found, here are the steps to solve the least squares problem.

    1. Find the vector 𝒚=𝑸T𝒃

      𝒚=12[ 1 0 1 -1 0 1 ][ 1 1 1 ]=12[ 2 0 ]
    2. Solve the system 𝑹𝒙=𝒚

      𝑹𝒙=[ 2 0 0 2 ]𝒙=12[ 2 0 ]=𝒚

      Find

      𝒙=[ 1 0 ],𝒗=𝑨𝒙=[ 1 -1 0 0 1 1 ][ 1 0 ]=[ 1 0 1 ]

    Computer solution (useful as verification). The above calculations can also be carried out in Matlab/Octave

    >> 
    A=[1 -1; 0 0; 1 1]; [Q,R]=qr(A,0)
    >> 
    Q

    ( -0.7071 0.7071 0.0 0.0 -0.7071 -0.7071 )

    >> 
    R

    ( -1.4142 -3.3307e-16 0.0 -1.4142 )

    >> 

    Note that Matlab/Octave returned orthogonal vectors in the opposite orientation, highlighting that there are multiple ways to orthonormalize the columns of 𝑨.

    In Matlab/Octave the least squares solution is returned by the backslash operator

    >> 
    b=[1; 1; 1]; x=A\b; disp(x)

    1

    0

    >> 

  2. Matrix 𝑨 with linearly independent columns, 𝒃 in the column space

    𝑨=[ 𝒂1 𝒂2 ]=[ 1 -1 1 0 1 1 ],𝒃=[ 2 2 2 ].

    Solution. Note that 𝒃 is a simple scaling of the first column of 𝑨

    𝒃=2𝒂1=[ 𝒂1 𝒂2 ][ 2 0 ]=𝑨𝒙𝒙=[ 2 0 ].

    This example again highlights the need to first look at a problem before blindly carrying out calculations.

  3. Matrix 𝑨 with linearly independent columns, 𝒃 not in the column space

    𝑨=[ 𝒂1 𝒂2 ]=[ 1 -1 1 2 1 1 ],𝒃=[ 1 2 1 ].

    Solution. In this case the Gram-Schmidt algorithm is applied to find the 𝑸𝑹 factorization of 𝑨. Here are the GS steps:

    GS1. Find 𝒒1

    r11=||𝒂1||=3,𝒒1=𝒂1r11=13[ 1 1 1 ]

    GS2. Find 𝒒2. This is carried out in two sub-steps:

    1. Subtract from 𝒂2 its component along the previously determined direction 𝒒1

      𝒘=𝒂2-(𝒒1T𝒂2)𝒒1=𝒂2-r12𝒒1
      r12=13[ 1 1 1 ][ -1 2 1 ]=23
      𝒘=𝒂2-r12𝒒1=[ -1 2 1 ]-23[ 1 1 1 ]=13[ -5 4 1 ].

      This vector is orthogonal to 𝒒1. Verify

      𝒒1T𝒘=𝒒1T(𝒂2-r12𝒒1)=r12-r12𝒒1T𝒒1=r12-r121=0
      13[ 1 1 1 ]13[ -5 4 1 ]=133(-5+4+1)=0
    2. Divide the resulting vector by its norm

      r22=||𝒘||=423,𝒒2=𝒘r22=142[ -5 4 1 ]

    Verify:

    𝑸𝑹=[ 1/3 -5/42 1/3 4/42 1/3 1/42 ][ 3 2/3 0 42/3 ]=[ 1 -1 1 2 1 1 ]=𝑨

    Here are the steps to solve the least squares problem.

    1. Find the vector 𝒚=𝑸T𝒃

      𝒚=[ 1/3 1/3 1/3 -5/42 4/42 1/42 ][ 1 2 1 ]=[ 4/3 4/42 ]
    2. Solve the system 𝑹𝒙=𝒚

      𝑹𝒙=[ 3 2/3 0 42/3 ]𝒙=[ 4/3 4/42 ]=𝒚

      Find

      𝒙=[ 24/21 6/21 ],𝒗=𝑨𝒙=[ 1 -1 1 2 1 1 ][ 24/21 6/21 ]=[ 6/7 12/7 10/7 ]

    Computer solution (useful as verification). The above calculations can also be carried out in Matlab/Octave

    >> 
    A=[1 -1; 1 2; 1 1]; [Q,R]=qr(A,0)
    >> 
    Q

    ( -0.5774 0.7715 -0.5774 -0.6172 -0.5774 -0.1543 )

    >> 
    R

    ( -1.7321 -1.1547 0.0 -2.1602 )

    In Matlab/Octave the least squares solution is returned by the backslash operator

    >> 
    b=[1; 2; 1]; x=A\b; disp(x)

    1.1429

    0.2857

    >> 
    [24/21; 6/21]

    ( 1.1429 0.2857 )

    >>