Least squares, Gram-Schmidt solved problems

The examples below are of complexity similar to the problems in Homework 7. Please note that even in these small dimensional cases the arithmetic gets tiresome, hiding the simplicity of the solution by projection:

minπ’™βˆˆβ„n||𝒃-⁑𝑨⁑𝒙||2,𝑸⁑𝑹=𝑨,𝑹⁑𝒙=𝑸T𝒃.

The intent of Homework 7 is to:

  1. Gain familiarity with the Gram-Schmidt algorithm;

  2. Understand the steps in solving the least squares problem by projection;

  3. Understand the necessity of working with symbolic representations of vectors and matrices, leaving the arithmetic to software systems (Matlab/Octave).

Theory primer.
The least squares problem is stated as

minπ’™βˆˆβ„n||𝒃-⁑𝑨⁑𝒙||2

and seeks to find the scaling coefficients π’™βˆˆβ„n required to most closely approximate π’ƒβˆˆβ„m by linear combination of the columns of 𝑨=[ 𝒂1 𝒂2 … 𝒂n ]βˆˆβ„mΓ—n.

The solution is found by orthogonal projection of 𝒃 onto the column space of 𝑨. It is efficient to define an orthonormal basis for the column space of 𝑨 through the column vectors of 𝑸

𝑸=[ 𝒒1 𝒒2 … 𝒒r ],⁑𝑸⁑𝑹=𝑨.

computed using the Gram-Schmidt algorithm (see below). Read 𝑸⁑𝑹=𝑨 to state: β€œlinear combinations of the columns of 𝑸 can be formed to obtain the column vectors of 𝑨”. Therefore projectiing onto the column space of 𝑨 is the same as projecting onto the column space of 𝑸 and the projection matrix is

𝑷=𝑸⁑𝑸T

where π‘Έβˆˆβ„mΓ—r is a matrix with r mutually orthogonal columns of unit norm, and r=rank(𝑨). The matrix 𝑸=[ 𝒒1 𝒒2 … 𝒒r ] is (see below).

The projection of 𝒃 onto C(𝑨) is a vector 𝒗 that (since it's in the column space of 𝑨) can also be expressed as 𝒗=𝑨⁑𝒙

𝒗=𝑷⁑⁑𝒃=𝑨⁑𝒙.

This leads to

𝑸⁑𝑸T⁑𝒃=𝑸⁑𝑹⁑𝒙⇒𝑹⁑𝒙=𝑸T⁑𝒃=π’š

Algorithm 1

Given n vectors 𝒂1,…,𝒂n

Initialize 𝒒1=𝒂1,..,𝒒n=𝒂n, 𝑹=𝑰n

for i=1 to n

ri⁑i=(𝒒iT⁑𝒒i)1/2; if |ri⁑i|<Ξ΅ skip to next i

𝒒i=𝒒i/ri⁑i

for j=i+1 to n

ri⁑j=𝒒iT⁑𝒂j; 𝒒j=𝒒j-ri⁑j𝒒i

end

end

return 𝑸,𝑹

Note that if π’ƒβˆˆC(𝑨) (𝒃 is in the column space of 𝑨) then 𝒃=π‘Έβ‘π’š and

⁑𝑷⁑𝒃=𝑸⁑𝑸Tβ‘π‘Έβ‘π’š=π‘Έβ‘π’š=𝒃,

stating that the projection of 𝒃 is 𝒃 itself. In this case the least squares problem has a solution such that 𝒃=𝑨⁑𝒙 exactly.

  1. Matrix 𝑨 with linearly independent columns, 𝒃 not in the column space

    𝑨=[ 𝒂1 𝒂2 ]=[ 1 -1 0 0 1 1 ],𝒃=[ 1 1 1 ].

    Solution. Note that the column vectors are orthogonal

    𝒂1T⁑𝒂2=[ 1 0 1 ][ -1 0 1 ]=0.

    Remember to first look at a problem before blindly carrying out calculations. In this case 𝑸 is found by scaling each column vector by its norm

    𝑸=[ 𝒒1 𝒒2 ]=[ 𝒂1||𝒂1|| 𝒂2||𝒂2|| ]=12[ 1 -1 0 0 1 1 ],

    and the matrix 𝑹 is a diagonal matrix containing the norms

    𝑹=[ 2 0 0 2 ].

    Verify:

    𝑸⁑𝑹=12[ 1 -1 0 0 1 1 ][ 2 0 0 2 ]=[ 1 -1 0 0 1 1 ][ 1 0 0 1 ]=[ 1 -1 0 0 1 1 ]=𝑨

    Now that the Q⁑R factorization has been found, here are the steps to solve the least squares problem.

    1. Find the vector π’š=𝑸T⁑𝒃

      π’š=12[ 1 0 1 -1 0 1 ][ 1 1 1 ]=12[ 2 0 ]
    2. Solve the system 𝑹⁑𝒙=π’š

      𝑹⁑𝒙=[ 2 0 0 2 ]𝒙=12[ 2 0 ]=π’š

      Find

      𝒙=[ 1 0 ],𝒗=𝑨⁑𝒙=[ 1 -1 0 0 1 1 ][ 1 0 ]=[ 1 0 1 ]

    Computer solution (useful as verification). The above calculations can also be carried out in Matlab/Octave

    >> 
    A=[1 -1; 0 0; 1 1]; [Q,R]=qr(A,0)
    >> 
    Q

    ( -0.7071 0.7071 0.0 0.0 -0.7071 -0.7071 )

    >> 
    R

    ( -1.4142 -3.3307e-16 0.0 -1.4142 )

    >> 

    Note that Matlab/Octave returned orthogonal vectors in the opposite orientation, highlighting that there are multiple ways to orthonormalize the columns of 𝑨.

    In Matlab/Octave the least squares solution is returned by the backslash operator

    >> 
    b=[1; 1; 1]; x=A\b; disp(x)

    1

    0

    >> 

  2. Matrix 𝑨 with linearly independent columns, 𝒃 in the column space

    𝑨=[ 𝒂1 𝒂2 ]=[ 1 -1 1 0 1 1 ],𝒃=[ 2 2 2 ].

    Solution. Note that 𝒃 is a simple scaling of the first column of 𝑨

    𝒃=2𝒂1=[ 𝒂1 𝒂2 ][ 2 0 ]=𝑨⁑𝒙⇒𝒙=[ 2 0 ].

    This example again highlights the need to first look at a problem before blindly carrying out calculations.

  3. Matrix 𝑨 with linearly independent columns, 𝒃 not in the column space

    𝑨=[ 𝒂1 𝒂2 ]=[ 1 -1 1 2 1 1 ],𝒃=[ 1 2 1 ].

    Solution. In this case the Gram-Schmidt algorithm is applied to find the 𝑸⁑𝑹 factorization of 𝑨. Here are the GS steps:

    GS1. Find 𝒒1

    r11=||𝒂1||=3,⁑𝒒1=𝒂1r11=13[ 1 1 1 ]

    GS2. Find 𝒒2. This is carried out in two sub-steps:

    1. Subtract from 𝒂2 its component along the previously determined direction 𝒒1

      π’˜=𝒂2-(𝒒1T⁑𝒂2)⁑𝒒1=𝒂2-r12⁑𝒒1
      r12=13[ 1 1 1 ][ -1 2 1 ]=23
      π’˜=𝒂2-r12⁑𝒒1=[ -1 2 1 ]-23[ 1 1 1 ]=13[ -5 4 1 ].

      This vector is orthogonal to 𝒒1. Verify

      𝒒1Tβ‘π’˜=𝒒1T(𝒂2-r12⁑𝒒1)=r12-r12⁑𝒒1T⁑𝒒1=r12-r12β‹…1=0
      13[ 1 1 1 ]13[ -5 4 1 ]=133(-5+4+1)=0
    2. Divide the resulting vector by its norm

      r22=||π’˜||=423,𝒒2=π’˜r22=142[ -5 4 1 ]

    Verify:

    𝑸⁑𝑹=[ 1/3 -5/42 1/3 4/42 1/3 1/42 ][ 3 2/3 0 42/3 ]=[ 1 -1 1 2 1 1 ]=𝑨

    Here are the steps to solve the least squares problem.

    1. Find the vector π’š=𝑸T⁑𝒃

      π’š=[ 1/3 1/3 1/3 -5/42 4/42 1/42 ][ 1 2 1 ]=[ 4/3 4/42 ]
    2. Solve the system 𝑹⁑𝒙=π’š

      𝑹⁑𝒙=[ 3 2/3 0 42/3 ]𝒙=[ 4/3 4/42 ]=π’š

      Find

      𝒙=[ 24/21 6/21 ],𝒗=𝑨⁑𝒙=[ 1 -1 1 2 1 1 ][ 24/21 6/21 ]=[ 6/7 12/7 10/7 ]

    Computer solution (useful as verification). The above calculations can also be carried out in Matlab/Octave

    >> 
    A=[1 -1; 1 2; 1 1]; [Q,R]=qr(A,0)
    >> 
    Q

    ( -0.5774 0.7715 -0.5774 -0.6172 -0.5774 -0.1543 )

    >> 
    R

    ( -1.7321 -1.1547 0.0 -2.1602 )

    In Matlab/Octave the least squares solution is returned by the backslash operator

    >> 
    b=[1; 2; 1]; x=A\b; disp(x)

    1.1429

    0.2857

    >> 
    [24/21; 6/21]

    ( 1.1429 0.2857 )

    >>