Least squares, Gram-Schmidt solved problems

Least squares, Gram-Schmidt solved problems

The examples below are of complexity similar to the problems in Homework 7. Please note that even in these small dimensional cases the arithmetic gets tiresome, hiding the simplicity of the solution by projection:

{min}_{𝒙 \in ℝ^{n}} {|| 𝒃 - 𝑨 𝒙 ||}_{2}, 𝑸 𝑹 = 𝑨, 𝑹 𝒙 = 𝑸^{T} 𝒃 .

The intent of Homework 7 is to:

Gain familiarity with the Gram-Schmidt algorithm;
Understand the steps in solving the least squares problem by projection;
Understand the necessity of working with symbolic representations of vectors and matrices, leaving the arithmetic to software systems (Matlab/Octave).

Theory primer.

The least squares problem is stated as

{min}_{𝒙 \in ℝ^{n}} {|| 𝒃 - 𝑨 𝒙 ||}_{2}

and seeks to find the scaling coefficients $𝒙 \in ℝ^{n}$ required to most closely approximate $𝒃 \in ℝ^{m}$ by linear combination of the columns of $𝑨 = [\begin{array}{llll} 𝒂_{1} & 𝒂_{2} & \dots & 𝒂_{n} \end{array}] \in ℝ^{m \times n}$ .

The solution is found by orthogonal projection of $𝒃$ onto the column space of $𝑨$ . It is efficient to define an orthonormal basis for the column space of $𝑨$ through the column vectors of $𝑸$

𝑸 = [\begin{array}{llll} 𝒒_{1} & 𝒒_{2} & \dots & 𝒒_{r} \end{array}], 𝑸 𝑹 = 𝑨 .

computed using the Gram-Schmidt algorithm (see below). Read $𝑸 𝑹 = 𝑨$ to state: “linear combinations of the columns of $𝑸$ can be formed to obtain the column vectors of $𝑨$ ”. Therefore projectiing onto the column space of $𝑨$ is the same as projecting onto the column space of $𝑸$ and the projection matrix is

𝑷 = 𝑸 𝑸^{T}

where $𝑸 \in ℝ^{m \times r}$ is a matrix with $r$ mutually orthogonal columns of unit norm, and $r = rank (𝑨)$ . The matrix $𝑸 = [\begin{array}{llll} 𝒒_{1} & 𝒒_{2} & \dots & 𝒒_{r} \end{array}]$ is (see below).

The projection of $𝒃$ onto $C (𝑨)$ is a vector $𝒗$ that (since it's in the column space of $𝑨$ ) can also be expressed as $𝒗 = 𝑨 𝒙$

𝒗 = 𝑷 𝒃 = 𝑨 𝒙 .

This leads to

𝑸 𝑸^{T} 𝒃 = 𝑸 𝑹 𝒙 \Rightarrow 𝑹 𝒙 = 𝑸^{T} 𝒃 = 𝒚

Algorithm 1

Given $n$ vectors $𝒂_{1}, \dots, 𝒂_{n}$

Initialize $𝒒_{1} = 𝒂_{1}$ ,.., $𝒒_{n} = 𝒂_{n}$ , $𝑹 = 𝑰_{n}$

for $i = 1$ to $n$

$r_{i i} = {(𝒒_{i}^{T} 𝒒_{i})}^{1 / 2}$ ; if $| r_{i i} | < ε$ skip to next $i$

$𝒒_{i} = 𝒒_{i} / r_{i i}$

for $j = i$ +1 to $n$

$r_{i j} = 𝒒_{i}^{T} 𝒂_{j}$ ; $𝒒_{j} = 𝒒_{j} - r_{i j} 𝒒_{i}$

end

return $𝑸, 𝑹$

Note that if $𝒃 \in C (𝑨)$ ( $𝒃$ is in the column space of $𝑨$ ) then $𝒃 = 𝑸 𝒚$ and

𝑷 𝒃 = 𝑸 𝑸^{T} 𝑸 𝒚 = 𝑸 𝒚 = 𝒃,

stating that the projection of $𝒃$ is $𝒃$ itself. In this case the least squares problem has a solution such that $𝒃 = 𝑨 𝒙$ exactly.

Matrix $𝑨$ with linearly independent columns, $𝒃$ not in the column space
$𝑨 = [\begin{array}{ll} 𝒂_{1} & 𝒂_{2} \end{array}] = [\begin{array}{ll} 1 & - 1 \\ 0 & 0 \\ 1 & 1 \end{array}], 𝒃 = [\begin{array}{l} 1 \\ 1 \\ 1 \end{array}] .$
Solution. Note that the column vectors are orthogonal
$𝒂_{1}^{T} 𝒂_{2} = [\begin{array}{lll} 1 & 0 & 1 \end{array}] [\begin{array}{l} - 1 \\ 0 \\ 1 \end{array}] = 0 .$
Remember to first look at a problem before blindly carrying out calculations. In this case $𝑸$ is found by scaling each column vector by its norm
$𝑸 = [\begin{array}{ll} 𝒒_{1} & 𝒒_{2} \end{array}] = [\begin{array}{ll} \frac{𝒂_{1}}{|| 𝒂_{1} ||} & \frac{𝒂_{2}}{|| 𝒂_{2} ||} \end{array}] = \frac{1}{\sqrt{2}} [\begin{array}{ll} 1 & - 1 \\ 0 & 0 \\ 1 & 1 \end{array}],$
and the matrix $𝑹$ is a diagonal matrix containing the norms
$𝑹 = [\begin{array}{ll} \sqrt{2} & 0 \\ 0 & \sqrt{2} \end{array}] .$
Verify:
$𝑸 𝑹 = \frac{1}{\sqrt{2}} [\begin{array}{ll} 1 & - 1 \\ 0 & 0 \\ 1 & 1 \end{array}] [\begin{array}{ll} \sqrt{2} & 0 \\ 0 & \sqrt{2} \end{array}] = [\begin{array}{ll} 1 & - 1 \\ 0 & 0 \\ 1 & 1 \end{array}] [\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}] = [\begin{array}{ll} 1 & - 1 \\ 0 & 0 \\ 1 & 1 \end{array}] = 𝑨$
Now that the $Q R$ factorization has been found, here are the steps to solve the least squares problem.
1. Find the vector $𝒚 = 𝑸^{T} 𝒃$
  $𝒚 = \frac{1}{\sqrt{2}} [\begin{array}{lll} 1 & 0 & 1 \\ - 1 & 0 & 1 \end{array}] [\begin{array}{l} 1 \\ 1 \\ 1 \end{array}] = \frac{1}{\sqrt{2}} [\begin{array}{l} 2 \\ 0 \end{array}]$
2. Solve the system $𝑹 𝒙 = 𝒚$
  $𝑹 𝒙 = [\begin{array}{ll} \sqrt{2} & 0 \\ 0 & \sqrt{2} \end{array}] 𝒙 = \frac{1}{\sqrt{2}} [\begin{array}{l} 2 \\ 0 \end{array}] = 𝒚$
  Find
  $𝒙 = [\begin{array}{l} 1 \\ 0 \end{array}], 𝒗 = 𝑨 𝒙 = [\begin{array}{ll} 1 & - 1 \\ 0 & 0 \\ 1 & 1 \end{array}] [\begin{array}{l} 1 \\ 0 \end{array}] = [\begin{array}{l} 1 \\ 0 \\ 1 \end{array}]$
Computer solution (useful as verification). The above calculations can also be carried out in Matlab/Octave
>>

A=[1 -1; 0 0; 1 1]; [Q,R]=qr(A,0)
>>

Q
$(\begin{array}{cc} - 0.7071 & 0.7071 \\ 0.0 & 0.0 \\ - 0.7071 & - 0.7071 \end{array})$
>>

R
$(\begin{array}{cc} - 1.4142 & - 3.3307 e - 16 \\ 0.0 & - 1.4142 \end{array})$
>>
Note that Matlab/Octave returned orthogonal vectors in the opposite orientation, highlighting that there are multiple ways to orthonormalize the columns of $𝑨$ .

In Matlab/Octave the least squares solution is returned by the backslash operator
>>

b=[1; 1; 1]; x=A\b; disp(x)
1

0
>>
Matrix $𝑨$ with linearly independent columns, $𝒃$ in the column space
$𝑨 = [\begin{array}{ll} 𝒂_{1} & 𝒂_{2} \end{array}] = [\begin{array}{ll} 1 & - 1 \\ 1 & 0 \\ 1 & 1 \end{array}], 𝒃 = [\begin{array}{l} 2 \\ 2 \\ 2 \end{array}] .$
Solution. Note that $𝒃$ is a simple scaling of the first column of $𝑨$
$𝒃 = 2 𝒂_{1} = [\begin{array}{ll} 𝒂_{1} & 𝒂_{2} \end{array}] [\begin{array}{l} 2 \\ 0 \end{array}] = 𝑨 𝒙 \Rightarrow 𝒙 = [\begin{array}{l} 2 \\ 0 \end{array}] .$
This example again highlights the need to first look at a problem before blindly carrying out calculations.
Matrix $𝑨$ with linearly independent columns, $𝒃$ not in the column space
$𝑨 = [\begin{array}{ll} 𝒂_{1} & 𝒂_{2} \end{array}] = [\begin{array}{ll} 1 & - 1 \\ 1 & 2 \\ 1 & 1 \end{array}], 𝒃 = [\begin{array}{l} 1 \\ 2 \\ 1 \end{array}] .$
Solution. In this case the Gram-Schmidt algorithm is applied to find the $𝑸 𝑹$ factorization of $𝑨$ . Here are the GS steps:

GS1. Find $𝒒_{1}$
$r_{11} = || 𝒂_{1} || = \sqrt{3}, 𝒒_{1} = \frac{𝒂_{1}}{r_{11}} = \frac{1}{\sqrt{3}} [\begin{array}{l} 1 \\ 1 \\ 1 \end{array}]$

GS2. Find $𝒒_{2}$ . This is carried out in two sub-steps:
1. Subtract from $𝒂_{2}$ its component along the previously determined direction $𝒒_{1}$
  $𝒘 = 𝒂_{2} - (𝒒_{1}^{T} 𝒂_{2}) 𝒒_{1} = 𝒂_{2} - r_{12} 𝒒_{1}$ $r_{12} = \frac{1}{\sqrt{3}} [\begin{array}{lll} 1 & 1 & 1 \end{array}] [\begin{array}{l} - 1 \\ 2 \\ 1 \end{array}] = \frac{2}{\sqrt{3}}$ $𝒘 = 𝒂_{2} - r_{12} 𝒒_{1} = [\begin{array}{l} - 1 \\ 2 \\ 1 \end{array}] - \frac{2}{3} [\begin{array}{l} 1 \\ 1 \\ 1 \end{array}] = \frac{1}{3} [\begin{array}{l} - 5 \\ 4 \\ 1 \end{array}] .$
  This vector is orthogonal to $𝒒_{1}$ . Verify
  $𝒒_{1}^{T} 𝒘 = 𝒒_{1}^{T} (𝒂_{2} - r_{12} 𝒒_{1}) = r_{12} - r_{12} 𝒒_{1}^{T} 𝒒_{1} = r_{12} - r_{12} \cdot 1 = 0$ $\frac{1}{\sqrt{3}} [\begin{array}{lll} 1 & 1 & 1 \end{array}] \frac{1}{3} [\begin{array}{l} - 5 \\ 4 \\ 1 \end{array}] = \frac{1}{3 \sqrt{3}} (- 5 + 4 + 1) = 0$
2. Divide the resulting vector by its norm
  $r_{22} = || 𝒘 || = \frac{\sqrt{42}}{3}, 𝒒_{2} = \frac{𝒘}{r_{22}} = \frac{1}{\sqrt{42}} [\begin{array}{l} - 5 \\ 4 \\ 1 \end{array}]$
Verify:
$𝑸 𝑹 = [\begin{array}{ll} 1 / \sqrt{3} & - 5 / \sqrt{42} \\ 1 / \sqrt{3} & 4 / \sqrt{42} \\ 1 / \sqrt{3} & 1 / \sqrt{42} \end{array}] [\begin{array}{ll} \sqrt{3} & 2 / \sqrt{3} \\ 0 & \sqrt{42} / 3 \end{array}] = [\begin{array}{ll} 1 & - 1 \\ 1 & 2 \\ 1 & 1 \end{array}] = 𝑨$
Here are the steps to solve the least squares problem.
1. Find the vector $𝒚 = 𝑸^{T} 𝒃$
  $𝒚 = [\begin{array}{lll} 1 / \sqrt{3} & 1 / \sqrt{3} & 1 / \sqrt{3} \\ - 5 / \sqrt{42} & 4 / \sqrt{42} & 1 / \sqrt{42} \end{array}] [\begin{array}{l} 1 \\ 2 \\ 1 \end{array}] = [\begin{array}{l} 4 / \sqrt{3} \\ 4 / \sqrt{42} \end{array}]$
2. Solve the system $𝑹 𝒙 = 𝒚$
  $𝑹 𝒙 = [\begin{array}{ll} \sqrt{3} & 2 / \sqrt{3} \\ 0 & \sqrt{42} / 3 \end{array}] 𝒙 = [\begin{array}{l} 4 / \sqrt{3} \\ 4 / \sqrt{42} \end{array}] = 𝒚$
  Find
  $𝒙 = [\begin{array}{l} 24 / 21 \\ 6 / 21 \end{array}], 𝒗 = 𝑨 𝒙 = [\begin{array}{ll} 1 & - 1 \\ 1 & 2 \\ 1 & 1 \end{array}] [\begin{array}{l} 24 / 21 \\ 6 / 21 \end{array}] = [\begin{array}{l} 6 / 7 \\ 12 / 7 \\ 10 / 7 \end{array}]$
Computer solution (useful as verification). The above calculations can also be carried out in Matlab/Octave
>>

A=[1 -1; 1 2; 1 1]; [Q,R]=qr(A,0)
>>

Q
$(\begin{array}{cc} - 0.5774 & 0.7715 \\ - 0.5774 & - 0.6172 \\ - 0.5774 & - 0.1543 \end{array})$
>>

R
$(\begin{array}{cc} - 1.7321 & - 1.1547 \\ 0.0 & - 2.1602 \end{array})$

In Matlab/Octave the least squares solution is returned by the backslash operator
>>

b=[1; 2; 1]; x=A\b; disp(x)
1.1429

0.2857
>>

[24/21; 6/21]
$(\begin{array}{c} 1.1429 \\ 0.2857 \end{array})$
>>