"Maymester MATH547 Linear Algebra for Applications in Data Science"

MATH547: Linear algebra for applications in data scienceMay 29, 2020

Final Examination

Solve the following problems (5 course points each). Present a brief motivation of your method of solution. Explicitly state any conditions that must be met for solution procedure to be valid. Organize your computation and writing so the solution you present is readily legible. No credit is awarded for statement of the final answer to a problem without presentation of solution procedure.

This is an open-book test, and you are free to consult the textbook or use software to check your solution. Your submission must however reflect your individual effort with no aid from any other person. If you studied the course material and understood solutions to the homework assignments, drafting examination question solutions in TeXmacs should take about 2 hours. The allotted time is 3:10 hours, thus also providing flexibility for internet connection interruption.

Draft your solution in TeXmacs in this file. Upload your answer into Sakai both as a TeXmacs, and pdf. Allow at least 10 minutes before the submission cut-off time to ensure you can upload your file.

Problem 1 (theory)

Two matrices $𝑨, 𝑩 \in ℝ^{m \times m}$ are said to be similar, denoted as $𝑨 \sim 𝑩$ if there exists some invertible matrix $𝑻$ such that $𝑩 = 𝑻^{- 1} 𝑨 𝑻$ . Prove that matrix similarity is an equivalence relation.

Solution. Verify equivalence relation properties for $𝑨, 𝑩, 𝑪 \in ℝ^{m \times m}$ :

Reflection

$𝑨 \sim 𝑨$ holds with $𝑻 = 𝑰$ , $𝑨 = 𝑰^{- 1} 𝑨 𝑰$ .

Symmetry

$𝑨 \sim 𝑩 \Rightarrow 𝑩 \sim 𝑨$ . If $𝑨 \sim 𝑩$ , then $\exists 𝑻$ such that $𝑩 = 𝑻^{- 1} 𝑨 𝑻$ . Multiply on left by $𝑻$ , on right by $𝑻^{- 1}$ to obtain $𝑨 = 𝑻^{} 𝑩 𝑻^{- 1} = {(𝑻^{- 1})}^{- 1} 𝑩 (𝑻^{- 1}) = 𝑺^{- 1} 𝑩 𝑺$ , hence $𝑨 \sim 𝑩$ .

Transitivity

$𝑨 \sim 𝑩 \land 𝑩 \sim 𝑪 \Rightarrow 𝑨 \sim 𝑪$ . If $𝑨 \sim 𝑩 \land 𝑩 \sim 𝑪$ , then $\exists 𝑻, 𝑼$ such that $𝑩 = 𝑻^{- 1} 𝑨 𝑻$ and $𝑪 = 𝑼^{- 1} 𝑩 𝑼 .$ Replace $𝑩$ to obtain

𝑪 = 𝑼^{- 1} 𝑻^{- 1} 𝑨 𝑻 𝑼 = {(𝑻 𝑼)}^{- 1} 𝑨 𝑻 𝑼 = 𝑺 𝑨 𝑺,

with $𝑺 = 𝑻 𝑼$ , hence $𝑪 \sim 𝑨$ .

Matrix similarity is verified to be an equivalence relation.

Problem 2 (theory)

A matrix $𝑨 \in ℝ^{m \times m}$ is said to be skew-symmetric if $𝑨 = - 𝑨^{T}$ .

Are skew-symmetric matrices a vector subspace of the vector space of matrices $(ℝ^{m \times m}, ℝ, +, \cdot)$ ?
Compute $𝒙^{T} 𝑨 𝒙$ when $𝑨$ is skew-symmetric.

Solution.

Verify closure for $α 𝑨 + β 𝑩$ , with $α, β$ scalars, $𝑨, 𝑩 \in ℝ^{m \times m}$ . Compute ${(α 𝑨 + β 𝑩)}^{T} = α 𝑨^{T} + β 𝑩^{T} = - (α 𝑨 + β 𝑩)$ , verified as skew-symmetric, hence a subspace of $(ℝ^{m \times m}, ℝ, +, \cdot)$ .
Compute transpose ${(𝒙^{T} 𝑨 𝒙^{})}^{T} = 𝒙^{T} 𝑨^{T} {(𝒙^{T})}^{T} = - 𝒙^{T} 𝑨 𝒙$ . Since $α = 𝒙^{T} 𝑨 𝒙$ is a scalar, $α^{T} = α = - α$ , hence $α = 0$ .

Problem 3 (theory)

Consider for $𝒗 \in ℝ^{m}$ , ${|| 𝒗 ||}_{2} = 1$ , the matrix $𝑯 = 𝑰 - 2 𝒗 𝒗^{T}$ .

Show that $𝑯$ is orthogonal.
Can $𝒗$ be chosen such that $𝑯$ is a projector?

Solution.

Compute
$𝑯 𝑯^{T} = (𝑰 - 2 𝒗 𝒗^{T}) {(𝑰 - 2 𝒗 𝒗^{T})}^{T} = (𝑰 - 2 𝒗 𝒗^{T}) (𝑰^{T} - {(2 𝒗 𝒗^{T})}^{T}) = (𝑰 - 2 𝒗 𝒗^{T}) (𝑰 - 2 𝒗 𝒗^{T}) =$
Since ${|| 𝒗 ||}_{2} = 𝒗^{T} 𝒗 = 1$ , obtain
$𝑯 𝑯^{T} = 𝑰 - 2 𝒗 𝒗^{T} - 2 𝒗 𝒗^{T} + 4 𝒗 (𝒗^{T} 𝒗) 𝒗^{T} = 𝑰 - 4 𝒗 𝒗^{T} + 4 𝒗 𝒗^{T} = 𝑰 .$
And since $𝑯^{T} = {(𝑰 - 2 𝒗 𝒗^{T})}^{T} = 𝑰 - 2 𝒗 𝒗^{T} = 𝑯$ , replacing in above gives $𝑯^{T} 𝑯 = 𝑰$ , hence $𝑯$ is orthogonal.
For $𝑯$ to be a projector $𝑯^{2} = 𝑯$ , must hold. Compute
$𝑯^{2} = 𝑯 𝑯 = 𝑯 𝑯^{T} = 𝑰 = 𝑰 - 2 𝒗 𝒗^{T},$
implying $𝒗 𝒗^{T} = 𝟎$ or $𝒗 = 𝟎$ , contradicting ${|| 𝒗 ||}_{2} = 1,$ hence $𝑯$ cannot be a projector.

Problem 4 (computation)

The matrix $𝑨$ contains surveying data (https://math.nist.gov/MatrixMarket/data/Harwell-Boeing/lsq/well1033.html), and its singular value decomposition is $𝑨 = 𝑼 𝚺 𝑽^{T}$ .

octave]

cd /home/student/courses/MATH547ML/data/lsq; load well1033

octave]

A=Expression1;

octave]

Construct a vector $𝒃 = 𝑼 𝒛$ , with $𝒛$ a unit vector of random numbers ( $|| 𝒛 || = 1$ ).
Solve the least squares problem ${min}_{𝒙} || 𝑨 𝒙 - 𝒃 ||$ .
What is the error $e = || 𝑨 𝒙 - 𝒃 ||$ ?

Remember to briefly comment the solution you present. Presentation of Octave commands without explanation of what you are doing does not received full credit.

Solution.

Compute SVD and generate random numbers

octave]

[U,S,V]=svd(A); [m,n]=size(A); z=rand(m,1); z=z/norm(z); b=U*z;
disp([norm(z) norm(b)])

1.0000 1.0000

octave]

The solution is the projection of $𝒃$ onto $C (𝑨)$ , $𝑨 𝒙 = 𝑷_{C (𝑨)} 𝒃 = 𝒚$ . The projector can be computed from the $Q R$ -decomposition, $𝑨 = 𝑸 𝑹$ as $𝑷_{C (𝑨)} = 𝑸 𝑸^{T}$ .
octave]

[Q,R]=qr(A); y=Q*Q'*b; x=A\y;
octave]
The projector can also be computed from the SVD, $𝑷_{C (𝑨)} = 𝑼_{r} 𝑼_{r}^{T}$ with $𝑼_{r}$ the first $r$ columns of $𝑨$ , and $r = rank (𝑨)$ .
octave]

r=rank(A); u=U(:,1:r)*U(:,1:r)'*b; w=A\u; disp(norm(x-w));
1.2416e-14
octave]
Compute error
octave]

err=norm(A*x-b)
err = 0.81930
octave]

Problem 5 (computation)

For the same data as in Problem 4:

Construct a reduced matrix $𝑹 = 𝑪^{T} 𝑨 𝑩$ , that contains the first $k$ singular modes of $𝑨$ , such that $σ_{k} / σ_{1} ⩽ 10^{- 1}$ , with $𝑩, 𝑪$ orthogonal matrices.
Find the projection $𝒗$ of $𝒃$ from problem 4 onto the column space of $𝑪$ , $𝒗 \in C (𝑪)$ .
Solve the problem $𝑹 𝒖 = 𝒗$ .
Compare $𝑩 𝒖$ to least squares solution from Problem 4.

Remember to briefly comment the solution you present. Presentation of Octave commands without explanation of what you are doing does not received full credit.

Solution.

Extract the singular values, check condition $σ_{k} / σ_{1} ⩽ 10^{- 1}$ to find $k$ .
octave]

s=diag(S); k=310; s(k)/s(1)
ans = 0.073165
octave]
Define $𝑩, 𝑪$ from singular vectors of $𝑨$
octave]

C=U(:,1:k); B=V(:,1:k); R=C'*A*B;
octave]
Since $𝑪$ has orthonormal columns, the projection is solution of $𝑷_{C (𝑪)} 𝒃 = 𝑪 𝑪^{T} = 𝑪 𝒗$ , or $𝒗 = 𝑪^{T} 𝒃$
octave]

v=C'*b;
octave]
Find solution
octave]

u=R\v;
octave]

Compute the relative error $|| 𝑩 𝒖 - 𝒙 || / || 𝒙 ||$

octave]

norm(B*u-x)/norm(x)

ans = 0.99360

octave]

The above relative error is large. Define a function to carry above steps for some $k$

octave]

function [rerr,Aerr]=ReducedModel(k,A,x,b,s,U,V)
  C=U(:,1:k); B=V(:,1:k); R=C'*A*B;
  v=C'*b; u=R\v;
  rerr = norm(B*u-x)/norm(x);
  Aerr=norm(A-U(:,1:k)*diag(s(1:k))*V(:,1:k)');
end

octave]

[rerr,Aerr]=ReducedModel(310,A,x,b,s,U,V)

rerr = 0.99360 Aerr = 0.099048

octave]

[rerr,Aerr]=ReducedModel(315,A,x,b,s,U,V)

rerr = 0.96736 Aerr = 0.028965

octave]

[rerr,Aerr]=ReducedModel(319,A,x,b,s,U,V)

rerr = 0.74059 Aerr = 0.010874

octave]

[rerr,Aerr]=ReducedModel(320,A,x,b,s,U,V)

rerr = 1.1217e-14 Aerr = 2.5645e-14

octave]

From the above, there is no benefit from model reduction in this case.