MATH347: Linear algebra for applications in data scienceJune 4, 2021
Solve the following problems (5 course points each). Present a brief
motivation of your method of solution. This is an open-book test, and
you are free to consult the textbook. Your submission must however
reflect your individual effort with no aid from any other person.
Draft your solution in TeXmacs in this file. Spaces for your solution
have been provided in this file for text, formulas, figures, Octave
commands. If Octave does not work within TeXmacs, verify your commands
in the stand-alone Octave application and paste the commands into the
appropriate spaces in this file without executing them. Upload your
answer into Sakai both as a TeXmacs, and pdf. Allow at least 10
minutes before the submission cut-off time to ensure you can upload
your file.
Data can be represented in multiple ways. The course desribed the least squares solution to representing the data as a polynomial, for instance a quadratic polynomial the coefficients of which are found by solving
(1) |
Notice that is a linear combination of with scaling coefficients , and recall that .
Consider another representation of the data as a trigonometric polynomial . State the least squares problem by modifying (1) to reflect the new representation.
Solution. The columns of change
(2) |
For , for , arbitrarily choose some values for , and construct data vectors in Octave.
Solution.
octave] |
m=100; t=2*pi*(1:m)'/m; |
octave] |
a0=-0.5; a1=0.5; a2=1.0; aEX=[a0; a1; a2]; |
octave] |
x=a0+a1*sin(t)+a2*cos(t); |
octave] |
Solve in Octave the least squares problem you stated in point 1. Do you recover the coefficients you chose?
octave] |
A=[t.^0 sin(t) cos(t)]; |
octave] |
[Q,R]=qr(A,0); b=Q'*x; aLS=R\b; aLS' |
ans =
-0.50000 0.50000 1.00000
octave] |
Coefficients are exactly recovered.
Perturb the data to mimic measurement noise , where is a vector of random numbers in the interval scaled by . Solve the least squares problem for the new, noisy data to obtain the perturbed coefficients .
octave] |
y = x + 2*(rand(m,1)-0.5); |
octave] |
b=Q'*y; aLS=R\b; aLS' |
ans =
-0.34031 0.50169 1.04731
octave] |
Write an Octave loop over the scaling coefficient values with , and compute the norm of the change in the coefficients for each value. Construct a plot of and comment on the effect of the magnitude of the noise as measured by upon recovery of the exact coefficients
octave] |
n=10; err=zeros(n,1); for s=1:n y = x + 2*s*(rand(m,1)-0.5); b=Q'*y; aLS=R\b; err(s) = norm(aLS-aEX); end |
octave] |
plot(1:n,err,'o') |
octave] |
cd ~/courses/MATH347DS; print -depsc 'errplot.eps' |
octave] |
Continuing the above, suppose the measurement noise is modulated in time,
(3) |
Investigate now the utility of the singular value decomposition to gain insight into the data.
Construct a data matrix of the modulated noise measurements specified in formula (3)
octave] |
n=10; Y = zeros(m,n); err=zeros(n,1); for s=1:n y = x + s*2*(rand(m,1)-0.5).*sin(t) + 4*s*2*(rand(m,1)-0.5).*sin(2*t); Y(:,s) = y; end |
octave] |
Compute the mean of the measurements , and construct the centered data matrix
octave] |
yAV = mean(Y')'; size(yAV) |
ans =
100 1
octave] |
Y0=Y-yAV; |
octave] |
Compute the first 3 singular vectors of using the svds Octave function.
octave] |
[U S V]=svds(Y0,3); |
octave] |
The largest 10 singular values are found through the instruction sigma=svds(Y,10,'L'). Display these values and comment on your observations.
octave] |
sigma=svds(Y,10,'L')' |
sigma =
Columns 1 through 8:
189.757 161.290 139.115 115.779 96.063 84.466
64.798 48.051
Columns 9 and 10:
31.548 16.625
octave] |
Singular values decay quickly.
Plot the first 3 singular vectors. What features of the data is revealed by the dominant singular vectors?
octave] |
plot(t,U(:,1),t,U(:,2),t,U(:,3)) |
octave] |
print -depsc 'uplot.eps' |
octave] |