A vector space can be formed from all linear mappings from the vector space to another vector space
with addition and scaling of linear mappings defined by and . Let denote a basis for the domain of linear mappings within , such that the linear mapping is represented by the matrix
When the domain and codomain are the real vector spaces , , the above is a standard matrix of real numbers, . For linear mappings between infinite dimensional vector spaces, the matrix is understood in a generalized sense to contain an infinite number of columns that are elements of the codomain . For example, the indefinite integral is a linear mapping between the vector space of functions that allow differentiation to any order,
and for the monomial basis , is represented by the generalized matrix
Truncation of the MacLaurin series with to terms, and sampling of at points , forms a standard matrix of real numbers
Values of function at are approximated by
with denoting the coordinates of in basis . The above argument states that the coordinates of , the primitive of are given by
as can be indeed verified through term-by-term integration of the MacLaurin series.
As to be expected, matrices can also be organized as vector space , which is essentially the representation of the associated vector space of linear mappings,
The addition and scaling of matrices is given in terms of the matrix components by
From the above it is apparent that linear mappings and matrices can also be considered as data, and a first step in analysis of such data is definition of functionals that would attach a single scalar label to each linear mapping of matrix. Of particular interest is the definition of a norm functional that characterizes in an appropriate sense the size of a linear mapping.
Consider first the case of finite matrices with real components that represent linear mappings between real vector spaces . The columns of could be placed into a single column vector with components
Subsequently the norm of the matrix could be defined as the norm of the vector . An example of this approach is the Frobenius norm
A drawback of the above approach is that the structure of the matrix and its close relationship to a linear mapping is lost. A more useful characterization of the size of a mapping is to consider the amplification behavior of linear mapping. The motivation is readily understood starting from linear mappings between the reals , that are of the form . When given an argument of unit magnitude , the mapping returns a real number with magnitude . For mappings within the plane, arguments that satisfy are on the unit circle with components have images through given analytically by
and correspond to ellipses.
∴ |
n=250; h=2.0*/n; θ=h*(1:n); c=cos.(θ); s=sin.(θ); |
∴ |
a1=[2; 3]; a2=[-1; 1]; A=[a1 a2] |
(1)
∴ |
fx = c.*a1[1]+s.*a2[1]; fy = c.*a1[2]+s.*a2[2]; |
∴ |
clf(); grid("on"); plot(c,s); axis("equal"); |
∴ |
plot(fx,fy,"r"); |
∴ |
F=svd(A); U=F.U; Σ=Diagonal(F.S); Vt=F.Vt; V=Vt'; |
∴ |
σ1=Σ[1,1]; σ2=Σ[2,2]; |
∴ |
z=[0; 0]; u1=σ1*[z U[:,1]]; u2=σ2*[z U[:,2]]; |
∴ |
v1=[z V[:,1]]; v2=[z V[:,2]]; |
∴ |
cd(homedir()*"/courses/MATH661/images"); |
∴ |
plot(u1[1,:],u1[2,:],"r"); |
∴ |
plot(u2[1,:],u2[2,:],"r"); |
∴ |
plot(v1[1,:],v1[2,:],"b"); |
∴ |
plot(v2[1,:],v2[2,:],"b"); |
∴ |
savefig("L08Fig01.eps") |
∴ |
From the above the mapping associated amplifies some directions more than others. This suggests a definition of the size of a matrix or a mapping by the maximal amplification unit norm vectors within the domain.
In the above, any vector norm can be used within the domain and codomain.
The fundamental theorem of linear algebra partitions the domain and codomain of a linear mapping . For real vectors spaces , the partition properties are stated in terms of spaces of the associated matrix as
The dimension of the column and row spaces is the rank of the matrix, is the nullity of , and is the nullity of . A infinite number of bases could be defined for the domain and codomain. It is of great theoretical and practical interest to define bases with properties that facilitate insight or computation.
The above partitions of the domain and codomain are orthogonal, and suggest searching for orthogonal bases within these subspaces. Introduce a matrix representation for the bases
with and . Orthogonality between columns , for is expressed as . For , the inner product is positive , and since scaling of the columns of preserves the spanning property , it is convenient to impose . Such behavior is concisely expressed as a matrix product
with the identity matrix in . Expanded in terms of the column vectors of the first equality is
It is useful to determine if a matrix exists such that , or
The columns of are the coordinates of the column vectors of in the basis , and can readily be determined
where is the column of , hence , leading to
Note that the second equality
acts as normalization condition on the matrices .
Given a linear mapping , expressed as , the simplest description of the action of would be a simple scaling, as exemplified by that has as its associated matrix . Recall that specification of a vector is typically done in terms of the identity matrix , but may be more insightfully given in some other basis . This suggests that especially useful bases for the domain and codomain would reduce the action of a linear mapping to scaling along orthogonal directions, and evaluate by first re-expressing in another basis , and re-expressing in another basis , . The condition that the linear operator reduces to simple scaling in these new bases is expressed as for , with the scaling coefficients along each direction which can be expressed as a matrix vector product , where is of the same dimensions as and given by
Imposing the condition that are orthogonal leads to
which can be replaced into to obtain
From the above the orthogonal bases and scaling coefficients that are sought must satisfy .
with properties:
is an orthogonal matrix,
is an orthogonal matrix,
is diagonal, , , and .
Proof. The proof of the SVD makes use of properties of the norm, concepts from analysis and complete induction. Adopting the 2-norm set ,
The domain is compact (closed and bounded), and the extreme value theorem implies that attains its maxima and minima, hence there must exist some vectors of unit norm such that . Introduce orthogonal bases , for whose first column vectors are , and compute
In the above is a row vector with components , , and must be zero for to be the direction along which the maximum norm is obtained. Introduce vectors
and . From it results that . By induction, assume that has a singular value decomposition, , such that
and the orthogonal matrices arising in the singular value decomposition of are
The scaling coefficients are called the singular values of . The columns of are called the left singular vectors, and those of are called the right singular vectors.
The fact that the scaling coefficients are norms of and submatrices of , , is crucial importance in applications. Carrying out computation of the matrix products
leads to a representation of as a sum
Each product is a matrix of rank one, and is called a rank-one update. Truncation of the above sum to terms leads to an approximation of
In very many cases the singular values exhibit rapid, exponential decay, , such that the approximation above is an accurate representation of the matrix .
|
|||
The SVD can be used to solve common problems within linear algebra.
Compute the SVD, ;
Find the coordinates of in the orthogonal basis , ;
Scale the coordinates of by the inverse of the singular values , , such that is satisfied;
Find the coordinates of in basis , .
where the matrix (notice the inversion of dimensions) is defined as a matrix with elements on the diagonal, and is called the pseudo-inverse of . Similarly the matrix
that allows stating the solution of simply as is called the pseudo-inverse of . Note that in practice is not explicitly formed. Rather the notation is simply a concise reference to carrying out steps 1-4 above.