MATH347DS

Matrix Vector Subspaces

Synopsis. Operations with vectors have been formally defined by the vector space algebraic structure, and the idea of obtaining new vectors by linear combination has been concisely stated as matrix-vector multiplication. Matrices also describe mappings between vector spaces that preserve linear combinations. Can all vectors of interest be obtained by linear combination of some set of vectors? This is a natural question to ask, and is answered through the concept of vector subspaces associated with a matrix.

1.Vector subspaces

1.1.Vectors reachable by linear combination

A central interest in data science is to seek simple description of complex objects. A typical situation is that many instances of some object of interest are initially given as an $m$ -tuple $𝒃 \in ℝ^{m}$ with large $m$ . Assuming that addition and scaling of such objects can be cogently defined, a vector space is obtained, say over the field of reals with an Euclidean distance, $E_{m}$ . Examples include for instance recordings of medical data (electroencephalograms, electrocardiograms), sound recordings, or images, for which $m$ can easily reach in to the millions. A natural question to ask is whether all the $m$ real numbers are actually needed to describe the observed objects. Perhaps instead of specifying all the components of $𝒃 \in ℝ^{m}$ , it might be possible to state that $𝒗$ is a linear combination of $n < m$ vectors

𝒃 = 𝑨 𝒙 = [\begin{array}{cccc} 𝒂_{1} & 𝒂_{2} & \dots & 𝒂_{n} \end{array}] [\begin{array}{c} x_{1} \\ x_{2} \\ ⋮ \\ x_{n} \end{array}] = x_{1} 𝒂_{1} + \dots + x_{n} 𝒂_{n} .

It is then natural to formally define the set of all vectors $𝒃$ that could thus be expressed, meaning that they can be reached by linear combination of the columns of $𝑨$ .

Definition. In vector space $𝒱 = (V, S, +, \cdot)$ , the span of vectors $𝒂_{1}, 𝒂_{2}, \dots, 𝒂_{n} \in V,$ is the set of vectors reachable by linear combination

$span {𝒂_{1}, 𝒂_{2}, \dots, 𝒂_{n}} = {𝒃 \in V | \exists x_{1}, \dots, x_{n} \in S such that 𝒃 = x_{1} 𝒂_{1} + \dots + x_{n} 𝒂_{n}} .$

What now is the relationship between this set of reachable vectors and the entire vector space? The mathematical transcription of this question leads to a consideration of another algebraic structure.

Definition. $𝒰 = (U, S, +, \cdot)$ , $U \neq \emptyset$ , is a vector subspace of vector space $𝒱 = (V, S, +, \cdot)$ over the same field of scalars S, denoted by $𝒰 \leq 𝒱$ , if $U \subseteq V$ and $\forall a, b \in S$ , $\forall 𝒖, 𝒗 \in U$ , the linear combination $a 𝒖 + b 𝒗 \in U$ .

The above states a vector subspace must be closed under linear combination, and have the same vector addition and scaling operations as the enclosing vector space. The simplest vector subspace of a vector space is the null subspace that only contains the null element, $U = {𝟎}$ . In fact, any subspace must contain the null element $𝟎$ , or otherwise closure would not be verified for the particular linear combination $𝒖 + (- 𝒖) = 𝟎$ . One can think of $𝒵 = ({𝟎}, S, +, \cdot)$ as the smallest subspace of a vector space. By the above definition, $𝒰$ is also a subspace of itself, intuitively, the largest subspace. If $U \subset V$ , then $𝒰$ is said to be a proper subspace of $𝒱$ , denoted by $𝒰 < 𝒱$ .

Setting $n - m$ components equal to zero in the real space $ℛ_{m}$ defines a proper subspace whose elements can be placed into a one-to-one correspondence with the vectors within $ℛ_{n}$ . For example, setting component $m$ of $𝒙 \in ℝ^{m}$ equal to zero gives $𝒙 = {[\begin{array}{ccccc} x_{1} & x_{2} & \dots & x_{m - 1} & 0 \end{array}]}^{T}$ that, while not a member of $ℝ^{m - 1}$ , is in a one-to-one relation with $𝒙^{'} = {[\begin{array}{cccc} x_{1} & x_{2} & \dots & x_{m - 1} \end{array}]}^{T} \in ℝ^{m - 1}$ . Dropping the last component of $𝒚 \in ℝ^{m}$ , $𝒚 = {[\begin{array}{ccccc} y_{1} & y_{2} & \dots & y_{m - 1} & y_{m} \end{array}]}^{T}$ gives vector $𝒚^{'} = [\begin{array}{cccc} y_{1} & y_{2} & \dots & y_{m - 1} \end{array}] \in ℝ^{m - 1}$ , but this is no longer a one-to-one correspondence since for some given $𝒚^{'}$ , the last component $y_{m}$ could take any value.

∴	m=3; x=[1; 2; 0]; xp=x[1:2]

$[\begin{array}{c} 1 \\ 2 \end{array}]$ (1)

∴	y=[1; 2; 3]; yp=y[1:2]

$[\begin{array}{c} 1 \\ 2 \end{array}]$ (2)

∴	[xp==yp x==y]

$[\begin{array}{cc} t r u e & f a l s e \end{array}]$ (3)

∴

1.2.Vector space composition

Vector subspaces arise from the decomposition of a vector space, the idea of breaking up a complex object into component parts. The converse, composition of vector spaces $𝒰 = (U, S, +, \cdot)$ $𝒱 = (V, S, +, \cdot)$ is also defined in terms of linear combination. A vector $𝒙 \in ℝ^{3}$ can be obtained as the linear combination

𝒙 = [\begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array}] = [\begin{array}{c} x_{1} \\ 0 \\ 0 \end{array}] + [\begin{array}{c} 0 \\ x_{2} \\ x_{3} \end{array}],

but also as

𝒙 = [\begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array}] = [\begin{array}{c} x_{1} \\ x_{2} - a \\ 0 \end{array}] + [\begin{array}{c} 0 \\ a \\ x_{3} \end{array}],

for some arbitrary $a \in ℝ$ . In the first case, $𝒙$ is obtained as a unique linear combination of a vector from the set $U = {{[\begin{array}{ccc} x_{1} & 0 & 0 \end{array}]}^{T} | x_{1} \in ℝ .}$ with a vector from $V = {{[\begin{array}{ccc} 0_{} & x_{2} & x_{3} \end{array}]}^{T} | x_{2}, x_{3} \in ℝ .}$ . In the second case, there is an infinity of linear combinations of a vector from $V$ with another from $W = {{[\begin{array}{ccc} x_{1} & x_{2} & 0 \end{array}]}^{T} | x_{1}, x_{2} \in ℝ .}$ to the vector $𝒙$ . This is captured by a pair of definitions to describe the two types of vector space composition.

Definition. Given two vector subspaces $𝒰 = (U, S, +, \cdot)$ , $𝒱 = (V, S, +, \cdot)$ of the space $𝒲 = (W, S, +, \cdot)$ , the direct sum is the vector space $𝒰 \oplus 𝒱 = (U \oplus V, S, +, \cdot)$ , where the direct sum of the two sets of vectors $U, V$ is $U \oplus V = {𝒖 + 𝒗 : \exists! 𝒖 \in U, \exists! 𝒗 \in V}$ . (unique decomposition)

Definition. Given two vector subspaces $𝒰 = (U, S, +, \cdot)$ , $𝒱 = (V, S, +, \cdot)$ of the space $𝒲 = (W, S, +, \cdot)$ , the sum is the vector space $𝒰 + 𝒱 = (U + V, S, +, \cdot)$ , where the sum of the two sets of vectors $U, V$ is $U + V = {𝒖 + 𝒗 : 𝒖 \in U, 𝒗 \in V}$ .

Since the same scalar field, vector addition, and scaling is used, it is convenient to refer to vector space sums simply by the sum of the vector sets $U + V$ , or $U \oplus V$ , instead of specifying the full 4-tuplet for each space. This shall be adopted henceforth to simplify the notation.

∴	u=[1; 0; 0]; v=[0; 2; 3]; vp=[0; 1; 3]; w=[1; 1; 0]; [u+v vp+w]

$[\begin{array}{cc} 1 & 1 \\ 2 & 2 \\ 3 & 3 \end{array}]$ (4)

∴

In the above computational example, the essential difference between the two ways to express $𝒙 \in ℝ^{3}$ is that $U \cap V = {𝟎}$ , but $V \cap W = {{[\begin{array}{ccc} 0 & a & 0 \end{array}]}^{T} : a \in ℝ} \neq {𝟎}$ , and in general if the zero vector is the only common element of two vector spaces then the sum of the vector spaces becomes a direct sum. In general, the common elements of two vector subspaces can also be defined.

Definition. Given two vector subspaces $(U, S, +, \cdot)$ , $(V, S, +, \cdot)$ of the space $(W, S, +, \cdot)$ , the intersection is the set $U \cap V = {𝒙 | : 𝒙 \in U, 𝒙 \in V} .$

In practice, the most important procedure to construct direct sums or to check when an intersection of two vector subspaces reduces to the zero vector is through an inner product.

Definition. Two vector subspaces $U, V$ of the real vector space $ℝ^{m}$ are orthogonal, denoted as $U ⊥ V$ if $𝒖^{T} 𝒗 = 0$ for any $𝒖 \in U, 𝒗 \in V$ .

Definition. Two vector subspaces U,V of $U + V$ are orthogonal complements, denoted $U = V^{⊥}$ , $V = U^{⊥}$ if they are orthogonal subspaces, $U ⊥ V$ , and $U \cap V = {𝟎}$ , i.e., the null vector is the only common element of both subspaces.

Continuing the above computational example where the same vector was obtained through two different linear combinations, $𝒛 = 𝒖 + 𝒗 = 𝒗^{'} + 𝒘$ , the essential difference between the two is $𝒖$ is orthogonal to $𝒗$ , whereas $𝒗^{'}$ is not orthogonal to $𝒘$ .

∴	[u'v vp'w]

$[\begin{array}{cc} 0 & 1 \end{array}]$ (5)

∴

2.Vector subspaces of a linear mapping

2.1.Matrix column and left null spaces

The wide-ranging utility of linear algebra essentially results a complete characterization of the behavior of a linear mapping between vector spaces $𝒇 : U \to V$ , $𝒇 (a 𝒖 + b 𝒗) = a 𝒇 (𝒖) + b 𝒇 (𝒗)$ . For some given linear mapping the questions that arise are:

Can any vector within $V$ be obtained by evaluation of $𝒇$ ?
Is there a single way that a vector within $V$ can be obtained by evaluation of $𝒇$ ?

Linear mappings between real vector spaces $𝒇 : ℝ^{n} \to ℝ^{m}$ , have been seen to be completely specified by a matrix $𝑨 \in ℝ^{m \times n}$ . It is common to frame the above questions about the behavior of the linear mapping $𝒇 (𝒙) = 𝑨 𝒙$ through sets associated with the matrix $𝑨$ . To frame an answer to the first question, a set of reachable vectors is first defined.

Definition. The column space (or range) of matrix $𝑨 \in ℝ^{m \times n}$ is the set of vectors reachable by linear combination of the matrix column vectors

$C (𝑨) = range (𝑨) = {𝒃 \in ℝ^{m} : \exists 𝒙 \in ℝ^{n} such that 𝒃 = 𝑨 𝒙} .$

By definition, the column space is included in the co-domain of the function $𝒇 (𝒙) = 𝑨 𝒙$ , $C (𝑨) \subseteq ℝ^{m}$ , and is readily seen to be a vector subspace of $ℝ^{m}$ . Having defined the set of vectors reachable by linear combination, two questions arise:

Is the column space the entire co-domain, $C (𝑨) = ℝ^{m}$ ? This would signify that any vector can be reached by linear combination of columns of $𝑨$ .
What co-domain vectors are not reachable by linear combination of columns of $𝑨$ ?

Consider the orthogonal complement of $C (𝑨)$ defined as the set vectors $𝒚$ orthogonal to all of the column vectors of $𝑨 = [\begin{array}{cccc} 𝒂_{1} & 𝒂_{2} & \dots & 𝒂_{n} \end{array}]$ , expressed through inner products as

𝒂_{1}^{T} 𝒚 = 0, 𝒂_{2}^{T} 𝒚 = 0, \dots, 𝒂_{n}^{T} 𝒚 = 0 .

This can be expressed more concisely through the transpose operation

𝑨 = [\begin{array}{cccc} 𝒂_{1} & 𝒂_{2} & \dots & 𝒂_{n} \end{array}], 𝑨^{T} 𝒚 = [\begin{array}{c} 𝒂_{1}^{T} \\ 𝒂_{2}^{T} \\ ⋮ \\ 𝒂_{n}^{T} \end{array}] 𝒚 = [\begin{array}{c} 𝒂_{1}^{T} 𝒚 \\ 𝒂_{2}^{T} 𝒚 \\ ⋮ \\ 𝒂_{n}^{T} 𝒚 \end{array}],

and leads to the definition of a set of vectors for which $𝑨^{T} 𝒚 = 𝟎$

Definition. The left null space (or cokernel) of a matrix $𝑨 \in ℝ^{m \times n}$ is the set

$N (𝑨^{T}) = null (𝑨^{T}) = {𝒚 \in ℝ^{m} : 𝑨^{T} 𝒚 = 𝟎} .$

Note that the left null space is also a vector subspace of the co-domain of $𝒇 (𝒙) = 𝑨 𝒙$ , $N (𝑨^{T}) \subseteq ℝ^{m}$ . The above definitions suggest that both the matrix and its transpose play a role in characterizing the behavior of the linear mapping $𝒇 = 𝑨 𝒙$ , so analagous sets are define for the transpose $𝑨^{T}$ .

Definition. The row space (or corange) of a matrix $𝑨 \in ℝ^{m \times n}$ is the set

$R (𝑨) = C (𝑨^{T}) = range (𝑨^{T}) = {𝒄 \in ℝ^{n} : \exists 𝒚 \in ℝ^{m} 𝒄 = 𝑨^{T} 𝒚} \subseteq ℝ^{n}$

Definition. The null space of a matrix $𝑨 \in ℝ^{m \times n}$ is the set

$N (𝑨) = null (𝑨) = {𝒙 \in ℝ^{n} : 𝑨 𝒙 = 𝟎} \subseteq ℝ^{n}$

2.2.Geometric description of subspaces

The concepts of Euclidean geometry are widely used to characterize subspaces of a vector space. Consider the familiar example of $ℰ_{2} = (ℝ^{2}, ℝ, +, \cdot)$ the Euclidean 2-space or plane,

ℝ^{2} = {[\begin{array}{l} x_{1} \\ x_{2} \end{array}] : x_{1}, x_{2} \in ℝ} .

As is common the vector space representing the plane is referred to either by its full name $ℰ_{2}$ or in shorthand form as $ℝ^{2}$ . The trivial subspaces of $ℝ^{2}$ are the zero vector space $𝒵 = ({𝟎}, ℝ, +, \cdot)$ , and $ℝ^{2}$ itself. Any line passing through the origin is a non-trivial subspace $ℒ = (L, ℝ, +, \cdot)$ with

L_{p, q} = {[\begin{array}{l} a p \\ a q \end{array}] : a \in ℝ} .

In particular the $x_{1}, x_{2}$ axes are

L_{1, 0} = {[\begin{array}{l} a \\ 0 \end{array}] : a \in ℝ}, L_{0, 1} = {[\begin{array}{l} 0 \\ a \end{array}] : a \in ℝ},

respectively. In $ℝ^{3}$ , the non-trivial subspaces are lines passing through the origin

L_{p, q, r} = {[\begin{array}{l} a p \\ a q \\ a r \end{array}] : a \in ℝ},

and planes passing through the origin

P_{𝒏} = {[\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \end{array}] : 𝒏^{T} 𝒙 = n_{1} x_{1} + n_{2} x_{2} + n_{3} x_{3} = 0, x_{1}, x_{2}, x_{3} \in ℝ},

where $𝒏$ is the normal vector of the plane. An intuitive understanding of subspace geometry is essential and built up from instructive computational examples.

Examples. Consider a linear mapping between real spaces $𝒇 : ℝ^{n} \to ℝ^{m}$ , defined by $𝒚 = 𝒇 (𝒙) = 𝑨 𝒙 = {[\begin{array}{ccc} y_{1} & \dots & y_{n} \end{array}]}^{T}$ , with $𝑨 \in ℝ^{m \times n}$ . Julia provides the nullspace function to return a set of vectors that span a null space. A function colspace to provide a set of vectors to span the column space is not yet in the general libraries, but can be readily defined, together with a function to display numerical results to a default precision of $p = 6$ digits.

∴	function colspace(A,p=6) return round.(Matrix(qr(A).Q)[:,1:rank(A)],digits=p) end;

∴	short(x) = round(x,digits=6);

∴

short(pi)

$3.141593$

∴

With these functions defined, the following examples provide great insight into the significance of column and null spaces and their associated spanning sets. For these small-dimensional, simple examples geometric insight is sufficient to understand what column and null spaces represent. Computational procedures can be devised for much higher number of components, and the geometric insights gained here carry over.

For $m = 3, n = 1$ ,

𝑨 = [\begin{array}{c} 1 \\ 0 \\ 0 \end{array}], 𝑨^{T} = [\begin{array}{ccc} 1 & 0 & 0 \end{array}],

the column space $C (𝑨)$ is the $y_{1}$ -axis, and the left null space $N (𝑨^{T})$ is the $y_{2} y_{3}$ -plane since the condition $𝑨^{T} 𝒚 = 0$ reduces to $y_{1} = 0$ . Spanning vector sets for $C (𝑨)$ and $N (𝑨^{T})$ can be computed as follows, confirming the previous geometric descriptions. Note that combining the two leads to the identity matrix, an observation whose significance will soon become apparent.

∴	A=[1; 0; 0]; colspace(A)

$[\begin{array}{c} 1.0 \\ 0.0 \\ 0.0 \end{array}]$ (6)

∴	nullspace(A')

$[\begin{array}{cc} 0.0 & 0.0 \\ 1.0 & 0.0 \\ 0.0 & 1.0 \end{array}]$ (7)

∴	[colspace(A) nullspace(A')]

$[\begin{array}{ccc} 1.0 & 0.0 & 0.0 \\ 0.0 & 1.0 & 0.0 \\ 0.0 & 0.0 & 1.0 \end{array}]$ (8)

∴

For $m = 3, n = 2$ ,

𝑨 = [\begin{array}{cc} 1 & - 1 \\ 0 & 0 \\ 0 & 0 \end{array}] = [\begin{array}{cc} 𝒂_{1} & 𝒂_{2} \end{array}], 𝑨^{T} = [\begin{array}{ccc} 1 & 0 & 0 \\ - 1 & 0 & 0 \end{array}],

the columns of $𝑨$ are colinear, $𝒂_{2} = - 𝒂_{1}$ , and the column space $C (𝑨)$ is the $y_{1}$ -axis, and the left null space $N (𝑨^{T})$ is the $y_{2} y_{3}$ -plane, as before.

∴	A=[1 -1; 0 0; 0 0]; CA=colspace(A)

$[\begin{array}{c} 1.0 \\ 0.0 \\ 0.0 \end{array}]$ (9)

∴	NAt=short.(nullspace(A'))

$[\begin{array}{cc} 0.0 & 0.0 \\ 1.0 & 0.0 \\ 0.0 & 1.0 \end{array}]$ (10)

∴

[CA NAt]

$[\begin{array}{ccc} 1.0 & 0.0 & 0.0 \\ 0.0 & 1.0 & 0.0 \\ 0.0 & 0.0 & 1.0 \end{array}]$ (11)

∴

For $m = 3, n = 2$ ,

𝑨 = [\begin{array}{cc} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array}], 𝑨^{T} = [\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \end{array}],

the column space $C (𝑨)$ is the $y_{1} y_{2}$ -plane, and the left null space $N (𝑨^{T})$ is the $y_{3}$ -axis.

∴	A=[1 0; 0 1; 0 0]; CA=colspace(A)

$[\begin{array}{cc} 1.0 & 0.0 \\ 0.0 & 1.0 \\ 0.0 & 0.0 \end{array}]$ (12)

∴	NAt=short.(nullspace(A'))

$[\begin{array}{c} 0.0 \\ 0.0 \\ 1.0 \end{array}]$ (13)

∴

[CA NAt]

$[\begin{array}{ccc} 1.0 & 0.0 & 0.0 \\ 0.0 & 1.0 & 0.0 \\ 0.0 & 0.0 & 1.0 \end{array}]$ (14)

∴

For $m = 3, n = 2$ ,

𝑨 = [\begin{array}{cc} 1 & 1 \\ 1 & - 1 \\ 0 & 0 \end{array}], 𝑨^{T} = [\begin{array}{ccc} 1 & 1 & 0 \\ 1 & - 1 & 0 \end{array}],

the same $C (𝑨)$ , $N (𝑨^{T})$ are obtained, albeit with a different set of spanning vectors returned by colspace.

∴	A=[1 1; 1 -1; 0 0]; CA=colspace(A)

$[\begin{array}{cc} - 0.707107 & - 0.707107 \\ - 0.707107 & 0.707107 \\ 0.0 & 0.0 \end{array}]$ (15)

∴	NAt=short.(nullspace(A'))

$[\begin{array}{c} 0.0 \\ 0.0 \\ 1.0 \end{array}]$ (16)

∴

[CA NAt]

$[\begin{array}{ccc} - 0.707107 & - 0.707107 & 0.0 \\ - 0.707107 & 0.707107 & 0.0 \\ 0.0 & 0.0 & 1.0 \end{array}]$ (17)

∴

For $m = 3, n = 3$ ,

𝑨 = [\begin{array}{ccc} 1 & 1 & 3 \\ 1 & - 1 & - 1 \\ 1 & 1 & 3 \end{array}] = [\begin{array}{ccc} 𝒂_{1} & 𝒂_{2} & 𝒂_{3} \end{array}],

𝑨^{T} = [\begin{array}{ccc} 1 & 1 & 1 \\ 1 & - 1 & 1 \\ 3 & - 1 & 3 \end{array}] = [\begin{array}{c} 𝒂_{1}^{T} \\ 𝒂_{2}^{T} \\ 𝒂_{3}^{T} \end{array}], 𝑨^{T} 𝒚 = [\begin{array}{c} 𝒂_{1}^{T} 𝒚 \\ 𝒂_{2}^{T} 𝒚 \\ 𝒂_{3}^{T} 𝒚 \end{array}] .

Since $𝒂_{3} = 𝒂_{1} + 2 𝒂_{2}$ , the orthogonality condition $𝑨^{T} 𝒚 = 𝟎$ is satisfied by vectors of form $𝒚 = [\begin{array}{ccc} a & 0 & - a \end{array}]$ , $\forall a \in ℝ$ .

∴	A=[1 1 3; 1 -1 -1; 1 1 3]; CA=colspace(A)

$[\begin{array}{cc} - 0.5773502691896257 & 0.40824829046386313 \\ - 0.5773502691896257 & - 0.816496580927726 \\ - 0.5773502691896257 & 0.40824829046386313 \end{array}]$ (18)

∴	NAt=short.(nullspace(A'))

$[\begin{array}{c} - 0.707107 \\ 0.0 \\ 0.707107 \end{array}]$ (19)

∴

[CA NAt]

$[\begin{array}{ccc} - 0.5773502691896257 & 0.40824829046386313 & - 0.707107 \\ - 0.5773502691896257 & - 0.816496580927726 & 0.0 \\ - 0.5773502691896257 & 0.40824829046386313 & 0.707107 \end{array}]$ (20)

∴