MATH347DS

Vector Space Dimension

Synopsis. The vector subspaces of a matrix characterize what vectors can or cannot be reached by linear combination of matrix columns or rows. How large are these subspaces? An answer is provided by the concept of minimal spanning sets, and the number of vectors in a minimal spanning set allows a precise definition of the intuitive concept of dimension. Upon completion of the framework for vector operations, a first example of the relevance of the theory is considered in the problem of classification of electrocardiograms as indicators of healthy or diseased states.

1.Linear dependence and independence

1.1.Zero factors

For the simple scalar mapping $f : ℝ \to ℝ$ , $f (x) = a x$ , the condition $f (x) = 0$ implies either that $a = 0$ or $x = 0$ . The function $f$ is a linear mapping, and $a = 0$ can be understood as defining a zero mapping $f (x) = 0$ . When $a \neq 0$ the condition $a x = 0$ necessarily implies that $x = 0$ . Scalar multiplication satisfies the zero product property and if a product is equal to zero one of the factors must be zero.

Linear mappings between vector spaces, $𝒇 : ℝ^{n} \to ℝ^{m}$ , $𝒇 (𝒙) = 𝑨 𝒙$ , $𝑨 \in ℝ^{m \times n}$ can exhibit remarkably different behavior. As in the scalar case, a zero mapping is defined by $𝑨 = 𝟎$ in which case $𝒇 (𝒙) = 𝟎$ . In contrast to the scalar case, even when $𝑨 \neq 𝟎$ , the equation $𝑨 𝒙 = 𝟎$ no longer necessarily implies that $𝒙 = 𝟎$ . For example,

𝑨 = [\begin{array}{ll} 𝒂_{1} & 𝒂_{2} \end{array}] = [\begin{array}{ll} 1 & 1 \\ 2 & 2 \end{array}], 𝑨 𝒙 = [\begin{array}{ll} 1 & 1 \\ 2 & 2 \end{array}] [\begin{array}{l} 1 \\ - 1 \end{array}] = [\begin{array}{l} 0 \\ 0 \end{array}] = 𝟎 .

(1)

The linear combination above can be read as traveling $x_{1} = 1$ in the direction of $𝒂_{1}$ , followed by traveling $x_{2} = - 1$ in the direction of $𝒂_{2}$ . Since $𝒂_{1} = 𝒂_{2}$ , the net result of the travel is arrival at the origin $𝑨 𝒙 = 𝟎$ . Though $𝑨$ has two column vectors, they both specify the same direction and are said to contain redundant data. A similar situation arises for

𝑩 = [\begin{array}{lll} 𝒃_{1} & 𝒃_{2} & 𝒃_{3} \end{array}] = [\begin{array}{lll} 1 & - 1 & 1 \\ 2 & 0 & 4 \\ 3 & 1 & 7 \end{array}], 𝑩 𝒙 = [\begin{array}{lll} 1 & - 1 & 1 \\ 2 & 0 & 4 \\ 3 & 1 & 7 \end{array}] [\begin{array}{l} 2 \\ 1 \\ - 1 \end{array}] = [\begin{array}{l} 0 \\ 0 \\ 0 \end{array}] = 𝟎 .

(2)

The columns of $𝑩$ specify three different directions, but the directions satisfy $𝒃_{3} = 2 𝒃_{1} + 𝒃_{2}$ . This implies that $𝒃_{3} \in span {𝒃_{1}, 𝒃_{2}}$ . Whereas the vectors $𝒂_{1}, 𝒂_{2}$ are colinear in $ℝ^{2}$ , the vectors $𝒃_{1}, 𝒃_{2}, 𝒃_{3}$ are coplanar within $ℝ^{3}$ . In both cases matrix vector multiplication is seen to not satisfy the zero product property and from $𝑨 𝒙 = 𝟎$ one cannot deduce that either $𝑨 = 𝟎$ or $𝒙 = 𝟎$ . This arises from the defining matrix-vector multiplication to describe linear combinations.

There are however cases when $𝑴 𝒙 = 𝟎$ implies that $𝒙 = 𝟎$ , most simply for the case $𝑴 = 𝑰$ . The need to distinguish between cases that satisfy or do not satisfy the zero product property leads to a general concept of linear dependence.

Definition. The vectors $𝒂_{1}, 𝒂_{2}, \dots, 𝒂_{n} \in V,$ are linearly dependent if there exist $n$ scalars, $x_{1}, \dots, x_{n} \in S$ , at least one of which is different from zero such that

$x_{1} 𝒂_{1} + \dots + x_{n} 𝒂_{n} = 𝟎 .$

Introducing a matrix representation of the vectors

𝑨 = [\begin{array}{cccc} 𝒂_{1} & 𝒂_{2} & \dots & 𝒂_{n} \end{array}]; 𝒙 = [\begin{array}{c} x_{1} \\ x_{2} \\ ⋮ \\ x_{n} \end{array}]

allows restating linear dependence as the existence of a non-zero vector, $\exists 𝒙 \neq 𝟎$ , such that $𝑨 𝒙 = 𝟎$ . Linear dependence can also be written as $𝑨 𝒙 = 𝟎 ⇏ 𝒙 = 𝟎$ , or that one cannot deduce from the fact that the linear mapping $𝒇 (𝒙) = 𝑨 𝒙$ attains a zero value that the argument itself is zero. The converse of this statement would be that the only way to ensure $𝑨 𝒙 = 𝟎$ is for $𝒙 = 𝟎$ , or $𝑨 𝒙 = 𝟎 \Rightarrow 𝒙 = 𝟎$ , leading to the concept of linear independence.

Definition. The vectors $𝒂_{1}, 𝒂_{2}, \dots, 𝒂_{n} \in V,$ are linearly independent if the only $n$ scalars, $x_{1}, \dots, x_{n} \in S$ , that satisfy

$x_{1} 𝒂_{1} + \dots + x_{n} 𝒂_{n} = 𝟎,$ (3)

are $x_{1} = 0$ , $x_{2} = 0$ ,…, $x_{n} = 0$ .

If $𝑨 \in ℝ^{m \times n}$ contains $n$ linearly independent vectors $𝒂_{1}, \dots, 𝒂_{n} \in ℝ^{m}$ the nullspace of $𝑨$ only contains one element, the zero vector.

N (𝑨) = {𝒙 : 𝑨 𝒙 = 𝟎} = {𝟎} \Leftrightarrow 𝒂_{1}, \dots, 𝒂_{n} linearly independent .

1.2.Orthogonality

Establishing linear independence through the above definition requires algebraic calculations to deduce that $x_{1} 𝒂_{1} + \dots + x_{n} 𝒂_{n} = 𝟎$ necessarily implies $x_{1} = 0$ , $x_{2} = 0$ ,…, $x_{n} = 0$ . There is an important case suggested by the behavior of the column vectors of the identity matrix $𝑰$ that simplifies the calculations. Distinct column vectors $𝒆_{1}, \dots, 𝒆_{m}$ within $𝑰$ are orthogonal

𝒆_{i}^{T} 𝒆_{j} = 0, 1 ⩽ i, j ⩽ m, i \neq j

and each individual vector is of unit 2-norm

𝒆_{i}^{T} 𝒆_{i} = 1, 1 ⩽ i ⩽ m .

In this case multiplying the equation $x_{1} 𝒆_{1} + \dots + x_{m} 𝒆_{m} = 𝟎$ by $𝒆_{j}^{T}$ immediately leads to $x_{j} = 0$ , and the column vectors of $𝑰$ are linearly independent. In general any orthogonal set of vectors $𝒖_{1}, \dots, 𝒖_{n}$ are linearly independent by the same argument. In matrix terms, the vectors are collected into a matrix $𝑼 = [\begin{array}{llll} 𝒖_{1} & 𝒖_{2} & \dots & 𝒖_{n} \end{array}]$ and

𝑼^{T} 𝑼 = [\begin{array}{l} 𝒖_{1}^{T} \\ 𝒖_{2}^{T} \\ ⋮ \\ 𝒖_{n}^{T} \end{array}] [\begin{array}{llll} 𝒖_{1} & 𝒖_{2} & \dots & 𝒖_{n} \end{array}] = [\begin{array}{llll} 𝒖_{1}^{T} 𝒖_{1} & 𝒖_{1}^{T} 𝒖_{2} & \dots & 𝒖_{1}^{T} 𝒖_{n} \\ 𝒖_{2}^{T} 𝒖_{1} & 𝒖_{2}^{T} 𝒖_{2} & \dots & 𝒖_{2}^{T} 𝒖_{n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 𝒖_{n}^{T} 𝒖_{1} & 𝒖_{n}^{T} 𝒖_{2} & \dots & 𝒖_{n}^{T} 𝒖_{n} \end{array}] = [\begin{array}{llll} {|| 𝒖_{1} ||}^{2} & 0 & \dots & 0 \\ 0 & {|| 𝒖_{n} ||}^{2} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & {|| 𝒖_{n} ||}^{2} \end{array}] .

Definition. The column vectors $𝒖_{1}, 𝒖_{2}, \dots, 𝒖_{n} \in ℝ^{m}$ of matrix $𝑼 \in ℝ^{m \times n}$ are orthogonal if

$𝑼^{T} 𝑼 = diag ({|| 𝒖_{1} ||}^{2}, \dots, {|| 𝒖_{n} ||}^{2}) .$

An especially simple and useful case is when the norms of all orthogonal vectors are equal to one.

Definition. The column vectors $𝒒_{1}, 𝒒_{2}, \dots, 𝒒_{n} \in ℝ^{m}$ of matrix $𝑸 \in ℝ^{m \times n}$ are orthonormal if

$𝑸^{T} 𝑸 = 𝑰 .$

Somewhat confusingly a square matrix with orthonormal columns is said to be orthogonal.

Definition. The matrix $𝑸 \in ℝ^{m \times m}$ is orthogonal if

$𝑸^{T} 𝑸 = 𝑸 𝑸^{T} = 𝑰 .$

Example. The rotation matrix in $ℝ^{2}$ is orthogonal,

𝑹_{θ} = [\begin{array}{ll} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{array}], 𝑹_{θ} 𝑹_{θ}^{T} = 𝑹_{θ}^{T} 𝑹_{θ} = [\begin{array}{ll} \cos^{2} θ + \sin^{2} θ & 0 \\ 0 & \cos^{2} θ + \sin^{2} θ \end{array}] = 𝑰 .

Example. A rotation matrix in $ℝ^{3}$ is orthogonal,

𝑹_{θ} = [\begin{array}{lll} \cos θ & - \sin θ & 0 \\ \sin θ & \cos θ & 0 \\ 0 & 0 & 1 \end{array}], 𝑹_{θ} 𝑹_{θ}^{T} = 𝑹_{θ}^{T} 𝑹_{θ} = [\begin{array}{lll} \cos^{2} θ + \sin^{2} θ & 0 & 0 \\ 0 & \cos^{2} θ + \sin^{2} θ & 0 \\ 0 & 0 & 1 \end{array}] = 𝑰 .

Example. The reflection matrix across direction $𝒒$ of unit norm in $ℝ^{m}$

𝑹_{𝒒} = 2 𝒒 𝒒^{T} - 𝑰,

is orthogonal

𝑹_{𝒒} 𝑹_{𝒒}^{T} = (2 𝒒 𝒒^{T} - 𝑰) {(2 𝒒 𝒒^{T} - 𝑰)}^{T} = (2 𝒒 𝒒^{T} - 𝑰) (2 𝒒 𝒒^{T} - 𝑰) = 4 𝒒 𝒒^{T} 𝒒 𝒒^{T} - 4 𝒒 𝒒^{T} - I = 𝑰

since $𝒒 𝒒^{T} 𝒒 𝒒^{T} = 𝒒 (𝒒^{T} 𝒒) 𝒒^{T} = 𝒒 (1) 𝒒^{T} = 𝒒 𝒒^{T}$ .

2.Basis and dimension

2.1.Minimal spanning sets

Vector spaces are closed under linear combination, and the span of a vector set $ℬ = {𝒂_{1}, 𝒂_{2}, \dots}$ defines a vector subspace. If the entire set of vectors can be obtained by a spanning set, $V = span ℬ$ , extending $ℬ$ by an additional element $𝒞 = ℬ \cup {𝒃}$ would be redundant since $span ℬ = span 𝒞$ . Avoinding redundancy leads to a consideration of a minimal spanning set. This is formalized by the concept of a basis, and also allows leads to a characterization of the size of a vector space by the cardinality of a basis set.

Definition. A set of vectors $𝒖_{1}, \dots, 𝒖_{n} \in U$ is a basis for vector space $𝒰 = (U, S, +, \cdot)$ if

$𝒖_{1}, \dots, 𝒖_{n}$ are linearly independent;

$span {𝒖_{1}, \dots, 𝒖_{n}} = U$ .

Definition. The number $n$ of vectors $𝒖_{1}, \dots, 𝒖_{n} \in U$ within a basis is the dimension of the vector space $𝒰 = (U, S, +, \cdot)$ .

The above definitions underlie statements such as “ $ℝ^{3}$ represents three-dimensional space”. Since any $𝒗 \in ℝ^{3}$ can be expressed as

𝒗 = 𝑰 𝒗 = v_{1} 𝒆_{1} + v_{2} 𝒆_{2} + v_{3} 𝒆_{3},

it results that ${𝒆_{1}, 𝒆_{2}, 𝒆_{3}}$ is a spanning set. The equation $x_{1} 𝒆_{1} + x_{2} 𝒆_{2} + x_{3} 𝒆_{3} = 𝟎$ leads to

[\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \end{array}] = [\begin{array}{l} 0 \\ 0 \\ 0 \end{array}] = 𝟎,

hence ${𝒆_{1}, 𝒆_{2}, 𝒆_{3}}$ are independent, and therefore form a basis. The cardinality of the set ${𝒆_{1}, 𝒆_{2}, 𝒆_{3}}$ is equal to three, and, indeed, $ℝ^{3}$ represents three-dimensional space.

2.2.Dimensions of matrix spaces

The domain $ℝ^{n}$ and co-domain $ℝ^{m}$ of the linear mapping $𝒇 : ℝ^{n} \to ℝ^{m}$ , $𝒇 (𝒙) = 𝑨 𝒙$ , are vector spaces. Within these vector spaces, subspaces associated with the linear mapping are defined by:

$C (𝑨)$ the column space of $𝑨$ , $C (𝑨) \leq ℝ^{m}$
$C (𝑨^{T})$ the row space of $𝑨$ , $C (𝑨^{T}) \leq ℝ^{n}$
$N (𝑨)$ the null space of $𝑨$ , $N (𝑨) \leq ℝ^{n}$
$N (𝑨^{T})$ the left null space of $𝑨$ , or null space of $𝑨^{T}$ , $N (𝑨^{T}) \leq ℝ^{m}$ .

The dimensions of these subspaces arise so often in applications to warrant formal definition.

Definition. The rank of a matrix $𝑨 \in ℝ^{m \times n}$ is the dimension of its column space.

Definition. The nullity of a matrix $𝑨 \in ℝ^{m \times n}$ is the dimension of its null space.

The dimension of the row space is equal to that of the column space. This is a simple consequence of how scalars are organized into vectors. When the preferred organization is into column vectors, a linear combination is expressed as

𝒃 = 𝑨 𝒙 = x_{1} 𝒂_{1} + \dots + x_{n} 𝒂_{n} .

The same linear combination can also be expressed through rows by taking the transpose, an operation that swaps rows and columns,

𝒃^{T} = {(𝑨 𝒙)}^{T} = 𝒙^{T} 𝑨^{T} = x_{1} 𝒂_{1}^{T} + \dots + x_{n} 𝒂_{n}^{T} .

Any statment about the linear combination of column vectors $x_{1} 𝒂_{1} + \dots + x_{n} 𝒂_{n}$ also holds for the linear combination of row vectors $x_{1} 𝒂_{1}^{T} + \dots + x_{n} 𝒂_{n}^{T}$ . In particular the dimension of the column space equals that of the row space.

3.Signal compression

3.1.Electrocardiograms

The first forays of the significance of matrix-associated subspaces through geometry of lines or planes is useful, but belie the utility of these concepts in applications. Consider the problem of long-term care for cardiac patients. Electrocardiograms (ECGs) are recorded periodically, often over decades and need to be stored for comparison and assessment of disease progression. Heartbeats arise from electrical signals occuring at frequencies of around $f = 250$ Hz (cycles/second). A “single-lead” ECG to detect heart arrhythmia can require $T = 180$ s recording time, so a vector representation would require $m = f T = 4.5 \times 10^{4} ≅ 𝒪 (10^{5})$ components, $𝒃 \in ℝ^{m}$ ,

𝒃 = 𝑰 𝒃 = b_{1} 𝒆_{1} + b_{2} 𝒆_{2} + \dots + b_{m} 𝒆_{m},

(4)

where component $b_{j}$ is the voltage recorded at time $t_{j} = j (δ t)$ , and $δ t = 1 / f$ is the sampling time interval.

The linear combination (4) seems a natural way to organize recorded data, but is it the most insightful? Recall the inclined plane example in which forces seemed to be naturally described in a Cartesian system consisting of the horizontal and vertical directions, yet a more insightful description was seen to correspond to normal and tangential directions. Might there be an alternative linear combination, say

𝒃 = x_{1} 𝒉_{1} + x_{2} 𝒉_{2} + \dots + x_{m} 𝒉_{m},

(5)

that would represent the ECG in a more meaningful manner? The vectors entering into the linear combination are, as usual, organized into a matrix

𝑯 = [\begin{array}{llll} 𝒉_{1} & 𝒉_{2} & \dots & 𝒉_{m} \end{array}] \in ℝ^{m \times m},

such that the ECG is obtained through a matrix-vector product $𝒃 = 𝑯 𝒙, 𝒙 \in ℝ^{m}$ . In particular, in this alternative description, could a smaller number of components suffice for capturing the essence of the ECG? Select the first $n$ components by defining

𝑯_{n} = [\begin{array}{llll} 𝒉_{1} & 𝒉_{2} & \dots & 𝒉_{n} \end{array}] \in ℝ^{m \times n}, 𝒄 = 𝑯_{n} 𝒚, 𝒄 \in ℝ^{m}, 𝒚 \in ℝ^{n} .

(6)

The two-norm of the vectors' difference is the error $ε = {|| 𝒃 - 𝒄 ||}_{2}$ incurred by keeping only $n$ components. If $𝒄$ is close to $𝒃$ then $ε$ should be small, and if $n$ is much smaller than $m$ , a compressed version of the ECG is obtained.

At first sight, the representations (5,6) might seem more costly than (4) since not only do the scaling factors $𝒙 \in ℝ^{m}$ or $𝒚 \in ℝ^{n}$ have to be stored, but also the $m^{2}$ components of $𝑯$ also. This difficulty is eliminated if $𝑯 = [\begin{array}{llll} 𝒉_{1} & 𝒉_{2} & \dots & 𝒉_{m} \end{array}]$ is obtained by some simple rule, much as $𝑰 = [\begin{array}{llll} 𝒆_{1} & 𝒆_{2} & \dots & 𝒆_{m} \end{array}]$ can be specified by any one of a number of procedures.

3.2.The identity matrix revisited

Within the algebraic structure of a vector space there is an identity element $1 \in ℝ$ with respect to the operation of scaling the vector $𝒃 \in ℝ^{m}$ ,

1 \cdot 𝒃 = 𝒃 .

Analogously, the identity matrix $𝑰 \in ℝ^{m \times m}$ acts as an identity element with respect to matrix vector multiplication

𝑰 𝒃 = 𝒃 .

Since matrix-matrix multiplication is simply successive matrix-vector multiplications,

𝑨 𝑩 = 𝑨 [\begin{array}{llll} 𝒃_{1} & 𝒃_{2} & \dots & 𝒃_{p} \end{array}] = [\begin{array}{llll} 𝑨 𝒃_{1} & 𝑨 𝒃_{2} & \dots & 𝑨 𝒃_{p} \end{array}]

the identity matrix $𝑰$ is an identity element for matrix multiplication

𝑰 𝑩 = 𝑩 .

Computer storage of the identity matrix $𝑰 \in ℝ^{m \times m}$ might naively seem to require $m^{2}$ locations, one for every component, but its remarkable structure implies specification only of a single number, $m$ the size of $𝑰$ . The identity matrix can then be reconstructed as needed through a variety of procedures.

Predefined identity operator

In Julia I is a predefined operator from which the full identity matrix can be reconstructed if needed

∴	m=3; Matrix(1.0I,m,m)

$[\begin{array}{ccc} 1.0 & 0.0 & 0.0 \\ 0.0 & 1.0 & 0.0 \\ 0.0 & 0.0 & 1.0 \end{array}]$ (7)

Component definition

In terms of components

𝑰 = [δ_{i j}], 1 ⩽ i, j ⩽ m, δ_{i j} = {\begin{cases} 1 & if i = j \\ 0 & otherwise \end{cases} .,

where $δ_{i j}$ is known as Kronecker-delta and is readily transcribed in Julia.

∴	δ(i,j) = if i==j return 1 else return 0 end;

∴	Id(m) = [δ(i,j) for i in 1:m, j=1:m]; Id(3)

$[\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}]$ (8)

∴

Column vector definition

Another construction is by circular shifts of the column vector $𝒆_{1} = {[\begin{array}{llll} 1 & 0 & \dots & 0 \end{array}]}^{T} \in ℝ^{m}$ .

∴	e1T(m) = [1; zeros(m-1,1)]; e1T(3)

$[\begin{array}{c} 1.0 \\ 0.0 \\ 0.0 \end{array}]$ (9)

∴	ekT(m,k) = circshift(e1T(m),k); [ekT(3,0) ekT(3,1) ekT(3,2)]

$[\begin{array}{ccc} 1.0 & 0.0 & 0.0 \\ 0.0 & 1.0 & 0.0 \\ 0.0 & 0.0 & 1.0 \end{array}]$ (10)

Diagonal matrix construction

The diagonal structure of $𝑰$ also defines a reconstruction.

∴	Diagonal(ones(3,3))

$[\begin{array}{ccc} 1.0 & 0.0 & 0.0 \\ 0.0 & 1.0 & 0.0 \\ 0.0 & 0.0 & 1.0 \end{array}]$ (11)

∴

3.3.

Exterior product construction of $𝑰$

A conceptually different reconstruction of $𝑰$ when $m = 2^{p}$ uses block matrix operations and builds up larger matrices from smaller ones. This technique is quite instructive in that it:

introduces the concept of an outer product that can be compared to inner products;
introduces the concept of recursive definition.

Since larger versions of the identity matrix will be obtained from smaller ones, a notation to specify the matrix size is needed. Let $𝑰_{k}$ denote the identity matrix of size $m \times m$ with $m = 2^{k}$ . Start from $p = 0$ , $m = 1$ in which case $𝑰_{0} = [1]$ , also define

𝑰_{1} = [\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}] .

The next matrix in this sequence would be $𝑰_{2}$ in which a block structure can be identified

𝑰_{2} = [\begin{array}{llll} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}] = [\begin{array}{ll} 𝑰_{1} & 𝟎 \\ 𝟎 & 𝑰_{1} \end{array}] = [\begin{array}{ll} 1 \cdot 𝑰_{1} & 0 \cdot 𝑰_{1} \\ 0 \cdot 𝑰_{1} & 1 \cdot 𝑰_{1} \end{array}] .

The scaling coefficients applied to each $𝑰_{1}$ block are recognized to be $𝑰_{1}$ itself. This suggest that the next matrix in the sequence could be obtained as

𝑰_{3} = [\begin{array}{ll} 1 \cdot 𝑰_{2} & 0 \cdot 𝑰_{2} \\ 0 \cdot 𝑰_{2} & 1 \cdot 𝑰_{2} \end{array}] .

It is useful to introduce a notation for these operations: $𝑰_{2} = 𝑰_{1} \otimes 𝑰_{1}, 𝑰_{3} = 𝑰_{1} \otimes 𝑰_{2}$ .

Definition. The exterior product of matrices $𝑨 = [a_{i j}] \in ℝ^{m \times n}$ and $𝑩 \in ℝ^{p \times q}$ is the matrix $𝑪 \in ℝ^{(m p) \times (n q)}$

$𝑪 = 𝑨 \otimes 𝑩 = [\begin{array}{llll} a_{11} 𝑩 & a_{1 2} 𝑩 & \dots & a_{1 n} 𝑩 \\ a_{2 1} 𝑩 & a_{2 1} 𝑩 & \dots & a_{2 n} 𝑩 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} 𝑩 & a_{m 2} 𝑩 & \dots & a_{m n} 𝑩 \end{array}] .$

Recall that the inner product of $𝒖, 𝒗 \in ℝ^{m}$ is the scalar $𝒖^{T} 𝒗$ , and one can think of an inner product as “reducing the dimensions of its factors”. In contrast the exterior product “increases the dimensions of its factors”. An example of the exterior product has already been met, namely the projection matrix along direction $𝒒$ of unit norm ( $|| 𝒒 || = 1$ )

𝑷_{𝒒} = 𝒒 𝒒^{T} = 𝒒 \otimes 𝒒 = [\begin{array}{llll} q_{1} 𝒒 & q_{2} 𝒒 & \dots & q_{m} 𝒒 \end{array}] .

Using the exterior product definition the matrix $𝑰_{q}$ is defined in terms of previous terms in the sequence as

𝑰_{q} = 𝑰_{1} \otimes 𝑰_{q - 1}, q > 1 .

Such definitions based upon previous terms are said to be recursive.

3.4.Walsh-Hadamard matrices

An alternative to specifying the signal $𝒃 \in ℝ^{m}$ as the linear combination $𝒃 = 𝑰 𝒃$ is now constructed by a different recursion procedure. As in the identity matrix case let $m = 2^{p}$ and start from $p = 0$ , $m = 1$ with $𝑯_{0} = [1]$ . For the next term choose however

𝑯_{1} = [\begin{array}{ll} 1 & 1 \\ 1 & - 1 \end{array}],

and define in general

𝑯_{q} = 𝑯_{1} \otimes 𝑯_{q - 1} .

Julia can be extended to include definitions of the above Hadamard matrices.

∴	using Hadamard

∴	H0=hadamard(2^0)

$[\begin{array}{c} 1 \end{array}]$ (12)

∴	H1=hadamard(2^1)

$[\begin{array}{cc} 1 & 1 \\ 1 & - 1 \end{array}]$ (13)

∴	H2=hadamard(2^2)

$[\begin{array}{cccc} 1 & 1 & 1 & 1 \\ 1 & - 1 & 1 & - 1 \\ 1 & 1 & - 1 & - 1 \\ 1 & - 1 & - 1 & 1 \end{array}]$ (14)

∴

The column vectors of the identity matrix are mutually orthogonal as expressed by

𝑰^{T} 𝑰 = [\begin{array}{l} 𝒆_{1}^{T} \\ 𝒆_{2}^{T} \\ ⋮ \\ 𝒆_{m}^{T} \end{array}] [\begin{array}{llll} 𝒆_{1} & 𝒆_{2} & \dots & 𝒆_{m} \end{array}] = [\begin{array}{llll} 𝒆_{1}^{T} 𝒆_{1} & 𝒆_{1}^{T} 𝒆_{2} & \dots & 𝒆_{1}^{T} 𝒆_{m} \\ 𝒆_{2}^{T} 𝒆_{1} & 𝒆_{2}^{T} 𝒆_{2} & \dots & 𝒆_{2}^{T} 𝒆_{m} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 𝒆_{m}^{T} 𝒆_{1} & 𝒆_{m}^{T} 𝒆_{2} & \dots & 𝒆_{m}^{T} 𝒆_{m} \end{array}] = 𝑰 .

The Hadamard matrices have similar behavior in that

𝑯_{q}^{T} 𝑯_{q} = 2^{q} 𝑰_{m}, m = 2^{q},

and thus be seen to have orthogonal columns.

∴

H2'*H2

$[\begin{array}{cccc} 4 & 0 & 0 & 0 \\ 0 & 4 & 0 & 0 \\ 0 & 0 & 4 & 0 \\ 0 & 0 & 0 & 4 \end{array}]$ (15)

∴

The structure of Hadamard matrices allows a reconstruction from small cases just as simple as that of the identity matrix, but the components of a Hadamard matrix are quite different. Rather than display of the components a graphical visualization is insightful. The spy(A) function displays a plot of the non-zero components of a matrix

∴	q=5; m=2^q; Iq=Matrix(1.0I,m,m); Hq=hadamard(m);

∴	clf(); subplot(1,2,1); spy(Iq); subplot(1,2,2); spy(Hq.+1);

∴	cd(homedir()*"/courses/MATH347DS/images"); savefig("L03Fig01.eps");

∴

Figure 1. The structure of the identity and Hadamard matrices for $m = 2^{5} = 32$ .

3.5.Sample sets and reconstructions

The behavior of the column vectors of $𝑰$ and $𝑯$ is instructive. Denote by $b : ℝ \to ℝ$ the ECG recording that would be obtained measurements were carried out for all $t \in [0, T]$ . This is obviously impossible since it would require an infinite number of measurements. In practice, only the values $b_{i} = b (t_{i})$ are recorded for $t_{i} = i (δ t)$ , which is a sample of the infinite set of the possible values. The sample set values are organized into the vector $𝒃$ . Function values $b (t)$ for $t \neq t_{i}$ can be reconstructed through some assumption, for instance that

b (t) = b (t_{i}) = b_{i} for t \in [t_{i}, t_{i + 1}] .

The above states that $b (t)$ remains constant from $t_{i}$ to $t_{i} + δ t$ . A simple example when $b (t) = \sin (t)$ illustrates the approach.

∴	T=2pi; n=16; δt=T/n; t=(0:n-1)δt; b=sin.(t);

∴	m=n^2; δts=T/m; ts=(0:m-1)*δts; s=sin.(ts);

∴	p=4n; δtw=T/p; tw=(0:p-1)δtw; w=reshape((b.*[1 1 1 1])',(1,p))';

∴	figure(3); clf(); plot(t,b,"o",ts,s,tw,w);

∴	title("Reconstruction of sine from samples");

∴	savefig("L04Fig02.eps");

∴

Figure 2. The reconstruction is a step function.

The reconstruction exhibits steps at which the function value changes. The column vectors of $𝑰$ are well suited for such reconstructions.

∴	n=8; In=Matrix(1.0I,n,n); Hn=hadamard(n);

∴	figure(5); clf();

∴	for j=1:n subplot(8,2,2j-1); plot(In[:,j]) subplot(8,2,2j); plot(Hn[:,j]) end

∴	title("Column vectors of I");

∴	subplot(2,1,2);

∴	for j=1:n plot(Hn[:,j]) end

∴	subplot(8,2,1); title("Identity matrix");

∴	subplot(8,2,2); title("Hadamard matrix");

∴	savefig("L04Fig03.eps");

∴

Figure 3. Comparison of column vectors of $𝑰, 𝑯$ .

3.6.Naive ECG compression

ECG compression.

What happens if the linear combinations expressing

𝒃 \in ℝ^{m}

𝒃 = 𝑰 𝒃 = b_{1} 𝒆_{1} + \dots + b_{m} 𝒆_{m} = c_{1} 𝒉_{1} + \dots + c_{m} 𝒉_{m} = 𝑯 𝒄

are truncated? The scaling coefficients of the Hadamard linear combination are readily found

𝑯 𝒄 = 𝒃 \Rightarrow 𝑯^{T} (𝑯 𝒄) = 𝑯^{T} 𝒃 \Rightarrow (𝑯^{T} 𝑯) 𝒄 = 𝑯^{T} 𝒃 \Rightarrow m 𝑰 𝒄 = 𝑯^{T} 𝒃 \Rightarrow 𝒄 = \frac{1}{m} 𝑯^{T} 𝒃 .

Consider $𝒖$ to be the truncation of $𝒃 = 𝑰 𝒃$ to $n < m$ terms

𝒖 = b_{1} 𝒆_{1} + \dots + b_{n} 𝒆_{n} .

Due to the choice of the significance of the components $b_{i}$ the above simply drops ECG recording times for $t > n (δ t)$ . The truncation of the Hadamard linear combination

𝒗 = c_{1} 𝒉_{1} + \dots + c_{n} 𝒉_{n}

behaves quite differently.

∴

using MAT

∴	DataFileName = homedir()*"/courses/MATH347DS/data/ecg/ECGData.mat";

∴	DataFile = matopen(DataFileName,"r");

∴	dict = read(DataFile,"ECGData");

∴	data = dict["Data"]';

∴	size(data)

$[\begin{array}{c} 65536 \\ 162 \end{array}]$ (16)

∴	q=12; m=2^q; k=15; b=data[1:m,k];

∴	Iq=Matrix(1.0I,m,m); Hq=hadamard(m); c=(1/m)transpose(Hq)b;

∴	n=2^10; u=Iq[:,1:n]b[1:n]; v=Hq[:,1:n]c[1:n];

∴	figure(1); clf(); subplot(3,1,1); plot(b);

∴	subplot(3,1,2); plot(u);

∴	subplot(3,1,3); plot(v);

∴	cd(homedir()*"/courses/MATH347DS/images"); savefig("L03Fig02.eps");

∴

Figure 4. Comparison of original ECG (top) containing $m = 2^{12} = 4096$ time samples with truncated linear combinations based upon $n = 2^{10} = 1024$ terms of the linear combinations using identity matrix $𝑰$ (middle), and Hadamard matrix $𝑯$ (bottom).

3.7.Basis subsets and compression

Neither of the truncated linear combinations seems particularly useful. Truncation of the $𝑰$ -based linear combination cuts off part of the signal, while that of the $𝑯$ -based linear combination introduces heart pulses not present in the original signal. The problem is the significance given to the ordering of the vectors into the linear combination. Consider first truncation of $𝒃 = 𝑰 𝒃$ to obtain $𝒖$ . Rather than cutting off part of the signal, an alternative data compression is to use a larger time sample, say 4 $δ t$ instead of $δ t$ , a process known as down-sampling,

𝒅 = 𝒃 [1 : 4 : m] \in ℝ^{m / 4} .

∴	w=fwht(b);

∴	figure(6); clf(); plot(c,".");

∴	plot(w,".");

∴