Lecture 4: Linear Combinations

1.Finite-dimensional vector spaces

1.1.Overview

The definition from Table 1 of a vector space reflects everyday experience with vectors in Euclidean geometry, and it is common to refer to such vectors by descriptions in a Cartesian coordinate system. For example, a position vector 𝒓 within the plane can be referred through the pair of coordinates (x,y). This intuitive understanding can be made precise through the definition of a vector space 2=(2,,+,), called the real 2-space. Vectors within 2 are elements of 2=×={(x,y)|.x,y}, meaning that a vector is specified through two real numbers, 𝒓(x,y). Addition of two vectors, 𝒒(s,t), 𝒓(x,y) is defined by addition of coordinates 𝒒+𝒓=(s+x,t+v). Scaling 𝒓(x,y) by scalar a is defined by a𝒓(ax,ay). Similarly, consideration of position vectors in three-dimensional space leads to the definition of the 3=(3,,+,), or more generally a real m-space m=(m,,+,), m, m>0.

Addition rules for 𝒂,𝒃,𝒄V
𝒂+𝒃V Closure
𝒂+(𝒃+𝒄)=(𝒂+𝒃)+𝒄 Associativity
𝒂+𝒃=𝒃+𝒂 Commutativity
𝟎+𝒂=𝒂 Zero vector
𝒂+(-𝒂)=𝟎 Additive inverse
Scaling rules for 𝒂,𝒃V, x,yS
x𝒂V Closure
x(𝒂+𝒃)=x𝒂+x𝒃 Distributivity
(x+y)𝒂=x𝒂+y𝒂 Distributivity
x(y𝒂)=(xy)𝒂 Composition
1𝒂=𝒂 Scalar identity

Table 1. Vector space 𝒱=(V,S,+,) properties for arbitrary 𝒂,𝒃,𝒄V

Note however that there is no mention of coordinates in the definition of a vector space as can be seen from the list of properties in Table 1. The intent of such a definition is to highlight that besides position vectors, many other mathematical objects follow the same rules. As an example, consider the set of all continuous functions C()={f|.f:}, with function addition defined by the sum at each argument t, (f+g)(t)=f(t)+g(t), and scaling by a defined as (af)(t)=af(t). Read this as: “given two continuous functions f and g, the function f+g is defined by stating that its value for argument x is the sum of the two real numbers f(t) and g(t)”. Similarly: “given a continuous function f, the function af is defined by stating that its value for argument t is the product of the real numbers a and f(t)”. Under such definitions 𝒞0=(C(),,+,) is a vector space, but quite different from m. Nonetheless, the fact that both 𝒞0 and m are vector spaces can be used to obtain insight into the behavior of continuous functions from Euclidean vectors, and vice versa. This correspondence principle between discrete and continuous formulations is a recurring theme in scientific computation.

1.2.Real vector space m

Column vectors.
Since the real spaces m=(m,,+,) play such an important role in themselves and as a guide to other vector spaces, familiarity with vector operations in m is necessary to fully appreciate the utility of linear algebra to a wide range of applications. Following the usage in geometry and physics, the m real numbers that specify a vector 𝒖m are called the components of 𝒖. The one-to-one correspondence between a vector and its components 𝒖(u1,,um), is by convention taken to define an equality relationship,

𝒖=[ u1 um ], (1)

with the components arranged vertically and enclosed in square brackets. Given two vectors 𝒖,𝒗m, and a scalar a, vector addition and scaling are defined in m by real number addition and multiplication of components

𝒖+𝒗=[ u1 um ]+[ v1 vm ]=[ u1+v1 um+vm ],a𝒖=a[ u1 um ]=[ au1 aum ]. (2)
u=[1; 2; 3]; v=[-1; -2; -3]; [u v]

[ 1 -1 2 -2 3 -3 ] (3)

a=2; b=5; a*u+b*v

[ -3 -6 -9 ] (4)

The vector space m is defined using the real numbers as the set of scalars, and constructing vectors by grouping together m scalars, but this approach can be extended to any set of scalars S, leading to the definition of the vector spaces 𝒮n=(Sn,S,+,). These will often be referred to as n-vector space of scalars, signifying that the set of vectors is V=Sn.

To aid in visual recognition of vectors, the following notation conventions are introduced:

Row vectors.
Instead of the vertical placement or components into one column, the components of could have been placed horizontally in one row [ u1 um ], that contains the same data, differently organized. By convention vertical placement of vector components is the preferred organization, and 𝒖 shall denote a column vector henceforth. A transpose operation denoted by a T superscript is introduced to relate the two representations

𝒖T=[ u1 um ],

u=[1; 2; 3]; u'

[ 1 2 3 ] (5)

and 𝒖T is the notation used to denote a row vector.

In Julia, horizontal placement of successive components in a row is denoted by a space.

u=[4 5 6]

[ 4 5 6 ] (6)

Compatible vectors.
Addition of real vectors 𝒖,𝒗m defines another vector 𝒘=𝒖+𝒗m. The components of 𝒘 are the sums of the corresponding components of 𝒖 and 𝒗, wi=ui+vi, for i=1,2,,m. Addition of vectors with different number of components is not defined, and attempting to add such vectors produces an error. Such vectors with different number of components are called incompatible, while vectors with the same number of components are said to be compatible. Scaling of 𝒖 by a defines a vector 𝒛=a𝒖, whose components are zi=aui, for i=1,2,,m.

1.3.Working with vectors

Ranges.
The vectors used in applications usually have a large number of components, m1, and it is important to become proficient in their manipulation. Previous examples defined vectors by explicit listing of their m components. This is impractical for large m, and support is provided for automated generation for often-encountered situations. First, observe that Table 1 mentions one distinguished vector, the zero element that is a member of any vector space 𝟎V. The zero vector of a real vector space m is a column vector with m components, all of which are zero, and a mathematical convention for specifying this vector is 𝟎T=[ 0 0 0 ]m. This notation specifies that transpose of the zero vector is the row vector with m zero components, also written through explicit indexing of each component as 𝟎i=0, for i=1,,m. Keep in mind that the zero vector 𝟎 and the zero scalar 0 are different mathematical objects.

The ellipsis symbol in the mathematical notation is transcribed in Julia by the notion of a range, with 1:m denoting all the integers starting from 1 to m, organized as a row vector. The notation is extended to allow for strides different from one, and the mathematical ellipsis i=m,m-1,,1 is denoted as m:-1:1. In general r:s:t denotes the set of numbers {r,r+s,,r+ns} with r+nst, and r,s,t real numbers and n a natural number, r,s,t, n. If there is no natural number n such that r+nst, an empty vector with no components is returned.

2.Linear combinations

2.1.Linear combination as a matrix-vector product

The expression 𝒙=x1𝒆1+x2𝒆2++xm𝒆m expresses the idea of scaling vectors within a set and subsequent addition to form a new vector 𝒙. The matrix 𝑰=[ 𝒆1 𝒆2 𝒆m ] groups these vectors together in a single entity, and the scaling factors are the components of the vector 𝒙. To bring all these concepts together it is natural to consider the notation

𝒙=𝑰𝒙,

I

UniformScaling{Bool}(true)

Matrix(1I,3,3)

[ 1 0 0 0 1 0 0 0 1 ] (7)

Matrix(1.0I,3,3)

[ 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 ] (8)

as a generalization of the scalar expression x=1x. It is clear what the operation 𝑰𝒙 should signify: it should capture the vector scaling and subsequent vector addition x1𝒆1+x2𝒆2++xm𝒆m. A specific meaning is now ascribed to 𝑰𝒙 by identifying two definitions to one another.

Linear combination.
Repeateadly stating “vector scaling and subsequent vector addition” is unwieldy, so a special term is introduced for some given set of vectors {𝒂1,,𝒂n}.

Definition. (Linear Combination) . The linear combination of vectors 𝒂1,𝒂2,,𝒂nV with scalars x1,x2,,xnS in vector space (V,S,+,) is the vector 𝒃=x1𝒂1+x2𝒂2+xn𝒂n.

Matrix-vector product.
Similar to the grouping of unit vectors 𝒆1,,𝒆m into the identity matrix 𝑰, a more concise way of referring to arbitrary vectors 𝒂1,,𝒂n from the same vector space is the matrix 𝑨=[ 𝒂1 𝒂2 𝒂n ]. Combining these observations leads to the definition of a matrix-vector product.

Definition. (Matrix-Vector Product) . In the vector space (V,S,+,), the product of matrix 𝑨=[ 𝒂1 𝒂2 𝒂n ] composed of columns 𝒂1,𝒂2,,𝒂nV with the vector 𝒙Sn whose components are scalars x1,x2,,xnS is the linear combination 𝒃=x1𝒂1+x2𝒂2+xn𝒂n=𝑨𝒙V.

2.2.Linear algebra problem examples

Linear combinations in E2.
Consider a simple example that leads to a common linear algebra problem: decomposition of forces in the plane along two directions. Suppose a force is given in terms of components along the Cartesian x,y-axes, 𝒃=bx𝒆x+by𝒆y, as expressed by the matrix-vector multiplication 𝒃=𝑰𝒃. Note that the same force could be obtained by linear combination of other vectors, for instance the normal and tangential components of the force applied on an inclined plane with angle θ, 𝒃=xt𝒆t+xn𝒆n, as in Figure 1. This defines an alternate reference system for the problem. The unit vectors along these directions are

𝒕=[ cosθ sinθ ],𝒏=[ -sinθ cosθ ],

θ=π/6.; c=cos(θ); s=sin(θ); t=[c; s]; n=[-s; c];

and can be combined into a matrix 𝑨=[ 𝒕 𝒏 ]. The value of the components (xt,xn) are the scaling factors and can be combined into a vector 𝒙=[ xt xn ]T. The same force must result irrespective of whether its components are given along the Cartesian axes or the inclined plane directions leading to the equality

𝑰𝒃=𝒃=𝑨𝒙. (9)
b=[0.2; 0.4]; I*b

[ 0.2 0.4 ] (10)

Interpret equation (9) to state that the vector 𝒃 could be obtained either as a linear combination of 𝑰, 𝒃=𝑰𝒃, or as a linear combination of the columns of 𝑨, 𝒃=𝑨𝒙. Of course the simpler description seems to be 𝑰𝒃 for which the components are already known. But this is only due to an arbitrary choice made by a human observer to define the force in terms of horizontal and vertical components. The problem itself suggests that the tangential and normal components are more relevant; for instance a friction force would be evaluated as a scaling of the normal force.

The components of 𝒃 in this more natural reference system are not known, but can be determined by solving the vector equality 𝑨𝒙=𝑰𝒃=𝒃, known as a linear system of equations, implemented in many programming environments (Julia, Matlab, Octave) through the backslash operator x=A\b.

A=[t n]

[ 0.8660254037844387 -0.49999999999999994 0.49999999999999994 0.8660254037844387 ] (11)

x = A \ b

[ 0.37320508075688774 0.2464101615137755 ] (12)

[I*b A*x]

[ 0.2 0.2 0.4 0.4 ] (13)

Figure 1. Alternative decompositions of force on inclined plane.

Linear combinations in m and 𝒞0[0,2π).
Linear combinations in a real space can suggest properties or approximations of more complex objects such as continuous functions. Let 𝒞0[0,2π)=(C[0,2π),,+,) denote the vector space of continuous functions that are periodic on the interval [0,2π), C[0,π)={f|f:.,f(t)=f(t+2π)}. Recall that vector addition is defined by (f+g)(t)=f(t)+g(t), and scaling by (af)(t)=af(t), for f,gC[0,2π), a. Familiar functions within this vector space are sin(kt), cos(kt) with k, and these can be recognized to intrinsically represent periodicity on [0,2π), a role analogous to the normal and tangential directions in the inclined plane example. Define now another periodic function b(t+2π)=b(t) by repeating the values b(t)=t(π-t)(2π-t) from the interval [0,2π) on all intervals [2pπ,2(p+1)π], for p. The function b is not given in terms of the “naturally” periodic functions sin(kt), cos(kt), but could it thus be expressed? This can be stated as seeking a linear combination b(t)=k=1xksin(kt), as studied in Fourier analysis. The coefficients xk could be determined from an analytical formula involving calculus operations xk=1π02πb(t)sin(kt)dt, but we'll seek an approximation using a linear combination of n terms

b(t)k=1nxksin(kt),A(t)=[ sin(t) sin(2t) sin(nt) ],A:n.

Organize this as a matrix vector product b(𝒕)A(𝒕)𝒙, with

A(𝒕)=[ sin(𝒕) sin(2𝒕) sin(n𝒕) ],𝒙=[ x1 x2 xn ]Tn.

The idea is to sample the column vectors of A(𝒕) at the components of the vector 𝒕=[ t1 t2 tm ]Tm, tj=(j-1)h, j=1,2,,m, h=π/m. Let 𝒃=b(𝒕), and 𝑨=A(𝒕), denote the so-sampled b,A functions leading to the definition of a vector 𝒃m and a matrix 𝑨m×n. There are n coefficients available to scale the column vectors of 𝑨, and 𝒃 has m components. For m>n it is generally not possible to find 𝒙 such that 𝑨𝒙 would exactly equal 𝒃, but as seen later the condition to be as close as possible to 𝒃 leads to a well defined solution procedure. This is known as a least squares problem and is automatically applied in the x=A\b instruction when the matrix A is not square. As seen in the following numerical experiment and Figure 2, the approximation is excellent and the information conveyed by m=1000 samples of b(t) is now much more efficiently stored in the form chosen for the columns of 𝑨 and the n=5 scaling coefficients that are the components of 𝒙.

Figure 2. Comparison of least squares approximation (red line) with samples (black dots) of exact function b(t)=t(π-t)(2π-t)

m=1000; h=2*π/m; j=1:m;
t=((j.-1)*h);
n=3; A=sin.(t);
for k=2:n
  global A
  A = [A sin.(k*t)]
end;
bt=t.*(π.-t).*(2*π.-t);
x=A\bt; b=A*x;
s=25; i=1:s:m; ts=t[i]; bs=bt[i];
clf(); plot(ts,bs,"ok",t,b,"r");
xlabel("t"); ylabel("b(t)"); grid("on")
title("Fourier approximation of \$b(t)\$");
cd(homedir()*"/courses/MATH661/images");
savefig("L04Fig02.eps");

Summary.