A general procedure to relate input values from set to output values from set is to first construct the set of all possible instances of and , which is the Cartesian product of with , denoted as . Usually only some associations of inputs to outputs are of interest leading to the following definition.
Associating an output to an input is also useful, leading to the definition of an inverse relation as , . Note that an inverse exists for any relation, and the inverse of an inverse is the original relation, .
Relation is reflexive if for any . The equality relation is reflexive, , the less than relation is not, .
Relation is symmetric if implies that , . The equality relation is symmetric, , the less than relation is not, .
Relation is anti-symmetric if for , then . The less than relation is antisymmetric, .
Relation is transitive if and implies . for any . The equality relation is transitive, , as is the less than relation , .
Certain combinations of properties often arise. A homogeneous relation that is reflexive, symmetric, and transitive is said to be an equivalence relation. Equivalence relations include equality among the reals, or congruence among triangles. A homogeneous relation that is reflexive, anti-symmetric and transitive is a partial order relation, such as the less than or equal relation between reals. Finally, a homogeneous relation that is anti-symmetric and transitive is an order relation, such as the less than relation between reals.
Functions between sets and are a specific type of relationship that often arise in science. For a given input , theories that predict a single possible output are of particular scientific interest.
The above intuitive definition can be transcribed in precise mathematical terms as is a function if and implies . Since it's a particular kind of relation, a function is a triplet of sets , but with a special, common notation to denote the triplet by , with and the property that . The set is the domain and the set is the codomain of the function . The value from the domain is the argument of the function associated with the function value . The function value is said to be returned by evaluation .
As seen previously, a Euclidean space can be used to suggest properties of more complex spaces such as the vector space of continuous functions . A construct that will be often used is to interpret a vector within as a function, since with components also defines a function , with values . As the number of components grows the function can provide better approximations of some continuous function through the function values at distinct sample points .
The above function examples are all defined on a domain of scalars or naturals and returned scalar values. Within linear algebra the particular interest is on functions defined on sets of vectors from some vector space that return either scalars , or vectors from some other vector space , . The codomain of a vector-valued function might be the same set of vectors as its domain, . The fundamental operation within linear algebra is the linear combination with , . A key aspect is to characterize how a function behaves when given a linear combination as its argument, for instance or
Consider first the case of a function defined on a set of vectors that returns a scalar value. These can be interpreted as labels attached to a vector, and are very often encountered in applications from natural phenomena or data analysis.
(1) |
Many different functionals may be defined on a vector space , and an insightful alternative description is provided by considering the set of all linear functionals, that will be denoted as . These can be organized into another vector space with vector addition of linear functionals and scaling by defined by
(2) |
As is often the case, the above abstract definition can better be understood by reference to the familiar case of Euclidean space. Consider , the set of vectors in the plane with the position vector from the origin to point in the plane with coordinates . One functional from the dual space is , i.e., taking the second coordinate of the position vector. The linearity property is readily verified. For , ,
Given some constant value , the curves within the plane defined by are called the contour lines or level sets of . Several contour lines and position vectors are shown in Figure 1. The utility of functionals and dual spaces can be shown by considering a simple example from physics. Assume that is the height above ground level and a vector is the displacement of a body of mass in a gravitational field. The mechanical work done to lift the body from ground level to height is with the gravitational acceleration. The mechanical work is the same for all displacements that satisfy the equation . The work expressed in units can be interpreted as the number of contour lines intersected by the displacement vector . This concept of duality between vectors and scalar-valued functionals arises throughout mathematics, the physical and social sciences and in data science. The term “duality” itself comes from geometry. A point in with coordinates can be defined either as the end-point of the position vector , or as the intersection of the contour lines of two functionals and . Either geometric description works equally well in specifying the position of , so it might seem redundant to have two such procedures. It turns out though that many quantities of interest in applications can be defined through use of both descriptions, as shown in the computation of mechanical work in a gravitational field.
Consider now functions from vector space to another vector space . As before, the action of such functions on linear combinations is of special interest.
(3) |
The image of a linear combination through a linear mapping is another linear combination , and linear mappings are said to preserve the structure of a vector space, and called homomorphisms in mathematics. The codomain of a linear mapping might be the same as the domain in which case the mapping is said to be an endomorphism.
Matrix-vector multiplication has been introduced as a concise way to specify a linear combination
with the columns of the matrix, . This is a linear mapping between the real spaces , , , and indeed any linear mapping between real spaces can be given as a matrix-vector product.
Vectors within the real space can be completely specified by real numbers, even though is large in many realistic applications. A vector within , i.e., a continuous function defined on the reals, cannot be so specified since it would require an infinite, non-countable listing of function values. In either case, the task of describing the elements of a vector space by simpler means arises. Within data science this leads to classification problems in accordance with some relevant criteria.
Many classification criteria are scalars, defined as a scalar-valued function on a vector space, . The most common criteria are inspired by experience with Euclidean space. In a Euclidean-Cartesian model of the geometry of a plane , a point is arbitrarily chosen to correspond to the zero vector , along with two preferred vectors grouped together into the identity matrix . The position of a point with respect to is given by the linear combination
Several possible classifications of points in the plane are depicted in Figure 2: lines, squares, circles. Intuitively, each choice separates the plane into subsets, and a given point in the plane belongs to just one in the chosen family of subsets. A more precise characterization is given by the concept of a partition of a set.
In precise mathematical terms, a partition of set is such that , for which . Since there is only one set ( signifies “exists and is unique”) to which some given belongs, the subsets of the partition are disjoint, . The subsets are labeled by within some index set . The index set might be a subset of the naturals, in which case the partition is countable, possibly finite. The partitions of the plane suggested by Figure 2 are however indexed by a real-valued label, with .
A technique which is often used to generate a partition of a vector space is to define an equivalence relation between vectors, . For some element , the equivalence class of is defined as all vectors that are equivalent to , . The set of equivalence classes of is called the quotient set and denoted as , and the quotient set is a partition of . Figure 2 depicts four different partitions of the plane. These can be interpreted geometrically, such as parallel lines or distance from the origin. With wider implications for linear algebra, the partitions can also be given in terms of classification criteria specified by functions.
The partition of by circles from Figure 2 is familiar; the equivalence classes are sets of points whose position vector has the same size, , or is at the same distance from the origin. Note that familiarity with Euclidean geometry should not obscure the fact that some other concept of distance might be induced by the data. A simple example is statement of walking distance in terms of city blocks, in which the distance from a starting point to an address blocks east and blocks north is city blocks, not the Euclidean distance since one cannot walk through the buildings occupying a city block.
The above observations lead to the mathematical concept of a norm as a tool to evaluate vector magnitude. Recall that a vector space is specified by two sets and two operations, , and the behavior of a norm with respect to each of these components must be defined. The desired behavior includes the following properties and formal definition.
The magnitude of a vector should be a unique scalar, requiring the definition of a function. The scalar could have irrational values and should allow ordering of vectors by size, so the function should be from to , . On the real line the point at coordinate is at distance from the origin, and to mimic this usage the norm of is denoted as , leading to the definition of a function , .
Provision must be made for the only distinguished element of , the null vector . It is natural to associate the null vector with the null scalar element, . A crucial additional property is also imposed namely that the null vector is the only vector whose norm is zero, . From knowledge of a single scalar value, an entire vector can be determined. This property arises at key junctures in linear algebra, notably in providing a link to another branch of mathematics known as analysis, and is needed to establish the fundamental theorem of linear algbera or the singular value decomposition encountered later.
Transfer of the scaling operation property leads to imposing . This property ensures commensurability of vectors, meaning that the magnitude of vector can be expressed as a multiple of some standard vector magnitude .
Position vectors from the origin to coordinates on the real line can be added and . If however the position vectors point in different directions, , , then . For a general vector space the analogous property is known as the triangle inequality, for .
;
;
.
Note that the norm is a functional, but the triangle inequality implies that it is not generally a linear functional. Returning to Figure 2, consider the functions defined for through values
Sets of constant value of the above functions are also equivalence classes induced by the equivalence relations for .
, ;
, ;
, ;
, .
These equivalence classes correspond to the vertical lines, horizontal lines, squares, and circles of Figure 2. Not all of the functions are norms since is zero for the non-null vector , and is zero for the non-null vector . The functions and are indeed norms, and specific cases of the following general norm.
(4) |
Denote by the largest component in absolute value of . As increases, becomes dominant with respect to all other terms in the sum suggesting the definition of an inf-norm by
This also works for vectors with equal components, since the fact that the number of components is finite while can be used as exemplified for , by , with .
Note that the Euclidean norm corresponds to , and is often called the -norm. The analogy between vectors and functions can be exploited to also define a -norm for , the vector space of continuous functions defined on .
(5) |
The integration operation can be intuitively interpreted as the value of the sum from equation (4) for very large and very closely spaced evaluation points of the function , for instance . An inf-norm can also be define for continuous functions by
where sup, the supremum operation can be intuitively understood as the generalization of the max operation over the countable set to the uncountable set .
Vector norms arise very often in applications since they can be used to classify data, and are implemented in most software systems as a norm(x,p) to evaluate the -norm of a vector , with as the default.
∴ |
x=[1 1 1]; [norm(x) sqrt(3.0)] |
(6)
∴ |
m=9; x=ones(m); [norm(x) sqrt(m)] |
(7)
∴ |
m=4; x=ones(m); [norm(x,1) m] |
(8)
∴ |
[norm(x,1) norm(x,2) norm(x,4) norm(x,8) norm(x,16) norm(x,Inf)] |
(9)
∴ |
Norms are functionals that define what is meant by the size of a vector, but are not linear. Even in the simplest case of the real line, the linearity relation is not verified for , . Nor do norms characterize the familiar geometric concept of orientation of a vector. A particularly important orientation from Euclidean geometry is orthogonality between two vectors. Another function is required, but before a formal definition some intuitive understanding is sought by considering vectors and functionals in the plane, as depicted in Figure 4. Consider a position vector and the previously-encountered linear functionals
The component of the vector can be thought of as the number of level sets of times it crosses; similarly for the component. A convenient labeling of level sets is by their normal vectors. The level sets of have normal , and those of have normal vector . Both of these can be thought of as matrices with two columns, each containing a single component. The products of these matrices with the vector gives the value of the functionals
In general, any linear functional defined on the real space can be labeled by a vector
and evaluated through the matrix-vector product . This suggests the definition of another function ,
The function is called an inner product, has two vector arguments from which a matrix-vector product is formed and returns a scalar value, hence is also called a scalar product. The definition from an Euclidean space can be extended to general vector spaces. For now, consider the field of scalars to be the reals .
For any , .
For any , , .
For any , .
The inner product returns the number of level sets of the functional labeled by crossed by the vector , and this interpretation underlies many applications in the sciences as in the gravitational field example above. Inner products also provide a procedure to evaluate geometrical quantities and relationships.
In the number of level sets of the functional labeled by crossed by itself is identical to the square of the 2-norm
In general, the square root of satisfies the properties of a norm, and is called the norm induced by an inner product
A real space together with the scalar product and induced norm defines an Euclidean vector space .
In the point specified by polar coordinates has the Cartesian coordinates , , and position vector . The inner product
is seen to contain information on the relative orientation of with respect to . In general, the angle between two vectors with any vector space with a scalar product can be defined by
which becomes
in a Euclidean space, .
In two vectors are orthogonal if the angle between them is such that , and this can be extended to an arbitrary vector space with a scalar product by stating that are orthogonal if . In vectors are orthogonal if .
From two functions and , a composite function, , is defined by
Consider linear mappings between Euclidean spaces , . Recall that linear mappings between Euclidean spaces are expressed as matrix vector multiplication
The composite function is , defined by
Note that the intemediate vector is subsequently multiplied by the matrix . The composite function is itself a linear mapping
so it also can be expressed a matrix-vector multiplication
(10) |
Using the above, is defined as the product of matrix with matrix
The columns of can be determined from those of by considering the action of on the the column vectors of the identity matrix . First, note that
(11) |
The above can be repeated for the matrix giving
(12) |
Combining the above equations leads to , or
From the above the matrix-matrix product is seen to simply be a grouping of all the products of with the column vectors of ,
∴ |
a1=[1; 2]; a2=[3; 4]; A=[a1 a2] |
(13)
∴ |
b1=[-1; 1; 3]; b2=[2; -2; 3]; B=[b1 b2] |
(14)
∴ |
C=B*A |
(15)
∴ |
c1=B*a1; c2=B*a2; [c1 c2] |
(16)
∴ |
Summary.
Linear functionals attach a scalar label to a vector, and preserve linear combinations
Linear functionals arise when establish vector magnitude and orientation
Linear mappings establish correspondences between vector spaces and preserve linear combinations
Composition of linear mappings is represented through matrix multiplication