MATH661

Lecture 3: Problems and Algorithms

1.Mathematical problems

1.1.Formalism for defining a mathematical problem

In general, mathematical problems can be thought of as mappings from some set of inputs $X$ to some set of outputs $Y$ . The mapping is often carried out through a function $f$ , i.e., a procedure that associates a single $y \in Y$ to some input $x \in X$

f : X \to Y, y = f (x), x \overset{f}{\to} y

Examples:

$•$ $\circ$ Compute the square of a real:

X = ℝ, Y = ℝ, y = f (x) = x^{2} .

∴

f(x)=x^2

f

∴

f(2)

$4$

∴

$•$ $\circ$ Find $x$ solution of $a x + b = c$ for given $a, b, c \in ℝ$ , $a \neq 0$ . The inputs to this problem are $a, b, c$ and the output is the solution $(c - b) / a$

X = ℝ \ {0} \times ℝ \times ℝ, Y = ℝ, f (a, b, c) = (c - b) / a .

∴	function f(a,b,c) if (a!=0) return (c-b)/a end end

f

∴

f(1,2,3)

$1.0$

∴

$•$ $\circ$ Compute the innner product of two vectors $𝒖, 𝒗 \in ℝ^{n}$ :

X = ℝ^{n} \times ℝ^{n}, Y = ℝ, y = f (𝒖, 𝒗) = \sum_{i = 1}^{n} u_{i} v_{i}

with $u_{i}, v_{i}$ the components of $𝒖, 𝒗$ . Note that the input set is the Cartesian product of sets of vectors and the output set is the reals. Such functions defined from sets of vectors (more accurately vector spaces) to reals (more accurately scalars) are called functionals.

∴	function f(u,v) sum(u.*v) end

f

∴	f([1 2 3],[1 2 3])

$14$

$•$ $\circ$ Compute the definite integral

(u, v) = \int_{a}^{b} u (x) v (x) d x,

with $u, v$ arbitrary continuous functions, denoted by $u, v \in C^{(0)} ([a, b])$ :

X = C^{(0)} ([a, b]) \times C^{(0)} ([a, b]), Y = ℝ .

Again, this an example of a functional.

The functional is now defined for function arguments, and evaluation can be carried out either symbolically or numerically. Symbolic evaluation in the public domain Maxima system for $u, v \in 𝒯 = {1, \sin x, \cos x, \sin 2 x, \cos 2 x, \dots}$ over the interval $[- π, π]$ demonstrates orthogonality of the trigonometric functions $𝒯$ .

(%i1)

scprod(f,g,x,a,b):=integrate(f*g,x,a,b)

$(%o1) scprod (f, g, x, a, b) ≔ integrate (f g, x, a, b)$

(%i5)

scprod(sin(x),sin(x),x,-%pi,%pi)

$(%o5) π$

(%i30)

s:makelist(sin(k*x),k,1,3)

$(%o30) [\sin (x), \sin (2 x), \sin (3 x)]$

(%i31)

c:makelist(cos(k*x),k,0,3)

$(%o31) [1, \cos (x), \cos (2 x), \cos (3 x)]$

(%i32)

sc:join(c,s)

$(%o32) [1, \sin (x), \cos (x), \sin (2 x), \cos (2 x), \sin (3 x)]$

(%i28)

trigsp(f,g):=integrate(f*g,x,-%pi,%pi)

$(%o28) trigsp (f, g) ≔ integrate (f g, x, - π, π)$

(%i34)

tsp:outermap(trigsp,sc,sc)

$(%o34) [[2 π, 0, 0, 0, 0, 0], [0, π, 0, 0, 0, 0], [0, 0, π, 0, 0, 0], [0, 0, 0, π, 0, 0], [0, 0, 0, 0, π, 0], [0, 0, 0, 0, 0, π]]$

$•$ $\circ$ Compute the derivative of a function $g \in C^{(1)} (ℝ)$ , with $C^{(k)} (ℝ)$ the space of functions defined on $ℝ$ differentiable $k$ times: $X = C^{(1)} (ℝ)$ , $Y = C^{(0)} (ℝ)$ , $f = d / d x$ . Note that in this case $X, Y$ are sets of functions, in which case $f$ is referred to as an operator.

As above, the operator can be evaluated symbolically, and is predefined in all symbolic computation packages including Maxima

(%i43)

diff(sin(x),x)

(%o43) cos(x)

(%i44)

diff(sin(cos(x))+cos(sin(x)),x)

(%o44) (-cos(x)*sin(sin(x)))-sin(x)*cos(cos(x))

(%i45)

$•$ $\circ$ Find the roots of a polynomial $p_{n} (x) = a_{n} x^{n} + \dots + a_{1} x + a_{0}$ . The input is the polynomial specified by the vector of coefficients $𝒂 \in ℝ^{n + 1}$ . The output is another vector $𝒙 \in ℝ^{n}$ whose components are roots, $p_{n} (x_{i}) = 0$

X = ℝ^{n + 1}, Y = ℝ^{n} .

The function $f : X \to Y$ cannot be written explicitly (corollary of Abel-Ruffini theorem), but there are approximations $\tilde{f}$ of the root-finding function that can be implemented such $\tilde{f} ≅ f$ .

Most software systems include facilities for polynomials, including root-finding.

∴	import Pkg; Pkg.add("Polynomials");

∴	using Polynomials

∴	p=Polynomial([-6,11,-6,1])

-6 + 11*x - 6*x^2 + x^3

∴

roots(p)

$[\begin{array}{c} 1.0000000000000002 \\ 1.999999999999998 \\ 3.0000000000000018 \end{array}]$ (1)

∴

Note that the specification of a mathematical problem requires definition of the triplet $(X, Y, f)$ .

Once a problem is specified, the natural question is to ascertain whether a solution is possible. Generally, simple affirmation of the existence of a solution is the objective of some field of mathematics (e.g., analysis, functional analysis). From the point of view of science, an essential question is not only existence but also:

how does the output $y = f (x)$ change if $x$ changes?
what are the constructive methods to approximate $y$ ?

1.2.Vector space

The above general definition of a mathematical problem must be refined in order to assess magnitude of changes in inputs or outputs. A first step is to introduce some structure in the input and output sets $X, Y$ . Using these sets, vector spaces $𝒱 = (V, S, +, \cdot)$ are constructed, consisting of a set of vectors $V$ , a set of scalars $S$ , an addition operation $+$ , and a scaling operation $\cdot$ . The vector space is often referred to simply by its set of vectors $V$ , when the set of scalars, addition operation, and scaling operation are self-evident in context.

Formally,a vector space $𝒱$ is defined by a set $V$ whose elements satisfy certain scaling and addition properties, denoted all together by the 4-tuple $𝒱 = (V, S, +, \cdot)$ . The first element the 4-tuple is a set whose elements are called vectors. The second element is a set of scalars, and the third is the vector addition operation. The last is the scaling operation, seen as multiplication of a vector by a scalar. The vector addition and scaling operations must satisfy rules suggested by positions or forces in three-dimensional space, which are listed in Table 1. In particular, a vector space requires definition of two distinguished elements: the zero vector $𝟎 \in V$ , and the identity scalar element $1 \in S$ .

Addition rules for	$\forall 𝒂, 𝒃, 𝒄 \in V$
$𝒂 + 𝒃 \in V$	Closure
$𝒂 + (𝒃 + 𝒄) = (𝒂 + 𝒃) + 𝒄$	Associativity
$𝒂 + 𝒃 = 𝒃 + 𝒂$	Commutativity
$𝟎 + 𝒂 = 𝒂$	Zero vector
$𝒂 + (- 𝒂_{}) = 𝟎$	Additive inverse
Scaling rules for	$\forall 𝒂, 𝒃 \in V$ , $\forall x, y \in S$
$x 𝒂 \in V$	Closure
$x (𝒂 + 𝒃) = x 𝒂 + x 𝒃$	Distributivity
$(x + y) 𝒂 = x 𝒂 + y 𝒂$	Distributivity
$x (y 𝒂) = (x y) 𝒂$	Composition
$1 𝒂 = 𝒂$	Scalar identity

Table 1. Vector space $𝒱 = (V, S, +, \cdot)$ properties for arbitrary $𝒂, 𝒃, 𝒄 \in V$

1.3.Norm

A first step is quantification of the changes in input or output, assumed to have the structure of a vector space, $𝒳 = (X, ℝ, +, \cdot), 𝒴 = (Y, ℝ, +, \cdot)$ .

Definition 1. A norm on vector space $𝒳$ is a function $|| || : X \to ℝ_{+}$ , that for any $x, y, z \in X$ , $α \in ℝ$ satisfies the properties:

$|| x || = 0$ if and only if x=0.

$|| a x || = | a | || x ||$

$|| x + y || ⩽ || x || + || y ||$

1.4.Condition number

The ratio of changes in output to changes in input is the absolute condition number of a problem.

Definition 2. The problem $f : X \to Y$ has absolute condition number

$\hat{κ} = {lim}_{ε \to 0} {sup}_{|| δ x || ⩽ ε} \frac{|| f (x + δ x) - f (x) ||}{|| δ x ||}$

To avoid influence of choice of reference unit, the relative condition number is also introduced.

Definition 3. The problem $f : X \to Y$ has relative condition number

$κ = {lim}_{ε \to 0} {sup}_{|| δ x || ⩽ ε} \frac{|| f (x + δ x) - f (x) ||}{|| f (x) ||} \cdot \frac{|| x ||}{|| δ x ||} .$

2.Solution algorithm

2.1.Accuracy

In scientific computation, the mathematical problem $f : X \to Y$ is approximated by an algorithm $\tilde{f} : \tilde{X} \to \tilde{Y}$ , in which is assumed to be computable, and $\tilde{X}, \tilde{Y}$ are vector spaces that approximate $X, Y$ . As a first step in characterizing how well the algorithm $\tilde{f}$ approximates the problem $f$ , consider that $\tilde{X} = X$ and $\tilde{Y} = Y$ , i.e., there is no error in representation of the domain and codomain.

Definition 4. The absolute error of algorithm $\tilde{f} : X \to Y$ that approximates the problem $f : X \to Y$ is

$e = || \tilde{f} (x) - f (x) || .$

Definition 5. The relative error of algorithm $\tilde{f} : X \to Y$ that approximates the problem $f : X \to Y$ is

$ε = \frac{|| \tilde{f} (x) - f (x) ||}{|| f (x) ||} .$

Definition 6. An algorithm $\tilde{f} : X \to Y$ is accurate if there exists finite $M \in ℝ_{+}$ such that

$ε = \frac{|| \tilde{f} (x) - f (x) ||}{|| f (x) ||} ⩽ M ϵ_{mach}$

The above condition is also denoted as $ε = 𝒪 (ϵ_{mach})$

2.2.Stability

Algorithms should not catastrophically increase input errors. This is quantified in the concept of stability.

Definition 7. An algorithm $\tilde{f} : X \to Y$ is forward stable if

$|| \tilde{x} - x || / || x || = 𝒪 (ϵ_{mach}) \Rightarrow || \tilde{f} (x) - f (\tilde{x}) || / || f (\tilde{x}) || = 𝒪 (ϵ_{mach})$

The above states that the relative error in the output should be on the order of machine epsilon if the relative in the input is of order machine epsilon. Note that the constants in the order statements $M, N$ are usually different from one another, $|| \tilde{x} - x || / || x || ⩽ M ϵ_{mach}$ , $|| \tilde{f} (x) - f (\tilde{x}) || / || f (\tilde{x}) || ⩽ N ϵ_{mach} .$

Definition 8. An algorithm $\tilde{f} : X \to Y$ is backward stable if from existence of some $\tilde{x}$ such that $\tilde{f} (x) = f (\tilde{x})$ , it results that

$|| \tilde{x} - x || / || x || = 𝒪 (ϵ_{mach}) .$

Backward stability asserts that the result of the algorithm on exact input data is the same as the solution to the mathematical problem for nearby data (with distance on order of machine epsilon).

Summary.

Mathematical problems are stated as functions from a set of inputs $X$ to a set of outputs $Y$ , $f : X \to Y$
The difficulty of a mathematical problem is assessed by measuring the effect of changes in input
To quantify changes in inputs and outputs, the framework of a normed vector space is introduced
The ratio of norm of output change to norm of input change is the absolute condition number of a problem
$\hat{κ} = {lim}_{ε \to 0} {sup}_{|| δ x || ⩽ ε} \frac{|| f (x + δ x) - f (x) ||}{|| δ x ||}$
Algorithms are constructive approximations of mathemtical problems $\tilde{f} : X \to Y$ . The accuracy of an algorithm is assessed by comparison of the algorithm output to that of the mathematical problem through absolute error $e$ and relative error $ε$
$e = || \tilde{f} (x) - f (x) ||, ε = \frac{|| \tilde{f} (x) - f (x) ||}{|| f (x) ||}$
The tendency of an algorithm to amplify pertubations of input is assessed by the concept of stability
Algorithms that do not amplify relative changes in input of the size of machine precision are forward stable.
Algorithms that compute the exact result of a mathematical problem for changes in put of the size of machine precision are backward stable.