Lecture 3: Problems and Algorithms

1.Mathematical problems

1.1.Formalism for defining a mathematical problem

In general, mathematical problems can be thought of as mappings from some set of inputs X to some set of outputs Y. The mapping is often carried out through a function f, i.e., a procedure that associates a single yY to some input xX

f:XY,y=f(x),xfy

Examples:

Compute the square of a real:

X=,Y=,y=f(x)=x2.

f(x)=x^2

f

f(2)

4

Find xsolution of ax+b=c for given a,b,c, a0. The inputs to this problem are a,b,c and the output is the solution (c-b)/a

X=\{0}××,Y=,f(a,b,c)=(c-b)/a.

function f(a,b,c)
  if (a!=0)
    return (c-b)/a
  end
end

f

f(1,2,3)

1.0

Compute the innner product of two vectors 𝒖,𝒗n:

X=n×n,Y=,y=f(𝒖,𝒗)=i=1nuivi

with ui,vi the components of 𝒖,𝒗. Note that the input set is the Cartesian product of sets of vectors and the output set is the reals. Such functions defined from sets of vectors (more accurately vector spaces) to reals (more accurately scalars) are called functionals.

function f(u,v)
  sum(u.*v)
end

f

f([1 2 3],[1 2 3])

14

Compute the definite integral

(u,v)=abu(x)v(x)dx,

with u,v arbitrary continuous functions, denoted by u,vC(0)([a,b]):

X=C(0)([a,b])×C(0)([a,b]),Y=.

Again, this an example of a functional.

The functional is now defined for function arguments, and evaluation can be carried out either symbolically or numerically. Symbolic evaluation in the public domain Maxima system for u,v𝒯={1,sinx,cosx,sin2x,cos2x,} over the interval [-π,π] demonstrates orthogonality of the trigonometric functions 𝒯.

(%i1) 
scprod(f,g,x,a,b):=integrate(f*g,x,a,b)

(%o1)  scprod(f,g,x,a,b)integrate(fg,x,a,b)

(%i5) 
scprod(sin(x),sin(x),x,-%pi,%pi)

(%o5)  π

(%i30) 
s:makelist(sin(k*x),k,1,3)

(%o30)  [sin(x),sin(2x),sin(3x)]

(%i31) 
c:makelist(cos(k*x),k,0,3)

(%o31)  [1,cos(x),cos(2x),cos(3x)]

(%i32) 
sc:join(c,s)

(%o32)  [1,sin(x),cos(x),sin(2x),cos(2x),sin(3x)]

(%i28) 
trigsp(f,g):=integrate(f*g,x,-%pi,%pi)

(%o28)  trigsp(f,g)integrate(fg,x,-π,π)

(%i34) 
tsp:outermap(trigsp,sc,sc)

(%o34)  [[2π,0,0,0,0,0],[0,π,0,0,0,0],[0,0,π,0,0,0],[0,0,0,π,0,0],[0,0,0,0,π,0],[0,0,0,0,0,π]]

Compute the derivative of a function gC(1)(), with C(k)() the space of functions defined on differentiable k times: X=C(1)(), Y=C(0)(), f=d/dx. Note that in this case X,Y are sets of functions, in which case f is referred to as an operator.

As above, the operator can be evaluated symbolically, and is predefined in all symbolic computation packages including Maxima

(%i43) 
diff(sin(x),x)

(%o43) cos(x)

(%i44) 
diff(sin(cos(x))+cos(sin(x)),x)

(%o44) (-cos(x)*sin(sin(x)))-sin(x)*cos(cos(x))

(%i45) 

Find the roots of a polynomial pn(x)=anxn++a1x+a0. The input is the polynomial specified by the vector of coefficients 𝒂n+1. The output is another vector 𝒙n whose components are roots, pn(xi)=0

X=n+1,Y=n.

The function f:XY cannot be written explicitly (corollary of Abel-Ruffini theorem), but there are approximations f of the root-finding function that can be implemented such ff.

Most software systems include facilities for polynomials, including root-finding.

import Pkg; Pkg.add("Polynomials");

using Polynomials
p=Polynomial([-6,11,-6,1])

-6 + 11*x - 6*x^2 + x^3

roots(p)

[ 1.0000000000000002 1.999999999999998 3.0000000000000018 ] (1)

Note that the specification of a mathematical problem requires definition of the triplet (X,Y,f).

Once a problem is specified, the natural question is to ascertain whether a solution is possible. Generally, simple affirmation of the existence of a solution is the objective of some field of mathematics (e.g., analysis, functional analysis). From the point of view of science, an essential question is not only existence but also:

  1. how does the output y=f(x) change if x changes?

  2. what are the constructive methods to approximate y?

1.2.Vector space

The above general definition of a mathematical problem must be refined in order to assess magnitude of changes in inputs or outputs. A first step is to introduce some structure in the input and output sets X,Y. Using these sets, vector spaces 𝒱=(V,S,+,) are constructed, consisting of a set of vectors V, a set of scalars S, an addition operation +, and a scaling operation . The vector space is often referred to simply by its set of vectors V, when the set of scalars, addition operation, and scaling operation are self-evident in context.

Formally,a vector space 𝒱 is defined by a set V whose elements satisfy certain scaling and addition properties, denoted all together by the 4-tuple 𝒱=(V,S,+,). The first element the 4-tuple is a set whose elements are called vectors. The second element is a set of scalars, and the third is the vector addition operation. The last is the scaling operation, seen as multiplication of a vector by a scalar. The vector addition and scaling operations must satisfy rules suggested by positions or forces in three-dimensional space, which are listed in Table 1. In particular, a vector space requires definition of two distinguished elements: the zero vector 𝟎V, and the identity scalar element 1S.

Addition rules for 𝒂,𝒃,𝒄V
𝒂+𝒃V Closure
𝒂+(𝒃+𝒄)=(𝒂+𝒃)+𝒄 Associativity
𝒂+𝒃=𝒃+𝒂 Commutativity
𝟎+𝒂=𝒂 Zero vector
𝒂+(-𝒂)=𝟎 Additive inverse
Scaling rules for 𝒂,𝒃V, x,yS
x𝒂V Closure
x(𝒂+𝒃)=x𝒂+x𝒃 Distributivity
(x+y)𝒂=x𝒂+y𝒂 Distributivity
x(y𝒂)=(xy)𝒂 Composition
1𝒂=𝒂 Scalar identity

Table 1. Vector space 𝒱=(V,S,+,) properties for arbitrary 𝒂,𝒃,𝒄V

1.3.Norm

A first step is quantification of the changes in input or output, assumed to have the structure of a vector space, 𝒳=(X,,+,),𝒴=(Y,,+,).

Definition 1. A norm on vector space 𝒳 is a function ||||:X+, that for any x,y,zX, α satisfies the properties:

  1. ||x||=0 if and only if x=0.

  2. ||ax||=|a|||x||

  3. ||x+y||||x||+||y||

1.4.Condition number

The ratio of changes in output to changes in input is the absolute condition number of a problem.

Definition 2. The problem f:XY has absolute condition number

κ^=limε0sup||δx||ε||f(x+δx)-f(x)||||δx||

To avoid influence of choice of reference unit, the relative condition number is also introduced.

Definition 3. The problem f:XY has relative condition number

κ=limε0sup||δx||ε||f(x+δx)-f(x)||||f(x)||||x||||δx||.

2.Solution algorithm

2.1.Accuracy

In scientific computation, the mathematical problem f:XY is approximated by an algorithm f:XY, in which is assumed to be computable, and X,Y are vector spaces that approximate X,Y. As a first step in characterizing how well the algorithm f approximates the problem f, consider that X=X and Y=Y, i.e., there is no error in representation of the domain and codomain.

Definition 4. The absolute error of algorithm f:XY that approximates the problem f:XY is

e=||f(x)-f(x)||.

Definition 5. The relative error of algorithm f:XY that approximates the problem f:XY is

ε=||f(x)-f(x)||||f(x)||.

Definition 6. An algorithm f:XY is accurate if there exists finite M+ such that

ε=||f(x)-f(x)||||f(x)||Mϵmach

The above condition is also denoted as ε=𝒪(ϵmach)

2.2.Stability

Algorithms should not catastrophically increase input errors. This is quantified in the concept of stability.

Definition 7. An algorithm f:XY is forward stable if

||x-x||/||x||=𝒪(ϵmach)||f(x)-f(x)||/||f(x)||=𝒪(ϵmach)

The above states that the relative error in the output should be on the order of machine epsilon if the relative in the input is of order machine epsilon. Note that the constants in the order statements M,N are usually different from one another, ||x-x||/||x||Mϵmach, ||f(x)-f(x)||/||f(x)||Nϵmach.

Definition 8. An algorithm f:XY is backward stable if from existence of some x such that f(x)=f(x), it results that

||x-x||/||x||=𝒪(ϵmach).

Backward stability asserts that the result of the algorithm on exact input data is the same as the solution to the mathematical problem for nearby data (with distance on order of machine epsilon).

Summary.