Lecture 2: Approximation techniques

1.Rate and order of convergence

The objective of scientific computation is to solve some problem f(x)=0 by constructing a sequence of approximations {xn}n. The condition suggested by mathematical analysis would be x=limnxn, with f(x)=0. As already seen in the Leibniz series approximation of π, acceptable accuracy might only be obtained for large n. Since f could be an arbitrarily complex mathematical object, such slowly converging approximating sequences are of little practical interest. Scientific computing seeks approximations of the solution with rapidly decreasing error. This change of viewpoint with respect to analysis is embodied in the concepts of rate and order of convergence.

Definition 1. {xn}n converges to x with rate r(0,1) and order p if

limn|xn+1-x||xn-x|p=r. (1)

As previously discussed, the above definition is of limited utility since:

  1. The solution x is unknown;

  2. The limit n is impractical to attain.

Sequences converge faster for higher order p or lower rate r. A more useful approach is to determine estimates of the rate and order of convergence over some range of iterations that are sufficiently accurate. Rewriting (1) as

limn(|xn+1-x|-r|xn-x|p)=0,

suggests introducing the distance between successive iterates dn=|xn-xn-1|, and considering the condition

|dn+1-sdnq|smallforlargen.

Definition 2. {xn}n approximates x with rate s and order q if there exist s,q and n1,n2 such that

|dn+1-sdnq|<ϵ,forn1nn2 (2)

with dn=|xn-xn-1|, n, ϵ denotes machine epsilon.

As an example, consider the derivative g=f' of f(x)=ex-1 at x0=0, as given by the calculus definition

g(x0)=f'(x0)=limh0f(x0+h)-f(x0)h,

and construct a sequence of approximations

gn=fn-f(0)hn,fn=f(hn),hn=2-n.

Start with a numerical experiment, and compute the sequence dn=|gn-gn-1|.

n 1 2 N hn=2-n 1/2 1/4 1/2N fn f1 f2 fN gn=(fn-f(0))/hn g1=(f1-f(0))/2 g2=(f2-f(0))/4 gN=(fN-f(0))/2N dn-1=|gn-gn-1| - d1=|g2-g1| dN-1=|gN-gN-1|

Table 1. Table presentation of calculations to construct approximation of derivative sequence for f(x)=tanx, at x0=0.

N=24; n=1:N; h=2.0.^(-n); f(x) = exp(x)-1; x0=0; f0=f(x0);
g = (f.(h).-f0) ./ h; d=abs.(g[2:N]-g[1:N-1]);
n1=2; n2=8; [h[n1:n2] g[n1:n2] d[n1:n2]]

[ 0.25 1.1361016667509656 0.070914042216355 0.125 1.0651876245346106 0.033276281848861444 0.0625 1.0319113426857491 0.0161223027144608 0.03125 1.0157890399712883 0.007935690423401809 0.015625 1.0078533495478865 0.0039369071225365815 0.0078125 1.00391644242535 0.001960771808398931 0.00390625 1.001955670616951 0.0009784720235188615 ] (3)

Investigation of the numerical results indicates increasing accuracy in the estimate of g(x)=(ex-1)'=ex, g(0)=1 with decreasing step size h. The distance between successive approximation sequence terms dn=|gn-gn-1| also decreases. It is more intuitive to analyze convergence behavior through a plot rather than a numerical table.

clf(); plot(h[2:N],d,"-o"); xlabel("h"); ylabel("d");
cd(homedir()*"//courses//MATH661//images"); savefig("L02Fig01a.eps");

The intent of the rate and order of approximation definitions is to state that the distance between successive terms behaves as

dn+1sdnq,

in the hope that this is a Cauchy sequence, and successively closer terms actually indicate convergence. The convergence parameters (s,q) can be isolated by taking logarithms, cn=logdn leading to a linear dependence

cn+1qcn+logs.

Subtraction of successive terms gives cn-cn-1q(cn-1-cn-2), leading to an average slope estimate

q1N-3n=3N-1cn-cn-1cn-1-cn-2
c=log.(2,d); lh=log.(2,h[2:N]); clf(); plot(lh,c,"-o"); plot([-10,-20],[-10,-20],"k"); plot([-10,-20],[-10,-30],"g");
xlabel("log(h)"); ylabel("log(d)"); savefig("L02Fig01b.eps");
num=c[3:N-1]-c[2:N-2]; den=c[2:N-2]-c[1:N-3];
q = sum(num ./ den)/(N-3)

0.9920966582673338

The above computations indicate q1, known as linear convergence. Figure 1b shows the common practice of depicting guide lines of slope 1 (black) and slope 2 (green) to visually ascertain the rate of convergence. Once the order of approximation q is determined, the rate of aproximation is estimated from

logs1N-2n=2N-1(cn-qcn-1).
s=exp(sum(c[2:N-1]-q*c[1:N-2])/(N-2))

0.3252477724180383

The above results suggest successive approximants become closer spaced according to

dn0.124dn-1

Figure 1. (a, left). Convergence plot; (b,right) Convergence plot in logarithmic coordinates.

Repeat the above experiment at x0=ln2, where g(ln2)=2, and using a different approximation of the derivative

gn=f(ln2+hn)-f(ln2-hn)2hn.

For this experiment, in addition to the rate and order of approximation (s,q), also determine the rate and order of convergence (r,p) using

bn=|gn-g(π/4)|,bn+1rbnp,an=logbn,an+1=pan+logr.
N=32; n=1:N; h=2.0.^(-n); f(x) = exp(x)-1; x0=log(2); f0=f(x0); g0=exp(x0);
g = (f.(x0 .+ h).-f.(x0 .- h)) ./ (2*h); d=abs.(g[2:N]-g[1:N-1]);
c=log.(2,d); lh=log.(2,h[2:N]); b=abs.(g[2:N].-g0); a=log.(2,b);
plot(lh,c,"-o"); plot(lh,a,"-x"); plot([-10,-20],[-10,-20],"k"); plot([-10,-20],[-10,-30],"g");
xlabel("log(h)"); ylabel("c, a"); grid("on");
savefig("L02Fig02.eps");

Figure 2. Typical convergence behavior for approximants of a derivative. Blue line shows first-order or linear convergence of approximation f'(x0)(f(x0+h)-f(x0))/h for f(x)=ex-1 at x0=0. The convergence curve is monotone, with decreasing error for all sample points due to fortuitous f(x0)=0. Green and orange lines indicate that the orders of convergence and approximation are quadratic for f'(x0)(f(x0+h)-f(x0-h))/(2h) for f(x)=ex-1 at x0=log2. Now, f(x0)0, and small differences in the numerator are no longer resolved by the floating point system leading to an increase in the error for log(h)<-20. The numerical experiment indicates that order of approximation can be used interchangeably with order of convergence, i.e., closer spacing of successive approximations is often an indication of convergence.

2.Convergence acceleration

Given some approximation sequence {xn}n, xnx, with x solution of problem f(x)=0, it is of interest to construct a more rapidly convergent sequence {yn}n, ynx. Knowledge of the order of convergence p can be used to achieve this purpose by writing

xn-xr(xn-1-x)p,xn-1-xr(xn-2-x)p, (4)

and taking the ratio to obtain

xn-xxn-1-x=(xn-1-xxn-2-x)p. (5)

For p, the above is a polynomial equation of degree p that can be solved to obtain x. The heuristic approximation (4) suggests a new approximation of the exact limit x obtained by solving (5).

2.1.Aitken acceleration

One of the widely used acceleration techniques was published by Aitken (1926, but had been in use since Medieval times) for p=1 in which case (5) gives

xnxn-2-(xn+xn-2)x=xn-12-2xn-1xx=xnxn-2-xn-12xn-2xn-1+xn-2.

The above suggests that starting from {xn}n, the sequence {an}n with

an=xnxn-2-xn-12xn-2xn-1+xn-2=xn-(xn-xn-1)2xn-2xn-1+xn-2,

might converge faster towards the limit. Investigate by revisiting the numerical experiment on approximation of the derivative g=f' of f(x)=ex-1 at x0=0, using

gn=fn-f(0)hn,fn=f(hn),hn=2-n.
N=24; n=1:N; h=2.0.^(-n); f(x) = exp(x)-1; x0=0; f0=f(x0);
g = (f.(h).-f0) ./ h; a = copy(g);
a[3:N] = g[3:N] - (g[3:N]-g[2:N-1]).^2 ./ (g[3:N]-2*g[2:N-1]+g[1:N-2]);
lh=log.(2,h); d=log.(2,abs.(g.-1)); b=log.(2,abs.(a.-1));
clf(); plot(lh,d,"-o"); plot(lh,b,"-x"); plot([-10,-20],[-10,-20],"k"); plot([-10,-20],[-10,-30],"g"); xlabel("log(h)"); ylabel("g, a"); grid("on"); savefig("L02Fig03.eps");

Figure 3. Aitken acceleration of linearly convergent sequence (blue dots) yields a close-to-quadratic convergent sequence (orange x).

Analysis reinforces the above numerical experiment. First-order convergence implies the distance to the limit decreases during iteration as

3.Approximation correction types

Several approaches may be used in construction of an approximating sequence {xn}n. The approaches exemplified below for xn, can be generalized when xn is some other type of mathematical object.

3.1.Additive corrections

Returning to the Leibniz series

π4=1-13+15-17+19-.,

the sequence of approximations is {Ln}n with general term

Ln=k=0n(-1)k2k+1.

Note that successive terms are obtained by an additive correction

Ln=Ln-1+(-1)n2n+1,Lnπ4.

Another example, again giving an approximation of π is the Srinivasa Ramanujan series

Rn=229801k=0n(4k)!(1103+26390k)(k!)43964k,limnRn=1π,

that can be used to obtain many digits of accuracy with just a few terms.

An example of the generalization of this approach is the Taylor series of a function. For example, the familiar sine power series

sinx=x-x33!+x55!-,

is analogous, but with rationals now replaced by monomials, and the limit is now a function sin:[-1,1]. The general term is

Tn(x)=k=0n(-1)kx2k+1(2k+1)!,

and the same type of additive correction appears, this time for functions,

Tn(x)=Tn-1(x)+(-1)nx2n+1(2n+1)!,Tn(x)sinx.

3.2.Multiplicative corrections

Approximating sequences need not be constructed by adding a correction. Consider the approximation of π/2 given by Wallis's product (1656)

Sn=(2123)(4345)(6567),Sn=k=1n4k24k2-1,Snπ2,

for which

Sn=Sn-1(4n24n2-1).

Another famous example is the Viète formula from 1593

2π=222+222+2+22,Vn=k=1nNj=1k222

in which the correction is multiplicative with numerators given by nested radicals. Similar to the symbol for addition, and the symbol for multiplication, the N symbol is used to denote nested radicals

Nj=1kajb=a1+a2+a3++akbkb3b2b1.

In the case of the Viète formula, aj=2, bj=2 for all j.

3.3.Continued fractions

Yet another alternative is that of continued fractions, with one possible approximation of π given by

π+3=6+126+ (6)

A notation is introduced for continued fractions using the K symbol

Fn=b0+Kk=1nakbk=b0+a1b1+.

Using this notation, the sequences arising in the continued fraction representation of π are {an}n, {bn}n chosen as ak=(2k-1)2 for k+, and bk=6 for k.

π=limn(6+Kk=1n(2k-1)26).

3.4.Composite corrections

The above correction techniques used arithmetic operations. The repeated radical coefficients in the Viète formula suggest consideration of repeated composition of arbitrary functions t0,t1,,tn to construct the approximant

insect

Tn=t0t1tn=k=0ntk.

This is now a general framework, in which all of the preceeding correction approaches can be expressed. For example, the continued fraction formula (6) is recovered through the functions

t0(z)=6+z,t1(z)=16+z,,tk(z)=(2k-1)26+z,

and evaluation of the composite function at z=0

Fn=Tn(0).

This general framework is of significant current interest since such composition of nonlinear functions is the basis of deep neural network approximations.

Summary.