MATH661 Homework 1 - Number approximation

MATH661 Homework 1 - Number approximation

Posted: 08/24/22

Due: 09/01/22, 11:55PM

Track 1: 1,2,3,6. Track 2: 1-6.

$\circ$ Julia preamble

Construct a convergence plot in logarithmic coordinates for the following continued fraction approximation of $e$

e = 2 + \frac{1}{1 + \frac{1}{2 + \frac{2}{3 + \frac{3}{⋱}}}}

(1)

Identify the terms in the general expression of a continued fraction

F_{n} = b_{0} + \overset{n}{\underset{k = 1}{K}} \frac{a_{k}}{b_{k}} .

Compare with the additive approximation from a McLaurin series

e^{x} = 1 + \frac{x}{1!} + \frac{x^{2}}{2!} + \dots

Estimate the rate and order of convergence for both approximations.

Solution. Rewrite (1) as

e = 2 + \frac{1}{f}, f = 1 + \frac{1}{2 + \frac{2}{3 + \frac{3}{⋱}}},

and introduce the continued fraction sequence ${F_{n}}_{n \in ℕ}$ ,

F_{n} = b_{0} + \overset{n}{\underset{k = 1}{K}} \frac{a_{k}}{b_{k}} = b_{0} + \frac{a_{1}}{b_{1} +},

F_{0} = b_{0}, F_{1} = b_{0} + \frac{a_{1}}{b_{1}}, F_{2} = b_{0} + \frac{a_{1}}{b_{1} +}, F_{3} = b_{0} + \frac{a_{1}}{b_{1} +}

with assumed limit ${lim}_{n \to \infty} = f$ . Writing out the first few terms

F_{0} = 1, F_{1} = 1 + \frac{1}{2}, F_{2} = 1 + \frac{1}{2 + \frac{2}{3}}, F_{3} = 1 + \frac{1}{2 + \frac{2}{3 + \frac{3}{4}}}, \dots,

leads to identification of coefficients as

b_{k} = k + 1, a_{k} = k .

$\circ$ Define a function $f$ to compute $F_{n}$ and return $E_{n} = 2 + 1 / F_{n} \to e$ .

∴	function f(n,dbg=false) b=n+1; F=1.0*b; E=2+1/F for k=n:-1:1 b=k; a=k; F = b + a/F; E=2+1/F if dbg @printf("k=%d b_k=%d a_k=%d F_k=%f E_k=%f\n",k,b,a,F,E) end end return E end;

∴

Test the function

∴

f(5,true)

k=5 b_k=5 a_k=5 F_k=5.833333 E_k=2.171429 k=4 b_k=4 a_k=4 F_k=4.685714 E_k=2.213415 k=3 b_k=3 a_k=3 F_k=3.640244 E_k=2.274707 k=2 b_k=2 a_k=2 F_k=2.549414 E_k=2.392247 k=1 b_k=1 a_k=1 F_k=1.392247 E_k=2.718263 $2.718263331760264$

∴

exp(1.0)

$2.718281828459045$

Comparison with the builtin $e^{1}$ shows 5-digit accuracy for $n = 5$ .

∴

$\circ$ Define a function $g$ to compute

M_{n} = 1 + \frac{1}{1!} + \frac{1}{2!} + \dots + \frac{1}{n!} .

∴	function g(n) M=1.0; fact=1.0 for k=1:n fact=k*fact M=M+1/fact end return M end

g

∴

Test the function

∴

g.(1:5)

$[\begin{array}{c} 2.0 \\ 2.5 \\ 2.6666666666666665 \\ 2.708333333333333 \\ 2.7166666666666663 \end{array}]$ (2)

Note that only 3 accurate digits are obtained for $n = 5$ .

∴

The convergence behavior of the two approximations $E_{n}, M_{n}$ is shown in Fig. 1.

$\circ$

Figure 1. Convergence of continued fraction and additive approximations of $e$ .

The definition of order $p$ and rate $r$ of convergence

{lim}_{n \to \infty} \frac{| x_{n + 1} - x |}{{| x_{n} - x |}^{p}} = r,

is based upon the assumption of power-law decrease of the error $e_{n} = | x_{n} - x |$ ,

e_{n + 1} \sim e_{n}^{p} \Leftrightarrow e_{n + 1} = r e_{n}^{p} .

In log-coordinates this assumption leads to a straight line representation

\log e_{n + 1} = p \log e_{n} + \log r .

The $(\log n, \log e_{n})$ representation in Fig. 1 does not allow direct identification of an order of convergence. However a plot of $(\log e_{n}, \log e_{n + 1})$ is easily constructed (Fig. 2), and shows $p ≅ 1$ , $\ln r ≅ - 2 \Rightarrow r = e^{- 2} ≅ 0.135$ .

$\circ$

Figure 2. Order-of-convergence plot for approximations of $e$ . Visual estimation indicates $p = 1$ , linear sequence convergence.

∴	N=15; n=1:N; e=exp(1.0); x=log.(n);

∴	errE=log.(abs.(f.(n) .- e)); errM=log.(abs.(g.(n) .- e));

∴

clf(); plot(errE[1:N-1],errE[2:N],"-o",errM[1:N-1],errM[2:N],"-x"); xlabel("log(e_n)"); ylabel("log(e_(n+1))"); title("e Sequence convergence"); grid("on"); plot([-25,-20],[-25,-20],"b-"); plot([-25,-20],[-25,-15],"g-"); legend(["errE","errM","1st","2nd"]);

∴	savefig(homedir()*"/courses/MATH661/images/H01Fig02.eps")

The PostScript backend does not support transparency; partially transparent artists will be rendered opaque. The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.

∴

exp(-2.0)

$0.1353352832366127$

∴

Apply convergence acceleration to both the above approximations of $e$ . Construct the convergence plot of the accelerated sequences, and estimate the new rate and order of convergence.

Solution. Since both sequences exhibit linear convergence the Aitken formula

a_{n} = x_{n} - \frac{{(x_{n} - x_{n - 1})}^{2}}{x_{n} - 2 x_{n - 1} + x_{n - 2}}

is applicable. The resulting accelerated convergence plot is shown in Fig. 3. Convergence acceleration to second order is observed for a small range of errors, $\ln e \in [- 7, - 4]$ . For smaller errors, the floating point system cannot separate the small differences appearing in the Aitken correction.

$\circ$

Figure 3. Convergence of Aitken acceleration of $e$ approximation sequences

∴	function Aitken(x) a=copy(x); N=length(x) for n=3:N Δx = x[n] - x[n-1] Δ2x = x[n] - 2*x[n-1] + x[n-2] a[n] = x[n] - Δx^2/Δ2x end return a end;

∴	N=6; n=1:N; E=f.(n); M=g.(n); aE=Aitken(E); aM=Aitken(M);

∴	erraE=log.(abs.(aE .- e)); erraM=log.(abs.(aM .- e));

∴

clf(); plot(errE[1:N-1],errE[2:N],"-ob"); plot(errM[1:N-1],errM[2:N],"-xb"); plot(erraE[1:N-1],erraE[2:N],"-og"); plot(erraM[1:N-1],erraM[2:N],"-xg"); xlabel("log(e_n)"); ylabel("log(e_(n+1))"); title("e Sequence convergence"); grid("on"); plot([-10,-5],[-10,-5],"b-"); plot([-10,-5],[-10,0],"g-"); legend(["errE","errM","erraE","erraM","1st","2nd"]);

∴	savefig(homedir()*"/courses/MATH661/images/H01Fig03.eps")

∴

Completely state the mathematical problem of taking the $n^{th}$ root of a positive real, $n \in ℕ$ . Find the absolute and relative condition numbers.

Solution. First, assume $n > 0$ fixed leading to the problem $f : ℝ_{+} \to ℝ$ , $f (x) = x^{1 / n}$ . The absolute condition number is

\hat{κ} = {lim}_{ε \to 0} {sup}_{| δ x | ⩽ ε} \frac{|| f (x + δ x) - f (x) ||}{|| δ x ||} .

The condition number furnishes a bound for the change in the solution upon a change in the input

|| f (x + δ x) - f (x) || ⩽ \hat{κ} || δ x || .

Using the absolute value norm $|| x || = | x |$ for $x \in ℝ_{+}$ gives

\hat{κ} = {lim}_{ε \to 0} {sup}_{| δ x | ⩽ ε} \frac{| f (x + δ x) - f (x) |}{| δ x |} = | \frac{d f (x)}{d x} | = \frac{1}{n} x^{\frac{1}{n} - 1} = \frac{1}{n} \frac{1}{x^{(n - 1) / n}}, for x > 0 .

Consider some simple cases:

$n = 1$ , $f (x) = x$ , $\hat{κ} = 1$ , hence perturbations in the input are not amplified

f (x) = x, f (x + δ x) = x + δ x, f (x + δ x) - f (x) = δ x .

$\circ$ The problem is well-conditioned.

∴	f(x)=x; δf(x,δx)=f(x+δx)-f(x);

∴	δx=0.1; δf.(0:0.2:1,δx)/δx

$[\begin{array}{c} 1.0 \\ 1.0000000000000002 \\ 0.9999999999999998 \\ 0.9999999999999998 \\ 0.9999999999999998 \\ 1.0000000000000009 \end{array}]$ (3)

$n = 2$ , $f (x) = \sqrt{x}$ ,

\hat{κ} = \frac{1}{2 \sqrt{x}} .

$\circ$ at $x = 1 / 4$ , $\hat{κ} = 1$ , indicating no amplification of input perturbation

∴	f(x)=sqrt(x); δf(x,δx)=f(x+δx)-f(x);

∴	δx=10.0 .^(-3:-1); δf.(0.25,δx)./δx

$[\begin{array}{c} 1.0000000000000009 \\ 1.0000000000000009 \\ 0.9999999999999998 \end{array}]$ (4)

∴

$\circ$ as $x \to \infty$ , $\hat{κ} \to 0$ , indicating input perturbations have negligible effect upon output. This indicates incorrect identification of the variables in a problem.

∴	f(x)=sqrt(x); δf(x,δx)=f(x+δx)-f(x); X=floatmax(Float64);

∴	δx=10.0 .^(-3:-1); δf.(X,δx)./δx

$[\begin{array}{c} 0.0 \\ 0.0 \\ 0.0 \end{array}]$ (5)

∴	δx=10.0 .^(-3:-1); δf.(X/10.0,δx)./δx

$[\begin{array}{c} 0.0 \\ 0.0 \\ 0.0 \end{array}]$ (6)

∴

$\circ$ as $x \to 0_{+}$ , $\hat{κ} \to \infty$ , indicating small input perturbations have arbitrarily large effects upon output. This indicates an ill-posed problem

∴	f(x)=sqrt(x); δf(x,δx)=f(x+δx)-f(x); x=floatmin(Float64);

∴	δx=10.0 .^(-3:-1); δf.(x,δx)./δx

$[\begin{array}{c} 31.62277660168379 \\ 10.0 \\ 3.162277660168379 \end{array}]$ (7)

∴	δx=10.0 .^(-15:-12); δf.(x,δx)./δx

$[\begin{array}{c} 3.1622776601683788 e 7 \\ 1.0 e 7 \\ 3.162277660168379 e 6 \\ 1.0 e 6 \end{array}]$ (8)

∴	δx=10.0 .^(-21:-18); δf.(x,δx)./δx

$[\begin{array}{c} 3.1622776601683792 e 10 \\ 1.0 e 10 \\ 3.16227766016838 e 9 \\ 1.0 e 9 \end{array}]$ (9)

∴

In general, the conditioning of the $n^{th}$ root operation

\hat{κ} = \frac{1}{n} \frac{1}{x^{(n - 1) / n}}, for x > 0

indicates ill-conditioning for $x \to 0$ , well conditioning for $x \sim 1$ , incorrect model for $x \to \infty$ .

Consider now what happens when $n$ is also allowed to vary. The definition of the condition number cannot be directly applied since the limit process is not defined for $n \in ℕ$ . One can however consider the problem $g : ℝ_{+} \times ℝ_{+} \to ℝ$

g (x, y) = x^{1 / y},

and notice that $h : ℝ_{+} \times ℕ \to ℝ$ , $h (x, n) = x^{1 / n}$ is a restriction of $g$ . The condition number of $h$ can be inferred from that of $g$

\hat{κ} = {lim}_{ε \to 0} {sup}_{|| δ z || ⩽ ε} \frac{| g (x + δ x, y + δ y) - g (x, y) |}{|| δ z ||}, z = {[\begin{array}{ll} δ x & δ y \end{array}]}^{T} .

Use the $\infty$ -norm for $δ z \in ℝ_{+}^{2}$ , ${|| δ z ||}_{\infty} = max (δ x, δ y)$ . When approaching zero perturbation, $|| δ z || \to 0$ above the first bisector, the inequality

δ x < δ y ⩽ ε

implies

\hat{κ} = | \frac{\partial g}{\partial y} | .

Conversely, for $|| δ z || \to 0$ below the first bisector

\hat{κ} = | \frac{\partial g}{\partial x} | .

In general, for some arbitrary path to approach zero,

\hat{κ} = max (| \frac{\partial g}{\partial x} |, | \frac{\partial g}{\partial y} |) = max (\frac{1}{y} \frac{1}{x^{(y - 1) / y}}, \frac{x^{1 / y} \ln x}{y^{2}}) .

When restricted to $y = n \in ℕ$ ,

\hat{κ} = max (\frac{1}{n} \frac{1}{x^{(n - 1) / n}}, \frac{x^{1 / n} \ln x}{n^{2}}) .

$\circ$

Figure 4. Conditioning of $n^{th}$ -root $x^{1 / n}$ with perturbations allowed in $n$ .

∴	κ(x,n)=max(1/n/x^((n-1)/n),x^(1/n)*log(x)/n^2); clf();

∴	x=0.001:0.2:11; κ2=log.(κ.(x,2)); κ3=log.(κ.(x,3));

∴	κ4=log.(κ.(x,4)); plot(x,κ2,x,κ3,x,κ4); grid("on");

∴	xlabel("x"); ylabel("ln κ"); title("Condition of nth root");

∴	legend(["2","3","4"]);

∴	savefig(homedir()*"/courses/MATH661/images/H01Fig04.eps")

∴

Completely state the mathematical problem of solving the initial value problem for an ordinary differential equation of first order. Find the absolute and relative condition numbers.

Solution. The IVP
$y^{'} = f (y), y (0) = y_{0}$
is formulated as the mathematical problem
$F : C^{0, 1} (ℝ) \times ℝ \to C (ℝ),$
that when evaluated for some slope function $f$ and initial condition $y_{0}$ gives the integral curve $y : [0, a) \to ℝ$ , $a > 0$ . $F (f, y_{0}) = y$ . In the above $C (ℝ)$ is the space of continuous functions and $C^{0, 1}$ is the space of Lipschitz continuous functions.

As in the $n^{th}$ -root problem there are two inputs to $F$ , and it is useful to start with the case in which the slope function $f$ is fixed and only $y_{0}$ varies. The Lyapunov exponent $L$ is defined as
$δ y (t) ≅ e^{L t} δ y_{0},$
to characterize this case and the condition number is simply
$\hat{κ} (t) = e^{L t},$
i.e., the condition number and the Lyapunov exponent express the same concept.

The condition number for the case of variation of the slope function requires a concept of taking a derivative with respect to a function $f$ , a generalization of the calculus concept of taking a derivative with respect to a variable. This is known as a functional derivative and can be defined in the Fréchet sense for normed spaces and in the Gateaux sense for Banach spaces.
Completely state the mathematical problem of finding the roots of a cubic polynomial. Find the absolute and relative condition numbers.

Solution. Consider the cubic $x^{3} + a_{2} x^{2} + a_{1} x + a_{0} = 0$ with roots $x_{1}, x_{2}, x_{3}$ related to the polynomial coefficients by the Vieta relations
$x_{1} + x_{2} + x_{3} = - a_{2}, x_{1} x_{2} + x_{1} x_{3} + x_{2} x_{3} = a_{1}, x_{1} x_{2} x_{3} = - a_{0} .$
The mathematical problem of finding the roots of the cubic is $f : ℝ^{3} \to ℂ^{3}$
$f (𝒂) = 𝒙, 𝒂 = [\begin{array}{l} a_{0} \\ a_{1} \\ a_{2} \end{array}], 𝒙 = [\begin{array}{l} x_{1} \\ x_{2} \\ x_{3} \end{array}]$
Consider the effect of small changes $δ 𝒂$ upon the roots by taking differentials
$\begin{array}{rcl} δ x_{1} + δ x_{2} + δ x_{3} & = & - δ a_{2} \\ (x_{2} + x_{3}) δ x_{1} + (x_{3} + x_{1}) δ x_{2} + (x_{1} + x_{2}) δ x_{3} & = & δ a_{1} \\ x_{2} x_{3} δ x_{1} + x_{3} x_{1} δ x_{2} + x_{1} x_{2} δ x_{3} & = & - δ a_{0} \end{array}$
This is a linear system for $δ 𝒙$ with matrix
$𝑩 = [\begin{array}{lll} 1 & 1 & 1 \\ x_{2} + x_{3} & x_{3} + x_{1} & x_{1} + x_{2} \\ x_{2} x_{3} & x_{3} x_{1} & x_{1} x_{2} \end{array}] .$
The condition number for $f$ is the maximal amplification by the matrix $𝑩$ or
$\hat{κ} = || 𝑩 || .$
Consider some specific cases:
- For $x_{1} = x_{2} = x_{3} = ξ$ ,
  $𝑩 = [\begin{array}{lll} 𝒃 & 𝒃 & 𝒃 \end{array}], 𝒃 = [\begin{array}{l} 1 \\ 2 ξ \\ ξ^{2} \end{array}]$

Numerically compare the approximation of $b (t) = t (π - t) (2 π - t)$ by linear combination of $𝒯 = {\sin t, \sin 2 t, \sin 3 t, \dots}$ with that of linear combinations of $ℰ = {1, e^{t}, e^{2 t}, e^{3 t}, \dots}$ . Present a study of the aproximation error as the number of terms in the linear combination increases. Estimate the order of convergence in both cases.

Solution. The approximation is stated as

b (t) ≅ {\hat{b}}_{n} (t) = \sum_{k = 1}^{n} c_{k} a_{k} (t)

with the basis functions chosen either as $a_{k} (t) = \sin (k t)$ or $a_{k} (t) = e^{(k - 1) t}$ . The approximation error $e (t) = b (t) - {\hat{b}}_{n} (t)$ can be measured in various norms, e.g., the 2-norm

ε^{2} = {|| e (t) ||}_{2}^{2} = \int_{0}^{2 π} {(b (t) - {\hat{b}}_{n} (t))}^{2} d t ≅ \frac{2 π}{m} \sum_{i = 1}^{m} {(b (t_{i}) - {\hat{b}}_{n} (t_{i}))}^{2} .

$•$ Based upon the code from Fig.1 in L04, define a function that returns the approximation error for given basis set $a_{k} (t)$ , number of terms $n$ , and evaluation points $m$ .

∴	function err(m,n,a,dbg=false) h=2.0π/m; j=1:m; t=hj; A = a.(1,t) for k=2:n A = [A a.(k,t)] end if dbg return A end bt=t.(π.-t).(2π.-t) x=A\bt; b=Ax; return norm(b-bt)(2pi)/m end

err

∴

$•$ Define the two basis sets of interest

∴	s(k,t) = sin(kt); e(k,t)=exp((k-1)t);

∴

$\circ$ Verify the err function constructs the expected matrix. Note that the basis function is passed as an argument.

∴	err(8,1,s,true)

$[\begin{array}{c} 0.7071067811865475 \\ 1.0 \\ 0.7071067811865476 \\ 1.2246467991473532 e - 16 \\ - 0.7071067811865475 \\ - 1.0 \\ - 0.7071067811865477 \\ - 2.4492935982947064 e - 16 \end{array}]$ (10)

∴	err(8,1,e,true)

$[\begin{array}{c} 1.0 \\ 1.0 \\ 1.0 \\ 1.0 \\ 1.0 \\ 1.0 \\ 1.0 \\ 1.0 \end{array}]$ (11)

∴	err(8,2,s,true)

$[\begin{array}{cc} 0.7071067811865475 & 1.0 \\ 1.0 & 1.2246467991473532 e - 16 \\ 0.7071067811865476 & - 1.0 \\ 1.2246467991473532 e - 16 & - 2.4492935982947064 e - 16 \\ - 0.7071067811865475 & 1.0 \\ - 1.0 & 3.6739403974420594 e - 16 \\ - 0.7071067811865477 & - 1.0 \\ - 2.4492935982947064 e - 16 & - 4.898587196589413 e - 16 \end{array}]$ (12)

∴	err(8,2,e,true)

$[\begin{array}{cc} 1.0 & 2.1932800507380152 \\ 1.0 & 4.810477380965351 \\ 1.0 & 10.550724074197761 \\ 1.0 & 23.140692632779267 \\ 1.0 & 50.75401951173493 \\ 1.0 & 111.31777848985621 \\ 1.0 & 244.15106285427498 \\ 1.0 & 535.4916555247646 \end{array}]$ (13)

∴

The numerical values within the matrix $𝑨$ are hard to interpret for large $m$ , hence plot the columns in Fig. 5. The basis fubnctino plots already indicate that the exponential family is likely to lead to bad approximations since with respect to $e^{(n - 1) t}$ , all $e^{k t}$ for $k < n - 1$ are negligibly small, hence $𝑨$ is likely to have only one independent column vector.

$\circ$

Figure 5. Left: Sine basis functions. Right: Exponential basis functions.

∴	m=256; n=5; As = err(m,n,s,true); Ae = err(m,n,e,true);

∴	clf(); h=2.0π/m; j=1:m; t=hj;

∴	for k=1:n plot(t,As[:,k]) end

∴	grid("on"); xlabel("t"); ylabel("sin(k*t)");

∴	title("Sine basis functions");

∴	cd(homedir()*"/courses/MATH661/images");

∴	savefig("H01Fig05a.eps");

∴	clf(); grid("on"); xlabel("t"); ylabel("exp(k*t)");

∴	title("Exp basis functions");

∴	for k=1:n plot(t,Ae[:,k]) end

∴	savefig("H01Fig05b.eps");

∴

$\circ$ Test the err function for larger $m$ values (more samples).

∴	err(1000,10,s)

$0.0020986874338193287$

∴	err(1000,10,e)

$1.4507633675453773$

∴	err(4,1,s,true)

$[\begin{array}{c} 0.0 \\ 1.0 \\ 1.2246467991473532 e - 16 \\ - 1.0 \end{array}]$ (14)

∴

The convergence behavior is shown in Fig. 6. As expected as the number of terms in the linear combination increases the sine approximation converges, while that for the exponential basis has a constant error (numerically, the rank of the matrix remains 1).

$\circ$

Figure 6. Convergence of sine, exp basis approximation of $b (t)$ with increasing number of terms.

∴	m=1000; n=10:10:100; xn = log.(1.0*n);

∴	errs = err.(m,n,s); lgerrs = log.(errs);

∴	erre = err.(m,n,e); lgerre = log.(erre);

∴	clf(); plot(xn,lgerrs,"ok",xn,lgerre,"rx");

∴	xlabel("log(n)"); ylabel("log(err)"); grid("on")

∴	title("Convergence behavior for sine, exp basis");

∴	legend(["sine","exp"]);

∴	cd(homedir()*"/courses/MATH661/images");

∴	savefig("H01Fig06.eps");

∴

∴

∴

∴

∴

∴