MATH661 Homework 1 - Number approximation

Posted: 08/24/22

Due: 09/01/22, 11:55PM

Track 1: 1,2,3,6. Track 2: 1-6.

Julia preamble

  1. Construct a convergence plot in logarithmic coordinates for the following continued fraction approximation of e

    e=2+11+12+23+3 (1)

    Identify the terms in the general expression of a continued fraction

    Fn=b0+Kk=1nakbk.

    Compare with the additive approximation from a McLaurin series

    ex=1+x1!+x22!+

    Estimate the rate and order of convergence for both approximations.

    Solution. Rewrite (1) as

    e=2+1f,f=1+12+23+3,

    and introduce the continued fraction sequence {Fn}n,

    Fn=b0+Kk=1nakbk=b0+a1b1+,
    F0=b0,F1=b0+a1b1,F2=b0+a1b1+,F3=b0+a1b1+

    with assumed limit limn=f. Writing out the first few terms

    F0=1,F1=1+12,F2=1+12+23,F3=1+12+23+34,,

    leads to identification of coefficients as

    bk=k+1,ak=k.

    Define a function f to compute Fn and return En=2+1/Fne.

    Define a function g to compute

    Mn=1+11!+12!++1n!.

    The convergence behavior of the two approximations En,Mn is shown in Fig. 1.

    Figure 1. Convergence of continued fraction and additive approximations of e.

    The definition of order p and rate r of convergence

    limn|xn+1-x||xn-x|p=r,

    is based upon the assumption of power-law decrease of the error en=|xn-x|,

    en+1enpen+1=renp.

    In log-coordinates this assumption leads to a straight line representation

    logen+1=plogen+logr.

    The (logn,logen) representation in Fig. 1 does not allow direct identification of an order of convergence. However a plot of (logen,logen+1) is easily constructed (Fig. 2), and shows p1, lnr-2r=e-20.135.

    Figure 2. Order-of-convergence plot for approximations of e. Visual estimation indicates p=1, linear sequence convergence.

  2. Apply convergence acceleration to both the above approximations of e. Construct the convergence plot of the accelerated sequences, and estimate the new rate and order of convergence.

    Solution. Since both sequences exhibit linear convergence the Aitken formula

    an=xn-(xn-xn-1)2xn-2xn-1+xn-2

    is applicable. The resulting accelerated convergence plot is shown in Fig. 3. Convergence acceleration to second order is observed for a small range of errors, lne[-7,-4]. For smaller errors, the floating point system cannot separate the small differences appearing in the Aitken correction.

    Figure 3. Convergence of Aitken acceleration of e approximation sequences

  3. Completely state the mathematical problem of taking the nth root of a positive real, n. Find the absolute and relative condition numbers.

    Solution. First, assume n>0 fixed leading to the problem f:+, f(x)=x1/n. The absolute condition number is

    κ^=limε0sup|δx|ε||f(x+δx)-f(x)||||δx||.

    The condition number furnishes a bound for the change in the solution upon a change in the input

    ||f(x+δx)-f(x)||κ^||δx||.

    Using the absolute value norm ||x||=|x| for x+ gives

    κ^=limε0sup|δx|ε|f(x+δx)-f(x)||δx|=|df(x)dx|=1nx1n-1=1n1x(n-1)/n,forx>0.

    Consider some simple cases:

    In general, the conditioning of the nth root operation

    κ^=1n1x(n-1)/n,forx>0

    indicates ill-conditioning for x0, well conditioning for x1, incorrect model for x.

    Consider now what happens when n is also allowed to vary. The definition of the condition number cannot be directly applied since the limit process is not defined for n. One can however consider the problem g:+×+

    g(x,y)=x1/y,

    and notice that h:+×, h(x,n)=x1/n is a restriction of g. The condition number of h can be inferred from that of g

    κ^=limε0sup||δz||ε|g(x+δx,y+δy)-g(x,y)|||δz||,z=[ δx δy ]T.

    Use the -norm for δz+2, ||δz||=max(δx,δy). When approaching zero perturbation, ||δz||0 above the first bisector, the inequality

    δx<δyε

    implies

    κ^=|gy|.

    Conversely, for ||δz||0 below the first bisector

    κ^=|gx|.

    In general, for some arbitrary path to approach zero,

    κ^=max(|gx|,|gy|)=max(1y1x(y-1)/y,x1/ylnxy2).

    When restricted to y=n,

    κ^=max(1n1x(n-1)/n,x1/nlnxn2).

    Figure 4. Conditioning of nth-root x1/n with perturbations allowed in n.

  4. Completely state the mathematical problem of solving the initial value problem for an ordinary differential equation of first order. Find the absolute and relative condition numbers.

    Solution. The IVP

    y'=f(y),y(0)=y0

    is formulated as the mathematical problem

    F:C0,1()×C(),

    that when evaluated for some slope function f and initial condition y0 gives the integral curve y:[0,a), a>0. F(f,y0)=y. In the above C() is the space of continuous functions and C0,1 is the space of Lipschitz continuous functions.

    As in the nth-root problem there are two inputs to F, and it is useful to start with the case in which the slope function f is fixed and only y0 varies. The Lyapunov exponent L is defined as

    δy(t)eLtδy0,

    to characterize this case and the condition number is simply

    κ^(t)=eLt,

    i.e., the condition number and the Lyapunov exponent express the same concept.

    The condition number for the case of variation of the slope function requires a concept of taking a derivative with respect to a function f, a generalization of the calculus concept of taking a derivative with respect to a variable. This is known as a functional derivative and can be defined in the Fréchet sense for normed spaces and in the Gateaux sense for Banach spaces.

  5. Completely state the mathematical problem of finding the roots of a cubic polynomial. Find the absolute and relative condition numbers.

    Solution. Consider the cubic x3+a2x2+a1x+a0=0 with roots x1,x2,x3 related to the polynomial coefficients by the Vieta relations

    x1+x2+x3=-a2,x1x2+x1x3+x2x3=a1,x1x2x3=-a0.

    The mathematical problem of finding the roots of the cubic is f:33

    f(𝒂)=𝒙,𝒂=[ a0 a1 a2 ],𝒙=[ x1 x2 x3 ]

    Consider the effect of small changes δ𝒂 upon the roots by taking differentials

    δx1+δx2+δx3 = -δa2 (x2+x3)δx1+(x3+x1)δx2+(x1+x2)δx3 = δa1 x2x3δx1+x3x1δx2+x1x2δx3 = -δa0

    This is a linear system for δ𝒙 with matrix

    𝑩=[ 1 1 1 x2+x3 x3+x1 x1+x2 x2x3 x3x1 x1x2 ].

    The condition number for f is the maximal amplification by the matrix 𝑩 or

    κ^=||𝑩||.

    Consider some specific cases:

  6. Numerically compare the approximation of b(t)=t(π-t)(2π-t) by linear combination of 𝒯={sint,sin2t,sin3t,} with that of linear combinations of ={1,et,e2t,e3t,}. Present a study of the aproximation error as the number of terms in the linear combination increases. Estimate the order of convergence in both cases.

    Solution. The approximation is stated as

    b(t)b^n(t)=k=1nckak(t)

    with the basis functions chosen either as ak(t)=sin(kt) or ak(t)=e(k-1)t. The approximation error e(t)=b(t)-b^n(t) can be measured in various norms, e.g., the 2-norm

    ε2=||e(t)||22=02π(b(t)-b^n(t))2dt2πmi=1m(b(ti)-b^n(ti))2.

    Based upon the code from Fig.1 in L04, define a function that returns the approximation error for given basis set ak(t), number of terms n, and evaluation points m.

    function err(m,n,a,dbg=false)
      h=2.0*π/m; j=1:m; t=h*j;
      A = a.(1,t)
      for k=2:n
        A = [A a.(k,t)]
      end
      if dbg
        return A
      end
      bt=t.*(π.-t).*(2*π.-t)
      x=A\bt; b=A*x;
      return norm(b-bt)*(2*pi)/m
    end

    err

    Define the two basis sets of interest

    s(k,t) = sin(k*t); e(k,t)=exp((k-1)*t);

    Verify the err function constructs the expected matrix. Note that the basis function is passed as an argument.

    The numerical values within the matrix 𝑨 are hard to interpret for large m, hence plot the columns in Fig. 5. The basis fubnctino plots already indicate that the exponential family is likely to lead to bad approximations since with respect to e(n-1)t, all ekt for k<n-1 are negligibly small, hence 𝑨 is likely to have only one independent column vector.

    Figure 5. Left: Sine basis functions. Right: Exponential basis functions.

    Test the err function for larger m values (more samples).

    The convergence behavior is shown in Fig. 6. As expected as the number of terms in the linear combination increases the sine approximation converges, while that for the exponential basis has a constant error (numerically, the rank of the matrix remains 1).

    Figure 6. Convergence of sine, exp basis approximation of b(t) with increasing number of terms.