MATH661

Lecture 19: Derivative Approximation

Having introduced approximations of elements of vector spaces, a natural question is the approximation of transformations of such objects or operator approximation. An operator is understood here as a mapping from a domain vector space $𝒰 = (U, S, +, \cdot)$ to a co-domain vector space $𝒱 = (V, S, +, \cdot)$ , and the operator $ℒ : U \to V$ is said to be linear if for any scalars $c_{1}, c_{2} \in S$ and vectors $u_{1}, u_{2} \in U$ ,

ℒ (c_{1} u_{1} + c_{2} u_{2}) = c_{1} ℒ (u_{1}) + c_{2} ℒ (u_{2}),

i.e., the image of a linear combination is the linear combination of the images. Linear algebra considers the case of finite dimensional vector spaces, such as $U = ℝ^{m}$ , $V = ℝ^{n}$ , in which case a linear operator is represented by a matrix $𝑳 \in ℝ^{m \times n}$ , and satisfies

𝑳 (c_{1} 𝒖_{1} + c_{2} 𝒖_{2}) = c_{1} 𝑳 𝒖_{1} + c_{2} 𝑳 𝒖_{2} .

In contrast, the focus here is on infinite-dimensional function spaces such as $C^{r} (ℝ)$ (cf. Tab. 1, L18), the space of functions with continuous derivatives up to order $r$ . Common linear operator examples include:

Differentiation: $ℒ f = \partial^{k} f / \partial t^{k}$ , $ℒ : C^{r} (ℝ) \to C^{r - k} (ℝ)$ .
Riemann integration: $ℒ f = \int_{a}^{b} ω (t) f (t) d t$ , $ℒ : C (ℝ \ Δ) \to ℝ$ , where $Δ$ is a set of measure zero.
Linear differential equation: $ℒ y = \sum_{j = 0}^{k} a_{j} (t) y^{(j)} = f (t)$ , $ℒ : C^{r} (ℝ) \to C^{r - k} (ℝ)$ .

1.Numerical differentiation based upon polynomial interpolation

A general approach to operator approximation is to simply introduce an approximation of the function the operator acts upon, $f ≅ p$ ,

ℒ f ≅ ℒ p .

Monomial basis.

As an example consider the polynomial interpolant of

f

based upon data

𝒟 = {(x_{i}, y_{i} = f (x_{i})), i = 0, \dots, n}

p (t) = [\begin{array}{lllll} 1 & t & t^{2} & \dots & t^{n} \end{array}] 𝒄,

with coeffcients $𝒄$ determined as the solution of the interpolation conditions

𝑴 𝒄 = 𝒚,

with notations

𝑴 = [\begin{array}{lllll} 𝟏 & 𝒙 & 𝒙^{2} & \dots & 𝒙^{n} \end{array}], 𝒙^{k} = {[\begin{array}{lll} x_{0}^{k} & \dots & x_{n}^{k} \end{array}]}^{T}, 𝒚 = {[\begin{array}{lll} y_{0} & \dots & y_{n} \end{array}]}^{T} .

Differentiation of $f$ ( $ℒ = d / d t$ ) can be approximated as

\frac{d}{d t} f ≅ \frac{d}{d t} p = [\begin{array}{lllll} 0 & 1 & 2 t & \dots & n t^{n - 1} \end{array}] 𝒄 .

It is often of interest to express the result of applying an operator directly in terms of known information on $f$ . Formally, in the case of differentiation,

\frac{d}{d t} f ≅ [\begin{array}{lllll} 0 & 1 & 2 t & \dots & n t^{n - 1} \end{array}] 𝑴^{- 1} 𝒚,

allowing the identification of a differentiation approximation operator $𝒟$

\frac{d}{d t} f ≅ 𝒟 (𝒚), 𝒟 = [\begin{array}{lllll} 0 & 1 & 2 t & \dots & n t^{n - 1} \end{array}] 𝑴^{- 1} .

This formulation explicitly includes the inversion of the sampled basis matrix $𝑴$ , and is hence not computationally efficient. Alternative formulations can be constructed that carry out some of the steps in computing $𝑴^{- 1}$ analytically.

Newton basis (finite difference calculus).

An especially useful formulation for numerical differentiation arises from the Newton interpolant of data

𝒟 = {(x_{i} = i h, y_{i} = f (x_{i})), i = 0, \dots, n}

f : ℝ \to ℝ

f \in C^{(n + 1)} (ℝ)

f (t) ≅ p (t) = [y_{0}] + [y_{1}, y_{0}] (t - x_{0}) + \dots + [y_{n}, y_{n - 1}, \dots, y_{0}] (t - x_{0}) \cdot (t - x_{1}) \cdot \dots \cdot (t - x_{n - 1}) .

For equidistant sample points $x_{i} = i h$ , the Newton interpolant can be expressed as an operator acting upon the data. Introduce the translation operator

E f (t) = f (t + h) .

Repeated application of the translation operator leads to

E^{k} f (t) = E (E^{k - 1} f (t)) = \dots = f (t + k h),

and the identity operator is given by

I f (t) = f (t) = E^{0} f (t) \Rightarrow I = E^{0} .

Finite differences of the function values are expressed through the forward, backward and central operators

Δ = E - I, \nabla = I - E, δ = E^{1 / 2} - E^{- 1 / 2},

leading to the formulas

Δ f (t) = f (t + h) - f (t), \nabla f (t) = f (t) - f (t - h), δ f (t) = f (t + h / 2) - f (t - h / 2) .

Applying the above to the data set $𝒟$ leads to

Δ y_{i} = y_{i + 1} - y_{i}, \nabla y_{i} = y_{i} - y_{i - 1}, δ y_{i} = y_{i + 1 / 2} - y_{i - 1 / 2} .

The divided differences arising in the Newton can be expressed in terms of finite difference operators,

[y_{1}, y_{0}] = \frac{y_{1} - y_{0}}{h} = \frac{1}{h} Δ y_{0}, [y_{2}, y_{1}, y_{0}] = \frac{[y_{2}, y_{1}] - [y_{1}, y_{0}]}{2 h} = \frac{Δ y_{1} - Δ y_{0}}{2 h^{2}} = \frac{Δ^{2} y_{0}}{2 h^{2}},

or in general

[y_{k}, \dots, y_{1}, y_{0}] = \frac{Δ^{k}}{k! h^{k}} y_{0} .

Using the above and rescaling the variable $t$ in the Newton basis $𝒩 = {1, t - x_{0}, (t - x_{0}) (t - x_{1}), \dots}$ in units of the step size $t = α h + x_{0}$ leads to

p (t (α)) = P (α) = (I + α \frac{Δ}{1!} + α (α - 1) \frac{Δ^{2}}{2!} + \dots + α (α - 1) \cdot \dots \cdot (α - 1 + n) \frac{Δ^{n}}{n!}) y_{0} .

(1)

The generalized binomial series states

{(1 + x)}^{α} = \sum_{k = 0}^{\infty} (\begin{array}{l} α \\ k \end{array}) x^{k},

(2)

with

(\begin{array}{l} α \\ k \end{array}) = \frac{α (α - 1) \dots (α - k + 1)}{k!}

the generalized binomial coefficient. The operator acting upon $y_{0}$ in (1) can be interpreted as the truncation at order $n$

P (α) ≅ {(I + Δ)}^{α} y_{0} = ℱ_{α} y_{0},

of the operator ${(I + Δ)}^{α}$ defined through (2) by the substitutions $1 \to I$ , $x \to Δ$ . The operator $ℱ_{α} = {(I + Δ)}^{α}$ can be interpreted as the interpolation operator with equidistant sampling points, with $P (α)$ its truncation to order $n$ . Reversing the order of the sampling points leads to the Newton interpolant

p (t) = [y_{n}] + [y_{n - 1}, y_{n}] (t - x_{n}) + \dots + [y_{0}, y_{1}, \dots, y_{n}] (t - x_{n}) (t - x_{n - 1}) \cdot \dots \cdot (t - x_{1}) .

The divided differences can be expressed in terms of the backward operator as

[y_{n - 1}, y_{n}] = \frac{y_{n - 1} - y_{n}}{h} = - \frac{1}{h} \nabla y_{n}, [y_{n - 2}, y_{n - 1}, y_{n}] = \frac{[y_{n - 2}, y_{n - 1}] - [y_{n - 1}, y_{n}]}{2 h} = - \frac{\nabla y_{n - 1} - \nabla y_{n}}{2 h^{2}} = \frac{\nabla^{2} y_{n}}{2 h^{2}},

leading to an analogous expression of the interpolation operator in terms backward finite differences

p (t (α)) = P (α) = (I - α \frac{\nabla}{1! h} + α (α - 1) \frac{\nabla^{2}}{2! h^{2}} + \dots + {(- 1)}^{n} α (α - 1) \cdot \dots \cdot (α - 1 + n) \frac{\nabla^{n}}{n! h^{n}}) y_{n} ≅ {(I - \nabla)}^{α} y_{n} = ℬ_{α} y_{n} .

Differentiation of the interpolation expressed in terms of forward finite differences gives

f^{'} (t) ≅ \frac{d}{d t} P (α) = \frac{d α}{d t} P^{'} (α) ≅ \frac{1}{h} \frac{d}{d α} ℱ_{α} y_{0} = \frac{1}{h} [\ln (I + Δ)] {(I + Δ)}^{a} y_{0} ≅ \frac{1}{h} \ln (I + Δ) P (α) .

The particular interpolant $P (α)$ is irrelevant, leading to the operator identity

\frac{d}{d t} ≅ \frac{1}{h} \ln (I + Δ) .

For $| x | < 1$ , the power series expansions are

\frac{d}{d x} \ln (1 + x) = \frac{1}{1 + x} = 1 - x + x^{2} - \dots \Rightarrow \ln (1 + x) = x - \frac{x^{2}}{2} + \frac{x^{3}}{3} - \dots + {(- 1)}^{k + 1} \frac{x^{k}}{k} + \dots,

are uniformly convergent, leading to the expression

\frac{d}{d t} ≅ \frac{1}{h} (Δ - \frac{1}{2} Δ^{2} + \frac{1}{3} Δ^{3} - \dots + {(- 1)}^{k} \frac{1}{k} Δ^{k} + \dots),

stating that the (continuum) differentiation operator can be approximated by an infinite series of finite difference operations, recovered exactly in the $h \to 0$ limit. Denote by $D_{k}^{+}$ the truncation at term $k$ of the above operator series such that

f^{'} (x_{0}) ≅ D_{k}^{+} (f) (x_{0}) = \frac{1}{h} (Δ - \frac{1}{2} Δ^{2} + \frac{1}{3} Δ^{3} - \dots + {(- 1)}^{k} \frac{1}{k} Δ^{k}) y_{0} .

Truncation at $k = 1, 2, 3$ leads to the expressions

D_{1}^{+} (f) = \frac{f (h + t) - f (t)}{h}, D_{2}^{+} (f) = \frac{4 f (h + t) - f (2 h + t) - 3 f (t)}{2 h}, D_{3}^{+} (f) = \frac{18 f (h + t) - 9 f (2 h + t) + 2 f (3 h + t) - 11 f (t)}{6 h} .

The $h \to 0$ limit of divided differences is given by

{lim}_{h \to 0} [y_{k}, y_{k - 1}, \dots, y_{0}] = {lim}_{h \to 0} (\frac{1}{k! h^{k}} Δ^{k} y_{0}) = \frac{1}{k!} f^{(k)} (x_{0}),

such that for small finite $h > 0$ ,

Δ^{k} y_{0} ≅ h^{k} f^{(k)} (x_{0}) .

The resulting derivative approximation error is of order $k$ ,

e_{k}^{+} (t) = D_{k}^{+} (f) (t) - f^{'} (t) = \frac{{(- 1)}^{k + 1} h^{k}}{k + 1} f^{(k + 1)} (t) = 𝒪 (h^{k}) .

The analogous expression for backward differences is

\frac{d}{d t} ≅ - \frac{1}{h} \ln (I - \nabla) = \frac{1}{h} (\nabla + \frac{1}{2} \nabla^{2} + \frac{1}{3} \nabla^{3} + \dots + \frac{1}{k} \nabla^{k} + \dots),

and the first few truncations are

D_{1}^{-} (f) = \frac{f (t - h) - f (t)}{h}, D_{2}^{-} (f) = \frac{- f (t - 2 h) + 4 f (t - h) - 3 f (t)}{2 h}, D_{3}^{-} (f) = \frac{2 f (t - 3 h) - 9 f (t - 2 h) + 18 f (t - h) - 11 f (t)}{6 h}

with errors

e_{k}^{-} (t) = D_{k}^{-} (f) (t) - f^{'} (t) = \frac{h^{k}}{k} f^{(k + 1)} (t) = 𝒪 (h^{k}) .

The above operator identities can be inverted to obtain

Δ = E - I = \exp (h \frac{d}{d t}) - I, \nabla = I - E^{- 1} = I - \exp (- h \frac{d}{d t}),

leading to

E = \exp (h \frac{d}{d t}) = 1 + h \frac{d}{d t} + \frac{1}{2} {(h \frac{d}{d t})}^{2} + \dots + \frac{1}{k!} {(h \frac{d}{d t})}^{k} + \dots +

this time expressing the finite translation operator as an infinite series of continuum differentiation operations. This allows expressing the central difference operator as

δ = E^{1 / 2} - E^{- 1 / 2} = \exp (\frac{h}{2} \frac{d}{d t}) - \exp (- \frac{h}{2} \frac{d}{d t}) = 2 \sinh (\frac{h}{2} \frac{d}{d t}),

and approximations of the derivative based on centered differencing are obtained from

\frac{d}{d t} ≅ \frac{2}{h} arcsinh (\frac{δ}{2}) = \frac{1}{h} (δ - \frac{δ^{3}}{24} + \frac{3 δ^{5}}{640} - \frac{5 δ^{7}}{7168} + \frac{35 δ^{9}}{294912} - \dots) .

An advantage of the centered finite differences (surmised from the odd power series) is a higher order of accuracy

e_{k} = D_{k} f (f) - f^{'} (t) = 𝒪 (h^{2 k}) .

Higher order derivative are obtained by repeated application of the operator series, e.g.,

\frac{d^{2}}{d t^{2}} = \frac{d}{d t} \cdot \frac{d}{d t} = \frac{1}{h^{2}} {(Δ - \frac{1}{2} Δ^{2} + \frac{1}{3} Δ^{3} - \dots)}^{2} = \frac{1}{h^{2}} {(Δ^{2} - Δ^{3} + \frac{11}{12} Δ^{4} - \dots)}^{2} .

2.Taylor series methods

An alternative derivation of the above finite difference formulas is to construct a linear combination of function values

L_{m}^{n} f (t) = \sum_{k = - m}^{n} c_{k} f (t + k h) = (\sum_{k = - m}^{n} c_{k} E^{k}) f (t),

and determine the coefficients $c_{k}$ such that the $p^{th}$ derivative is approximated to order $q$

f^{(p)} (t) = L_{m}^{n} f (t) + 𝒪 (h^{q}) .

For example, for $m = 0$ , $n = 1$ , carrying out Taylor series expansions gives

\begin{array}{rcl} f (t + h) & = & f (t) + h f^{'} (t) + \frac{1}{2} h^{2} f^{''} (t) + \dots \\ f (t) & = & f (t) . \end{array}

Eliminating $f (t)$ by multiplying the first equation by $c_{1} = 1$ and the second by $c_{0} = - 1$ recovers the forward finite difference formula

f^{'} (t) = \frac{f (t + h) - f (t)}{h} + 𝒪 (h) .

3.Numerical differentiation based upon piecewise polynomial interpolation

$B$ -spline basis.

The above example used a truncation of the monomial basis

ℳ_{n} (t) = {1, t, \dots, t^{n}}

. Analogous results are obtained when using a different basis. Consider the equidistant sample points

x_{i} = i h + x_{0}

, data

𝒟 = {(x_{i}, y_{i} = f (x_{i}), i = 0, 1, \dots, n)}

and the first-degree

B

-spline basis

ℬ_{n, 1} (t) = {B_{0, 1} (t), B_{1, 1} (t), \dots, B_{n, 1} (t)},

in which case the linear piecewise interpolant is expressed as

p (t) = \sum_{i = 0}^{n} y_{i} B_{i, 1} (t),

and over interval $[x_{i - 1}, x_{i}]$ reduces to

p_{i} (t) = y_{i - 1} B_{i - 1, 1} (t) + y_{i} B_{i, 1} (t) = y_{i - 1} \cdot \frac{x_{i} - t}{x_{i} - x_{i - 1}} + y_{i} \cdot \frac{t - x_{i - 1}}{x_{i} - x_{i - 1}} .

Differentiation recovers the familiar slope expression

p_{i}^{'} (t) = \frac{y_{i} - y_{i - 1}}{x_{i} - x_{i - 1}} = \frac{y_{i} - y_{i - 1}}{3 h} .

At the nodes, a piecewise linear spline is discontinuous, hence the derivative is not defined, though one could consider the one-sided limits. Evaluation of derivatives at midpoints $t_{i} = (x_{i - 1} + x_{i}) / 2 = (i - 1) h + h / 2 + x_{0}$ , $i = 1, 2, \dots, n$ , leads to

𝒚^{'} = [\begin{array}{l} y_{1}^{'} \\ y_{2}^{'} \\ ⋮ \\ y_{n}^{'} \end{array}] = p^{'} (𝒕) = 𝑫 𝒙 = \frac{1}{h} [\begin{array}{llllll} - 1 & 1 & 0 & 0 & \dots & 0 \\ 0 & - 1 & 1 & 0 & \dots & 0 \\ ⋱ & ⋱ \\ ⋱ & ⋱ \\ - 1 & 1 \end{array}] [\begin{array}{l} x_{0} \\ x_{1} \\ ⋮ \\ x_{n} \end{array}],

with $𝑫 \in ℝ^{n \times (n + 1)}$ .