MATH661

Additive Nonlinear Operator Approximation

The linear algebra concepts arising from study of linear mappings between vector spaces $𝒇 : U \to V$ , $𝒇 (α 𝒖 + β 𝒗) = α f (𝒖) + β 𝒇 (𝒗)$ , are widely applicable to nonlinear functions also. The study of nonlinear approximation starts with the simplest case of approximation of a function with scalar values and arguments, $f : ℝ \to ℝ$ through additive corrections.

1.Function spaces

An immediate application of the linear algebra framework is to construct vector spaces of real functions $ℱ = (F, +, \cdot)$ , with $F = {f | f : ℝ \to ℝ .}$ , and the addition and scaling operations induced from $ℝ$ ,

(α f + β g) (t) = α f (t) + g (t), f, g \in F, α, β \in ℝ .

Comparing with the real vector space $(ℝ^{m}, +, \cdot)$ in which the analogous operation is $α 𝒖 + β 𝒗, 𝒖, 𝒗 \in ℝ^{m}, α, β \in ℝ$ , or componentwise

{(α 𝒖 + β 𝒗)}_{i} = α u_{i} + β v_{i}, i = 1, 2, \dots, m,

the key difference that arises is the dimension of the set of vectors. Finite-dimensional vectors within $ℝ^{m}$ can be regarded as functions defined on a finite set $𝒖 \Leftrightarrow u : {1, 2, \dots, m} \to ℝ$ , with $u (i) = u_{i}$ . The elements of $F$ are however functions defined on $ℝ$ , a set with cardinality $𝔠 = 2^{ℵ_{0}}$ , with $ℵ_{0}$ the cardinality of the naturals $ℕ$ . This leads to a review of the concept of a basis for this infinite-dimensional case.

1.1.Infinite dimensional basis set

In the finite dimensional case $𝑩 \in ℝ^{m \times m}$ constituted a basis if any vector $𝒚 \in ℝ^{m}$ could be expressed uniquely as a linear combination of the column vectors of

\forall 𝒚 \in ℝ^{m}, \exists! 𝒄 \in ℝ^{m} such that 𝒚 = 𝑩 𝒄 = c_{1} 𝒃_{1} + \dots + c_{m} 𝒃_{m} .

While the above finite sum is well defined, there is no consistent definition of an infinite sum of vectors. As a simple example, in the vector space of real numbers $ℛ_{1} = (ℝ, +, \cdot)$ , any finite sum of reals is well defined, for instance

S_{n} = \sum_{k = 0}^{n} {(- 1)}^{k} = {\begin{cases} 1 & if n even \\ 0 & if n odd \end{cases} .

but the limit $S_{n \to \infty}$ cannot be determined. This leads to the necessity of seeking finite-dimensional linear combinations to span a vector space $𝒱 = (V, S, +, \cdot)$ . First, define linear independence of an infinite (possibly uncountable) set of vectors $ℬ = {v_{γ} | . γ \in Γ, v_{γ} \in V}$ , where $Γ$ is some indexing set.

Definition. The vector set $ℬ = {v_{γ} | . γ \in Γ, v_{γ} \in V},$ is linearly independent if the only $n \in ℕ$ scalars, $x_{1}, \dots, x_{n} \in S$ , that satisfy

$x_{1} v_{γ_{1}} + \dots + x_{n} v_{γ_{n}} = 0, γ_{i} \in Γ$ (1)

are $x_{1} = 0$ , $x_{2} = 0$ ,…, $x_{n} = 0$ .

The important aspect of the above definition is that all finite vector subsets are linearly independent. The same approach is applied in the definition of a spanning set.

Definition. Vectors within the set $ℬ = {v_{γ} | . γ \in Γ, v_{γ} \in V},$ span $V$ , stated as $V = span (ℬ)$ , if for any $u \in V$ there exist $n \in ℕ$ scalars, $x_{1}, \dots, x_{n} \in S$ , such that

$x_{1} v_{γ_{1}} + \dots + x_{n} v_{γ_{n}} = u, γ_{i} \in Γ .$ (2)

This now allows a generally-applicable definition of basis and dimension.

Definition. The vector set $ℬ = {v_{γ} | . γ \in Γ, v_{γ} \in V}$ is a basis for vector space $𝒱 = (V, S, +, \cdot)$ if

$ℬ$ is linearly independent;

$span (ℬ) = V$ .

Definition. The dimension of a vector space $𝒱 = (V, S, +, \cdot)$ is the cardinality of a basis set $ℬ$ , $\dim (𝒱) = | ℬ |$ .

The use of finite sums to define linear independence and bases is not overly restrictive since it can be proven that every vector space has a basis. The proof of this theorem is based on Zorn's lemma from set theory, and asserts exsitence of a basis, but no constructive procedure. The difficulty in practical construction of bases for infinite dimensional vector spaces is ascertained through basic examples.

Example. $ℛ_{\infty}$ . As a generalization of $ℛ_{m} = (ℝ^{m}, ℝ, +, \cdot)$ , consider the vector space of real sequences ${x_{n}}_{n \in ℕ}$ represented as a vectors with a countably infinite number of components $𝒙 = {[\begin{array}{llll} x_{1} & x_{2} & x_{3} & \dots \end{array}]}^{T}$ . Linear combinations are defined by

α 𝒙 + β 𝒚 = {[\begin{array}{llll} α x_{1} + β y_{1} & α x_{2} + β y_{2} & α x_{3} + β y_{3} & \dots \end{array}]}^{T} .

Let $𝒆_{i}$ denote the vector of all zeros except the $i^{th}$ position. In $ℝ^{m}$ , the identity matrix $𝑰 = [\begin{array}{lll} 𝒆_{1} & \dots & 𝒆_{m} \end{array}]$ was a basis, but this does not generalize to $ℝ^{\infty}$ ; for example the vector $𝒗 = {[\begin{array}{llll} 1 & 1 & 1 & \dots \end{array}]}^{T}$ cannot be obtained by finite linear combination of the $𝒆_{i}$ vectors. In fact, there is no countable set of vectors that spans $ℝ^{\infty}$ .

Example. $P (ℝ)$ . The vector space of polynomials $P (ℝ) = {p | p (t) = c_{n} t^{n} + c_{n - 1} t^{n - 1} + \dots + c_{0} ., n \in ℕ, c_{i} \in ℝ, i = 0, 1, \dots, N}$ on the real line has an easily constructed basis, namely the set of the monomials

ℬ (t) = {t^{n} | n \in ℕ .},

an infinite set with the cardinality as the naturals $| ℬ | = | ℕ | = ℵ_{0}$ .

1.2.Alternatives to the concept of a basis

The difficulty in ascribing significance to an infinite sum of vectors $\sum_{i = 1}^{\infty} 𝒗_{i}$ can be resolved by endowing the vector space with additional structure, in particular a way to define convergence of the partial sums

𝒔_{n} = \sum_{i = 1}^{n} 𝒗_{i}

to a limit ${lim}_{n \to \infty} 𝒔_{n} = 𝒗$ .

Fourier series.

One approach is the introduction of an inner product

(𝒖, 𝒗)

and the associated norm

|| 𝒖 || = {(𝒖, 𝒗)}^{1 / 2}

. A considerable advantage of this approach is that it not only allows infinite linear combinations, but also definition of orthonormal spanning sets. An example is the vector space of continuous functions defined on

[- π, π]

with the inner product

(f, g) = \frac{1}{π} \int_{- π}^{π} f (t) g (t) d t,

and norm $|| f || = {(f, f)}^{1 / 2}$ . An orthonormal spanning set for $C [- π, π]$ is given by

{\frac{1}{2}} ⋃ {\cos (n x) | n \in ℕ^{+} .} ⋃ {\sin (n x) | n \in ℕ^{+} .} .

Vector spaces with an inner product are known as Hilbert spaces.

Taylor series.

Convergence of infinite sums can be determined through a norm, without the need of an inner product. An example is the space of real-analytic functions with the inf-norm

{|| f ||}_{\infty} = {sup}_{x} | f (t) |,

for which a spanning set is given by the monomials ${1, t, t^{2}, \dots}$ , and the infinite exapnsion

f (t) = \sum_{k = 0}^{\infty} a_{k} {(t - c)}^{k}

is convergent, with coefficients given by the Taylor series

f (t) = f (c) + \frac{f^{'} (c)}{1!} (t - c) + \dots, a_{k} = \frac{f^{(k)} (c)}{k!} .

Note that orthogonality of the spanning set cannot be established, absent an inner product.

1.3.Common function spaces

Several function spaces find widespread application in scientific computation. An overview is provided in Table 1.

$B (ℝ)$	bounded functions
$C (ℝ)$	continuous functions	$C^{r} (ℝ)$	with continuous derivatives up to $r$
$C_{c} (ℝ)$	with compact support	$C_{c}^{r} (ℝ)$	and compact support
$C_{0} (ℝ)$	that vanish at infinity	$C^{\infty} (ℝ)$	smooth functions
$L^{p} (ℝ)$	with finite $p$ -norm	$W^{k, p} (ℝ)$	Sobolev space, with norm
	${\|\| f \|\|}_{p} = {(\int_{- \infty}^{\infty} {\| f (t) \|}^{2} d t)}^{1 / p}$		${\|\| f \|\|}_{k, p} = {(\sum_{i = 0}^{k} {\|\| f^{(i)} \|\|}_{p}^{p})}^{1 / p}$

Table 1. Common vector spaces of functions

2.Interpolation

The interpolation problem seeks the representation of a function $f$ known only through a sample data set $𝒟 = {(x_{i}, y_{i} = f (x_{i})) | i = 0, \dots, m .} \subset ℝ \times ℝ$ , by an approximant $p (t)$ , obtained through combination of elements from some family of computable functions, $ℬ = {b_{0}, \dots, b_{n} | b_{k} : ℝ \to ℝ .}$ . The approximant $p (t)$ is an interpolant of $𝒟$ if

p (x_{i}) = f (x_{i}) = y_{i}, i = 0, \dots, m,

or $p (t)$ passes through the known poles $(x_{i}, y_{i})$ of the function $f$ . The objective is to use $p (t)$ thus determined to approximate the function $f$ at other points. Assuming $x_{0} < x_{1} < \dots < x_{m}$ , evaluation of $p (t)$ at $t \in (x_{0}, x_{m})$ is an interpolation, while evaluation at $t < x_{0}$ or $t > x_{m}$ , is an extrapolation. The basic problems arising in interpolation are:

choice of the family from which to build the approximant $p (t)$ ;
choice of the combination technique;
estimation of the error of the approximation given some knowledge of $f$ .

$\circ$ Algorithms for interpolation of real functions can readily be extended to more complicated objects, e.g., interpolation of matrix representations of operators. Implementation is aided by programming language polymorphism as in Julia.

Define a Julia package for function approximation, to be stored on GitHub. First, define git environment variables.

Shell]

git config --global user.name "SorinMitran"

Shell]

git config --global user.email "mitran@unc.edu"

Shell]

Next, define the package directory.

∴	home=unsafe_string(ccall(:getenv, Cstring, (Cstring,), "HOME"));

∴	pkgdir = home * "/courses/MATH661/packages";

∴

Define a package template

∴	using PkgTemplates

∴

tmpl=Template(;
  user="SorinMitran", dir=pkgdir,
  plugins = [ License(; name="MPL"),
              Git(; manifest=true, ssh=true),
              GitHubActions(; x86=true),
              Codecov(),
              Documenter{GitHubActions}(),
              Develop(),
             ],
);

∴	tmpl("Interpolation661");

[ Info: Running prehooks [ Info: Running hooks Activating environment at ‘~/courses/MATH661/packages/Interpolation661/Project.toml‘ Updating registry at ‘~/.julia/registries/General‘ No Changes to ‘~/web/courses/MATH661/packages/Interpolation661/Project.toml‘ No Changes to ‘~/web/courses/MATH661/packages/Interpolation661/Manifest.toml‘ Precompiling project… [32m ✓ [39mInterpolation661 1 dependency successfully precompiled in 1 seconds Activating environment at ‘~/.julia/environments/v1.6/Project.toml‘ Activating new environment at ‘~/courses/MATH661/packages/Interpolation661/docs/Project.toml‘ Resolving package versions… Installed IOCapture ─────────── v0.2.2 Installed ANSIColoredPrinters ─ v0.0.1 Installed Parsers ───────────── v2.1.0 Installed DocStringExtensions ─ v0.8.6 Installed Documenter ────────── v0.27.10 Updating ‘~/web/courses/MATH661/packages/Interpolation661/docs/Project.toml‘ [e30172f5] + Documenter v0.27.10 Updating ‘~/web/courses/MATH661/packages/Interpolation661/docs/Manifest.toml‘ [a4c015fc] + ANSIColoredPrinters v0.0.1 [ffbed154] + DocStringExtensions v0.8.6 [e30172f5] + Documenter v0.27.10 [b5f81e59] + IOCapture v0.2.2 [682c06a0] + JSON v0.21.2 [69de0a69] + Parsers v2.1.0 [2a0f44e3] + Base64 [ade2ca70] + Dates [b77e0a4c] + InteractiveUtils [76f85450] + LibGit2 [56ddb016] + Logging [d6f4376e] + Markdown [a63ad114] + Mmap [ca575930] + NetworkOptions [de0858da] + Printf [3fa0cd96] + REPL [9a3f8284] + Random [ea8e919c] + SHA [9e88b42a] + Serialization [6462fe0b] + Sockets [8dfed614] + Test [4ec0a83e] + Unicode Precompiling project… [32m ✓ [39m[90mIOCapture[39m [32m ✓ [39m[90mANSIColoredPrinters[39m [32m ✓ [39m[90mDocStringExtensions[39m [33m ✓ [39m[90mParsers[39m [33m ✓ [39m[90mJSON[39m [32m ✓ [39mDocumenter 6 dependencies successfully precompiled in 5 seconds [33m2[39m dependencies precompiled but different versions are currently loaded. Restart julia to access the new versions Resolving package versions… Updating ‘~/web/courses/MATH661/packages/Interpolation661/docs/Project.toml‘ [c8cb1979] + Interpolation661 v0.1.0 ‘..‘ Updating ‘~/web/courses/MATH661/packages/Interpolation661/docs/Manifest.toml‘ [c8cb1979] + Interpolation661 v0.1.0 ‘..‘ Activating environment at ‘~/.julia/environments/v1.6/Project.toml‘ [ Info: Running posthooks Resolving package versions… Updating ‘~/.julia/environments/v1.6/Project.toml‘ [c8cb1979] + Interpolation661 v0.1.0 ‘~/courses/MATH661/packages/Interpolation661‘ Updating ‘~/.julia/environments/v1.6/Manifest.toml‘ [c8cb1979] + Interpolation661 v0.1.0 ‘~/courses/MATH661/packages/Interpolation661‘ [ Info: New package is at /home/mitran/courses/MATH661/packages/Interpolation661

∴

2.1.Additive corrections

As to be expected, a widely used combination technique is linear combination,

p (t) = c_{0} b_{0} (t) + c_{1} b_{1} (t) + \dots + c_{n} b_{n} (t) .

The idea is to capture the nonlinearity of $f (t)$ through the functions $b_{0} (t), \dots, b_{n} (t)$ , while maintaining the framework of linear combinations. Sampling of $b_{j} (t)$ at the poles $x_{i}$ of a data set $𝒟$ , constructs the vectors

𝒃_{j} = b_{j} (𝒙) = {[\begin{array}{lll} b_{j} (x_{0}) & \dots & b_{j} (x_{m}) \end{array}]}^{T} \in ℝ^{m + 1},

which gathered together into a matrix leads to the formulation of the interpolation problem as

𝑩 𝒄 = 𝒚 = 𝑰 𝒚, 𝑩 \in ℝ^{(m + 1) \times (n + 1)} .

(3)

Before choosing some specific function set $ℬ$ , some general observations are useful.

The function values $y_{i} = f (x_{i}), i = 0, \dots, m$ , are directly incorporated into the interpolation problem (3). Any estimate of the error at other points requires additional information on $f$ . Such information can be furnished by bounds on the function values, or knowledge of its derivatives for example.
A solution to (3) exists if $𝒚 \in C (𝑩)$ . Economical interpolations would use $n < m$ functions in the set $ℬ$ , hopefully $n ≪ m$ .

2.2.Polynomial interpolation

Monomial form of interpolating polynomial.

As noted above, the vector space of polynomials

P (ℝ)

has an easily constructed basis, that of the monomials

m_{j} (t) = t^{j}

which shall be organized as a row vector of functions

ℳ (t) = [\begin{array}{llll} 1 & t & t^{2} & \dots \end{array}] .

With $ℳ_{n + 1} (t)$ denoting the first $n + 1$ monomials

ℳ_{n + 1} (t) = [\begin{array}{llll} 1 & t & \dots & t^{n} \end{array}],

a polynomial of degree $n$ is the linear combination

p (t) = ℳ_{n + 1} (t) 𝒂 = [\begin{array}{llll} 1 & t & \dots & t^{n} \end{array}] [\begin{array}{l} a_{0} \\ a_{1} \\ ⋮ \\ a_{n} \end{array}] = a_{0} + a_{1} t + \dots + a_{n} t^{n} .

Let $𝑴 \in ℝ^{(m + 1) \times (n + 1)}$ denote the matrix obtained from evaluation of the first $n + 1$ monomials at the sample points $𝒙 = {[\begin{array}{llll} x_{0} & x_{1} & \dots & x_{m} \end{array}]}^{T}$ ,

𝑴 = ℳ_{n + 1} (𝒙) .

The above notation conveys that a finite-dimensional matrix $𝑴 \in ℝ^{(m + 1) \times (n + 1)}$ is obtained from evaluation of the row vector of the monomial basis function $ℳ (x) : ℝ \to ℝ^{n + 1}$ , at the column vector of sample points $𝒙 \in ℝ^{m + 1}$ . The interpolation condition $p (𝒙) = 𝒚$ leads to the linear system

𝑴 𝒂 = 𝒚 .

(4)

For a solution to exist for arbitrary $𝒚$ , $𝑴$ must be of full rank, hence $m = n$ , in which case $𝑴$ becomes the Vandermonde matrix

𝑴 = [\begin{array}{llll} 1 & x_{0} & \dots & x_{0}^{n} \\ 1 & x_{1} & \dots & x_{1}^{n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & x_{n} & \dots & x_{n}^{n} \end{array}],

known to be ill-conditioned. Since $𝑴$ is square and of full rank, (4) has a unique solution.

Finding the polynomial coefficients by solving the above linear system requires $𝒪 (n^{3} / 3)$ operations. Evaluation of the monomial form is economically accomplished in $𝒪 (n)$ operations through Horner's scheme

p (t) = a_{0} + (a_{1} + \dots + (a_{n - 2} + (a_{n - 1} + a_{n} t) \cdot t) \cdot t) \cdot t .

(5)

$\circ$

Figure 1. Monomial basis over interval $[- π, π]$

$\circ$

Algorithm (Horner's scheme)

Input: $t \in ℝ, 𝒂 \in ℝ^{n + 1}$

Output: $p (t) = a_{0} + a_{1} t + \dots + a_{n} t^{n}$

$p = a_{n}$

for $k = n - 1$ downto 0

$p = a_{k} + p \cdot t$

end

return $p$

∴	function Horner(t,a) n=length(a)-1; p=a[n+1] for k=n:-1:1 p=a[k]+p*t end return p end;

∴	p2(t)=3t^2-2t+1; ap2=[1 -2 3];

∴	t=-3:3; [p2.(t) Horner.(t,Ref(ap2))]

$[\begin{array}{cc} 34 & 34 \\ 17 & 17 \\ 6 & 6 \\ 1 & 1 \\ 2 & 2 \\ 9 & 9 \\ 22 & 22 \end{array}]$ (6)

∴

Lagrange form of interpolating polynomial.

It is possible to reduce the operation count to find the interpolating polynomial by carrying out an

L U

decomposition of the monomial matrix

𝑴

ℳ_{n + 1} (𝒙) = 𝑴 = 𝑳 𝑼 .

Let $ℒ_{n + 1} (t) = [\begin{array}{llll} ℓ_{0} (t) & ℓ_{1} (t) & \dots & ℓ_{n} (t) \end{array}]$ denote another set of basis functions that evaluates to the identity matrix at the sample points $𝒙$ , such that $ℒ_{n + 1} (𝒙) = 𝑰$ ,

ℳ_{n + 1} (𝒙) = 𝑴 = 𝑳 𝑼 = 𝑰 𝑳 𝑼 = ℒ_{n + 1} (𝒙) 𝑳 𝑼 .

For arbitrary $t$ , the relationship

ℳ_{n + 1} (t) = ℒ_{n + 1} (t) 𝑳 𝑼,

describes a linear mapping between the monomials $ℳ_{n + 1} (t)$ and the $ℒ_{n + 1} (t)$ functions, a mapping which is invertible since $𝑴 = 𝑳 𝑼$ is of full rank

ℒ_{n + 1} (t) = ℳ_{n + 1} (t) 𝑼^{- 1} 𝑳^{- 1} .

Note that organization of bases as row vectors of functions leads to linear mappings expressed through right factors.

$\circ$ The $L U$ factorization of the Vandermonde matrix can be determined analytically, as exemplified for $n = 3$ by

(\begin{array}{cccc} 1 & x_{0} & x_{0}^{2} & x_{0}^{3} \\ 1 & x_{1} & x_{1}^{2} & x_{1}^{3} \\ 1 & x_{2} & x_{2}^{2} & x_{2}^{3} \\ 1 & x_{3} & x_{3}^{2} & x_{3}^{3} \end{array}) = (\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & \frac{x_{0} - x_{2}}{x_{0} - x_{1}} & 1 & 0 \\ 1 & \frac{x_{0} - x_{3}}{x_{0} - x_{1}} & \frac{(x_{0} - x_{3}) (x_{3} - x_{1})}{(x_{0} - x_{2}) (x_{2} - x_{1})} & 1 \end{array}) (\begin{array}{cccc} 1 & x_{0} & x_{0}^{2} & x_{0}^{3} \\ 0 & x_{1} - x_{0} & x_{1}^{2} - x_{0}^{2} & x_{1}^{3} - x_{0}^{3} \\ 0 & 0 & (x_{0} - x_{2}) (x_{1} - x_{2}) & (x_{0} - x_{2}) (x_{1} - x_{2}) (x_{0} + x_{1} + x_{2}) \\ 0 & 0 & 0 & - ((x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})) \end{array})

In[37]:=

V[n_]:=Table[Subscript[x, i]^j,{i, 0, n},{j, 0, n}];
n=3; n1=n+1; M=V[n]

$(\begin{array}{cccc} 1 & x_{0} & x_{0}^{2} & x_{0}^{3} \\ 1 & x_{1} & x_{1}^{2} & x_{1}^{3} \\ 1 & x_{2} & x_{2}^{2} & x_{2}^{3} \\ 1 & x_{3} & x_{3}^{2} & x_{3}^{3} \end{array})$

In[13]:=

LU=Simplify[LUDecomposition[M]][[1]]

$(\begin{array}{cccc} 1 & x_{0} & x_{0}^{2} & x_{0}^{3} \\ 1 & x_{1} - x_{0} & x_{1}^{2} - x_{0}^{2} & x_{1}^{3} - x_{0}^{3} \\ 1 & \frac{x_{0} - x_{2}}{x_{0} - x_{1}} & (x_{0} - x_{2}) (x_{1} - x_{2}) & (x_{0} - x_{2}) (x_{1} - x_{2}) (x_{0} + x_{1} + x_{2}) \\ 1 & \frac{x_{0} - x_{3}}{x_{0} - x_{1}} & \frac{(x_{0} - x_{3}) (x_{3} - x_{1})}{(x_{0} - x_{2}) (x_{2} - x_{1})} & - ((x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})) \end{array})$

In[14]:=

L=Table[ If[i<j,LU[[j,i]], If[i==j, 1, 0]],{j,1,n1},{i,1,n1}]

$(\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & \frac{x_{0} - x_{2}}{x_{0} - x_{1}} & 1 & 0 \\ 1 & \frac{x_{0} - x_{3}}{x_{0} - x_{1}} & \frac{(x_{0} - x_{3}) (x_{3} - x_{1})}{(x_{0} - x_{2}) (x_{2} - x_{1})} & 1 \end{array})$

In[15]:=

U=Table[ If[i>=j,LU[[j,i]], 0],{j,1,n1},{i,1,n1}]

$(\begin{array}{cccc} 1 & x_{0} & x_{0}^{2} & x_{0}^{3} \\ 0 & x_{1} - x_{0} & x_{1}^{2} - x_{0}^{2} & x_{1}^{3} - x_{0}^{3} \\ 0 & 0 & (x_{0} - x_{2}) (x_{1} - x_{2}) & (x_{0} - x_{2}) (x_{1} - x_{2}) (x_{0} + x_{1} + x_{2}) \\ 0 & 0 & 0 & - ((x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})) \end{array})$

In[18]:=

Simplify[L.U]

$(\begin{array}{cccc} 1 & x_{0} & x_{0}^{2} & x_{0}^{3} \\ 1 & x_{1} & x_{1}^{2} & x_{1}^{3} \\ 1 & x_{2} & x_{2}^{2} & x_{2}^{3} \\ 1 & x_{3} & x_{3}^{2} & x_{3}^{3} \end{array})$

In[38]:=

Factor[Det[M]]

$(x_{0} - x_{1}) (x_{0} - x_{2}) (x_{1} - x_{2}) (x_{0} - x_{3}) (x_{1} - x_{3}) (x_{2} - x_{3})$

In[39]:=

$\circ$ Both factors can be inverted analytically, e.g., for $n = 3$ ,

𝑳^{- 1} = (\begin{array}{cccc} 1 & 0 & 0 & 0 \\ - 1 & 1 & 0 & 0 \\ \frac{x_{1} - x_{2}}{x_{0} - x_{1}} & \frac{x_{2} - x_{0}}{x_{0} - x_{1}} & 1 & 0 \\ \frac{(x_{1} - x_{3}) (x_{3} - x_{2})}{(x_{0} - x_{1}) (x_{0} - x_{2})} & \frac{(x_{0} - x_{3}) (x_{2} - x_{3})}{(x_{0} - x_{1}) (x_{1} - x_{2})} & \frac{(x_{0} - x_{3}) (x_{1} - x_{3})}{(x_{0} - x_{2}) (x_{2} - x_{1})} & 1 \end{array}),

𝑼^{- 1} = (\begin{array}{cccc} 1 & \frac{x_{0}}{x_{0} - x_{1}} & - \frac{x_{0} x_{1}}{(x_{0} - x_{2}) (x_{2} - x_{1})} & \frac{x_{0} x_{1} x_{2}}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \\ 0 & \frac{1}{x_{1} - x_{0}} & \frac{x_{0} + x_{1}}{(x_{0} - x_{2}) (x_{2} - x_{1})} & - \frac{x_{1} x_{2} + x_{0} (x_{1} + x_{2})}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \\ 0 & 0 & \frac{1}{(x_{0} - x_{2}) (x_{1} - x_{2})} & \frac{x_{0} + x_{1} + x_{2}}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \\ 0 & 0 & 0 & - \frac{1}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \end{array}) .

In[19]:=

Linv=Simplify[Inverse[L]]

$(\begin{array}{cccc} 1 & 0 & 0 & 0 \\ - 1 & 1 & 0 & 0 \\ \frac{x_{1} - x_{2}}{x_{0} - x_{1}} & \frac{x_{2} - x_{0}}{x_{0} - x_{1}} & 1 & 0 \\ \frac{(x_{1} - x_{3}) (x_{3} - x_{2})}{(x_{0} - x_{1}) (x_{0} - x_{2})} & \frac{(x_{0} - x_{3}) (x_{2} - x_{3})}{(x_{0} - x_{1}) (x_{1} - x_{2})} & \frac{(x_{0} - x_{3}) (x_{1} - x_{3})}{(x_{0} - x_{2}) (x_{2} - x_{1})} & 1 \end{array})$

In[20]:=

Uinv=Simplify[Inverse[U]]

$(\begin{array}{cccc} 1 & \frac{x_{0}}{x_{0} - x_{1}} & - \frac{x_{0} x_{1}}{(x_{0} - x_{2}) (x_{2} - x_{1})} & \frac{x_{0} x_{1} x_{2}}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \\ 0 & \frac{1}{x_{1} - x_{0}} & \frac{x_{0} + x_{1}}{(x_{0} - x_{2}) (x_{2} - x_{1})} & - \frac{x_{1} x_{2} + x_{0} (x_{1} + x_{2})}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \\ 0 & 0 & \frac{1}{(x_{0} - x_{2}) (x_{1} - x_{2})} & \frac{x_{0} + x_{1} + x_{2}}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \\ 0 & 0 & 0 & - \frac{1}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})} \end{array})$

In[21]:=

$\circ$ The functions that result for $n = 3$

{\frac{(t - x_{1}) (t - x_{2}) (t - x_{3})}{(x_{0} - x_{1}) (x_{0} - x_{2}) (x_{0} - x_{3})}, \frac{(t - x_{0}) (t - x_{2}) (t - x_{3})}{(x_{1} - x_{0}) (x_{1} - x_{2}) (x_{1} - x_{3})}, \frac{(t - x_{0}) (t - x_{1}) (t - x_{3})}{(x_{2} - x_{0}) (x_{2} - x_{1}) (x_{2} - x_{3})}, \frac{(t - x_{0}) (t - x_{1}) (t - x_{2})}{(x_{3} - x_{0}) (x_{3} - x_{1}) (x_{3} - x_{2})}},

can be generalized as

ℓ_{i} (t) = \prod_{j = 0}^{n}^{'} \frac{t - x_{j}}{x_{i} - x_{j}},

known as the Lagrange basis set, where the prime on the product symbol skips the index $j = i$ . Note that each member of the basis is a polynomial of degree $n$ .

In[23]:=

Simplify[{1,t,t^2,t^3}.Uinv.Linv]

${\frac{(t - x_{1}) (t - x_{2}) (t - x_{3})}{(x_{0} - x_{1}) (x_{0} - x_{2}) (x_{0} - x_{3})}, - \frac{(t - x_{0}) (t - x_{2}) (t - x_{3})}{(x_{0} - x_{1}) (x_{1} - x_{2}) (x_{1} - x_{3})}, - \frac{(t - x_{0}) (t - x_{1}) (t - x_{3})}{(x_{0} - x_{2}) (x_{2} - x_{1}) (x_{2} - x_{3})}, - \frac{(t - x_{0}) (t - x_{1}) (t - x_{2})}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})}}$

In[24]:=

By construction, through the condition $ℒ_{n + 1} (𝒙) = 𝑰$ , a Lagrange basis function evaluated at a sample point is

ℓ_{i} (x_{j}) = δ_{i j} = {\begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases} . .

A polynomial of degree $n$ is expressed as a linear combinations of the Lagrange basis functions by

p (t) = ℒ_{n + 1} (t) 𝒄 = [\begin{array}{llll} ℓ_{0} (t) & ℓ_{1} (t) & \dots & ℓ_{n} (t) \end{array}] [\begin{array}{l} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{array}] = c_{0} ℓ_{0} (t) + c_{1} ℓ_{1} (t) + \dots c_{n} ℓ_{n} (t) .

The interpolant of data ${(x_{i}, y_{i} = f (x_{i})), i = 0, 1, \dots, n}$ is determined through the conditions

p (𝒙) = 𝒚 = ℒ_{n + 1} (𝒙) 𝒄 = 𝑰 𝒄 = 𝒄 \Rightarrow 𝒄 = 𝒚,

i.e., the linear combination coefficients are simply the sampled function values $c_{i} = y_{i} = f (x_{i})$ .

p (t) = \sum_{i = 0}^{n} y_{i} ℓ_{i} (t) = \sum_{i = 0}^{n} y_{i} \prod_{j = 0}^{n}^{'} \frac{t - x_{j}}{x_{i} - x_{j}} .

(7)

Determining the linear combination coefficients may be without cost, but evaluation of the Lagrange form (7) of the interpolating polynomial requires $𝒪 (n^{2})$ operations, significantly more costly than the $𝒪 (n)$ operations required by Horner's scheme (5)

$\circ$

Algorithm (Lagrange evaluation)

Input: $𝒙, 𝒚 \in ℝ^{n + 1}, t \in ℝ$

Output: $p (t) = \sum_{i = 0}^{n} y_{i} \prod_{j = 0}^{n}^{'} (t - x_{j}) / (x_{i} - x_{j})$

$p = 0$

for $i = 0$ to n

$w = 1$

for $j = 0$ to $n$

if $j \neq i$ then $w = w (t - x_{j}) / (x_{i} - x_{j})$

end

$p = p + w \cdot y_{i}$

end

return $p$

∴	function Lagrange(t,x,y) n=length(x)-1; p=0 for i=1:n+1 w=1 for j=1:n+1 if (i!=j) w=w(t-x[j])/(x[i]-x[j]); end end p = p + wy[i] end return p end;

∴	p2(t)=3t^2-2t+1;

∴	x=[-2 0 2]; y=p2.(x);

∴	t=-3:3; [p2.(t) Lagrange.(t,Ref(x),Ref(y))]

$[\begin{array}{cc} 34.0 & 34.0 \\ 17.0 & 17.0 \\ 6.0 & 6.0 \\ 1.0 & 1.0 \\ 2.0 & 2.0 \\ 9.0 & 9.0 \\ 22.0 & 22.0 \end{array}]$ (8)

∴

$\circ$

Figure 2. Lagrange basis for $n = 6$ for $\sin (x)$ over interval $[- π, π]$

∴	n=6; x=range(-π,π,length=n+1); y=sin.(x);

∴	t=range(-π,π,length=10*n);

∴	M=ones(length(t),length(x));

∴	function lagrange(j,t,x,y) n=length(x)-1; l=1 for i=1:n+1 if (i!=j) l = l*(t-x[i])/(x[j]-x[i]); end end return l end;

∴

∴	for j=1:n+1 M[:,j] = lagrange.(j,t,Ref(x),Ref(y)) end

∴	figure(1); clf();

∴	for j=1:n+1 plot(t,M[:,j]) end

∴	xlabel("t"); ylabel("l(i,t)"); grid("on"); title("Lagrange basis");

∴	imdir=home*"/courses/MATH661/images/";

∴	savefig(imdir*"L18LagrangeBasis.eps")

∴

A reformulation of the Lagrange basis can however reduce the operation count. Let $ℓ (t) = \prod_{k = 0}^{n} (t - x_{k})$ , and rewrite $ℓ_{i} (t)$ as

ℓ_{i} (t) = \prod_{j = 0}^{n}^{'} \frac{t - x_{j}}{x_{i} - x_{j}} = ℓ (t) \frac{w_{i}}{t - x_{i}},

with the weights

w_{i} = \prod_{j = 0}^{n}^{'} \frac{1}{x_{i} - x_{j}},

depending only on the function sample arguments $x_{i}$ , but not on the function values $y_{i}$ . The interpolating polynomial is now

p (t) = \sum_{i = 0}^{n} y_{i} ℓ_{i} (t) = ℓ (t) \sum_{i = 0}^{n} y_{i} \frac{w_{i}}{t - x_{i}} .

Interpolation of the function $g (t) = 1$ would give

1 = ℓ (t) \sum_{i = 0}^{n} \frac{w_{i}}{t - x_{i}},

and taking the ratio yields

p (t) =,

(9)

known as the barycentric Lagrange formula (by analogy to computation of a center of mass). Evaluation of the weights $w_{i}$ costs $𝒪 (n^{2})$ operations, but can be done once for any set of $x_{i}$ . The evaluation of $p (t)$ now becomes an $𝒪 (2 n)$ process, comparable in cost to Horner's scheme.

$\circ$

Algorithm (Barycentric Lagrange evaluation)

Input: $𝒙, 𝒚 \in ℝ^{n + 1}, t \in ℝ$

Output: $p (t) = (\sum_{i = 0}^{n} y_{i}) / (\sum_{i = 0}^{n})$

for $i = 0$ to $n$

$w_{i} = 1$

for $j = 0$ to $n$

if $j \neq i$ $w_{i} = w_{i} / (x_{i} - x_{j})$

end

$q = r = 0$

for $i = 0$ to n

$s = w_{i} / (t - x_{i})$ ; $q = q + y_{i} s$ ; $r = r + s$

end

$p = q / r$

return $p$

∴

function BaryLagrange(t,x,y)
  n=length(x)-1; w=ones(size(x));
  for i=1:n+1
    w[i]=1
    for j=1:n+1
      if (i!=j) w[i]=w[i]/(x[i]-x[j]); end
    end
  end
  q=r=0
  for i=1:n+1
    d=t-x[i]
    if d≈0 return y[i]; end
    s=w[i]/d; q=q+y[i]*s; r=r+s
  end
  return q/r
end;

∴	p2(t)=3t^2-2t+1;

∴	x=[-2 0 2]; y=p2.(x);

∴	t=-3:3; [p2.(t) BaryLagrange.(t,Ref(x),Ref(y))]

$[\begin{array}{cc} 34 & 33.99999999999999 \\ 17 & 17 \\ 6 & 6.0 \\ 1 & 1 \\ 2 & 2.0 \\ 9 & 9 \\ 22 & 21.999999999999996 \end{array}]$ (10)

∴

Newton form of interpolating polynomial.

Inverting only one factor of the

ℳ_{n + 1} (t) = ℒ_{n + 1} (t) 𝑳 𝑼

mapping yields yet another basis set

𝒮 (t) = [\begin{array}{llll} N_{0} (t) & N_{1} (t) & N_{2} (t) & \dots \end{array}]

ℳ_{n + 1} (t) 𝑼^{- 1} = ℒ_{n + 1} (t) 𝑳 = 𝒮_{n + 1} (t) .

$\circ$ The first four basis polynomials are

{1, \frac{t - x_{0}}{x_{1} - x_{0}}, \frac{(t - x_{0}) (t - x_{1})}{(x_{2} - x_{0}) (x_{2} - x_{1})}, \frac{(t - x_{0}) (t - x_{1}) (t - x_{2})}{(x_{3} - x_{0}) (x_{3} - x_{1}) (x_{3} - x_{2})}},

with $N_{0} (t) = 1$ , and in general

N_{k} (t) = \prod_{j = 0}^{k - 1} \frac{t - x_{j}}{x_{k} - x_{j}},

for $k > 0$ .

In[22]:=

Simplify[{1,x,x^2,x^3}.Uinv]

${1, \frac{x_{0} - x}{x_{0} - x_{1}}, - \frac{(x - x_{0}) (x - x_{1})}{(x_{0} - x_{2}) (x_{2} - x_{1})}, - \frac{(x - x_{0}) (x - x_{1}) (x - x_{2})}{(x_{0} - x_{3}) (x_{3} - x_{1}) (x_{3} - x_{2})}}$

In[23]:=

Computation of the scaling factors $w_{k} = 1 / \prod_{j = 0}^{k - 1} (x_{k} - x_{j})$ would require $𝒪 (n^{2} / 2)$ operations, but can be avoided by redefining the basis set as $𝒩 (t) = [\begin{array}{llll} n_{0} (t) & n_{1} (t) & n_{2} (t) & \dots \end{array}]$ , with $n_{0} (t) = 1$ , and

n_{k} (t) = \prod_{j = 0}^{k - 1} (t - x_{j}),

known as the Newton basis. As usual, the coefficients $𝒅 \in ℝ^{n + 1}$ of the linear combination of Newton polynomials

p (t) = 𝒩_{n + 1} (t) 𝒄 = [\begin{array}{llll} n_{0} (t) & n_{1} (t) & \dots & n_{n} (t) \end{array}] [\begin{array}{l} d_{0} \\ d_{1} \\ ⋮ \\ d_{n} \end{array}] = d_{0} n_{0} (t) + d_{1} n_{1} (t) + \dots + d_{n} n_{n} (t),

are determined from the interpolation conditions $p (𝒙) = 𝒚$ . The resulting linear system is of triangular form,

[\begin{array}{ccccc} 1 & 0 & 0 & \dots & 0 \\ 1 & x_{1} - x_{0} & 0 & \dots & 0 \\ 1 & x_{2} - x_{0} & (x_{2} - x_{0}) (x_{2} - x_{1}) & \dots & 0 \\ 1 & x_{3} - x_{0} & (x_{3} - x_{0}) (x_{3} - x_{1}) & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & x_{n} - x_{0} & (x_{n} - x_{0}) (x_{n} - x_{1}) & \dots & \prod_{j = 0}^{n - 1} (x_{n} - x_{j}) \end{array}] [\begin{array}{l} d_{0} \\ d_{1} \\ d_{2} \\ ⋮ \\ d_{n} \end{array}] = [\begin{array}{l} y_{0} \\ y_{1} \\ y_{2} \\ ⋮ \\ y_{n} \end{array}]

and readily solved by forward substitution.

$\circ$ The first few coefficients are

d_{0} = y_{0}, d_{1} = \frac{y_{1} - d_{0}}{x_{1} - x_{0}} = \frac{y_{1} - y_{0}}{x_{1} - x_{0}},

d_{2} = \frac{y_{2} - (x_{2} - x_{0}) d_{1} - d_{0}}{(x_{2} - x_{0}) (x_{2} - x_{1})} = \frac{y_{2} - (x_{2} - x_{0}) - y_{0}}{(x_{2} - x_{0}) (x_{2} - x_{1})} = \frac{-}{x_{2} - x_{0}} .

In[24]:=

d2=(y2-(x2-x0)(y1-y0)/(x1-x0)-y0)/(x2-x0)/(x2-x1)

$\frac{- \frac{(x2 - x0) (y1 - y0)}{x1 - x0} - y0 + y2}{(x 2 - x0) (x2 - x1)}$

In[25]:=

dd2=((y2-y1)/(x2-x1)-(y1-y0)/(x1-x0))/(x2-x0)

$\frac{\frac{y2 - y1}{x2 - x1} - \frac{y1 - y 0}{x1 - x0}}{x2 - x0}$

In[26]:=

Simplify[dd2-d2]

$0$

In[27]:=

The forward substitution is efficiently expressed through the definition of divided differences

[y_{i}] = y_{i}, [y_{i + 1}, y_{i}] = \frac{[y_{i + 1}] - [y_{i}]}{x_{i + 1} - x_{i}} = \frac{y_{i + 1} - y_{i}}{x_{i + 1} - x_{i}}, [y_{i + 2}, y_{i + 1}, y_{i}] = \frac{[y_{i + 2}, y_{i + 1}] - [y_{i + 1}, y_{i}]}{x_{i + 2} - x_{i}},

or in general, the $k^{th}$ divided difference

[y_{i + k}, y_{i + k - 1}, \dots, y_{i}] = \frac{[y_{i + k}, y_{i + k - 1}, \dots, y_{i + 1}] - [y_{i + k - 1}, y_{i + k - 1}, \dots, y_{i}]}{x_{i + k} - x_{i}},

given in terms of the $(k - 1)$ divided differences. The forward substitution computations are conveniently organized in a table, useful both for hand computation and also for code implementation.

\begin{array}{lllllll} i & x_{i} & [y_{i}] & [y_{i}, y_{i - 1}] & [y_{i}, y_{i - 1}, y_{i - 2}] \\ 0 & x_{0} & y_{0} & - & - \\ 1 & x_{1} & y_{1} & - \\ 2 & x_{2} & y_{2} & ⋱ \\ ⋮ & ⋮ & ⋮ & ⋮ \\ n & x_{n} & y_{n} & \dots \end{array}

Table 2. Table of divided differences. The Newton basis coefficients $𝒅$ are the diagonal terms.

$\circ$

Algorithm (Forward substitution, Newton coefficients)

Input: $𝒙, 𝒚 \in ℝ^{n + 1}$

Output: $𝒅 \in ℝ^{n + 1}$

$𝒅 = 𝒚$

for $i = 1$ to $n$

for $j = n$ downto $i$

$d_{j} = (d_{j} - d_{j - 1}) / (x_{j} - x_{j - i})$

end

∴	function DivDif(x,y) n=length(x)-1; d=copy(y) for i=2:n+1 for j=n+1:-1:i d[j] = (d[j]-d[j-1])/(x[j]-x[j-i+1]) end end return d end;

∴	p2(t)=3t^2-2t+1; x=[-2 0 2]; y=p2.(x); d=DivDif(x,y)

$[\begin{array}{ccc} 17 & - 8 & 3 \end{array}]$ (11)

∴

The above algorithm requires only $𝒪 (n)$ operations, and the Newton form of the interpolating polynomial

p (t) = [y_{0}] \cdot 1 + [y_{1}, y_{0}] \cdot (t - x_{0}) + [y_{1}, y_{0}] \cdot (t - x_{0}) (t - x_{1}) + \dots + + [y_{n}, y_{n - 1}, \dots, y_{0}] \cdot (t - x_{0}) \cdot (t - x_{1}) \cdot \dots \cdot (t - x_{n - 1}),

can also be evaluated in $𝒪 (n)$ operations

$\circ$

Algorithm (Newton polynomial evaluation)

Input: $𝒙, 𝒅 \in ℝ^{n + 1}, t \in ℝ$

Output: $p (t) = d_{0} + d_{1} t + \dots + d_{n} t^{n}$

$p = d_{0}$ ; $r = 1$

for $k = 1$ to $n$

$r = r \cdot (t - x_{k - 1})$

$p = p + d_{k} \cdot r$

end

return $p$

∴	function Newton(t,x,d) n=length(x)-1; p=d[1]; r=1 for k=2:n+1 r = r(t-x[k-1]) p = p + d[k]r end return p end;

∴	p2(t)=3t^2-2t+1; x=[-2 0 2]; y=p2.(x); d=DivDif(x,y);

∴	t=-3:3; [p2.(t) Newton.(t,Ref(x),Ref(d))]

$[\begin{array}{cc} 34 & 34 \\ 17 & 17 \\ 6 & 6 \\ 1 & 1 \\ 2 & 2 \\ 9 & 9 \\ 22 & 22 \end{array}]$ (12)

∴

$\circ$

Figure 3. Newton basis for $n = 6$ for $\sin (x)$ over interval $[- π, π]$

∴	n=6; x=range(-π,π,length=n+1); y=sin.(x);

∴	t=range(-π,π,length=10*n);

∴	M=ones(length(t),length(x));

∴	function newton(j,t,x,y) n=length(x)-1; nj=1 for i=1:j-1 nj = nj*(t-x[i]) end return nj end;

∴

∴	for j=1:n+1 M[:,j] = newton.(j,t,Ref(x),Ref(y)) end

∴	figure(1); clf();

∴	for j=1:n+1 plot(t,M[:,j]) end

∴	xlabel("t"); ylabel("l(i,t)"); grid("on"); title("Newton basis");

∴	imdir=home*"/courses/MATH661/images/";

∴	savefig(imdir*"L18NewtonBasis.eps")

∴