MATH661

Lecture 18: Best Approximant

1.Best approximants

Interpolation of data $𝒟 = {(x_{i}, y_{i} = f (x_{i})), i = 0, \dots, n}$ by an approximant $p (t)$ corresponds to the minimization problem

{min}_{p} || f - p ||,

in the discrete one-norm at the sample points $x_{i}$

|| f || = {|| 𝒇 ||}_{1} = \sum_{i = 0}^{n} | f (x_{i}) | .

Different approximants are obtained upon changing the norm.

Theorem (Existence of best approximant. For any element $f \in F$ in a normed vector space $ℱ = (F, S, +, \cdot)$ , there exists a best approximant $g \in G$ within a finite dimensional subspace $G \subset F$ that is a solution of

${min}_{g \in G} || f - g || .$

The argument underlying the above theorem is based upon constructing the closed and bounded subset of $G$

K = {g \in G | . || g - f || ⩽ || 0 - f || = || f ||} \subset G .

Since $G$ is finite dimensional, $K$ is compact, and the continuous mapping $g \to || g - f ||$ attains is extrema.

The two main classes of approximants $g$ of real functions $f : [a, b] \to ℝ$ that arise are:

Approximants based upon sampling

The vectors $𝒇 = f (𝒙), 𝒈 = g (𝒙)$ are constructed at sample points $𝒙 \in ℝ^{m}$ and the best approximant solves the problem

{min}_{g \in G} || 𝒇 - 𝒈 || .

Note that the minimization is carried out over the members of the subset $G$ , not over the vectors $𝒈$ . The norm can include information on derivatives as in the norm

{|| f ||}_{H} = {|| 𝒇 ||}_{1} + {|| 𝒇^{'} ||}_{1},

arising in Hermite interpolation.

Approximants over the function domain

The norm is now expressed through an integral such as the $p$ -norms

{|| f ||}_{p} = {(\int_{a}^{b} {| f (t) |}^{p} d t)}^{1 / p} .

In general, the best approximant in a normed space is not unique. However, the best approximant is unique in a Hilbert space, and is further characterized by orthogonality of the residual to the approximation subspace.

Theorem (Best Approximant in Hilbert space). For any element $f \in F$ in a Hilbert space $ℱ = (F, S, +, \cdot)$ , there exists a unique approximant $g \in G$ within a finite dimensional subspace $G \subset F$ that is a solution of

${min}_{g \in G} || f - g ||,$

and the residual $f - g$ is orthogonal to $G$ , $\forall h \in G$

$(f - g, h) = 0 .$

Note that orthogonality of the residual $(f - g, h) = 0$ implies $(f, h) = (g, h)$ or that the best approximant is the projection of $f$ onto $G$ .

2.Two-norm approximants in Hilbert spaces

For Hilbert spaces with a norm is induced by the scalar product

|| f || = {(f, f)}^{1 / 2},

finding the best approximant reduces to a problem within $ℝ^{m}$ (or $ℂ^{m}$ ). Introduce a basis $ℬ = {b_{1}, b_{2}, \dots}$ for $ℱ$ such that any $f \in F$ has an expansion

f (t) = \sum_{j = 1}^{\infty} f_{j} b_{j} (t), f_{j} = (f, b_{j})

Since $G$ is finite dimensional, say $n = \dim (G)$ , an approximant has expansion

g (t) = \sum_{j = 1}^{n} g_{j} b_{s (j)} (t) .

Note that the approximation may lie in an arbitrary finite-dimensional subspace of $ℱ$ . Choosing the appropriate subset through the function $s : ℕ \to ℕ$ is an interesting problem in itself, leading to the goal of selecting those basis functions that capture the largest components of $f$ , i.e., the solution of

{min}_{𝒔 \in ℕ^{n}} \sum_{j = 1}^{n} | (f, b_{s (j)}) | .

Approximate solutions of the basis component selection are obtained by processes such as greedy approximation or clustering algorithms. The approach typically adopted is to exploit the Bessel inequality

\sum_{i = 1}^{n} f_{s (i)}^{2} ⩽ {|| f ||}^{2},

and select

s (1) = \arg {max}_{i \in S} f_{i}^{2},

eliminate $s (1)$ from $S$ , and search again. The $k^{th}$ -step is

s (k) = \arg {max}_{i \in S} f_{i}^{2},

with $S_{k} = S - {s (1), \dots, s (k - 1)}$ .

Assuming $s (j) = j$ , the orthogonality relation $f - g ⊥ G$ leads to a linear system

(f - g, b_{i}) = 0 \Rightarrow (\sum_{j = 1}^{n} g_{j} b_{j}, b_{i}) = \sum_{j = 1}^{n} (b_{i}, b_{j}) g_{j} = (f, b_{i}) \Rightarrow 𝑩 𝒈 = 𝒇 .

If the basis is orthonormal, then $𝑩 = 𝑰$ , and the best approximant is simply given by the projection of $f$ onto the basis elements. Note that the scalar product need not be the Euclidean discrete or continuous versions

(f, g) = \sum_{i = 1}^{n} f_{i} g_{i}, (f, g) = \int_{a}^{b} f (t) g (t) d t .

A weighting function may be present as in

(f, g) = 𝒇^{T} 𝑾 𝒈, (f, g) = \int_{a}^{b} f (t) g (t) w (t) d t,

discrete and continuous versions, respectively. In essense the appropriate measure $μ (t)$ for some specific problem

d μ (t) = w (t) d t,

arises and might not be the Euclidean measure $w (t) = 1$ .

3.Inf-norm approximants

In the vector space of continuous functions defined on a topological space $X$ (e.g., a closed and bounded set in $ℝ^{n}$ ), a norm can be defined by

|| f || = {max}_{x \in X} | f (x) |,

and the best approximant is found by solving the problem

{inf}_{g \in G} || f - g || = {inf}_{g \in G} {max}_{x \in X} | f (x) - g (x) | .

The fact that $g$ is the best approximant of $f$ can be restated as $0$ being the approximant of $f - g$ since

|| f - g - 0 || ⩽ || f - (g + h) || .

A key role is played by the points where $f (x) = g (x)$ leading to the definition of a critical set as

crit (f) = 𝒵 (f) = {x \in X : | f (x) | = || f ||} .

When $G = P_{n - 1}$ , the space of polynomials of degree at most $n - 1$ , with $\dim P_{n - 1} = n$ , the best approximant can be charaterized by the number of sign changes of $f (x) - g (x)$ .

Theorem (Chebyshev Alternation). The polynomial $p \in P_{n - 1}$ is the best approximant of $f : [a, b] \to ℝ$ in the inf-norm

${|| f - p ||}_{\infty} = {max}_{a ⩽ x ⩽ b} | f (x) - p (x) |$

if and only if there exist $n + 1$ points $a ⩽ x_{0} < x_{1} < \dots < x_{n} ⩽ b$ such that

$f (x_{i}) - p (x_{i}) = s \cdot {(- 1)}^{i} {|| f - p ||}_{\infty},$

where $| s | = 1$ .

Recall that choosing $x_{i} = \cos [(2 i - 1) π / (2 n)]$ , the roots of the $T_{n} (θ) = \cos (n θ)$ Chebyshev polynomial (with $x = \cos θ$ , $a = - 1$ , $b = 1$ ), leads to the optimal error bound in polynomial interpolation

| f (t) - p (t) | ⩽ \frac{{|| f^{(n + 1)} ||}_{\infty}}{(n + 1)! 2^{n}} .

The error bound came about from consideration of the alternation of signs of $p (x_{j}) - q (x_{j})$ at the extrema of the Chebyshev polynomial $T_{n},$ $x_{i} = \cos (i π / n)$ , $i = 0, 1, \dots n$ , with $p, q$ monic polynomials. The Cebyshev alternation theorem generalizes this observation and allows the formulation of a general approach to finding the best inf-norm approximant known as the Remez algorithm. The idea is that rather than seeking to satisfy the interpolation conditions

𝑴 𝒂 = 𝒚

in the monomial basis

𝑴 = ℳ_{n - 1} (x) = [\begin{array}{llll} 𝟏 & 𝒙 & \dots & 𝒙^{n - 1} \end{array}] \in ℝ^{n \times n},

attempt to find $n$ alternating-sign extrema points by considering the basis set

𝑹 = ℛ_{n} (𝒙) = [\begin{array}{llll} 𝟏 & 𝒙 & \dots & 𝒙^{n - 1} \end{array} \pm 𝟏] \in ℝ^{(n + 1) \times (n + 1)}

with $\pm 𝟏 = [\begin{array}{llll} + 1 & - 1 & + 1 & \dots \end{array}]$ .

Algorithm (Remez)

Initialize $𝒙 \in ℝ^{n + 1}$ to Chebyshev maxima on interval $[a, b]$
Solve $𝑹 𝒄 = f (𝒙)$ $ℛ (𝒙)$ , $𝒄^{T} = [\begin{array}{ll} 𝒂^{T} & c_{n + 1} \end{array}]$ , $𝒂 \in ℝ^{n}$
Find the extrema $𝒚$ of $p (t) - f (t)$ with $p (t) = a_{0} + a_{1} t + \dots + a_{n - 1} t^{n - 1}$
If $p (y_{i}) - f (y_{i})$ are approximately equal in absolute value and of opposite signs, return $𝒙$
Otherwise set $𝒙 = 𝒚$ , repeat