For the steady-state heat equation with spatially-varying diffusivity, symmetric discretizations on uniform grids lead to systems with , and a regular sparsity pattern. Irregular domain discretization will lead to more complicated sparsity patterns that require different approaches to solving the linear system. It is important to link the changes in the structure of to specific aspects of the approximation procedure. Consider the difficulties of applying finite difference discretization on a domain of arbitrary shape with boundary (Fig. 1). At grid node closer to the boundary than the uniform spacing , centered finite difference formulas would refer to undefined values outside the domain. One-sided finite difference formulas would fail to take into account boundary values for the problem. Taylor series expansions could be used,
from which elimination of the second derivative leads to an approximation of the first derivative as
(1) |
Note that setting would place at a grid node, and from (1) the familiar centered finite difference approximation of the first derivative
is recovered. For an arbitrary domain the values of would vary and the resulting linear system would no longer be symmetric. From a physical perspective this might be surprising at first since the operator is isotropic, but this is true for an inifinitesimal domain. Upon irregular discretization the problem is only an approximation of the physical problem , and can exhibit different behavior, in this case loss of isotropy near the boundaries.
![]() |
Computing the appropriate mesh size fractions for all grid points near a boundary is an onerous task, and suggests seeking a different approach. A frutiful idea is to separate the problem of geometric description from that of physics expressed by some operator . Domains within of arbitrary complexity can be approximated to any desired precision by a simplicial covering. Simplicia are the simplest geometric objects with non-zero measure in a space. For these are line segments that can approximate arbitrary curves. The corresponding simplicia for and are triangles and tetrahedra, respectively. Consider and specify a set of triangles with vertices , , that form a partition of precision of the domain ,
The above state that intersections of triangles must have zero measure in , i.e., triangles can share edges or vertices but cannot overlap over a non-zero area. The area of the union of triangles approximates the area of the overall domain .
In a finite difference discretization the function is approximated by a set of values , often referred to as a grid function. Similarly, a set of values can be defined at the triangle vertices . Denote the vertex coordinates of triangle by , . Values of within the triangle are determined through piecewise interpolation, a generalization of one-dimensional -splines, using the form functions
with the triangle area
Note that for the form functions give the fraction of the overall area occupied by the interior triangles such that . The linear spline interpolation of based upon the vertex values is
(2) |
the familiar form of a linear combination. It is customary to set if , recovering the framework of -splines. Since thus approximated is non-zero only over the single triangle , such an approach is commonly referred to as a finite element method (FEM).
Various approaches can be applied to derive an algebraic system for the vertex values from the conservation law of interest. Consider the operator and the static equilibrium equation in with Dirichlet boundary conditions on . When denotes temperature, this is a statement of thermal equilibrium where heat fluxes balance out external heating and imposed temperature values on the boundary. One commonly used approach closely resembles the least squares approximation of ,
The approximant of in this case is its projection onto , , with the (incomplete) decomposition of . The error of this approximation is is orthogonal to
(3) |
The generalization of (3) in which the finite-dimensional vector is replaced by the function that satisfies is
(4) |
for each triangle with denoting the scalar product
The analogy can be understood by recognizing that finite element approximants lie within the span of the form functions for all triangles and their vertices . This known as a Galerkin method with (4) expressing orthogonality of the error and all form functions , leading to
The null result of applying the second-order differential operator onto a linear form function is avoided through integration by parts (divergence theorem
Assembling contributions from all triangles results in a linear system , expressing an approximation of the steady-state heat equation .
It is illuminating to note that though the physical process itself is isotropic, the FEM approximation typically leads to a non-symmetric system matrix due to the different sizes of the triangularization elements. The fact that the approximation depends on the domain discretization is not surprising; this also occurred for finite difference approximations as evidenced by the eigenvalue dependency on grid spacing , e.g., . The particularity of FEM discretization is that the single parameter has been replaced by the individual geometry of all triangles within the domain partition. It is to be expected that the resulting matrices will exhibit condition numbers that are monotonic with respect to , the ratio of the area of the largest triangle area to the smallest. This is readily understood: when the spanning set becomes linearly dependent since one of its members approaches the zero element. The same effect is obtained if the aspect ratio of a triangle becomes large (i.e., one of its angles is close to zero), since again the spanning set is close to linearly dependent. A finite element system matrix will still exhibit sparsity since the form functions are non-zero on only one triangle. The sparsity pattern is however determined by the connectivity, i.e., the number of triangles at each shared vertex. A typical sparsity matrix is shown in Fig. 3. If the physical principle of action and reaction (Newton's third law) is respected by discretization the matrix will still be symmetric, a considerable advantage with respect to the use of Taylor series to extend finite difference methods to arbitrary domains.
From the above general observations it becomes apparent that solution techniques considered up to now are inadequate. Factorization methods such as or would lead to fill-in and loss of sparsity. Additive splitting is no longer trivially implemented since connectivity has be accounted for other than by simple loops. The already slow convergence rate of methods based upon additive splitting is likely to degrade further or perhaps diverge due to the influence the spatial discretization has upon eigenvalues of the iteration matrix . Similar considerations apply to gradient descent.
An alternative approach is to seek a suitable basis in which to iteratively construct improved approximations of the solution of the discretized system ,
Vectors within the basis set should be economical to compute and also lead to fast convergence in the sense that the coefficient vector should have components that rapidly decrease in absolute value. One idea is to recognize that for a sparse system matrix with an average of nonzero elements per row the cost to evaluate the matrix-vector product is only as opposed to for a full system with . This suggests considering a vector set
starting from some arbitrary vector . The resulting sequence of vectors has been encountered already in the power iteration method for computing eigenvalues and eigenvectors of , and for large , will tend to belong to the null space associated with the largest eigenvalue, leading to the ill-conditioned matrices
As in the development of power iteration into the method for eigenvalue approximation, the ill-conditioning of can be eliminating by orthogonalization of . In fact, the procedure can be organized so as to iteratively add one more vector to the vectors already obtained from orthogonalization of . Start in iteration from . A new direction is obtained through multiplication by , . Gram-Schmidt orthogonalization leads to
The above can be written as.
(5) |
Note that
thus constructing a sequence of vector spaces of increasing dimension when is not an eigenvector of . These are known as Krylov spaces . In the iteration (5) generalizes to
(6) |
The resulting algorithm is known as the Arnoldi iteration.
Algorithm (Arnoldi)
,
for
for to
end
end
Approximate solutions to the system can now be obtained by choosing the starting vector of the embedded Krylov spaces as and solving the least squares problem
(7) |
Problem (7) is reformulated using (6) as
with since . This is known as the generalized minimal residual algorithm (GMRES).
Algorithm (GMRES)
, ,
for
for to
end
# Least squares approximation
; ;
if return()
end