Model Reduction
1.Projection of mappings
1.1.Reduced matrices
The least-squares problem
|
(1) |
focuses on a simpler representation of a data vector
as a linear combination of column vectors of .
Consider some phenomenon modeled as a function between vector spaces
,
such that for input parameters ,
the state of the system is . For most
models is differentiable, a
transcription of the condition that the system should not exhibit jumps
in behavior when changing the input parameters. Then by appropriate
choice of units and origin, a linearized model
is obtained if ,
expressed as (1) if .
A simpler description is often sought, typically based on recognition
that the inputs and outputs of the model can themselves be obtained as
linear combinations ,
,
involving a smaller set of parameters ,
,
,
. The
column spaces of the matrices ,
are vector subspaces of the original set of inputs and outputs, ,
.
The sets of column vectors of
each form a reduced basis for the system
inputs and outputs if they are chosed to be of full rank. The reduced
bases are assumed to have been orthonormalized through the Gram-Schmidt
procedure such that ,
and .
Expressing the model inputs and outputs in terms of the reduced basis
leads to
The matrix
is called the reduced system matrix and is
associated with a mapping ,
that is a restriction to the
vector subspaces of the mapping . When
is an endomorphism, ,
, the
same reduced basis is used for both inputs and outputs, ,
,
and the reduced system is
Since is assumed to be orthogonal, the
projector onto is .
Applying the projector on the inital model
leads to ,
and since
the relation
is obtained, and conveniently grouped as
again leading to the reduced model .
The above calculation highlights that the reduced model is a projection
of the full model
on .
1.2.Dynamical system model reduction
An often encountered situation is the reduction of large-dimensional
dynamical system
|
(2) |
a generalization to multiple degrees of freedom of the damped oscillator
equation
In (2), are the time-depenent coordinates of
the system, the forces acting on the system, and
are the mass, drag, stiffness matrices, respectively.
When ,
a reduced description is sought by linear combination of
basis vectors
Choose
to have orthonormal columns, and project (2) onto
by multiplication with the projector
Since , deduce ,
hence
Introduce notations
for the reduced mass, drag, stiffness matrices, with ,
of smaller size. The reduced coordinates and forces are
The resulting reduced dynamical system is
2.Reduced bases
One elemenet is missing from the description of model reduction above:
how is determined? Domain-specific
knowledge can often dictate an appropriate basis (e.g., Fourier basis fo
periodic phenomena). An alternative approach is to extract an
appropriate basis from observations of a phenomenon, known as
data-driven modeling.
2.1.Correlation matrices
Correlation coefficient.
Consider two functions ,
that represent data streams in time of inputs and
outputs of some
system. A basic question arising in modeling and data science is
whether the inputs and outputs are themselves in a functional
relationship. This usually is a consequence of incomplete knowledge of
the system, such that while
might be assumed to be the most relevant input, output quantities,
this is not yet fully established. A typical approach is to then carry
out repeated measurements leading to a data set , thus
defining a relation. Let
denote vectors containing the input and output values. The mean values
of the input and output are estimated by the statistics
where is the expectation seen to be a
linear mapping,
whose associated matrix is
and the means are also obtained by matrix vector multiplication (linear
combination),
Deviation from the mean is measured by the standard
deviation defined for
by
Note that the standard deviations are no longer linear mappings of the
data.
Assume that the origin is chosen such that .
One tool to estalish whether the relation
is also a function is to compute the correlation
coefficient
that can be expressed in terms of a scalar product and 2-norm as
Squaring each side of the norm property , leads to
known as the Cauchy-Schwarz inequality, which implies . Depending on
the value of , the variables are said to
be:
-
uncorrelated, if ;
-
correlated, if ;
-
anti-correlated, if .
The numerator of the correlation coefficient is known as the covariance
of
The correlation coefficient can be interpreted as a normalization of the
covariance, and the relation
is the two-variable version of a more general relationship encountered
when the system inputs and outputs become vectors.
Patterns in data.
Consider now a related problem, whether the input and output
parameters ,
thought to characterize a system are actually well chosen, or whether
they are redundant in the sense that a more insightful description is
furnished by
with fewer components .
Applying the same ideas as in the correlation coefficient, a sequence
of measurements is made leading to data
sets
Again, by appropriate choice of the origin the means of the above
measurements is assumed to be zero
Covariance matrices can be constructed by
Consider now the SVDs of ,
,
and from
identify ,
and .
Recall that the SVD returns an order set of singular values ,
and associated singular vectors. In many applications the singular
values decrease quickly, often exponentially fast. Taking the first
singular modes then gives a basis set
suitable for mode reduction
3.Stochastic systems - Karhunen-Loève
theorem
The data reduction inherent in SVD representations is a generic feature
of natural phenomena. A paradigm for physical systems is the evolution
of correlated behavior against a backdrop of thermal enery, typically
represented as a form of noise.
One mathematical technique to model such systems is the definition of a
stochastic process ,
where for each fixed ,
is a random variable, i.e., a measurable function
from a set of possible outcomes to a
measurable space . The set
is the sample space of a probability triple , where for
A measurable space is a set coupled with procedure to determine
measurable subsets, known as a -algebra.
Theorem. Let
be a zero-mean (),
square-integrable stochastic process defined over probability space
indexed by ,
.
Then admits a
representation
with