This document describes a non-sampling functional approximation method for linear and non-linear Bayesian updates. It begins by introducing the Lorenz-63 system as an example problem for applying linear and non-linear Bayesian updates. It then provides the mathematical framework for Bayesian updates using conditional probabilities and expectations. The document outlines an approach for approximating the Bayesian update using polynomial chaos expansions in a functional space without sampling. It concludes by presenting results of applying the linear and non-linear Bayesian update approximations to the Lorenz-63 system.
Activity 2-unit 2-update 2024. English translation
Non-sampling functional approximation of linear and non-linear Bayesian Update
1. Non-sampling functional approximation of linear and
non-linear Bayesian Update
Alexander Litvinenko
H. G. Matthies
Institute of Scientific Computing, TU Braunschweig
Brunswick, Germany
wire@tu-bs.de
http://www.wire.tu-bs.de
2. 2
Lorenz-63 Problem
Lorenz 1963 is a system of ODEs. Has chaotic solutions for certain
parameter values and initial conditions. [LBU is done by O. Pajonk].
˙x = σ(y − x)
˙y = x(ρ − z) − y
˙z = xy − βz
Initial state q0(ω) = (x0(ω), y0(ω), z0(ω)) are uncertain.
Solving in t0, t1, ..., t10, Noisy Measur. → UPDATE, solving in t11, t12, ..., t20, Noisy
Measur. → UPDATE,...
IDEA of the Bayesian Update (BU):
Take qf(ω) = q0(ω).
Linear BU: qa = qf + K · (z − y)
Non-Linear BU: qa = qf + H1
· (z − y) + (z − y)T
· H2
· (z − y).
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
3. 3
Setting for the identification process
General idea:
We observe / measure a system, whose structure we know in principle.
The system behaviour depends on some quantities (parameters),
which we do not know ⇒ uncertainty.
We model (uncertainty in) our knowledge in a Bayesian setting:
as a probability distribution on the parameters.
We start with what we know a priori, then perform a measurement.
This gives new information, to update our knowledge (identification).
Update in probabilistic setting works with conditional probabilities
⇒ Bayes’s theorem.
Repeated measurements lead to better identification.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
4. 4
Mathematical setup
Consider
A(u; q) = f ⇒ u = S(f; q),
where S is solution operator.
Operator depends on parameters q ∈ Q,
hence state u ∈ U is also function of q:
Measurement operator Y with values in Y:
y = Y (q; u) = Y (q, S(f; q)).
Examples of measurements:
(ODE) u(t) = (x(t), y(t), z(t))T
, y(t) = (x(t), y(t))T
(PDE) y(ω) = D0
u(ω, x)dx, y(ω) = D0
| grad u(ω, x)|2
dx, u in few points
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
5. 5
Inverse problem
For given f, measurement y is just a function of q.
This function is usually not invertible ⇒ ill-posed problem,
measurement y does not contain enough information.
In Bayesian framework state of knowledge modelled in a probabilistic way,
parameters q are uncertain, and assumed as random.
Bayesian setting allows updating / sharpening of information
about q when measurement is performed.
The problem of updating distribution—state of knowledge of q
becomes well-posed.
Can be applied successively, each new measurement y and
forcing f —may also be uncertain—will provide new information.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
6. 6
Conditional probability and expectation
With state u ∈ U = U ⊗ S a RV, the quantity to be measured
y(ω) = Y (q(ω), u(ω))) ∈ Y := Y ⊗ S
is also uncertain, a random variable.
A new measurement z is performed, composed from the
“true” value y ∈ Y and a random error : z(ω) = y + (ω) ∈ Y .
Classically, Bayes’s theorem gives conditional probability
P(Iq|Mz) =
P(Mz|Iq)
P(Mz)
P(Iq);
expectation with this posterior measure is conditional expectation. Kolmogorov starts
from conditional expectation E (·|Mz),
from this conditional probability via P(Iq|Mz) = E χIq|Mz .
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
7. 7
Conditional expectation
The conditional expectation is defined as
orthogonal projection onto the closed subspace L2(Ω, P, σ(z)):
E(q|σ(z)) := PQ∞q = argmin˜q∈L2(Ω,P,σ(z)) q − ˜q 2
L2
The subspace Q∞ := L2(Ω, P, σ(z)) represents the available information, estimate
minimises Φ(·) := q − (·) 2
over Q∞.
The update, also called the assimilated value qa(ω) := PQ∞q = E(q|σ(z)), is a
Q-valued RV
and represents new state of knowledge after the measurement.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
8. 8
Approximation
Minimisation equivalent to orthogonality: find φ ∈ L0(Y, Q)
∀v ∈ Q∞ : DqaΦ(qa(φ)), v L2 = q − qa, v L2 = 0,
⇔ DφΦ := DqaΦ ◦ Dφqa = 0 with qa(φ) := φ(z).
Approximation of Q∞: take Qn ⊂ Q∞
Qn := {ϕ ∈ Q : ϕ = ψn ◦ Y, ψn a nth
degree polynomial}
i.e. ϕ = H0
+ H1
Y + · · · + Hk
Y ⊗k
+ · · · + Hn
Y ⊗n
,
where Hk
∈ L k
s (Y, Q) is symmetric and k-linear.
With qa(φ) = qa(( H0
, . . . , Hk
, . . . , Hn
)) =
n
k=0 Hk
z⊗k
= PQnq,
orthogonality implies
∀ = 0, . . . , n : D( H)Φ(qa( H0
, . . . , Hk
, . . . , Hn
)) = 0
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
9. 9
Approximation
Φ(qa) = q − qa
2
= q − ( H0
+ H1
Y + · · · + Hn
Y ⊗n
) 2
=
q − ( H0
+ H1
Y + · · · + Hn
Y ⊗n
), q − ( H0
+ H1
Y + · · · + Hn
Y ⊗n
) .
D( H0 )Φ(qa(...)) : q − ( H0
+ H1
Y + · · · + Hn
Y ⊗n
), 1 = 0.
H0
+ H1
Y + · · · + Hn
Y ⊗n
= q .
D( H1 )Φ(qa(...)) : q − ( H0
+ H1
Y + · · · + Hn
Y ⊗n
), Y = 0.
H0
Y + H1
Y, Y + · · · Hn
Y ⊗n+1
= q, Y .
D( Hn )Φ(qa(...)) : q − ( H0
+ H1
Y + · · · + Hn
Y ⊗n
), Y ⊗n
= 0.
H0
Y ⊗n
+ H1
Y ⊗n+1
+ · · · Hn
Y ⊗2n
= q, Y ⊗n
.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
10. 10
Bayesian update in components
For finite dimensional spaces, or after discretisation,
in components:
let q = (qm
), Y = (Y
), and Hk
= ( Hk m
1...k
), then:
∀ = 0, . . . , n;
Y 1 · · · Y
( H0 m
) + · · · + Y 1 · · · Y +k ( Hk m
+1... +k
)+
· · · + Y 1 · · · Y +n ( Hn m
+1... +n
) = qm
Y 1 · · · Y
.
matrix does not depend on m—is identically block diagonal.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
11. 11
Ingredients of the quadratic Bayessian Update
H0
+ H1
Y + H2
Y ⊗2
= q ,
H0
Y + H1
Y ⊗2
+ H2
Y ⊗3
= q ⊗ Y
H0
Y ⊗2
+ H1
Y ⊗3
+ H2
Y ⊗4
= q ⊗ Y ⊗2
where Y = (x(ω), y(ω), z(ω))T
expanded in PCE, i.e. Y ∈ R3×|J |
,
Y ⊗2
− Y 2
is the covariance matrix,
Y ⊗4
is determ. tensor of size 3 × 3 × 3 × 3.
1 Y Y ⊗2
Y Y ⊗2
Y ⊗3
Y ⊗2
Y ⊗3
Y ⊗4
H0
H1
H2
=
q
q ⊗ Y
q ⊗ Y ⊗2
the linear system is symmetric positive definite.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
12. 12
Numerical solution of the system
Lorenz63: q = (x, y, z), Y = (x, y, z) + ε.
For x (for y, z the same) the system above is (1 + 4 + 9) × (1 + 4 + 9) = 13 × 13 large.
But there are linear dependences (e. g. H2
12 = H2
21)!
If remove linear dependent rows and columns, obtain 10 × 10 system.
Solution is ( H0 i
, H1 i
1, H1 i
2, H1 i
3, H2 i
11, ..., H2 i
33), i = 1, 2, 3.
If ε = 0, then q = Y and system can be solved analytically
1 q q⊗2
q q⊗2
q⊗3
q⊗2
q⊗3
q⊗4
H0
H1
H2
=
q
q ⊗ q
q ⊗ q⊗2
And H0
= 0, H1
= I, H2
= 0.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
13. 13
Polynomial Chaos Expansion
Let Y (x, θ), θ = (θ1, ..., θM, ...), is approximated:
Y (x, θ) =
β∈Jm,p
Hβ(θ)Yβ(x),
Yβ(x) =
1
β! Θ
Hβ(θ)Y (x, θ) P(dθ).
In Lorenz63 model matrix of PCE coefficients is Y ∈ R3×|Jm,p|
, |Jm,p| = (m+p)!
m!p! .
E.g. multiindex for Y ⊗ Y will be Jm,2p and for Y ⊗4
will be Jm,4p.
If (Y − Z) is needed, for uncorrelated Y (Jm1,p) and Z (Jm2,p), then must build
Jm1+m2,p.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
14. 14
MAIN RESULT 1: Special cases
For n = 0 (constant functions) ⇒ qa = H0
= q (= E (q)).
For n = 1 the approximation is with affine functions
H0
+ H1
z = q
H0
z + H1
z ⊗ z = q ⊗ z
=⇒ (remember that [covqz] = q ⊗ z − q ⊗ z )
H0
= q − H1
z
H1
( z ⊗ z − z ⊗ z ) = q ⊗ z − q ⊗ z
⇒ H1
= [covqz][covzz]−1
(Kalman’s solution),
H0
= q − [covqz][covzz]−1
z ,
and finally
qa = H0
+ H1
z = q + [covqz][covzz]−1
(z − z ).
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
15. 15
Measurement error
Let ε ∈ N(0, σ2
I) be the Gaussian noise.
Since RVs Y and ε are uncorrelated, obtain
Y + ε = Y + ε = Y
(Y + ε)⊗2
= Y ⊗2
+ 2 Y ε + ε⊗2
= Y ⊗2
+ σ2
I
(Y + ε)⊗3
= Y ⊗3
+ 3 Y ε⊗2
(Y + ε)⊗4
= Y ⊗4
+ 6 Y ⊗2
ε⊗2
+ ε⊗4
(since ε⊗(2k+1)
= 0).
One should be accurate in computing the RHS q ⊗ Y , q ⊗ Y ⊗2
, ... since RVs q and
Y belong to different overlapping spaces.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
16. 16
MAIN RESULT 2: Final Step of Non-linear BU
After H0
, H1
, H2
are computed, evaluate
qa = H0
+ H1
(Y − ˆY ) + H2
(Y − ˆY )⊗2
,
where ˆY are the noisy measurements.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
17. 17
Numerics
Abbildung 1: One update in the linearized Lorenz63.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
18. 18
Abbildung 2: Linear (left) and non-linear (right) Bayesian updates in
Lorenz63.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
19. 19
Conclusion about NLBU
• + Is functional, sampling free approximation.
• + Give a hope to be used for solving inverse problems.
• - Stochastic dimension grows up very fast.
• - Is very “expensive” in computing and requires efficient tensor arithmetics.
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing
20. 20
Literature
1. O. Pajonk, B. V. Rosic, A. Litvinenko, and H. G. Matthies, A Deterministic Filter
for Non-Gaussian Bayesian Estimation, Physica D: Nonlinear Phenomena, Vol.
241(7), pp. 775-788, 2012.
2. B. V. Rosic, A. Litvinenko, O. Pajonk and H. G. Matthies, Sampling Free Linear
Bayesian Update of Polynomial Chaos Representations, J. of Comput. Physics,
Vol. 231(17), 2012 , pp 5761-5787
TU Braunschweig Institute of Scientific Computing
CC
Scientifi omputing