1. Bayesian Networks
Unit 6 Exact Inference
in Bayesian Networks
Wang, Yuan-Kai, 王元凱
ykwang@mails.fju.edu.tw
http://www.ykwang.tw
Department of Electrical Engineering, Fu Jen Univ.
輔仁大學電機工程系
2006~2011
Reference this document as:
Wang, Yuan-Kai, “Exact Inference in Bayesian Networks,"
Lecture Notes of Wang, Yuan-Kai, Fu Jen University, Taiwan, 2011.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
2. Bayesian Networks Unit - Exact Inference in BN p. 2
Goal of This Unit
• Learn to efficiently compute the
sum product of the inference formula
P( X | E e) P ( X i | Pa ( X i ))
hH i 1~ n
– Remember: enumeration and
multiplication of all P(Xi|Pa(Xi) are not
efficient
– We will learn other 3 methods for exact
inference
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
3. Bayesian Networks Unit - Exact Inference in BN p. 3
Related Units
• Background
– Probabilistic graphical model
• Next units
– Approximate inference algorithms
– Probabilistic inference over time
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
4. Bayesian Networks Unit - Exact Inference in BN p. 4
Self-Study References
• Chapter 14, Artificial Intelligence-a modern
approach, 2nd, by S. Russel & P. Norvig, Prentice
Hall, 2003.
• The generalized distributive law, S. M. Aji and R. J.
McEliece, IEEE Trans. On Information Theory, vol.
46, no. 2, 2000.
• Inference in Bayesian networks, B. D’Ambrosio, AI
Magazine, 1999.
• Probabilistic Inference in graphical models, M. I.
Jordan & Y. Weiss.
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
5. Bayesian Networks Unit - Exact Inference in BN p. 5
Structure of Related Lecture Notes
Problem Structure Data
Learning
PGM B E
Representation Learning
A
Unit 5 : BN Units 16~ : MLE, EM
Unit 9 : Hybrid BN J M
Units 10~15: Naïve Bayes, MRF,
HMM, DBN,
Kalman filter P(B) Parameter
P(E) Learning
P(A|B,E)
P(J|A)
Query Inference
P(M|A)
Unit 6: Exact inference
Unit 7: Approximate inference
Unit 8: Temporal inference
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
6. Bayesian Networks Unit - Exact Inference in BN p. 6
Contents
1. Basics of Graph ……………………………… 11
2. Sum-Product and Generalized Distributive
Law …………………………………………..... 20
3. Variable Elimination ........................................ 29
4. Belief Propagation ....……............................... 96
5. Junction Tree ……………...……………........ 157
6. Summary .......................................................... 212
7. Implementation ……………………………… 214
8. Reference .......................................................... 215
Fu Jen University
Fu Jen University Department of Electrical Engineering
Department of Electronic Engineering Wang, Yuan-Kai Copyright
Yuan-Kai Wang Copyright
7. Bayesian Networks Unit - Exact Inference in BN p. 7
Four Steps of Inference P(X|e)
• Step 1: Bayesian theorem
P ( X , E e)
P ( X | E e) P ( X , E e)
P ( E e)
• Step 2: Marginalization
P( X , E e, H h)
hH
• Step 3: Conditional independence
P( X i | Pa ( X i ))
hH i 1~ n
• Step 4: Sum-Product computation
– Exact inference
– Approximate inference
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
8. Bayesian Networks Unit - Exact Inference in BN p. 8
Five Types of Queries in Inference
• For a probabilistic graphical model G
• Given a set of evidence E=e
• Query the PGM with
– P(e) : Likelihood query
– arg max P(e) :
Maximum likelihood query
– P(X|e) : Posterior belief query
– arg maxx P(X=x|e) : (Single query variable)
Maximum a posterior (MAP) query
– arg maxx …x P(X1=x1, …, Xk=xk|e) :
1 k
Most probable explanation (MPE) query
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
9. Bayesian Networks Unit - Exact Inference in BN p. 9
Brute Force Enumeration
• We can compute
in O(KN) time, where K=|Xi|
B E
A
J M
• By using BN, we can represent joint distribution
in O(N) space
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
10. Bayesian Networks Unit - Exact Inference in BN p. 10
Expression Tree of Enumeration :
Repeated Computations
• P(b|j,m)= EAP(b)P(E)P(A|b,E)P(j|A)P(m|A)
E=e
+ E= e
+
+
A=a *
A=a * *
* A= a * *
* A= a * * *
* *
* * P(a|b,e)
P(a|b,e) P(m|a) P(e) P(b) * P(a|b,e)
P(m|a) *
P(e) P(b) P(a|b,e) P(j|a) P(m|a) P(e) P(b)
P(j|a) P(m|a) P(e) P(b) P(j|a)
P(j|a)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
11. Bayesian Networks Unit - Exact Inference in BN p. 11
1. Basics of Graph
• Polytree
• Multiply connected networks
• Clique
• Markov network
• Chordal graph
• Induced width
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
12. Bayesian Networks Unit - Exact Inference in BN p. 12
Two Kinds of PGMs
• There are two kinds of
probabilistic graphical models
(PGMs)
– Singly connected network
• Polytree
– Multiply connected network
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
13. Bayesian Networks Unit - Exact Inference in BN p. 13
Singly Connected Networks (Polytree)
• Any two nodes are Burglary Earthquake
connected by at most
one undirected path Alarm
• Theorem John Calls Mary Calls
• Inference in a polytree
is linear in the node size A H
of the network
B C
• This assumes tabular
CPT representation D E
F G
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
14. Bayesian Networks Unit - Exact Inference in BN p. 14
Multiply Connected Networks
• At least two nodes are connected by
more than one undirected path
Cloudy
Sprinkler Rain
Wet
Grass
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
15. Bayesian Networks Unit - Exact Inference in BN p. 15
Clique (1/2)
• A clique is a subgraph of an undirected
graph that is complete and maximal
– Complete:
• Fully connected
• Every node connects to every other nodes
– Maximal:
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
16. Bayesian Networks Unit - Exact Inference in BN p. 16
Clique (2/2)
• Identify cliques
A
EGH CEG
B C G
DEF ACE
D E H
F ABD ADE
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
17. Bayesian Networks Unit - Exact Inference in BN p. 17
Markov Network (1/2)
• An undirected graph with
– Hyper-nodes (multi-vertex nodes)
– Hyper-edges (multi-vertex edges)
EGH CEG
DEF ACE
ABD ADE
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
18. Bayesian Networks Unit - Exact Inference in BN p. 18
Markov Network (2/2)
• Every hyper-edge e=(x1…xk) has a
potential function fe(x1…xk)
• The probability distribution is
P ( X 1 ,..., X n ) Z f e ( x e1 ,..., x ek )
e E
Z 1 / ... f e ( x e1 ,..., x ek )
x1 xn e E
EGH CEG P ( EGH , CEG ) Z f e ( E , G, H , C )
eE
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
19. Bayesian Networks Unit - Exact Inference in BN p. 19
Chordal Graphs
• Elimination ordering undirected chordal
graph V S V S
T L T L
A B A B
X D X D
Graph:
• Maximal cliques are factors in elimination
• Factors in elimination are cliques in the graph
• Complexity is exponential in size of the largest
clique in graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
20. Bayesian Networks Unit - Exact Inference in BN p. 20
2. Sum-Product and
Generalized Distributive Law
P ( X | E e) P ( X i | Pa ( X i ))
hH i 1~ n
We obtain the formula because
two rules in probability theory
Sum Rule : P( x) P( x, y )
y
Product Rule : P( x, y ) P( x | y ) P( y )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
21. Bayesian Networks Unit - Exact Inference in BN p. 21
The Sum-Product with
Generalized Distributive Law
P ( X | E e) P ( X i | Pa ( X i ))
hH i 1~ n
P ( X i | Pa ( X i ))
Xk X 1 i 1~ k
P ( X 1 | Pa ( X 1 )) P ( X k | Pa ( X k ))
Xk X1
P( X k | Pa ( X k )) P( X t | X k , )
Xk X k 1
P( X
X1
1 | Pa ( X 1 )) P( X u | X 1 , )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
22. Bayesian Networks Unit - Exact Inference in BN p. 22
Distributive Law for Sum-Product
(1/3)
• ax1 ax2 a ( x1 x2 ) ax
i
i a xi
i
• x x
i j
i j ( x)( x)
i
i
j
j P( x | x )
i
i h
i
P ( xi , x h )
P ( xh )
P ( xi , xh ) P ( xh )
• P ( x i ) P ( x j ) P ( x i ) P ( x j ) i
Variable i
i j i j
is eliminated
P( x | x ) P( x
i j
i h j | xk ) ( i
)(
P ( x i | xh )
j
P ( x j | xk ) )
P ( xh ) P ( xk ) f1 ( xh ) f 2 ( xk )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
23. Bayesian Networks Unit - Exact Inference in BN p. 23
Distributive Law for Sum-Product
(2/3)
• P ( x i | xh ) P ( x j | x k )
i j
( i
P ( x i | xh ) )( j
)
P ( x j | xk )
f1 ( xh ) f 2 ( xk )
• P( x i | xk ) P( x j | xi ) P( x | x )( P( x | x ) )
i k j i
i j i j
P( x | x ) f ( x )) f ( x )
( i k i
k
i
• P(b | j , m) P(b) P(e)P(a | b, e) P( j | a) P(m | a)
e a
P (b) P(e) P (a | b, e) P ( j | a ) P (m | a )
e a
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
24. Bayesian Networks Unit - Exact Inference in BN p. 24
Distributive Law for Sum-Product
(3/3)
ab + ac = a(b+c)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
25. Bayesian Networks Unit - Exact Inference in BN p. 25
Distributive Law for Max-Product
• max(ax1 , ax2 ) a max( x1 , x2 )
max axi a max xi
i i
• max max xi x j max xi max x j
i j i j
• max max P ( x i ) P ( x j ) max P ( x i ) max P ( x j )
i j i j
max max P( x i | xk ) P( x j | xk )
i j
max P( x i | xk ) max P( x j | xk )
i j
• arg max P ( xi )
i
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
26. Bayesian Networks Unit - Exact Inference in BN p. 26
Generalized Distributive Law (1/2)
Aji and McEliece,
2000
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
27. Bayesian Networks Unit - Exact Inference in BN p. 27
Generalized Distributive Law (2/2)
Aji and McEliece,
2000
•a+0=0+a=a
•a*1=1*a=a
•a*b+a*c=a*(b+c)
•max(a,0)=max(0+a)=a
•a*1=1*a=a
•max(a*b, a*c)
=a*max(b, c)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
28. Bayesian Networks Unit - Exact Inference in BN p. 28
Marginal to MAP : MAX Product
Likelihood & Posterior Queries
x1
x2
x3
x4 x5
Maximum Likelihood Query
& MAP Query
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
29. Bayesian Networks Unit - Exact Inference in BN p. 29
3. Variable Elimination
• Variable elimination improves the
enumeration algorithm by
– Eliminating repeated calculations
• Carry out summations right-to-left
–Bottom-up in the evaluation tree
• Storing intermediate results (factors) to
avoid re-computation
– Dropping irrelevant variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
30. Bayesian Networks Unit - Exact Inference in BN p. 30
Basic Idea
• Write query in the form
P ( X n , e ) P ( xi | pa i )
xk x3 x2 i
• Iteratively
–Move all irrelevant terms (constants) outside
the innermost summation
(i aibc) = (bc (i ai ))
–Perform innermost sum, getting a new term:
factors
–Insert the new term into the product
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
31. Bayesian Networks Unit - Exact Inference in BN p. 31
An Example without Evidence (1/2)
P(C)
Cloudy
0.5 C P(R|C)
T 0.8
C P(S|C) F 0.2
T 0.1
F 0.5
Sprinkler Rain S R P(W|S,R)
T T 0.99
T F 0.90
F T 0.90
WetGrass F F 0.00
P ( w) P ( w | r , s ) P ( r | c ) P ( s | c ) P (c )
r , s ,c
P ( w | r , s ) P ( r | c ) P ( s | c ) P (c )
r ,s c
P ( w | r , s ) f1 ( r , s ) f1 ( r , s )
r ,s Factor
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
32. Bayesian Networks Unit - Exact Inference in BN p. 32
An Example without Evidence (2/2)
R S C P(R|C) P(S|C) P(C) P(R|C) P(S|C) P(C)
T T T
T T F
T F T
T F F
F T T
F T F
F F T
F F F
R S f1(R,S) = ∑c P(R|S) P(S|C) P(C)
Factor f1(r,s) T T
A factor may be T F
• A function F T
• A value F F
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
33. Bayesian Networks Unit - Exact Inference in BN p. 33
An Example with Evidence (1/2)
Factors
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
34. Bayesian Networks Unit - Exact Inference in BN p. 34
An Example with Evidence (2/2)
P(E)
Burglary Earthquake
• fM(a) = <0.7,0.1> P(B)
0.002
B E P(A|B,E)
T T 0.95
• fJ(a) = <0.9,0.05> 0.001 Alarm T F
F T
0.95
0.29
• fA(a,b,e) A P(J|A)
F F 0.001
John Calls T 0.90 Mary Calls A P(M|A)
• fÃJM(b,e) F 0.05 T
F
0.70
0.01
J M A B E fM(a) PJ(a) fA(a,b,e) fJM (a,b,e) fÃJM (b,e)
T T T T T 0.7 0.9 0.95 0.7*0.9*0.95
T T T T F 0.7 0.9 0.95 0.7*0.9*0.95
T T T F T 0.7 0.9 0.29 0.7*0.9*0.29
T T T F F 0.7 0.9 0.001 0.7*0.9*0.01
T T F T T 0.1 0.05 0.05 0.1*0.05*0.05
T T F T F 0.1 0.05 0.05 0.1*0.05*0.05
T T F F T 0.1 0.05 0.71 0.1*0.05*0.71
T T F F F 0.1 0.05 0.95 0.1*0.05*0.95
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
35. Bayesian Networks Unit - Exact Inference in BN p. 35
Basic Operations
• Summing out a variable from a
product of factors
– Move any irrelevant terms (constants)
outside the innermost summation
– Add up submatrices in pointwise
product of remaining factors
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
36. Bayesian Networks Unit - Exact Inference in BN p. 36
Variable Elimination Algorithm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
37. Bayesian Networks Unit - Exact Inference in BN p. 37
Irrelevant Variables (1/2)
• Consider the query
P(JohnCalls|Burglary = true)
– P(J|b)= P(b) eP(e) aP(a|b,e)P(J|a) mP(m|a)
– Sum over m is identically 1
mP(m|a) = 1
– M is irrelevant to the query
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
38. Bayesian Networks Unit - Exact Inference in BN p. 38
Irrelevant Variables (2/2)
• Theorem 1: P(X|E)
Y is irrelevant if YAncestors({X}E)
• In the example P(J|b)
– X =JohnCalls, E={Burglary}
– Ancestors({X} E)
= {Alarm,Earthquake}
– so MaryCalls is irrelevant
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
39. Bayesian Networks Unit - Exact Inference in BN p. 39
Complexity
• Time and space cost of variable elimination
are O(dkn)
– n: No. of random variables
– d: no. of discrete values
– k: no. of parent nodes k is critical for
• Polytrees : k is small, Linear complexity
– If k=1, O(dn)
• Multiply connected networks :
– O(dkn), k is large
– Can reduce 3SAT to variable elimination
• NP-hard
– Equivalent to counting 3SAT models
• #P-complete, i.e. strictly harder than NP-complete
problems
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
40. Bayesian Networks Unit - Exact Inference in BN p. 40
Pros and Cons
• Variable elimination is simple and
efficient for single query P(Xi | e)
• But it is less efficient if all the variables
are computed: P(X1 | e), …, P(Xk | e)
– In a polytree network, one would need to
issue O(n) queries costing O(n) each: O(n2)
• Junction tree algorithm extends variable
elimination that compute posterior
probabilities for all nodes
simultaneously
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
41. Bayesian Networks Unit - Exact Inference in BN p. 41
3.1 An Example
• The Asia network
Visit to Smoking
Asia
Tuberculosis Lung Cancer
Abnormality Bronchitis
in Chest
X-Ray Dyspnea
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
42. Bayesian Networks Unit - Exact Inference in BN p. 42
V S
• We want to inference P(d)
• Need to eliminate: v,s,x,t,l,a,b T L
A B
Initial factors
X D
P (v, s , t , l , a , b, x, d )
P ( v ) P ( s ) P (t | v ) P (l | s ) P (b | s ) P ( a | t , l ) P ( x | a ) P ( d | a , b )
“Brute force approach”
P (d) P (v, s, t, l, a,b, x, d)
x b a l t s v
T
Complexity is exponential O(N )
• N : size of the graph, number of variables
• K : number of states for each variable
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
43. Bayesian Networks Unit - Exact Inference in BN p. 43
V S
• We want to inference P(d)
• Need to eliminate : v,s,x,t,l,a,b T L
A B
Initial factors
X D
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
Eliminate: v
Compute: fv (t ) P (v )P (t |v )
v
fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
t fv(t)
Note: fv(t) = P(t) T 0.70
In general, result of elimination is F 0.01
not necessarily a probability term
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
44. Bayesian Networks Unit - Exact Inference in BN p. 44
V S
• We want to inference P(d)
• Need to eliminate : s,x,t,l,a,b T L
A B
• Initial factors
X D
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
Eliminate: s
Compute: fs (b , l ) P (s )P (b | s )P (l | s )
s
fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )
b l fs(b,l)
T T 0.95
•Summing on s results in fs(b,l) T
F
F 0.95
T 0.29
•A factor with two arguments F F 0.001
•Result of elimination may be a function of several variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
45. Bayesian Networks Unit - Exact Inference in BN p. 45
V S
• We want to inference P(d)
• Need to eliminate : x,t,l,a,b T L
A B
• Initial factors
X D
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )
Eliminate: x
Compute: fx (a ) P (x | a )
x
fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b )
Note: fx(a) = 1 for all values of a !!
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
46. Bayesian Networks Unit - Exact Inference in BN p. 46
V S
• We want to inference P(d)
• Need to eliminate : t,l,a,b T L
A B
• Initial factors
X D
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b )
Eliminate: t
Compute: ft (a , l ) fv (t )P (a |t , l )
t
fs (b , l )fx (a )ft (a , l )P (d | a , b )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
47. Bayesian Networks Unit - Exact Inference in BN p. 47
V S
• We want to inference P(d)
• Need to eliminate : l,a,b T L
A B
• Initial factors
X D
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b )
fs (b , l )fx (a )ft (a , l )P (d | a , b )
Eliminate: l
Compute: fl (a , b ) fs (b , l )ft (a , l )
l
fl (a , b )fx (a )P (d | a , b )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
48. Bayesian Networks Unit - Exact Inference in BN p. 48
V S
• We want to inference P(d)
T L
• Need to eliminate : b
A B
• Initial factors X D
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )P ( s )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )fs (b , l )P (a | t , l )P ( x | a )P (d | a , b )
fv (t )fs (b , l )fx (a )P (a | t , l )P (d | a , b )
fs (b , l )fx (a )ft (a , l )P (d | a , b )
fl (a , b )fx (a )P (d | a , b ) fa (b , d ) fb (d )
Eliminate: a,b
Compute:
fa (b , d ) fl (a , b )fx (a ) p (d | a , b )
a
fb (d ) fa (b , d )
b
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
49. Bayesian Networks Unit - Exact Inference in BN p. 49
V S
• Different elimination ordering
• Need to eliminate : a,b,x,t,v,s,l T L
• Initial factors A B
P (v)P (s)P (t | v)P (l | s)P (b | s)P (a | t, l)P ( x | a)P (d | a,b)
X D
Intermediate factors: In previous order
g a (l , t , d , b , x , s , v ) Both f v (v, s , x, t , l , a , b )
g b (l , t , d , x , s , v ) need f s ( s , x, t , l , a , b )
g x (l , t , d , s , v ) n=7 f x ( x, t , l , a , b )
g t (l , d , s , v ) steps f t (t , l , a, b)
g v (l , d , s ) f l (l , a, b)
g s (l , d )
But each step has
f a ( a, b)
different
g l (d )
computation size f b (d )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
50. Bayesian Networks Unit - Exact Inference in BN p. 50
Short Summary
• Variable elimination is a sequence of
rewriting operations
• Computation depends on
– Number of variables n
• Each elimination step reduces one variable
• So we need n elimination steps
– Size of factors
• Effected by order of elimination
• Discussed in sub-section 3.2
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
51. Bayesian Networks Unit - Exact Inference in BN p. 51
V S
Dealing with Evidence(1/7) T L
A B
• How do we deal with evidence? X D
• Suppose get evidence V = t, S = f, D = t
• We want to compute P(L, V = t, S = f, D = t)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
52. Bayesian Networks Unit - Exact Inference in BN p. 52
V S
Dealing with Evidence(2/7) T L
A B
• We start by writing the factors: X D
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
• Since we know that V = t, we don’t need to eliminate V
• Instead, we can replace the factors P(V) and P(T|V) with
fP (V ) P (V t ) fp (T |V ) ( ) P ( |V t )
T T
• These “select” the appropriate parts of the original
factors given the evidence
• Note that fp(V) is a constant, and thus does not appear in
elimination of other variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
53. Bayesian Networks Unit - Exact Inference in BN p. 53
V S
Dealing with Evidence(3/7) T L
• Given evidence V = t, S = f, D = t A B
• Compute P(L, V = t, S = f, D = t )
X D
• Initial factors, after setting evidence:
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
54. Bayesian Networks Unit - Exact Inference in BN p. 54
V S
Dealing with Evidence(4/7) T L
A B
• Given evidence V = t, S = f, D = t
• Compute P(L, V = t, S = f, D = t ) X D
• Initial factors, after setting evidence:
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b )
• Eliminating x, we get
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
55. Bayesian Networks Unit - Exact Inference in BN p. 55
V S
Dealing with Evidence(5/7) T L
• Given evidence V = t, S = f, D = t A B
• Compute P(L, V = t, S = f, D = t )
• Initial factors, after setting evidence: X D
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b )
• Eliminating x, we get
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b )
• Eliminating t, we get
fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )ft (a , l )fx (a )fP (d |a ,b ) (a , b )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
56. Bayesian Networks Unit - Exact Inference in BN p. 56
V S
Dealing with Evidence(6/7) T L
• Given evidence V = t, S = f, D = t A B
• Compute P(L, V = t, S = f, D = t )
• Initial factors, after setting evidence: X D
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b )
• Eliminating x, we get
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b )
• Eliminating t, we get
fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )ft (a , l )fx (a )fP (d |a ,b ) (a , b )
• Eliminating a, we get
fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )fa (b , l )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
57. Bayesian Networks Unit - Exact Inference in BN p. 57
V S
Dealing with Evidence(7/7) T L
• Given evidence V = t, S = f, D = t A B
• Compute P(L, V = t, S = f, D = t )
• Initial factors, after setting evidence: X D
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )P ( x | a )fP (d |a ,b ) (a , b )
• Eliminating x, we get
fP (v )fP ( s )fP (t |v ) (t )fP (l |s ) (l )fP ( b|s ) (b )P (a | t , l )fx (a )fP (d |a ,b ) (a , b )
• Eliminating t, we get
fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )ft (a , l )fx (a )fP (d |a ,b ) (a , b )
• Eliminating a, we get
fP (v )fP ( s )fP (l |s ) (l )fP ( b|s ) (b )fa (b , l )
• Eliminating b, we get fP (v )fP ( s )fP (l |s ) (l )fb (l )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
58. Bayesian Networks Unit - Exact Inference in BN p. 58
Complexity (1/2)
• Suppose in one elimination step we compute
fx ( y 1 , , y k ) f 'x (x , y , , y k )
x
1
m
f 'x ( x , y , , y k ) fi ( x , y , y
1 1,1, 1,li
)
i 1
This requires |X| : No. of discrete values of X
• m X Yi multiplications
i
– For each value for x, y1, …, yk, we do m
multiplications
• X Yi additions
i
– For each value of y1, …, yk , we do |X| additions
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
59. Bayesian Networks Unit - Exact Inference in BN p. 59
Complexity (2/2)
• One elimination step requires
– m X Yi multiplications
i
– X Yi additions
i
– O( X Yi ), m is a constant (neglected)
i
– Or O(d k) if
• |X|=|Yi|=d,
• k: no. of parent nodes
• Time and space cost are O(dkn) Complexity is
– n: No. of random variables exponential in number
– d: no. of discrete values of variables k
– k: no. of parent nodes
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
60. Bayesian Networks Unit - Exact Inference in BN p. 60
3.2 Order of Elimination
• How to select “good” elimination
orderings in order to reduce complexity
1. Start by understanding variable
elimination via the graph we are working
with
2. Then reduce the problem of finding good
ordering to graph-theoretic operation that
is well-understood
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
61. Bayesian Networks Unit - Exact Inference in BN p. 61
Undirected Graph Conversion (1/2)
• At each stage of the variable
elimination,
• We have an algebraic term that we
need to evaluate
• This term is of the form
P ( x 1 , , x k ) fi ( Z i )
y1 yn i
where Zi are sets of variables
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
62. Bayesian Networks Unit - Exact Inference in BN p. 62
Undirected Graph Conversion (2/2)
• Plot a graph where
– If X,Y are arguments of some factor
• That is, if X,Y are in some Zi
– There are undirected edges X--Y
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
63. Bayesian Networks Unit - Exact Inference in BN p. 63
Example
• Consider the “Asia” example
• The initial factors are
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
• The undirected graph is
V S V S
T L T L
A B A B
X D X D
• In the first step this graph is just the
moralized graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
64. Bayesian Networks Unit - Exact Inference in BN p. 64
Variable Elimination
Change of Graph
P (v )P ( s )P (t | v )P (l | s )P (b | s )P (a | t , l )P ( x | a )P (d | a , b )
• Now we eliminate t, getting
P (v )P ( s )P (l | s )P (b | s )P ( x | a )P (d | a , b )ft (v , a , l )
• The corresponding change in the graph
is V S V S
T L T L Nodes V,L,A
become
A B A B a clique
X D X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
65. Bayesian Networks Unit - Exact Inference in BN p. 65
Example (1/6)
• Want to compute P(L,V=t,S=f,D=t)
V S
T L
A B
• Moralizing V S
X D
T L
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
66. Bayesian Networks Unit - Exact Inference in BN p. 66
Example (2/6)
• Want to compute P(L,V=t,S=f,D=t)
V S
T L
• Moralizing A B
• Setting evidence X D
V S
T L
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
67. Bayesian Networks Unit - Exact Inference in BN p. 67
Example (3/6)
• Want to compute P(L,V=t,S=f,D=t)
V S
T L
• Moralizing A B
• Setting evidence
• Eliminating x
X D
V S
– New factor fx(A) T L
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
68. Bayesian Networks Unit - Exact Inference in BN p. 68
Example (4/6)
• Want to compute P(L,V=t,S=f,D=t)
V S
T L
• Moralizing A B
• Setting evidence X D
• Eliminating x
Eliminating a
V S
•
– New factor fa(b,t,l) T L
A B
A clique in reduced undirected graph
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
69. Bayesian Networks Unit - Exact Inference in BN p. 69
Example (5/6)
• Want to compute P(L,V=t,S=f,D=t)
V S
• Moralizing T L
• Setting evidence A B
• Eliminating x X D
• Eliminating a V S
• Eliminating b T L
– New factor fb(t,l) A B
A clique in reduced
X D
undirected graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
70. Bayesian Networks Unit - Exact Inference in BN p. 70
Example (6/6)
• Want to compute P(L,V=t,S=f,D=t)
V S
T L
• Moralizing A B
• Setting evidence X D
• Eliminating x V S
• Eliminating a T L
• Eliminating b
Eliminating t A B
•
– New factor ft(l) X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
71. Bayesian Networks Unit - Exact Inference in BN p. 71
Elimination and Clique (1/2)
• We can eliminate a variable x by
1. For all Y,Z, s.t., Y--X, Z--X
• add an edge Y--Z
2. Remove X and all adjacent edges to it
• This procedures create a clique that contains
all the neighbors of X
• After step 1 we have a clique that
corresponds to the intermediate factor
(before marginalization)
• The cost of the step is exponential in the size
of this clique : dk in O(ndk)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
72. Bayesian Networks Unit - Exact Inference in BN p. 72
Elimination and Clique (2/2)
• The process of eliminating nodes from
an undirected graph gives us a clue to
the complexity of inference
• To see this, we will examine the graph
that contains all of the edges we added
during the elimination
• The resulting graph is always chordal
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
73. Bayesian Networks Unit - Exact Inference in BN p. 73
V S
Example (1/7) T L
• Want to compute P(L) A B
X D
• Moralizing V S
T L
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
74. Bayesian Networks Unit - Exact Inference in BN p. 74
V S
Example (2/7) T L
• Want to compute P(L) A B
X D
• Moralizing
• Eliminating v V S
– Multiply to get f’v(v,t)
– Result fv(t) T L
A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
75. Bayesian Networks Unit - Exact Inference in BN p. 75
V S
Example (3/7) T L
• Want to compute P(L) A B
X D
• Moralizing
• Eliminating v V S
• Eliminating x T L
–Multiply to get f’x(a,x)
–Result fx(a) A B
X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
76. Bayesian Networks Unit - Exact Inference in BN p. 76
V S
Example (4/7) T L
• Want to compute P(L) A B
X D
• Moralizing
• Eliminating v V S
• Eliminating x T L
• Eliminating s
–Multiply to get f’s(l,b,s) A B
–Result fs(l,b) X D
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
77. Bayesian Networks Unit - Exact Inference in BN p. 77
V S
Example (5/7) T L
• Want to compute P(D) A B
X D
• Moralizing
• Eliminating v
• Eliminating x
V S
• Eliminating s T L
• Eliminating t A B
–Multiply to get f’t(a,l,t) X D
–Result ft(a,l)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
78. Bayesian Networks Unit - Exact Inference in BN p. 78
V S
Example (6/7) T L
• Want to compute P(D) A B
X D
• Moralizing
• Eliminating v V S
• Eliminating x T L
• Eliminating s
• Eliminating t A B
• Eliminating l X D
–Multiply to get f’l(a,b,l)
–Result fl(a,b)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
79. Bayesian Networks Unit - Exact Inference in BN p. 79
V S
Example (7/7) T L
• Want to compute P(D) A B
X D
• Moralizing
• Eliminating v
• Eliminating x V S
• Eliminating s T L
• Eliminating t A B
• Eliminating l X D
• Eliminating a, b
–Multiply to get f’a(a,b,d)
–Result f(d)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
80. Bayesian Networks Unit - Exact Inference in BN p. 80
Induced Graphs V S
• The resulting graph are induced T L
graphs (for this particular ordering) A B
X D
• Main property:
– Every maximal clique in the induced graph
corresponds to an intermediate factor in the
computation
– Every factor stored during the process is a subset of
some maximal clique in the graph
• These facts are true for any variable
elimination ordering on any network
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
81. Bayesian Networks Unit - Exact Inference in BN p. 81
Induced Width (Treewidth)
• The size of the largest clique k in the
induced graph is
– An indicator for the complexity of variable
elimination
• w=k-1 is called
– Induced width (treewidth) of a graph
– According to the specified ordering
• Finding a good ordering for a graph is
equivalent to finding the minimal
induced width of the graph
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
82. Bayesian Networks Unit - Exact Inference in BN p. 82
Treewidth
Low treewidth High tree width
Chains N=nxn grid
W=1
Trees (no loops)
W = O(n) = O(p N)
MINVOLSET
INTUBATION KINKEDTUBE
PULMEMBOLUS VENTMACH
DISCONNECT
PAP SHUNT VENTLUNG VENITUBE
MINOVL
PVSAT
VENTALV
ARTCO2
PRESS
Loopy graphs Arnborg85
TPR SAO2 EXPCO2
INSUFFANESTH
HYPOVOLEMIA
LVFAILURE CATECHOL
LVEDVOLUME
STROEVOLUME ERRBLOWOUTPUT
HISTORY HRERRCAUTER
CVP PCWP CO HREKGHRSAT
HRBP
BP
W = #parents
W = NP-hard to find
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
83. Bayesian Networks Unit - Exact Inference in BN p. 83
Complexity
• Time and space cost of variable elimination
are O(dkn)
– n: No. of random variables
– d: no. of discrete values
– k: no. of parent nodes = treewidth + 1 (W+1)
• Polytrees : k is small, Linear
– If k=1, O(dn)
• Multiply connected networks :
– O(dkn), k is large
– Can reduce 3SAT to variable elimination
• NP-hard
– Equivalent to counting 3SAT models
• #P-complete, i.e. strictly harder than NP-complete
problems
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
84. Bayesian Networks Unit - Exact Inference in BN p. 84
Elimination on Trees (1/3)
• Suppose we have a tree that
– A network where each variable has at most
one parent
• Then all the factors involve at most two
variables: Treewidth=1
• The moralized graph is also a tree
A A
B C B C
D E D E
F G F G
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
85. Bayesian Networks Unit - Exact Inference in BN p. 85
Elimination on Trees (2/3)
• We can maintain the tree structure by
eliminating extreme variables in the tree
A A
B C B C
D E D E
A
F G F G
B C
D E
F G
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
86. Bayesian Networks Unit - Exact Inference in BN p. 86
Elimination on Trees (3/3)
• Formally, for any tree, there is an
elimination ordering with treewidth = 1
Theorem
• Inference on trees is linear in number of
variables : O(dn)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
87. Bayesian Networks Unit - Exact Inference in BN p. 87
Exercise: Variable Elimination
p(smart)=.8 p(study)=.6 Query: What is the probability
smart study that a student studied, given
that they pass the exam?
p(fair)=.9
prepared fair
p(prep|…) smart smart
pass study .9 .7
smart smart study .5 .1
p(pass|…)
prep prep prep prep
fair .9 .7 .7 .2
fair .1 .1 .1 .1
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
88. Bayesian Networks Unit - Exact Inference in BN p. 88
Variable Elimination Algorithm
• Let X1,…, Xm be an ordering on the non-query
variables ... P ( X | Parents ( X )) j j
X 1 X 2 X m j
• For i = m, …, 1
– Leave in the summation for Xi only factors
mentioning Xi
– Multiply the factors, getting a factor that contains a
number for each value of the variables mentioned,
including Xi
– Sum out Xi, getting a factor f that contains a
number for each value of the variables mentioned,
not including Xi
– Replace the multiplied factor in the summation
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
89. Bayesian Networks Unit - Exact Inference in BN p. 89
3.3 General Graphs
• If the graph is not a polytree
– More general networks
– Usually loopy networks
• Can we inference loopy networks by
variable elimination?
– If network has a cycle, the treewidth for
any ordering is greater than 1
– Its complexity is high,
– VE becomes a not practical algorithm
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
90. Bayesian Networks Unit - Exact Inference in BN A p. 90
B C
Example (1/2)
D E
• Eliminating A, B, C, D, E,….
F G
• Resulting graph is chordal with
treewidth 2 H
A A A A
B C B C B C B C
D E D E D E D E
F G F G F G F G
H H H H
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
91. Bayesian Networks Unit - Exact Inference in BN A p. 91
B C
Example (2/2)
D E
• Eliminating H,G, E, C, F, D, E, A
F G
• Resulting graph is chordal with
treewidth 3 H
A A A A
B C B C B C B C
D E D E D E D E
F G F G F G F G
H H H H
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
92. Bayesian Networks Unit - Exact Inference in BN p. 92
Find Good Elimination Order
in General Graph
Theorem:
• Finding an ordering that minimizes the
treewidth is NP-Hard
However,
• There are reasonable heuristic for finding
“relatively” good ordering
• There are provable approximations to the best
treewidth
• If the graph has a small treewidth, there are
algorithms that find it in polynomial time
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
93. Bayesian Networks Unit - Exact Inference in BN p. 93
Heuristics for
Finding an Elimination Order
• Since elimination order is NP-hard to
optimize,
• It is common to apply greedy search
techniques: Kjaerulff90
• At each iteration, eliminate the node that
would result in the smallest
– Number of fill-in edges [min-fill]
– Resulting clique weight [min-weight] (Weight of
clique = product of number of states per node in
clique)
• There are some approximation algorithms Amir01
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
94. Bayesian Networks Unit - Exact Inference in BN p. 94
Factorization in Loopy Networks
Probabilistic models with no loop are tractable
Factorizable
a
b Pa, x P(b, x) P(c, x) P(d, x)
a b c d
c
d P (a, x) P (b, x) P (c, x) P (d, x)
a b c d
Probabilistic models with loop are not tractable
a
Not Factorizable
b
c Pa, b, c, d, x
a b c d
d
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
95. Bayesian Networks Unit - Exact Inference in BN p. 95
Short Summary
• Variable elimination
– Actual computation is done in elimination
step
– Computation depends on order of
elimination
– Very sensitive to topology
– Space = time
• Complexity
– Polytrees: Linear time
– General graphs: NP-hard
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
96. Bayesian Networks Unit - Exact Inference in BN p. 96
4. Belief Propagation
• Also called
– Message passing
– Pearl’s algorithm
• Subsections
– 4.1 Message passing in simple chains
– 4.2 Message passing in trees
– 4.3 BP Algorithm
– 4.4 Message passing in general graphs
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
97. Bayesian Networks Unit - Exact Inference in BN p. 97
What’s Wrong with VarElim
• Often we want to query all hidden nodes
• Variable elimination takes O(N2dk) time to
compute P(Xi|e) for all (hidden) nodes Xi
• Message passing algorithms that can do
this in O(Ndk) time
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
98. Bayesian Networks Unit - Exact Inference in BN p. 98
Repeated Variable Elimination Leads to
Redundant Calculations
X1 X2 X3
Y1 Y2 Y3
P ( x1 | y1:3 ) P ( x1 ) P ( y1 | x1 ) P ( x 2 | x1 ) P ( y 2 | x 2 ) P ( x3 | x 2 ) P ( y 3 | x3 )
x2 x3
P ( x 2 | y1:3 ) P ( x 2 | x1 ) P ( y 2 | x 2 ) P ( x1 ) P ( y1 | x1 ) P ( x3 | x 2 ) P ( y 3 | x3 )
x1 x3
P ( x3 | y1:3 ) P ( x3 | x 2 ) P ( y 3 | x3 ) P ( x1 ) P ( y1 | x1 ) P ( x 2 | x1 ) P ( y 2 | x 2 )
x1 x2
O(N2 K2) time to compute all N marginals
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
99. Bayesian Networks Unit - Exact Inference in BN p. 99
Belief Propagation
• Belief propagation (BP) operates by sending
beliefs/messages between nearby variables in
the graphical model
• It works like variable elimination
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
100. Bayesian Networks Unit - Exact Inference in BN p. 100
4.1 Message Passing
in Simple Chains
X1 ... Xk ... Xn
• Likelihood query (query without evidence)
– P(X1), P(Xn), P(Xk)
– P(Xj , Xk)
• Posterior query (query with evidence)
– P(X1|Xn), P(Xn|X1),
– P(Xk|X1), P(Xk|Xn),
– P(X1|Xk), P(Xn|Xk),
– P(Xk|Xj)
• Maximum A Posterior (MAP) query
– arg max P(Xk|Xj)
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
101. Bayesian Networks Unit - Exact Inference in BN p. 101
Sum-Product of the Simple Chain
(1/2)
X1 ... Xk ... Xn
P( X k ) P( X
X 1 X k 1 , X k 1 X n
1 , , X k , , X n )
P ( X 1 , , X k , , X n )
X1 X k 1 X k 1 Xn
P ( X i | Pa ( X i ))
X1 X k 1 X k 1 Xn Xi
P ( X n | X n 1 ) P ( X k | X k 1 ) P ( X 2 | X 1 ) P ( X 1 )
X1 X k 1 X k 1 Xn
P ( X 1 ) P ( X 2 | X 1 ) P ( X k 1 | X k 2 ) P ( X k | X k 1 )
X1 X2 X k 1
P( X
X k 1
k 1 | X k ) P ( X n | X n 1 )
Xn
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
102. Bayesian Networks Unit - Exact Inference in BN p. 102
Sum-Product of the Simple Chain
(2/2)
X1 ... Xk ... Xn
P( X k | X j ) P ( X 1 , , X n )
{ X i |1 i n , i j , k }
P( X
{ X i |1 i n , i j , k } X i
i | Pa ( X i ))
P( X
{ X i |1 i n , i j , k }
n | X n 1 ) P ( X k | X k 1 ) P ( X 2 | X 1 ) P ( X 1 )
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
103. Bayesian Networks Unit - Exact Inference in BN p. 103
4.1.1 Likelihood Query
• P(Xn) or P(xn) : Forward passing
X1 X2 X3 ... Xn
• P(X1) or P(x1) : Backward passing
X1 X2 X3 ... Xn
• P(Xk) or P(xk) : Forward-Backward passing
X1 X2 ... Xk ... Xn
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
104. Bayesian Networks Unit - Exact Inference in BN p. 104
Forward Passing (1/6)
A B C D E
• P(e)
P ( e ) P ( a ) P (b | a ) P ( c | b ) P ( d | c ) P ( e | d )
d c b a
P ( e | d ) P ( d | c ) P ( c | b ) P ( a ) P (b | a )
d c b a
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright
105. Bayesian Networks Unit - Exact Inference in BN p. 105
Forward Passing (2/6)
X we can perform innermost summation
m AB (B )
A B C D E
• Now
P ( e ) P ( e | d ) P ( d | c ) P ( c | b ) P ( a ) P (b | a )
d c b a
P ( e | d ) P ( d | c ) P ( c | b ) p (b )
d c b
• This summation is exactly
– A variable elimination step
– We call it: send a CPT P(b) to compute next
innermost summation
– The sent CPT P(b) is called a belief, or message:
m AB (b) P(b) P(a ) P(b | a ) f (a, b)
a a
Fu Jen University Department of Electrical Engineering Wang, Yuan-Kai Copyright