Parallel Adaptive Wang Landau - GDR November 2011

Wang–Landau algorithm
Improvements
2D Ising model
Conclusion

Parallel Adaptive Wang–Landau Algorithm

Pierre E. Jacob

CEREMADE - Universit´ Paris Dauphine & CREST, funded by AXA Research
e

15 novembre 2011

joint work with Luke Bornn (UBC), Arnaud Doucet (Oxford), Pierre Del Moral
(INRIA & Universit´ de Bordeaux)
e

Pierre E. Jacob PAWL 1/ 18

Improvements
2D Ising model
Conclusion

Outline

1 Wang–Landau algorithm

2 Improvements
Automatic Binning
Parallel Interacting Chains
Adaptive proposals

3 2D Ising model

4 Conclusion


Improvements
2D Ising model
Conclusion

Wang–Landau

Context
unnormalized target density π
on a state space X

A kind of adaptive MCMC algorithm
It iteratively generates a sequence Xt .
The stationary distribution is not π itself.
At each iteration a diﬀerent stationary distribution is targeted.


Improvements
2D Ising model
Conclusion

Wang–Landau

Partition the space
The state space X is cut into d bins:
d
X = Xi and ∀i = j Xi ∩ Xj = ∅
i=1

Goal
The generated sequence spends the same time in each bin Xi ,
within each bin Xi the sequence is asymptotically distributed
according to the restriction of π to Xi .


Improvements
2D Ising model
Conclusion

Wang–Landau

Stationary distribution
Deﬁne the mass of π over Xi by:

ψi = π(x)dx
Xi

The stationary distribution of the WL algorithm is:
1
πψ (x) ∝ π(x) ×
ψJ(x)

where J(x) is the index such that x ∈ XJ(x)


Improvements
2D Ising model
Conclusion

Wang–Landau

Example with a bimodal, univariate target density: π and two πψ
corresponding to diﬀerent partitions.

Original Density, with partition lines Biased by X Biased by Log Density
0

−2

−4
Log Density

−6

−8

−10

−12

−5 0 5 10 15 −5 0 5 10 15 −5 0 5 10 15
X


Improvements
2D Ising model
Conclusion

Wang–Landau

Plugging estimates
In practice we cannot compute ψi analytically. Instead we plug in
estimates θt (i) of ψi at iteration t, and deﬁne the distribution πθt
by:
1
πθt (x) ∝ π(x) ×
θt (J(x))

Metropolis–Hastings
The algorithm does a Metropolis–Hastings step, aiming πθt at
iteration t, generating a new point Xt .


Improvements
2D Ising model
Conclusion

Wang–Landau

Estimate of the bias
The update of the estimated bias θt (i) is done according to:

θt (i) ← θt−1 (i)[1 + γt (IXt ∈Xi − d −1 )]

with γt a decreasing sequence or “step size”. E.g. γt = 1/t.


Improvements
2D Ising model
Conclusion

Wang–Landau

Result
In the end we get:
a sequence Xt asymptotically following πψ ,
as well as estimates θt (i) of ψi .


Automatic Binning
Improvements
2D Ising model
Adaptive proposals
Conclusion

Automate Binning
Easily move from one bin to another
Maintain some kind of uniformity within bins. If non-uniform, split
the bin.
Frequency

Frequency

Log density Log density

(a) Before the split (b) After the split


Automatic Binning
Improvements
2D Ising model
Adaptive proposals
Conclusion


(1) (N)
N chains (Xt , . . . , Xt ) instead of one.
targeting the same biased distribution πθt at iteration t,
sharing the same estimated bias θt at iteration t.

The update of the estimated bias becomes:
N
1
θt (i) ← θt−1 (i)[1 + γt ( IX (j) ∈X − d −1 )]
N t i
j=1


Automatic Binning
Improvements
2D Ising model
Adaptive proposals
Conclusion

Adaptive proposals

For continuous state spaces
We can use the adaptive Random Walk proposal where the
variance σt is learned along the iterations to target an acceptance
rate.

Robbins-Monro stochastic approximation update

σt+1 = σt + ρt (2I(A > 0.234) − 1)

Or alternatively

Σt = δ × Cov (X1 , . . . , Xt )


Improvements
2D Ising model
Conclusion

2D Ising model
Higdon (1998), JASA 93(442)
Target density
Consider a 2D Ising model, with posterior density
 

π(x|y ) ∝ exp α I[yi = xi ] + β I[xi = xj ]
i i∼j

with α = 1, β = 0.7.

The ﬁrst term (likelihood) encourages states x which are
similar to the original image y .
The second term (prior) favors states x for which
neighbouring pixels are equal, like a Potts model.

Improvements
2D Ising model
Conclusion

2D Ising models

(a) Original Image (b) Focused Region of Image


Improvements
2D Ising model
Conclusion

2D Ising models

Iteration 300,000 Iteration 350,000 Iteration 400,000 Iteration 450,000 Iteration 500,000
40

Metropolis−Hastings
30

20

10
Pixel
On
X2

40 Off

30

Wang−Landau
20

10

10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40
X1

Figure: Spatial model example: states explored over 200,000 iterations
for Metropolis-Hastings (top) and proposed algorithm (bottom).


Improvements
2D Ising model
Conclusion

2D Ising models
Metropolis−Hastings Wang−Landau

40

30

Pixel
0.4
0.6
X2

20
0.8
1.0

10

10 20 30 40 10 20 30 40
X1

Figure: Spatial model example: average state explored with
Metropolis-Hastings (left) and Wang-Landau after importance sampling
(right).


Improvements
2D Ising model
Conclusion

Conclusion

Automatic binning
We still have to define a range.

Parallel Chains
In practice it is more efficient to use N chains for T iterations
instead of 1 chain for N × T iterations.

Adaptive Proposals
Convergence results with fixed proposals are already challenging,
and making the proposal adaptive might add a layer of complexity.


Improvements
2D Ising model
Conclusion

Bibliography

Article: An Adaptive Interacting Wang-Landau Algorithm for
Automatic Density Exploration, L. Bornn, P.E. Jacob, P. Del
Moral, A. Doucet, available on arXiv.
Software: PAWL, an R package, available on CRAN:
install.packages("PAWL")
References:
F. Wang, D. Landau, Physical Review E, 64(5):56101
Y. Atchad´, J. Liu, Statistica Sinica, 20:209-233
e


Parallel Adaptive Wang Landau - GDR November 2011

Recomendados

Recomendados

Mais conteúdo relacionado

Mais de Pierre Jacob

Mais de Pierre Jacob (14)

Último

Último (20)

Parallel Adaptive Wang Landau - GDR November 2011