Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo

Monte Carlo
& MCMC

Xin-She Yang

Monte Carlo
Estimating π
Buﬀon’s
Monte Carlo Simulations, Sampling and
problem
Probability
Monte Carlo
Markov Chain Monte Carlo
Monte Carlo
integration
Quality of
Sampling
Quasi-Monte
Carlo
Xin-She Yang
Pseudorandom
Pseudorandom
number
generation
Other
distributions c 2010
Limitations
Multivariate
distributions

Markov
Chains
Markov chains
Markov chains
A Famous
Markov Chain Xin-She Yang Monte Carlo & MCMC

Estimating π

Monte Carlo
& MCMC

Xin-She Yang How to estimate π using only a ruler and some match sticks?
Monte Carlo
Estimating π
Buﬀon’s
problem
Probability
Monte Carlo
Monte Carlo
integration
Quality of
Sampling
Quasi-Monte
Carlo

Pseudorandom
Pseudorandom
number
generation
Other
distributions
Limitations
Multivariate
distributions

Markov
Chains
Markov chains
Markov chains
A Famous

Buffon’s Needle Problem

Monte Carlo
& MCMC
Buffon’s needle problem (1733). Probability of crossing a line
Xin-She Yang 2 L
p= · ,
Monte Carlo
π d
Estimating π where L = length of needles, and d =spacing.
Buffon’s
problem
Probability
Monte Carlo
Monte Carlo
integration
Quality of
Sampling
Quasi-Monte
Carlo

Pseudorandom
Pseudorandom
number
generation
Other
distributions
Limitations
Multivariate
distributions

Markov
Chains
Markov chains
Markov chains
A Famous

Probability of Crossing a Line

Monte Carlo
& MCMC

Xin-She Yang Since p ≈ n/N ≈ 2L/πd, we have
Monte Carlo
2N L
Estimating π
Buﬀon’s
π≈ · .
problem n d
Probability
Monte Carlo
Monte Carlo
integration
Lazzarini (1901): L = 5d/6, N = 3408, n = 1808, so
Quality of
Sampling
Quasi-Monte 2 × 3408 5
Carlo π≈ · ≈ 3.14159290.
Pseudorandom 1808 6
Pseudorandom
number
generation
Other
distributions Too accurate?! Is this right? What happens when n = 1809?
Limitations √
Multivariate
distributions Errors ∼ 1/ N ∼ 2%.
Markov
Chains
Markov chains
Markov chains
A Famous

Monte Carlo Methods

Monte Carlo
& MCMC
Everyone has used Monte Carlo methods in some way ...
Xin-She Yang

Monte Carlo
Estimating π
Buﬀon’s
problem
Probability
Monte Carlo
Monte Carlo
integration
Quality of
Sampling
Quasi-Monte
Carlo

Pseudorandom
Pseudorandom
number
generation
Other
distributions
Limitations
Multivariate
distributions

Markov Measure temperatures, choose a product, ...
Chains
Markov chains
Markov chains
Taste soup, wine ...
A Famous

Monte Carlo Integration

Monte Carlo
& MCMC n
1
Xin-She Yang I= fdv = V fi + O(ǫ),
Ω N
Monte Carlo
i =1
Estimating π
1 N 2 √
Buﬀon’s
problem N i =1 fi − µ2
Probability ǫ∼ ∼ O(1/ N).
Monte Carlo N
Monte Carlo
integration
Quality of
Sampling
Quasi-Monte
Carlo

Pseudorandom
Pseudorandom
number
generation
Other
distributions
Limitations
Multivariate
distributions

Markov
Chains
Markov chains
Markov chains
A Famous

Importance and Quality of the Samples

Monte Carlo
& MCMC
Higher dimensions – even more challenging!
Xin-She Yang

I= ... f (u, v , ..., w ) du dv ...dw .
Monte Carlo
Estimating π
Buﬀon’s
problem
√
Probability Errors ∼ 1/ N
Monte Carlo
Monte Carlo
integration
Quality of
Higher dimensional integrals
Sampling
Quasi-Monte
Carlo
How to distribute these sampling points?
Pseudorandom
Pseudorandom
number
Regular grids: E ∼ O(N −2/d ) in d ≥ 4 dimensions (not
generation
Other
enough!)
distributions
Limitations
Multivariate
distributions
Strategies: importance sampling, Latin hypercube, ...
Markov
Chains
Markov chains
Any other ways?
Markov chains
A Famous

Quasi-Monte Carlo Methods

Monte Carlo
& MCMC
In essence, that is to distribute (consecutive) sampling points
Xin-She Yang
as far away as possible, using quasi-random or low-discrepancy
numbers (not pseudo-random)... Halton, Sobol, Corput ...
Monte Carlo
Estimating π
Buﬀon’s
For example, Corput express an integer n as a prime base b
problem
Probability m
Monte Carlo
Monte Carlo n= aj (n)b j , aj ∈ {0, 1, 2, ..., b − 1}.
integration
Quality of j=0
Sampling
Quasi-Monte
Carlo Then, it is reversed or reﬂected
Pseudorandom
m
Pseudorandom 1
number
generation φb (n) = aj (n) .
Other b j+1
distributions j=0
Limitations
Multivariate
distributions
For example, 0, 1, 2, ..., 15 =⇒ 0, 1 , 1 , 3 , 1 , ..., 15 .
2 4 4 8 16
Markov
Chains
Markov chains
Errors ∼ O(1/N)
Markov chains
A Famous

Pseudorandom numbers – by deterministic
sequences
Monte Carlo
& MCMC
Uniform Distributions:
Xin-She Yang
di = (adi −1 + c) mod m,
Monte Carlo
Estimating π
Classic IBM generator:
Buﬀon’s

m = 231 (strong correlation!)
problem
Probability a = 65539, c = 0,
Monte Carlo
Monte Carlo
integration
Quality of
In fact, correlation coeﬃcient is 1!
Sampling
Quasi-Monte
Better choice (old Matlab):
Carlo

Pseudorandom a = 75 = 16807, c = 0, m =31 −1 = 2, 147, 483, 647.
Pseudorandom
number
generation
Other
If scaled by m, all numbers are in [1/m, (m − 1)/m].
distributions
Limitations New Matlab: [ǫ, 1 − ǫ], ǫ = 2−53 ≈ 1.1 × 10−16 .
Multivariate
distributions

Markov
Chains
IEEE: 64-bits system = 53 bits for a signed fraction in base 2
Markov chains
Markov chains
and 11 bits for a signed exponent.
A Famous

Other Distributions

Monte Carlo
& MCMC
Inverse transform method, rejection method, Mersenne twister,
Xin-She Yang
..., Markov chain Monte Carlo.
2
√1 e −u /2 ,
Monte Carlo
Estimating π Standard norm distribution: p(u) = 2π
Buﬀon’s
v −u 2 /2 du
CDF: Φ(v ) = √1 = 1 v
2 [1 + ( 2 )],
problem
−∞ e
Probability
√
2π
Monte Carlo
Monte Carlo √
integration
Quality of
v = Φ−1 (u) = 2 erf−1 (2u − 1),
Sampling 1200 10000

Quasi-Monte
Carlo
1000
8000

Pseudorandom
Pseudorandom 800

number 6000

generation
600
Other
distributions 4000

Limitations 400

Multivariate
distributions 2000
200

Markov
Chains 0
0 0.2 0.4 0.6 0.8 1
0
-6 -4 -2 0 2 4 6

Markov chains
Markov chains
A Famous

Transform method: Limitations

Monte Carlo
& MCMC

Xin-She Yang

Monte Carlo
√
Estimating π v = Φ−1 (u) = 2 erf−1 (2u − 1),
Buﬀon’s
problem
Probability
Monte Carlo
Monte Carlo √
integration
π πx 3 7π 2 x 5 127π 3 x 7
Quality of
Sampling erf−1 (x) = x+ + + + ··· .
Quasi-Monte
Carlo
2 12 480 40320
Pseudorandom
Pseudorandom
number
generation
Not so easy to calculate!
Other
distributions
Limitations
Sometimes, the inverse may not be possible.
Multivariate
distributions

Markov
Chains
Markov chains
Markov chains
A Famous

Multivariate Distributions

Monte Carlo
& MCMC
Bivariate normal distributions:
Xin-She Yang 1 −(v1 +v2 )/2
2 2
p(v1 , v2 ) = e .
Monte Carlo
2π
Estimating π
Buffon’s Box-M¨ller method: from u1 , u2 ∼ uniform distributions
u
problem
Probability
Monte Carlo
Monte Carlo
v1 = −2 ln u1 cos(2πu2 ), v2 = −2 ln u1 sin(2πu2 ).
integration
Quality of
Sampling
Quasi-Monte
Carlo
Problems
Pseudorandom
Pseudorandom
number
Difficult to calculate the inverse in most cases
generation
Other
(sometimes, even impossible!).
distributions
Limitations
Multivariate
Other methods (e.g., rejection method) are inefficient.
distributions

Markov
Chains
Markov chains
So – the Markov chain Monte Carlo (MCMC) way!
Markov chains
A Famous

Random Walk down the Markov Chains

Monte Carlo
& MCMC
Random walk – A drunkard’s walk:
Xin-She Yang
ut+1 = µ + ut + wt ,
Monte Carlo
Estimating π where wt is a random variable, and µ is the drift.
Buﬀon’s
problem For example, wt ∼ N(0, σ 2 ) (Gaussian).
Probability
Monte Carlo
Monte Carlo 25 10

integration
Quality of 20

Sampling 5

Quasi-Monte
Carlo 15

0

Pseudorandom 10

Pseudorandom -5

number 5

generation
-10

Other 0

distributions
Limitations -5
-15

Multivariate
distributions -10 -20
0 100 200 300 400 500 -15 -10 -5 0 5 10 15 20

Markov
Chains
Markov chains
Markov chains
A Famous

Markov Chains

Monte Carlo
& MCMC

Xin-She Yang Markov chain: the next state only depends on the current state
and the transition probability.
Monte Carlo
Estimating π
Buﬀon’s
problem
Probability
P(i , j) ≡ P(Vt+1 = Sj V0 = Sp , ..., Vt = Si )
Monte Carlo
Monte Carlo
integration
Quality of
= P(Vt+1 = Sj Vt = Sj ),
Sampling
Quasi-Monte
Carlo
=⇒ Pij πi∗ = Pji πj∗ , π ∗ = stionary probability distribution.
Pseudorandom
Pseudorandom
number
generation
Other
Examples: Brownian motion
distributions
Limitations
Multivariate
distributions
ui +1 = µ + ui + ǫi , ǫi ∼ N(0, σ 2 ).
Markov
Chains
Markov chains
Markov chains
A Famous

Markov Chains

Monte Carlo
& MCMC
Monopoly (board games)
Xin-She Yang

Monte Carlo
Estimating π
Buﬀon’s
problem
Probability
Monte Carlo
Monte Carlo
integration
Quality of
Sampling
Quasi-Monte
Carlo

Pseudorandom
Pseudorandom
number
generation
Other
distributions
Limitations
Multivariate
distributions

Markov
Chains Monopoly Animation
Markov chains
Markov chains
A Famous

A Famous $Billion Markov Chain – PageRank

Monte Carlo
& MCMC

Xin-She Yang Google PageRank Algorithm (by Page et al., 1997)
Monte Carlo
Estimating π
Buﬀon’s
problem
Probability
Monte Carlo
Monte Carlo
integration
Quality of
Sampling
Quasi-Monte
Carlo

Pseudorandom
Pseudorandom
number
generation
Other
distributions
Limitations
Multivariate
distributions
Billions of web pages: pages = states, link probability ∼ 1/t
Markov
Chains where t ≈ the expectation of the number of clicks.
Markov chains
Markov chains
A Famous

Googling as a Markov Chain

(t)
Monte Carlo
(t+1) 1−α Ranki
& MCMC Rankj = +α ,
Xin-She Yang N B(pi )
pi ∈Ω(pi )
Monte Carlo
Estimating π
where N=number of pages, B(pi ) is the link bounds of page
(t=0)
Buﬀon’s
problem pi , and α=a ranking factor (≈ 0.85). Ranki = 1/N.
Probability
T
Monte Carlo
Monte Carlo
Let R = Rank1 , ..., RankN , and L(pi , pj ) = 0 if no links
integration =⇒
Quality of
Sampling  
Quasi-Monte
Carlo

(1 − α)
 L(p1 , p1 ) ... L(p1 , pj ) ...L(p1 , pN )
.
.
 
Pseudorandom   
.

Pseudorandom 1 .
  
R=  .  + α L(pi , p1 ) L(pi , pj ) ...L(pi , pN )  R,
   
number
generation N .
. ..
  
Other    . .

distributions  . 
Limitations (1 − α) L(pN , p1 ) ... L(pN , pN )
Multivariate
distributions

where N L(pi , pj ) = 1. Google Matrix (stochastic, sparse).
Markov
Chains i =1
Markov chains
Markov chains =⇒ a stationary probability distribution R (update monthly).
A Famous

Markov Chain Monte Carlo

Monte Carlo
& MCMC

Xin-She Yang

Monte Carlo
Landmarks: Monte Carlo method (1930s, 1945, from 1950s)
Estimating π
Buffon’s
e.g., Metropolis Algorithm (1953), Metropolis-Hastings (1970).
problem
Probability
Monte Carlo
Monte Carlo
Markov Chain Monte Carlo (MCMC) methods – A class of
integration
Quality of
methods.
Sampling
Quasi-Monte
Carlo
Really took off in 1990s, now applied to a wide range of areas:
Pseudorandom
Pseudorandom physics, Bayesian statistics, climate changes, machine learning,
number
generation
Other
finance, economy, medicine, biology, materials and engineering
distributions
Limitations
...
Multivariate
distributions

Markov
Chains
Markov chains
Markov chains
A Famous

Metropolis-Hastings

Monte Carlo
& MCMC
The Metropolis-Hastings algorithm algorithm:
Xin-She Yang
1 Begin with any initial θ0 at time t ← 0 such that
Monte Carlo p(θ0 ) > 0
Estimating π
Buﬀon’s
problem 2 Generating a candidate sample θ∗ ∼ q(θt , .) from a
Probability
Monte Carlo proposal distribution
Monte Carlo
integration
Quality of 3 Evaluate the acceptance probability α(θt , θ∗ ) given by
Sampling
Quasi-Monte
Carlo
p(θ∗ )q(θ∗ , θt )
Pseudorandom α = min ,1
Pseudorandom
number
p(θt )q(θt , θ∗ )
generation
Other
distributions
4 Generate a uniformly-distributed random number u ∼
Limitations
Multivariate Unif[0, 1], and accept θ∗ if α ≥ u. That is, if α ≥ u then
distributions

Markov
θt+1 ← θ∗ else θt+1 ← θt
Chains
Markov chains
5 Increase the counter or time t ← t + 1, and go to step 2
Markov chains
A Famous

Mixture distribution: A distribution with known
mean and variance.
Monte Carlo
& MCMC
f (x|µ, σ 2 ) = K αi pi (x|µi , σi2 ),
i =i
K
i =1 αi = 1.
Xin-She Yang
E.g., α1 = α2 = 1/2, µ1 = 2, µ2 = −2 and σ1 = σ2 = 1.
6

Monte Carlo 4

Estimating π 2

Buﬀon’s
problem 0

Probability -2

Monte Carlo
Monte Carlo -4
0 2000 4000 6000 8000 10000

integration
Quality of
Sampling 0.2
Quasi-Monte 0.18
Carlo
0.16
Pseudorandom
Pseudorandom 0.14
number
generation 0.12

Other 0.1
distributions
Limitations 0.08
Multivariate
distributions 0.06

0.04
Markov
Chains 0.02
Markov chains 0
Markov chains −6 −4 −2 0 2 4 6
A Famous

When to Stop the Chain

Monte Carlo
& MCMC As the MCMC runs, convergence may be reached
Xin-She Yang
When does a chain converge? When to stop the chain ... ?
Monte Carlo
Estimating π Are the samples correlated ?
Buﬀon’s
problem
Probability 0

Monte Carlo
Monte Carlo
integration 100

Quality of
Sampling
200
Quasi-Monte
Carlo

Pseudorandom 300

Pseudorandom
number 400
generation
Other
distributions
500
Limitations
Multivariate
distributions 600

Markov
Chains 0 100 200 300 400 500 600 700 800 900

Markov chains
Markov chains
A Famous

A Long Single Chain or Multiple Short Chains?

Monte Carlo
& MCMC

Xin-She Yang

Monte Carlo When a Markov chain will converge in practice? If it has
Estimating π
Buﬀon’s converged, what does it mean?
problem
Probability
Monte Carlo Is a very long chain really good enough (from statistical
Monte Carlo
integration point of view)?
Quality of
Sampling
Quasi-Monte How long is long enough?
Carlo

Pseudorandom Are multiple chains better?
Pseudorandom
number
generation
How to improve the sampling eﬃciency and/or mixing
Other
distributions properties ?
Limitations
Multivariate
distributions

Markov
Chains
Markov chains
Markov chains
A Famous

Simulated Tempering

Monte Carlo
& MCMC Simulated annealing: temperature T from high to low.
Xin-She Yang Simulated tempering: raise T to a higher value, reduce to low.
Monte Carlo
Estimating π
Buffon’s
πτ = π(x)1/τ , πτ →∞ → 1, as τ → ∞.
problem
Probability
Monte Carlo The basic idea is to reduce from a very high τ to τ0 = 1.
Monte Carlo
integration
Quality of
Sampling
flatten
Quasi-Monte
Carlo
=⇒
Pseudorandom
π≥ 0 πτ = π(x)1/τ
Pseudorandom
number
generation
Other
distributions
Limitations
Tempering
Multivariate
distributions
Use flattened (near uniform) distributions as
Markov
Chains proposals/candidates to produce high quality samplings.
Markov chains
Markov chains
A Famous

Sampling: Forward or Backward? Which Way?

Monte Carlo
& MCMC Is this the only way?
Xin-She Yang
No! – Coupling from the Past & Metaheuristics
Monte Carlo
Estimating π
Buffon’s
problem
Probability
Monte Carlo If we go backward along the chain, any advantages? If so, how?
Monte Carlo
integration
Quality of
Sampling
Is there a universally efficient sampling tool for drawing
Quasi-Monte
Carlo samples in general?
Pseudorandom
Pseudorandom
number
No! – No-free-lunch theorem (Wolpert & Macready, 1997)
generation
Other
distributions The aim of the research is to find the best algorithm(s) for a
Limitations
Multivariate
distributions
given/specific problem/distribution.
Markov
Chains
Markov chains
Also Metaheuristics (very promosing).
Markov chains
A Famous

Thank you

Monte Carlo
& MCMC

Xin-She Yang References
Monte Carlo Gamerman D., Markov Chain Monte Carlo, Chapman & Hall/CRC, (1997).
Estimating π Corcoran J. and Tweedie R., Perfect sampling ... Jour. Stat. Plan. Infer., 104, 297 (2002).
Buffon’s
problem Cox M., Forbes A. B., Harris P. M., Smith I., Classification and solution of regression ..., NPL SSfM
Probability Report, (2004).
Monte Carlo Propp J. & Wilson D., Exact sampling ..., Random Stru. Alg., 9, 223 (1996).
Monte Carlo
integration Yang X. S., Nature-Inspired Metaheuristic Algorithms, Luniver Press, (2008).
Quality of
Sampling Yang X. S., Introduction to Computational Mathematics, World Scientific, (2008).
Quasi-Monte Yang X. S., Engineering Optimization: An Introduction with Metaheuristic Applications, Wiley,
Carlo
(2010).
Pseudorandom
Pseudorandom
number
generation
Other
distributions
Acknowledgement:
Limitations
Multivariate EPSRC, SSfM, NPL, CUED, and London Maths Society.
distributions

Markov
Thank you!
Chains
Markov chains
Markov chains
A Famous

Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo

Semelhante a Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo (20)

Mais de Xin-She Yang

Mais de Xin-She Yang (20)

Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo