SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
MCMC for mixtures of Gaussians, and model 
selection 
Aaron McDaid, aaronmcdaid@gmail.com 
October 30, 2014 
1 / 36
Six models 
l 
l 
l 
l 
ll 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
ll l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
lll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
ll 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
−20 0 20 40 60 
0 20 40 60 80 100 
V1 
V2 
(a) 1. vvv 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
−10 −5 0 5 10 
−30 −20 −10 0 10 20 30 
V1 
V2 
(b) 2. eee 
2 / 36
Six models 
l 
l l 
l l 
l 
l 
l l 
l 
l 
l 
l 
ll 
l l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
ll l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l l l l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l ll 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
−10 0 10 20 30 
−20 −10 0 10 
V1 
V2 
(a) 3. vvi 
l 
l 
l 
ll 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
llll 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l l 
l l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
ll 
l 
ll 
ll 
l 
l l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
ll 
l 
l l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll l 
ll 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
−30 −20 −10 0 10 20 30 
−30 −20 −10 0 10 20 30 
V1 
V2 
(b) 4. eei 
3 / 36
Six models 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l l 
l 
l 
l l 
l 
l l 
l ll l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l ll 
l 
l 
l 
l 
l 
ll 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
ll 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
lll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
ll 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
lll 
l 
l 
l 
l l 
ll 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
ll 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l ll 
ll 
l 
l 
−20 −10 0 10 20 
−20 −10 0 10 20 
V1 
V2 
(a) 5. vii 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
ll 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
−100 0 100 200 
−200 −100 0 100 200 
V1 
V2 
(b) 6. eii 
4 / 36
Old Faithful N=272 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 
50 60 70 80 90 
eruptions 
waiting 
Old Faithful - Yellowstone National 
Park 
5 / 36
Old Faithful N=272 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
ll 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
l 
1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 
50 60 70 80 90 
V1 
V2 
Old Faithful - Yellowstone National 
Park 
6 / 36
Overview 
Goals 
De
ne the mclust model 
Bayes Factor and BIC - connection between mclust and 
MCMC 
Priors 
Integration (analytical and numerical) 
MCMC algorithm1 
Selecting from the six models via MCMC 
Evaluation (on synthetic data) 
One application 
1Mahlet G. Tadesse, Naijun Sha, and Marina Vannucci. Bayesian Variable 
Selection in Clustering High-Dimensional Data". In: Journal of the American 
Statistical Association 100.470 (June 2005), pp. 602{617. issn: 0162-1459. 
doi: 10.1198/016214504000001565. url: 
http://www.stat.rice.edu/~{}marina/papers/jasa05.pdf. 
7 / 36
Goals 
Not a `shootout' with mclust 
See what MCMC can do 
Calculate the Bayes Factor more precisely - is it better than 
BIC? 
Push to larger numbers of clusters 
8 / 36
Basic model 
N data points in a p-dimensional space. 
m 2 (fvvv; eee; vvi; eei; vii; eiig) 
K number of clusters 
k covariance of clusterk 
k mean Pof cluster k 
 
K 
k=1 k = 1 
zi P(zi = k) = k 
xi jzi=k  Normal(k ;k ): 
Mixture models 
P(xi jzi=k) = N(xi jk ;k ) 
P(xi ) = 
XK 
k=1 
kN(xi jk ;k ) 
9 / 36
mclust 
MLE (Maximum Likelihood Estimate) 
R package mclust2 
Given (K;m), use Expectation-Maximization (EM) algorithm 
to estimate (;;). 
P(Xjk ;k ;;m;K) 
Requires running EM for each possible combination of (K;m). 
Hundreds of runs may be required. f(K = 2;m = VVI); (K = 
3;m = EEI ); (K = 50;m = EEI ); : : : g 
Then use BIC to select among the models. 
2Chris Fraley and Adrian E. Raftery. MCLUST: Software for model-based 
cluster analysis. In: Journal of Classi
cation 16.2 (1999), pp. 297{306. 
10 / 36
mclust 
Why do we need model selection? 
vvv vvi vii 
eee eei eii 
De
ne  = (;;). 
P(Xj=^ eee;K;m=vvv;K) = P(Xj=^ eee;K;m=eee;K) 
Cannot maximize P(Xj;m;K) 
Count the degrees-of-freedom f , in order to penalize the more 
complex model. 
AIC = 2 log P(XjMLE 
m;K ;m;K)  2f 
BIC = 2 log P(XjMLE 
m;K ;m;K)  log(N)f 
11 / 36
Bayes Factor 
(BIC) Bayesian Information Criterion 
BIC  2 log 
Bayes Factor z }| { 
(P(Xjm;K)) 
P(X = Xobs jm;K) 
Informally, the average P(Xj;m;K) over all . 
Can we compute this (weighted) average more accurately? 
12 / 36
Bayes Factor 
(BIC) Bayesian Information Criterion 
BIC  2 log 
Bayes Factor z }| { 
(P(Xjm;K)) 
P(X = Xobs jm;K) 
Informally, the average P(Xj;m;K) over all . 
Can we compute this (weighted) average more accurately? 
P(Xjm=vvv;K) 
P(Xjm=eee;K) = 
R 
R P(X;jm=vvv;K) d 
P(X;jm=eee;K) d 
12 / 36
Full model 
N data points in a p-dimensional space. 
dependence distribution 
m  Uniform(fvvv; eee; vvi; eei; vii; eiig) 
K jK0  Poisson(1) 
 jK  Dirichlet(0): 
zi j;K P(zi = kj;K) = k 
k jm;K  Wishart1(V0; g0): 
k jk ;m;K  Normal(0; 1 
n0 
k ): 
xi jzi=k; k ;;m;K  Normal(k ;k ): 
0 = 
 
1 
2 
; 
1 
2 
; :::; 
1 
2 
 
0 = X 
n0 = 0:001 
g0 = (p+1)+n0(p+1) 
1n0 
 p + 1 +  
V0 = Cov(X)(g0  p  1) 
13 / 36
Dirichlet 
( 1K 
; 1K 
; :::; 1K 
). 
PK 
k=1 k = 1 
Dirichlet gives us random vectors 
Dirichlet(1; 2; :::; K) 
K = 4,  = (0:01; 0:09; 0:80; 0:10) 
May lead to empty clusters in the prior, and therefore in 
posterior too 
KjX  KTRUE 
Solution3. K  Poisson(1)jK  1 
3Agostino Nobile. Bayesian
nite mixtures: a note on prior speci
cation 
and posterior computation. In: arXiv preprint arXiv:0711.0458 (2007). 
14 / 36
Integration 
Joint probability 
P(X;;; z;;K;m) =P(Xj;; z;;K;m) 
 P(j;z;;K;m)0;n0 
 P(jz;;K;m)V0;g0 
 P(zj;K;m) 
 P(jK;m)0 
 P(K jm) 
 P(m) 
15 / 36
Integration 
In general, 
P(ajb) = 
X 
c 
P(a; cjb) 
P(ajb) = 
Z 
P(a; ejb) de 
P(ajb) = 
X 
c 
P(ajc; b)P(cjb) 
P(ajb) = 
Z 
P(aje; b)P(ejb) de 
16 / 36
Integration 
P(mjX) = 
1X 
K=1 
X 
z 
Z Z Z 
P(;; z;;K;mjX) d d d 
P(KjX) = 
X 
m 
X 
z 
Z Z Z 
P(;; z;;K;mjX) d d d 
P(K;mjX) = 
X 
z 
Z Z Z 
P(;; z;;K;mjX) d d d 
P(zjX) = 
1X 
K=1 
X 
m 
Z Z Z 
P(;; z;;K;mjX) d d d 
17 / 36
Integration 
P(z;K;mjX) = 
Z Z Z 
P(;; z;;K;mjX) d d d 
P(z;K;mjX) = 
Z Z Z 
1 
P(X) 
P(;; z;;K;m;X) d d d 
P(z;K;mjX) = 
1 
P(X) 
Z Z Z 
P(;;jz;K;m;X)P(z;K;m;X) d d P(z;K;m;X) 
P(z;K;mjX) = 
P(X) 
Z Z Z 
P(;;jz;K;m;X) d d d 
18 / 36
Mini-overview 
We speci
ed the model, with all our priors, earlier. How do we get 
our estimates? 
RJMCMC4 would give many estimates of 
P(;; z;;K;mjX). 
Want faster MCMC. 
Solve P(X; z;K;m) analytically. 
Use that to sample z;K;mjX. 
4Peter J. Green. Reversible Jump Markov Chain Monte Carlo computation 
and Bayesian model determination. In: Biometrika 82.4 (Dec. 1995), 
pp. 711{732. doi: 10.1093/biomet/82.4.711. url: 
http://dx.doi.org/10.1093/biomet/82.4.711. 
19 / 36
Mini-overview 
We speci
ed the model, with all our priors, earlier. How do we get 
our estimates? 
RJMCMC4 would give many estimates of 
P(;; z;;K;mjX). 
Want faster MCMC. 
Solve P(X; z;K;m) analytically. 
Use that to sample z;K;mjX. 
Count popular (m), or (K), or (m;K) in sample. P(m;KjX). 
(Proven identical to RJMCMC - dierent MCMC algorithms 
(usually) don't change results, just speed.) 
4Peter J. Green. Reversible Jump Markov Chain Monte Carlo computation 
and Bayesian model determination. In: Biometrika 82.4 (Dec. 1995), 
pp. 711{732. doi: 10.1093/biomet/82.4.711. url: 
http://dx.doi.org/10.1093/biomet/82.4.711. 
19 / 36
Mini-overview 
We speci
ed the model, with all our priors, earlier. How do we get 
our estimates? 
RJMCMC4 would give many estimates of 
P(;; z;;K;mjX). 
Want faster MCMC. 
Solve P(X; z;K;m) analytically. 
Use that to sample z;K;mjX. 
Count popular (m), or (K), or (m;K) in sample. P(m;KjX). 
(Proven identical to RJMCMC - dierent MCMC algorithms 
(usually) don't change results, just speed.) 
If desired, ;;jX; z;K;m is easily generated. 
4Peter J. Green. Reversible Jump Markov Chain Monte Carlo computation 
and Bayesian model determination. In: Biometrika 82.4 (Dec. 1995), 
pp. 711{732. doi: 10.1093/biomet/82.4.711. url: 
http://dx.doi.org/10.1093/biomet/82.4.711. 
19 / 36
Analytical integration 
P(X; z;K;m) = 
Z Z Z 
P(X;;; z;;K;m) d d d 
P(;;;X; z;K;m) = P(;;;X; z;K;m) 
P(;;jX; z;K;m)P(X; z;K;m) = P(X; z;K;mj;;)P(;;) 
P(X; z;K;m) = P(X;z;K;mj;;)P(;;) 
P(;;jX;z;K;m) 
20 / 36
Numerical integration (MCMC) 
Markov Chain Monte Carlo (MCMC) 
Begin with an initial estimate (z1;m1;K1) 
At each iteration, propose to perturb 
(zi ;mi ;Ki ) ) (zi;mi;Ki) 
Similar to current state, to enable a gradual `climb' towards 
the good estimates. 
21 / 36
Numerical integration (MCMC) 
Markov Chain Monte Carlo (MCMC) 
Begin with an initial estimate (z1;m1;K1) 
At each iteration, propose to perturb 
(zi ;mi ;Ki ) ) (zi;mi;Ki) 
Similar to current state, to enable a gradual `climb' towards 
the good estimates. 
h 
De
ne ai = min 
1; P(X;zi;mi;Ki) 
P(X;zi ;mi ;Ki ) 
q(zi ;mi ;Ki jzi;mi;Ki) 
q(zi;mi;Kijzi ;mi ;Ki ) 
i 
21 / 36
Numerical integration (MCMC) 
Markov Chain Monte Carlo (MCMC) 
Begin with an initial estimate (z1;m1;K1) 
At each iteration, propose to perturb 
(zi ;mi ;Ki ) ) (zi;mi;Ki) 
Similar to current state, to enable a gradual `climb' towards 
the good estimates. 
h 
De
ne ai = min 
1; P(X;zi;mi;Ki) 
P(X;zi ;mi ;Ki ) 
q(zi ;mi ;Ki jzi;mi;Ki) 
q(zi;mi;Kijzi ;mi ;Ki ) 
i 
(zi+1;mi+1;Ki+1) = (zi;mi;Ki) with probability ai . 
(zi+1;mi+1;Ki+1) = (zi ;mi ;Ki ) with probability 1ai . 
Resulting estimates will be drawn as z;m;KjX 
21 / 36
Numerical integration (MCMC) 
Markov Chain Monte Carlo (MCMC) 
Begin with an initial estimate (z1;m1;K1) 
At each iteration, propose to perturb 
(zi ;mi ;Ki ) ) (zi;mi;Ki) 
Similar to current state, to enable a gradual `climb' towards 
the good estimates. 
h 
De
ne ai = min 
1; P(X;zi;mi;Ki) 
P(X;zi ;mi ;Ki ) 
q(zi ;mi ;Ki jzi;mi;Ki) 
q(zi;mi;Kijzi ;mi ;Ki ) 
i 
(zi+1;mi+1;Ki+1) = (zi;mi;Ki) with probability ai . 
(zi+1;mi+1;Ki+1) = (zi ;mi ;Ki ) with probability 1ai . 
Resulting estimates will be drawn as z;m;KjX 
`Good' proposals don't aect the distribution, but they do 
improve speed 
21 / 36
The above is too slow. Still too much correlation, slowing the progress. 
So I run six chains, 
(z;KjX;m = vvv) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) 
(z;KjX;m = eee) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) 
(z;KjX;m = vvi) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) 
(z;KjX;m = eei) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) 
(z;KjX;m = vii) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) 
(z;KjX;m = eii) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) 
and combine the results. 
Results should be combined in proportion to P(mjX). 
22 / 36
*VVV* .. iteration: 0/10000 nonEmpty: 1 K: 1 nmi: 0 entropy: 0.000000 
*VVV* .. iteration: 50/10000 nonEmpty: 6 K: 6 nmi: 52.5095 entropy: 1.477233 
*VVV* .. iteration: 100/10000 nonEmpty: 9 K: 9 nmi: 72.713 entropy: 2.045612 
*VVV* .. iteration: 150/10000 nonEmpty: 9 K: 9 nmi: 75.5046 entropy: 2.124148 
*VVV* .. iteration: 200/10000 nonEmpty: 9 K: 9 nmi: 74.8402 entropy: 2.105455 
*VVV* .. iteration: 250/10000 nonEmpty: 10 K: 11 nmi: 77.8969 entropy: 2.191450 
*VVV* .. iteration: 300/10000 nonEmpty: 11 K: 11 nmi: 79.2266 entropy: 2.228856 
*VVV* .. iteration: 350/10000 nonEmpty: 12 K: 12 nmi: 82.1832 entropy: 2.312034 
*VVV* .. iteration: 400/10000 nonEmpty: 12 K: 12 nmi: 82.1832 entropy: 2.312034 
*VVV* .. iteration: 450/10000 nonEmpty: 11 K: 11 nmi: 81.1627 entropy: 2.283326 
*VVV* .. iteration: 500/10000 nonEmpty: 13 K: 13 nmi: 84.8982 entropy: 2.388416 
*VVV* .. iteration: 550/10000 nonEmpty: 13 K: 13 nmi: 84.8982 entropy: 2.388416 
*VVV* .. iteration: 600/10000 nonEmpty: 14 K: 14 nmi: 88.896 entropy: 2.500883 
*VVV* .. iteration: 650/10000 nonEmpty: 14 K: 14 nmi: 88.896 entropy: 2.500883 
*VVV* .. iteration: 700/10000 nonEmpty: 14 K: 15 nmi: 91.0987 entropy: 2.562850 
*VVV* .. iteration: 750/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 
*VVV* .. iteration: 800/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 
*VVV* .. iteration: 850/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 
*VVV* .. iteration: 900/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 
*VVV* .. iteration: 950/10000 nonEmpty: 16 K: 16 nmi: 94.6821 entropy: 2.663661 
*VVV* .. iteration: 1000/10000 nonEmpty: 15 K: 15 nmi: 93.7927 entropy: 2.638641 
*VVV* .. iteration: 1050/10000 nonEmpty: 14 K: 14 nmi: 91.2693 entropy: 2.567652 
*VVV* .. iteration: 1100/10000 nonEmpty: 14 K: 14 nmi: 91.0987 entropy: 2.562850 
*VVV* .. iteration: 1150/10000 nonEmpty: 14 K: 14 nmi: 91.2693 entropy: 2.567652 
*VVV* .. iteration: 1200/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 
*VVV* .. iteration: 1250/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 
*VVV* .. iteration: 1300/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 
*VVV* .. iteration: 1350/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 
*VVV* .. iteration: 1400/10000 nonEmpty: 16 K: 16 nmi: 96.4608 entropy: 2.713701 
*VVV* .. iteration: 1450/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 
23 / 36
(High level) description of complete algorithm:5 
I run six chains (z;KjX;m). In parallel, independently of each 
other. 
There is a variable M which is the `current'/`best' model. 
At iteration i , based on P(X; zi ;m;Ki ) (and other quantities). 
Can be proven that M will be distributed proportional to 
P(Xjm=M). 
5Bradley P Carlin and Siddhartha Chib. Bayesian model choice via Markov 
chain Monte Carlo methods. In: Journal of the Royal Statistical 
Society-Series B Methodological 57.3 (1995), pp. 473{484. 
24 / 36
(High level) description of complete algorithm:5 
I run six chains (z;KjX;m). In parallel, independently of each 
other. 
There is a variable M which is the `current'/`best' model. 
At iteration i , based on P(X; zi ;m;Ki ) (and other quantities). 
Can be proven that M will be distributed proportional to 
P(Xjm=M). 
to work well, need to train good pseudopriors in advance. 
5Bradley P Carlin and Siddhartha Chib. Bayesian model choice via Markov 
chain Monte Carlo methods. In: Journal of the Royal Statistical 
Society-Series B Methodological 57.3 (1995), pp. 473{484. 
24 / 36
Application 
Wine dataset N=178 p=27 
3 regions of Italy 
mclust (3,VVI) 
1 2 3 
1 58 1 
2 5 65 1 
3 48 
MCMC (3,EEI) 
1 2 3 
1 58 1 
2 2 66 3 
3 48 
25 / 36

Mais conteúdo relacionado

Destaque

Usa las vegas_4_hotels_luxueuses
Usa las vegas_4_hotels_luxueusesUsa las vegas_4_hotels_luxueuses
Usa las vegas_4_hotels_luxueusesfilipj2000
 
Haiti pour eve
Haiti pour eveHaiti pour eve
Haiti pour eveourbothy
 
Feudal System (short)
Feudal System (short)Feudal System (short)
Feudal System (short)benstory
 
Tribus de l'omo
Tribus de l'omoTribus de l'omo
Tribus de l'omofilipj2000
 
Cominfo11
Cominfo11Cominfo11
Cominfo11ATD13
 
Présentation d'Open Data Paris au Mobile 2.0 2011
Présentation d'Open Data Paris au Mobile 2.0 2011Présentation d'Open Data Paris au Mobile 2.0 2011
Présentation d'Open Data Paris au Mobile 2.0 2011Mairie de Paris
 
CRFCB AMU evolutions_catalogage_091213_enjeux_1
CRFCB AMU evolutions_catalogage_091213_enjeux_1CRFCB AMU evolutions_catalogage_091213_enjeux_1
CRFCB AMU evolutions_catalogage_091213_enjeux_1nonue12
 
Jean paris 1900-(cons)
Jean paris 1900-(cons)Jean paris 1900-(cons)
Jean paris 1900-(cons)filipj2000
 
EDOMA Présentation 2009
EDOMA Présentation 2009EDOMA Présentation 2009
EDOMA Présentation 2009huntziger
 

Destaque (20)

Usa las vegas_4_hotels_luxueuses
Usa las vegas_4_hotels_luxueusesUsa las vegas_4_hotels_luxueuses
Usa las vegas_4_hotels_luxueuses
 
Consejos en Trabajos con Goma Eva
Consejos en Trabajos con Goma EvaConsejos en Trabajos con Goma Eva
Consejos en Trabajos con Goma Eva
 
Haiti pour eve
Haiti pour eveHaiti pour eve
Haiti pour eve
 
Feudal System (short)
Feudal System (short)Feudal System (short)
Feudal System (short)
 
Tribus de l'omo
Tribus de l'omoTribus de l'omo
Tribus de l'omo
 
hjjjj
hjjjjhjjjj
hjjjj
 
Cominfo11
Cominfo11Cominfo11
Cominfo11
 
Info sacu
Info sacuInfo sacu
Info sacu
 
Présentation d'Open Data Paris au Mobile 2.0 2011
Présentation d'Open Data Paris au Mobile 2.0 2011Présentation d'Open Data Paris au Mobile 2.0 2011
Présentation d'Open Data Paris au Mobile 2.0 2011
 
¡Confianza alegre al trabajar!
¡Confianza alegre al trabajar!¡Confianza alegre al trabajar!
¡Confianza alegre al trabajar!
 
CRFCB AMU evolutions_catalogage_091213_enjeux_1
CRFCB AMU evolutions_catalogage_091213_enjeux_1CRFCB AMU evolutions_catalogage_091213_enjeux_1
CRFCB AMU evolutions_catalogage_091213_enjeux_1
 
Abraham
AbrahamAbraham
Abraham
 
Olympiades Mondiales 2009 Jour4
Olympiades Mondiales 2009  Jour4Olympiades Mondiales 2009  Jour4
Olympiades Mondiales 2009 Jour4
 
Zotero 3.0 - Doctorado Formación en la Sociedad del Conocimiento
Zotero 3.0 - Doctorado Formación en la Sociedad del ConocimientoZotero 3.0 - Doctorado Formación en la Sociedad del Conocimiento
Zotero 3.0 - Doctorado Formación en la Sociedad del Conocimiento
 
Jean paris 1900-(cons)
Jean paris 1900-(cons)Jean paris 1900-(cons)
Jean paris 1900-(cons)
 
EDOMA Présentation 2009
EDOMA Présentation 2009EDOMA Présentation 2009
EDOMA Présentation 2009
 
Info sacu
Info sacuInfo sacu
Info sacu
 
Biodiversidad
BiodiversidadBiodiversidad
Biodiversidad
 
Archivo 2
Archivo 2Archivo 2
Archivo 2
 
Archivo 1
Archivo 1Archivo 1
Archivo 1
 

Semelhante a MCMC for clustering of multivariate-Normal data

Trabajo
TrabajoTrabajo
Trabajoyucai
 
Book*
Book*Book*
Book*LPCO
 
Care giving training institution13
Care giving training institution13Care giving training institution13
Care giving training institution13berhanu taye
 
Ejercicios De Mecanografía
Ejercicios De MecanografíaEjercicios De Mecanografía
Ejercicios De Mecanografíawongaa
 
Ejercicios De MecanografíA
Ejercicios De MecanografíAEjercicios De MecanografíA
Ejercicios De MecanografíAwongaa
 
Huruf jawi bersambung
Huruf jawi bersambungHuruf jawi bersambung
Huruf jawi bersambungHasimah Muda
 
Mapa de un diseño para el PAFSu de una UMF
Mapa de un diseño para el PAFSu de una UMFMapa de un diseño para el PAFSu de una UMF
Mapa de un diseño para el PAFSu de una UMFBryan Bone
 

Semelhante a MCMC for clustering of multivariate-Normal data (14)

Mecanografia.
Mecanografia.Mecanografia.
Mecanografia.
 
Trabajo mecanet
Trabajo mecanetTrabajo mecanet
Trabajo mecanet
 
Trabajo de Mecanet
Trabajo de Mecanet Trabajo de Mecanet
Trabajo de Mecanet
 
Trabajo
TrabajoTrabajo
Trabajo
 
Book*
Book*Book*
Book*
 
Alfabeto Cursiva
Alfabeto CursivaAlfabeto Cursiva
Alfabeto Cursiva
 
Care giving training institution13
Care giving training institution13Care giving training institution13
Care giving training institution13
 
mecanografia
mecanografiamecanografia
mecanografia
 
Ejercicios De Mecanografía
Ejercicios De MecanografíaEjercicios De Mecanografía
Ejercicios De Mecanografía
 
Ejercicios De MecanografíA
Ejercicios De MecanografíAEjercicios De MecanografíA
Ejercicios De MecanografíA
 
Ejercicio mecanografia
Ejercicio mecanografiaEjercicio mecanografia
Ejercicio mecanografia
 
Huruf jawi bersambung
Huruf jawi bersambungHuruf jawi bersambung
Huruf jawi bersambung
 
DHV13
DHV13DHV13
DHV13
 
Mapa de un diseño para el PAFSu de una UMF
Mapa de un diseño para el PAFSu de una UMFMapa de un diseño para el PAFSu de una UMF
Mapa de un diseño para el PAFSu de una UMF
 

Último

Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 

Último (20)

Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 

MCMC for clustering of multivariate-Normal data

  • 1. MCMC for mixtures of Gaussians, and model selection Aaron McDaid, aaronmcdaid@gmail.com October 30, 2014 1 / 36
  • 2. Six models l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l ll l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l ll l ll l l l l l l l l l lll l l l l l l l l l l l l l ll l l l l l l ll l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l ll l l l ll l l l l l l l l l l l l ll l l l l l ll ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l −20 0 20 40 60 0 20 40 60 80 100 V1 V2 (a) 1. vvv l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l −10 −5 0 5 10 −30 −20 −10 0 10 20 30 V1 V2 (b) 2. eee 2 / 36
  • 3. Six models l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l −10 0 10 20 30 −20 −10 0 10 V1 V2 (a) 3. vvi l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l ll l l llll l l l l l ll l l l l l l l l l l l l l l l l l l l l l ll l ll ll l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l ll l l l l ll l l l l l l l l l l l ll l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l ll l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l −30 −20 −10 0 10 20 30 −30 −20 −10 0 10 20 30 V1 V2 (b) 4. eei 3 / 36
  • 4. Six models l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l ll l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll ll l l ll l l l l l l l l l l l l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lll l l l l l l l l l l l l l l l l l l l l l l ll l ll l l l l l l l l l l l l l l l l l l l ll l l l l l l l lll l l l l l ll l l l ll l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l ll l l l l l l l l l l l l l l ll ll l l −20 −10 0 10 20 −20 −10 0 10 20 V1 V2 (a) 5. vii l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l ll l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l −100 0 100 200 −200 −100 0 100 200 V1 V2 (b) 6. eii 4 / 36
  • 5. Old Faithful N=272 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 50 60 70 80 90 eruptions waiting Old Faithful - Yellowstone National Park 5 / 36
  • 6. Old Faithful N=272 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 50 60 70 80 90 V1 V2 Old Faithful - Yellowstone National Park 6 / 36
  • 8. ne the mclust model Bayes Factor and BIC - connection between mclust and MCMC Priors Integration (analytical and numerical) MCMC algorithm1 Selecting from the six models via MCMC Evaluation (on synthetic data) One application 1Mahlet G. Tadesse, Naijun Sha, and Marina Vannucci. Bayesian Variable Selection in Clustering High-Dimensional Data". In: Journal of the American Statistical Association 100.470 (June 2005), pp. 602{617. issn: 0162-1459. doi: 10.1198/016214504000001565. url: http://www.stat.rice.edu/~{}marina/papers/jasa05.pdf. 7 / 36
  • 9. Goals Not a `shootout' with mclust See what MCMC can do Calculate the Bayes Factor more precisely - is it better than BIC? Push to larger numbers of clusters 8 / 36
  • 10. Basic model N data points in a p-dimensional space. m 2 (fvvv; eee; vvi; eei; vii; eiig) K number of clusters k covariance of clusterk k mean Pof cluster k K k=1 k = 1 zi P(zi = k) = k xi jzi=k Normal(k ;k ): Mixture models P(xi jzi=k) = N(xi jk ;k ) P(xi ) = XK k=1 kN(xi jk ;k ) 9 / 36
  • 11. mclust MLE (Maximum Likelihood Estimate) R package mclust2 Given (K;m), use Expectation-Maximization (EM) algorithm to estimate (;;). P(Xjk ;k ;;m;K) Requires running EM for each possible combination of (K;m). Hundreds of runs may be required. f(K = 2;m = VVI); (K = 3;m = EEI ); (K = 50;m = EEI ); : : : g Then use BIC to select among the models. 2Chris Fraley and Adrian E. Raftery. MCLUST: Software for model-based cluster analysis. In: Journal of Classi
  • 12. cation 16.2 (1999), pp. 297{306. 10 / 36
  • 13. mclust Why do we need model selection? vvv vvi vii eee eei eii De
  • 14. ne = (;;). P(Xj=^ eee;K;m=vvv;K) = P(Xj=^ eee;K;m=eee;K) Cannot maximize P(Xj;m;K) Count the degrees-of-freedom f , in order to penalize the more complex model. AIC = 2 log P(XjMLE m;K ;m;K) 2f BIC = 2 log P(XjMLE m;K ;m;K) log(N)f 11 / 36
  • 15. Bayes Factor (BIC) Bayesian Information Criterion BIC 2 log Bayes Factor z }| { (P(Xjm;K)) P(X = Xobs jm;K) Informally, the average P(Xj;m;K) over all . Can we compute this (weighted) average more accurately? 12 / 36
  • 16. Bayes Factor (BIC) Bayesian Information Criterion BIC 2 log Bayes Factor z }| { (P(Xjm;K)) P(X = Xobs jm;K) Informally, the average P(Xj;m;K) over all . Can we compute this (weighted) average more accurately? P(Xjm=vvv;K) P(Xjm=eee;K) = R R P(X;jm=vvv;K) d P(X;jm=eee;K) d 12 / 36
  • 17. Full model N data points in a p-dimensional space. dependence distribution m Uniform(fvvv; eee; vvi; eei; vii; eiig) K jK0 Poisson(1) jK Dirichlet(0): zi j;K P(zi = kj;K) = k k jm;K Wishart1(V0; g0): k jk ;m;K Normal(0; 1 n0 k ): xi jzi=k; k ;;m;K Normal(k ;k ): 0 = 1 2 ; 1 2 ; :::; 1 2 0 = X n0 = 0:001 g0 = (p+1)+n0(p+1) 1n0 p + 1 + V0 = Cov(X)(g0 p 1) 13 / 36
  • 18. Dirichlet ( 1K ; 1K ; :::; 1K ). PK k=1 k = 1 Dirichlet gives us random vectors Dirichlet(1; 2; :::; K) K = 4, = (0:01; 0:09; 0:80; 0:10) May lead to empty clusters in the prior, and therefore in posterior too KjX KTRUE Solution3. K Poisson(1)jK 1 3Agostino Nobile. Bayesian
  • 19. nite mixtures: a note on prior speci
  • 20. cation and posterior computation. In: arXiv preprint arXiv:0711.0458 (2007). 14 / 36
  • 21. Integration Joint probability P(X;;; z;;K;m) =P(Xj;; z;;K;m) P(j;z;;K;m)0;n0 P(jz;;K;m)V0;g0 P(zj;K;m) P(jK;m)0 P(K jm) P(m) 15 / 36
  • 22. Integration In general, P(ajb) = X c P(a; cjb) P(ajb) = Z P(a; ejb) de P(ajb) = X c P(ajc; b)P(cjb) P(ajb) = Z P(aje; b)P(ejb) de 16 / 36
  • 23. Integration P(mjX) = 1X K=1 X z Z Z Z P(;; z;;K;mjX) d d d P(KjX) = X m X z Z Z Z P(;; z;;K;mjX) d d d P(K;mjX) = X z Z Z Z P(;; z;;K;mjX) d d d P(zjX) = 1X K=1 X m Z Z Z P(;; z;;K;mjX) d d d 17 / 36
  • 24. Integration P(z;K;mjX) = Z Z Z P(;; z;;K;mjX) d d d P(z;K;mjX) = Z Z Z 1 P(X) P(;; z;;K;m;X) d d d P(z;K;mjX) = 1 P(X) Z Z Z P(;;jz;K;m;X)P(z;K;m;X) d d P(z;K;m;X) P(z;K;mjX) = P(X) Z Z Z P(;;jz;K;m;X) d d d 18 / 36
  • 26. ed the model, with all our priors, earlier. How do we get our estimates? RJMCMC4 would give many estimates of P(;; z;;K;mjX). Want faster MCMC. Solve P(X; z;K;m) analytically. Use that to sample z;K;mjX. 4Peter J. Green. Reversible Jump Markov Chain Monte Carlo computation and Bayesian model determination. In: Biometrika 82.4 (Dec. 1995), pp. 711{732. doi: 10.1093/biomet/82.4.711. url: http://dx.doi.org/10.1093/biomet/82.4.711. 19 / 36
  • 28. ed the model, with all our priors, earlier. How do we get our estimates? RJMCMC4 would give many estimates of P(;; z;;K;mjX). Want faster MCMC. Solve P(X; z;K;m) analytically. Use that to sample z;K;mjX. Count popular (m), or (K), or (m;K) in sample. P(m;KjX). (Proven identical to RJMCMC - dierent MCMC algorithms (usually) don't change results, just speed.) 4Peter J. Green. Reversible Jump Markov Chain Monte Carlo computation and Bayesian model determination. In: Biometrika 82.4 (Dec. 1995), pp. 711{732. doi: 10.1093/biomet/82.4.711. url: http://dx.doi.org/10.1093/biomet/82.4.711. 19 / 36
  • 30. ed the model, with all our priors, earlier. How do we get our estimates? RJMCMC4 would give many estimates of P(;; z;;K;mjX). Want faster MCMC. Solve P(X; z;K;m) analytically. Use that to sample z;K;mjX. Count popular (m), or (K), or (m;K) in sample. P(m;KjX). (Proven identical to RJMCMC - dierent MCMC algorithms (usually) don't change results, just speed.) If desired, ;;jX; z;K;m is easily generated. 4Peter J. Green. Reversible Jump Markov Chain Monte Carlo computation and Bayesian model determination. In: Biometrika 82.4 (Dec. 1995), pp. 711{732. doi: 10.1093/biomet/82.4.711. url: http://dx.doi.org/10.1093/biomet/82.4.711. 19 / 36
  • 31. Analytical integration P(X; z;K;m) = Z Z Z P(X;;; z;;K;m) d d d P(;;;X; z;K;m) = P(;;;X; z;K;m) P(;;jX; z;K;m)P(X; z;K;m) = P(X; z;K;mj;;)P(;;) P(X; z;K;m) = P(X;z;K;mj;;)P(;;) P(;;jX;z;K;m) 20 / 36
  • 32. Numerical integration (MCMC) Markov Chain Monte Carlo (MCMC) Begin with an initial estimate (z1;m1;K1) At each iteration, propose to perturb (zi ;mi ;Ki ) ) (zi;mi;Ki) Similar to current state, to enable a gradual `climb' towards the good estimates. 21 / 36
  • 33. Numerical integration (MCMC) Markov Chain Monte Carlo (MCMC) Begin with an initial estimate (z1;m1;K1) At each iteration, propose to perturb (zi ;mi ;Ki ) ) (zi;mi;Ki) Similar to current state, to enable a gradual `climb' towards the good estimates. h De
  • 34. ne ai = min 1; P(X;zi;mi;Ki) P(X;zi ;mi ;Ki ) q(zi ;mi ;Ki jzi;mi;Ki) q(zi;mi;Kijzi ;mi ;Ki ) i 21 / 36
  • 35. Numerical integration (MCMC) Markov Chain Monte Carlo (MCMC) Begin with an initial estimate (z1;m1;K1) At each iteration, propose to perturb (zi ;mi ;Ki ) ) (zi;mi;Ki) Similar to current state, to enable a gradual `climb' towards the good estimates. h De
  • 36. ne ai = min 1; P(X;zi;mi;Ki) P(X;zi ;mi ;Ki ) q(zi ;mi ;Ki jzi;mi;Ki) q(zi;mi;Kijzi ;mi ;Ki ) i (zi+1;mi+1;Ki+1) = (zi;mi;Ki) with probability ai . (zi+1;mi+1;Ki+1) = (zi ;mi ;Ki ) with probability 1ai . Resulting estimates will be drawn as z;m;KjX 21 / 36
  • 37. Numerical integration (MCMC) Markov Chain Monte Carlo (MCMC) Begin with an initial estimate (z1;m1;K1) At each iteration, propose to perturb (zi ;mi ;Ki ) ) (zi;mi;Ki) Similar to current state, to enable a gradual `climb' towards the good estimates. h De
  • 38. ne ai = min 1; P(X;zi;mi;Ki) P(X;zi ;mi ;Ki ) q(zi ;mi ;Ki jzi;mi;Ki) q(zi;mi;Kijzi ;mi ;Ki ) i (zi+1;mi+1;Ki+1) = (zi;mi;Ki) with probability ai . (zi+1;mi+1;Ki+1) = (zi ;mi ;Ki ) with probability 1ai . Resulting estimates will be drawn as z;m;KjX `Good' proposals don't aect the distribution, but they do improve speed 21 / 36
  • 39. The above is too slow. Still too much correlation, slowing the progress. So I run six chains, (z;KjX;m = vvv) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) (z;KjX;m = eee) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) (z;KjX;m = vvi) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) (z;KjX;m = eei) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) (z;KjX;m = vii) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) (z;KjX;m = eii) { (zi ;mi = vvv;Ki ) ) (zi;mi = vvv;Ki) and combine the results. Results should be combined in proportion to P(mjX). 22 / 36
  • 40. *VVV* .. iteration: 0/10000 nonEmpty: 1 K: 1 nmi: 0 entropy: 0.000000 *VVV* .. iteration: 50/10000 nonEmpty: 6 K: 6 nmi: 52.5095 entropy: 1.477233 *VVV* .. iteration: 100/10000 nonEmpty: 9 K: 9 nmi: 72.713 entropy: 2.045612 *VVV* .. iteration: 150/10000 nonEmpty: 9 K: 9 nmi: 75.5046 entropy: 2.124148 *VVV* .. iteration: 200/10000 nonEmpty: 9 K: 9 nmi: 74.8402 entropy: 2.105455 *VVV* .. iteration: 250/10000 nonEmpty: 10 K: 11 nmi: 77.8969 entropy: 2.191450 *VVV* .. iteration: 300/10000 nonEmpty: 11 K: 11 nmi: 79.2266 entropy: 2.228856 *VVV* .. iteration: 350/10000 nonEmpty: 12 K: 12 nmi: 82.1832 entropy: 2.312034 *VVV* .. iteration: 400/10000 nonEmpty: 12 K: 12 nmi: 82.1832 entropy: 2.312034 *VVV* .. iteration: 450/10000 nonEmpty: 11 K: 11 nmi: 81.1627 entropy: 2.283326 *VVV* .. iteration: 500/10000 nonEmpty: 13 K: 13 nmi: 84.8982 entropy: 2.388416 *VVV* .. iteration: 550/10000 nonEmpty: 13 K: 13 nmi: 84.8982 entropy: 2.388416 *VVV* .. iteration: 600/10000 nonEmpty: 14 K: 14 nmi: 88.896 entropy: 2.500883 *VVV* .. iteration: 650/10000 nonEmpty: 14 K: 14 nmi: 88.896 entropy: 2.500883 *VVV* .. iteration: 700/10000 nonEmpty: 14 K: 15 nmi: 91.0987 entropy: 2.562850 *VVV* .. iteration: 750/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 *VVV* .. iteration: 800/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 *VVV* .. iteration: 850/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 *VVV* .. iteration: 900/10000 nonEmpty: 15 K: 15 nmi: 92.2898 entropy: 2.596360 *VVV* .. iteration: 950/10000 nonEmpty: 16 K: 16 nmi: 94.6821 entropy: 2.663661 *VVV* .. iteration: 1000/10000 nonEmpty: 15 K: 15 nmi: 93.7927 entropy: 2.638641 *VVV* .. iteration: 1050/10000 nonEmpty: 14 K: 14 nmi: 91.2693 entropy: 2.567652 *VVV* .. iteration: 1100/10000 nonEmpty: 14 K: 14 nmi: 91.0987 entropy: 2.562850 *VVV* .. iteration: 1150/10000 nonEmpty: 14 K: 14 nmi: 91.2693 entropy: 2.567652 *VVV* .. iteration: 1200/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 *VVV* .. iteration: 1250/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 *VVV* .. iteration: 1300/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 *VVV* .. iteration: 1350/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 *VVV* .. iteration: 1400/10000 nonEmpty: 16 K: 16 nmi: 96.4608 entropy: 2.713701 *VVV* .. iteration: 1450/10000 nonEmpty: 17 K: 17 nmi: 97.9391 entropy: 2.755290 23 / 36
  • 41. (High level) description of complete algorithm:5 I run six chains (z;KjX;m). In parallel, independently of each other. There is a variable M which is the `current'/`best' model. At iteration i , based on P(X; zi ;m;Ki ) (and other quantities). Can be proven that M will be distributed proportional to P(Xjm=M). 5Bradley P Carlin and Siddhartha Chib. Bayesian model choice via Markov chain Monte Carlo methods. In: Journal of the Royal Statistical Society-Series B Methodological 57.3 (1995), pp. 473{484. 24 / 36
  • 42. (High level) description of complete algorithm:5 I run six chains (z;KjX;m). In parallel, independently of each other. There is a variable M which is the `current'/`best' model. At iteration i , based on P(X; zi ;m;Ki ) (and other quantities). Can be proven that M will be distributed proportional to P(Xjm=M). to work well, need to train good pseudopriors in advance. 5Bradley P Carlin and Siddhartha Chib. Bayesian model choice via Markov chain Monte Carlo methods. In: Journal of the Royal Statistical Society-Series B Methodological 57.3 (1995), pp. 473{484. 24 / 36
  • 43. Application Wine dataset N=178 p=27 3 regions of Italy mclust (3,VVI) 1 2 3 1 58 1 2 5 65 1 3 48 MCMC (3,EEI) 1 2 3 1 58 1 2 2 66 3 3 48 25 / 36
  • 44. Synthetic data N = 400 K 2 f5; 10; 20g p 2 f16; 4g g0 2 fp; p + 1; p + 2g n0 2 f0:001; 0:01; 0:1g m 2 fvvv; eee; vvi; eei; vii; eiig 324 kinds of dataset. 5 realizations of each. A total of 1620 datasets. Ran mclust and MCMC algorithm on each 26 / 36
  • 45. N=400 K=5 m = vvv;K = 5; p = 16 mclust ^K ^m 36 5 VVV 4 3 VVV 3 6 VVV 1 4 VVV 1 2 VVV MCMC ^K ^m 39 VVV 5 3 VVV 4 1 VVI 8 1 VVI 11 1 EEE 10 m = vvv;K = 5; p = 4 ^m ^K 40 5 VVV 2 6 VVV 1 8 VVV 1 7 VVV 1 4 VVV ^m ^K 43 5 VVV 1 8 EEE 1 4 VVV 27 / 36
  • 46. N=400 K=20 m = eee;K = 20; p = 16 mclust ^K ^m 28 20 EEE 4 23 EEE 3 27 EEE 2 24 EEE 2 22 EEE 2 21 EEE ... MCMC ^K ^m 45 20 EEE m = eee;K = 20; p = 4 ^m ^K 13 20 EEE 8 19 EEE 4 23 EEE 3 18 EEE 3 15 EEE ... ^m ^K 28 20 EEE 4 17 EEE 3 16 EEE 3 13 EEE 3 12 EEE ... 28 / 36
  • 47. Synthetic data N=100 K=20 V1 −40 0 20 40 l l ll l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l lll l ll l l l ll l ll l l l l ll l l l l l l l l l l l l l l ll ll l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l l l l l l l l ll l l l l l l l l l l l l l −40 0 20 40 l l l l l l l ll l l l l l l ll l l l l l l l ll ll l l l l l ll l l lll l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l l l l l ll l l l l l l l l l l l ll l l ll l l l ll l l l lll l l l ll l l l l l l l l l l l l l l l l l l −40 0 20 40 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l ll l l l ll l l ll l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l lll l l l l l l l l l l l l l l l l l l l l l l l l l l −40 0 20 40 −40 0 20 l l l l ll l l l l l l l l l l l l l l l l l l l l l ll l l l l l ll l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l −40 0 20 l l l l l l ll l l ll l l l l l l l ll l l l l l l l l l l l l l l l ll l l ll l l ll l l ll l l l l l l l l l l l l l l l l l l l l V2 l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l ll l l l l ll l l l l l l ll l l l l l l l l l l l l l ll l l l ll l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l ll l l l l ll l ll ll l ll l l l l l l l ll l l l l ll l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l ll l l ll l l l l l l l l l l l l l l l l l l ll l ll l l l l l l ll l ll l l l l l ll l l l l l l l l l l l l l ll l l ll l l l l l l l ll l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l ll l ll ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l ll l l l l ll ll ll ll l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l l ll l l l ll l lll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l ll l ll l l ll l l l l l l ll l lll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l V3 l l ll l l l l l ll ll ll l l l l l l l ll l l l l l l l l ll l l l l l l ll l l l ll l lll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l l l l l l l l l lll l l l l l l l ll l l llll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l ll l lll l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l ll l l l ll l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l −40 0 20 l l l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l lll l l ll l l l ll l ll l l l l l l l l l l ll ll l l l ll l l l l l l l l l l −40 0 20 ll l l l l l l ll l l l l l l l l l l ll ll l ll l l l l l l l ll l l l l l l ll l l l l l l ll l l l l l l l l l l ll l l l lll l ll l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l ll l lll l ll l l l l ll l l l l l l l l l ll l l l l l l l l l l l ll ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l ll l l l ll l l l l l l l l l l l l l l l l ll l l l V4 ll l l l l l l l l l l l l l l l l l ll l l ll l l l l l l l l ll l l l l ll l l l l l l l l l l l ll ll l l l l l ll l l l l l l l l ll l l lll l l l l ll l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l ll lll l l l l l l l l l l l ll l l ll l l lll l l l l l l l l l l l l l l l l l l l ll l l l l l ll l l l l l l ll l l l l l l l l l l l l l l l ll l l l l l l ll l ll l l l lll l l l l l l l l l ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l ll l lll l ll l l l l l l l l l l l l l ll l l l l l l ll l l l l l l l l ll l l l l ll l l l l l l l l l l l l l l l l l l l l lll l l ll l ll l l l l l ll l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l ll l l l l l l l l l ll l l l l l l ll l l l l l l l l l l l l l l l l l l l ll ll ll l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l ll l l l l ll l lll l l l l l l l l l l l l l l ll l l l l l ll l l l lll ll ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l lll l l l ll l l ll l l l l l l l l l l l l l l l l l l l l l V5 l l l l l l l l l l l l l l l l l l ll l l l l l l l ll l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l ll l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l −40 0 20 l l l l l l l l l l l l l l l ll l ll l l l l l ll l l l ll l l l l l l l l l l l l l l l l l l l l l −40 0 20 l l l l l l l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l ll l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l ll l l l l l l ll l l l l l l l l ll l l ll l l lll l ll l l l l ll l l l l l l l l l l l l l l l l l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l ll l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l l l l l ll l ll l ll l l l l l l l l ll l l l l l l l l l l ll l l l lll l l l l l ll l l l l l l l l l l l ll l l l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l V6 lll l l l ll ll l l l ll l l l l l ll l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l ll l l l l ll l l ll l l l l ll l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l ll l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l ll l l ll l l ll l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l ll l l l l l l l l ll l l l l l l l ll l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l V7 −40 0 20 l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l ll l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l −40 0 20 40 −40 0 20 l l ll l l l l l lll l l l l l l l l l l l ll l l l l l l l ll l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l ll l lll l l l l l l l l lll l ll l l l lll ll ll l l l l l l l l l ll l l l ll l l l l l ll l l l l l l l ll ll l l l l l −40 0 20 40 ll l l l l lll l l l l l l l l l ll l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l lll l l l l l ll l l l l ll ll l l l ll ll l l l l l l l l lll l l l l l l l l ll l l l l lll l l l l l ll l ll l l l l l l l l l l l l l l ll l l l l l l l l l l l −40 0 20 40 ll l l l lllll l l l ll l l l l ll l l l l l l l l l l l l ll l ll l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l lll l l l l l l l l l l l l l lll l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l −40 0 20 40 ll l l l l lll ll l l l ll l l l l l l l l l ll l ll l l l l lll l l l l l l l l l l l l l l l l l l l l ll l l l l l V8 29 / 36
  • 48. Synthetic data N=100 K=20 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l −40 −20 0 20 40 −40 −20 0 20 40 V1 V2 30 / 36
  • 49. Synthetic data N=100 K=20 l l l l ll l l l l l l l l l l l l l l l l l −40 −20 0 20 40 −40 −20 0 20 40 V1 V2 31 / 36
  • 50. N = 100;K = 20; p = 8;m = vvv . 15 such datasets mclust ^K ^m 5 VVV 7 VVV 16 VII 19 VII 22 EII 23 EEE 23 EEE 25 EII 25 EII 26 EII 27 EEE 29 EEE 29 EEI 29 EII 34 EEE MCMC ^K ^m 17 VVV 17 VVV 17 VVV 17 VVV 18 VVV 18 VVV 18 VVV 18 VVV 19 VVV 19 VVV 19 VVV 19 VVV 20 VVV 20 VVV 20 VVV 32 / 36
  • 51. N = 100;K = 20; p = 8;m = eee . 15 such datasets mclust ^K ^m 19 EEE 19 EEE 19 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 21 EEE 22 EEE MCMC ^K ^m 18 EEE 19 EEE 19 EEE 19 EEE 19 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 20 EEE 33 / 36
  • 52. N = 100;K = 20; p = 8;m = vvi . 15 such datasets mclust ^K ^m 11 VVI 13 VVI 14 VVI 14 VVI 15 VVI 15 VVI 16 VVI 17 VVI 18 VVI 19 EEI 19 VVI 20 VII 20 VVI 21 EEI 24 EII MCMC ^K ^m 17 VVI 18 VVI 19 VVI 19 VVI 19 VVI 20 VVI 20 VVI 20 VVI 20 VVI 20 VVI 20 VVI 20 VVI 20 VVI 20 VVI 20 VVI 34 / 36
  • 53. N = 100;K = 20; p = 8;m = vii . 15 such datasets mclust ^K ^m 5 VVV 8 VII 12 VII 13 VII 17 VII 17 VII 18 VII 18 VII 18 VII 19 VII 19 VII 19 VII 20 VII 21 EII 32 EEE MCMC ^K ^m 14 VII 16 VVV 17 VVV 19 VII 19 VII 19 VII 19 VII 19 VII 19 VII 19 VII 20 VII 20 VII 20 VII 20 VII 20 VII 35 / 36
  • 54. Concluding remarks V** more dicult that E** VVV more dicult that VVI and VII Large K, small N, most dicult MCMC excels here (At
  • 55. rst, I expected dierently) Should repeat N = 100;K 2 f10; 20g across more p, more n0, et cetera. 36 / 36