SlideShare uma empresa Scribd logo
1 de 9
Baixar para ler offline
Confronting the
Partition Function
Lecture slides for Chapter 18 of Deep Learning
www.deeplearningbook.org
Ian Goodfellow
Last updated 2017-12-29
(Goodfellow 2017)
Unnormalized models
t many probabilistic models (commonly known as undi-
e defined by an unnormalized probability distribution
p̃ by dividing by a partition function Z(✓) to obtain a
n:
p(x; ✓) =
1
Z(✓)
p̃(x; ✓). (18.1)
integral (for continuous variables) or sum (for discrete
alized probability of all states:
Z
p̃(x)dx (18.2)
X
p̃(x). (18.3)
re defined by an unnormalized probability distribution
p̃ by dividing by a partition function Z(✓) to obtain a
on:
p(x; ✓) =
1
Z(✓)
p̃(x; ✓). (18.1)
integral (for continuous variables) or sum (for discrete
alized probability of all states:
Z
p̃(x)dx (18.2)
X
x
p̃(x). (18.3)
table for many interesting models.
where Z is
or
(Goodfellow 2017)
Gradient of log-likelihood
Positive phase:
push up on data points
Negative phase:
push down model
samples
arning undirected models by maximum likelihood particularly
he partition function depends on the parameters. The gradient of
d with respect to the parameters has a term corresponding to the
partition function:
r✓ log p(x; ✓) = r✓ log p̃(x; ✓) r✓ log Z(✓). (18.4)
l-known decomposition into the positive phase and negative
g.
directed models of interest, the negative phase is difficult. Models
ariables or with few interactions between latent variables typically
positive phase. The quintessential example of a model with a
positive phase and a difficult negative phase is the RBM, which has
t are conditionally independent from each other given the visible
where the positive phase is difficult, with complicated interactions
ariables, is primarily covered in chapter 19. This chapter focuses
(Goodfellow 2017)
Negative phase sampling
nally independent from each other given the visible
ive phase is difficult, with complicated interactions
marily covered in chapter 19. This chapter focuses
ve phase.
the gradient of log Z:
r✓ log Z (18.5)
=
r✓Z
Z
(18.6)
=
r✓
P
x p̃(x)
Z
(18.7)
=
P
x r✓p̃(x)
Z
. (18.8)
NTING THE PARTITION FUNCTION
= Ex⇠p(x)r✓ log p̃(x). (18.13)
de use of summation over discrete x, but a similar result
n over continuous x. In the continuous version of the
iz’s rule for differentiation under the integral sign to obtain
r✓
Z
p̃(x)dx =
Z
r✓p̃(x)dx. (18.14)
(Goodfellow 2017)
Basic learning algorithm for
undirected models
• For each minibatch:
• Generate model samples
• Compute positive phase using data samples
• Compute negative phase using model samples
• Combine positive and negative phases, do a
gradient step to update parameters
(Goodfellow 2017)
CHAPTER 18. CONFRONTING THE PARTITION FUNCTION
x
p(x)
The positive phase
pmodel(x)
pdata(x)
x
p(x)
The negative phase
pmodel(x)
pdata(x)
Figure 18.1: The view of algorithm 18.1 as having a “positive phase” and a “negative
phase.” (Left)In the positive phase, we sample points from the data distribution and
push up on their unnormalized probability. This means points that are likely in the
data get pushed up on more. (Right)In the negative phase, we sample points from the
model distribution and push down on their unnormalized probability. This counteracts
the positive phase’s tendency to just add a large constant to the unnormalized probability
everywhere. When the data distribution and the model distribution are equal, the positive
phase has the same chance to push up at a point as the negative phase has to push down.
When this occurs, there is no longer any gradient (in expectation), and training must
terminate.
for dreaming in humans and other animals (Crick and Mitchison, 1983), the idea
being that the brain maintains a probabilistic model of the world and follows the
(Goodfellow 2017)
Challenge: model samples are
slow
• Undirected models usually need Markov chains
• Naive approach: run the Markov chain for a long time
starting from random initialization each minibatch
• Speed tricks:
• Contrastive divergence: start the Markov chain from data
• Persistent contrastive divergence: for each minibatch,
continue the Markov chain from where it was for the
previous minibatch
(Goodfellow 2017)
Sidestep the problem
• Use other criteria besides likelihood so that there is
no need to compute Z or its gradient
• Pseudolikelihood
• Score matching
• Ratio matching
• Noise contrastive estimation
(Goodfellow 2017)
Estimating the Partition
Function
• To evaluate a trained model, we want to know the
likelihood
• This requires estimating Z, even if we trained using
a method that doesn’t differentiate Z
• Can estimate Z using annealed importance sampling

Mais conteúdo relacionado

Mais procurados

Physics_150612_01
Physics_150612_01Physics_150612_01
Physics_150612_01Art Traynor
 
Variational Inference
Variational InferenceVariational Inference
Variational InferenceTushar Tank
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsCaleb (Shiqiang) Jin
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmologyChristian Robert
 
The Graph Minor Theorem: a walk on the wild side of graphs
The Graph Minor Theorem: a walk on the wild side of graphsThe Graph Minor Theorem: a walk on the wild side of graphs
The Graph Minor Theorem: a walk on the wild side of graphsMarco Benini
 
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...Understanding the Differences between the erfc(x) and the Q(z) functions: A S...
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...Ismael Torres-Pizarro, PhD, PE, Esq.
 
20191123 bayes dl-jp
20191123 bayes dl-jp20191123 bayes dl-jp
20191123 bayes dl-jpTaku Yoshioka
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Christian Robert
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chaptersChristian Robert
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapterChristian Robert
 

Mais procurados (20)

Physics_150612_01
Physics_150612_01Physics_150612_01
Physics_150612_01
 
Variational Inference
Variational InferenceVariational Inference
Variational Inference
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmology
 
The Graph Minor Theorem: a walk on the wild side of graphs
The Graph Minor Theorem: a walk on the wild side of graphsThe Graph Minor Theorem: a walk on the wild side of graphs
The Graph Minor Theorem: a walk on the wild side of graphs
 
Pattern baysin
Pattern baysinPattern baysin
Pattern baysin
 
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...Understanding the Differences between the erfc(x) and the Q(z) functions: A S...
Understanding the Differences between the erfc(x) and the Q(z) functions: A S...
 
20191123 bayes dl-jp
20191123 bayes dl-jp20191123 bayes dl-jp
20191123 bayes dl-jp
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chapters
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 

Semelhante a 18_partition.pdf

Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Olga Zinkevych
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"tuxette
 
Synthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptxSynthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptxRupeshKumar301638
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgRonald Teo
 
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...Kohei Hayashi
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentElectronic Arts / DICE
 
5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functions5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functionssmiller5
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep LearningRayKim51
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishtuxette
 
Inria Tech Talk - La classification de données complexes avec MASSICCC
Inria Tech Talk - La classification de données complexes avec MASSICCCInria Tech Talk - La classification de données complexes avec MASSICCC
Inria Tech Talk - La classification de données complexes avec MASSICCCStéphanie Roger
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
 

Semelhante a 18_partition.pdf (20)

MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
 
06 mlp
06 mlp06 mlp
06 mlp
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
 
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
 
Distributed ADMM
Distributed ADMMDistributed ADMM
Distributed ADMM
 
Synthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptxSynthetic Image Data Generation using GAN &Triple GAN.pptx
Synthetic Image Data Generation using GAN &Triple GAN.pptx
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scg
 
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal...
 
GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game Development
 
5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functions5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functions
 
Dag in mmhc
Dag in mmhcDag in mmhc
Dag in mmhc
 
04 numerical
04 numerical04 numerical
04 numerical
 
Climate Extremes Workshop - Assessing models for estimation and methods for u...
Climate Extremes Workshop - Assessing models for estimation and methods for u...Climate Extremes Workshop - Assessing models for estimation and methods for u...
Climate Extremes Workshop - Assessing models for estimation and methods for u...
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Quantitative Redistricting Workshop - Machine Learning for Fair Redistricting...
Quantitative Redistricting Workshop - Machine Learning for Fair Redistricting...Quantitative Redistricting Workshop - Machine Learning for Fair Redistricting...
Quantitative Redistricting Workshop - Machine Learning for Fair Redistricting...
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfish
 
Inria Tech Talk - La classification de données complexes avec MASSICCC
Inria Tech Talk - La classification de données complexes avec MASSICCCInria Tech Talk - La classification de données complexes avec MASSICCC
Inria Tech Talk - La classification de données complexes avec MASSICCC
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 

Mais de KSChidanandKumarJSSS (8)

12_applications.pdf
12_applications.pdf12_applications.pdf
12_applications.pdf
 
16_graphical_models.pdf
16_graphical_models.pdf16_graphical_models.pdf
16_graphical_models.pdf
 
10_rnn.pdf
10_rnn.pdf10_rnn.pdf
10_rnn.pdf
 
15_representation.pdf
15_representation.pdf15_representation.pdf
15_representation.pdf
 
14_autoencoders.pdf
14_autoencoders.pdf14_autoencoders.pdf
14_autoencoders.pdf
 
Adversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdfAdversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdf
 
Adversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdfAdversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdf
 
13_linear_factors.pdf
13_linear_factors.pdf13_linear_factors.pdf
13_linear_factors.pdf
 

Último

codes and conventions of film magazine and website.pptx
codes and conventions of film magazine and website.pptxcodes and conventions of film magazine and website.pptx
codes and conventions of film magazine and website.pptx17duffyc
 
Van Gogh Powerpoint for art lesson today
Van Gogh Powerpoint for art lesson todayVan Gogh Powerpoint for art lesson today
Van Gogh Powerpoint for art lesson todaylucygibson17
 
Jaro je tady - Spring is here (Judith) 4
Jaro je tady - Spring is here (Judith) 4Jaro je tady - Spring is here (Judith) 4
Jaro je tady - Spring is here (Judith) 4wistariecz
 
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证khuurq8kz
 
Call Girls Varanasi Just Call 8617370543Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8617370543Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8617370543Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8617370543Top Class Call Girl Service AvailableNitya salvi
 
Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...
Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...
Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...Nitya salvi
 
WhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) Delhi
WhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) DelhiWhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) Delhi
WhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) Delhidelhimunirka15
 
How to order fake Worcester State University diploma?
How to order fake Worcester State University diploma?How to order fake Worcester State University diploma?
How to order fake Worcester State University diploma?melodolykelton
 
Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...
Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...
Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...Nitya salvi
 
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...delhimunirka15
 
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu DhabiMussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabiromeke1848
 
Digital C-Type Printing: Revolutionizing The Future Of Photographic Prints
Digital C-Type Printing: Revolutionizing The Future Of Photographic PrintsDigital C-Type Printing: Revolutionizing The Future Of Photographic Prints
Digital C-Type Printing: Revolutionizing The Future Of Photographic PrintsMatte Image
 
Engineering Major for College_ Environmental Health Engineering by Slidesgo.pptx
Engineering Major for College_ Environmental Health Engineering by Slidesgo.pptxEngineering Major for College_ Environmental Health Engineering by Slidesgo.pptx
Engineering Major for College_ Environmental Health Engineering by Slidesgo.pptxDanielRemache4
 
Completed Event Presentation for Huma 1305
Completed Event Presentation for Huma 1305Completed Event Presentation for Huma 1305
Completed Event Presentation for Huma 1305jazlynjacobs51
 
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶delhimunirka15
 
Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...
Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...
Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...meghakumariji156
 
Museum of fine arts Lauren Simpson…………..
Museum of fine arts Lauren Simpson…………..Museum of fine arts Lauren Simpson…………..
Museum of fine arts Lauren Simpson…………..mvxpw22gfc
 
如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一
如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一
如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一8jg9cqy
 
Ignite Your Brand: Tailored Creative Solutions Proposal
Ignite Your Brand: Tailored Creative Solutions ProposalIgnite Your Brand: Tailored Creative Solutions Proposal
Ignite Your Brand: Tailored Creative Solutions ProposalCreative Labs
 
LESSON-1-MUSIC-Q4 also a reviewer mapeh.pptx
LESSON-1-MUSIC-Q4 also a reviewer mapeh.pptxLESSON-1-MUSIC-Q4 also a reviewer mapeh.pptx
LESSON-1-MUSIC-Q4 also a reviewer mapeh.pptxmatthewmirafuentes
 

Último (20)

codes and conventions of film magazine and website.pptx
codes and conventions of film magazine and website.pptxcodes and conventions of film magazine and website.pptx
codes and conventions of film magazine and website.pptx
 
Van Gogh Powerpoint for art lesson today
Van Gogh Powerpoint for art lesson todayVan Gogh Powerpoint for art lesson today
Van Gogh Powerpoint for art lesson today
 
Jaro je tady - Spring is here (Judith) 4
Jaro je tady - Spring is here (Judith) 4Jaro je tady - Spring is here (Judith) 4
Jaro je tady - Spring is here (Judith) 4
 
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
一比一原版美国西雅图大学毕业证(Seattle毕业证书)毕业证成绩单留信认证
 
Call Girls Varanasi Just Call 8617370543Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8617370543Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8617370543Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8617370543Top Class Call Girl Service Available
 
Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...
Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...
Just Call Vip call girls Farrukhabad Escorts ☎️8617370543 Two shot with one g...
 
WhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) Delhi
WhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) DelhiWhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) Delhi
WhatsApp-(# 9711106444 #)Call Girl in Noida Sector 80 Noida (Escorts) Delhi
 
How to order fake Worcester State University diploma?
How to order fake Worcester State University diploma?How to order fake Worcester State University diploma?
How to order fake Worcester State University diploma?
 
Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...
Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...
Call Girls In Sindhudurg Escorts ☎️8617370543 🔝 💃 Enjoy 24/7 Escort Service E...
 
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
Nehru Nagar, Call Girls ☎️ ((#9711106444)), 💘 Full enjoy Low rate girl💘 Genui...
 
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu DhabiMussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
Mussafah Call Girls +971525373611 Call Girls in Mussafah Abu Dhabi
 
Digital C-Type Printing: Revolutionizing The Future Of Photographic Prints
Digital C-Type Printing: Revolutionizing The Future Of Photographic PrintsDigital C-Type Printing: Revolutionizing The Future Of Photographic Prints
Digital C-Type Printing: Revolutionizing The Future Of Photographic Prints
 
Engineering Major for College_ Environmental Health Engineering by Slidesgo.pptx
Engineering Major for College_ Environmental Health Engineering by Slidesgo.pptxEngineering Major for College_ Environmental Health Engineering by Slidesgo.pptx
Engineering Major for College_ Environmental Health Engineering by Slidesgo.pptx
 
Completed Event Presentation for Huma 1305
Completed Event Presentation for Huma 1305Completed Event Presentation for Huma 1305
Completed Event Presentation for Huma 1305
 
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶
FULL ENJOY —📞9711106444 ✦/ Vℐℙ Call Girls in Ghaziabad | Delhi🫶
 
Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...
Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...
Top Rated Lucknow Escorts Service, ₹5000 Best Hot Call Girls With Room +91-82...
 
Museum of fine arts Lauren Simpson…………..
Museum of fine arts Lauren Simpson…………..Museum of fine arts Lauren Simpson…………..
Museum of fine arts Lauren Simpson…………..
 
如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一
如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一
如何办理(Flinders毕业证)弗林德斯大学毕业证毕业证成绩单原版一比一
 
Ignite Your Brand: Tailored Creative Solutions Proposal
Ignite Your Brand: Tailored Creative Solutions ProposalIgnite Your Brand: Tailored Creative Solutions Proposal
Ignite Your Brand: Tailored Creative Solutions Proposal
 
LESSON-1-MUSIC-Q4 also a reviewer mapeh.pptx
LESSON-1-MUSIC-Q4 also a reviewer mapeh.pptxLESSON-1-MUSIC-Q4 also a reviewer mapeh.pptx
LESSON-1-MUSIC-Q4 also a reviewer mapeh.pptx
 

18_partition.pdf

  • 1. Confronting the Partition Function Lecture slides for Chapter 18 of Deep Learning www.deeplearningbook.org Ian Goodfellow Last updated 2017-12-29
  • 2. (Goodfellow 2017) Unnormalized models t many probabilistic models (commonly known as undi- e defined by an unnormalized probability distribution p̃ by dividing by a partition function Z(✓) to obtain a n: p(x; ✓) = 1 Z(✓) p̃(x; ✓). (18.1) integral (for continuous variables) or sum (for discrete alized probability of all states: Z p̃(x)dx (18.2) X p̃(x). (18.3) re defined by an unnormalized probability distribution p̃ by dividing by a partition function Z(✓) to obtain a on: p(x; ✓) = 1 Z(✓) p̃(x; ✓). (18.1) integral (for continuous variables) or sum (for discrete alized probability of all states: Z p̃(x)dx (18.2) X x p̃(x). (18.3) table for many interesting models. where Z is or
  • 3. (Goodfellow 2017) Gradient of log-likelihood Positive phase: push up on data points Negative phase: push down model samples arning undirected models by maximum likelihood particularly he partition function depends on the parameters. The gradient of d with respect to the parameters has a term corresponding to the partition function: r✓ log p(x; ✓) = r✓ log p̃(x; ✓) r✓ log Z(✓). (18.4) l-known decomposition into the positive phase and negative g. directed models of interest, the negative phase is difficult. Models ariables or with few interactions between latent variables typically positive phase. The quintessential example of a model with a positive phase and a difficult negative phase is the RBM, which has t are conditionally independent from each other given the visible where the positive phase is difficult, with complicated interactions ariables, is primarily covered in chapter 19. This chapter focuses
  • 4. (Goodfellow 2017) Negative phase sampling nally independent from each other given the visible ive phase is difficult, with complicated interactions marily covered in chapter 19. This chapter focuses ve phase. the gradient of log Z: r✓ log Z (18.5) = r✓Z Z (18.6) = r✓ P x p̃(x) Z (18.7) = P x r✓p̃(x) Z . (18.8) NTING THE PARTITION FUNCTION = Ex⇠p(x)r✓ log p̃(x). (18.13) de use of summation over discrete x, but a similar result n over continuous x. In the continuous version of the iz’s rule for differentiation under the integral sign to obtain r✓ Z p̃(x)dx = Z r✓p̃(x)dx. (18.14)
  • 5. (Goodfellow 2017) Basic learning algorithm for undirected models • For each minibatch: • Generate model samples • Compute positive phase using data samples • Compute negative phase using model samples • Combine positive and negative phases, do a gradient step to update parameters
  • 6. (Goodfellow 2017) CHAPTER 18. CONFRONTING THE PARTITION FUNCTION x p(x) The positive phase pmodel(x) pdata(x) x p(x) The negative phase pmodel(x) pdata(x) Figure 18.1: The view of algorithm 18.1 as having a “positive phase” and a “negative phase.” (Left)In the positive phase, we sample points from the data distribution and push up on their unnormalized probability. This means points that are likely in the data get pushed up on more. (Right)In the negative phase, we sample points from the model distribution and push down on their unnormalized probability. This counteracts the positive phase’s tendency to just add a large constant to the unnormalized probability everywhere. When the data distribution and the model distribution are equal, the positive phase has the same chance to push up at a point as the negative phase has to push down. When this occurs, there is no longer any gradient (in expectation), and training must terminate. for dreaming in humans and other animals (Crick and Mitchison, 1983), the idea being that the brain maintains a probabilistic model of the world and follows the
  • 7. (Goodfellow 2017) Challenge: model samples are slow • Undirected models usually need Markov chains • Naive approach: run the Markov chain for a long time starting from random initialization each minibatch • Speed tricks: • Contrastive divergence: start the Markov chain from data • Persistent contrastive divergence: for each minibatch, continue the Markov chain from where it was for the previous minibatch
  • 8. (Goodfellow 2017) Sidestep the problem • Use other criteria besides likelihood so that there is no need to compute Z or its gradient • Pseudolikelihood • Score matching • Ratio matching • Noise contrastive estimation
  • 9. (Goodfellow 2017) Estimating the Partition Function • To evaluate a trained model, we want to know the likelihood • This requires estimating Z, even if we trained using a method that doesn’t differentiate Z • Can estimate Z using annealed importance sampling