SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Pixel RNN 부터
Pixel CNN++ 까지
2020. 01. 16 (목)
이동헌
Contents
Taxonomy of Generative Models
(1) Pixel RNN
(2) Pixel CNN
(3) Gated Pixel CNN
(4) Pixel CNN++
(Google DeepMind, arxiv, 2016)
(Google DeepMind, arxiv, 2016)
(Google DeepMind, NIPS, 2016)
(OpenAI, ICML, 2017)
Taxonomy of Generative Models
Generative model은 Maximum Likelihood를 바탕으로 학습하는 것으로
정리할 수 있으며, 이 때 어떤 식으로 likelihood를 다루느냐 (근사를 할
것이냐 혹은 정확히 표현할 것이냐 등)에 따라 다양한 전략이 존재
Taxonomy of Generative Models
Density (=Prior distribution, model) 정의
(+) 다루기가 비교적 편하고 어느 정도 모델의 움직임이
예측가능
(-) 우리가 아는 것 이상으로는 결과를 낼 수 없는 한계
Density를 정의하지 않고 Sampling 함
Taxonomy of Generative Models
Generator가 만드는 분포로부터 sample을 생성
(Markov Chain과 다르게 input 없이 sample 생성)
sample x′을 반복적으로 뽑다보면 결국에
는 x′이 pmodel(x)로부터 나온 sample로 수렴
(+) Sample간의 분산이 높지 않은 경우 괜찮
은 성능
(-) 고차원에서 성능 떨어지고 계산 느림
Taxonomy of Generative Models
학습 시, Density를
수학적으로 계산
(미적분)이 가능
Neural Autoregressive à
: 이전의 자기 자신을 이용하여
현재의 자신을 예측하는 모델
Taxonomy of Generative Models
• Encoder:
• Decoder: from a latent code z, reconstructed sample
!" #$ z to be close to the data used to obtain the latent code, x
5!67! 5 8 79 8~;< 8 $ , =>?@@A B7!C?@ ß VAE는 결합분포를 적분식으로 표현
하며 이를 ‘직접’ 적분하지 못하기 때문
에 variational inference로 '추정'
(1) Pixel RNN
• Autoregreesive Model의 핵심은, 데이터간의 dependency 순서를 정해주는 것!
• One effective approach to tractably model a joint distribution of the pixels in the
image is to cast it as a product of conditional distributions.
à Pixel (1~n2) 순서로 진행
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(1) Pixel RNN
Architecture
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(1) Pixel RNN
• R, G, B 순서로 진행
MASK
: First Layer, each of the RGB channels is connected to previous
channels and to the context, but is not connected to itself.
: Subsequent Layers, the channels are also connected to themselves.
Multiple Residual Blocks (모델마다 다름)
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(1) Pixel RNN
Input
Hidden
State
input-to-state & state-to-state
Row LSTM
Multiplication à Convolution
https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
(1) Pixel RNN
Input
Hidden
State
input-to-state & state-to-state
Diagonal BiLSTM 2x1 Conv
• Diagonal convolution 어려우므로, skew the feature maps
à it can be parallelized
https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
(2) Pixel CNN
input-to-state
Input
Hidden
State
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
• Discrete Softmax Distribution
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
• Negative log-likelihood (NLL)
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(3) Gated Pixel CNN
v Pixel CNN 성능 개선
1) ReLU à Gated Activation Unit à Conditional PixelCNN
<A single layer in the Gated PixelCNN architecture>
Condition
(Vk,g ∗ s is an unmasked 1 × 1 convolution, h=s)
Van den Oord, Aaron, et al. "Conditional image generation with pixelcnn decoders." Advances in neural information processing systems. 2016.
(3) Gated Pixel CNN
2) Stacks : blinded spot 제거
PixelCNN
1.Horizontal Stack : It conditions only on the current row and takes as input the output of previous layer as
well as the of the vertical stack.
2.Vertical Stack : It conditions on all the rows above the current pixel. It doesn’t have any masking. It’s output
is fed into the horizontal stack and the receptive field grows in rectangular fashion.
Gated PixelCNN
current pixel
https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
(4) Pixel CNN++
1) Discretized logistic mixture likelihood
The softmax layer which is used to compute the conditional distribution of a pixel although efficiency is very costly in terms of
memory. Also, it makes gradients sparse early on during training.
à To counter this, we assume a latent color intensity akin to that used in variational autoencoders, with a continuous distribution
It is rounded off to its nearest 8-bit representation to give pixel value. The distribution of intensity is logistic so the pixel values
can be easily determined.
Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017).
à This method is memory efficient, output is of lower dimensions which provides denser gradients thus solving both problems.
(4) Pixel CNN++
2) Other Modification
• Conditioning on whole pixels : PixelCNN factorizes the model over the 3 sub pixels according to the color(RGB) which
however, complicates the model. The dependency between color channels of a pixel is relatively simple and doesn’t
require a deep model to train.
à Therefore, it is better to condition on whole pixels instead of separate colors and then output joint distributions over
all 3 channels of the predicted pixel.
• Downsampling : PixelCNN cannot compute long range dependencies. This is one of the disadvantages of PixelCNN as
to why it cannot match the performance of PixelRNN. To overcome this, we downsample the layers by using
convolutions of stride 2. Downsampling reduces input size and thus improves relative size of receptive field which
leads to some loss of information but it can be compensated by adding extra short-cut connections.
https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
(4) Pixel CNN++
2) Other Modification
• Short-cut connections : This model the encoder-decoder structure of U-net. Layers 2 and 3 are downsampled and then
layers 5 and 6 are upsampled. There is a residual connection from encoders to decoders to provide the localised
information.
• Dropout : Since the model for PixelCNN and PixelCNN++ are both very powerful, they are likely to overfit data if not
regularized. So, we apply dropout on the residual path after the first convolution.
https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
Experiments
Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017).
감사합니다

Mais conteúdo relacionado

Mais procurados

PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
Jinwon Lee
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
JaeJun Yoo
 

Mais procurados (20)

PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep Learning
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 
Wasserstein GAN
Wasserstein GANWasserstein GAN
Wasserstein GAN
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Neural Radiance Field
Neural Radiance FieldNeural Radiance Field
Neural Radiance Field
 
ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Faster rcnn
Faster rcnnFaster rcnn
Faster rcnn
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
 
Swin transformer
Swin transformerSwin transformer
Swin transformer
 
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Hable John Uncharted2 Hdr Lighting
Hable John Uncharted2 Hdr LightingHable John Uncharted2 Hdr Lighting
Hable John Uncharted2 Hdr Lighting
 
Light prepass
Light prepassLight prepass
Light prepass
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 

Semelhante a Pixel RNN to Pixel CNN++

Deep Learning
Deep LearningDeep Learning
Deep Learning
Pierre de Lacaze
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
inside-BigData.com
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 

Semelhante a Pixel RNN to Pixel CNN++ (20)

Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Yolo
YoloYolo
Yolo
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Mnist report
Mnist reportMnist report
Mnist report
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 

Mais de Dongheon Lee (10)

Workshop 210417 dhlee
Workshop 210417 dhleeWorkshop 210417 dhlee
Workshop 210417 dhlee
 
GAN Evaluation
GAN EvaluationGAN Evaluation
GAN Evaluation
 
BeautyGlow
BeautyGlowBeautyGlow
BeautyGlow
 
ModuLab DLC-Medical5
ModuLab DLC-Medical5ModuLab DLC-Medical5
ModuLab DLC-Medical5
 
ModuLab DLC-Medical4
ModuLab DLC-Medical4ModuLab DLC-Medical4
ModuLab DLC-Medical4
 
ModuLab DLC-Medical1
ModuLab DLC-Medical1ModuLab DLC-Medical1
ModuLab DLC-Medical1
 
ModuLab DLC-Medical3
ModuLab DLC-Medical3ModuLab DLC-Medical3
ModuLab DLC-Medical3
 
Deep Learning for AI (2)
Deep Learning for AI (2)Deep Learning for AI (2)
Deep Learning for AI (2)
 
Deep Learning for AI (3)
Deep Learning for AI (3)Deep Learning for AI (3)
Deep Learning for AI (3)
 
Deep Learning for AI (1)
Deep Learning for AI (1)Deep Learning for AI (1)
Deep Learning for AI (1)
 

Último

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
dharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 

Último (20)

FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 

Pixel RNN to Pixel CNN++

  • 1. Pixel RNN 부터 Pixel CNN++ 까지 2020. 01. 16 (목) 이동헌
  • 2. Contents Taxonomy of Generative Models (1) Pixel RNN (2) Pixel CNN (3) Gated Pixel CNN (4) Pixel CNN++ (Google DeepMind, arxiv, 2016) (Google DeepMind, arxiv, 2016) (Google DeepMind, NIPS, 2016) (OpenAI, ICML, 2017)
  • 3. Taxonomy of Generative Models Generative model은 Maximum Likelihood를 바탕으로 학습하는 것으로 정리할 수 있으며, 이 때 어떤 식으로 likelihood를 다루느냐 (근사를 할 것이냐 혹은 정확히 표현할 것이냐 등)에 따라 다양한 전략이 존재
  • 4. Taxonomy of Generative Models Density (=Prior distribution, model) 정의 (+) 다루기가 비교적 편하고 어느 정도 모델의 움직임이 예측가능 (-) 우리가 아는 것 이상으로는 결과를 낼 수 없는 한계 Density를 정의하지 않고 Sampling 함
  • 5. Taxonomy of Generative Models Generator가 만드는 분포로부터 sample을 생성 (Markov Chain과 다르게 input 없이 sample 생성) sample x′을 반복적으로 뽑다보면 결국에 는 x′이 pmodel(x)로부터 나온 sample로 수렴 (+) Sample간의 분산이 높지 않은 경우 괜찮 은 성능 (-) 고차원에서 성능 떨어지고 계산 느림
  • 6. Taxonomy of Generative Models 학습 시, Density를 수학적으로 계산 (미적분)이 가능 Neural Autoregressive à : 이전의 자기 자신을 이용하여 현재의 자신을 예측하는 모델
  • 7. Taxonomy of Generative Models • Encoder: • Decoder: from a latent code z, reconstructed sample !" #$ z to be close to the data used to obtain the latent code, x 5!67! 5 8 79 8~;< 8 $ , =>?@@A B7!C?@ ß VAE는 결합분포를 적분식으로 표현 하며 이를 ‘직접’ 적분하지 못하기 때문 에 variational inference로 '추정'
  • 8. (1) Pixel RNN • Autoregreesive Model의 핵심은, 데이터간의 dependency 순서를 정해주는 것! • One effective approach to tractably model a joint distribution of the pixels in the image is to cast it as a product of conditional distributions. à Pixel (1~n2) 순서로 진행 Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 9. (1) Pixel RNN Architecture Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 10. (1) Pixel RNN • R, G, B 순서로 진행 MASK : First Layer, each of the RGB channels is connected to previous channels and to the context, but is not connected to itself. : Subsequent Layers, the channels are also connected to themselves. Multiple Residual Blocks (모델마다 다름) Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 11. (1) Pixel RNN Input Hidden State input-to-state & state-to-state Row LSTM Multiplication à Convolution https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
  • 12. (1) Pixel RNN Input Hidden State input-to-state & state-to-state Diagonal BiLSTM 2x1 Conv • Diagonal convolution 어려우므로, skew the feature maps à it can be parallelized https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
  • 13. (2) Pixel CNN input-to-state Input Hidden State Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 14.
  • 15. Experiments • Discrete Softmax Distribution Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 16. Experiments • Negative log-likelihood (NLL) Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 17. Experiments Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 18. Experiments Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 19. (3) Gated Pixel CNN v Pixel CNN 성능 개선 1) ReLU à Gated Activation Unit à Conditional PixelCNN <A single layer in the Gated PixelCNN architecture> Condition (Vk,g ∗ s is an unmasked 1 × 1 convolution, h=s) Van den Oord, Aaron, et al. "Conditional image generation with pixelcnn decoders." Advances in neural information processing systems. 2016.
  • 20. (3) Gated Pixel CNN 2) Stacks : blinded spot 제거 PixelCNN 1.Horizontal Stack : It conditions only on the current row and takes as input the output of previous layer as well as the of the vertical stack. 2.Vertical Stack : It conditions on all the rows above the current pixel. It doesn’t have any masking. It’s output is fed into the horizontal stack and the receptive field grows in rectangular fashion. Gated PixelCNN current pixel https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
  • 21. (4) Pixel CNN++ 1) Discretized logistic mixture likelihood The softmax layer which is used to compute the conditional distribution of a pixel although efficiency is very costly in terms of memory. Also, it makes gradients sparse early on during training. à To counter this, we assume a latent color intensity akin to that used in variational autoencoders, with a continuous distribution It is rounded off to its nearest 8-bit representation to give pixel value. The distribution of intensity is logistic so the pixel values can be easily determined. Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017). à This method is memory efficient, output is of lower dimensions which provides denser gradients thus solving both problems.
  • 22. (4) Pixel CNN++ 2) Other Modification • Conditioning on whole pixels : PixelCNN factorizes the model over the 3 sub pixels according to the color(RGB) which however, complicates the model. The dependency between color channels of a pixel is relatively simple and doesn’t require a deep model to train. à Therefore, it is better to condition on whole pixels instead of separate colors and then output joint distributions over all 3 channels of the predicted pixel. • Downsampling : PixelCNN cannot compute long range dependencies. This is one of the disadvantages of PixelCNN as to why it cannot match the performance of PixelRNN. To overcome this, we downsample the layers by using convolutions of stride 2. Downsampling reduces input size and thus improves relative size of receptive field which leads to some loss of information but it can be compensated by adding extra short-cut connections. https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
  • 23. (4) Pixel CNN++ 2) Other Modification • Short-cut connections : This model the encoder-decoder structure of U-net. Layers 2 and 3 are downsampled and then layers 5 and 6 are upsampled. There is a residual connection from encoders to decoders to provide the localised information. • Dropout : Since the model for PixelCNN and PixelCNN++ are both very powerful, they are likely to overfit data if not regularized. So, we apply dropout on the residual path after the first convolution. https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
  • 24. Experiments Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017).