O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a navegar o site, você aceita o uso de cookies. Leia nosso Contrato do Usuário e nossa Política de Privacidade.

O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a utilizar o site, você aceita o uso de cookies. Leia nossa Política de Privacidade e nosso Contrato do Usuário para obter mais detalhes.

O slideshow foi denunciado.

Gostou da apresentação? Compartilhe-a!

- Shuffle and learn: Unsupervised Lea... by Universitat Polit... 558 views
- Video Analysis with Convolutional N... by Universitat Polit... 3673 views
- Video Analysis with Recurrent Neura... by Universitat Polit... 6827 views
- The impact of visual saliency predi... by Universitat Polit... 1058 views
- Predicting Human Eye Fixations via ... by Universitat Polit... 3209 views
- Learning representations from EEG w... by Universitat Polit... 740 views

813 visualizações

Publicada em

Nguyen, Anh, Jason Yosinski, Yoshua Bengio, Alexey Dosovitskiy, and Jeff Clune. "Plug & play generative networks: Conditional iterative generation of images in latent space." arXiv preprint arXiv:1612.00005 (2016).

Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. In addition, we provide a unified probabilistic interpretation of related activation maximization methods and call the general class of models "Plug and Play Generative Networks". PPGNs are composed of 1) a generator network G that is capable of drawing a wide range of image types and 2) a replaceable "condition" network C that tells the generator what to draw. We demonstrate the generation of images conditioned on a class (when C is an ImageNet or MIT Places classification network) and also conditioned on a caption (when C is an image captioning network). Our method also improves the state of the art of Multifaceted Feature Visualization, which generates the set of synthetic inputs that activate a neuron in order to better understand how deep neural networks operate. Finally, we show that our model performs reasonably well at the task of image inpainting. While image models are used in this paper, the approach is modality-agnostic and can be applied to many types of data.

Publicada em:
Dados e análise

Sem downloads

Visualizações totais

813

No SlideShare

0

A partir de incorporações

0

Número de incorporações

0

Compartilhamentos

0

Downloads

47

Comentários

6

Gostaram

8

Nenhuma nota no slide

- 1. Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space Anh Nguyen, Jason Yosinski, Yoshua Bengio, Alexey Dosovitskiy, Jeff Clune [GitHub] [Arxiv] Slides by Víctor Garcia UPC Computer Vision Reading Group (27/01/2017)
- 2. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 3. Introduction Interpretation of different frameworks to generate images maximizing: p(x, y) = p(x)*p(y|x) Prior Condition Encourages to look realistic Encourages to look from a particular class
- 4. Introduction Image Generation: ● High Resolution Images (227x227) GANs struggle to Generate >64x64 Images
- 5. Introduction Image Generation: ● High Resolution Images ● Intra-Class Variance
- 6. Introduction Image Generation: ● High Resolution Images ● Intra-Class Variance ● Inter-Class Variance (1000-ImageNet classes)
- 7. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 8. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples from a distribution p(x):
- 9. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Current state
- 10. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Future State Current state
- 11. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Future State Current state Gradient to the natural manifold of p(x)
- 12. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Gradient to the natural manifold of p(x) NoiseFuture State Current state
- 13. Probabilistic Interpretation of the method Future State Current state Gradient to the natural manifold of p(x) Noise
- 14. Probabilistic Interpretation of the method p(x)
- 15. Probabilistic Interpretation of the method p(x) Step towards an image that causes the classifier to produce a higher score for class C Step towards a more generic image Noise
- 16. Probabilistic Interpretation of the method xt Rough example
- 17. Probabilistic Interpretation of the method y_co = Content activations y_st = Style activations Rough example
- 18. Probabilistic Interpretation of the method xt+i Rough example
- 19. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 20. Method Why Plug & Play ?
- 21. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 22. Method | PPGN-x: DAE model of p(x) What a Denoising Autoencoder is? x h(x) R(x)
- 23. Method | PPGN-x: DAE model of p(x) What a Denoising Autoencoder is? x_noise h(x) x N(0,σ^2) R(x)
- 24. Method | PPGN-x: DAE model of p(x) What a Denoising Autoencoder is? x_noise h(x) x N(0,σ^2) R(x)
- 25. Method | PPGN-x: DAE model of p(x)
- 26. Method | PPGN-x: DAE model of p(x)
- 27. Method | PPGN-x: DAE model of p(x) 1) Poorly modeled data, blurry 2) Slow changes
- 28. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 29. Method | DGN-AM: sampling without a learned prior Deep Generator Network-based Activation Maximization It is faster if we move over h subspace instead of the x fc6 AlexNet
- 30. Method | DGN-AM: sampling without a learned prior Deep Generator Network-based Activation Maximization Discriminator 1/0 AlexNet fc6
- 31. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm
- 32. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm
- 33. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm
- 34. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm No learned prior No noise
- 35. Method | DGN-AM: sampling without a learned prior + Different modes from different starts - Same image after many steps - Low mixing speed
- 36. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 37. Method | PPGN-h: Generator and DAE model of p(h) A 7 layers DAE is added to model the prior p(h) in order to increase the mixing speed
- 38. Method | PPGN-h: Generator and DAE model of p(h) The equation is the following: Prior p(h) Conditioned Gradient Noise
- 39. Method | PPGN-h: Generator and DAE model of p(h) - Similar to the last case. Low diversity - p(h) model learned by DAE is too simple
- 40. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 41. Method | Joint PPGN-h: joint Generator and DAE In order to model p(h) in a more complex way DAE: h/fc6 → ? → h/fc6
- 42. Method | Joint PPGN-h: joint Generator and DAE In order to model p(h) in a more complex way DAE: h/fc6 → ? → h/fc6 Joint Generator and DAE: h/fc6 x h/fc6 G E
- 43. Method | Joint PPGN-h: joint Generator and DAE In order to model p(h) in a more complex way DAE: h/fc6 → ? → h/fc6 Joint Generator and DAE: h/fc6 x h/fc6 G E With the same existing network we train the Generator G to act as a DAE in conjunction with the E network
- 44. Method | Joint PPGN-h: joint Generator and DAE AlexNet Equation is the same than before
- 45. Method | Joint PPGN-h: joint Generator and DAE - Faster mixing - Better quality
- 46. Method | Joint PPGN-h: joint Generator and AE AlexNet Equation is the same than before
- 47. Method | Joint PPGN-h: joint Generator and AE - Faster mixing - Better quality
- 48. Method | Joint PPGN-h: joint Generator and DAE Noise sweeps For the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels: fc6 N(0, ) +
- 49. Method | Joint PPGN-h: joint Generator and AE Noise sweeps For the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels:
- 50. Method | Joint PPGN-h: joint Generator and AE Noise sweeps
- 51. Method | Joint PPGN-h: joint Generator and AE Noise sweeps We can still recover large information from the image when mapping with a lot of noise. Many → one.
- 52. Method | Joint PPGN-h: joint Generator and DAE Combination of Losses Comparison of Losses: ● Real Images ● ● ● ●
- 53. Method | Joint PPGN-h: joint Generator and DAE Combination of Losses
- 54. Method | Joint PPGN-h: joint Generator and DAE Combination of Losses
- 55. Method | Joint PPGN-h: joint Generator and DAE Evaluating: Qualitatively
- 56. Method | Joint PPGN-h: joint Generator and DAE Evaluating: Qualitatively
- 57. Method | Joint PPGN-h: joint Generator and DAE Evaluating: Qualitatively
- 58. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 59. Further Experiments | Captioning MS-COCO Dataset
- 60. Further Experiments | Captioning
- 61. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 62. Further Experiments | MFV Multifaceted Feature Visualization
- 63. Multifaceted Feature Visualization Further Experiments | MFV
- 64. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
- 65. Further Experiments | Inpainting Multifaceted Feature Visualization
- 66. Further Experiments | Inpainting Multifaceted Feature Visualization
- 67. Further Experiments | Inpainting Multifaceted Feature Visualization
- 68. Further Experiments | Inpainting Multifaceted Feature Visualization
- 69. Further Experiments | Inpainting Multifaceted Feature Visualization
- 70. Conclusions ● Only using GANs for the reconstruction, GANs collapse into fewer modes, far from the original p(x). ● Using extra Losses it is possible to better reconstruct the images even for 1000 classes and for higher resolution. Mapping one-to-one helps to prevent typical latent → missing modes. ● It would be great to generate also the embedding space for this super-resolution multi-class images instead of using a supervised learned space.

Nenhum painel de recortes público que contém este slide

Parece que você já adicionou este slide ao painel

Criar painel de recortes

Entre para ver os comentários