O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (UPC Reading Group)

813 visualizações

Publicada em

Slides by Víctor Garcia about the paper:

Nguyen, Anh, Jason Yosinski, Yoshua Bengio, Alexey Dosovitskiy, and Jeff Clune. "Plug & play generative networks: Conditional iterative generation of images in latent space." arXiv preprint arXiv:1612.00005 (2016).

Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. In addition, we provide a unified probabilistic interpretation of related activation maximization methods and call the general class of models "Plug and Play Generative Networks". PPGNs are composed of 1) a generator network G that is capable of drawing a wide range of image types and 2) a replaceable "condition" network C that tells the generator what to draw. We demonstrate the generation of images conditioned on a class (when C is an ImageNet or MIT Places classification network) and also conditioned on a caption (when C is an image captioning network). Our method also improves the state of the art of Multifaceted Feature Visualization, which generates the set of synthetic inputs that activate a neuron in order to better understand how deep neural networks operate. Finally, we show that our model performs reasonably well at the task of image inpainting. While image models are used in this paper, the approach is modality-agnostic and can be applied to many types of data.

Publicada em: Dados e análise
  • Entre para ver os comentários

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space (UPC Reading Group)

  1. 1. Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space Anh Nguyen, Jason Yosinski, Yoshua Bengio, Alexey Dosovitskiy, Jeff Clune [GitHub] [Arxiv] Slides by Víctor Garcia UPC Computer Vision Reading Group (27/01/2017)
  2. 2. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  3. 3. Introduction Interpretation of different frameworks to generate images maximizing: p(x, y) = p(x)*p(y|x) Prior Condition Encourages to look realistic Encourages to look from a particular class
  4. 4. Introduction Image Generation: ● High Resolution Images (227x227) GANs struggle to Generate >64x64 Images
  5. 5. Introduction Image Generation: ● High Resolution Images ● Intra-Class Variance
  6. 6. Introduction Image Generation: ● High Resolution Images ● Intra-Class Variance ● Inter-Class Variance (1000-ImageNet classes)
  7. 7. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  8. 8. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples from a distribution p(x):
  9. 9. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Current state
  10. 10. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Future State Current state
  11. 11. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Future State Current state Gradient to the natural manifold of p(x)
  12. 12. Probabilistic Interpretation of the method Metropolis-adjusted Langevin algorithm (MALA) which is a MCMC algorithm for iteratively producing random samples: Gradient to the natural manifold of p(x) NoiseFuture State Current state
  13. 13. Probabilistic Interpretation of the method Future State Current state Gradient to the natural manifold of p(x) Noise
  14. 14. Probabilistic Interpretation of the method p(x)
  15. 15. Probabilistic Interpretation of the method p(x) Step towards an image that causes the classifier to produce a higher score for class C Step towards a more generic image Noise
  16. 16. Probabilistic Interpretation of the method xt Rough example
  17. 17. Probabilistic Interpretation of the method y_co = Content activations y_st = Style activations Rough example
  18. 18. Probabilistic Interpretation of the method xt+i Rough example
  19. 19. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  20. 20. Method Why Plug & Play ?
  21. 21. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  22. 22. Method | PPGN-x: DAE model of p(x) What a Denoising Autoencoder is? x h(x) R(x)
  23. 23. Method | PPGN-x: DAE model of p(x) What a Denoising Autoencoder is? x_noise h(x) x N(0,σ^2) R(x)
  24. 24. Method | PPGN-x: DAE model of p(x) What a Denoising Autoencoder is? x_noise h(x) x N(0,σ^2) R(x)
  25. 25. Method | PPGN-x: DAE model of p(x)
  26. 26. Method | PPGN-x: DAE model of p(x)
  27. 27. Method | PPGN-x: DAE model of p(x) 1) Poorly modeled data, blurry 2) Slow changes
  28. 28. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  29. 29. Method | DGN-AM: sampling without a learned prior Deep Generator Network-based Activation Maximization It is faster if we move over h subspace instead of the x fc6 AlexNet
  30. 30. Method | DGN-AM: sampling without a learned prior Deep Generator Network-based Activation Maximization Discriminator 1/0 AlexNet fc6
  31. 31. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm
  32. 32. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm
  33. 33. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm
  34. 34. Method | DGN-AM: sampling without a learned prior Once we trained the network G we find the equation for the MALA algorithm No learned prior No noise
  35. 35. Method | DGN-AM: sampling without a learned prior + Different modes from different starts - Same image after many steps - Low mixing speed
  36. 36. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  37. 37. Method | PPGN-h: Generator and DAE model of p(h) A 7 layers DAE is added to model the prior p(h) in order to increase the mixing speed
  38. 38. Method | PPGN-h: Generator and DAE model of p(h) The equation is the following: Prior p(h) Conditioned Gradient Noise
  39. 39. Method | PPGN-h: Generator and DAE model of p(h) - Similar to the last case. Low diversity - p(h) model learned by DAE is too simple
  40. 40. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  41. 41. Method | Joint PPGN-h: joint Generator and DAE In order to model p(h) in a more complex way DAE: h/fc6 → ? → h/fc6
  42. 42. Method | Joint PPGN-h: joint Generator and DAE In order to model p(h) in a more complex way DAE: h/fc6 → ? → h/fc6 Joint Generator and DAE: h/fc6 x h/fc6 G E
  43. 43. Method | Joint PPGN-h: joint Generator and DAE In order to model p(h) in a more complex way DAE: h/fc6 → ? → h/fc6 Joint Generator and DAE: h/fc6 x h/fc6 G E With the same existing network we train the Generator G to act as a DAE in conjunction with the E network
  44. 44. Method | Joint PPGN-h: joint Generator and DAE AlexNet Equation is the same than before
  45. 45. Method | Joint PPGN-h: joint Generator and DAE - Faster mixing - Better quality
  46. 46. Method | Joint PPGN-h: joint Generator and AE AlexNet Equation is the same than before
  47. 47. Method | Joint PPGN-h: joint Generator and AE - Faster mixing - Better quality
  48. 48. Method | Joint PPGN-h: joint Generator and DAE Noise sweeps For the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels: fc6 N(0, ) +
  49. 49. Method | Joint PPGN-h: joint Generator and AE Noise sweeps For the last model we test the reconstruction of different h/fc6 vectors when adding different noise levels:
  50. 50. Method | Joint PPGN-h: joint Generator and AE Noise sweeps
  51. 51. Method | Joint PPGN-h: joint Generator and AE Noise sweeps We can still recover large information from the image when mapping with a lot of noise. Many → one.
  52. 52. Method | Joint PPGN-h: joint Generator and DAE Combination of Losses Comparison of Losses: ● Real Images ● ● ● ●
  53. 53. Method | Joint PPGN-h: joint Generator and DAE Combination of Losses
  54. 54. Method | Joint PPGN-h: joint Generator and DAE Combination of Losses
  55. 55. Method | Joint PPGN-h: joint Generator and DAE Evaluating: Qualitatively
  56. 56. Method | Joint PPGN-h: joint Generator and DAE Evaluating: Qualitatively
  57. 57. Method | Joint PPGN-h: joint Generator and DAE Evaluating: Qualitatively
  58. 58. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  59. 59. Further Experiments | Captioning MS-COCO Dataset
  60. 60. Further Experiments | Captioning
  61. 61. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  62. 62. Further Experiments | MFV Multifaceted Feature Visualization
  63. 63. Multifaceted Feature Visualization Further Experiments | MFV
  64. 64. Index ● Introduction ● Probabilistic Interpretation of the method ● Methods and Experiments ○ PPGN-x: DAE model of p(x) ○ DGN-AM: sampling without a learned prior ○ PPGN-h: Generator and DAE model of p(h) ○ Joint PPGN-h: joint Generator and DAE ● Further Experiments ○ Image Generation: Captioning ○ Image Generation: Multifaceted Feature Visualization ○ Image inpainting ● Conclusions
  65. 65. Further Experiments | Inpainting Multifaceted Feature Visualization
  66. 66. Further Experiments | Inpainting Multifaceted Feature Visualization
  67. 67. Further Experiments | Inpainting Multifaceted Feature Visualization
  68. 68. Further Experiments | Inpainting Multifaceted Feature Visualization
  69. 69. Further Experiments | Inpainting Multifaceted Feature Visualization
  70. 70. Conclusions ● Only using GANs for the reconstruction, GANs collapse into fewer modes, far from the original p(x). ● Using extra Losses it is possible to better reconstruct the images even for 1000 classes and for higher resolution. Mapping one-to-one helps to prevent typical latent → missing modes. ● It would be great to generate also the embedding space for this super-resolution multi-class images instead of using a supervised learned space.

×