This paper proposes AmbientGAN, which trains a generative adversarial network using partial or noisy observations rather than fully observed samples. AmbientGAN trains the discriminator on the measurement domain rather than the raw data domain, allowing the generator to be trained without needing large amounts of good training data. The paper proves it is theoretically possible to recover the original data distribution even when the measurement process is not invertible. It presents experimental results showing AmbientGAN can generate high quality samples and recover the underlying data distribution from various types of lossy and noisy measurements.
1. AmbientGAN: Generative Models
From Lossy Measurements
ICLR 2018 Reading Group
Jaejun Yoo
Clova ML / NAVER
12th Mar, 2018
by A. Bora, E. Price, A.G. Dimakis
2. Motivation
“How can we train our generative models with
partial, noisy observations”
In many settings, it is expensive or even impossible to obtain fully-observed
samples, but economical to obtain partial, noisy samples.
Why do we care?
3. This paper proposes
• AmbientGAN: train the discriminator not on the raw data
domain but on the measurement domain.
• Propose the way to train the generative model with a noisy,
corrupted, or missing data.
• Prove that it is theoretically possible to recover the original
true data distribution even though the measurement process
is not invertible.
8. Limitation of GAN
• Training Stability
• Forgetting problem
• Requires Good (or fully observed) Training Samples!!
• To train the generator, we need a lot of good images
• In many settings, it is expensive or even impossible to obtain fully-observed samples,
but economical to obtain partial, noisy samples.
9. Limitation of GAN
• Training Stability
• Forgetting problem
• Requires Good (or fully observed) Training Samples!!
• To train the generator, we need a lot of good images
• In many settings, it is expensive or even impossible to obtain fully-observed samples,
but economical to obtain partial, noisy samples.
Compressed sensing attempts to address this problem
by exploiting the models of the data structure; sparsity.
Bora et al. ICML 2017, “Compressed Sensing using Generative Models”
11. Chicken and Egg problem
• Okay, they showed it is possible to solve the problem with a small number of (good)
measurements by using Generative Models!
• But, what if it is even not possible to gather the good data in the first place?
• How can we collect enough data to train a generative model to start with?
15. Measurements? (in this paper)
• Block-Pixels: with 𝑝, make image pixel 0
• Conv + Noise: Gaussian kernel blur + add random Gaussian noise
• Block-Patch: make 𝑘 × 𝑘 patch 0
• Keep-Patch: make 0 except 𝑘 × 𝑘 patch
• Extract-Patch: use 𝑘 × 𝑘 patch as input
• Pad-Rotate-Project: zero padding + rotate + vertical 1D projection
• Pad-Rotate-Project-𝛉: Pad-Rotate-Project and include θ as an
additional information for the measurement
• Gaussian-Projection: Sample a random Gaussian vector and project
25. In Conclusion
• Using AmbientGAN, it is possible to train the generator
without the fully-observed data
• In theory, it is possible to find the true distribution by training
the generator when the measurement process is invertible and
differentiable
• Empirically, it is possible to recover the good data distribution
even though the measurement process is not clearly known.
26. Possible Applications?
• OCR
• Gayoung’s Webtoon data
• Adding Reconstruction loss and Cyclic loss
• Learnable 𝒇 𝜽
−𝟏
⋅ by FC
• etc.