generative adversarial network (+laplacian pyramid gan)
TRANSCRIPT
Generative Adversarial NetworkNamHyuk Ahn
Generative Adversarial Network
What is a Generative model?
• Goal: Wish to learn X → Y, P(y|x)
• Discriminative model (classifier)• Directly learn conditional distribution P(y|x) from training data• SVM, Logistic Regression
• Generative model (classifier)• Learn the joint probability, P(x,y) = P(x|y) * P(y)• Estimate parameter of P(x|y), P(y) from training data• Use Bayes rule to calculate P(y|x)• Naive Bayes, GMM
Generative vs. Discriminative
• Generative• Probabilistic “model” of each class• Decision boundary is “where model becomes more likely”• Natural use of unlabeled data (unsupervised learning)
• Discriminative• Focus on the decision boundary• More powerful with lots of data• Only supervised task
IAML2.23: Generative vs. discriminative learning
Generative vs. Discriminative
http://slideplayer.com/slide/6982498/
Adversarial Learning
http://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-generative-models-and-adversarial-training-upc-2016
Adversarial Learning
http://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-generative-models-and-adversarial-training-upc-2016
counterfeiters
police
fake currency
real currency
(neural) Network
http://cs231n.github.io
Adversarial Network
• Notations• 𝑝": Generator’s distribution over data 𝑥• 𝑝$(𝒛): Prior on input noise• 𝐺(𝑧; 𝜃"): Generator function (mlp with parameters 𝜃")• 𝐷(𝑥; 𝜃-): Discriminator function output single scalar
• If input is from real distribution, output is 1, otherwise return 0
• Goals (cost function)
Training procedure
• Optimize D completely in inner loop is bad idea(computation prohibitive, overfitting)
• Instead, first optimize D k steps and optimize G one step• D being maintained near optimal solution• G slowly moves to optimal
• In practice, eq1 may not provide sufficient gradient for G• In early stage, G output poor example, so D can reject with high
confidence• log(1 − 𝐷 𝐺 𝒛 ) saturate (log 1 = 0)
• Rather training G to minimize log(1 − 𝐷 𝐺 𝒛 ), maximize log𝐷(𝐺 𝒛 )
Global Optimality of 𝑝" = 𝑝-676
Global Optimality of 𝑝" = 𝑝-676
Convergence of Algorithm 1
Theoretically, cool.But in practice, GAN not always show good performance
Result
Why GAN is important?
• Use GAN in semi-supervised Learning• Features from discriminator could improve performance
when limited labeled data is available
• Vector Arithmetic• Generate fake image of bedroom (DCGAN)• [man with glasses] - [man without glasses] + [woman without
glasses] = [woman with glasses]
• Conditional GAN• GAN performs unsupervised manner, but also can model 𝑝 𝑥 𝑐
by adding class label in both G and D
Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
Image Pyramid
• Image pyramid is multi-scale image representation• Generate pyramid• Blur previous pyramid image• Subsample pixels
• Variety type of pyramid• Gaussian pyramid• Laplacian pyramid• ...
Figure from David Forsyth
Laplacian Pyramid
http://cs.brown.edu/courses/csci1430/2011/results/proj1/georgem/explained.jpg
Conditional GAN
• Add conditional variable→ 𝐷 𝑥 𝑐 𝑜𝑟𝐺 𝑧 𝑐
• Variable 𝑐 can be anything• Class label• Tags correspond to image• Additional image information• ...
Laplacian Pyramid GAN : Sampling procedure
Laplacian Pyramid GAN : Sampling procedure
Start with generator that output scaled image (Gaussian pyramid) 𝐼<=
Laplacian Pyramid GAN : Sampling procedure
1. Upsample Gaussian pyramid 𝐼<= to 𝑙? (green arrow)2. Input noise 𝑧?and 𝑙? to generator(𝑙? is conditional information – orange arrow)
3. Generator output Laplacian pyramid ℎA?
Laplacian Pyramid GAN : Sampling procedure
1. Upsample Gaussian pyramid 𝐼<? to 𝑙B (green arrow)2. Input noise 𝑧Band 𝑙B to generator(𝑙B is conditional information – orange arrow)
3. Generator output Laplacian pyramid ℎAB
Laplacian Pyramid GAN : Sampling procedure
Finally create generated image 𝐼<C
Laplacian Pyramid GAN : Training procedure
Result
Result
Believe or not
Why LAPGAN is better?
• LAPGAN don’t use global Generator/Discriminator• Instead, separate image into multi-scaled pyramids• Other multi-scaled approach might be helpful
• Each G/D only cover each scaled pyramid• Believe or not, LAPGAN produce sharper images• (my thought) • Each Generator focus on generating Laplacian pyramid
which about high-pass (edges) with conditional information• This idea can make generator to produce much sharper images
Other GAN topics
• GAN• LAPGAN• DCGAN• InfoGAN• Bidirectional GAN• EBGAN• ...• ...
DCGAN (15.11)
EBGAN (16.09)
StackGAN (16.12)
StackGAN (16.12)
Reference
• Goodfellow, Ian, et al. "Generative adversarial nets." Advances in Neural Information Processing Systems. 2014.
• Denton, Emily L., Soumith Chintala, and Rob Fergus. "Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks." Advances in neural information processing systems. 2015.