lecture 7: generative adversarial networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfuva deep...
TRANSCRIPT
![Page 1: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/1.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 1
Lecture 7: Generative Adversarial NetworksEfstratios Gavves
![Page 2: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/2.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 2
oGentle intro to generative models
oGenerative Adversarial Networks
oVariants of Generative Adversarial Networks
Lecture overview
![Page 3: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/3.jpg)
Generative models
![Page 4: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/4.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 4
oGenerative modellingโฆLearn the joint pdf: ๐(๐ฅ, ๐ฆ)
โฆModel the world Perform tasks, e.g. use Bayes rule to classify: ๐(๐ฆ|๐ฅ)
โฆNaรฏve Bayes, Variational Autoencoders, GANs
oDiscriminative modellingโฆLearn the conditional pdf: ๐(๐ฆ|๐ฅ)
โฆTask-oriented
โฆE.g., Logistic Regression, SVM
Types of Learning
![Page 5: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/5.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 5
oWhat to pick?โฆV. Vapnik: โOne should solve the [classification] problem directly and never solve a more general [and harder] problem as an intermediate step.โ
oTypically, discriminative models are selected to do the job
oGenerative models give us more theoretical guarantees that the model is going to work as intendedโฆBetter generalization
โฆLess overfitting
โฆBetter modelling of causal relationships
Types of Learning
![Page 6: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/6.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 6
Applications of generative modeling?
![Page 7: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/7.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 7
oAct as a regularizer in discriminative learningโฆDiscriminative learning often too goal-oriented
โฆOverfitting to the observations
oSemi-supervised learningโฆMissing data
oSimulating โpossible futuresโ for Reinforcement Learning
oData-driven generation/sampling/simulation
Applications of generative modeling?
![Page 8: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/8.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 8
Applications: Image Generation
![Page 9: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/9.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 9
Applications: Super-resolution
![Page 10: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/10.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 10
Applications: Cross-model translation
![Page 11: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/11.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 11
A map of generative models
![Page 12: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/12.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 12
Explicit density models
oPlug in the model density function to likelihood
oThen maximize the likelihood
Problems
oModes must be complex enoughto match data complexity
oAlso, model must becomputationally tractable
oMore details in the next lectures
![Page 13: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/13.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 13
o Density estimation
Generative modeling: Case I
Train set Fitted model
![Page 14: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/14.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 14
Implicit density models
oNo explicit probability density function (pdf) needed
o Instead, a sampling mechanism to draw samplesfrom the pdf without knowing the pdf
![Page 15: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/15.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 15
Implicit density models: GANs
oSample data in parallel
oFew restrictions on generator model
oNo Markov Chains needed
oNo variational bounds
oBetter qualitative examplesโฆWeak but true
![Page 16: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/16.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 16
o Sample Generation
Generative modeling: Case II
Train examples
![Page 17: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/17.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 17
o Sample Generation
Generative modeling: Case II
Train examples New samples (ideally)
![Page 18: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/18.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 18
oGenerativeโฆYou can sample novel input samples
โฆE.g., you can literally โcreateโ images that never existed
oAdversarialโฆOur generative model ๐บ learns adversarially, by fooling an discriminative oracle model D
oNetworkโฆ Implemented typically as a (deep) neural network
โฆEasy to incorporate new modules
โฆEasy to learn via backpropagation
What is a GAN?
![Page 19: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/19.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 19
oAssume you have two partiesโฆPolice: wants to recognize fake money as reliably as possible
โฆCounterfeiter: wants to make as realistic fake money as possible
oThe police forces the counterfeiter to get better (and vice versa)
oSolution relates to Nash equilibrium
GAN: Intuition
![Page 20: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/20.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 20
GAN: Pipeline
![Page 21: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/21.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 21
oMust be differentiable
oNo invertibility requirement
oTrainable for any size of z
oCan make conditionally Gaussian given z, but no strict requirement
Generator network ๐ฅ = ๐บ(๐ง; ๐(G))
๐ง ๐ฅ
![Page 22: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/22.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 22
oThe discriminator is just a standard neural network
oThe generator looks like an inverse discriminator
Generator & Discriminator: Implementation
![Page 23: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/23.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 23
oMinimax
oMaximin
oHeuristic, non-saturating game
oMax likelihood game
Training definitions
![Page 24: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/24.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24
o ๐ฝ(๐ท) = โ1
2๐ผ๐ฅ~๐๐๐๐ก๐ log ๐ท ๐ฅ โ
1
2๐ผ๐ง~๐๐ง log(1 โ ๐ท(๐บ(๐ง))
o๐ท ๐ฅ = 1 โ The discriminator believes that ๐ฅ is a true image
o๐ท ๐บ(๐ง) = 1 โ The discriminator believes that ๐บ(๐ง) is a true image
oEquilibrium is a saddle point of the discriminator loss
oResembles Jensen-Shannon divergence
oGenerator minimizes the log-probability of the discriminator being correct
Minimax Game
NIPS 2016 Tutorial: Generative Adversarial Networks
![Page 25: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/25.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 25
A reasonable loss for the generator?
![Page 26: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/26.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 26
oFor the simple case of zero-sum game๐ฝ(๐บ) = โ๐ฝ(๐ท)
oSo, we can summarize game by
๐ ๐ D , ๐ G = โ๐ฝ ๐ท (๐ D , ๐ G )
oEasier theoretical analysis
o In practice not used when the discriminator starts to recognize fake samples, then โฆ
Minimax Game
![Page 27: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/27.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 27
oFor the simple case of zero-sum game๐ฝ(๐บ) = โ๐ฝ(๐ท)
oSo, we can summarize game by
๐ ๐ D , ๐ G = โ๐ฝ ๐ท (๐ D , ๐ G )
oEasier theoretical analysis
o In practice not used when the discriminator starts to recognize fake samples, the generator gradients vanish
Minimax Game
![Page 28: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/28.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 30
o ๐ฝ(๐ท) = โ1
2๐ผ๐ฅ~๐๐๐๐ก๐ log ๐ท ๐ฅ โ
1
2๐ผ๐ง~๐๐ง log(1 โ ๐ท(๐บ(๐ง))
o ๐ฝ(๐บ) = โ1
2๐ผ๐ง~๐๐ง log(๐ท(๐บ(๐ง))
oEquilibrium not any more describable by single loss
oGenerator maximizes the log-probability of the discriminator being mistakenโฆGood ๐บ(๐ง) D ๐บ ๐ง = 1 ๐ฝ(๐บ) is maximized
oHeuristically motivated; generator can still learn even when discriminator successfully rejects all generator samples
Heuristic non-saturating game
![Page 29: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/29.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 31
DCGAN Architecture
![Page 30: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/30.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 32
Examples
![Page 31: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/31.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 33
Even vector space arithmetics โฆ
Man
with
glasses
Man Woman
Woman with
glasses
![Page 32: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/32.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 34
o ๐ฝ(๐ท) = โ1
2๐ผ๐ฅ~๐๐๐๐ก๐ log ๐ท ๐ฅ โ
1
2๐ผ๐ง log(1 โ ๐ท(๐บ(๐ง))
o ๐ฝ(๐บ) = โ1
2๐ผ๐ง log(๐
โ1(๐ท ๐บ ๐ง )
oWhen discriminator is optimal, the generator gradient matches that of maximum likelihood
Modifying GANs for Max-Likelihood
On distinguishability criteria for estimating generative models
![Page 33: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/33.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 35
Comparison of Generator Losses
When sample is likely fake, the non-saturating heuristic and the ML cost are flat no gradients in early steps
The ML cost variant generates gradients mostly from the โgood generationsโ all gradients from few samples high variance Variance reduction?
![Page 34: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/34.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 36
oOptimal ๐ท(๐ฅ) for any ๐๐๐๐ก๐ ๐ฅ and ๐๐๐๐๐๐ ๐ฅ is always
๐ท ๐ฅ =๐๐๐๐ก๐ ๐ฅ
๐๐๐๐ก๐ ๐ฅ + ๐๐๐๐๐๐ ๐ฅ
oEstimating this ratio with supervised learning (discriminator) is the key
Optimal discriminator
Discriminator Data
Model
distribution
![Page 35: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/35.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 37
o๐ฟ ๐ท, ๐บ = ๐ฅ ๐๐ ๐ฅ log๐ท ๐ฅ +๐๐(๐ฅ) log 1 โ ๐ท ๐ฅ ๐๐ฅ
โฆMinimize ๐ฟ ๐ท, ๐บ w.r.t. ๐ท๐๐ฟ
๐๐ท= 0 and ignore the integral (we sample over all ๐ฅ)
โฆThe function ๐ฅ โ ๐ log ๐ฅ + ๐ log(1 โ ๐ฅ) attains max in [0, 1] at ๐
๐+๐
oThe optimal discriminator
๐ทโ ๐ฅ =๐๐(๐ฅ)
๐๐ ๐ฅ + ๐๐(๐ฅ)โฆAnd at optimality ๐๐ ๐ฅ โ ๐๐ ๐ฅ , thus
๐ทโ ๐ฅ =1
2๐ฟ ๐บโ, ๐ทโ = โ2 log 2
Why is this the optimal discriminator?
![Page 36: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/36.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 38
oBy expanding the Jensen-Shannon divergence, we have
๐ท๐ฝ๐(๐๐||๐๐) =1
2๐ท๐พ๐ฟ(๐๐||
๐๐ + ๐๐
2) +
1
2๐ท๐พ๐ฟ(๐๐||
๐๐ + ๐๐
2)
=1
2แlog 2 + เถฑ
๐ฅ
๐๐ ๐ฅ log๐๐ ๐ฅ
๐๐ ๐ฅ + ๐๐ ๐ฅ๐๐ฅ + log 2
GANs and Jensen-Shannon divergence
![Page 37: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/37.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 39
o By expanding the Jensen-Shannon divergence, we have
๐ท๐ฝ๐(๐๐||๐๐) =1
2๐ท๐พ๐ฟ(๐๐||
๐๐ + ๐๐
2) +
1
2๐ท๐พ๐ฟ(๐๐||
๐๐ + ๐๐
2)
=1
2แlog 2 + เถฑ
๐ฅ
๐๐ ๐ฅ log๐๐ ๐ฅ
๐๐ ๐ฅ + ๐๐ ๐ฅ๐๐ฅ + log 2
GANs and Jensen-Shannon divergence
https://lilianweng.github.io/lil-log/2017/08/20/from-GAN-to-WGAN.html
![Page 38: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/38.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 40
oDoes the divergence make a difference?
o Is there a difference between KL-divergence, Jensen-Shannon divergence, โฆ
๐ท๐พ๐ฟ(๐๐| ๐๐ = เถฑ๐ฅ
๐๐ log๐๐๐๐
๐๐ฅ
๐ท๐ฝ๐(๐๐||๐๐) =1
2๐ท๐พ๐ฟ(๐๐||
๐๐ + ๐๐
2) +
1
2๐ท๐พ๐ฟ(๐๐||
๐๐ + ๐๐
2)
oLetโs check the KL-divergence
Is the divergence important?
![Page 39: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/39.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 41
oForward KL divergence: ๐ท๐พ๐ฟ(๐(๐ฅ)| ๐โ(๐ฅ)
high probability everywhere that the data occurs
oBackward KL divergence: ๐ท๐พ๐ฟ(๐โ(๐ฅ)||๐(๐ฅ))
low probability wherever the data does not occur
oWhich version makes the model โconservativeโ?
Is the divergence important?
๐ท๐พ๐ฟ(๐๐| ๐๐ = เถฑ๐ฅ
๐๐ log๐๐๐๐
๐๐ฅ
๐๐ is what we get and cannot change๐๐ is what we make through
our model and (through training) change
![Page 40: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/40.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 42
o๐ท๐พ๐ฟ(๐(๐ฅ)||๐โ(๐ฅ)) high probability everywhere that the data occurs
o๐ท๐พ๐ฟ(๐โ(๐ฅ)||๐(๐ฅ)) low probability wherever the data does not occur
oWhich version makes the model โconservativeโ?
o๐ท๐พ๐ฟ(๐โ(๐ฅ)||๐ ๐ฅ ) = ๐โ(๐ฅ)log๐
โ ๐ฅ
๐ ๐ฅ
โฆAvoid areas where ๐ ๐ฅ โ 0
oZero-forcingโฆ๐โ ๐ฅ โ 0 in areas when approximation๐โ ๐ฅ
๐ ๐ฅcannot be good
Is the divergence important?
![Page 41: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/41.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 43
o JS is symmetric, KL is not
KL vs JS
![Page 42: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/42.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 44
oGANs is a mini-max optimizationโฆNon-cooperative game with a tied objective
o Training is not always easyWhen optimizing one player/network, we might hurt the other one oscillations
o Assume two players ๐ ๐ฅ = ๐ฅ๐ฆWe optimize one step at a timeโฆPlayer 1 minimizes: min
x๐1 ๐ฅ = ๐ฅ๐ฆ โ
๐๐1
๐๐ฅ= ๐ฆ
โ ๐ฅ๐ก+1 = ๐ฅ๐ก โ ๐ โ ๐ฆ
โฆPlayer 2 minimizes: miny
๐2 ๐ฅ = โ๐ฅ๐ฆ โ๐๐2
๐๐ฅ= โ๐ฅ
โ ๐ฆ๐ก+1 = ๐ฆ๐ก + ๐ โ ๐ฅ
GAN Problems: Reaching Nash equilibrium causes instabilities
https://lilianweng.github.io/lil-log/2017/08/20/from-GAN-to-WGAN.html
![Page 43: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/43.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 45
๐ฝ(๐ท) = โ1
2๐ผ๐ฅ~๐๐๐๐ก๐ log๐ท ๐ฅ โ
1
2๐ผ๐ง log(1 โ ๐ท(๐บ(๐ง))
๐ฝ(๐บ) = โ1
2๐ผ๐ง log(๐ท(๐บ(๐ง))
o If the discriminator is quite bad no accurate feedback for generator no reasonable generator gradients
oBut, if the discriminator is perfect, ๐ท ๐ฅ = ๐ทโ(๐ฅ) gradients go to 0 no learning anymore
oBad when this happens early in the trainingโฆEasier to train the discriminator than the generator
GAN Problems: Vanishing Gradients
![Page 44: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/44.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 46
oVery low variability
o It is safer for thegenerator to producesamples from the mode it knows it approximates well
GAN Problems: Mode collapse
![Page 45: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/45.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 47
oData lie in low-dim manifolds
oHowever, the manifold is not known
oDuring training ๐๐ is not perfect either, especially in the start
oSo, the support of ๐๐ and ๐๐ is non-overlapping and disjoint not good for KL/JS divergences
oEasy to find a discriminating line
GAN Problems: Low dimensional supports
![Page 46: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/46.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 48
o Instead of KL/JS, use Wasserstein (Earth Moverโs) Distance๐ ๐๐ , ๐๐ = inf
๐พ~ฮ (pr,pg)E x,y ~ฮณ|๐ฅ โ ๐ฆ|
oEven for non-overlapping supports, the distance is meaningful
Wasserstein GAN
![Page 47: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/47.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 49
o Instead of matching image statistics, match feature statistics
๐ฝ(๐ท) = ๐ผ๐ฅ~๐๐๐ ๐ฅ โ ๐ผ๐ง~๐๐ง๐ ๐บ ๐ง2
2
o๐ can be any statistic of the data, like the mean or the median
Feature matching
![Page 48: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/48.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 53
oUse SGD-like algorithm of choiceโฆAdam Optimizer is a good choice
oUse two mini-batches simultaneouslyโฆThe first mini-batch contains real examples from the training set
โฆThe second mini-batch contains fake generated examples from the generator
oOptional: run k-steps of one player (e.g. discriminator) for every step of the other player (e.g. generator)
Training procedure
![Page 49: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/49.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 54
oLearning a conditional model ๐(๐ฆ|๐ฅ) is often generates better samplesโฆDenton et al., 2015
oEven learning ๐(๐ฅ, ๐ฆ) makes samples look more realisticโฆSalimans et al., 2016
oConditional GANs are a great addition for learning with labels
Use labels if possible
![Page 50: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/50.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 55
oDefault discriminator cost:
cross_entropy(1., discriminator(data))+ cross_entropy(0., discriminator(samples))
oOne-sided label smoothing:
cross_entropy(0.9, discriminator(data))+ cross_entropy(0., discriminator(samples))
oDo not smooth negative labels:
cross_entropy(1.-alpha, discriminator(data))+ cross_entropy(beta, discriminator(samples))
One-sided label smoothing
![Page 51: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/51.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 56
oMax likelihood often is overconfidentโฆMight return accurate prediction, but too high probabilities
oGood regularizerโฆSzegedy et al., 2015
oDoes not reduce classification accuracy, only confidence
oSpecifically for GANsโฆPrevents discriminator from giving very large gradient signals to generator
โฆPrevents extrapolating to encourage extreme samples
Benefits of label smoothing
![Page 52: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/52.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 57
oGenerally, good practice for neural networks
oGiven inputs ๐ = {๐ฅ 1 , ๐ฅ 2 , โฆ , ๐ฅ(๐)}
oCompute mean and standard deviation of features of ๐: ๐๐๐, ๐๐๐
oNormalize featuresโฆSubtract mean, divide by standard deviation
Batch normalization
![Page 53: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/53.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 58
Batch normalization: Graphically
Layer kLayer k+1
๐ง๐ = โ(๐ฅ๐โ1) ๐ฅ๐+1 = ๐ง๐
![Page 54: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/54.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 59
Batch normalization: Graphically
Layer kLayer k+1
๐ง๐ = โ(๐ฅ๐โ1)Batch norm(๐๐๐
(๐ก), ๐๐๐
(๐ก))
๐ฅ๐+1 =๐ง๐ โ ๐๐๐๐๐๐
![Page 55: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/55.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 60
But, can cause strong intra-batch correlation
![Page 56: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/56.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 61
oTraining with two mini-batches
oOne fixed reference mini-batch for computing mean and standard deviation
oThe other for doing the training as usual
oProceed as normal, only use the mean and standard deviation for the batch norm from the fixed reference mini-batch
oProblem: Overfitting to the reference mini-batch
Reference batch normalization
Iteration 1
Iteration 2
Iteration 3
Standard
mini-batch
Reference
mini-batch
๐๐๐, ๐๐๐
๐๐๐, ๐๐๐
๐๐๐, ๐๐๐
๐๐ฝ(1)
๐๐
๐๐ฝ(2)
๐๐
๐๐ฝ(3)
๐๐
![Page 57: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/57.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 62
oMini-batch= standard mini-batch + reference/fixed mini-batch
Solution: Virtual batch normalization
Iteration 1
Iteration 2
Iteration 3
Standard
mini-batch
Reference
mini-batch
๐๐๐๐ , ๐๐๐
(๐ )๐๐ฝ(1)
๐๐
๐๐ฝ(2)
๐๐
๐๐ฝ(3)
๐๐
๐๐๐๐ , ๐๐๐
(๐ )
๐๐๐๐ , ๐๐๐
(๐ )
![Page 58: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/58.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 63
oUsually the discriminator winsโฆThatโs good, in that the theoretical justification assume a perfect discriminator
oUsually the discriminator network is bigger than the generator
oSometimes running discriminator more often than generator works betterโฆHowever, no real consensus
oDo not limit the discriminator to avoid making it too smartโฆBetter use non-saturating cost
โฆBetter use label smoothing
Balancing Generator & Discriminator
![Page 59: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/59.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 64
oOptimization is tricky and unstableโฆ finding a saddle point does not imply a global minimum
oAn equilibrium might not even be reached
oMode-collapse is the most severe form of non-convergence
Open Question: Non-convergence
![Page 60: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/60.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 65
oDiscriminator converges to the correct distribution
oGenerator however places all mass in the most likely point
Open Question: Mode collapse
![Page 61: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/61.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 66
oDiscriminator converges to the correct distribution
oGenerator however places all mass in the most likely point
oProblem: low sample diversity
Open Question: Mode collapse
![Page 62: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/62.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 67
oClassify each sample by comparing to other examples in the mini-batch
o If samples are too similar, the model is penalized
Minibatch features
Penalized Not Penalized
Mini-batch
Sample
![Page 63: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/63.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 68
oDespite the nice images, who cares?
o It would be nice to quantitatively evaluate the model
oFor GANs it is even hard to estimate the likelihood
Open Question: Evaluation of GANs
![Page 64: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/64.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 69
oThe generator must be differentiable
o It cannot be differentiable if outputs are discrete
oE.g., harder to make it work for text
oPossible workaroundsโฆREINFORCE [Williams, 1992]
โฆConcrete distribution [Maddison et al., 2016]
โฆGumbel softmax [Jang et al., 2016]
โฆTrain GAN to generate continuous embeddings
Open Question: Discrete outputs
![Page 65: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/65.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 70
Open Question: Semi-supervised classification
![Page 66: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/66.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 71
o InfoGAN [Chen et al., 2016]
Interpretable latent codes
![Page 67: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/67.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 72
oConditional GANsโฆStandard GANs have no encoder!
oActor-CriticโฆRelated to Reinforcement Learning
GAN spinoffs
Conditional GAN
![Page 68: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/68.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 73
oGANs interpreted as actor-critic [Pfau and Vinyals, 2016]
oGANs as inverse reinforcement learning [Finn et al., 2016]
oGANs for imitation learning [Ho and Ermin 2016]
Connections to Reinforcement Learning
![Page 69: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/69.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 74
Application: Image to Image translation
![Page 70: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/70.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 75
Application: Style transfer
![Page 71: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/71.jpg)
UVA DEEP LEARNING COURSE โ EFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 76
ohttps://www.youtube.com/watch?v=XOxxPcy5Gr4
Application: Face generation
![Page 72: Lecture 7: Generative Adversarial Networksuvadlc.github.io/lectures/apr2019/lecture7-gan.pdfUVA DEEP LEARNING COURSE โEFSTRATIOS GAVVES GENERATIVE ADVERSARIAL NETWORKS - 24 o๐ฝ(๐ท)=โ1](https://reader036.vdocuments.net/reader036/viewer/2022070712/5ecde339c9dc5a794236dd78/html5/thumbnails/72.jpg)
UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES
GENERATIVE ADVERSARIAL NETWORKS - 77
Summary
oGANs are generative models using supervised learning to approximate an intractable cost function
oGANs can simulate many cost functions, including max likelihood
oFinding Nash equilibria in high-dimensional, continuous, non-convex games is an important open research problem
oGAN research is in its infancy, most works published only in 2016. Not mature enough yet, but very compelling results