lecture 11 recap - github pages · figure copyright and adapted from ian goodfellow, tutorial on...

90
Lecture 11 Recap I2DL: Prof. Niessner, Dr. Dai 1

Upload: others

Post on 17-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Lecture 11 Recap

I2DL: Prof. Niessner, Dr. Dai 1

Page 2: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Transfer Learning

3I2DL: Prof. Niessner, Dr. Dai

P1 P2

Large dataset Small dataset

Distribution Distribution

Use what has been learned for another

setting

Page 3: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Transfer Learning

5I2DL: Prof. Niessner, Dr. Dai

Trained on ImageNet New dataset with C classes

TRAIN

FROZEN

[Donahue et al., ICML’14] DeCAF, [Razavian et al., CVPRW’14] CNN Features off-the-shelf

Source : http://cs231n.stanford.edu/slides/2016/winter1516_lecture11.pdf

Page 4: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Basic Structure of RNN• We want to have notion of “time” or “sequence”

7I2DL: Prof. Niessner, Dr. Dai

Hidden state

InputPrevious hidden state

𝑨𝑡 = 𝜽𝑐𝑨𝑡−1 + 𝜽𝑥𝒙𝑡

Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 5: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Long-Term Dependencies

8I2DL: Prof. Niessner, Dr. Dai

I moved to Germany … so I speak German fluently.Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 6: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Long-Short Term Memory Units(LSTM)

11I2DL: Prof. Niessner, Dr. Dai

Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 7: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Long-Short Term Memory Units• Key ingredients • Cell = transports the information through the unit

12I2DL: Prof. Niessner, Dr. Dai

Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 8: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

LSTM• Highway for the gradient to flow

13I2DL: Prof. Niessner, Dr. Dai

Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 9: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

RNNs in Computer Vision• Caption generation

14I2DL: Prof. Niessner, Dr. Dai

[Xu et al., PMLR’15] Neural Image Caption Generation

Page 10: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoders

I2DL: Prof. Niessner, Dr. Dai 15

Page 11: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Machine Learning

16

• Labels or target classes

• Goal: Learn a mapping from input to label

• Classification, regression

Supervised learning

I2DL: Prof. Niessner, Dr. Dai

Page 12: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

DOG DOG

DOG

CAT

CAT

CAT

Machine Learning

17

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Dr. Dai

Page 13: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Unsupervised learning Supervised learning

Machine Learning

• No label or target class• Find out properties of

the structure of the data

• Clustering (k-means, PCA)

18

DOG DOG

DOG

CAT

CAT

CAT

I2DL: Prof. Niessner, Dr. Dai

Page 14: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

DOG DOG

DOG

CAT

CAT

CAT

Machine Learning

19

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Dr. Dai

Page 15: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Machine Learning

20

Unsupervised learning Supervised learning

DOG DOG

DOG

CAT

CAT

CAT

I2DL: Prof. Niessner, Dr. Dai

Page 16: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoders• Unsupervised approach for learning a lower-

dimensional feature representation from unlabeled training data

21I2DL: Prof. Niessner, Dr. Dai

Source: https://hackernoon.com

Page 17: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoders• From an input image

to a feature representation (bottleneck layer)

• Encoder: a CNN in our case

22I2DL: Prof. Niessner, Dr. Dai

𝑥

Conv

𝑧

Input ImageSource: https://bit.ly/37dpsbQ

Page 18: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoders• Why do we need this dimensionality reduction?

• To capture the patterns, the most meaningful factors of variation in our data

• Other dimensionality reduction methods?

23I2DL: Prof. Niessner, Dr. Dai

Page 19: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoder Training

24

Conv Transpose Conv

Input Image Output Image

ReconstructionLoss (like L1, L2)

I2DL: Prof. Niessner, Dr. Dai

Source: https://bit.ly/37dpsbQ

Page 20: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoder Training

25

Latent space 𝑧dim 𝑧 < dim(𝑥)

Inp

ut 𝑥

Rec

onst

ruct

ion 𝑥′

Input images

Reconstructed images

I2DL: Prof. Niessner, Dr. Dai

Page 21: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

26

Autoencoder Training• No labels

required

• We can use unlabeled data to first get its structure

I2DL: Prof. Niessner, Dr. Dai

Latent space 𝑧dim 𝑧 < dim(𝑥)

Inp

ut 𝑥

Rec

onst

ruct

ion 𝑥′

Page 22: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

27

Autoencoder Use CasesEmbedding of

MNIST numbers

I2DL: Prof. Niessner, Dr. Dai

Source: https://lts2.epfl.ch/blog/perekres/2015/02/21/layer-by-layer-visualizations-of-mnist-dataset-feature-representations/

Page 23: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

28

Autoencoder for Pre-Training• Test case: Medical applications based on CT images

– Large set of unlabeled data.– Small set of labeled data.

• We cannot take a network pre-trained on ImageNet. Why?

• The image features are different for CT vs natural images

I2DL: Prof. Niessner, Dr. Dai

Page 24: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

29

Autoencoder for Pre-Training• Test case: medical applications based on CT images

– Large set of unlabeled data.– Small set of labeled data.

• We can pre-train our network using an autoencoder to “learn” the type of features present in CT images

I2DL: Prof. Niessner, Dr. Dai

Page 25: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

30

Autoencoder for Pre-Training• Step 1: Unsupervised training with autoencoders

Input Reconstruction

I2DL: Prof. Niessner, Dr. Dai

Source: https://bit.ly/37dpsbQ

Page 26: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

31

Autoencoder for Pre-Training• Step 2: Supervised training with the labeled data

Input Reconstruction

Throw away the decoder

I2DL: Prof. Niessner, Dr. Dai

Source: https://bit.ly/37dpsbQ

Page 27: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoder for Pre-Training• Step 2: Supervised training with the labeled data

32

Input

Ground truth labels for supervised learning

Loss

Backprop as always

I2DL: Prof. Niessner, Dr. Dai

𝑥 𝑦

𝑦∗

𝑧

Page 28: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Why use Autoencoders?• Pre-training, as mentioned before

– Image same image reconstructed– Use the encoder as “feature extractor”

• Use them to get pixel-wise predictions– Image semantic segmentation– Low-resolution image High-resolution image– Image Depth map

33I2DL: Prof. Niessner, Dr. Dai

Page 29: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoders for Pixel-wise Predictions

34I2DL: Prof. Niessner, Dr. Dai

Page 30: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Semantic Segmentation (FCN)• Recall the Fully Convolutional Networks

35[Long et al., CVPR’15] : Fully Convolutional Networks for Semantic Segmentation

Can we do better?

I2DL: Prof. Niessner, Dr. Dai

Page 31: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

SegNet

36I2DL: Prof. Niessner, Dr. Dai[Badrinarayanan et al., TPAMI‘16] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Page 32: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

SegNet

37

Input

Ground Truth

SegNet

I2DL: Prof. Niessner, Dr. Dai[Badrinarayanan et al., TPAMI‘16] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Page 33: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

SegNet• Encoder: normal convolutional filters + pooling

• Decoder: Upsampling + convolutional filters

38I2DL: Prof. Niessner, Dr. Dai[Badrinarayanan et al., TPAMI‘16] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Page 34: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

SegNet• Encoder: Normal convolutional filters + pooling

• Decoder: Upsampling + convolutional filters

39I2DL: Prof. Niessner, Dr. Dai[Badrinarayanan et al., TPAMI‘16] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Page 35: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

SegNet• Encoder: normal convolutional filters + pooling

• Decoder: Upsampling + convolutional filters

• The convolutional filters in the decoder are learned using backprop and their goal is to refine the upsampling

40I2DL: Prof. Niessner, Dr. Dai[Badrinarayanan et al., TPAMI‘16] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Page 36: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Generative Models• Given training data, how to generate new samples

from the same distribution

42I2DL: Prof. Niessner, Dr. Dai Source: https://openai.com/blog/generative-models/

Rea

l Im

ages

Gen

erat

ed Im

ages

Page 37: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Generative Models

43I2DL: Prof. Niessner, Dr. Dai

Explicit Density Implicit Density

Tractable Density Approximate Density

Variational Markov Chain

Markov Chain Direct

Variational Autoencoder Boltzmann Machine

GSN GANFully Visible Belief Nets

Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017

Page 38: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

44

Variational Autoencoders

I2DL: Prof. Niessner, Dr. Dai

Page 39: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoders• Encode the input into a representation (bottleneck)

and reconstruct it with the decoder

45

Conv Transpose Conv

Encoder Decoder

I2DL: Prof. Niessner, Dr. Dai

𝑥 𝑥

𝑧

Page 40: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoders• Encode the input into a representation (bottleneck)

and reconstruct it with the decoder

46I2DL: Prof. Niessner, Dr. Dai

Source: https://bit.ly/37ctFMS

Latent space learnedby autoencoder on MNIST

Page 41: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

𝑧

Variational Autoencoder

47

Conv Transpose Conv

Encoder Decoder

I2DL: Prof. Niessner, Dr. Dai

𝑥 𝑥𝜙 𝜃

𝑞𝜙 𝑧 𝑥 𝑝𝜃 𝑥 𝑧)

Page 42: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

48

Variational Autoencoder

Conv Transpose Conv

Goal: Sample from the latent distribution to generate new outputs!

I2DL: Prof. Niessner, Dr. Dai

𝑧

𝑥 𝑥𝜙 𝜃

Page 43: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Variational Autoencoder

49

• Latent space is now a distribution• Specifically it is a Gaussian

Encoder Decoder

Sample

I2DL: Prof. Niessner, Dr. Dai

𝑥 𝜙 𝜃 𝑥

𝜇𝑧|𝑥

Σ𝑧|𝑥 𝑧

𝑧|𝑥 ∼ 𝒩(𝜇𝑧|𝑥, Σ𝑧|𝑥)

Page 44: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Variational Autoencoder

50

• Latent space is now a distribution• Specifically it is a Gaussian

EncoderMean

Diagonal covariance

I2DL: Prof. Niessner, Dr. Dai

𝑥 𝜙

𝜇𝑧|𝑥

Σ𝑧|𝑥

𝑧|𝑥 ∼ 𝒩(𝜇𝑧|𝑥, Σ𝑧|𝑥)

Page 45: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Variational Autoencoder

51

• Training

I2DL: Prof. Niessner, Dr. Dai

Encoder Decoder

Sample𝑥 𝜃 𝑥

𝜇𝑧|𝑥

Σ𝑧|𝑥 𝑧

𝑧|𝑥 ∼ 𝒩(𝜇𝑧|𝑥, Σ𝑧|𝑥)

𝜙

Page 46: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Variational Autoencoder• Sampling operation is not differentiable

-> We can‘t backpropagate through the latent space

52I2DL: Prof. Niessner, Dr. Dai

Encoder Decoder

Sample𝑥 𝜙 𝜃 𝑥

𝜇𝑧|𝑥

Σ𝑧|𝑥 𝑧

𝑧|𝑥 ∼ 𝒩(𝜇𝑧|𝑥, Σ𝑧|𝑥)

🚫

Page 47: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Reparametrization Trick• Now we only need to backpropagate through an

addition and a multiplication

53I2DL: Prof. Niessner, Dr. Dai

Encoder Decoder

Sample

𝑥 𝜙 𝜃 𝑥

𝜇𝑧|𝑥

Σ𝑧|𝑥 𝑧

𝒩(0,1)

* +

Page 48: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Variational Autoencoder

54

• Test: Sample from the latent space

I2DL: Prof. Niessner, Dr. Dai

Decoder

Sample 𝜃 𝑥

𝜇𝑧|𝑥

Σ𝑧|𝑥 𝑧

𝑧|𝑥 ∼ 𝒩(𝜇𝑧|𝑥, Σ𝑧|𝑥)

Page 49: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoder vs VAE

Autoencoder Variational Autoencoder Ground TruthSource: https://github.com/kvfrans/variational-autoencoder

55I2DL: Prof. Niessner, Dr. Dai

Page 50: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoder Overview• Autoencoders (AE)

– Reconstruct input– Unsupervised learning– Latent space features are useful

• Variational Autoencoders (VAE)– Probability distribution in latent space (e.g., Gaussian)– Interpretable latent space (head pose, smile)– Sample from model to generate output

56I2DL: Prof. Niessner, Dr. Dai

Page 51: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Generative Adversarial Networks (GANs)

57I2DL: Prof. Niessner, Dr. Dai

Page 52: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Generative Adversarial Networks (GANs)

58

Source: https://github.com/hindupuravinash/the-gan-zoo

I2DL: Prof. Niessner, Dr. Dai

Page 53: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Convolution and Deconvolution

Convolutionno padding, no stride

Source: https://github.com/vdumoulin/conv_arithmetic

Transposed convolutionno padding, no stride

Input

Output

Input

Output

I2DL: Prof. Niessner, Dr. Dai 59

Page 54: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Autoencoder

Conv DeconvI2DL: Prof. Niessner, Dr. Dai 60

Page 55: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Decoder as Generative Model

Latent space 𝑧dim 𝑧 < dim(𝑥)

Test time:-> reconstruction from

‘random’ vector

Output Image

ReconstructionLoss (often L2)

I2DL: Prof. Niessner, Dr. Dai 63

Page 56: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Decoder as Generative Model

Interpolation between two chair models

I2DL: Prof. Niessner, Dr. Dai 64

[Dosovitsky et al., ‘14] Learning to Generate Chairs

Page 57: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Decoder as Generative Model

Morphing betweenchair models

I2DL: Prof. Niessner, Dr. Dai 65

[Dosovitsky et al., ‘14] Learning to Generate Chairs

Page 58: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Decoder as Generative Model

Latent space zdim (z) < dim (x)

“Test time”:-> reconstruction from

‘random’ vector

Reconstruction Loss Often L2, i.e., sum of squared dist.-> L2 distributes error equally

-> mean is opt.-> res. Is blurry

Instead of L2, can we “learn” a loss function?

I2DL: Prof. Niessner, Dr. Dai 66

Page 59: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Generative Adversarial Networks (GANs)

[Goodfellow et al., NIPS‘14] Generative Adversarial Networks (slide from McGuinness)

67

𝑧𝐺

𝐺(𝑧)

𝐷

𝐷(𝐺(𝑧))

I2DL: Prof. Niessner, Dr. Dai

Page 60: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Generative Adversarial Networks (GANs)

68

𝑧𝐺

𝐺(𝑧)

𝐷

𝑥

𝐷(𝑥)

𝐷(𝐺(𝑧))

I2DL: Prof. Niessner, Dr. Dai

[Goodfellow et al., NIPS‘14] Generative Adversarial Networks (slide from McGuinness)

Page 61: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Generative Adversarial Networks (GANs)

real data fake data

I2DL: Prof. Niessner, Dr. Dai [Goodfellow, NIPS‘16] Tutorial: Generative Adversarial Networks 69

Page 62: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

• Minimax Game:– G minimizes probability that D is correct– Equilibrium is saddle point of discriminator loss

• Discriminator loss

• Generator loss binary cross entropy

GANs: Loss Functions

• D provides supervision (i.e., gradients) for G

I2DL: Prof. Niessner, Dr. Dai[Goodfellow et al., NIPS‘14] Generative Adversarial Networks

𝐽 𝐷 = −1

2𝔼𝐱∼𝑝𝑑𝑎𝑡𝑎 log𝐷 𝒙 −

1

2𝔼𝒛 log 1 − 𝐷 𝐺 𝒛

𝐽(𝐺) = −𝐽 𝐷

70

Page 63: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

• Heuristic Method (often used in practice)– G maximizes the log-probability of D being mistaken– G can still learn even when D rejects all generator samples

• Discriminator loss

GANs: Loss Functions

• Generator loss

I2DL: Prof. Niessner, Dr. Dai

𝐽 𝐷 = −1

2𝔼𝐱∼𝑝𝑑𝑎𝑡𝑎 log𝐷 𝒙 −

1

2𝔼𝒛 log 1 − 𝐷 𝐺 𝒛

𝐽(𝐺) = −1

2𝔼𝒛 log𝐷 𝐺 𝒛

71[Goodfellow et al., NIPS‘14] Generative Adversarial Networks

Page 64: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Alternating Gradient Updates• Step 1: Fix G, and perform gradient step to

• Step 2: Fix D, and perform gradient step to

72I2DL: Prof. Niessner, Dr. Dai

𝐽 𝐷 = −1

2𝔼𝐱∼𝑝𝑑𝑎𝑡𝑎 log𝐷 𝒙 −

1

2𝔼𝒛 log 1 − 𝐷 𝐺 𝒛

𝐽(𝐺) = −1

2𝔼𝒛 log𝐷 𝐺 𝒛

Page 65: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Training a GAN

74

Source: https://medium.com/ai-society/gans-from-scratch-1-a-deep-introduction-with-code-in-pytorch-and-tensorflow-cb03cdcdba0f

I2DL: Prof. Niessner, Dr. Dai

Page 66: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

GANs: Loss FunctionsMinimax

Heuristic

I2DL: Prof. Niessner, Dr. Dai 75[Goodfellow et al., NIPS‘14] Generative Adversarial Networks

Page 67: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

DCGAN: Generator

Generator of Deep Convolutional GANs

I2DL: Prof. Niessner, Dr. Dai 76[Radford et al., ICLR‘16] DCGAN : Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Page 68: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

DCGAN: Results

Results on MNIST

77I2DL: Prof. Niessner, Dr. Dai[Radford et al., ICLR‘16] DCGAN : Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Page 69: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

DCGAN: Results

Results on CelebA (200k relatively well aligned portrait photos)

I2DL: Prof. Niessner, Dr. Dai 78

[Radford et al., ICLR‘16] DCGAN : Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Page 70: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Conditional Generative Adversarial

Networks (cGANs)79I2DL: Prof. Niessner, Dr. Dai

Page 71: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Pix2Pix: Image-to-Image Translation

I2DL: Prof. Niessner, Dr. Dai

[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

80

Page 72: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

real or fake?

Discriminator

z G(z)

D

Generator

G

min𝐺

max𝐷

𝔼𝑧,𝑥 log 𝐷(𝐺 𝑧 ) + log(1 − 𝐷 𝑥 )

I2DL: Prof. Niessner, Dr. Dai

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝐺 𝑥 ) + log(1 − 𝐷 𝑦 )

81

[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

Page 73: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

G(x)x

real or fake?

Discriminator

D

Generator

G

I2DL: Prof. Niessner, Dr. Dai

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝐺 𝑥 ) + log(1 − 𝐷 𝑦 )

82

[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

Page 74: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Real!

DiscriminatorGenerator

I2DL: Prof. Niessner, Dr. Dai

G(x)x

DG

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝐺 𝑥 ) + log(1 − 𝐷 𝑦 )

83

[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

Page 75: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

min𝐺

max𝐷

𝔼𝑥,𝑦 log 𝐷(𝐺 𝑥 ) + log(1 − 𝐷 𝑦 )

DiscriminatorGenerator

Real too!

I2DL: Prof. Niessner, Dr. Dai

G(x)x

DG

84

[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

Page 76: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

min𝐺

max𝐷

𝔼𝑥,𝑦 log𝐷(𝑥, 𝐺 𝑥 ) + log(1 − 𝐷 𝑥, 𝑦 )

real or fake pair?

match joint distribution p G x , y ∼ p(x, y)

fake pair real pair

I2DL: Prof. Niessner, Dr. Dai

G

G(x)x

D

85[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

Page 77: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Pix2Pix

86I2DL: Prof. Niessner, Dr. Dai

Page 78: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Edges → ImagesInput Output Input Output Input Output

I2DL: Prof. Niessner, Dr. Dai

[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial NetworksEdges from [Xie et al., ICCV’15] Holistically-Nested Edge Detection]

87

Page 79: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Sketches → ImagesInput Output Input Output Input Output

Trained on Edges → Images

Data from [Eitz, Hays, Alexa, 2012]I2DL: Prof. Niessner, Dr. Dai 88

Page 80: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Input Output Ground Truth

I2DL: Prof. Niessner, Dr. Dai

Data from maps.google.com

89

[Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

Page 81: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

BW → ColorInput Output Input Output Input Output

I2DL: Prof. Niessner, Dr. Dai 90

Data from ImageNet [Isola et al., CVPR‘17] Pix2Pix : Image-to-Image Translation with Conditional Adversarial Networks

Page 82: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

GAN Applications

91I2DL: Prof. Niessner, Dr. Dai

Page 83: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

BigGAN: HD Image Generation

92I2DL: Prof. Niessner, Dr. Dai

[Brock et al., ICLR‘18] BigGAN : Large Scale GAN Training for High Fidelity Natural Image Synthesis

Page 84: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

StyleGAN: Face Image Generation

93I2DL: Prof. Niessner, Dr. Dai

[Karras et al., ‘18] StyleGAN : A Style-Based Generator Architecture for Generative Adversarial Networks [Karras et al., ‘19] StyleGAN2 : Analyzing and Improving the Image Quality of StyleGAN

Page 85: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Cycle GAN: Unpaired Image-to-Image Translation

I2DL: Prof. Niessner, Dr. Dai[Zhu et al., ICCV‘17] Cycle GAN : Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

94

Page 86: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

SPADE: GAN-Based Image Editing

95I2DL: Prof. Niessner, Dr. Dai[Park et al., CVPR‘19] SPADE : Semantic Image Synthesis with Spatially-Adaptive Normalization

Page 87: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Few-Shot-Vid2Vid: Single-Shot Video Generation

96I2DL: Prof. Niessner, Dr. Dai

• https://www.youtube.com/watch?v=APoB1u3kTOU• https://www.youtube.com/watch?v=kkA6CHRovKA

[Wang et al., NeurIPS‘19] Few-Shot Video-to-Video Synthesis

Page 88: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

References for Further Reading• https://towardsdatascience.com/intuitively-

understanding-variational-autoencoders-1bfe67eb5daf

• https://phillipi.github.io/pix2pix/

• http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture13.pdf

I2DL: Prof. Niessner, Dr. Dai 97

Page 89: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

Next Lecture

I2DL: Prof. Niessner, Dr. Dai 98

• Next lecture on 28th January: - Course outlook

• Reminder: Exercise 3 due tomorrow at 18:00

• Thursday exercise session on exercise 3 and exam tips

Page 90: Lecture 11 Recap - GitHub Pages · Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. 44 Variational Autoencoders I2DL: Prof. Niessner,

See you next time!

99I2DL: Prof. Niessner, Dr. Dai