deep generative models

18
Deep Generative Models Mijung Kim

Upload: mijung-kim

Post on 21-Apr-2017

669 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Deep Generative ModelsMijung Kim

Discriminative vs. Generative Learning

Discriminative Learning Generative Learning

Learn 𝑝(𝑦|π‘₯) directly Model 𝑝 𝑦 , 𝑝 π‘₯ 𝑦 first,

Then derive the posterior distribution:

𝑝 𝑦 π‘₯ =𝑝 π‘₯ 𝑦 𝑝(𝑦)

𝑝(π‘₯)

2

Undirected Graph vs. Directed Graph

Undirected Directed

β€’ Boltzmann Machines

β€’ Restricted Boltzmann Machines

β€’ Deep Boltzmann Machines

β€’ Sigmoid Belief Networks

β€’ Variational Autoencoders (VAE)

β€’ Generative Adversarial Networks

(GAN)

3

a b

c

a b

c

Deep Belief

Networks

Boltzmann Machines

β€’ Stochastic Recurrent Neural Network and Markov Random

Field invented by Hinton and Sejnowski in 1985

β€’ 𝑷 𝒙 =𝐞𝐱𝐩(βˆ’π‘¬ 𝒙 )

𝒁> E(x): Energy function

> Z: partition function where Οƒπ‘₯ 𝑃 π‘₯ = 1

β€’ Energy-based model: positive values all the time

β€’ Single visible layer and single hidden layer

β€’ Fully connected: not practical to implement

4

Restricted Boltzmann Machines

β€’ Dimensionality reduction, classification, regression,

collaborative filtering, feature learning and topic modeling

β€’ 𝑷 𝐯 = 𝒗, 𝐑 = 𝒉 =𝟏

π’πžπ±π©(βˆ’π‘¬ 𝒗, 𝒉 )

β€’ Two layers like BMs

β€’ Building blocks of deep probabilistic models

β€’ Gibbs sampling with Contrastive Divergence (CD) or Persistent

CD

5

Comparison btw BMs and RBMs

Boltzmann Machines Restricted Boltzmann Machines

v1 v2 v3

h1 h2 h3 h4

v1 v2 v3

h1 h2 h3 h4

6

𝑬 𝒗, 𝒉 = βˆ’π’—π‘»π‘Ήπ’— βˆ’ 𝒉𝑻𝑺𝒉 βˆ’ π’—π‘»π‘Ύπ’‰βˆ’ 𝒃𝑻𝒗 βˆ’ 𝒄𝑻𝒉 𝑬 𝒗, 𝒉 = βˆ’π’ƒπ‘»π’— βˆ’ 𝒄𝑻𝒉 βˆ’ 𝒗𝑻𝑾𝒉

Deep Belief Networks

β€’ Unsupervised

β€’ Small dataset

β€’ Stacked RBMs

β€’ Pre-train each RBM

β€’ Undirected + Directed

7

RBM

Sigmoid

Belief

Net

β€’ 𝑃 v, h1, h2, h3 =

𝑃 v h1 𝑃 h1 h2 𝑃(h2, h3)

β€’ 𝑃 v h1 = ς𝑖 𝑃(𝑣𝑖|h1)

β€’ 𝑃 h1 h2 = ς𝑗 𝑃(β„Žπ‘—1|h2)

β€’ 𝑃 h2, h3 =1

𝑍(π‘Š3)exp(h2π‘‡π‘Š3h3)

8

RBM

Sigmoid

Belief

Net

β„Ž1

β„Ž2

β„Ž3

v

π‘Š3

π‘Š2

π‘Š1

Sigmoid Belief Net RBM

Limitations of DBN (By Ruslan Salakhutdinov)

β€’ Explaining away

β€’ Greedy layer-wise pre-training

> no optimization over all layers

β€’ Approximation inference is feed-forward

> no bottom-up and top-down

9

http://www.slideshare.net/zukun/p05-deep-boltzmann-machines-cvpr2012-deep-learning-methods-for-vision

Deep Boltzmann Machines

β€’ Unsupervised

β€’ Small dataset

β€’ Stacked RBMs

β€’ Pre-train each RBM

β€’ Undirected

10

β€’ π‘ƒπœƒ v =

1

𝑍(πœƒ)Οƒh1,h2,h3 exp( v

π‘‡π‘Š1h1 +

h1π‘‡π‘Š2h2 + h2π‘‡π‘Š3h3)

β€’ πœƒ = {π‘Š1,π‘Š2,π‘Š3}

β€’ Bottom-up and Top-down:

β€’ 𝑃 β„Žπ‘—2 = 1|h1, h3 =

𝜎(Οƒπ’Œπ‘Ύπ’Œπ’‹πŸ‘ π’‰π’Œ

πŸ‘ + Οƒπ’Žπ‘Ύπ’Žπ’‹πŸ π’‰π’Ž

𝟏 )

11

β„Ž1

β„Ž2

β„Ž3

v

π‘Š3

π‘Š2

π‘Š1

Variational Autoencoders (VAE)

12

Encoder Decoderx x’

ΞΌ

Οƒ

Z

π‘ž(𝑧|π‘₯) 𝑝(π‘₯|𝑧)

Sampling Reconstruct

β„’ π‘ž = βˆ’π‘«π‘²π‘³ 𝒒 𝒛 𝒙 βˆ₯ π’‘π’Žπ’π’…π’†π’ 𝒛 + 𝔼𝒛~𝒒 𝒛 𝒙 π₯π¨π π’‘π’Žπ’π’…π’†π’(𝒙|𝒛)

𝔃 ~𝓝(𝝁, 𝝈)

http://www.slideshare.net/KazukiNitta/variational-autoencoder-68705109

Stochastic Gradient Variational Bayes

(SGVB) Estimator

13

Encoder Decoderx x’

ΞΌ

Οƒ

Z

Back Prop.

Feed Forward

𝝐 ~𝓝 𝟎, 𝑰

𝔃 = 𝝁 + 𝝐𝝈

14http://dpkingma.com/wordpress/wp-content/uploads/2014/05/2014-03_talk_iclr.pdf

Generative Adversarial Networks(GAN)

15

Data Sample DiscriminatorGenerator

Sample

Noise

Yes or No

Generator

https://ishmaelbelghazi.github.io/ALI

Convolutional Neural Networks

16

Image Classification

Deep Convolutional Generative Adversarial Networks (DCGAN)

17

https://openai.com/blog/generative-models/

Real Images vs. Generated images

18

http://kenkihara.com/projects/GAN.html