lecture 9 vae variants 40mins - github pages
TRANSCRIPT
![Page 1: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/1.jpg)
VAE variantsHao Dong
Peking University
1
![Page 2: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/2.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
2
VAE variants
Hierarchical representation learning
Temporal representa2on learning
Representation learning
![Page 3: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/3.jpg)
• Convolutional VAE• Conditional VAE• IWAE• β-VAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
3
VAE variants
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 4: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/4.jpg)
4
Convolutional Variational Autoencoder
• Limitations of vanilla VAE• The size of weight of fully connected layer == input size x output size• If VAE uses fully connected layers only, will lead to curse of dimensionality
when the input dimension is large (e.g., image).
• Solution
Image is modified from: Deep Clustering with Convolutional Autoencoder. NIPS 2017.
Fully connected layers here(independent variables)
![Page 5: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/5.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
5
VAE variants
Hierarchical representa2on learning
Temporal representation learning
Representation learning
![Page 6: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/6.jpg)
6
Conditional Variational Autoencoder
• Train and inference with labelled data.
Learning structured output representation using deep conditional generative models. NeurIPS 2015.
![Page 7: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/7.jpg)
7
Recap: Variational Autoencoder
Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013
• Recap: Se>ng up the objecAve• Maximise P(X)• Set Q(z) to be an arbitrary distribuGon
![Page 8: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/8.jpg)
8
Recap: Variational Autoencoder
Auto-Encoding Variational Bayes. Diederik P. Kingma, Max Welling. ICLR 2013
• Recap: Setting up the objective
encoder ideal reconstruction KLD
![Page 9: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/9.jpg)
9
Condi6onal Varia6onal Autoencoder
Learning structured output representation using deep conditional generative models. NeurIPS 2015.
• Setting up the objective with labels• Maximise P(Y|X)• Set Q(z) to be an arbitrary distribution
![Page 10: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/10.jpg)
10
Conditional Variational Autoencoder
Learning structured output representation using deep conditional generative models. NeurIPS 2015.
• Setting up the objective
encoder ideal
reconstruction KLD
![Page 11: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/11.jpg)
11
Conditional Variational Autoencoder
• Train and inference without labelled data i.e., vanilla VAE
Learning structured output representation using deep conditional generative models. NeurIPS 2015.
![Page 12: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/12.jpg)
12
Conditional Variational Autoencoder
• Train and inference with labelled data.
Learning structured output representation using deep conditional generative models. NeurIPS 2015.
![Page 13: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/13.jpg)
13
Conditional Variational Autoencoder
• Train and inference with labelled data.
Learning structured output representation using deep conditional generative models. NeurIPS 2015.
![Page 14: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/14.jpg)
14
Conditional Variational Autoencoder
• Train and inference with labeled data.
Learning structured output representation using deep conditional generative models. NeurIPS 2015.
![Page 15: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/15.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
15
VAE variants
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 16: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/16.jpg)
16
Before we start
• Disentangled / Factorised representation• Each variable in the inferred latent representation is only sensitive to one
single generative factor and relatively invariant to other factors• Good interpretability and easy generalization to a variety of tasks
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 17: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/17.jpg)
17
Before we start
• Unsupervised hierarchical representation learning
Rethinking Style and Content Disentanglement in Variational Autoencoders. ICLR 2018 Workshops.
![Page 18: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/18.jpg)
18
β-VAE
• Unsupervised representation learning• Augment the original VAE framework with a single hyper-parameter β
that modulates the learning constraints• Impose a limit on the capacity of the latent information channel
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 19: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/19.jpg)
19
β-VAE
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 20: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/20.jpg)
20
β-VAE
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 21: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/21.jpg)
21
β-VAE
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 22: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/22.jpg)
22
β-VAE
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 23: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/23.jpg)
23
β-VAE
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 24: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/24.jpg)
24
β-VAE
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 25: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/25.jpg)
25
β-VAE
• Discussion: It is really unsupervised?
• It is unsupervised/self-supervised learning, because it does not need any label data
• It is not fully unsupervised learning, it works because of the inductive bias of the neural network model, the hierarchical design introduces prior knowledge about the data
β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. ICLR 2017.
![Page 26: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/26.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
26
VAE variants
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 27: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/27.jpg)
27
IWAE(Importance Weighted Autoencoder)
Importance Weighted Autoencoders. ICLR 2016.
• Optimise a tighter lower bound than VAE• VAE just optimises a lower bound of log P(X)
encoder ideal reconstruction KLD
-ELBO
ELBO
![Page 28: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/28.jpg)
28
IWAE
Importance Weighted Autoencoders. ICLR 2016.
• Optimise a tighter lower bound than VAE
![Page 29: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/29.jpg)
29
IWAE
Importance Weighted Autoencoders. ICLR 2016.
![Page 30: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/30.jpg)
30
IWAE
Importance Weighted Autoencoders. ICLR 2016.
• Why “Importance weighted”
where
![Page 31: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/31.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
31
VAE variants
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 32: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/32.jpg)
32
Ladder VAE
• To learn hierarchical latent representation• Deep models with several layers of dependent stochastic variables are difficult to train
• Limiting the improvements obtained using these highly expressive models
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
![Page 33: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/33.jpg)
33
Ladder VAE
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
encode decode encode decode
Hierarchical VAE Ladder VAE
![Page 34: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/34.jpg)
34
Ladder VAE
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
![Page 35: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/35.jpg)
35
Ladder VAE
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
![Page 36: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/36.jpg)
36
Ladder VAE
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
![Page 37: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/37.jpg)
37
Ladder VAE
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
![Page 38: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/38.jpg)
38
Ladder VAE
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
![Page 39: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/39.jpg)
39
Ladder VAE
LVAE: Ladder Variational Autoencoder. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. NIPS 2016.
![Page 40: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/40.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
40
VAE variants
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 41: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/41.jpg)
41
Progressive + Fade-in VAE
• Discussion
Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.
encoders decoders
high-level
middle-level
low-level
Can we directly train a hierarchical VAE with ladder structure like that?
![Page 42: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/42.jpg)
42
Progressive + Fade-in VAE
• Discussion
Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.
encoders decoders
high-level
middle-level
low-level
Information SHORTCUT problemall information go through the low-level path,other paths are ignored.model is lazy…
![Page 43: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/43.jpg)
43
Progressive + Fade-in VAE
• Progressive + Fade-in
Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.
Stage 1: enable high-level path Stage 2: enable high-level and middle paths
Stage 3: enable all paths …
𝛼 increase from 0 to 1 gradually
![Page 44: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/44.jpg)
44
Progressive + Fade-in VAE
• Results
Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.
High-level: background and foreground colors
Middle-level: shape
Low-level: …
![Page 45: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/45.jpg)
45
Progressive + Fade-in VAE
• Results
Progressive Learning and Disentanglement of Hierarchical Representations. ICLR 2020.
High-level Middle-level Low-level
![Page 46: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/46.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
46
VAE variants
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 47: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/47.jpg)
47
VAE in speech
• Learning latent representations for style control and transfer in end-to-end speech synthesis
• RNN as encoder
Learning latent representations for style control and transfer in end-to-end speech synthesis. ICASSP 2019.
![Page 48: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/48.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
48
VAE variants
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 49: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/49.jpg)
50
TD-VAE
• State-space model as a Markov Chain model
Temporal Difference VAE. ICLR 2019.
![Page 50: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/50.jpg)
51
TD-VAE
Temporal Difference VAE. ICLR 2019.
![Page 51: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/51.jpg)
52
TD-VAE
Temporal Difference VAE. ICLR 2019.
![Page 52: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/52.jpg)
53
TD-VAE
Temporal Difference VAE. ICLR 2019.
![Page 53: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/53.jpg)
54
TD-VAE
Temporal Difference VAE. ICLR 2019.
![Page 54: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/54.jpg)
55
TD-VAE
Temporal Difference VAE. ICLR 2019.
![Page 55: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/55.jpg)
• Convolutional VAE• Conditional VAE• β-VAE• IWAE• Ladder VAE• Progressive + Fade-in VAE• VAE in speech• Temporal Difference VAE (TD-VAE)
56
Summary
Hierarchical representation learning
Temporal representation learning
Representation learning
![Page 56: Lecture 9 VAE variants 40mins - GitHub Pages](https://reader034.vdocuments.net/reader034/viewer/2022043013/626b5db638bce23a7b3270df/html5/thumbnails/56.jpg)
Thanks
57