![Page 1: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/1.jpg)
Deep Learning 4Autoencoder, Attention (spatial transformer), Multi-modal
learning, Neural Turing Machine, Memory Networks, Generative Adversarial Net
Jian Li
IIIS, Tsinghua
![Page 2: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/2.jpg)
Autoencoder
![Page 3: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/3.jpg)
Autoencoder• Unsupervised learning
• Let the learning algorithm figure out the structure of the data (without supervised information)• Compact representation
• Sparse representation
• Representation learning (related to dictionary learning)
Both the input and the output are x
![Page 4: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/4.jpg)
Denoising Autoencoder
• Artificially add some noise to the input• The higher level representations are relatively stable and robust to the
corruption of the input;
• It is necessary to extract features that are useful for representation of the input distribution.
![Page 5: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/5.jpg)
Sparse Autoencoder
• We can make the hidden layer larger, and at the same time encourage the sparsity of the code • By adding sparsity encouraging regularization term. E.g.
• or manually zeroing all but the few strongest hidden unit activations
𝜌 ∶ Bernoulli(0.05)
Average activation of neuron j in the hidden layer
![Page 6: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/6.jpg)
Variational autoencoder (VAE)
• Bayesian approach
• Perspective from variational inference
Distr generated by the decoderThe distr learnt by encoder to
approximate the posterior distribution p(z|x)
Prior of the code
![Page 7: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/7.jpg)
A quick intro to variational inference
• Typically, the posterior is hard to compute and sample from (MCMC approach can be pretty slow )
• We wish to use q (from some parametric family) to approximate the posterior p(z|x)
ELBO: evidence lower boundELBO<=logp(x)
Minimizing KL is equivalent to maximizing ELBO (since evidence logp(x) doesn’t depend on z)
![Page 8: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/8.jpg)
Contractive autoencoder (CAE)
• Perspective from manifold learning
• Encourage the encoding to be contractive
Frobenius norm of the Jacobian matrix of the encoder activations
with respect to the input
Related reading material: http://www.deeplearningbook.org/version-2015-10-03/contents/manifolds.html
![Page 9: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/9.jpg)
Spatial Transformer Networks-an attention mechanism
Jaderberg et al, “Spatial Transformer Networks”, NIPS 2015
![Page 10: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/10.jpg)
• Would like to pay attention to certain areas of an image
![Page 11: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/11.jpg)
![Page 12: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/12.jpg)
Affine transformation. But it can be a more general transform
𝜃: parameters we need to learn
![Page 13: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/13.jpg)
• A module can be inserted to any place of a network • Used several times in later deepmind papers
The set of sampling points
![Page 14: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/14.jpg)
Output V is determined by input U and sampling points 𝑥𝑖
𝑠, 𝑦𝑖𝑠 ∈ 𝑇𝜃(𝐺)
The localization network can be FC network or a CNN.The last layer should a regression layer to produce 𝜃
𝑥𝑖𝑠, 𝑦𝑖
𝑠 ∈ 𝑇𝜃(𝐺) indicates which points in U we want to focus on
![Page 15: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/15.jpg)
![Page 16: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/16.jpg)
• We can even learn the target grid B (using “thin plate spline”) (again through backprop)
![Page 17: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/17.jpg)
Multimodal representation learning---Image Caption 2
Kires et al. Unifying Visual-Semantic Embeddings withMultimodal Neural Language Models
![Page 18: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/18.jpg)
![Page 19: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/19.jpg)
OverviewMap CNN codes and RNN code to a common space
Details of SC-NLM. Please see the paper
![Page 20: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/20.jpg)
![Page 21: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/21.jpg)
![Page 22: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/22.jpg)
Details
• LSTM notations used in this work
![Page 23: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/23.jpg)
Details
• D: length of the CNN code (CNN can be AlexNet, VggNet, or ResNet)
x
v
𝑊𝑇: precomputed using e.g. word2vec (we don’t learn it)
![Page 24: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/24.jpg)
Details
• Optimize pairwise rank loss (𝜃:parameters needed to be learnt: 𝑊𝐼and LSTM parameters)
Max-margin formulation. 𝛼 margin
![Page 25: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/25.jpg)
Neural Turing Machine [Graves et al.]
![Page 26: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/26.jpg)
“Memory”
![Page 27: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/27.jpg)
Overview
![Page 28: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/28.jpg)
Read
![Page 29: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/29.jpg)
Write
![Page 30: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/30.jpg)
Addressing Mechanism (overview)Where to look at in the memory
![Page 31: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/31.jpg)
Addressing (details)
![Page 32: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/32.jpg)
Addressing (Details)
![Page 33: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/33.jpg)
Go over the process
![Page 34: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/34.jpg)
![Page 35: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/35.jpg)
![Page 36: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/36.jpg)
![Page 37: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/37.jpg)
![Page 38: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/38.jpg)
![Page 39: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/39.jpg)
![Page 40: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/40.jpg)
Memory Network[Weston et al.][Sukhbaatar et al.]
![Page 41: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/41.jpg)
Memory Network [Weston et al.][Sukhbaatar et al.]
![Page 42: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/42.jpg)
Overview
![Page 43: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/43.jpg)
Memory Module
![Page 44: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/44.jpg)
Memory Vectors
![Page 45: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/45.jpg)
Q&A Example
![Page 46: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/46.jpg)
Generative Adversarial Nets (GAN) [Goodfellow et al.]
![Page 47: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/47.jpg)
Generative Models
• Most work on deep generative models focused on provided aparametric specification of a probability distribution function (likeDeep Belief Net, PixelCNN, PixelRNN)
• Train these models by maximizing the log likelihood
Max 𝑖 log 𝑃(𝑥𝑖 , 𝑦 𝑖 )
• Difficulty: Intractable probabilistic computations
![Page 48: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/48.jpg)
Generative Adversarial Nets
• Two neural networks: a generative model and a discriminative model
• A two-player minimax game
• One network for generation (e.g., generating images), one for classification (distinguishing the true data from the generated data)
• Hence, if the generative model produces the same distribution as the true data distribution, the discriminative model wouldn’t be able to distinguish them. This is an equilibrium point!
• But in practice, we can’t achieve this point. The discriminative model is a bit too strong for the current generative model.
![Page 49: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/49.jpg)
The discriminative model D tries to distinguish whether x is from the original data distribution or from the generated distribution.
![Page 50: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/50.jpg)
Generative Adversarial Nets
: Prior noise (e.g., Gaussian) for the generative model
: the discriminative model outputs a single scalar, which is the
Prob[x is from the data (rather than from the generative model)]
Generative model wants to minimize log[1 − 𝐷 𝐺 𝑧 ]
Discriminative model wants to assign correct labels (from g or from data)
The value of the minmax game:
• The goal is to reach an equilibrium.
D wants it largeG has no control on this D wants it small
G wants it large
![Page 51: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/51.jpg)
Training
D: maximization
G: minimization
A variant of best-response to reach an equilibrium
![Page 52: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/52.jpg)
Training Process
Another visualized training process, see http://cs.stanford.edu/people/karpathy/gan/
Data distribution
Generated distribution
D(x)After training DD(x)
before training D
Data distributionAfter training G
![Page 53: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/53.jpg)
Experiments
![Page 54: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/54.jpg)
Final Notes
• We have covered the basics, and several recent “end products”. There are many important ideas developed by many researchers that lead to those cool stuffs you see today
• A fast growing area (2000+ppl in NIPS 2013, now 8000+ ppl this year NIPS)• In many cases, the design is more of an art than a science
• But it doesn’t mean that DL is just “tuning parameters”
• Important Things We didn’t cover• Things related to graphical models, Bayesian Approaches• Deep Belief Net (Restricted Boltzmann Machine)• Autoencoder (Variational Autoencoder, Stacked Autoencoder)• Stacking traditional “shallow” models ……• Lots of applications in NLP (word2vec, topic models)• Unsupervised learning• Transfer learning• Theoretical results• …..
![Page 55: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/55.jpg)
• Deep Reinforcement Learning• Play games
• Playing Atari with Deep Reinforcement Learning
• https://github.com/kuz/DeepMind-Atari-Deep-Q-Learner
• Search technique (Monte-Carlo Tree Search) – AlphaGo• Open Source facebook Go engine:
• https://github.com/facebookresearch/darkforestGo
![Page 56: Deep Learning 2 - Tsinghua Universityiiis.tsinghua.edu.cn/~jianli/courses/ML2016/DL4.pdfDeep Learning 4 Autoencoder ... •Compact representation •Sparse representation •Representation](https://reader034.vdocuments.net/reader034/viewer/2022042620/5abaf1887f8b9a441d8c3fa2/html5/thumbnails/56.jpg)
• Some slides borrowed from cs231n at Stanford, slides for “End-To-End Memory Networks” by Sukhbaatar et al. and from wiki
• Variational inference, Blei.
• Thank Jianbo Guo for preparing some slides of GAN