multi-modal embeddings: from discriminative to generative models and creative ai
TRANSCRIPT
![Page 1: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/1.jpg)
@graphificRoelof Pieters
Mul--ModalEmbeddings:fromDiscrimina-vetoGenera-veModelsand
Crea-veAI2May2016
KTH
www.csc.kth.se/~roelof/ [email protected]
![Page 2: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/2.jpg)
Modalities
2
![Page 3: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/3.jpg)
• AE/VAE
• DBN
• Latent Vectors/Manifold Walking
• RNN / LSTM /GRU
• Modded RNNs (ie Biaxial RNN)
• CNN + LSTM/GRU
• X + Mixture density network (MDN)
Common Generative Architectures
• DRAW• GRAN• DCGAN
• DeepDream & other CNN visualisations
• Splitting/Remixing NNs:• Image Analogy• Style Transfer • Semantic Style Transfer • Texture Synthesis
• Compositional Pattern-Producing Networks (CPPN)
• NEAT• CPPN w/ GAN+VAE.
![Page 4: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/4.jpg)
• AE/VAE
• DBN
• Latent Vectors/Manifold Walking
• RNN / LSTM /GRU
• Modded RNNs (ie Biaxial RNN)
• CNN + LSTM/GRU
• X + Mixture density network (MDN)
Common Generative Architectures
• DRAW• GRAN• DCGAN
• DeepDream & other CNN visualisations
• Splitting/Remixing NNs:• Image Analogy• Style Transfer • Semantic Style Transfer • Texture Synthesis
• Compositional Pattern-Producing Networks (CPPN)
• NEAT• CPPN w/ GAN+VAE.
what we’ll cover
![Page 5: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/5.jpg)
Text Generation
5
![Page 6: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/6.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
Wanna Play ?
Text Prediction (1)
![Page 7: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/7.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
Wanna Play ?
Text Prediction (1)
![Page 8: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/8.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
Wanna Play ?
Text Prediction (1)
![Page 9: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/9.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
Wanna Play ?
Text Prediction (2)
![Page 10: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/10.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
Wanna Play ?
Text Prediction (2)
![Page 11: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/11.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
![Page 12: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/12.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
Wanna Play ?
Handwriting Prediction
![Page 13: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/13.jpg)
Alex Graves (2014) Generating Sequences WithRecurrent Neural Networks
Wanna Play ?
Handwriting Prediction(we’re skipping the density mixture network details for now)
![Page 14: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/14.jpg)
![Page 15: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/15.jpg)
Wanna Play ?
Text generation
15
Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
![Page 16: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/16.jpg)
Wanna Play ?
Text generation
16
Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
![Page 17: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/17.jpg)
![Page 18: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/18.jpg)
![Page 19: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/19.jpg)
Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
![Page 20: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/20.jpg)
Andrej Karpathy, Justin Johnson, Li Fei-Fei (2015) Visualizing and Understanding Recurrent Networks
![Page 21: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/21.jpg)
Karpathy (2015), The Unreasonable Effectiveness
of Recurrent Neural Networks (blog)
![Page 22: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/22.jpg)
http://www.creativeai.net/posts/aeh3orR8g6k65Cy9M/generating-magic-cards-using-deep-recurrent-convolutional
![Page 23: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/23.jpg)
More…
more at:http://gitxiv.com/category/natural-language-
processing-nlp http://www.creativeai.net/?cat%5B0%5D=read-
write
![Page 24: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/24.jpg)
Image Generation
24
![Page 25: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/25.jpg)
Turn Convnet Around: “Deep Dream”
Image -> NN -> What do you (think) you see -> Whats the (text) label
Image -> NN -> What do you (think) you see -> feed back activations ->
optimize image to “fit” to the ConvNets “hallucination” (iteratively)
![Page 26: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/26.jpg)
Google, Inceptionism: Going Deeper into Neural Networks
Turn Convnet Around: “Deep Dream”
see also: www.csc.kth.se/~roelof/deepdream/
![Page 27: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/27.jpg)
Turn Convnet Around: “Deep Dream”
see also: www.csc.kth.se/~roelof/deepdream/ Google, Inceptionism: Going Deeper into Neural Networks
![Page 28: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/28.jpg)
codeyoutube
Roelof Pieters 2015
![Page 29: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/29.jpg)
https://www.flickr.com/photos/graphific/albums/72157657250972188
Single Units
Roelof Pieters 2015
![Page 30: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/30.jpg)
Multifaceted Feature Visualization
Anh Nguyen, Jason Yosinski, Jeff Clune (2016) Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks
![Page 31: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/31.jpg)
Multifaceted Feature Visualization
![Page 32: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/32.jpg)
Multifaceted Feature Visualization
![Page 33: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/33.jpg)
Preferred stimuli generation
Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, and Jeff Clune (2016) AI Neuroscience: Understanding Deep Neural Networks by Synthetically Generating the Preferred Stimuli for Each of Their Neurons
![Page 34: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/34.jpg)
![Page 35: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/35.jpg)
Inter-modal: Style Transfer (“Style Net” 2015)
Leon A. Gatys, Alexander S. Ecker, Matthias Bethge , 2015. A Neural Algorithm of Artistic Style (GitXiv)
![Page 36: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/36.jpg)
![Page 37: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/37.jpg)
Inter-modal: Image Analogies (2001)
A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Salesin.(2001) Image Analogies, SIGGRAPH 2001 Conference Proceedings.
A. Hertzmann (2001) Algorithms for Rendering in Artistic StylesPh.D thesis. New York University. May, 2001.
![Page 38: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/38.jpg)
Inter-modal: Image Analogies (2001)
![Page 39: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/39.jpg)
Inter-modal: Style Transfer (“Style Net” 2015)
Leon A. Gatys, Alexander S. Ecker, Matthias Bethge , 2015. A Neural Algorithm of Artistic Style (GitXiv)
![Page 40: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/40.jpg)
style layers: 3_1,4_1,5_1
![Page 41: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/41.jpg)
style layers: 3_2
![Page 42: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/42.jpg)
style layers: 5_1
![Page 43: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/43.jpg)
style layers: 5_2
![Page 44: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/44.jpg)
style layers: 5_3
![Page 45: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/45.jpg)
style layers: 5_4
![Page 46: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/46.jpg)
style layers: 5_1 + 5_2 + 5_3 + 5_4
![Page 48: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/48.jpg)
Inter-modal: Style Transfer+MRF (2016)
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand
![Page 49: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/49.jpg)
Inter-modal: Style Transfer+MRF (2016)
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand
![Page 50: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/50.jpg)
Inter-modal: Pretrained Style Transfer (2016)
Texture Networks: Feed-forward Synthesis of Textures and Stylized Images, 2016, Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, Victor Lempitsky
500x speedup! (avg min loss from 10s to 20ms)
![Page 51: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/51.jpg)
Inter-modal: Pretrained Style Transfer #2 (2016)
Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks, Chuan Li, Michael Wand, 2016
similarly 500x speedup
![Page 52: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/52.jpg)
Inter-modal: Perceptual Loss ST (2016)
Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei
![Page 53: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/53.jpg)
Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei
![Page 54: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/54.jpg)
54
Inter-modal: Style Transfer (“Style Net” 2015)
@DeepForger
![Page 55: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/55.jpg)
55
Inter-modal: Style Transfer (“Style Net” 2015)
@DeepForger
![Page 56: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/56.jpg)
+
+
=
Inter-modal: Semantic Style Transfer
![Page 57: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/57.jpg)
https://github.com/alexjc/neural-doodle
Semantic Style Transfer (“Neural Doodle”)
![Page 58: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/58.jpg)
Synthesize textures
![Page 59: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/59.jpg)
Synthesise textures (random weights)
- which activation function should i use?- pooling?- min nr of units?Experiment:
https://nucl.ai/blog/extreme-style-machines/
Random Weights (kinda like “Extreme Learning Machines”)
![Page 60: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/60.jpg)
Synthesise textures (random weights)totally random initialised weights:
https://nucl.ai/blog/extreme-style-machines/
![Page 61: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/61.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Activation Functions
![Page 62: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/62.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Activation Functions
![Page 63: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/63.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Activation Functions
![Page 64: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/64.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Down-sampling
![Page 65: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/65.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Down-sampling
![Page 66: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/66.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Down-sampling
![Page 67: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/67.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Nr of units
![Page 68: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/68.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Nr of units
![Page 69: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/69.jpg)
Synthesise textures (randim weights)
https://nucl.ai/blog/extreme-style-machines/
Nr of units
![Page 70: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/70.jpg)
• Image Analogies, 2001, A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, D. Sales
• A Neural Algorithm of Artistic Style, 2015. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge
• Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, 2016, Chuan Li, Michael Wand
• Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, 2016, Alex J. Champandard
• Texture Networks: Feed-forward Synthesis of Textures and Stylized Images, 2016, Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, Victor Lempitsky
• Perceptual Losses for Real-Time Style Transfer and Super-Resolution, 2016, Justin Johnson, Alexandre Alahi, Li Fei-Fei
• Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks, 2016, Chuan Li, Michael Wand
• @DeepForger
70
“Style Transfer” papers
![Page 71: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/71.jpg)
“A stop sign is flying in blue skies.”
“A herd of elephants flying in the blue skies.”
Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov, 2015. Generating Images from Captions with Attention (arxiv) (examples)
Caption -> Image generation
![Page 72: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/72.jpg)
Image Colorisation
![Page 73: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/73.jpg)
More…
more at:http://gitxiv.com/category/computer-vision http://www.creativeai.net/ (no category for
images just yet)
![Page 74: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/74.jpg)
Audio Generation
74
![Page 75: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/75.jpg)
Early LSTM music composition (2002)
Douglas Eck and Jurgen Schmidhuber (2002) Learning The Long-Term Structure of the Blues?
![Page 76: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/76.jpg)
Markov constraints
http://www.flow-machines.com/
![Page 77: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/77.jpg)
Markov constraints
http://www.flow-machines.com/
![Page 78: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/78.jpg)
78
Audio Generation: Midi
https://soundcloud.com/graphific/pyotr-lstm-tchaikovsky
A Recurrent Latent Variable Model for Sequential Data, 2016,
J. Chung, K. Kastner, L. Dinh, K. Goel, A. Courville, Y. Bengio
+ “modded VRNN:
![Page 79: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/79.jpg)
79
Audio Generation: Midi
https://soundcloud.com/graphific/neural-remix-net
A Recurrent Latent Variable Model for Sequential Data, 2016,
J. Chung, K. Kastner, L. Dinh, K. Goel, A. Courville, Y. Bengio
+ “modded VRNN:
![Page 80: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/80.jpg)
80
Audio Generation: Raw
Gated Recurrent Unit (GRU)stanford cs224d projectAran Nayebi, Matt Vitelli (2015) GRUV: Algorithmic Music Generation using Recurrent Neural Networks
![Page 81: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/81.jpg)
• LSTM improvements
• Recurrent Batch Normalization http://gitxiv.com/posts/MwSDm6A4wPG7TcuPZ/recurrent-batch-normalization
• also: hidden-to-hidden transition (earlier only input-to-hidden transformation of RNNs)
• faster convergence and improved generalization.
LSTM improvements
![Page 82: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/82.jpg)
• LSTM improvements
• weight normalisation http://gitxiv.com/posts/p9B6i9Kzbkc5rP3cp/weight-normalization-a-simple-reparameterization-to
LSTM improvements
![Page 83: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/83.jpg)
• LSTM improvements
• Associative Long Short-Term Memory http://gitxiv.com/posts/jpfdiFPsu5c6LLsF4/associative-long-short-term-memory
LSTM improvements
![Page 84: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/84.jpg)
• LSTM improvements
• Bayesian RNN dropout http://gitxiv.com/posts/CsCDjy7WpfcBvZ88R/bayesianrnn
LSTM improvements
![Page 85: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/85.jpg)
Chor-RNN
Continuous Generation
Luka Crnkovic-Friis & Louise Crnkovic-Friis (2016) Generative Choreography using Deep Learning
![Page 86: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/86.jpg)
• Mixture Density LSTM
Generative
1
2
3
4
5
6
![Page 87: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/87.jpg)
• Mixture Density LSTM
Generative
1
2
3
4
5
6
![Page 88: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/88.jpg)
• Mixture Density LSTM
Generative
1
2
3
4
5
6
![Page 89: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/89.jpg)
• Mixture Density LSTM
Generative
1
2
3
4
5
6
![Page 90: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/90.jpg)
• Mixture Density LSTM
Generative
1
2
3
4
5
6
![Page 91: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/91.jpg)
Wanna be Doing Deep Learning?
![Page 92: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/92.jpg)
python has a wide range of deep learning-related libraries available
Deep Learning with Python
Low level
High level
deeplearning.net/software/theano
caffe.berkeleyvision.org
tensorflow.org/
lasagne.readthedocs.org/en/latest
and of course:
keras.io
![Page 95: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/95.jpg)
Questions?love letters? existential dilemma’s? academic questions? gifts?
find me at:www.csc.kth.se/~roelof/
@graphific
Oh, and soon we’re looking for Creative AI enthusiasts !
- job- internship- thesis work
inAI (Deep Learning)
&Creativity
![Page 96: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/96.jpg)
https://medium.com/@ArtificialExperience/creativeai-9d4b2346faf3
![Page 97: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/97.jpg)
Creative AI > a “brush” > rapid experimentation
human-machine collaboration
![Page 98: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/98.jpg)
Creative AI > a “brush” > rapid experimentation
(YouTube, Paper)
![Page 99: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/99.jpg)
Creative AI > a “brush” > rapid experimentation
(YouTube, Paper)
![Page 100: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/100.jpg)
Creative AI > a “brush” > rapid experimentation
(Vimeo, Paper)
![Page 101: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/101.jpg)
101
Generative Adverserial Nets
Emily Denton, Soumith Chintala, Arthur Szlam, Rob Fergus, 2015. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks (GitXiv)
![Page 102: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/102.jpg)
102
Generative Adverserial Nets
Alec Radford, Luke Metz, Soumith Chintala , 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
![Page 103: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/103.jpg)
103
Generative Adverserial Nets
Alec Radford, Luke Metz, Soumith Chintala , 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
![Page 104: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/104.jpg)
104
Generative Adverserial Nets
Alec Radford, Luke Metz, Soumith Chintala , 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (GitXiv)
”turn” vector created from four averaged samples of faces looking left vs looking right.
![Page 105: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/105.jpg)
walking through the manifold
Generative Adverserial Nets
![Page 106: Multi-modal embeddings: from discriminative to generative models and creative ai](https://reader031.vdocuments.net/reader031/viewer/2022030317/5871529d1a28ab8e5b8b4855/html5/thumbnails/106.jpg)
top: unmodified samplesbottom: same samples dropping out ”window” filters
Generative Adverserial Nets