creative computing with deep learning · deep learning charles p. martin 2017. deep learning! all...
TRANSCRIPT
![Page 1: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/1.jpg)
Creative Computing with Deep Learning
Charles P. Martin2017
![Page 2: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/2.jpg)
Deep Learning! All aboard the hype train!
![Page 3: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/3.jpg)
Musical Examples
More Examples
Zelda Attention RNN Output
DeepJazz - 2-layer LSTM
Flow MachinesWaveNet piano
![Page 4: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/4.jpg)
Big Questions1. How do Artificial Neural Networks (ANN) work? 2. How do you use them artistically?3. Why have ANN systems worked so well for images, and so poorly for
music?
![Page 5: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/5.jpg)
Artificial Neural Networks
![Page 6: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/6.jpg)
“Units” - Artificial Neuronshttps://www.tensorflow.org/get_started/mnist/beginners
(bias) b
+ b
![Page 7: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/7.jpg)
Artificial Neural Network
![Page 8: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/8.jpg)
Multilayer Perceptrons
![Page 9: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/9.jpg)
Modern “Deep” Networks● VGG16 - Popular image
recognition network design uses 16 layers. (138M parameters)
● Other deep image rec. networks > 150 layers.
● Networks start to look like layered multinomial functions.
![Page 10: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/10.jpg)
Simple Example: Handwritten Numbers
https://www.tensorflow.org/get_started/mnist/beginners
How do we choose which number an input represents?
![Page 11: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/11.jpg)
Simple Example: Handwritten Numbers
https://www.tensorflow.org/get_started/mnist/beginners
Or just:
As equations:
As vectors:
![Page 12: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/12.jpg)
Simple Example: Handwritten Numbers
https://www.tensorflow.org/get_started/mnist/beginners
import tensorflow as tfx = tf.placeholder(tf.float32, [None, 784])W = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))y = tf.nn.softmax(tf.matmul(x, W) + b)
Check out example in Jupyter Notebook
![Page 13: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/13.jpg)
Training- Define some “Loss” function: L(w,b) = cross_entropy(y,y’)- Find values for the parameters so that the loss is minimized.- Tricky with lots of parameters
- “Simple” Example has 7850 params
- Take partial derivative of L withrespect to each parameter, then adjust the parameter a little bit sothat we can expect L to be lower.
- Typical technique for finding thesepartials is “back propagation”, a.k.a.reverse-mode differentiation.
- http://colah.github.io/posts/2015-08-Backprop/
![Page 14: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/14.jpg)
Demo- See Jupyter Notebooks
“MNIST-Tensorflow” and “MNIST-Keras”
![Page 15: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/15.jpg)
Recurrent Neural Networks
![Page 16: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/16.jpg)
Idea: Units receive input and previous output.
![Page 17: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/17.jpg)
Training- Training examples consist of sequences
of input and output data.- E.g., training a network on a vocabulary
of four letters (h, e, l, o) and the example “hello”
- X would be “h, e, l, l”- y would be “e, l, l, o”- Each step finds Loss gradient over all
examples and through time.
![Page 18: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/18.jpg)
Long Short-Term Memory Cells- Cells can have “memory” of
some internal state that is preserved in between time-steps.
- LSTM (Long Short-Term Memory Cell) is v. popular
- Another is GRU (Gated Recurrent Unit).
- Not much evidence about advantages of any.
- Many “casual” deep learners just work with LSTM.
![Page 19: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/19.jpg)
“Unreasonable” Effectiveness- RNNs seem to work really well for certain tasks.- Generating natural language at character level is a good example- Andrej Karpathy had a lot of fun with the char-RNN idea.
![Page 20: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/20.jpg)
Demo character-level RNN- Look at charRNN-Keras Jupyter Notebook
Still possible to have kinks:
Generated: be not afraid of greatness: some are born great, stal3n3r333333ng3l33n333s333 3f the3rd3n33n33s333s3333333333333333333333333333n33l33c33333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333
Generated: be not afraid of greatness: some are born great, such so the hands;i have the ready we mad bring some which thee.
sicinio:so the man that his doliester to break their town.
second servinam:the noble her own or say the hands and the prest and down,and i am madam, sir, peterving the bark.
wance:some we hence and not down they the honour to branged,and i grace the provilly frattion, or and strive be firstbut since be disse'eret?
![Page 21: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/21.jpg)
A note on Convolutional Neural Networks.- Mainly used in image
recognition- Convolution layers apply
“learnable filters” to segments of image.
- “De-convolution” can undo this process and produce image.
- Successful results in “deep dream” images and “style transfer”
![Page 22: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/22.jpg)
Folk-RNNarXiv:1604.08723 [cs.SD]
![Page 23: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/23.jpg)
Data and Model- Data are folk tune transcriptions in
“ABC” format from thesession.org - Cleaned corpus includes 23636
transcriptions all transposed to C.- 137 unique tokens (vocabulary
elements) covering pitch, structure, duration, meter, etc.
- Data are encoded as one-hot vectors.- Network architecture: three LSTM layers
of 512 units each.- Paper found that folk-RNN (with 137
elements in vocab) outperforms a char-RNN on the same data despite similar numbers of params.
T: A Cup Of TeaM: 4/4L: 1/8K: AdoreAAa ~g2fg|eA~A2 BGBd|eA~A2 ~g2fg|1af (3gfe dG~G2:|2af (3gfe d2^cd||eaag efgf|eaag ed (3Bcd|eaag efgb|af (3gfe d2^cd:|
Cleaned up:
<s> M:4/4 K:Cdor g c c c’ b 2 a b | g c c 2 d B d f | g c c 2 b 2 a b |1 c’ a (3 b a g f B B 2 :| |2 c’ a (3 b a g f 2 =e f | g c’ c’ b g a b a | g c’ c’ b g f (3 d e f | g c’ c’ b g a b d’ | c’ a (3 b a g f 2 =e f :| <\s>
![Page 24: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/24.jpg)
Generating “Transcriptions” via LSTM- FolkRNN produces lead sheets
(“Transcriptions”) which are then interpreted by performers.
- Authors interested in evaluating and comparing music generation techniques.
- How useful are they? How much cherry picking needs to be done? What could their role be in assisting human creativity?
- https://highnoongmt.wordpress.com/2017/03/19/benchmarking-music-generation-systems/
![Page 25: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/25.jpg)
Applications in composition?- The Millenial Whoop Jig- “Given the ubiquity of this short musical phrase
called the “Millennial Whoop”, let’s compose some Celtic-style music that “features” it. I am going to use the folk-rnn deep LSTM model that we trained and presented recently. (This is a continuation of my explorations of using deep learning for assisting the process of music composition.)”
- “Let’s start with a jig, which is a 6/8 dance. Using our model trained on over 23,000 folk music transcriptions, I seed with the millennial whoop as the beginning measure and ask for 3 transcriptions in C major:”
![Page 26: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/26.jpg)
WaveNet
![Page 27: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/27.jpg)
Generating Raw Audio- Convolutional network. Input is a series
of digital audio samples, output is predicted next sample.
- So far, primary use-case is speech synthesis
- Network is also conditioned on phonetic features to define sounds to make.
- DeepMind demonstrate a “musical” example trained on “classical piano” music, which makes piano-like sounds.
- https://deepmind.com/blog/wavenet-generative-model-raw-audio/
- WaveNet: a generative model for raw audio
![Page 28: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/28.jpg)
Magenta - https://magenta.tensorflow.org
![Page 29: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/29.jpg)
Coordinated MIDI Generation- Extensive project for generating music
with ANNs- Open source:
https://github.com/tensorflow/magenta - Collection of approaches for generating
music (and images).- RNNs tuned for monophonic melodies,
polyphony, melody + chords, drums, etc.- Written in Tensorflow with abstracted
data cleaning, training, generation, and models. Code is tough to read.
- Magenta-Discuss is a good resource! 50% thoughtfulness, 50% noobs.
- Blog also good
![Page 30: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/30.jpg)
Magenta Demo- Magenta Attention Model trained on
MIDI from all Legend of Zelda soundtracks (up to 2016).
- Generated 2048 notes with no primer.- Magenta work seems useful, but lacks
research goals.- Could easily be repurposed by
researchers / creators - if so, good chance Google would promote the work too!
Zelda Attention RNN Output
![Page 31: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/31.jpg)
Neural Mashups- Take an image- Use a captioning network (im2txt) to
generate some sort of lyrics.- Set them to a generated song- Use speech synth to sing- Headline: “AI wrote a Christmas carol!”- E.g., https://vimeo.com/192711856 - Flow Machines “pop songs” are (better
developed) examples of this.
![Page 32: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/32.jpg)
Deep Learning Handwriting Paths
![Page 33: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/33.jpg)
LSTM + Mixture Distribution Network
https://greydanus.github.io/2016/08/21/handwriting/ Fake Kanji GeneratorMDN
![Page 34: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/34.jpg)
Deep Learning Ensemble Interactions
![Page 35: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/35.jpg)
RNN for Ensemble Performances
- 9 possible gestures- 4 players input, 3 players output- Input and output encoded as one-hot
vectors (i.e., 9^4 input possibilities, 9^3 output possibilities).
- Tried a quartet and duet configuration- Network used was similar to folkRNN (3
LSTM layers).- Broken right now! Built for old version of
tensorflow… ;_;
![Page 36: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/36.jpg)
ANN to recreate “tiny touch performances”
![Page 37: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/37.jpg)
Where is this going?
![Page 38: Creative Computing with Deep Learning · Deep Learning Charles P. Martin 2017. Deep Learning! All aboard the hype train! Musical Examples More Examples Zelda Attention RNN Output](https://reader035.vdocuments.net/reader035/viewer/2022071103/5fdd39421e8df87116297e16/html5/thumbnails/38.jpg)
Challenges- Defining a network is super easy.- Finding good data is hard, good representation can be very hard!- Training can take a long time, easier with big GPUs
- workstation with gaming GPUs is handy for experiments- Big models trained on several GPUs
- Trained models can be evaluated on simpler systems - even browsers and mobile devices.
- Can be hard to figure out goals or use cases even for interesting data.- Much blog hype on the internet focusses on existing data, and
well-defined problems. But getting these two things is much of the challenge.