let’s talk about bach
TRANSCRIPT
![Page 1: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/1.jpg)
Let’s talk about Bach
![Page 3: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/3.jpg)
Let’s talk about Bach BachBot
![Page 4: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/4.jpg)
BachBot Autmatic Stylistic Composition with LSTM
Feynman Liang GOTO Amsterdam, 13 June 2017
![Page 5: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/5.jpg)
About Me
• Software Engineer at
• MPhil Machine Learning @ University of Cambridge
• Joint work w/ Microsoft Research Cambridge
![Page 6: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/6.jpg)
Executive summary• Deep recurrent neural network model for music capable of:
• Polyphony
• Automatic composition
• Harmonization
• Learns music theory without prior knowledge
• Only 7% out of 1779 participants n a musical Turing test performed better than random chance
![Page 7: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/7.jpg)
Goals of the work
• Where is the frontier of computational creativity?
• How much has deep learning advanced automatic composition?
• How do we evaluate generative models?
vs
![Page 8: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/8.jpg)
Overview• Music primer
• Chorales dataset preparation
• RNN primer
• The BachBot model
• Results
• Musical Turing test
![Page 9: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/9.jpg)
If you’re a hands-on typefeynmanliang.github.io/bachbot-slides
docker pull fliang/bachbot:aibtb docker run --name bachbot -it fliang/bachbot:cornell bachbot datasets prepare bachbot datasets concatenate_corpus scratch/BWV-*.utf bachbot make_h5 bachbot train bachbot sample ~/bachbot/scratch/checkpoints/*/checkpoint_<ITER>.t7 -t bachbot decode sampled_stream ~/bachbot/scratch/sampled_$TMP.utf docker cp bachbot:/root/bachbot/scratch/out .
![Page 10: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/10.jpg)
Music primer
![Page 11: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/11.jpg)
Modern music notation
![Page 12: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/12.jpg)
Pitch: how “high” or “low” a note is
![Page 13: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/13.jpg)
Duration: how “long” a note is
![Page 14: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/14.jpg)
Polyphony: multiple simultaneous voices
![Page 15: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/15.jpg)
Piano roll: convenient computational representation
![Page 16: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/16.jpg)
Fermatas and phrases
![Page 17: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/17.jpg)
Chorales dataset preparation
![Page 19: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/19.jpg)
Transpose to Cmaj/Amin (convenience), quantize to 16th notes (computational)
![Page 20: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/20.jpg)
Transposition preserves relative pitches
![Page 21: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/21.jpg)
Quantization to 16th notes preserves meter and affects less than 0.2% of dataset
![Page 22: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/22.jpg)
Question: How many chords can be constructed from 4 voices, each with 128 pitches?
Answer: O(1284)! Data sparsity issue…
Handling polyphony
![Page 23: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/23.jpg)
Serialize in SATB order
START (59, True) (56, True) (52, True) (47, True)
||| (59, True) (56, True) (52, True) (47, True)
|||
(.) (57, False) (52, False) (48, False) (45, False)
||| (.)
(57, True) (52, True) (48, True) (45, True)
||| END
O(1284) => O(128) vocab. size!
![Page 24: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/24.jpg)
Size of pre-processed dataset
![Page 25: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/25.jpg)
Recurrent neural networks (RNN) primer
![Page 26: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/26.jpg)
Neuron Input x, output y, parameters w, activations z
![Page 27: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/27.jpg)
Feedforward neural network
![Page 28: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/28.jpg)
Memory cell
![Page 29: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/29.jpg)
Long short-term memory (LSTM) cell
![Page 30: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/30.jpg)
Stacking memory cells to form a deep RNN Unrolling for training
![Page 31: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/31.jpg)
The BachBot model
![Page 32: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/32.jpg)
Sequential prediction training criteria
https://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 33: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/33.jpg)
Model formulationRNN dynamics Initial state (all 0s) Prob. distr. over sequences
+
Need to choose the RNN parameters…
…in order to maximize the probability of the real
Bach chorales.
=
![Page 34: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/34.jpg)
Back-propagation
![Page 35: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/35.jpg)
Optimizing model architecture
![Page 36: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/36.jpg)
GPUs deliver a 8x performance speedup
![Page 37: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/37.jpg)
Depth matters (to a certain point)
![Page 38: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/38.jpg)
Hidden state size matters (to a certain point)
![Page 39: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/39.jpg)
LSTM memory cells matter
![Page 40: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/40.jpg)
The final model• Architecture
• 32-dimensional vector space embedding
• 3 layered, 256 hidden unit LSTMs
• “Tricks” during training
• 30% dropout
• Batch normalization
• 128 timestep truncated back-propagation through time (BPTT)
![Page 41: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/41.jpg)
Dropout improves generalization
![Page 42: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/42.jpg)
Automatic composition
• Sample next symbol from
• Condition model on sampled symbol
• Repeat
https://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 43: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/43.jpg)
Harmonization
• Let denote the fixed parts
• Need to choose highest probability symbols for the free parts
• Greedy 1-best selection
![Page 44: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/44.jpg)
Results
![Page 45: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/45.jpg)
Activations are difficult to interpret! Input and memory cell (layer 1 and 2) activations
![Page 46: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/46.jpg)
Layers closer to the output resemble piano roll (consequence of sequential training criteria) Memory cell (layer 3) activations, output activations, and predictions
![Page 47: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/47.jpg)
Model learns music theory!
• L1N64 and L1N138: Perfect cadences with root position chords in tonic key
• L1N151: A minor cadences ending phrases 2 and 4
• L1N87 and L2N37: I6 chords
![Page 48: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/48.jpg)
Harmonization: given a melody (here C major scale)
![Page 49: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/49.jpg)
Harmonization: produce the accompanying parts https://soundcloud.com/bachbot
![Page 50: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/50.jpg)
Harmonization: can also be used on non-Baroque melodies! https://soundcloud.com/bachbot
![Page 52: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/52.jpg)
![Page 53: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/53.jpg)
![Page 54: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/54.jpg)
Participants by country
![Page 55: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/55.jpg)
Participants by age group and music experience
![Page 56: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/56.jpg)
Correct discrimination rates for composition (SATB) and harmonization (others) questions
![Page 57: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/57.jpg)
More experienced respondents tend to do better
![Page 58: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/58.jpg)
Conclusion• Deep LSTM generative model for composing, completing, and generating
polyphonic music
• Open source (github.com/feynmanliang/bachbot)
• Integrated into Magenta’s “Polyphonic RNN” model (magenta.tensorflow.org)
• Appears to learn music theoretic without prior knowledge
• Largest music Turing test to date with over 1779 participants
• Average performance on music Turing test only 7% better than random guessing
![Page 59: Let’s talk about Bach](https://reader031.vdocuments.net/reader031/viewer/2022012423/6177b24218249e47a06fbd9a/html5/thumbnails/59.jpg)
Thank You!
• Questions?
• Need a world-class development team for your next project?
• Email me at: [email protected]