deeplearning forjazz walking bass transcription · pdf filedeeplearning forjazz walking bass...

Fraunhofer IDMT
Jakob Abeer1,3, Stefan Balke2, Klaus Frieler3
Martin Pfleiderer3, Meinard Mller2
Deep Learning for Jazz Walking Bass Transcription
1 Semantic Music Technologies Group, Fraunhofer IDMT, Ilmenau, Germany2 International Audio Laboratories Erlangen, Germany3 Jazzomat Research Project, University of Music Franz Liszt, Weimar, Germany

Fraunhofer IDMT 2
n Example: Miles Davis: So What (Paul Chambers: b)
n Our assumptions for this work:
n Quarter notes (mostly chord tones)
n Representation: beat-wise pitch values
Motivation What is a Walking Bass Line?
T
ri A
gu
sN
ura
dh
im
D
Dm7 (D, F, A, C)
C A F A D F A D A D A F AF

Fraunhofer IDMT 3
Motivation How is this useful?
n Harmonic analysis
n Composition (lead sheet) vs. actual performance
n Polyphonic transcription from ensemble recordings is challenging
n Walking bass line can provide first clues about local harmonic changes
n Features for style & performer classification

Fraunhofer IDMT 4
Problem Setting
n Challenges
n Bass is typically not salient
n Overlap with drums (bass drum) and piano (lower register)
n High variability of recording quality and playing styles
n Example: Lester Young Body and Soul
n Goals
n Train a DNN to extract bass pitch saliency representation
n Postprocessing: manual beat-annotations for beat-wise bass pitch

Fraunhofer IDMT 5
Outline
n Dataset
n Approach
n Bass Saliency Mapping
n Semi-Supervised Model Training
n Beat-Informed Late Fusion
n Evaluation
n Results
n Summary & Outlook

Fraunhofer IDMT 6
Dataset
n Weimar Jazz Database (WJD) [1]
n 456 high-quality jazz solo transcriptions
n Annotations: solo melody, beats, chords, segments (phrase, chorus ...)
n 41 files with bass annotations
n Data augmentation(+): Pitch-shifting +/- 1 semitone (sox library [2])

Fraunhofer IDMT 7
Bass-Saliency Mapping
n Data-driven approach
n Use Deep Neural Network (DNN) to learn mapping from magnitudespectrogram to bass saliency representation
n Spectral Analysis
n Resampling to 22.05 kHz
n Constant-Q magnitude spectrogram (librosa [3])
n Pitch range 28 (41.2 Hz) 67 (392 Hz)
n Multilabel classification
n Input dimensions: 40 * NContextFramesn Output dimensions: 40
n Learning Target: Bass pitch annotations from the WJD

Fraunhofer IDMT 8
DNN Hyperparameter
n Layer-wise training [4, 5]
n Least-squares estimate for weight & bias initialization
n (3-5) fully connected layers, MSE loss
n Frame-stacking (3-5 context frames) & feature normalization
n Activation functions: ReLU, Sigmoid (final layer)
n Dropout & L2 weight regularization
n Adadelta optimizer
n Mini-batch size = 500
n 500 epochs / layer
n learning rate = 1

Fraunhofer IDMT 9
Semi-Supervised Training
n Goal: Select prediction on unseen data as additional training data
F: FeaturesT: Targets

Fraunhofer IDMT 10
Semi-Supervised TrainingSparsity-based Selection
n Train model on labelled dataset D1+ (3899 notes)
n Predictions on unlabelled dataset D2+ (11697 notes)
n Select additional training data via sparsity greater than threshold t
n Re-training

Fraunhofer IDMT 11
Beat-Informed Late Fusion
n Use manual beat-annotations from the Weimar Jazz Database
n Find most salient pitch per beat
Beats
Frames
1 2 3 4

Fraunhofer IDMT 12
Evaluation
n Use manual beat-annotations from the Weimar Jazz Database
n Compare against state-of-the-art bass transcription algorithms
n D: Dittmar, Dressler, and Rosenbauer [8]
n SG: Salamon, Serr, and Gmez [9]
n RK: Ryynnen and Klapuri [7]

Fraunhofer IDMT 13
Example
n Chet Baker: Lets Get Lost (0:04 0:09)
Initial model
M1 - without data aug.M1+ - with data aug.
Semi-supervised learning
M20,+ - t0M21,+ - t1M22,+ - t2M23,+ - t3
D - Dittmar et al.SG - Salamon et al.RK - Ryynnen & Klapuri
M1+

Fraunhofer IDMT 14
Results

Fraunhofer IDMT 15
Summary
n Data-driven approach seems to enhance non-salient instruments.
n Beneficial
n Data augmentation & dataset enlargement
n Frame stacking (stable bass pitches)
n Beat-informed late fusion
n Semi-supervised training did not improve accuracy but made bass-saliencymaps sparser
n Model is limited to training sets pitch range

Fraunhofer IDMT 16
Acknowledgements
n Thanks to the authors for sharing bass transcription algorithms / results.
n Thanks to the students for transcribing the bass lines
n German Research Foundation
n DFG-PF 669/7-2 : Jazzomat Research Project (2012-2017)
n DFG MU 2686/6-1: Stefan Balke and Meinard Mller

Fraunhofer IDMT 17
References
[1] Weimar Jazz Database: http://jazzomat.hfm-weimar.de[2] sox http://sox.soundforge.net[3] McFee, B., Raffel, C., Liang, D., Ellis, D. P. W., McVicar, M., Battenberg, E., and Nieto, O., librosa: Audio
and Music Signal Analysis in Python, in Proc. of the Scientific Computing with Python conference (Scipy), Austin, Texas, 2015.
[4] Uhlich, S., Giron, F., and Mitsufuji, Y., Deep neural network based instrument extraction from music, in Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 21352139, Brisbane, Australia, 2015.
[5] Balke, S., Dittmar, C., Abeer, J., and Muller, M., Data-Driven Solo Voice Enhancement for Jazz Music Retrieval, in Proc. of the IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, USA, 2017.
[6] Hoyer, P.O., Non-negative Matrix Factorization with Sparseness Constraints, J. of Machine Learning Research, 5, pp. 14571469, 2004.
[7] Ryynanen, M.P. and Klapuri, A., Automatic transcription of melody, bass line, and chords in polyphonic music, Computer Music J., 32, pp. 7286, 2008
[8] Dittmar, C., Dressler, K., and Rosenbauer, K., A Toolbox for Automatic Transcription of Polyphonic Music, Proc. of the Audio Mostly Conf., pp. 5865, 2007.
[9] Salamon, J., Serr, J., and Gmez, E., Tonal Representations for Music Retrieval: From Version Identification to Query-by-Humming, Int. J. of Multimedia Inf. Retrieval, 2, pp. 4558, 2013

Fraunhofer IDMT 18
n Jazzomat Research Project & Weimar Jazz Database
n http://jazzomat.hfm-weimar.de/
n Python code and trained model available
n https://github.com/jakobabesser/walking_bass_transcription_dnn
n Additional online demos
n https://www.audiolabs-erlangen.de/resources/MIR/2017-AES-WalkingBassTranscription
Thank You!

deeplearning forjazz walking bass transcription · pdf filedeeplearning forjazz walking bass...

Documents

walking bass & jazz foundations slides · walking bass &...

walking bass line examples and analysis - chris...

walking bass lines chords

bob magnusson - the art of walking bass

six string bass (transcription got a match)

intro walking bass antonio gandia

walking bass

walking%20 bass%20solo%20book

episode 3: creating a walking bass line

lorincz viktor rock and roll bass solo transcription

creating walking bass lines

penerapan konsep walking bass pada ... konsep walking bass...

walking bass libro.pdf

5.2.2 walking bass lines lesson 1 -...

still a friend-bass transcription

zappa - big swifty bass transcription

art of walking bass bob magnusson.pdf

rufus reid - bass lines (walking)

mini lesson - walking bass lines

the essential guide to walking bass - ebg-egtwbfbgp-book...