feature extraction for musical genre classification mus15

Post on 19-Nov-2021

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Feature Extraction for Musical GenreClassification

MUS15

Kilian Merkelbach

June 22, 2015

Kilian Merkelbach | June 21, 2015 0/26

Feature Extraction for Musical Genre Classification MUS15

Kilian Merkelbach | June 22, 2015 1/26

1 Intro

2 History

3 MethodsTzanetakis & CookCorrea, Costa & Saito

4 Conclusion

Intro -

Contents

Kilian Merkelbach | June 22, 2015 2/26

Intro -

What are genres?

Kilian Merkelbach | June 22, 2015 3/26

Intro -

Genres provide order

Kilian Merkelbach | June 22, 2015 4/26

Intro -

Pipeline - Feature extraction

Kilian Merkelbach | June 22, 2015 5/26

Intro -

Pipeline - Training and Classification

Kilian Merkelbach | June 22, 2015 6/26

1 Intro

2 History

3 MethodsTzanetakis & CookCorrea, Costa & Saito

4 Conclusion

History -

Contents

Kilian Merkelbach | June 22, 2015 7/26

I 1998: First MP3 players availableI 1999: Napster, mp3.com go onlineI Commercial and private music collections explode

History -

MP3 revolution

Kilian Merkelbach | June 22, 2015 8/26

I 2001: Tzanetakis and Cook build foundation by publishing first paperI from then on: methods evolve from previous ones, always trying to

improve performance

History -

First steps

Kilian Merkelbach | June 22, 2015 9/26

1 Intro

2 History

3 MethodsTzanetakis & CookCorrea, Costa & Saito

4 Conclusion

Methods -

Contents

Kilian Merkelbach | June 22, 2015 10/26

I Tzanetakis, Cook: "Musical Genre Classification of Audio Signals"I Correa, Costa, Saito: "Tracking the Beat: Classification of Music

Genres and Synthesis of Rhythms"

Methods -

State of the Art

Kilian Merkelbach | June 22, 2015 11/26

I Timbral Texture Features (19 dim.)I Rhythmic Content Features (6 dim.)I Pitch Content Features (5 dim.)

Methods - Tzanetakis & Cook

Features Overview

Kilian Merkelbach | June 22, 2015 12/26

I based on short time Fourier transform (STFT)I Analysis Window (23ms) vs. Texture Window (1s)

[1]

Methods - Tzanetakis & Cook

Timbral Texture Features I

Kilian Merkelbach | June 22, 2015 13/26

I calculate four features from STFT output and use their mean andvariance for classification

I e.g. Spectral Flux:

Ft =

N∑n=1

(Nt[n]−Nt−1[n])2

I additional features based on Mel-frequency cepstral coefficients(MFCCs)

Methods - Tzanetakis & Cook

Timbral Texture Features II

Kilian Merkelbach | June 22, 2015 14/26

I Discrete Wavelet Transform (DWT) to analyze signalI autocorrelation function recognizes strong beats (example)I map peaks into Beat Histogram

Methods - Tzanetakis & Cook

Rhythmic Content Features

Kilian Merkelbach | June 22, 2015 15/26

[3]

Methods - Tzanetakis & Cook

Beat Histogram

Kilian Merkelbach | June 22, 2015 16/26

I Pitch Histogram: like Beat Histogram (0.5-1.5s) but with shorter timeframe (2-50ms)

Methods - Tzanetakis & Cook

Pitch Content Features

Kilian Merkelbach | June 22, 2015 17/26

I 59% accuracy with 10 different genresI 77% among classical genresI 61% among jazz genres

[3]

Methods - Tzanetakis & Cook

Results

Kilian Merkelbach | June 22, 2015 18/26

I Songs given as MIDI filesI Use directed graphs to describe song

Methods - Correa, Costa & Saito

Tracking the Beat

Kilian Merkelbach | June 22, 2015 19/26

I Weighted directed graphs with note lengths as verticesI Weights defined by frequency of note sequence

[2]

Methods - Correa, Costa & Saito

Digraphs

Kilian Merkelbach | June 22, 2015 20/26

I Build digraph for each song: 18 · 18 = 324 dim.I Additional features from digraph: 15 dim.I Use PCA to obtain 52 dim. feature vector

Methods - Correa, Costa & Saito

Features

Kilian Merkelbach | June 22, 2015 21/26

I 85.72% accuracy with 4 different genres (blues, bossa-nova , reggaeand rock)

Methods - Correa, Costa & Saito

Results

Kilian Merkelbach | June 22, 2015 22/26

1 Intro

2 History

3 MethodsTzanetakis & CookCorrea, Costa & Saito

4 Conclusion

Conclusion -

Contents

Kilian Merkelbach | June 22, 2015 23/26

I College students found correct genre (out of 10) with 70% accuracyafter 3 seconds [3]

I Machines are on parI Human experts still better if there are many genres

Conclusion -

Prediction by humans

Kilian Merkelbach | June 22, 2015 24/26

I step away from speech recognition features (e.g. MFCCs)I fuzzy classification (e.g. 90% Rock, 10% Blues)I larger genre sets, higher accuracy

Conclusion -

Future Research

Kilian Merkelbach | June 22, 2015 25/26

Wikimedia Commons.Short time fourier transform, 2006.

Debora C Correa, Luciano da F Costa, and Jose H Saito.Tracking the beat: Classification of music genres and synthesis ofrhythms.

G. Tzanetakis and P. Cook.Musical genre classification of audio signals.Speech and Audio Processing, IEEE Transactions on,10(5):293–302, Jul 2002.

Conclusion -

References

Kilian Merkelbach | June 22, 2015 26/26

top related