adapted representations of audio signals for music instrument recognition pierre leveau laboratoire...

23
Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom Paris), France

Upload: karen-barnett

Post on 20-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Adapted representations of audio signals for music

instrument recognition

Adapted representations of audio signals for music

instrument recognition

Pierre Leveau

Laboratoire d’Acoustique Musicale, Paris - France

GET - ENST (Télécom Paris), France

Page 2: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 2

SummarySummary

• Master Thesis: Music instrument recognition on solo performances with signal segmentation (transient part / release part)

• Ph. D. Thesis: Structured and sparse decompositions: application to audio indexing

Page 3: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 3

Music Instrument Recognition Music Instrument Recognition

• Basic Scheme

Feature extraction

Training DB (manually indexed)

Classificationmodel

Comparison to the model

File to analyzeFeature

extractiondecision

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 4: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 4

Feature ExtractionFeature Extraction

• Feature Extraction on frames of fixed size (30 ms)

Analysis Frames

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 5: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 5

Music Note SchemeMusic Note Scheme

time

energyEx: strong attack instrument

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 6: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 6

Interest of transients for Music Instrument Recognition

Interest of transients for Music Instrument Recognition

piano trumpet

cello flute

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 7: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 7

Chosen MethodChosen Method

• Signal segmentation into transient part / release part

• Approximation: fixed length transients

• Need of an automatic onset detection algorithm.

• Study of solo performances

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 8: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 8

Onset DetectionOnset Detection

• Detection function (ex: high frequency content, spectral difference, phase deviation…)

• Peak-picking

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 9: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 9

Evaluation of Onset DetectionEvaluation of Onset Detection

• Necessity of an reference onset database

• ROC Curves

good

det

ectio

ns %

false alarms %

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 10: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 10

Sound Onset LabelizationSound Onset Labelizationspectrogram

Signal plot

Sound listening and labels positioning

Reference Onset and Sound Databases

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 11: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 11

Onset DatabaseOnset Database

Annotation precision depending on the file type

Detection function evaluation must take it into account

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 12: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 12

Annotation precision: examples

Annotation precision: examples

trumpet

cello

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 13: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 13

Developed Detection Function

Developed Detection Function

Complex Spectral Difference:

Delta Complex Spectral Difference:

guitar violin

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 14: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 14

Detection Function comparison

Detection Function comparison

Tolerance window TROC = 100 ms TROC = Topt

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 15: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 15

Signal segmentationSignal segmentation

R RT T T R RT

Analysis Frames

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 16: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 16

Music Instrument recognition on transients - Results

Music Instrument recognition on transients - Results

Music instrument recognition only on transients implies:- big decrease of the learning database size- for a fixed duration of the test signal, less data to take a decision.

Results worse than for a recognition on all frames

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 17: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 17

PerspectivesPerspectives

• Increase the onset database size for a more robust evaluation

• Improve the robustness of the Onset detection algorithm

• Merge decisions on transients and steady part, compare to the classical static recognition.

• Select features adapted for each part of the notes.

Music instrument recognition on solo performances with signal segmentation (transient part / release part)

Page 18: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 18

Ph. D. ThesisPh. D. Thesis

Subject:

Sparse and structured decompositions: application to audio indexing

Under supervision of Gaël Richard (GET - ENST, Paris)

and Laurent Daudet (Laboratoire d’Acoustique Musicale, Paris)

Page 19: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 19

SparseRepresentations

SparseRepresentations

• Classical representations: Orthogonal transform (ex: Fourier Transform, STFT, MDCT, Wavelet Transform…)

• Redundant representations:

x u

Sparse representations (only on N terms):

x ii0

N 1

ui RN

: Redundant dictionnary

Sparse and structured decompositions: application to audio indexing

Page 20: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 20

Dictionary ExampleDictionary Example

DCW

C: MDCT basis (useful to represent tonal parts of signals)

W: DWT basis (useful to represent transient parts of signals)

Sparse and structured decompositions: application to audio indexing

Page 21: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 21

AlgorithmsAlgorithms

• Matching Pursuit (and its variants): Greedy algorithms Based on an iterative search Faster algorithm needs a suboptimal search

• Molecular Matching Pursuit: Gives structured, perceptually relevant organizations

of the atoms (by grouping significant coefficients) Faster than standard MP Fast varying frequencies (ex: vibrato) cannot be

efficiently represented

Sparse and structured decompositions: application to audio indexing

Page 22: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 22

Application to music instrument recognitionApplication to music

instrument recognition

Signal Feature Extraction

Classical Music Instrument Recognition

Comparison to statistical models

Decision

Signal MMP Feature Extraction (which features?)

Comparison to statistical models

(which models?)

Decision

Music Instrument Recognition with sparse decomposition

features

featuresStructuredRepresentation

Sparse and structured decompositions: application to audio indexing

Page 23: Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom

Pierre Leveau - ENST - LAM 23

To be continued…To be continued…

Thank you for your attention.