adapted representations of audio signals for music instrument recognition pierre leveau laboratoire...
TRANSCRIPT
Adapted representations of audio signals for music
instrument recognition
Adapted representations of audio signals for music
instrument recognition
Pierre Leveau
Laboratoire d’Acoustique Musicale, Paris - France
GET - ENST (Télécom Paris), France
Pierre Leveau - ENST - LAM 2
SummarySummary
• Master Thesis: Music instrument recognition on solo performances with signal segmentation (transient part / release part)
• Ph. D. Thesis: Structured and sparse decompositions: application to audio indexing
Pierre Leveau - ENST - LAM 3
Music Instrument Recognition Music Instrument Recognition
• Basic Scheme
Feature extraction
Training DB (manually indexed)
Classificationmodel
Comparison to the model
File to analyzeFeature
extractiondecision
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 4
Feature ExtractionFeature Extraction
• Feature Extraction on frames of fixed size (30 ms)
Analysis Frames
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 5
Music Note SchemeMusic Note Scheme
time
energyEx: strong attack instrument
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 6
Interest of transients for Music Instrument Recognition
Interest of transients for Music Instrument Recognition
piano trumpet
cello flute
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 7
Chosen MethodChosen Method
• Signal segmentation into transient part / release part
• Approximation: fixed length transients
• Need of an automatic onset detection algorithm.
• Study of solo performances
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 8
Onset DetectionOnset Detection
• Detection function (ex: high frequency content, spectral difference, phase deviation…)
• Peak-picking
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 9
Evaluation of Onset DetectionEvaluation of Onset Detection
• Necessity of an reference onset database
• ROC Curves
good
det
ectio
ns %
false alarms %
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 10
Sound Onset LabelizationSound Onset Labelizationspectrogram
Signal plot
Sound listening and labels positioning
Reference Onset and Sound Databases
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 11
Onset DatabaseOnset Database
Annotation precision depending on the file type
Detection function evaluation must take it into account
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 12
Annotation precision: examples
Annotation precision: examples
trumpet
cello
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 13
Developed Detection Function
Developed Detection Function
Complex Spectral Difference:
Delta Complex Spectral Difference:
guitar violin
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 14
Detection Function comparison
Detection Function comparison
Tolerance window TROC = 100 ms TROC = Topt
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 15
Signal segmentationSignal segmentation
R RT T T R RT
Analysis Frames
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 16
Music Instrument recognition on transients - Results
Music Instrument recognition on transients - Results
Music instrument recognition only on transients implies:- big decrease of the learning database size- for a fixed duration of the test signal, less data to take a decision.
Results worse than for a recognition on all frames
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 17
PerspectivesPerspectives
• Increase the onset database size for a more robust evaluation
• Improve the robustness of the Onset detection algorithm
• Merge decisions on transients and steady part, compare to the classical static recognition.
• Select features adapted for each part of the notes.
Music instrument recognition on solo performances with signal segmentation (transient part / release part)
Pierre Leveau - ENST - LAM 18
Ph. D. ThesisPh. D. Thesis
Subject:
Sparse and structured decompositions: application to audio indexing
Under supervision of Gaël Richard (GET - ENST, Paris)
and Laurent Daudet (Laboratoire d’Acoustique Musicale, Paris)
Pierre Leveau - ENST - LAM 19
SparseRepresentations
SparseRepresentations
• Classical representations: Orthogonal transform (ex: Fourier Transform, STFT, MDCT, Wavelet Transform…)
• Redundant representations:
x u
Sparse representations (only on N terms):
x ii0
N 1
ui RN
: Redundant dictionnary
Sparse and structured decompositions: application to audio indexing
Pierre Leveau - ENST - LAM 20
Dictionary ExampleDictionary Example
DCW
C: MDCT basis (useful to represent tonal parts of signals)
W: DWT basis (useful to represent transient parts of signals)
Sparse and structured decompositions: application to audio indexing
Pierre Leveau - ENST - LAM 21
AlgorithmsAlgorithms
• Matching Pursuit (and its variants): Greedy algorithms Based on an iterative search Faster algorithm needs a suboptimal search
• Molecular Matching Pursuit: Gives structured, perceptually relevant organizations
of the atoms (by grouping significant coefficients) Faster than standard MP Fast varying frequencies (ex: vibrato) cannot be
efficiently represented
Sparse and structured decompositions: application to audio indexing
Pierre Leveau - ENST - LAM 22
Application to music instrument recognitionApplication to music
instrument recognition
Signal Feature Extraction
Classical Music Instrument Recognition
Comparison to statistical models
Decision
Signal MMP Feature Extraction (which features?)
Comparison to statistical models
(which models?)
Decision
Music Instrument Recognition with sparse decomposition
features
featuresStructuredRepresentation
Sparse and structured decompositions: application to audio indexing
Pierre Leveau - ENST - LAM 23
To be continued…To be continued…
Thank you for your attention.