classifying motion picture audio eirik gustavsen 07.06.07
Post on 19-Dec-2015
223 views
TRANSCRIPT
Outline
• Motivation • Thesis• State of the Art• Proposed system• Experimental setup• Results• Future work• Conclusion
Motivation
• Most projects classify clear classes or classes with noise.
• Few clear boundaries in motion picture audio• Subjective descriptions of movies• Dificult to compare movie content
Thesis
It is possible to automatically create a table of contents of a motion picture, based on its audio track only.
Research questions
• Find best LLDs to classify motion picture audio
• Detect boundaries between audio classes within complex audio segments
• Automatically create a TOC based on the audio track only
Low Level Descriptors
• Total of 23 low level descriptors
TIME DOMAIN
• Audio Power• Audio Wave Form• Root-Mean Square• Short Time Energy• Low Short Time Energy Ratio• Zero-Crossing Rate• High Zero-Crossing Rate Ratio
FREQUENCY DOMAIN
• Audio Spectrum Centroid• Fundamental Frequency• 10 Mel-Frequency Cepstral Coefficients• Spectrum Flux
Dimensionally reduction
Principal components analysis (PCA) is a technique used to reduce multidimensional data sets to lower dimensions for analysis.
f(1)f(2)f(3)f(4)f(5)...f(23)
PCAd(1)d(2)d(3)
Sample Results
Music with low volume
Clear speech
Speech with background environmental sounds
Fading between music and speech
Speech with Background music
Jingle
” Some mistakes”
Future Work
• To be done in this thesis– Post processing– TOC
• Open research questions for future works– New motion picture audio classes– Detecting sound objects– Speech recognition
Conclusion
• Pre-processing makes it possible to classify motion picture audio correctly
• Using right combination of LLDs enhances the result of the classification