discovery and characterization of melodic motives in large audio music collections

Post on 22-Nov-2014

360 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation for my PhD proposal Defense at Music Technology Group, UPF, Barcelona, Spain (2013).

TRANSCRIPT

  • 1. sankalp.gulati@upf.edu Discovery and Characterization of Melodic Motives in Large Audio Music Collections PhD Proposal Defense Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Sankalp Gulati Supervisor: Prof. Xavier Serra
  • 2. sankalp.gulati@upf.edu Patterns Images at right half taken from- (Mueen & Keogh, 2009) and (Mueen & Keogh, 2010)
  • 3. sankalp.gulati@upf.edu Melodic Patterns Top right image taken from - (Mueen & Keogh, 2009) and (Mueen & Keogh, 2010)
  • 4. sankalp.gulati@upf.edu Melodic Motives (Patterns) Melodic Motives Top right image taken from - (Mueen & Keogh, 2009) and (Mueen & Keogh, 2010)
  • 5. sankalp.gulati@upf.edu Melodic Motives Discovery Induction Extraction Matching Retrieval Discovery Melodic Motives + Image taken from - (Mueen & Keogh, 2009)
  • 6. sankalp.gulati@upf.edu Large Audio Music Collections Discovery Melodic Motives Large Audio Music Collections > 500,000 > 550 hours
  • 7. sankalp.gulati@upf.edu Characterizatio n Discovery Characterization Melodic Motives Large Audio Music Collections Transform N dimensions
  • 8. sankalp.gulati@upf.edu Discovery and Characterization of Melodic Motives in Large Audio Music Collections PhD Proposal Defense Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Sankalp Gulati Supervisor: Prof. Xavier Serra
  • 9. sankalp.gulati@upf.edu Music Information Research (MIR) Introduction
  • 10. sankalp.gulati@upf.edu Introduction Music->Melody (pitch, loudness, timbre) It is melody that enables us to distinguish one work from another. It is melody that human beings are innately able to reproduce by singing, humming, and whistling. It is melody that makes music memorable: we are likely to recall a tune long after we have forgotten its text -(Selfridge-Field, 1998) Audio example:
  • 11. sankalp.gulati@upf.edu Introduction Melodic Analysis : Melodic Motives Computational Melodic Motivic Analysis Hungarian, Slovak, French, Sicilian, Bulgaria n and Appalachian Folk Melodies - (Juhsz, 2006) Cretan, Nova scotia and Essen Folk Melodies (Conklin and Anagnostopoulou, 2010, 2006) Tunisian modal music -(Lartillot & Ayari, 2006).
  • 12. sankalp.gulati@upf.edu Introduction Melodic Motivic Discovery in Audio Music Signals? Is it needed? Why so little work? Solution?
  • 13. sankalp.gulati@upf.edu Introduction Indian Art Music: Opportunities Heterophonic Music Melodic framework (Rg) Importance of melodic phrases (Pakads, Chalans) Available audio music repertoire
  • 14. sankalp.gulati@upf.edu Introduction: Broad Research Goals Broad Research Goals: Computational methodology for melodic motivic discovery in large audio music collection utilizing domain specific knowledge. Melodic motivic analysis methodology Similarity measures based on melodic motives Compilation of sizeable audio music collection of Indian art music Summarize and compile existing literature
  • 15. sankalp.gulati@upf.edu Introduction: Goals and Motivation Motivation: Lack of approaches for melodic motif extraction in audio signal Lack of utilization of domain specific knowledge in computational methodologies Further state of the art in pattern processing in MIR
  • 16. sankalp.gulati@upf.edu Proposed Methodology
  • 17. sankalp.gulati@upf.edu Proposed Methodology: Overview Block Diagram for proposed methodology
  • 18. sankalp.gulati@upf.edu Proposed Methodology: Data Collection Audio Metadata Annotations > 550 hours
  • 19. sankalp.gulati@upf.edu Proposed Methodology: Melodic Feature Extraction Pitch, loudness and timbre features Pitch: F0 frequency contour of predominant melodic source. Use - (Salamon & Gmez, 2012) Loudness: Perceptual loudness computed using only predominant melodic source. Use - (Zwicker, 1977) Timbre: Centroid of the spectral envelope of the predominant melodic source. Use - (Rbel & Rodet, 2005). Predominant F0 Frequency estimation Synthesize predominant melodic source Loudness feature extraction Timbre feature extraction Audio
  • 20. sankalp.gulati@upf.edu Evaluation: predominant F0 frequency estimation 6 Hindustani music pieces ~45 mins Proposed Methodology: Melodic Feature Extraction
  • 21. sankalp.gulati@upf.edu Compact + Abstract/reduced Challenges: Heavy meandering around notes (Gamakas) Svar intonation Aroh-Avroh dependent svar intonation F0 frequency contour musical pitch perception 2.215 2.22 2.225 2.23 2.235 2.24 x 10 4 1300 1400 1500 1600 1700 1800 1900 2000 Time (1 sample = 10 ms) PredominantF0frequency(Cents) Proposed Methodology: Melodic Representation
  • 22. sankalp.gulati@upf.edu Continuous time varying values of pitch, loudness and timbral features Possibilities Melody transcription SAX based symbolic representation Parametric representation (no studies!!) Saddle point based representation Domain knowledge Svar intonation profiles Proposed Methodology: Melodic Representation
  • 23. sankalp.gulati@upf.edu Proposed Methodology: Melodic Similarity Challenges Melodic representation Large timing variations Pitch variations (ornamentations) Differentiating a characteristic phrase from a melodic sequence using same svars Fixing similarity threshold Audio example: Dynamic Time Warping (Initial experiments) DTW > (SAX + Euclidean distances) (Ross, Vinutha, and Rao,2012)
  • 24. sankalp.gulati@upf.edu Possibilities Euclidian and Mahalanobis distance measures HMM based distance measures Dynamic time warping based distances Step and boundary conditions Constraints Context dependent DTW Domain Knowledge DTW constraint parameters Pattern dependent similarity threshold Weighted distance measures Proposed Methodology: Melodic Similarity
  • 25. sankalp.gulati@upf.edu Proposed Methodology: Pattern Extraction Challenges: Melodic segmentation Different motif lengths Large volume of audio data Exact melodic similarity ~ parametric melodic representation 1000 1200 1400 1600 1800 2000 2200 160 180 200 220 240 260 280 300 320 Time (1 sample = 10 ms) PredominantF0frequency(Hz) Match Matrix
  • 26. sankalp.gulati@upf.edu Ongoing work Music parallelismMelodic segmentation Motif discovery in time series analysis domain Fast brute force exhaustive pattern search Pruning strategies 1000 1200 1400 1600 1800 2000 2200 160 180 200 220 240 260 280 300 320 Time (1 sample = 10 ms) PredominantF0frequency(Hz) Proposed Methodology: Pattern Extraction
  • 27. sankalp.gulati@upf.edu Possibilities Sparse similarity matrices Lower bounds on distance measures Phase space embedding/recurrent plots Suffix trees (~parametric representation) Domain knowledge Probable phrase boundaries Pruning rules Motif characteristics Proposed Methodology: Pattern Extraction
  • 28. sankalp.gulati@upf.edu Proposed Methodology: Melodic Motivic Analysis Challenges Non uniform length of motives Directions Clustering K-mean clustering Self organizing maps Fractal Analysis Application Rg characterization Rg specific motives Shared motives Transform N dimensions
  • 29. sankalp.gulati@upf.edu Proposed Methodology: Evaluation Challenges No annotated corpus Human subjectivity in similarity related tasks Listening tests Feedback through Dunya users
  • 30. sankalp.gulati@upf.edu References Selfridge-Field, E. (1998). Conceptual and representational issues in melodic comparison. Computing in musicology: a directory of research(11), 364. Juhsz, Z. (2006, June). A systematic comparison of different European folk music traditions using self-organizing maps. Journal of New Music Research, 35(2), 95112. Conklin, D., & Anagnostopoulou, C. (2006). Segmental pattern discovery in music. INFORMS Journal on Computing, 18(3), 285293. Lartillot, O., & Ayari, M. (2006). Motivic pattern extraction in music, and application to the study of Tunisian modal music. South African Computer Journal, 36, 1628. Salamon, J., & Gmez, E. (2012, August). Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6), 17591770. Zwicker, E. (1977). Procedure for calculating loudness of temporally variable sounds. The Journal of the Acoustical Society of America, 62(3), 675682. Rbel, A., & Rodet, X. (2005). Efficient spectral envelope estimation and its application to pitch shifting and envelope preservation. In Proc. dafx. Ross, J. C., Vinutha, T., & Rao, P. (2012). Detecting melodic motifs from audio for hindustani classical music. In Proceedings of the 13th international society for music information retrieval conference, porto, portugal. Mueen, A., Keogh, E. J., Zhu, Q., Cash, S., & Westover, M. B. (2009, April). Exact Discovery of Time Series Motifs. In SDM (pp. 473-484). Mueen, A., & Keogh, E. (2010, July). Online discovery and maintenance of time series motifs. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1089-1098). ACM.
  • 31. sankalp.gulati@upf.edu Work Plan

top related