modelling music similarity - utrecht university · 2 3 music similarity central issue in music ir...
TRANSCRIPT
1
1
Modelling Music Similarity
Frans Wiering22 August, 2007
2
Outline
Music similarity—refresh memory Melody retrieval
pitch only geometric methods
Harmony retrieval Chroma matching Evaluation
2
3
Music similarity
Central issue in music IR Many levels of musical
similarity many different tasks different features given a task, expert jidgements
are pretty consistent Identity generally not the issue
e.g. performance and notationdifferences
or: issues of ‘work’ humon performance problems:
Query By Humming incomplete (just one voice) imprecision, errors
4
Measuring similarity Usually expressed in one non-negative real number
allows ordering in list use of standard evaluation methods
3
5
Retrieval methods
Symbolic data string-based methods
usually pitch-only exact, substring, approximate
matching methods set-based methods
usually pitch, duration, onsettime
geometric distance measuressuch as EMD/PTD, C-Brahms
graph-based methods probabilistic methods
Markov models similarity derived from
transition probabilities
Audio data fingerprinting
no pitch/rhythmdetection
exact match only chroma-based matching
finds musically similarpassages
self-organising maps clustering musical genres
6
Classroom exercise: melodic similarity
how to model the similarity between the melodies?
4
7
Classroom exercise: melodic similarity
how to model the similarity between the melodies?
8
Melody retrieval by pitch
General idea: pitch is most important feature others can be discarded represent melody as string apply string matching techniques
Some ways of representingmelodies pitch names interval (distance between 2 pitches) gross contour
same/up/down (Parson’s Code) refined contour
same/step up/leap up/step down/leap down
c c g g a a g g f f e e d d e c
0 4 0 1 0 -1 0 -1 0 -1 0 -1 0 +1 -2
S U S U S D S D S D S D S U D
s U s u s d s d s d s u s u D
5
9
Themefinder
Several 1-dimensionalsearch options, e.g. pitch interval contour rhythm
wildcards matching by regular
expressions ca. 40.000 themes
Barlow and Morgenstern(1948)
ESAC encodings Lincoln, 16th Century Motet
www.themefinder.org
10
Sample result
Example after Byrd &Crawford (2000)
non-identical hits different rhythm different meter
do we find these similar? does this help end users?
Query: +m2 +M2 P1 -M2 -m2 -M2
6
11
Why pitch-only retrieval is unsatisfactory
Remember: time structure is most stable element ofmelody (Sloboda & Parker 1985)
Information contribution of other 3 parameters (estimatefor Western music; Byrd & Crawford 2000) pitch: 50% rhythm: 40% timbre + dynamics: 10%
Melodic confounds (Selfridge-Field 1998): rests repeated notes grace notes, ornamentation
People remember high-level concepts, not notes
12
Mental model of a songAh, vous dirai-je maman melody level
phrase level
chunk level
subchunk level
A ABanalysis synt
hesi
s
analysis: from ear to LTM (sub) chunks created by similarity and
continuity a lot of parallellism
boundaries by leaps and harmony chunks may have a harmonic aspect too
(I, V, V->I)
synthesis: from LTM to focus of attention recollection
using general characteristics of phrases andchunks
enables understanding of variation performance
notes are reconstitued through some musicalgrammar
7
13
Set-based approaches to melody retrievalin polyphony General idea:
compare note sets: find supersets, calculate distance usually take onset, pitch and duration account (OPD) hopefully more tolerant agains some of the problems of melodic variety
Clausen, Engelbrecht, Meyer, Schmidt (2000): PROMS matches onset times; wildcards elegant indexing
Lemström, Mäkinen, Ukkonen, Turkia (several articles, 2003-4) C-Brahms algorithms for matching line segments
P1: onsets P2: partial match onset times P3: common shared time
attention to time complexity Typke, Veltkamp, Wiering (2006)
Orpheus matching, using EMD and PTD
14
C-Brahms: P1-P3
General task: find trans-lations of pattern P in T
P1: all starting points inP match starting pointsin T (example)
P2: same, with subsetsof P
P3: find the one withmaximum overlap
see also: http://www.cs.helsinki.fi/group/cbrahms/algorithm-visualisations.html
8
15
Earth Mover’s Distance
The Earth Mover’s Distance(EMD) measures similarity bycalculating a minimum flowthat would match two set ofweighted points. One setemits weight, the other onereceives weight (Y. Rubner1998; S. Cohen 1999)
Constraints: no negative flow no point emits or receives
more than its weight the lighter pointset is
completely matched partial matching
16
Application to melody
Researched inRainer Typke’s PhDthesis (2007)
Models melodiccontour
Represent notes asweighted point setsin 2-dimensionalspace (pitch, time)
Weight representsduration other possibilities
contour/metricposition etc.
here, the ‘earth’ is only moved along the temporal axis
9
17
Another example
Interestingproperties tolerant against
melodic confounds suitable for
polyphony continuous partial matching
disadvantage triangle inequality
doesn’t hold less suitable for
indexing after alignment, the ‘earth’ is moved both along thetemporal axis and along the pitch axis
18
Test on RISM A/II
finds 15 out of 16 known instances of Roslin Castle
10
19
Proportional Transportation Distance (PTD)
Giannopoulos &Veltkamp (2002) EMD, weights of
sets normalised to 1 triangle inequality
holds suitable for indexing no partial matching
Test on RISM A/II only hits with
approximately samelength
need 4 queries tofind all known items
20
False positive (EMD)
Problems arise when length and/or number of notes differsconsiderably
11
21
Segmenting
Solution: apply segmentation create segments of
comparable length overlapping segments of 6-9
consecutive notes perform search for each query
segment search results are combined
Evaluation: better Recall-Precision averages
Disadvantage: very largenumber of segments possible solution: use
‘cognitive segmenting’ many algorithms have been
proposed for this
22
Graph matching
Melodies can be represented as graphs Compare graphs Tested on folksong collection Will be discussed by Renier Leuken on
Friday
12
23
Concluding remarks about melody retrieval
Lots of creativity go into melody; difficult to giverules not a ‘basic musical structure’ (Temperley 2001)
Important to use multiple features pitch, rhythm harmony
Melody is not an object but a process thattakes place over time role of expectation in perception can be modelled too (Huron 2006)
24
Harmonic matching
Use chords and chord relationships in retrieval Tonality: system for interpreting pitches or chords through their
relationships to a reference pitch, dubbed tonic (Huron 2006) Relatively few different chords are used
constructed in similar way connected in relatively sterotyped patterns
Most basic unit: triad consists of 3 different pitches 24 consonant triads--each can function as a tonic
Tonality is a ‘basic musical structure’ much standardisation fewer problems in dealing with creativity than in melody
13
25
Example: OMRAS harmonic matching
Jeremy Pickens et al. PolyphonicScore Retrieval Using PolyphonicAudio Queries: A HarmonicModeling Approach (2002)
Example of audio to symbolicmatching compares complete pieces harmonic aspect makes it
particularly nice Main steps
audio recording -> MIDItranscription
compare to MIDI representations ofscores in database
output ranked results
Online Music Recognition And Searching—www.omras.org
26
OMRAS in more detail Transcriptions contain many errors
most of these: harmonics of correct pitches these disappear when ‘simultaneities’ are
reduced to simple chords Each simultaneity is compared to all 24
triads (12 major, 12 minor) no decision, but value for each tonality employs Krumhansl-Kessler frequency
profiles Out of the 24 values for each simultaneity,
a Markov model for transitions betweentriads is generated
Models of query and documents arecompared Language modelling: estimating probability
of generating a query that conforms themodel of the document
Tested by means of retrieval of variations(Mozart, Ah, vous, Lachrimae, Folia)
14
27
Chroma matching
Most promising type ofaudio matching
Idea: extract chroma choose time interval perform FFT ->
frequency spectrum determine energy for
each pitch sum for same pitch in
different octaves create vectors find nearest
neighbour(s)
illustration from Dan Ellis: labrosa.ee.columbia.edu/projects/coversongs/
Applications: cover song identification (Dan
Ellis—best in MIREX 2006) approximate audio matching (Casey
2006) audio alignment; audio-notation
alignment (Syncplayer, Clausen &Müller) http://www-mmdb.iai.uni-
bonn.de/projects/syncplayer/
28
Selected tools for MIR
Survey by Paul Lamere (May 2005) http://www.music-ir.org/evaluation/tools.html
Audio Marsyas (http://marsyas.sness.net/)
audio processing, specifically MIR Matlab (commercial)
Symbolic MIDI Toolbox (http://www.jyu.fi/musica/miditoolbox/)
functions for analyzing and visualizing MIDI files, uses Matlab jMIR (jmir.sourceforge.net)
Java software suite for MIR research, mainly feature extraction andclassification
SIMILE (Müllensiefen & Frieler 2004) set of similarity algorithms
15
29
MIR evaluation: MIREX
Music Information Retrieval Evaluation eXchange http://www.music-ir.org/mirexwiki/ yearly competition (?) since 2004/5
Many kinds of tasks feature extraction, e.g. chord, onset detection, melody
extraction classification, e.g. mood, genre identification, e.g. artist, cover song audio to score alignment similarity and retrieval
query by humming audio similarity and retrieval symbolic melodic similarity
30
Performance evaluation
Data collection shortage of suitable test collections problem: copyright
Human judgements pooling method: manually remove false positives from combined
output of participants ground truth: establish ideal result set from complete data set drawback: reasons behind human judgements are inaccessible
Evaluation measure Precision and Recall often used similarity judgements are usually not binary need suitable methods
16
31
Case: evaluation of melodic similarity
ground truth: perfect answer to a musical query,determined by domain experts
experiment Database: RISM A/II c. 470.000 melodies select n queries filtering: sets of 50 candidate hits task:
decide relevant/not rank relevant items by similarity to the query
32
Expert interface
17
33
Sample result
all ground truths are on http://rainer.typke.org/mirex05.0.html
34
Average Dynamic Recall
Ground truth partially ordered (i.e. some items have the samerank)
Algorithm output ordered by similarity Used in MIREX 2005, 2006; SHREC
after Typke, Veltkamp & Wiering 2006
18
35
Summary
what we discussed melody retrieval
pitch only geometric methods
harmony retrieval (Markov Modelling) chroma matching evaluation
what we did not discuss many other symbolic methods audio matching classification
where to go from here http://www.ismir.net/proceedings/ (493 papers)
36
References (1)
H. Barlow & S. Morgenstern. Dictionary of Musical Themes. 1948D. Byrd & T. Crawford. Problems of Music Information Retrieval in the Real
World. Information Processing and Management 38, 249-272. 2000M. Casey. Audio Tools for Music Discovery and Structural Analysis. 2006
[http://www.methodsnetwork.ac.uk/activities/es02mainpage.html]M. Clausen, R. Engelbrecht, D. Meyer, & J. Schmitz. PROMS: a web-
based tool for searching in polyphonic music. In ISMIR 2000.P. P. Giannopoulos & R. C. Veltkamp. A pseudo-metric for weighted point
sets. In Proceedings of the 7th European Conference on ComputerVision (ECCV). Springer-Verlag, 715–730, 2002.
D. Huron, Sweet Anticipation: Music and the Psychology of Expectation.Bradford Books, 2006
D. Müllensiefen & K. Frieler. Cognitive Adequacy in the Measurement ofMelodic Similarity: Algorithmic vs. Human Judgements”. Computing inMusicology 13, 147-176, 2004
M. Müller, F. Kurth, & M. Clausen. Audio matching via chroma-basedstatistical features. In ISMIR 2005, 288–295.
19
37
References (2)
J. Pickens et al. Polyphonic Score Retrieval Using Polyphonic AudioQueries: A Harmonic Modeling Approach. ISMIR 2002
E. Selfridge-Field. Conceptual and representational issues in melodiccomparison. Computing in Musicology, 11:3–64, 1998
D. Temperley, The Cognition of Basic Musical Structures. MIT Press, 2001R.Typke. Music Retrieval based on Melodic Similarity. PhD thesis Utrecht
University, 2007R. Typke, R. C. Veltkamp & F. Wiering. A measure for evaluating retrieval
techniques based on partially ordered ground truth lists. In InternationalConference on Multimedia & Expo (ICME), 2006.
R. Typke, F. Wiering & R. C. Veltkamp. Transportation distances andhuman perception of melodic similarity. Musicae Scientiae DiscussionForum 4A, 153-181, 2007
E. Ukkonen, K. Lemström, & V. Mäkinen. Geometric algorithms fortransposition invariant content-based music retrieval. In ISMIR2003,193–199