modeling meter and harmony: a preference-rule approachsleator/papers/music-modeling.pdf · tem for...

18
10 Computer Music Journal Computer Music Journal, 23:1, pp. 10–27, Spring 1999 © 1999 Massachusetts Institute of Technology. Introduction Computational music analysis is an important project for a number of reasons. From a psycho- logical point of view, systems for performing mu- sical processes may shed light on how human listeners perform such processes, just as computa- tional work in vision and problem solving has led to important insights in those domains. While re- cent research has revealed much about listeners’ mental representations of music (Dowling and Harwood 1986; Butler 1992), there is much to be learned in this area. There are practical applica- tions to automatic music analysis as well. Many kinds of music-related tasks that computers might perform—for example, producing a score from raw pitch data, detecting probable errors in a score, searching a melodic database for matches to a query, generating an accompaniment for a melody, or performing statistical analysis of musical styles for the purposes of automatic composition—re- quire an ability to process music and extract vari- ous kinds of structure and information from it. Extracting this information proves to be a highly complex task. In this article, we present a computational sys- tem for analyzing metrical and harmonic structure. The system is designed for Western tonal music, particularly art music of the “common-practice” period. As input, the program takes a list of notes with pitch, on-time, and off-time; it produces a rep- resentation showing a metrical structure (consist- ing of several levels of beats) and a harmonic structure (consisting of a partitioning of the piece into segments, each labeled with a root). (There are important reasons for combining the metrical and harmonic systems into a single program, which we will explain.) Our approach is based on preference rules. Preference rules are criteria for selecting an analysis of a piece out of many possible ones. The preferred analysis is the one which, on balance, best satisfies the rules. The current article presents an overview of our project; more information is available at the World Wide Web site http:// www.link.cs.cmu.edu/music-analysis. The com- plete source code for our system (written in C) is available at this site, and we also provide a number of sample input files and a utility program for gen- erating input files from MIDI files. The site also contains information about recent work that we were unable to include here. Both metrical analysis and harmonic analysis have been addressed before in computational music studies. The problem of harmonic analysis has re- ceived surprisingly little attention. The few at- tempts that have been made (notably by Winograd [1968] and Maxwell [1992]) have had limited suc- cess. The most important limitation is that they are unable to handle cases where the notes of a chord are not stated fully and simultaneously, such as arpeggiations, incomplete chords, and unaccom- panied melodies. They also require information that is not generally present in listening, such as key signatures, pitch spellings, and rhythmic val- ues. (Temperley [1997] has reviewed this work.) More work has been done on metrical analysis (Longuet-Higgins and Steedman 1971; Steedman 1977; Chafe, Mont-Reynaud, and Rush 1982; Povel and Essens 1985; Allen and Dannenberg 1990; Lee Modeling Meter and Harmony: A Preference- Rule Approach David Temperley * and Daniel Sleator *School of Music Weigel Hall Ohio State University Columbus, Ohio 43201, USA [email protected] School of Computer Science Carnegie-Mellon University 5000 Forbes Avenue Pittsburgh, Pennsylvania 15213, USA [email protected]

Upload: duongdung

Post on 29-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

10 Computer Music Journal

Computer Music Journal 231 pp 10ndash27 Spring 1999copy 1999 Massachusetts Institute of Technology

Introduction

Computational music analysis is an importantproject for a number of reasons From a psycho-logical point of view systems for performing mu-sical processes may shed light on how humanlisteners perform such processes just as computa-tional work in vision and problem solving has ledto important insights in those domains While re-cent research has revealed much about listenersrsquomental representations of music (Dowling andHarwood 1986 Butler 1992) there is much to belearned in this area There are practical applica-tions to automatic music analysis as well Manykinds of music-related tasks that computers mightperformmdashfor example producing a score from rawpitch data detecting probable errors in a scoresearching a melodic database for matches to aquery generating an accompaniment for a melodyor performing statistical analysis of musical stylesfor the purposes of automatic compositionmdashre-quire an ability to process music and extract vari-ous kinds of structure and information from itExtracting this information proves to be a highlycomplex task

In this article we present a computational sys-tem for analyzing metrical and harmonic structureThe system is designed for Western tonal musicparticularly art music of the ldquocommon-practicerdquoperiod As input the program takes a list of noteswith pitch on-time and off-time it produces a rep-resentation showing a metrical structure (consist-ing of several levels of beats) and a harmonic

structure (consisting of a partitioning of the pieceinto segments each labeled with a root) (There areimportant reasons for combining the metrical andharmonic systems into a single program which wewill explain) Our approach is based on preferencerules Preference rules are criteria for selecting ananalysis of a piece out of many possible ones Thepreferred analysis is the one which on balancebest satisfies the rules The current article presentsan overview of our project more information isavailable at the World Wide Web site httpwwwlinkcscmuedumusic-analysis The com-plete source code for our system (written in C) isavailable at this site and we also provide a numberof sample input files and a utility program for gen-erating input files from MIDI files The site alsocontains information about recent work that wewere unable to include here

Both metrical analysis and harmonic analysishave been addressed before in computational musicstudies The problem of harmonic analysis has re-ceived surprisingly little attention The few at-tempts that have been made (notably by Winograd[1968] and Maxwell [1992]) have had limited suc-cess The most important limitation is that theyare unable to handle cases where the notes of achord are not stated fully and simultaneously suchas arpeggiations incomplete chords and unaccom-panied melodies They also require informationthat is not generally present in listening such askey signatures pitch spellings and rhythmic val-ues (Temperley [1997] has reviewed this work)More work has been done on metrical analysis(Longuet-Higgins and Steedman 1971 Steedman1977 Chafe Mont-Reynaud and Rush 1982 Poveland Essens 1985 Allen and Dannenberg 1990 Lee

Modeling Meter andHarmony A Preference-Rule Approach

David Temperley and Daniel Sleatordagger

School of MusicWeigel HallOhio State UniversityColumbus Ohio 43201 USAtemperley1osuedudaggerSchool of Computer ScienceCarnegie-Mellon University5000 Forbes AvenuePittsburgh Pennsylvania 15213 USAsleator+cscmuedu

Temperley and Sleator 11

1991 Rosenthal 1992 Parncutt 1994) In this areatoo previous systems have been limited in both theinputs they handle and the representations theyproduce Some can handle only quantized inputs(Longuet-Higgins and Steedman 1971 Steedman1977 Povel and Essens 1985 Lee 1991 Parncutt1994) some consider only rhythmic informationwithout considering pitch in any way (Longuet-Higgins and Steedman 1971 Povel and Essens 1985Allen and Dannenberg 1990 Lee 1991 Parncutt1994) some produce only a single level of beats(Povel and Essens 1985 Parncutt 1994) and all ofthese systems are limited to monophonic musicThe current system attempts to overcome theselimitations

Quite apart from the issue of performance thesystem presented here differs from most other har-monic and metrical analysis systems in an impor-tant way For the most part the systems mentionedabove are not preference-rule systems Rather theymight better be called procedural systems they arebest described in terms of the procedure they fol-low rather than the output they produce Cer-tainly these systems reflect principles about thekinds of metrical structures that are desirable likehaving strong beats on long notes However it isnot usually possible to describe the output of thesystem as one that satisfies some set of criteria(Two exceptions are the systems of Povel andEssens and Parncutt which could be viewed assimple preference-rule systems) With preference-rule systems by contrast the output can be de-scribed simply as the one that best satisfies thepreference rules To put it another way the prefer-ence rules themselves provide a kind of high-leveldescription of what the system is doing which isnot usually available in procedural systems

The conceptual simplicity of preference-rule sys-tems is one of their attractive features Howeverthey present a computational problem how to findthe optimal analysis of a piece out of a huge num-ber of possibilities We will propose a solution tothis problem that seems to have quite broad appli-cability to musical preference-rule systems

We begin by describing our two preference-rulesystems With each system we begin with an in-formal description and then explain the way it is

formalized and implemented We also discuss theprocedure we have devised for solving the searchproblem posed above We examine several ex-amples of the programrsquos output pointing to somevirtues and failings Finally we consider some pos-sible directions for improvement and discusssome important general advantages of the prefer-ence-rule approach

Meter The GTTM Model

Before we begin a word is needed about the inputto the program The input we assume is a ldquonotelistrdquo giving the pitch (in integer notation middleC = 60) and the on-time and off-time (in millisec-onds) of a series of notes It is perhaps easiest tothink of this as a two-dimensional representationwith pitch on one axis and time on the othersimilar to a piano roll Each event is represented asa line segment on this plane the length of the seg-ment corresponding to the notersquos duration Theprogram requires no information beyond this inparticular it does not require other informationcommonly available in scores such as bar lineskey signatures rhythmic notation and the spell-ings of pitches

While our approach to meter builds on variousearlier attempts it is most closely related to thetheory of Lerdahl and Jackendoff as presented in AGenerative Theory of Tonal Music (hereaftercalled GTTM) (1983) Lerdahl and Jackendoff pro-pose a theory of meter stated as a set of well-formedness (or hard-and-fast) rules and preferencerules The preference rules in GTTM are stated in-formally and the authors do not discuss how theymight be formalized or implemented Lerdahl andJackendoff propose that a metrical structure con-sists of several levels of equally spaced beats(where beats are points in time not necessarilyevents) Every second or third beat at one level is abeat at the next level up (By convention the levelwith the fastest beats is called the lowest level) Inmusic notation the time signature of a piece usu-ally indicates several levels of the metrical struc-ture For example a piece in 68 time has one levelof eighth-note beats every third beat at that level

12 Computer Music Journal

forms a dotted-quarter level of beats and everysecond beat at that level forms a dotted-half (orldquoone-measurerdquo) level In addition there may alsobe one or two levels above the measure so-calledhypermetrical levels

Even assuming that a metrical structure mustinvolve several levels of equally spaced beats onemust still determine the duple or triple relation-ships between levels the tempo (the time inter-vals between beats) and the placing of the beatsrelative to the music For this purpose Lerdahland Jackendoff posit a set of metrical preferencerules stating the criteria whereby listeners inferthe correct structure Consider Figure 1 the tradi-tional American melody ldquoOh Susannahrdquo The cor-rect metrical structure is shown above the staff(The metrical and harmonic analysis shown hereis the one produced by our program with one dif-ference that we will explain) The most importantrule is that beats (especially higher-level or

ldquostrongrdquo beats) should whenever possible coincidewith the onsets of events Second there is a prefer-ence for strong beats to coincide with longerevents In ldquoOh Susannahrdquo for example this favorsplacing quarter-note beats on even-numberedeighth-note beats (the second fourth and so on)since this aligns them with the dotted eighthnotes Similarly this rule favors placing half-notebeats on odd-numbered quarter notes since thisaligns the long note in measure 4 with a half-notebeat We state these rules as follows (our wordingdiffers slightly from that in GTTM)

Event rulemdashprefer a structure that alignsbeats with event onsetsLength rulemdashprefer a structure that alignsstrong beats with onsets of longer events

Note that the preferred metrical structure is theone that is preferred on balance it may not be pre-ferred at every moment of the piece For example

Figure 1 ldquoOh Susannahrdquoshowing the correct metri-cal and harmonic struc-ture Each row of dotsindicates a level of themetrical structure eachdot indicates a beat on

the onset of the note be-low Each chord symbolrepresents a chord spanbeginning on the note be-low and extending to thebeginning of the followingchord span

Temperley and Sleator 13

the second A in measure 10 is a long note on afairly weak beat thus violating the length rule buton balance this structure is still the preferred one

When we turn to polyphonic music severalcomplications arise Consider Figure 2 the open-ing of Mozartrsquos Sonata K 332 The correct metri-cal structure is indicated by the time signaturewhy is this the perceived structure Clearly theeighth-note level is determined by the event ruleThe quarter-note level could also be explained bythe event rule on certain eighth-note beats wehave two event onsets not just one This brings upan important point about the event rule the moreevent onsets at a time point the better a beat loca-tion it is Regarding the dotted-half-note level onemight suppose that it was due to the long notes inthe right hand This raises the question of what ismeant by the ldquolengthrdquo of an event The actuallength of events is available in the input and thiscould be used However this is problematic In theMozart excerpt the long notes in the right handwould probably still seem long (and hence goodcandidates for strong beats) even if they wereplayed staccato Alternatively one might assumethat the length of a note corresponds to its inter-onset interval (IOI) which is the time interval be-tween the notersquos onset and the onset of thefollowing note (This is the approach taken bymost previous meter-finding programs) This ap-proach works fairly well in monophonic music (al-

though even there it encounters problems as wewill show) however it is totally unworkable inpolyphonic music In the Mozart piece the IOI ofthe first right-hand event is only one eighth noteit is followed immediately by a note in the lefthand Intuitively what we want is the IOI of anote within that line of the texture we call thisthe registral IOI However separating the eventsof a texture into lines is a highly complex taskthat we will not address here Our solution to thisproblem which is crude but fairly effective is todefine the registral IOI of an event as the inter-on-set interval to the next event within a certainrange of pitch we adopt the value of ninesemitones However it sometimes proves usefulto use duration as well In Figure 3 the fact thatthe melody note is held makes it clear that it is ina separate line from the lower notes and hence istruly long despite its short registral IOI Takingall this into account we propose a measure of aneventrsquos length that is used for the purpose of thelength rule the length of a note is the maximumof its duration and its registral IOI

Lerdahl and Jackendoffrsquos theory neglects one ex-tremely important aspect of meter The GTTMrules state that beats at the ldquotactusrdquo level and im-mediately higher levels must be exactly evenlyspaced The tactus is the most salient level ofmeter usually corresponding to the main ldquobeatrdquoof the music in colloquial terms In actual perfor-mance however beats are of course not exactlyevenly spaced at any level There may be deliber-ate fluctuations in timing as well as errors andimperfections This is a crucial point because itmeans that we cannot simply infer the metricalstructure at the beginning of a piece and extrapo-late it metronomically through the rest of thepiece (even if we had the ability to do that) we

Figure 2 Mozartrsquos SonataK 332 first movementmeasures 1ndash5

Figure 3 A case where du-ration affects the perceivedlength of notes

Figure 2

Figure 3

14 Computer Music Journal

must continuously adjust the meter to fit the mu-sic that is heard (There is no doubt that listenerscan and do make these adjustments in general wehave little difficulty relocating the beat after afermata or ritard or even a complete change inmeter or tempo) In principle accommodating thisinto Lerdahl and Jackendoffrsquos theory is straightfor-ward we simply make the requirement for regu-larity of beats at each level into a preference rulerather than a well-formedness (hard-and-fast) ruleWe state this rule as follows

Regularity rulemdashprefer beats at each level tobe maximally evenly spaced

Maximizing the regularity of each level thus be-comes a flexible constraint to be balanced againstthe other preference rules

The regularity rule brings up an important fea-ture of our system In the past computationalanalysis of rhythm has generally been divided intotwo problems One is quantization the roundingoff of durations and time points in a piece to mul-tiples of a common beat (see for example Desainand Honing 1992) The other is metrical analysisimposing higher levels of beats on an existinglower level Models that assume a quantized input(such as Lerdahl and Jackendoffrsquos) are really onlyaddressing the second problem However an im-portant recent realization of music artificial intel-ligence has been that quantization and meterfinding are really part of the same process In im-posing a low level of beats on a piece of musicmarking the onsets of events one is in effect iden-tifying their position and duration in terms of inte-ger values of those beats Most notably Rosenthal(1992) proposed a model that accomplishes bothlow-level quantization and medium-level meterfinding Our system is similar in this respect per-forming both quantization and meter finding in asingle process

The three rules discussed abovemdashthe event rulethe length rule and the regularity rulemdashform thecore of the meter program It remains to be ex-plained how they are formalized and implementedThe basic idea is straightforward The system mustconsider all possible analyses of a piece where ananalysis is simply a well-formed metrical struc-

ture consisting of several rows of beats as out-lined above Each metrical preference rule assigns anumerical score to each analysis indicating howwell the analysis satisfies that rule The preferredanalysis is the one with the highest total score

For now let us consider the simpler situation ofderiving a single row of beats whose time intervalsmay vary over a broad range say 400ndash1600 msecthe typical range for the tactus level We will re-turn later to the problem of adding other levelsAn analysis then is a row of beats imposed on agiven piece The input representation is first quan-tized into very short segments of 35 msec whichwe call pips (the value of 35 msec was simplyfound to be optimal through trial and error) Everyevent onset and offset is rounded to the nearestpip beats also may occur only at the start of a pip

An analysis can then be evaluated in the follow-ing way Each pip containing a beat is given a nu-merical score indicating how many events haveonsets at that point This score is also weighted bythe length of the events starting there In sum-ming these note scores for all the pips we have anumerical representation of how well the analysissatisfies the event and length rules Each pip con-taining a beat is also given a score indicating howevenly spaced it is in the context of the previousbeats in that analysis There are various ways thatthis could be quantified but we have found a verysimple method that works well the regularity of abeat Bn is simply given by the absolute differencebetween the interval between Bn and Bnndash1 and thatbetween Bnndash1 and Bnndash2 This beat-interval scoretherefore acts as a penalty a representation is pre-ferred in which the beat-interval scores are mini-mized The preferred tactus level is the one whosetotal note score and (negative) beat-interval scoreis highest One complication here is that simplydefining the note score for a beat level as the sumof the note scores for all its beats gives an advan-tage to analyses with more beats Thus we weightthe note score of each beat by the square root of itsbeat interval (the interval to the previous beat)The square root was chosen simply because it is aconvenient function whose growth rate is betweena constant (which would be too small) and a linearfunction (which would be too large)

Temperley and Sleator 15

It is because of the beat-interval scores that thescores must be computed in terms of entire analy-ses The best beat location within (for example) a 1-sec segment depends in part on the beat-intervalpenalty for different pips but this penalty dependson the location of previous beats which in turn de-pends on their beat-interval scores (and their notescores) and so on In theory then all possibleanalyses must be considered to be sure of findingthe highest-scoring one overall However the num-ber of possible analyses increases exponentiallywith the number of pips in the piece There is there-fore a search problem to be solved of finding the cor-rect analysis without actually generating them allwe discuss later how we solve this problem

The preferred tactus level for a piece then isthe one that best satisfies the three preferencerules quantified in the way just described It re-mains to be explained how the other levels are de-rived The program begins with the tactus level itthen generates two levels above the tactus and twobelow Five levels prove to be sufficient for thegreat majority of pieces We have adopted the con-vention of calling the tactus level 2 the upper lev-els 3 and 4 and the lower levels 1 and 0 Thegeneration of the upper levels is straightforwardLevel 3 is generated in exactly the same way aslevel 2 with the added stipulation that every beatat level 3 must also be a beat at level 2 and ex-actly one or two level-2 beats must elapse betweeneach pair of level-3 beats Level 4 is then generatedfrom level 3 in the same manner There is no ex-plicit requirement for regularity in terms of the re-lationship between levels (duple or triple) but thistends to emerge naturally since there is pressureon each level to be regular in itself The lower lev-els are slightly more complicated At level 1 apossible duple or triple division is generated foreach level-2 beat The best choice between dupleand triple and the exact placing of the lower-levelbeats within the tactus beat is determined usingthe same preference rules described above How-ever there is also a penalty for switching fromduple to triple from one tactus beat to another Inthis way other things being equal there is a pref-erence for maintaining either duple or triple divi-sion once it is established

A Model of Harmony

Another essential aspect of musical structure isharmony Elsewhere one of us has proposed amodel for harmonic analysis (Temperley 1997)this is the basis for the system we present hereLike the metrical program the main input to theharmonic program is simply a ldquopiano rollrdquo a listof events with a time and duration for each event(although metrical information is also required aswe explain) The program produces a representa-tion of segments which we call chord spans la-beled with roots This is somewhat different fromconventional harmonic analysis or ldquoRoman-nu-meral analysisrdquo whereas a Roman-numeral analy-sis represents each root relative to the current keythe current program simply labels roots in abso-lute terms

The model we employ is again a preference-rule system giving a set of criteria for choosingthe preferred analysis out of all the possible onesA possible harmonic analysis is simply a completesegmentation of the piece where each segment islabeled with a root Again ldquoOh Susannahrdquo pro-vides a simple illustration of the programrsquos rulesWhat we would consider the correct analysis is in-dicated by the root symbols below the staff (Un-like metrical structuremdashwhich is usuallyindicated by the score of a piecemdashthe correct har-monic structure for a piece is to some extent amatter of opinion While there are certainly caseswhere people disagree about the correct analysisfor a piece there is a great deal of agreement aswell) Assuming the first segment consists of thefirst three measures (plus the opening pickup)why does C seem like the correct choice of roothere The most obvious reason is that C E and Gare all present and these are chord tones of C ma-jor One question is how we know which notes inthe segment are chord tones as opposed to orna-mental tones we will return to this But even oncethis is known the chord tones in a segment oftendo not uniquely identify a chord For examplemeasures 9ndash10 contain F and A these could implyeither an F-major chord or a D-minor chord al-though F major seems preferred To capture suchintuitions we stipulate that roots are preferred

16 Computer Music Journal

that result in certain pitch-root relationships butsome relationships are preferred to others We ex-press this in the following rule

Compatibility rulemdashprefer roots that resultin certain pitch-root relationships The fol-lowing relationships are preferred in this or-der 1 5 3 3 7 5 9 ornamental

In measures 9ndash10 a root of F is thus preferredover D given the pitches F and A F results inpitch-root relationships of 1 and 3 whereas D re-sults in 5 and 3

The compatibility rule says that an event thatdoes not have a chord-tone relationship with thecurrent root is ldquoornamentalrdquo However not justany event can be ornamental some events are bet-ter ornamental tones than others If one considersthe most common types of ornamental tonesmdashpassing tones neighbor tones suspensions and ap-poggiaturasmdashone finds that they all share acommon characteristic they are closely followedby another event a half-step or whole-step away inpitch In addition we have found that there is apreference for ornamental tones to be metricallyweak although sometimes they are metricallystrong (This of course requires access to metricalstructure we will return to this in a moment) In ascale passage for example the metrically strongnotes are more likely to be heard as chord tonesWe express these intuitions as follows

Ornamental dissonance rulemdashin labelingevents as ornamental prefer events that are(1) closely followed by another event a half-step or whole-step away and (2) metricallyweak

According to this rule the D in the pickup tomeasure 1 the A at the end of measure 1 and theD at the end of measure 2 are all good ornamentaldissonances since all are closely followedstepwise However the G in measure 2 (for ex-ample) is not closely followed stepwise If thisevent were treated as ornamentalmdashthat is if a root

were chosen for that segment of which G was nota chord tonemdashthe ornamental dissonance rulewould be severely violated

Now consider measure 4 According to thecompatibility rule D would be the preferred rootfor this segment (since the root D results in apitch-root relationship of 1 with pitch D) Why isG more preferred Intuitively G is a ldquocloserrdquo har-mony to C than D is and should therefore be pre-ferred in such cases Here we introduce a simplespatial model the line of fifths similar to thecircle of fifths but extending infinitely in eitherdirection as shown in Figure 4 Choosing a rootfor a segment is therefore a matter of mapping itonto a point on the line of fifths Given thismodel we can say that there is a preference forchoosing roots so that roots of nearby segmentsare close together on the line This leads to athird rule

Harmonic variance rulemdashprefer roots suchthat roots of nearby chord spans are close to-gether on the line of fifths

The line of fifths raises the issue of spellingMuch work in music theory (especially atonaltheory) and music cognition has assumed thatpitches are categorized into ldquopitch classesrdquo suchthat pitches one or more octave apart are membersof the same class However in conventional tonaltheory and music notation distinctions are madebetween different spellings of the same pitch egA versus G We call these tonal pitch classes(TPCs) as opposed to the twelve neutral pitchclasses of atonal theory It is our view that thesespelling distinctions are an important aspect oftonal harmony and must be represented (For adiscussion of this issue see Temperley 1997) Thisadds a further task for the program as well as di-viding the piece into segments labeled with rootsit must also choose a spelling label for each pitchevent hence mapping it onto the line of fifthsThe main criterion for doing this is a simple oneanalogous to the harmonic variance rule above

Figure 4 The line offifths

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 11

1991 Rosenthal 1992 Parncutt 1994) In this areatoo previous systems have been limited in both theinputs they handle and the representations theyproduce Some can handle only quantized inputs(Longuet-Higgins and Steedman 1971 Steedman1977 Povel and Essens 1985 Lee 1991 Parncutt1994) some consider only rhythmic informationwithout considering pitch in any way (Longuet-Higgins and Steedman 1971 Povel and Essens 1985Allen and Dannenberg 1990 Lee 1991 Parncutt1994) some produce only a single level of beats(Povel and Essens 1985 Parncutt 1994) and all ofthese systems are limited to monophonic musicThe current system attempts to overcome theselimitations

Quite apart from the issue of performance thesystem presented here differs from most other har-monic and metrical analysis systems in an impor-tant way For the most part the systems mentionedabove are not preference-rule systems Rather theymight better be called procedural systems they arebest described in terms of the procedure they fol-low rather than the output they produce Cer-tainly these systems reflect principles about thekinds of metrical structures that are desirable likehaving strong beats on long notes However it isnot usually possible to describe the output of thesystem as one that satisfies some set of criteria(Two exceptions are the systems of Povel andEssens and Parncutt which could be viewed assimple preference-rule systems) With preference-rule systems by contrast the output can be de-scribed simply as the one that best satisfies thepreference rules To put it another way the prefer-ence rules themselves provide a kind of high-leveldescription of what the system is doing which isnot usually available in procedural systems

The conceptual simplicity of preference-rule sys-tems is one of their attractive features Howeverthey present a computational problem how to findthe optimal analysis of a piece out of a huge num-ber of possibilities We will propose a solution tothis problem that seems to have quite broad appli-cability to musical preference-rule systems

We begin by describing our two preference-rulesystems With each system we begin with an in-formal description and then explain the way it is

formalized and implemented We also discuss theprocedure we have devised for solving the searchproblem posed above We examine several ex-amples of the programrsquos output pointing to somevirtues and failings Finally we consider some pos-sible directions for improvement and discusssome important general advantages of the prefer-ence-rule approach

Meter The GTTM Model

Before we begin a word is needed about the inputto the program The input we assume is a ldquonotelistrdquo giving the pitch (in integer notation middleC = 60) and the on-time and off-time (in millisec-onds) of a series of notes It is perhaps easiest tothink of this as a two-dimensional representationwith pitch on one axis and time on the othersimilar to a piano roll Each event is represented asa line segment on this plane the length of the seg-ment corresponding to the notersquos duration Theprogram requires no information beyond this inparticular it does not require other informationcommonly available in scores such as bar lineskey signatures rhythmic notation and the spell-ings of pitches

While our approach to meter builds on variousearlier attempts it is most closely related to thetheory of Lerdahl and Jackendoff as presented in AGenerative Theory of Tonal Music (hereaftercalled GTTM) (1983) Lerdahl and Jackendoff pro-pose a theory of meter stated as a set of well-formedness (or hard-and-fast) rules and preferencerules The preference rules in GTTM are stated in-formally and the authors do not discuss how theymight be formalized or implemented Lerdahl andJackendoff propose that a metrical structure con-sists of several levels of equally spaced beats(where beats are points in time not necessarilyevents) Every second or third beat at one level is abeat at the next level up (By convention the levelwith the fastest beats is called the lowest level) Inmusic notation the time signature of a piece usu-ally indicates several levels of the metrical struc-ture For example a piece in 68 time has one levelof eighth-note beats every third beat at that level

12 Computer Music Journal

forms a dotted-quarter level of beats and everysecond beat at that level forms a dotted-half (orldquoone-measurerdquo) level In addition there may alsobe one or two levels above the measure so-calledhypermetrical levels

Even assuming that a metrical structure mustinvolve several levels of equally spaced beats onemust still determine the duple or triple relation-ships between levels the tempo (the time inter-vals between beats) and the placing of the beatsrelative to the music For this purpose Lerdahland Jackendoff posit a set of metrical preferencerules stating the criteria whereby listeners inferthe correct structure Consider Figure 1 the tradi-tional American melody ldquoOh Susannahrdquo The cor-rect metrical structure is shown above the staff(The metrical and harmonic analysis shown hereis the one produced by our program with one dif-ference that we will explain) The most importantrule is that beats (especially higher-level or

ldquostrongrdquo beats) should whenever possible coincidewith the onsets of events Second there is a prefer-ence for strong beats to coincide with longerevents In ldquoOh Susannahrdquo for example this favorsplacing quarter-note beats on even-numberedeighth-note beats (the second fourth and so on)since this aligns them with the dotted eighthnotes Similarly this rule favors placing half-notebeats on odd-numbered quarter notes since thisaligns the long note in measure 4 with a half-notebeat We state these rules as follows (our wordingdiffers slightly from that in GTTM)

Event rulemdashprefer a structure that alignsbeats with event onsetsLength rulemdashprefer a structure that alignsstrong beats with onsets of longer events

Note that the preferred metrical structure is theone that is preferred on balance it may not be pre-ferred at every moment of the piece For example

Figure 1 ldquoOh Susannahrdquoshowing the correct metri-cal and harmonic struc-ture Each row of dotsindicates a level of themetrical structure eachdot indicates a beat on

the onset of the note be-low Each chord symbolrepresents a chord spanbeginning on the note be-low and extending to thebeginning of the followingchord span

Temperley and Sleator 13

the second A in measure 10 is a long note on afairly weak beat thus violating the length rule buton balance this structure is still the preferred one

When we turn to polyphonic music severalcomplications arise Consider Figure 2 the open-ing of Mozartrsquos Sonata K 332 The correct metri-cal structure is indicated by the time signaturewhy is this the perceived structure Clearly theeighth-note level is determined by the event ruleThe quarter-note level could also be explained bythe event rule on certain eighth-note beats wehave two event onsets not just one This brings upan important point about the event rule the moreevent onsets at a time point the better a beat loca-tion it is Regarding the dotted-half-note level onemight suppose that it was due to the long notes inthe right hand This raises the question of what ismeant by the ldquolengthrdquo of an event The actuallength of events is available in the input and thiscould be used However this is problematic In theMozart excerpt the long notes in the right handwould probably still seem long (and hence goodcandidates for strong beats) even if they wereplayed staccato Alternatively one might assumethat the length of a note corresponds to its inter-onset interval (IOI) which is the time interval be-tween the notersquos onset and the onset of thefollowing note (This is the approach taken bymost previous meter-finding programs) This ap-proach works fairly well in monophonic music (al-

though even there it encounters problems as wewill show) however it is totally unworkable inpolyphonic music In the Mozart piece the IOI ofthe first right-hand event is only one eighth noteit is followed immediately by a note in the lefthand Intuitively what we want is the IOI of anote within that line of the texture we call thisthe registral IOI However separating the eventsof a texture into lines is a highly complex taskthat we will not address here Our solution to thisproblem which is crude but fairly effective is todefine the registral IOI of an event as the inter-on-set interval to the next event within a certainrange of pitch we adopt the value of ninesemitones However it sometimes proves usefulto use duration as well In Figure 3 the fact thatthe melody note is held makes it clear that it is ina separate line from the lower notes and hence istruly long despite its short registral IOI Takingall this into account we propose a measure of aneventrsquos length that is used for the purpose of thelength rule the length of a note is the maximumof its duration and its registral IOI

Lerdahl and Jackendoffrsquos theory neglects one ex-tremely important aspect of meter The GTTMrules state that beats at the ldquotactusrdquo level and im-mediately higher levels must be exactly evenlyspaced The tactus is the most salient level ofmeter usually corresponding to the main ldquobeatrdquoof the music in colloquial terms In actual perfor-mance however beats are of course not exactlyevenly spaced at any level There may be deliber-ate fluctuations in timing as well as errors andimperfections This is a crucial point because itmeans that we cannot simply infer the metricalstructure at the beginning of a piece and extrapo-late it metronomically through the rest of thepiece (even if we had the ability to do that) we

Figure 2 Mozartrsquos SonataK 332 first movementmeasures 1ndash5

Figure 3 A case where du-ration affects the perceivedlength of notes

Figure 2

Figure 3

14 Computer Music Journal

must continuously adjust the meter to fit the mu-sic that is heard (There is no doubt that listenerscan and do make these adjustments in general wehave little difficulty relocating the beat after afermata or ritard or even a complete change inmeter or tempo) In principle accommodating thisinto Lerdahl and Jackendoffrsquos theory is straightfor-ward we simply make the requirement for regu-larity of beats at each level into a preference rulerather than a well-formedness (hard-and-fast) ruleWe state this rule as follows

Regularity rulemdashprefer beats at each level tobe maximally evenly spaced

Maximizing the regularity of each level thus be-comes a flexible constraint to be balanced againstthe other preference rules

The regularity rule brings up an important fea-ture of our system In the past computationalanalysis of rhythm has generally been divided intotwo problems One is quantization the roundingoff of durations and time points in a piece to mul-tiples of a common beat (see for example Desainand Honing 1992) The other is metrical analysisimposing higher levels of beats on an existinglower level Models that assume a quantized input(such as Lerdahl and Jackendoffrsquos) are really onlyaddressing the second problem However an im-portant recent realization of music artificial intel-ligence has been that quantization and meterfinding are really part of the same process In im-posing a low level of beats on a piece of musicmarking the onsets of events one is in effect iden-tifying their position and duration in terms of inte-ger values of those beats Most notably Rosenthal(1992) proposed a model that accomplishes bothlow-level quantization and medium-level meterfinding Our system is similar in this respect per-forming both quantization and meter finding in asingle process

The three rules discussed abovemdashthe event rulethe length rule and the regularity rulemdashform thecore of the meter program It remains to be ex-plained how they are formalized and implementedThe basic idea is straightforward The system mustconsider all possible analyses of a piece where ananalysis is simply a well-formed metrical struc-

ture consisting of several rows of beats as out-lined above Each metrical preference rule assigns anumerical score to each analysis indicating howwell the analysis satisfies that rule The preferredanalysis is the one with the highest total score

For now let us consider the simpler situation ofderiving a single row of beats whose time intervalsmay vary over a broad range say 400ndash1600 msecthe typical range for the tactus level We will re-turn later to the problem of adding other levelsAn analysis then is a row of beats imposed on agiven piece The input representation is first quan-tized into very short segments of 35 msec whichwe call pips (the value of 35 msec was simplyfound to be optimal through trial and error) Everyevent onset and offset is rounded to the nearestpip beats also may occur only at the start of a pip

An analysis can then be evaluated in the follow-ing way Each pip containing a beat is given a nu-merical score indicating how many events haveonsets at that point This score is also weighted bythe length of the events starting there In sum-ming these note scores for all the pips we have anumerical representation of how well the analysissatisfies the event and length rules Each pip con-taining a beat is also given a score indicating howevenly spaced it is in the context of the previousbeats in that analysis There are various ways thatthis could be quantified but we have found a verysimple method that works well the regularity of abeat Bn is simply given by the absolute differencebetween the interval between Bn and Bnndash1 and thatbetween Bnndash1 and Bnndash2 This beat-interval scoretherefore acts as a penalty a representation is pre-ferred in which the beat-interval scores are mini-mized The preferred tactus level is the one whosetotal note score and (negative) beat-interval scoreis highest One complication here is that simplydefining the note score for a beat level as the sumof the note scores for all its beats gives an advan-tage to analyses with more beats Thus we weightthe note score of each beat by the square root of itsbeat interval (the interval to the previous beat)The square root was chosen simply because it is aconvenient function whose growth rate is betweena constant (which would be too small) and a linearfunction (which would be too large)

Temperley and Sleator 15

It is because of the beat-interval scores that thescores must be computed in terms of entire analy-ses The best beat location within (for example) a 1-sec segment depends in part on the beat-intervalpenalty for different pips but this penalty dependson the location of previous beats which in turn de-pends on their beat-interval scores (and their notescores) and so on In theory then all possibleanalyses must be considered to be sure of findingthe highest-scoring one overall However the num-ber of possible analyses increases exponentiallywith the number of pips in the piece There is there-fore a search problem to be solved of finding the cor-rect analysis without actually generating them allwe discuss later how we solve this problem

The preferred tactus level for a piece then isthe one that best satisfies the three preferencerules quantified in the way just described It re-mains to be explained how the other levels are de-rived The program begins with the tactus level itthen generates two levels above the tactus and twobelow Five levels prove to be sufficient for thegreat majority of pieces We have adopted the con-vention of calling the tactus level 2 the upper lev-els 3 and 4 and the lower levels 1 and 0 Thegeneration of the upper levels is straightforwardLevel 3 is generated in exactly the same way aslevel 2 with the added stipulation that every beatat level 3 must also be a beat at level 2 and ex-actly one or two level-2 beats must elapse betweeneach pair of level-3 beats Level 4 is then generatedfrom level 3 in the same manner There is no ex-plicit requirement for regularity in terms of the re-lationship between levels (duple or triple) but thistends to emerge naturally since there is pressureon each level to be regular in itself The lower lev-els are slightly more complicated At level 1 apossible duple or triple division is generated foreach level-2 beat The best choice between dupleand triple and the exact placing of the lower-levelbeats within the tactus beat is determined usingthe same preference rules described above How-ever there is also a penalty for switching fromduple to triple from one tactus beat to another Inthis way other things being equal there is a pref-erence for maintaining either duple or triple divi-sion once it is established

A Model of Harmony

Another essential aspect of musical structure isharmony Elsewhere one of us has proposed amodel for harmonic analysis (Temperley 1997)this is the basis for the system we present hereLike the metrical program the main input to theharmonic program is simply a ldquopiano rollrdquo a listof events with a time and duration for each event(although metrical information is also required aswe explain) The program produces a representa-tion of segments which we call chord spans la-beled with roots This is somewhat different fromconventional harmonic analysis or ldquoRoman-nu-meral analysisrdquo whereas a Roman-numeral analy-sis represents each root relative to the current keythe current program simply labels roots in abso-lute terms

The model we employ is again a preference-rule system giving a set of criteria for choosingthe preferred analysis out of all the possible onesA possible harmonic analysis is simply a completesegmentation of the piece where each segment islabeled with a root Again ldquoOh Susannahrdquo pro-vides a simple illustration of the programrsquos rulesWhat we would consider the correct analysis is in-dicated by the root symbols below the staff (Un-like metrical structuremdashwhich is usuallyindicated by the score of a piecemdashthe correct har-monic structure for a piece is to some extent amatter of opinion While there are certainly caseswhere people disagree about the correct analysisfor a piece there is a great deal of agreement aswell) Assuming the first segment consists of thefirst three measures (plus the opening pickup)why does C seem like the correct choice of roothere The most obvious reason is that C E and Gare all present and these are chord tones of C ma-jor One question is how we know which notes inthe segment are chord tones as opposed to orna-mental tones we will return to this But even oncethis is known the chord tones in a segment oftendo not uniquely identify a chord For examplemeasures 9ndash10 contain F and A these could implyeither an F-major chord or a D-minor chord al-though F major seems preferred To capture suchintuitions we stipulate that roots are preferred

16 Computer Music Journal

that result in certain pitch-root relationships butsome relationships are preferred to others We ex-press this in the following rule

Compatibility rulemdashprefer roots that resultin certain pitch-root relationships The fol-lowing relationships are preferred in this or-der 1 5 3 3 7 5 9 ornamental

In measures 9ndash10 a root of F is thus preferredover D given the pitches F and A F results inpitch-root relationships of 1 and 3 whereas D re-sults in 5 and 3

The compatibility rule says that an event thatdoes not have a chord-tone relationship with thecurrent root is ldquoornamentalrdquo However not justany event can be ornamental some events are bet-ter ornamental tones than others If one considersthe most common types of ornamental tonesmdashpassing tones neighbor tones suspensions and ap-poggiaturasmdashone finds that they all share acommon characteristic they are closely followedby another event a half-step or whole-step away inpitch In addition we have found that there is apreference for ornamental tones to be metricallyweak although sometimes they are metricallystrong (This of course requires access to metricalstructure we will return to this in a moment) In ascale passage for example the metrically strongnotes are more likely to be heard as chord tonesWe express these intuitions as follows

Ornamental dissonance rulemdashin labelingevents as ornamental prefer events that are(1) closely followed by another event a half-step or whole-step away and (2) metricallyweak

According to this rule the D in the pickup tomeasure 1 the A at the end of measure 1 and theD at the end of measure 2 are all good ornamentaldissonances since all are closely followedstepwise However the G in measure 2 (for ex-ample) is not closely followed stepwise If thisevent were treated as ornamentalmdashthat is if a root

were chosen for that segment of which G was nota chord tonemdashthe ornamental dissonance rulewould be severely violated

Now consider measure 4 According to thecompatibility rule D would be the preferred rootfor this segment (since the root D results in apitch-root relationship of 1 with pitch D) Why isG more preferred Intuitively G is a ldquocloserrdquo har-mony to C than D is and should therefore be pre-ferred in such cases Here we introduce a simplespatial model the line of fifths similar to thecircle of fifths but extending infinitely in eitherdirection as shown in Figure 4 Choosing a rootfor a segment is therefore a matter of mapping itonto a point on the line of fifths Given thismodel we can say that there is a preference forchoosing roots so that roots of nearby segmentsare close together on the line This leads to athird rule

Harmonic variance rulemdashprefer roots suchthat roots of nearby chord spans are close to-gether on the line of fifths

The line of fifths raises the issue of spellingMuch work in music theory (especially atonaltheory) and music cognition has assumed thatpitches are categorized into ldquopitch classesrdquo suchthat pitches one or more octave apart are membersof the same class However in conventional tonaltheory and music notation distinctions are madebetween different spellings of the same pitch egA versus G We call these tonal pitch classes(TPCs) as opposed to the twelve neutral pitchclasses of atonal theory It is our view that thesespelling distinctions are an important aspect oftonal harmony and must be represented (For adiscussion of this issue see Temperley 1997) Thisadds a further task for the program as well as di-viding the piece into segments labeled with rootsit must also choose a spelling label for each pitchevent hence mapping it onto the line of fifthsThe main criterion for doing this is a simple oneanalogous to the harmonic variance rule above

Figure 4 The line offifths

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

12 Computer Music Journal

forms a dotted-quarter level of beats and everysecond beat at that level forms a dotted-half (orldquoone-measurerdquo) level In addition there may alsobe one or two levels above the measure so-calledhypermetrical levels

Even assuming that a metrical structure mustinvolve several levels of equally spaced beats onemust still determine the duple or triple relation-ships between levels the tempo (the time inter-vals between beats) and the placing of the beatsrelative to the music For this purpose Lerdahland Jackendoff posit a set of metrical preferencerules stating the criteria whereby listeners inferthe correct structure Consider Figure 1 the tradi-tional American melody ldquoOh Susannahrdquo The cor-rect metrical structure is shown above the staff(The metrical and harmonic analysis shown hereis the one produced by our program with one dif-ference that we will explain) The most importantrule is that beats (especially higher-level or

ldquostrongrdquo beats) should whenever possible coincidewith the onsets of events Second there is a prefer-ence for strong beats to coincide with longerevents In ldquoOh Susannahrdquo for example this favorsplacing quarter-note beats on even-numberedeighth-note beats (the second fourth and so on)since this aligns them with the dotted eighthnotes Similarly this rule favors placing half-notebeats on odd-numbered quarter notes since thisaligns the long note in measure 4 with a half-notebeat We state these rules as follows (our wordingdiffers slightly from that in GTTM)

Event rulemdashprefer a structure that alignsbeats with event onsetsLength rulemdashprefer a structure that alignsstrong beats with onsets of longer events

Note that the preferred metrical structure is theone that is preferred on balance it may not be pre-ferred at every moment of the piece For example

Figure 1 ldquoOh Susannahrdquoshowing the correct metri-cal and harmonic struc-ture Each row of dotsindicates a level of themetrical structure eachdot indicates a beat on

the onset of the note be-low Each chord symbolrepresents a chord spanbeginning on the note be-low and extending to thebeginning of the followingchord span

Temperley and Sleator 13

the second A in measure 10 is a long note on afairly weak beat thus violating the length rule buton balance this structure is still the preferred one

When we turn to polyphonic music severalcomplications arise Consider Figure 2 the open-ing of Mozartrsquos Sonata K 332 The correct metri-cal structure is indicated by the time signaturewhy is this the perceived structure Clearly theeighth-note level is determined by the event ruleThe quarter-note level could also be explained bythe event rule on certain eighth-note beats wehave two event onsets not just one This brings upan important point about the event rule the moreevent onsets at a time point the better a beat loca-tion it is Regarding the dotted-half-note level onemight suppose that it was due to the long notes inthe right hand This raises the question of what ismeant by the ldquolengthrdquo of an event The actuallength of events is available in the input and thiscould be used However this is problematic In theMozart excerpt the long notes in the right handwould probably still seem long (and hence goodcandidates for strong beats) even if they wereplayed staccato Alternatively one might assumethat the length of a note corresponds to its inter-onset interval (IOI) which is the time interval be-tween the notersquos onset and the onset of thefollowing note (This is the approach taken bymost previous meter-finding programs) This ap-proach works fairly well in monophonic music (al-

though even there it encounters problems as wewill show) however it is totally unworkable inpolyphonic music In the Mozart piece the IOI ofthe first right-hand event is only one eighth noteit is followed immediately by a note in the lefthand Intuitively what we want is the IOI of anote within that line of the texture we call thisthe registral IOI However separating the eventsof a texture into lines is a highly complex taskthat we will not address here Our solution to thisproblem which is crude but fairly effective is todefine the registral IOI of an event as the inter-on-set interval to the next event within a certainrange of pitch we adopt the value of ninesemitones However it sometimes proves usefulto use duration as well In Figure 3 the fact thatthe melody note is held makes it clear that it is ina separate line from the lower notes and hence istruly long despite its short registral IOI Takingall this into account we propose a measure of aneventrsquos length that is used for the purpose of thelength rule the length of a note is the maximumof its duration and its registral IOI

Lerdahl and Jackendoffrsquos theory neglects one ex-tremely important aspect of meter The GTTMrules state that beats at the ldquotactusrdquo level and im-mediately higher levels must be exactly evenlyspaced The tactus is the most salient level ofmeter usually corresponding to the main ldquobeatrdquoof the music in colloquial terms In actual perfor-mance however beats are of course not exactlyevenly spaced at any level There may be deliber-ate fluctuations in timing as well as errors andimperfections This is a crucial point because itmeans that we cannot simply infer the metricalstructure at the beginning of a piece and extrapo-late it metronomically through the rest of thepiece (even if we had the ability to do that) we

Figure 2 Mozartrsquos SonataK 332 first movementmeasures 1ndash5

Figure 3 A case where du-ration affects the perceivedlength of notes

Figure 2

Figure 3

14 Computer Music Journal

must continuously adjust the meter to fit the mu-sic that is heard (There is no doubt that listenerscan and do make these adjustments in general wehave little difficulty relocating the beat after afermata or ritard or even a complete change inmeter or tempo) In principle accommodating thisinto Lerdahl and Jackendoffrsquos theory is straightfor-ward we simply make the requirement for regu-larity of beats at each level into a preference rulerather than a well-formedness (hard-and-fast) ruleWe state this rule as follows

Regularity rulemdashprefer beats at each level tobe maximally evenly spaced

Maximizing the regularity of each level thus be-comes a flexible constraint to be balanced againstthe other preference rules

The regularity rule brings up an important fea-ture of our system In the past computationalanalysis of rhythm has generally been divided intotwo problems One is quantization the roundingoff of durations and time points in a piece to mul-tiples of a common beat (see for example Desainand Honing 1992) The other is metrical analysisimposing higher levels of beats on an existinglower level Models that assume a quantized input(such as Lerdahl and Jackendoffrsquos) are really onlyaddressing the second problem However an im-portant recent realization of music artificial intel-ligence has been that quantization and meterfinding are really part of the same process In im-posing a low level of beats on a piece of musicmarking the onsets of events one is in effect iden-tifying their position and duration in terms of inte-ger values of those beats Most notably Rosenthal(1992) proposed a model that accomplishes bothlow-level quantization and medium-level meterfinding Our system is similar in this respect per-forming both quantization and meter finding in asingle process

The three rules discussed abovemdashthe event rulethe length rule and the regularity rulemdashform thecore of the meter program It remains to be ex-plained how they are formalized and implementedThe basic idea is straightforward The system mustconsider all possible analyses of a piece where ananalysis is simply a well-formed metrical struc-

ture consisting of several rows of beats as out-lined above Each metrical preference rule assigns anumerical score to each analysis indicating howwell the analysis satisfies that rule The preferredanalysis is the one with the highest total score

For now let us consider the simpler situation ofderiving a single row of beats whose time intervalsmay vary over a broad range say 400ndash1600 msecthe typical range for the tactus level We will re-turn later to the problem of adding other levelsAn analysis then is a row of beats imposed on agiven piece The input representation is first quan-tized into very short segments of 35 msec whichwe call pips (the value of 35 msec was simplyfound to be optimal through trial and error) Everyevent onset and offset is rounded to the nearestpip beats also may occur only at the start of a pip

An analysis can then be evaluated in the follow-ing way Each pip containing a beat is given a nu-merical score indicating how many events haveonsets at that point This score is also weighted bythe length of the events starting there In sum-ming these note scores for all the pips we have anumerical representation of how well the analysissatisfies the event and length rules Each pip con-taining a beat is also given a score indicating howevenly spaced it is in the context of the previousbeats in that analysis There are various ways thatthis could be quantified but we have found a verysimple method that works well the regularity of abeat Bn is simply given by the absolute differencebetween the interval between Bn and Bnndash1 and thatbetween Bnndash1 and Bnndash2 This beat-interval scoretherefore acts as a penalty a representation is pre-ferred in which the beat-interval scores are mini-mized The preferred tactus level is the one whosetotal note score and (negative) beat-interval scoreis highest One complication here is that simplydefining the note score for a beat level as the sumof the note scores for all its beats gives an advan-tage to analyses with more beats Thus we weightthe note score of each beat by the square root of itsbeat interval (the interval to the previous beat)The square root was chosen simply because it is aconvenient function whose growth rate is betweena constant (which would be too small) and a linearfunction (which would be too large)

Temperley and Sleator 15

It is because of the beat-interval scores that thescores must be computed in terms of entire analy-ses The best beat location within (for example) a 1-sec segment depends in part on the beat-intervalpenalty for different pips but this penalty dependson the location of previous beats which in turn de-pends on their beat-interval scores (and their notescores) and so on In theory then all possibleanalyses must be considered to be sure of findingthe highest-scoring one overall However the num-ber of possible analyses increases exponentiallywith the number of pips in the piece There is there-fore a search problem to be solved of finding the cor-rect analysis without actually generating them allwe discuss later how we solve this problem

The preferred tactus level for a piece then isthe one that best satisfies the three preferencerules quantified in the way just described It re-mains to be explained how the other levels are de-rived The program begins with the tactus level itthen generates two levels above the tactus and twobelow Five levels prove to be sufficient for thegreat majority of pieces We have adopted the con-vention of calling the tactus level 2 the upper lev-els 3 and 4 and the lower levels 1 and 0 Thegeneration of the upper levels is straightforwardLevel 3 is generated in exactly the same way aslevel 2 with the added stipulation that every beatat level 3 must also be a beat at level 2 and ex-actly one or two level-2 beats must elapse betweeneach pair of level-3 beats Level 4 is then generatedfrom level 3 in the same manner There is no ex-plicit requirement for regularity in terms of the re-lationship between levels (duple or triple) but thistends to emerge naturally since there is pressureon each level to be regular in itself The lower lev-els are slightly more complicated At level 1 apossible duple or triple division is generated foreach level-2 beat The best choice between dupleand triple and the exact placing of the lower-levelbeats within the tactus beat is determined usingthe same preference rules described above How-ever there is also a penalty for switching fromduple to triple from one tactus beat to another Inthis way other things being equal there is a pref-erence for maintaining either duple or triple divi-sion once it is established

A Model of Harmony

Another essential aspect of musical structure isharmony Elsewhere one of us has proposed amodel for harmonic analysis (Temperley 1997)this is the basis for the system we present hereLike the metrical program the main input to theharmonic program is simply a ldquopiano rollrdquo a listof events with a time and duration for each event(although metrical information is also required aswe explain) The program produces a representa-tion of segments which we call chord spans la-beled with roots This is somewhat different fromconventional harmonic analysis or ldquoRoman-nu-meral analysisrdquo whereas a Roman-numeral analy-sis represents each root relative to the current keythe current program simply labels roots in abso-lute terms

The model we employ is again a preference-rule system giving a set of criteria for choosingthe preferred analysis out of all the possible onesA possible harmonic analysis is simply a completesegmentation of the piece where each segment islabeled with a root Again ldquoOh Susannahrdquo pro-vides a simple illustration of the programrsquos rulesWhat we would consider the correct analysis is in-dicated by the root symbols below the staff (Un-like metrical structuremdashwhich is usuallyindicated by the score of a piecemdashthe correct har-monic structure for a piece is to some extent amatter of opinion While there are certainly caseswhere people disagree about the correct analysisfor a piece there is a great deal of agreement aswell) Assuming the first segment consists of thefirst three measures (plus the opening pickup)why does C seem like the correct choice of roothere The most obvious reason is that C E and Gare all present and these are chord tones of C ma-jor One question is how we know which notes inthe segment are chord tones as opposed to orna-mental tones we will return to this But even oncethis is known the chord tones in a segment oftendo not uniquely identify a chord For examplemeasures 9ndash10 contain F and A these could implyeither an F-major chord or a D-minor chord al-though F major seems preferred To capture suchintuitions we stipulate that roots are preferred

16 Computer Music Journal

that result in certain pitch-root relationships butsome relationships are preferred to others We ex-press this in the following rule

Compatibility rulemdashprefer roots that resultin certain pitch-root relationships The fol-lowing relationships are preferred in this or-der 1 5 3 3 7 5 9 ornamental

In measures 9ndash10 a root of F is thus preferredover D given the pitches F and A F results inpitch-root relationships of 1 and 3 whereas D re-sults in 5 and 3

The compatibility rule says that an event thatdoes not have a chord-tone relationship with thecurrent root is ldquoornamentalrdquo However not justany event can be ornamental some events are bet-ter ornamental tones than others If one considersthe most common types of ornamental tonesmdashpassing tones neighbor tones suspensions and ap-poggiaturasmdashone finds that they all share acommon characteristic they are closely followedby another event a half-step or whole-step away inpitch In addition we have found that there is apreference for ornamental tones to be metricallyweak although sometimes they are metricallystrong (This of course requires access to metricalstructure we will return to this in a moment) In ascale passage for example the metrically strongnotes are more likely to be heard as chord tonesWe express these intuitions as follows

Ornamental dissonance rulemdashin labelingevents as ornamental prefer events that are(1) closely followed by another event a half-step or whole-step away and (2) metricallyweak

According to this rule the D in the pickup tomeasure 1 the A at the end of measure 1 and theD at the end of measure 2 are all good ornamentaldissonances since all are closely followedstepwise However the G in measure 2 (for ex-ample) is not closely followed stepwise If thisevent were treated as ornamentalmdashthat is if a root

were chosen for that segment of which G was nota chord tonemdashthe ornamental dissonance rulewould be severely violated

Now consider measure 4 According to thecompatibility rule D would be the preferred rootfor this segment (since the root D results in apitch-root relationship of 1 with pitch D) Why isG more preferred Intuitively G is a ldquocloserrdquo har-mony to C than D is and should therefore be pre-ferred in such cases Here we introduce a simplespatial model the line of fifths similar to thecircle of fifths but extending infinitely in eitherdirection as shown in Figure 4 Choosing a rootfor a segment is therefore a matter of mapping itonto a point on the line of fifths Given thismodel we can say that there is a preference forchoosing roots so that roots of nearby segmentsare close together on the line This leads to athird rule

Harmonic variance rulemdashprefer roots suchthat roots of nearby chord spans are close to-gether on the line of fifths

The line of fifths raises the issue of spellingMuch work in music theory (especially atonaltheory) and music cognition has assumed thatpitches are categorized into ldquopitch classesrdquo suchthat pitches one or more octave apart are membersof the same class However in conventional tonaltheory and music notation distinctions are madebetween different spellings of the same pitch egA versus G We call these tonal pitch classes(TPCs) as opposed to the twelve neutral pitchclasses of atonal theory It is our view that thesespelling distinctions are an important aspect oftonal harmony and must be represented (For adiscussion of this issue see Temperley 1997) Thisadds a further task for the program as well as di-viding the piece into segments labeled with rootsit must also choose a spelling label for each pitchevent hence mapping it onto the line of fifthsThe main criterion for doing this is a simple oneanalogous to the harmonic variance rule above

Figure 4 The line offifths

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 13

the second A in measure 10 is a long note on afairly weak beat thus violating the length rule buton balance this structure is still the preferred one

When we turn to polyphonic music severalcomplications arise Consider Figure 2 the open-ing of Mozartrsquos Sonata K 332 The correct metri-cal structure is indicated by the time signaturewhy is this the perceived structure Clearly theeighth-note level is determined by the event ruleThe quarter-note level could also be explained bythe event rule on certain eighth-note beats wehave two event onsets not just one This brings upan important point about the event rule the moreevent onsets at a time point the better a beat loca-tion it is Regarding the dotted-half-note level onemight suppose that it was due to the long notes inthe right hand This raises the question of what ismeant by the ldquolengthrdquo of an event The actuallength of events is available in the input and thiscould be used However this is problematic In theMozart excerpt the long notes in the right handwould probably still seem long (and hence goodcandidates for strong beats) even if they wereplayed staccato Alternatively one might assumethat the length of a note corresponds to its inter-onset interval (IOI) which is the time interval be-tween the notersquos onset and the onset of thefollowing note (This is the approach taken bymost previous meter-finding programs) This ap-proach works fairly well in monophonic music (al-

though even there it encounters problems as wewill show) however it is totally unworkable inpolyphonic music In the Mozart piece the IOI ofthe first right-hand event is only one eighth noteit is followed immediately by a note in the lefthand Intuitively what we want is the IOI of anote within that line of the texture we call thisthe registral IOI However separating the eventsof a texture into lines is a highly complex taskthat we will not address here Our solution to thisproblem which is crude but fairly effective is todefine the registral IOI of an event as the inter-on-set interval to the next event within a certainrange of pitch we adopt the value of ninesemitones However it sometimes proves usefulto use duration as well In Figure 3 the fact thatthe melody note is held makes it clear that it is ina separate line from the lower notes and hence istruly long despite its short registral IOI Takingall this into account we propose a measure of aneventrsquos length that is used for the purpose of thelength rule the length of a note is the maximumof its duration and its registral IOI

Lerdahl and Jackendoffrsquos theory neglects one ex-tremely important aspect of meter The GTTMrules state that beats at the ldquotactusrdquo level and im-mediately higher levels must be exactly evenlyspaced The tactus is the most salient level ofmeter usually corresponding to the main ldquobeatrdquoof the music in colloquial terms In actual perfor-mance however beats are of course not exactlyevenly spaced at any level There may be deliber-ate fluctuations in timing as well as errors andimperfections This is a crucial point because itmeans that we cannot simply infer the metricalstructure at the beginning of a piece and extrapo-late it metronomically through the rest of thepiece (even if we had the ability to do that) we

Figure 2 Mozartrsquos SonataK 332 first movementmeasures 1ndash5

Figure 3 A case where du-ration affects the perceivedlength of notes

Figure 2

Figure 3

14 Computer Music Journal

must continuously adjust the meter to fit the mu-sic that is heard (There is no doubt that listenerscan and do make these adjustments in general wehave little difficulty relocating the beat after afermata or ritard or even a complete change inmeter or tempo) In principle accommodating thisinto Lerdahl and Jackendoffrsquos theory is straightfor-ward we simply make the requirement for regu-larity of beats at each level into a preference rulerather than a well-formedness (hard-and-fast) ruleWe state this rule as follows

Regularity rulemdashprefer beats at each level tobe maximally evenly spaced

Maximizing the regularity of each level thus be-comes a flexible constraint to be balanced againstthe other preference rules

The regularity rule brings up an important fea-ture of our system In the past computationalanalysis of rhythm has generally been divided intotwo problems One is quantization the roundingoff of durations and time points in a piece to mul-tiples of a common beat (see for example Desainand Honing 1992) The other is metrical analysisimposing higher levels of beats on an existinglower level Models that assume a quantized input(such as Lerdahl and Jackendoffrsquos) are really onlyaddressing the second problem However an im-portant recent realization of music artificial intel-ligence has been that quantization and meterfinding are really part of the same process In im-posing a low level of beats on a piece of musicmarking the onsets of events one is in effect iden-tifying their position and duration in terms of inte-ger values of those beats Most notably Rosenthal(1992) proposed a model that accomplishes bothlow-level quantization and medium-level meterfinding Our system is similar in this respect per-forming both quantization and meter finding in asingle process

The three rules discussed abovemdashthe event rulethe length rule and the regularity rulemdashform thecore of the meter program It remains to be ex-plained how they are formalized and implementedThe basic idea is straightforward The system mustconsider all possible analyses of a piece where ananalysis is simply a well-formed metrical struc-

ture consisting of several rows of beats as out-lined above Each metrical preference rule assigns anumerical score to each analysis indicating howwell the analysis satisfies that rule The preferredanalysis is the one with the highest total score

For now let us consider the simpler situation ofderiving a single row of beats whose time intervalsmay vary over a broad range say 400ndash1600 msecthe typical range for the tactus level We will re-turn later to the problem of adding other levelsAn analysis then is a row of beats imposed on agiven piece The input representation is first quan-tized into very short segments of 35 msec whichwe call pips (the value of 35 msec was simplyfound to be optimal through trial and error) Everyevent onset and offset is rounded to the nearestpip beats also may occur only at the start of a pip

An analysis can then be evaluated in the follow-ing way Each pip containing a beat is given a nu-merical score indicating how many events haveonsets at that point This score is also weighted bythe length of the events starting there In sum-ming these note scores for all the pips we have anumerical representation of how well the analysissatisfies the event and length rules Each pip con-taining a beat is also given a score indicating howevenly spaced it is in the context of the previousbeats in that analysis There are various ways thatthis could be quantified but we have found a verysimple method that works well the regularity of abeat Bn is simply given by the absolute differencebetween the interval between Bn and Bnndash1 and thatbetween Bnndash1 and Bnndash2 This beat-interval scoretherefore acts as a penalty a representation is pre-ferred in which the beat-interval scores are mini-mized The preferred tactus level is the one whosetotal note score and (negative) beat-interval scoreis highest One complication here is that simplydefining the note score for a beat level as the sumof the note scores for all its beats gives an advan-tage to analyses with more beats Thus we weightthe note score of each beat by the square root of itsbeat interval (the interval to the previous beat)The square root was chosen simply because it is aconvenient function whose growth rate is betweena constant (which would be too small) and a linearfunction (which would be too large)

Temperley and Sleator 15

It is because of the beat-interval scores that thescores must be computed in terms of entire analy-ses The best beat location within (for example) a 1-sec segment depends in part on the beat-intervalpenalty for different pips but this penalty dependson the location of previous beats which in turn de-pends on their beat-interval scores (and their notescores) and so on In theory then all possibleanalyses must be considered to be sure of findingthe highest-scoring one overall However the num-ber of possible analyses increases exponentiallywith the number of pips in the piece There is there-fore a search problem to be solved of finding the cor-rect analysis without actually generating them allwe discuss later how we solve this problem

The preferred tactus level for a piece then isthe one that best satisfies the three preferencerules quantified in the way just described It re-mains to be explained how the other levels are de-rived The program begins with the tactus level itthen generates two levels above the tactus and twobelow Five levels prove to be sufficient for thegreat majority of pieces We have adopted the con-vention of calling the tactus level 2 the upper lev-els 3 and 4 and the lower levels 1 and 0 Thegeneration of the upper levels is straightforwardLevel 3 is generated in exactly the same way aslevel 2 with the added stipulation that every beatat level 3 must also be a beat at level 2 and ex-actly one or two level-2 beats must elapse betweeneach pair of level-3 beats Level 4 is then generatedfrom level 3 in the same manner There is no ex-plicit requirement for regularity in terms of the re-lationship between levels (duple or triple) but thistends to emerge naturally since there is pressureon each level to be regular in itself The lower lev-els are slightly more complicated At level 1 apossible duple or triple division is generated foreach level-2 beat The best choice between dupleand triple and the exact placing of the lower-levelbeats within the tactus beat is determined usingthe same preference rules described above How-ever there is also a penalty for switching fromduple to triple from one tactus beat to another Inthis way other things being equal there is a pref-erence for maintaining either duple or triple divi-sion once it is established

A Model of Harmony

Another essential aspect of musical structure isharmony Elsewhere one of us has proposed amodel for harmonic analysis (Temperley 1997)this is the basis for the system we present hereLike the metrical program the main input to theharmonic program is simply a ldquopiano rollrdquo a listof events with a time and duration for each event(although metrical information is also required aswe explain) The program produces a representa-tion of segments which we call chord spans la-beled with roots This is somewhat different fromconventional harmonic analysis or ldquoRoman-nu-meral analysisrdquo whereas a Roman-numeral analy-sis represents each root relative to the current keythe current program simply labels roots in abso-lute terms

The model we employ is again a preference-rule system giving a set of criteria for choosingthe preferred analysis out of all the possible onesA possible harmonic analysis is simply a completesegmentation of the piece where each segment islabeled with a root Again ldquoOh Susannahrdquo pro-vides a simple illustration of the programrsquos rulesWhat we would consider the correct analysis is in-dicated by the root symbols below the staff (Un-like metrical structuremdashwhich is usuallyindicated by the score of a piecemdashthe correct har-monic structure for a piece is to some extent amatter of opinion While there are certainly caseswhere people disagree about the correct analysisfor a piece there is a great deal of agreement aswell) Assuming the first segment consists of thefirst three measures (plus the opening pickup)why does C seem like the correct choice of roothere The most obvious reason is that C E and Gare all present and these are chord tones of C ma-jor One question is how we know which notes inthe segment are chord tones as opposed to orna-mental tones we will return to this But even oncethis is known the chord tones in a segment oftendo not uniquely identify a chord For examplemeasures 9ndash10 contain F and A these could implyeither an F-major chord or a D-minor chord al-though F major seems preferred To capture suchintuitions we stipulate that roots are preferred

16 Computer Music Journal

that result in certain pitch-root relationships butsome relationships are preferred to others We ex-press this in the following rule

Compatibility rulemdashprefer roots that resultin certain pitch-root relationships The fol-lowing relationships are preferred in this or-der 1 5 3 3 7 5 9 ornamental

In measures 9ndash10 a root of F is thus preferredover D given the pitches F and A F results inpitch-root relationships of 1 and 3 whereas D re-sults in 5 and 3

The compatibility rule says that an event thatdoes not have a chord-tone relationship with thecurrent root is ldquoornamentalrdquo However not justany event can be ornamental some events are bet-ter ornamental tones than others If one considersthe most common types of ornamental tonesmdashpassing tones neighbor tones suspensions and ap-poggiaturasmdashone finds that they all share acommon characteristic they are closely followedby another event a half-step or whole-step away inpitch In addition we have found that there is apreference for ornamental tones to be metricallyweak although sometimes they are metricallystrong (This of course requires access to metricalstructure we will return to this in a moment) In ascale passage for example the metrically strongnotes are more likely to be heard as chord tonesWe express these intuitions as follows

Ornamental dissonance rulemdashin labelingevents as ornamental prefer events that are(1) closely followed by another event a half-step or whole-step away and (2) metricallyweak

According to this rule the D in the pickup tomeasure 1 the A at the end of measure 1 and theD at the end of measure 2 are all good ornamentaldissonances since all are closely followedstepwise However the G in measure 2 (for ex-ample) is not closely followed stepwise If thisevent were treated as ornamentalmdashthat is if a root

were chosen for that segment of which G was nota chord tonemdashthe ornamental dissonance rulewould be severely violated

Now consider measure 4 According to thecompatibility rule D would be the preferred rootfor this segment (since the root D results in apitch-root relationship of 1 with pitch D) Why isG more preferred Intuitively G is a ldquocloserrdquo har-mony to C than D is and should therefore be pre-ferred in such cases Here we introduce a simplespatial model the line of fifths similar to thecircle of fifths but extending infinitely in eitherdirection as shown in Figure 4 Choosing a rootfor a segment is therefore a matter of mapping itonto a point on the line of fifths Given thismodel we can say that there is a preference forchoosing roots so that roots of nearby segmentsare close together on the line This leads to athird rule

Harmonic variance rulemdashprefer roots suchthat roots of nearby chord spans are close to-gether on the line of fifths

The line of fifths raises the issue of spellingMuch work in music theory (especially atonaltheory) and music cognition has assumed thatpitches are categorized into ldquopitch classesrdquo suchthat pitches one or more octave apart are membersof the same class However in conventional tonaltheory and music notation distinctions are madebetween different spellings of the same pitch egA versus G We call these tonal pitch classes(TPCs) as opposed to the twelve neutral pitchclasses of atonal theory It is our view that thesespelling distinctions are an important aspect oftonal harmony and must be represented (For adiscussion of this issue see Temperley 1997) Thisadds a further task for the program as well as di-viding the piece into segments labeled with rootsit must also choose a spelling label for each pitchevent hence mapping it onto the line of fifthsThe main criterion for doing this is a simple oneanalogous to the harmonic variance rule above

Figure 4 The line offifths

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

14 Computer Music Journal

must continuously adjust the meter to fit the mu-sic that is heard (There is no doubt that listenerscan and do make these adjustments in general wehave little difficulty relocating the beat after afermata or ritard or even a complete change inmeter or tempo) In principle accommodating thisinto Lerdahl and Jackendoffrsquos theory is straightfor-ward we simply make the requirement for regu-larity of beats at each level into a preference rulerather than a well-formedness (hard-and-fast) ruleWe state this rule as follows

Regularity rulemdashprefer beats at each level tobe maximally evenly spaced

Maximizing the regularity of each level thus be-comes a flexible constraint to be balanced againstthe other preference rules

The regularity rule brings up an important fea-ture of our system In the past computationalanalysis of rhythm has generally been divided intotwo problems One is quantization the roundingoff of durations and time points in a piece to mul-tiples of a common beat (see for example Desainand Honing 1992) The other is metrical analysisimposing higher levels of beats on an existinglower level Models that assume a quantized input(such as Lerdahl and Jackendoffrsquos) are really onlyaddressing the second problem However an im-portant recent realization of music artificial intel-ligence has been that quantization and meterfinding are really part of the same process In im-posing a low level of beats on a piece of musicmarking the onsets of events one is in effect iden-tifying their position and duration in terms of inte-ger values of those beats Most notably Rosenthal(1992) proposed a model that accomplishes bothlow-level quantization and medium-level meterfinding Our system is similar in this respect per-forming both quantization and meter finding in asingle process

The three rules discussed abovemdashthe event rulethe length rule and the regularity rulemdashform thecore of the meter program It remains to be ex-plained how they are formalized and implementedThe basic idea is straightforward The system mustconsider all possible analyses of a piece where ananalysis is simply a well-formed metrical struc-

ture consisting of several rows of beats as out-lined above Each metrical preference rule assigns anumerical score to each analysis indicating howwell the analysis satisfies that rule The preferredanalysis is the one with the highest total score

For now let us consider the simpler situation ofderiving a single row of beats whose time intervalsmay vary over a broad range say 400ndash1600 msecthe typical range for the tactus level We will re-turn later to the problem of adding other levelsAn analysis then is a row of beats imposed on agiven piece The input representation is first quan-tized into very short segments of 35 msec whichwe call pips (the value of 35 msec was simplyfound to be optimal through trial and error) Everyevent onset and offset is rounded to the nearestpip beats also may occur only at the start of a pip

An analysis can then be evaluated in the follow-ing way Each pip containing a beat is given a nu-merical score indicating how many events haveonsets at that point This score is also weighted bythe length of the events starting there In sum-ming these note scores for all the pips we have anumerical representation of how well the analysissatisfies the event and length rules Each pip con-taining a beat is also given a score indicating howevenly spaced it is in the context of the previousbeats in that analysis There are various ways thatthis could be quantified but we have found a verysimple method that works well the regularity of abeat Bn is simply given by the absolute differencebetween the interval between Bn and Bnndash1 and thatbetween Bnndash1 and Bnndash2 This beat-interval scoretherefore acts as a penalty a representation is pre-ferred in which the beat-interval scores are mini-mized The preferred tactus level is the one whosetotal note score and (negative) beat-interval scoreis highest One complication here is that simplydefining the note score for a beat level as the sumof the note scores for all its beats gives an advan-tage to analyses with more beats Thus we weightthe note score of each beat by the square root of itsbeat interval (the interval to the previous beat)The square root was chosen simply because it is aconvenient function whose growth rate is betweena constant (which would be too small) and a linearfunction (which would be too large)

Temperley and Sleator 15

It is because of the beat-interval scores that thescores must be computed in terms of entire analy-ses The best beat location within (for example) a 1-sec segment depends in part on the beat-intervalpenalty for different pips but this penalty dependson the location of previous beats which in turn de-pends on their beat-interval scores (and their notescores) and so on In theory then all possibleanalyses must be considered to be sure of findingthe highest-scoring one overall However the num-ber of possible analyses increases exponentiallywith the number of pips in the piece There is there-fore a search problem to be solved of finding the cor-rect analysis without actually generating them allwe discuss later how we solve this problem

The preferred tactus level for a piece then isthe one that best satisfies the three preferencerules quantified in the way just described It re-mains to be explained how the other levels are de-rived The program begins with the tactus level itthen generates two levels above the tactus and twobelow Five levels prove to be sufficient for thegreat majority of pieces We have adopted the con-vention of calling the tactus level 2 the upper lev-els 3 and 4 and the lower levels 1 and 0 Thegeneration of the upper levels is straightforwardLevel 3 is generated in exactly the same way aslevel 2 with the added stipulation that every beatat level 3 must also be a beat at level 2 and ex-actly one or two level-2 beats must elapse betweeneach pair of level-3 beats Level 4 is then generatedfrom level 3 in the same manner There is no ex-plicit requirement for regularity in terms of the re-lationship between levels (duple or triple) but thistends to emerge naturally since there is pressureon each level to be regular in itself The lower lev-els are slightly more complicated At level 1 apossible duple or triple division is generated foreach level-2 beat The best choice between dupleand triple and the exact placing of the lower-levelbeats within the tactus beat is determined usingthe same preference rules described above How-ever there is also a penalty for switching fromduple to triple from one tactus beat to another Inthis way other things being equal there is a pref-erence for maintaining either duple or triple divi-sion once it is established

A Model of Harmony

Another essential aspect of musical structure isharmony Elsewhere one of us has proposed amodel for harmonic analysis (Temperley 1997)this is the basis for the system we present hereLike the metrical program the main input to theharmonic program is simply a ldquopiano rollrdquo a listof events with a time and duration for each event(although metrical information is also required aswe explain) The program produces a representa-tion of segments which we call chord spans la-beled with roots This is somewhat different fromconventional harmonic analysis or ldquoRoman-nu-meral analysisrdquo whereas a Roman-numeral analy-sis represents each root relative to the current keythe current program simply labels roots in abso-lute terms

The model we employ is again a preference-rule system giving a set of criteria for choosingthe preferred analysis out of all the possible onesA possible harmonic analysis is simply a completesegmentation of the piece where each segment islabeled with a root Again ldquoOh Susannahrdquo pro-vides a simple illustration of the programrsquos rulesWhat we would consider the correct analysis is in-dicated by the root symbols below the staff (Un-like metrical structuremdashwhich is usuallyindicated by the score of a piecemdashthe correct har-monic structure for a piece is to some extent amatter of opinion While there are certainly caseswhere people disagree about the correct analysisfor a piece there is a great deal of agreement aswell) Assuming the first segment consists of thefirst three measures (plus the opening pickup)why does C seem like the correct choice of roothere The most obvious reason is that C E and Gare all present and these are chord tones of C ma-jor One question is how we know which notes inthe segment are chord tones as opposed to orna-mental tones we will return to this But even oncethis is known the chord tones in a segment oftendo not uniquely identify a chord For examplemeasures 9ndash10 contain F and A these could implyeither an F-major chord or a D-minor chord al-though F major seems preferred To capture suchintuitions we stipulate that roots are preferred

16 Computer Music Journal

that result in certain pitch-root relationships butsome relationships are preferred to others We ex-press this in the following rule

Compatibility rulemdashprefer roots that resultin certain pitch-root relationships The fol-lowing relationships are preferred in this or-der 1 5 3 3 7 5 9 ornamental

In measures 9ndash10 a root of F is thus preferredover D given the pitches F and A F results inpitch-root relationships of 1 and 3 whereas D re-sults in 5 and 3

The compatibility rule says that an event thatdoes not have a chord-tone relationship with thecurrent root is ldquoornamentalrdquo However not justany event can be ornamental some events are bet-ter ornamental tones than others If one considersthe most common types of ornamental tonesmdashpassing tones neighbor tones suspensions and ap-poggiaturasmdashone finds that they all share acommon characteristic they are closely followedby another event a half-step or whole-step away inpitch In addition we have found that there is apreference for ornamental tones to be metricallyweak although sometimes they are metricallystrong (This of course requires access to metricalstructure we will return to this in a moment) In ascale passage for example the metrically strongnotes are more likely to be heard as chord tonesWe express these intuitions as follows

Ornamental dissonance rulemdashin labelingevents as ornamental prefer events that are(1) closely followed by another event a half-step or whole-step away and (2) metricallyweak

According to this rule the D in the pickup tomeasure 1 the A at the end of measure 1 and theD at the end of measure 2 are all good ornamentaldissonances since all are closely followedstepwise However the G in measure 2 (for ex-ample) is not closely followed stepwise If thisevent were treated as ornamentalmdashthat is if a root

were chosen for that segment of which G was nota chord tonemdashthe ornamental dissonance rulewould be severely violated

Now consider measure 4 According to thecompatibility rule D would be the preferred rootfor this segment (since the root D results in apitch-root relationship of 1 with pitch D) Why isG more preferred Intuitively G is a ldquocloserrdquo har-mony to C than D is and should therefore be pre-ferred in such cases Here we introduce a simplespatial model the line of fifths similar to thecircle of fifths but extending infinitely in eitherdirection as shown in Figure 4 Choosing a rootfor a segment is therefore a matter of mapping itonto a point on the line of fifths Given thismodel we can say that there is a preference forchoosing roots so that roots of nearby segmentsare close together on the line This leads to athird rule

Harmonic variance rulemdashprefer roots suchthat roots of nearby chord spans are close to-gether on the line of fifths

The line of fifths raises the issue of spellingMuch work in music theory (especially atonaltheory) and music cognition has assumed thatpitches are categorized into ldquopitch classesrdquo suchthat pitches one or more octave apart are membersof the same class However in conventional tonaltheory and music notation distinctions are madebetween different spellings of the same pitch egA versus G We call these tonal pitch classes(TPCs) as opposed to the twelve neutral pitchclasses of atonal theory It is our view that thesespelling distinctions are an important aspect oftonal harmony and must be represented (For adiscussion of this issue see Temperley 1997) Thisadds a further task for the program as well as di-viding the piece into segments labeled with rootsit must also choose a spelling label for each pitchevent hence mapping it onto the line of fifthsThe main criterion for doing this is a simple oneanalogous to the harmonic variance rule above

Figure 4 The line offifths

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 15

It is because of the beat-interval scores that thescores must be computed in terms of entire analy-ses The best beat location within (for example) a 1-sec segment depends in part on the beat-intervalpenalty for different pips but this penalty dependson the location of previous beats which in turn de-pends on their beat-interval scores (and their notescores) and so on In theory then all possibleanalyses must be considered to be sure of findingthe highest-scoring one overall However the num-ber of possible analyses increases exponentiallywith the number of pips in the piece There is there-fore a search problem to be solved of finding the cor-rect analysis without actually generating them allwe discuss later how we solve this problem

The preferred tactus level for a piece then isthe one that best satisfies the three preferencerules quantified in the way just described It re-mains to be explained how the other levels are de-rived The program begins with the tactus level itthen generates two levels above the tactus and twobelow Five levels prove to be sufficient for thegreat majority of pieces We have adopted the con-vention of calling the tactus level 2 the upper lev-els 3 and 4 and the lower levels 1 and 0 Thegeneration of the upper levels is straightforwardLevel 3 is generated in exactly the same way aslevel 2 with the added stipulation that every beatat level 3 must also be a beat at level 2 and ex-actly one or two level-2 beats must elapse betweeneach pair of level-3 beats Level 4 is then generatedfrom level 3 in the same manner There is no ex-plicit requirement for regularity in terms of the re-lationship between levels (duple or triple) but thistends to emerge naturally since there is pressureon each level to be regular in itself The lower lev-els are slightly more complicated At level 1 apossible duple or triple division is generated foreach level-2 beat The best choice between dupleand triple and the exact placing of the lower-levelbeats within the tactus beat is determined usingthe same preference rules described above How-ever there is also a penalty for switching fromduple to triple from one tactus beat to another Inthis way other things being equal there is a pref-erence for maintaining either duple or triple divi-sion once it is established

A Model of Harmony

Another essential aspect of musical structure isharmony Elsewhere one of us has proposed amodel for harmonic analysis (Temperley 1997)this is the basis for the system we present hereLike the metrical program the main input to theharmonic program is simply a ldquopiano rollrdquo a listof events with a time and duration for each event(although metrical information is also required aswe explain) The program produces a representa-tion of segments which we call chord spans la-beled with roots This is somewhat different fromconventional harmonic analysis or ldquoRoman-nu-meral analysisrdquo whereas a Roman-numeral analy-sis represents each root relative to the current keythe current program simply labels roots in abso-lute terms

The model we employ is again a preference-rule system giving a set of criteria for choosingthe preferred analysis out of all the possible onesA possible harmonic analysis is simply a completesegmentation of the piece where each segment islabeled with a root Again ldquoOh Susannahrdquo pro-vides a simple illustration of the programrsquos rulesWhat we would consider the correct analysis is in-dicated by the root symbols below the staff (Un-like metrical structuremdashwhich is usuallyindicated by the score of a piecemdashthe correct har-monic structure for a piece is to some extent amatter of opinion While there are certainly caseswhere people disagree about the correct analysisfor a piece there is a great deal of agreement aswell) Assuming the first segment consists of thefirst three measures (plus the opening pickup)why does C seem like the correct choice of roothere The most obvious reason is that C E and Gare all present and these are chord tones of C ma-jor One question is how we know which notes inthe segment are chord tones as opposed to orna-mental tones we will return to this But even oncethis is known the chord tones in a segment oftendo not uniquely identify a chord For examplemeasures 9ndash10 contain F and A these could implyeither an F-major chord or a D-minor chord al-though F major seems preferred To capture suchintuitions we stipulate that roots are preferred

16 Computer Music Journal

that result in certain pitch-root relationships butsome relationships are preferred to others We ex-press this in the following rule

Compatibility rulemdashprefer roots that resultin certain pitch-root relationships The fol-lowing relationships are preferred in this or-der 1 5 3 3 7 5 9 ornamental

In measures 9ndash10 a root of F is thus preferredover D given the pitches F and A F results inpitch-root relationships of 1 and 3 whereas D re-sults in 5 and 3

The compatibility rule says that an event thatdoes not have a chord-tone relationship with thecurrent root is ldquoornamentalrdquo However not justany event can be ornamental some events are bet-ter ornamental tones than others If one considersthe most common types of ornamental tonesmdashpassing tones neighbor tones suspensions and ap-poggiaturasmdashone finds that they all share acommon characteristic they are closely followedby another event a half-step or whole-step away inpitch In addition we have found that there is apreference for ornamental tones to be metricallyweak although sometimes they are metricallystrong (This of course requires access to metricalstructure we will return to this in a moment) In ascale passage for example the metrically strongnotes are more likely to be heard as chord tonesWe express these intuitions as follows

Ornamental dissonance rulemdashin labelingevents as ornamental prefer events that are(1) closely followed by another event a half-step or whole-step away and (2) metricallyweak

According to this rule the D in the pickup tomeasure 1 the A at the end of measure 1 and theD at the end of measure 2 are all good ornamentaldissonances since all are closely followedstepwise However the G in measure 2 (for ex-ample) is not closely followed stepwise If thisevent were treated as ornamentalmdashthat is if a root

were chosen for that segment of which G was nota chord tonemdashthe ornamental dissonance rulewould be severely violated

Now consider measure 4 According to thecompatibility rule D would be the preferred rootfor this segment (since the root D results in apitch-root relationship of 1 with pitch D) Why isG more preferred Intuitively G is a ldquocloserrdquo har-mony to C than D is and should therefore be pre-ferred in such cases Here we introduce a simplespatial model the line of fifths similar to thecircle of fifths but extending infinitely in eitherdirection as shown in Figure 4 Choosing a rootfor a segment is therefore a matter of mapping itonto a point on the line of fifths Given thismodel we can say that there is a preference forchoosing roots so that roots of nearby segmentsare close together on the line This leads to athird rule

Harmonic variance rulemdashprefer roots suchthat roots of nearby chord spans are close to-gether on the line of fifths

The line of fifths raises the issue of spellingMuch work in music theory (especially atonaltheory) and music cognition has assumed thatpitches are categorized into ldquopitch classesrdquo suchthat pitches one or more octave apart are membersof the same class However in conventional tonaltheory and music notation distinctions are madebetween different spellings of the same pitch egA versus G We call these tonal pitch classes(TPCs) as opposed to the twelve neutral pitchclasses of atonal theory It is our view that thesespelling distinctions are an important aspect oftonal harmony and must be represented (For adiscussion of this issue see Temperley 1997) Thisadds a further task for the program as well as di-viding the piece into segments labeled with rootsit must also choose a spelling label for each pitchevent hence mapping it onto the line of fifthsThe main criterion for doing this is a simple oneanalogous to the harmonic variance rule above

Figure 4 The line offifths

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

16 Computer Music Journal

that result in certain pitch-root relationships butsome relationships are preferred to others We ex-press this in the following rule

Compatibility rulemdashprefer roots that resultin certain pitch-root relationships The fol-lowing relationships are preferred in this or-der 1 5 3 3 7 5 9 ornamental

In measures 9ndash10 a root of F is thus preferredover D given the pitches F and A F results inpitch-root relationships of 1 and 3 whereas D re-sults in 5 and 3

The compatibility rule says that an event thatdoes not have a chord-tone relationship with thecurrent root is ldquoornamentalrdquo However not justany event can be ornamental some events are bet-ter ornamental tones than others If one considersthe most common types of ornamental tonesmdashpassing tones neighbor tones suspensions and ap-poggiaturasmdashone finds that they all share acommon characteristic they are closely followedby another event a half-step or whole-step away inpitch In addition we have found that there is apreference for ornamental tones to be metricallyweak although sometimes they are metricallystrong (This of course requires access to metricalstructure we will return to this in a moment) In ascale passage for example the metrically strongnotes are more likely to be heard as chord tonesWe express these intuitions as follows

Ornamental dissonance rulemdashin labelingevents as ornamental prefer events that are(1) closely followed by another event a half-step or whole-step away and (2) metricallyweak

According to this rule the D in the pickup tomeasure 1 the A at the end of measure 1 and theD at the end of measure 2 are all good ornamentaldissonances since all are closely followedstepwise However the G in measure 2 (for ex-ample) is not closely followed stepwise If thisevent were treated as ornamentalmdashthat is if a root

were chosen for that segment of which G was nota chord tonemdashthe ornamental dissonance rulewould be severely violated

Now consider measure 4 According to thecompatibility rule D would be the preferred rootfor this segment (since the root D results in apitch-root relationship of 1 with pitch D) Why isG more preferred Intuitively G is a ldquocloserrdquo har-mony to C than D is and should therefore be pre-ferred in such cases Here we introduce a simplespatial model the line of fifths similar to thecircle of fifths but extending infinitely in eitherdirection as shown in Figure 4 Choosing a rootfor a segment is therefore a matter of mapping itonto a point on the line of fifths Given thismodel we can say that there is a preference forchoosing roots so that roots of nearby segmentsare close together on the line This leads to athird rule

Harmonic variance rulemdashprefer roots suchthat roots of nearby chord spans are close to-gether on the line of fifths

The line of fifths raises the issue of spellingMuch work in music theory (especially atonaltheory) and music cognition has assumed thatpitches are categorized into ldquopitch classesrdquo suchthat pitches one or more octave apart are membersof the same class However in conventional tonaltheory and music notation distinctions are madebetween different spellings of the same pitch egA versus G We call these tonal pitch classes(TPCs) as opposed to the twelve neutral pitchclasses of atonal theory It is our view that thesespelling distinctions are an important aspect oftonal harmony and must be represented (For adiscussion of this issue see Temperley 1997) Thisadds a further task for the program as well as di-viding the piece into segments labeled with rootsit must also choose a spelling label for each pitchevent hence mapping it onto the line of fifthsThe main criterion for doing this is a simple oneanalogous to the harmonic variance rule above

Figure 4 The line offifths

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 17

Pitch-variance rulemdashprefer spellings for pitchevents such that nearby events are close to-gether on the line of fifths

The representation of pitch spellingsmdashwhich wecall the TPC representationmdashis closely integratedwith the harmonic representation In the firstplace the compatibility rule which stipulates thepreferred pitch-root relationships refers to tonalpitch classes not neutral ones That is if a seg-ment contains A -C-E the preferred root for thatsegment is A not G However the program seeksthe overall representationmdashof both roots andpitchesmdashthat maximally satisfies all the rules theharmonic rules and the pitch-variance rule Insome cases harmonic considerations may force apitch spelling that would be less preferred giventhe pitch-variance rule alone In this way har-monic considerations ldquofeed backrdquo to influencepitch spelling

A final harmonic rule concerns the division ofthe piece into chord spans We have generally as-sumed chord spans of one measure or more butwhy do we assume this Alternatively one couldposit a separate chord span for each note In gen-eral there seems to be a preference for longer chordspans or at least an avoidance of very short onesHowever there is another principle involved aswell Consider the G segment in measure 4 Whynot begin this chord span two notes earlier on theprevious D (thus making the C a neighbor tone)The reason is that there is a preference for startingchord spans on strong beats of the metrical struc-ture This has the added effect of avoiding veryshort chord spans since strong beats are never veryclose together if a chord span is very short eitherthat chord span or the following one must begin ona weak beat We express this as follows

Strong-beat rulemdashprefer to start chord spanson strong beats

This important rule explains the motivation forincorporating the harmonic and metrical programsinto a single program Since harmonic analysis re-quires metrical information it is useful to be ableto use the output of the metrical program as inputto the harmonic program There is a problem here

however as metrical analysis sometimes requiresharmonic information We will return to this issuebelow

The implementation approach we take with theharmonic program is similar to that taken withthe metrical program The harmonic program con-siders all possible analyses of the entire piece inthis case a possible analysis is simply a completesegmentation of the piece into chord spans labeledwith roots (A spelling must also be chosen foreach pitch event The way this is handledcomputationally is rather complex and is not dis-cussed here) Again we begin by dividing the pieceinto short segments Here we can use the already-generated metrical structure Recall that any chordspan beginning on a very weak beat will get an ex-tremely high penalty It seems intuitively plau-sible to exclude altogether chord-span boundariesthat do not coincide with beats at all Thereforewe divide the piece into segments based on thelowest level of the metrical structure (usually onthe order of 100ndash300 msec) and stipulate thatchord-span boundaries can only occur at these seg-ment boundaries A root is chosen for each seg-ment and a chord span then emerges as any groupof contiguous segments with the same root (Sincethe number of possible roots is infinite we arbi-trarily limit it to a range of 48 roots on the line offifths and stipulate that the first root must be inthe range from F to D )

In evaluating a harmonic analysis each prefer-ence rule assigns a numerical score to each seg-ment indicating how well it satisfies that ruleBeginning with the compatibility rule we give thesegment a positive score for each event (or part ofan event) in the segment that has a chord-tone re-lationship with the root given by the analysismore preferred relationships result in higherscores Regarding ornamental tones each eventthat is not a chord tone of the root receives a pen-alty depending on how ldquogoodrdquo it is as an orna-mental tone that is how closely followed it is bythe next event a whole step or half step away andhow metrically weak it is better ornamental tonesreceive lower penalties Turning to the strong-beatrule we assign a penalty to each segment whoseroot is different from the previous segment de-

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

18 Computer Music Journal

pending on the strength of the beat at that pointweaker beats result in higher penalties The har-monic variance rule is slightly more complex Foreach segment we take the mean position on theline of fifths of all previous roots in the pieceweighted for recency so that more recent seg-ments affect it more We then look at the distanceof the current segmentrsquos root from this ldquocenter ofgravityrdquo This provides a measure of how close thecurrent segmentrsquos root is to previous roots Wethen assign each segment a penalty based on thisvalue (Incidentally one advantage of using theline of fifths as a space for roots and pitchclassesmdashrather than a circular space for ex-amplemdashis that it permits this easy way of calculat-ing spatial proximity)

If only the compatibility rule and the ornamen-tal dissonance rule were involved finding the pre-ferred analysis would be easy since the root foreach segment could simply be chosen in isolationWhat makes it difficult are the variance andstrong-beat rules Because of these the preferredroot for a segment depends on its context As withthe metrical program then there is a search prob-lem to be solved of finding the highest-scoringanalysis out of all possible ones

The Search Procedure

Our algorithms for finding optimal solutions tothe search problems described above are based ondynamic programming (Cormen Leiserson andRivest 1990) The procedures for the harmonic andmeter preference-rule systems are similar and aresummarized in this section

We explain our approach using a somewhat sim-plified example Imagine a harmonic algorithmthat uses only two of the rules discussed here thecompatibility rule as proposed above and a ldquono-changerdquo rule which simply penalizes all harmonicchanges (similar to the strong-beat rule except dis-regarding meter) Assume also for simplicityrsquossake that there are only twelve roots Now imag-ine a piece consisting of a series of segments Eachsegment has a compatibility score for each rootreflecting how compatible the pitches of the seg-

ment are with the root We can represent thesescores in a table as shown in Figure 5 The num-bers in each cell represent the compatibility scoresfor each segment with each root (Ignore the num-bers in parentheses for the moment) Let us as-sume that the no-change rule imposes a penalty of2 points Each analysis is thus a path along thetable with one step in each column each analysisreceives a compatibility score which is the totalof the scores in the squares it steps in minus 2points for each move from one row to anotherThe object is to find the highest-scoring path

Let us begin by considering just the first twosegments S1

and S2 We calculate the best analysisof these two segments by simply considering all144 possibilities each root for S1 combined witheach root for S2 (The best analysis is C-G with ascore of 5 + 6 ndash 2 = 9) Now we proceed to the thirdsegment S3 The score for a three-segment analy-sis can be viewed as the score for the first two seg-ments (already calculated) plus the compatibilityscore and no-change penalty (if any) for the thirdsegment It might appear that we have to calculatescores for all three-segment analyses to find thebest one However there is a major shortcut wecan take here Consider the analyses of S1ndashS2 thatend in root F We have already calculated thescores for these and found that F-F scores thehighest When adding on a third segment then weneed only consider the highest scoring of the

Figure 5 A hypotheticalsearch table for the sim-plified harmonic algo-rithm

Segment1 2 3 4 5

RootC 5 3 (8C) 1 (9C) 1 (12G) 0 (15E)C 0 1 (4C) 0 (7G) 0 (11G) 2 (17E)D 2 0 (3C) 0 (7G) 2 (13G) 0 (15E)E 0 4 (7C) 0 (7G) 0 (11G) 0 (15E)E 1 0 (3C) 5 (12G) 5 (17E) 3 (20E)F 4 3 (7F) 2 (9F) 1 (12G) 1 (16E)F 0 0 (3C) 0 (7G) 0 (11G) 0 (15E)G 3 6 (9C) 4 (13G) 2 (15G) 3 (18E)A 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)A 1 1 (4C) 2 (9G) 3 (14G) 6 (21E)B 0 1 (4C) 1 (8G) 0 (11G) 2 (17E)B 1 0 (3C) 2 (9G) 3 (14G) 0 (15E)

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 19

analyses ending in each root at S2 There is no rea-son that any of the others would ever be preferredS3 only cares about what the root was in the previ-ous segment anything prior to that is irrelevantThus we proceed as follows For each root at S2 werecord the score for the highest-scoring analysisending at that root along with the S1 root that theanalysis entails These ldquobest-so-farrdquo scores alongwith the best previous root are shown in paren-theses in the table At S3 we consider each root atS3 combined with each best-so-far analysis endingat each root at S2 for each root at S3 we choose anew best-so-far analysis and record the score andthe S2 root it entails We repeat this processthrough the entire piece In this way each squarein each column points back to a square in the pre-vious column When we get to the end of thepiece we look at the best-so-far scores for theroots in the final segment and choose the highestscore By tracing this analysis back through thetable we can then reconstruct the highest-scoringpossible analysis for the entire piece In this caseit is C-G-E-E-A

At any point in a piece one of the roots willhave the highest-scoring best-so-far analysis andthat will represent the preferred analysis at thatpoint For example at S3 the preferred root is GBut consider S4 here the preferred root is E and itpoints back to a root of E at S3 In this way thesystem naturally handles the ldquogarden-pathrdquo effectthe phenomenon of revising onersquos initial interpre-tation of a segment based on what follows

This is the procedure we use for both the metri-cal and harmonic programs With the meter pro-gram we can imagine the columns of the tablerepresenting pips Each pip has a note score rowsof the table represent beat intervals ie the timeto some previous pip Consider the simplifiedproblem of deriving a single beat level A beatlevel is a path through the table placing beats atcertain locations (Unlike the example above abeat analysis only steps in certain columns of thetable) There is a penalty for switching rows (iebeat intervals) from one step to the next The pre-ferred analysis is the one that hits the highest notescores with the fewest and smallest shifts in beatinterval So the problem becomes that of filling

the entry in the table for a specific pip and beat in-terval This is done by looking back to some previ-ous pip (determined by the beat interval) andchoosing the best beat interval to use at that priorpip When evaluating this choice we have avail-able the current beat interval and the hypotheticalprevious beat interval so we can assign the appro-priate beat-interval penalty Once the end of thetable is reached the highest score represents thepreferred analysis overall and this can be tracedback through the table

Regarding the harmonic program the simple al-gorithm described above can handle the compat-ibility rule the ornamental dissonance rule andthe strong-beat rule (which penalizes changes ofharmony depending on the strength of the beat)The dynamic programming table at each segmentonly needs to be a one-dimensional array that di-mension is the choice of root of the segment Toincorporate the harmonic variance rule and thepitch-variance rule (determining the spellings ofpitches) things get a lot more complex and weend up with a four-dimensional table (rather thana one-dimensional table) to fill in Space limita-tions prevent us from giving the details here

In practice the computation described above isintractable because the four-dimensional tablegets too large We handle this by pruning the tablefor a given segment immediately after the table forthat segment is constructed We simply keep onlythose elements that are within some constant ofthe highest score in the table We have found avalue for this constant that maintains a reasonablespeed for the system without compromising accu-racy

Examples

We have tested the program on a number of piecesand sections of pieces These include both quan-tized files generated from scores (eg Finale files)and unquantized files generated from live perfor-mances on a MIDI keyboard Most are pieces fromthe common-practice (Bach to Brahms) era mainlypiano pieces there are also a number of unaccom-panied melodies All of the input files we have

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

20 Computer Music Journal

tested (including those discussed below) are avail-able at our World Wide Web site Interested read-ers can judge the programrsquos performance forthemselves In this section we discuss representa-tive examples of the programrsquos output pointing tosome strengths and weaknesses along the waySome of the problems we discuss here have re-cently been addressed and are detailed on ourWorld Wide Web site

Both the metrical and harmonic programs in-volve a number of parameters The weight of eachpreference rule relative to the others must bespecified Many rules also involve internal param-eters for example in the compatibility rule therelative preference of different pitch-root relation-ships We have simply adjusted these parameterson a trial-and-error basis After many tests and ad-justments we have found a set of values thatseems to produce generally good results (The pa-rameters can easily be adjusted users of the pro-gram may wish to experiment with this)

Our first example is the opening of the Courantefrom Bachrsquos Suite for Violoncello in C Major Thescore is shown in Figure 6 the programrsquos output isin Figure 7

A word of explanation is needed about the for-mat of the output Each line of the output repre-

sents a beat at the lowest level of meter (Recallthat these are treated as indivisible units by theharmonic program) At the far left is the time pointat which that unit starts given in millisecondsRemember that these time points have been quan-tized to pipsmdashunits of 35 msec To the right of thatis a series of xs indicating the levels of beatspresent at the time point In both this example andthe next the lowest level of meter in the programrsquosoutput is omitted to save space thus the tactuslevel is the second from the left Next is the root ofthe segment shown according to its position on theline of fifths and finally the pitches contained inthe segment are listed (by their chosen spellings)and represented graphically on the line of fifthsEach pitch event is only indicated by name in thesegment in which it begins Graphically eachpitch event is represented by a + symbol in the seg-ment where it begins and by a | symbol in subse-quent segments where it is present If more thanone event of a particular TPC is present this is in-dicated with 2 3 etc as appropriate

The Bach example demonstrates several nice as-pects of the program The metrical structure iscorrect here with the possible exception of thehighest level (The input here is quantized gener-ated from a score rather than a live performance)

Figure 6 Bachrsquos Suite forVioloncello No 3Courante measures 1ndash12

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 21

Figure 7 The programrsquosoutput for the Bach pas-sage shown in Figure 6

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

22 Computer Music Journal

The meter is somewhat subtle since (with the ex-ception of measure 8) there is a note on everyeighth-note beat and all the notes are the samelength One important cue seems to be the lownotes at the beginning of measures 2 and 4 Theconcept of registral IOI is important here althoughthese notes are not actually long they are notclosely followed by another note in the same regis-ter which makes them seem long Turning to theharmonic structure this is (in our opinion) exactlycorrect in this passage In a piece such as this ofcourse it is essential for the program to be able tohandle harmonies in which the notes are statedsuccessively rather than all at once Note also theimportance of meter in determining the harmonyIt is this that allows the program to decide wherechord segments begin and end Without it for in-stance the first G segment might begin two notesearlier

Our second example the opening of the slowmovement of Beethovenrsquos Op 13 (ldquoPathetiquerdquo )Sonata tests an important aspect of the meter pro-gram its ability to handle fluctuations in tempoThe score is shown in Figure 8 and the programrsquosoutput is in Figure 9

Unlike the previous example this example isfrom a performance played on a MIDI keyboard Ifone consults the time data in the left column itcan be seen that the performer (David Temperley)makes considerable variations in the tempo Forconvenience the intervals between tactus beatsare shown at the far left these are not normallydisplayed by the program (The program considersthe tactus to be the sixteenth-note level here the

eighth-note level would perhaps be a betterchoice) Again the very lowest level of the metri-cal structure is not shown in this example thusthe tactus level is the second from the left Amongthe features of the expressive timing here are aslight slowing down at the beginnings and endingsof measures a significant acceleration in the firstthree quarters of measure 3 (the mean tactus inter-val here is 437 msec as opposed to a mean of 491msec for the whole passage) and most notably asubstantial ritard at the end of measure 4 withtwo consecutive beats of 630 msec Altogether thetactus intervals fluctuate between 420ndash630 msecThe program is able to handle all of these fluctua-tions maintaining the correct metrical structurethroughout The chords here greatly help the pro-gram to identify where the strong beats are Unac-companied melodies played with rhythmicfreedom often give the program trouble

The next example the opening of SchubertrsquosMoment Musical 6 (see Figure 10) demonstratessome strengths and weaknesses of the harmonicprogram In this case rather than showing the out-put we simply show the programrsquos analysis aschord symbols on the score each chord symbol in-dicates a chord span beginning on the note belowThe analysis of the opening phrase could be im-proved in several respects Labeling measure 1 asD is odd it would make more sense to considermeasures 1ndash2 as part of a single B chord Simi-larly measure 3 would probably be better labeledan A chord with a long appoggiatura One generalweakness of the program is that it has no knowl-edge of pedals Measure 7 would probably best be

Figure 8 BeethovenrsquosSonata Op 13(ldquoPathetiquerdquo) IImeasures 1ndash5

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 23

Figure 9 The programrsquosoutput for the Beethovenpassage shown in Figure 8

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

24 Computer Music Journal

analyzed as a B 7 chord over an E pedal Similarlymeasure 15 is an E 7 chord over an A bass Beingignorant of such possibilities the program musttry something else Measure 7 is analyzed reason-ably as an E chord with several appoggiaturas butmeasure 15 is bizarrely labeled as a B chord Turn-ing to the more chromatic harmonies the programanalyzes the last beat of measure 10 as 1-3- 5- 7with root Gmdasha ldquoFrench sixthrdquomdashjust as it shouldHowever the ldquoGerman sixthrdquo in measures 16ndash17is incorrectly labeled as an F dominant seventhmoreover the D is misspelled as an E (the onlyspelling error in this passage) This error brings upanother general failing of the program it has noknowledge of voice leading In measures 16ndash17the fact that the DE resolves to an E would nor-mally require a D spelling rather than E The

programrsquos ignorance of voice leading results in afair number of spelling mistakes For the mostpart however the pitch-variance rule and thefeedback from harmonic considerations ensure thecorrect spellings For example the program is ableto correctly identify the B and E in measures 10ndash12and the C and F in measures 16ndash18

The harmonic program has other weaknesses Ithas no knowledge of several common types of or-namental tones such as anticipations and escapetones It sometimes misses common progressionslike II-V-I analyzing them in some other strangeway (As listeners perhaps we give a ldquobonusrdquo tosuch progressions due to their familiarity andstructural importance) The output of the programis also somewhat unsatisfactory in that it indi-cates only the roots of chords without further in-

Figure 10 SchubertrsquosMoment Musical No 6(Op 94) measures 1ndash20showing the programrsquosharmonic analysis

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 25

formation such as mode (major or minor) exten-sion (triad or seventh) and inversion We hope toaddress these problems

Further Issues

We have explored several possible improvementsof the metrical program In its basic form as de-scribed above the program finds and maintainsthe correct tactus level on a large majority ofpieces Its performance on the lower levels ismostly quite good although it sometimes missesnotes in rapid scale passages (We do not want it tohit every note Some notes are extrametrical thatis notes that would be notated in small note headsin a score rolled chords grace notes trillsChopinesque ornamental flourishes and so onBut the program does not always correctly distin-guish the metrical notes from the extrametricalones) The performance on the upper levels isweaker especially on level 4 Frequently the pro-gram correctly identifies level 4 as duple which itusually is but chooses the incorrect phase (this oc-casionally happens with level 3 as well) In ldquoOhSusannahrdquo for example (see Figure 1) the programjudges even-numbered measures as metricallystrong at level 4 the same error occurs in the Bachcourante (see Figure 7) The reason for this is clearthere are often long notes at the ends of phraseswhich makes the program prefer them as strongalthough they are often weak at higher metricallevels The real solution to this problem would beto incorporate what Lerdahl and Jackendoff callldquogrouping structurerdquo Grouping structure is a hier-archy of segments with lower-level segments rep-resenting motives and larger segmentsrepresenting phrases and sections As Lerdahl andJackendoff note grouping affects meter in thatthere is a preference to locate strong beats near thebeginning of groups However getting a computerto recognize grouping boundaries proves to be avery difficult problem and our preliminary effortshave been unsuccessful Instead we have adopteda cruder solution At level 4 the program ignoresthe length rule and simply prefers beat locationsthat hit the maximum number of event onsets In

addition we give a slight bonus at level 4 for plac-ing a level-4 beat on the first level-3 beat of thepiece (rather than the second or third) thus en-couraging a strong level-4 beat near the beginningof the piece This fix improves performance some-what but incorporating grouping structure wouldclearly be more satisfactory

Making use of the harmonic analysis is anotherapproach to improving the performance of themetrical program on the higher levels Considerthe Schumann piece shown in Figure 11 thetactus here is the quarter note Our standard met-rical program will identify the level above thetactus as triple but out of phase with the notatedmeter (the first quarter note is metrically strong)Perceptually the important cue here seems to beharmony there are clear chord changes on the sec-ond and fifth tactus beats which favors strongbeats there (again this factor is noted by Lerdahland Jackendoff) The idea then is to let the har-monic analysis influence the metrical analysis byfavoring strong beats at changes of harmony Thispresents a serious chicken-and-egg problem how-ever since meter is crucial as input to harmonyOne solution would be to compute everything atonce optimizing over both the metrical and har-monic rules but we have not yet found an effi-cient way of doing this Another solution whichwe are currently exploring is to first run the piecethrough the harmonic program generating a provi-sional harmonic analysis then run the output ofthat through the meter program which is nowmodified to prefer strong beats at points of har-monic change and finally run this output throughthe harmonic program again to generate the finalharmonic analysis This technique can be used toimprove the metrical analysis on a number ofpieces including the Schumann piece discussedhere The software included in our distributioncan be run in this manner

Another factor that is involved in meter is loud-ness It would be quite easy to incorporate loud-ness as a factor in our system however we do notbelieve it is a major determinant of metrical struc-ture (see Rosenthal 1992 for discussion) Finallythe factor of parallelism should be mentioned thepreference to align metrical levels with patterns of

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

26 Computer Music Journal

repetition This is undoubtedly a factor in meterbut we have not yet addressed the major task of in-corporating it (For an interesting attempt tohandle parallelism see Steedman 1977)

While the main purpose of our system is to gen-erate satisfactory analyses for pieces it has somefurther interesting features that deserve mentionOne is its handling of real-time listening As notedearlier the systemrsquos analysis of one segment of apiecemdashboth metrical and harmonicmdashis the one en-tailed by the optimal analysis of the piece as awhole the analysis of a segment may be affectedby both its prior and subsequent context Since thesystem processes pieces in a left-to-right fashionits initial analysis of a segment may be revisedbased on what happens afterwards In this way theprogram may shed light on subtle nuances of lis-tening such as the garden-path effect mentionedearlier This is an emergent feature of the systemthat seems to warrant further exploration

Another interesting feature of the system relatesto its evaluation of possible analyses In searchingfor the optimal analysis of a piece the system is ineffect assigning numerical scores to various analy-ses of the piece and comparing them When it isfinished what it produces is not only an optimalanalysis but a numerical score for that analysisMore importantly it produces a series of scores forsegments of the analysis and for the differentrules indicating how well each segment satisfieseach rule In some cases an analysis of a segmentmay be found that satisfies all the rules reasonablywell In other cases however the best analysisavailable may be quite low scoring on one or sev-eral of the rules and may (in comparison to othersegments) be quite low scoring overall These

scores may reflect interesting aspects of music andmusical experience To take just two examples asegment featuring wide harmonic leaps on the lineof fifthsmdasha sudden move to a remote harmonymdashwould score poorly on the harmonic variance rulea passage featuring harmonic changes on very weakbeats or very rapid harmonic motion would scorepoorly on the strong-beat rule These are devices intonal music that can be used to create tension orinstability In this way the numerical scores pro-duced by the program can be seen as indicators ofthe degree of tension of musical passages Like thegarden-path effects observed earlier nothing spe-cial has to be done to produce these numericalscores They arise naturally as a result of thesystemrsquos search for the optimal analysis

In several ways then preference-rule systemsprovide powerful models of aspects of music cog-nition While we have focused here on meter andharmony it seems that preference-rule systemswould be applicable to other aspects of musicalstructure as well One clear candidate is groupingstructure the segmentation of a piece into mo-tives phrases and sections Lerdahl andJackendoff offer a preference-rule system for group-ing in GTTM and an attempt to implement thiscomputationally would certainly be worthwhile(Another interesting approach to grouping is foundin Tenney and Polanskyrsquos work [1980]) Streamingthe sorting of events into contrapuntal lines is an-other kind of musical structure that appears tolend itself well to preference-rule modeling Toour knowledge this has not been attempted evenat an informal level Finally key structure seemswell suited to a preference-rule approach Severalmodels have been presented for how key judg-

Figure 11 SchumannrsquosOp 15 (Kinderscenen)No 2

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49

Temperley and Sleator 27

ments are made for a passage of music One well-known proposal is Krumhanslrsquos key-profile model(1990) in which the distribution of pitches in apassage is matched to an ideal distribution foreach key with the best match representing thepreferred key One weakness of Krumhanslrsquosmodel is that it has no mechanism for handlingmodulation We are currently investigating somepossible solutions to this problem

Perhaps the most attractive feature of the prefer-ence-rule approach from the point of view of mod-eling cognition is the one we first mentioned itprovides a high-level way of describing a cognitiveprocess In a sense it is a way of breaking down theproblem The preference-rule system itself can bestudied tested and refined to determine whether itproduces good results without great concern (atleast for the moment) for whether the current real-ization of the system has cognitive validity As wehave said the search procedure we have proposedhas some psychologically attractive features aswell but the preference-rule system itself couldstill be right even if the search procedure turnedout to be completely wrong Even so it is still im-portant to have an implementation since this al-lows one to test whether onersquos preference-rulesystem can really work Both the successes andfailures of our program lend insight into the factorsthat are important in harmonic and metrical analy-sis Combined with other kinds of studies such asexperimental psychological work we hope thatcomputational studies such as ours will provideconverging evidence about the mental structuresand processes involved in music cognition

References

Allen P and R Dannenberg 1990 ldquoTracking MusicalBeats in Real Timerdquo Proceedings of the 1990 Interna-tional Computer Music Conference San FranciscoInternational Computer Music Association pp 140ndash143

Butler D 1992 The Musicianrsquos Guide to Perceptionand Cognition New York Schirmer

Chafe C B Mont-Reynaud and L Rush 1982 ldquoTo-ward an Intelligent Editor of Digital Audio Recogni-tion of Musical Constructsrdquo Computer MusicJournal 6(1)30ndash41

Cormen T C Leiserson and R Rivest 1990 Intro-duction to Algorithms Cambridge MassachusettsMIT Press

Desain P and H Honing 1992 Music Mind and Ma-chine Amsterdam Thesis Publishers

Dowling J and D Harwood 1986 Music CognitionOrlando Florida Academic Press

Krumhansl C 1990 Cognitive Foundations of MusicalPitch Oxford Oxford University Press

Lee C 1991 ldquoThe Perception of Metrical Structure Ex-perimental Evidence and a Modelrdquo In P Howell RWest and I Cross eds Representing Musical Struc-ture London Academic Press pp 59ndash127

Lerdahl F and R Jackendoff 1983 A GenerativeTheory of Tonal Music Cambridge MassachusettsMIT Press

Longuet-Higgins H C and M Steedman 1971 ldquoOnInterpreting Bachrdquo Machine Intelligence 6221ndash241

Maxwell H J 1992 ldquoAn Expert System for HarmonicAnalysis of Tonal Musicrdquo In M Balaban K Ebciogluand O Laske eds Understanding Music with AICambridge Massachusetts MIT Press pp 335ndash353

Parncutt R 1994 ldquoA Perceptual Model of Pulse Sa-lience and Metrical Accent in Musical RhythmsrdquoMusic Perception 11(4)409ndash464

Povel D-J and P Essens 1985 ldquoPerception of Tempo-ral Patternsrdquo Music Perception 2(4)411ndash440

Rosenthal D 1992 ldquoEmulation of Human Rhythm Per-ceptionrdquo Computer Music Journal 16(1)64ndash76

Steedman M 1977 ldquoThe Perception of MusicalRhythm and Metrerdquo Perception 6(5)555ndash570

Temperley D 1997 ldquoAn Algorithm for HarmonicAnalysisrdquo Music Perception 15(1)31ndash68

Tenney J and L Polansky 1980 ldquoTemporal GestaltPerception in Musicrdquo Journal of Music Theory24(2)205ndash241

Winograd T 1968 ldquoLinguistics and the ComputerAnalysis of Tonal Harmonyrdquo Journal of MusicTheory 12(1)2ndash49