israelneuman generativetoolsfor interactivecomposition: real...

of 15 /15
Generative Tools for Interactive Composition: Real-Time Musical Structures Based on Schaeffer’s TARTYP and on Klumpenhouwer Networks Israel Neuman University of Iowa Department of Computer Science 14 MacLean Hall Iowa City, Iowa 52242-1419, USA [email protected] Abstract: Interactive computer music is comparable to improvisation because it includes elements of real-time composition performed by the computer. This process of real-time composition often incorporates stochastic techniques that remap a predetermined fundamental structure to a surface of sound processing. The hierarchical structure is used to pose restrictions on the stochastic processes, but, in most cases, the hierarchical structure in itself is not created in real time. This article describes how existing musical analysis methods can be converted into generative compositional tools that allow composers to generate musical structures in real time. It proposes a compositional method based on generative grammars derived from Pierre Schaeffer’s TARTYP, and describes the development of a compositional tool for real-time generation of Klumpenhouwer networks. The approach is based on the intersection of musical ideas with fundamental concepts in computer science including generative grammars, predicate logic, concepts of structural representation, and various methods of categorization. Music is a time-based sequence of audible events that emerge from an underlying structure. Such a fundamental structure is more often than not multi-layered and hierarchical in nature. Although hierarchical structures are commonly conceptu- alized as predetermined compositional elements, analysis of jazz performances highlights their ex- istence in improvised music, where performers use their knowledge of the musical language and performance practices to compose in real time. For example, Steven Block has shown sophisticated pitch organizations in free jazz compositions by Ornette Coleman, John Coltrane, Cecil Taylor, and Anthony Braxton. In the conclusion of his study, Block (1990, p. 202) calls for greater analytical at- tention to “the complexities of the musical fabric in free jazz” and more innovative interpretation of the “hybrid language of many free compositions.” Jazz groups, such as the Art Ensemble of Chicago, collectively create complex musical “fabrics” that are unified by hierarchical structures. In many cases, the only pre-composed elements in these compo- sitions are a chord progression or a melody, and in Computer Music Journal, 38:2, pp. 63–77, Summer 2014 doi:10.1162/COMJ a 00240 c 2014 Massachusetts Institute of Technology. free improvisation the entire musical structure is composed in real time. Interactive computer music is comparable to improvisation because it includes elements of real- time composition performed by the computer. Arne Eigenfeldt (2011, p. 13) maintains that “composers of real-time computer music have most often relied upon constrained random procedures to make musical decisions.” Hence, the process of real-time composition often incorporates stochastic techniques that remap a predetermined fundamental structure to a surface of sound processing. The hierarchical structure is used to pose restrictions on the stochastic processes and maintain the unity of the piece. Yet, unlike in free jazz improvisation, the hierarchical structure in itself is not created in real time, but only fleshed out by stochastic means. The eventual goal of my research is to develop software that would allow composers to regener- ate musical structures in real time in the same manner as an improviser. In this article, I describe how existing methods used in musical analysis can be converted into generative compositional tools. My approach is based on the intersection of musi- cal ideas with fundamental concepts in computer science, including generative grammars, predicate logic, concepts of structural representation, and Neuman 63

Author: others

Post on 12-Feb-2021




0 download

Embed Size (px)


  • Generative Tools forInteractive Composition:Real-Time MusicalStructures Based onSchaeffer’s TARTYP and onKlumpenhouwer Networks

    Israel NeumanUniversity of IowaDepartment of Computer Science14 MacLean HallIowa City, Iowa 52242-1419, [email protected]

    Abstract: Interactive computer music is comparable to improvisation because it includes elements of real-timecomposition performed by the computer. This process of real-time composition often incorporates stochastictechniques that remap a predetermined fundamental structure to a surface of sound processing. The hierarchicalstructure is used to pose restrictions on the stochastic processes, but, in most cases, the hierarchical structure initself is not created in real time. This article describes how existing musical analysis methods can be convertedinto generative compositional tools that allow composers to generate musical structures in real time. It proposesa compositional method based on generative grammars derived from Pierre Schaeffer’s TARTYP, and describes thedevelopment of a compositional tool for real-time generation of Klumpenhouwer networks. The approach is basedon the intersection of musical ideas with fundamental concepts in computer science including generative grammars,predicate logic, concepts of structural representation, and various methods of categorization.

    Music is a time-based sequence of audible eventsthat emerge from an underlying structure. Sucha fundamental structure is more often than notmulti-layered and hierarchical in nature. Althoughhierarchical structures are commonly conceptu-alized as predetermined compositional elements,analysis of jazz performances highlights their ex-istence in improvised music, where performersuse their knowledge of the musical language andperformance practices to compose in real time. Forexample, Steven Block has shown sophisticatedpitch organizations in free jazz compositions byOrnette Coleman, John Coltrane, Cecil Taylor, andAnthony Braxton. In the conclusion of his study,Block (1990, p. 202) calls for greater analytical at-tention to “the complexities of the musical fabricin free jazz” and more innovative interpretation ofthe “hybrid language of many free compositions.”Jazz groups, such as the Art Ensemble of Chicago,collectively create complex musical “fabrics” thatare unified by hierarchical structures. In many cases,the only pre-composed elements in these compo-sitions are a chord progression or a melody, and in

    Computer Music Journal, 38:2, pp. 63–77, Summer 2014doi:10.1162/COMJ a 00240c© 2014 Massachusetts Institute of Technology.

    free improvisation the entire musical structure iscomposed in real time.

    Interactive computer music is comparable toimprovisation because it includes elements of real-time composition performed by the computer. ArneEigenfeldt (2011, p. 13) maintains that “composersof real-time computer music have most oftenrelied upon constrained random procedures tomake musical decisions.” Hence, the process ofreal-time composition often incorporates stochastictechniques that remap a predetermined fundamentalstructure to a surface of sound processing. Thehierarchical structure is used to pose restrictions onthe stochastic processes and maintain the unity ofthe piece. Yet, unlike in free jazz improvisation, thehierarchical structure in itself is not created in realtime, but only fleshed out by stochastic means.

    The eventual goal of my research is to developsoftware that would allow composers to regener-ate musical structures in real time in the samemanner as an improviser. In this article, I describehow existing methods used in musical analysis canbe converted into generative compositional tools.My approach is based on the intersection of musi-cal ideas with fundamental concepts in computerscience, including generative grammars, predicatelogic, concepts of structural representation, and

    Neuman 63

  • various methods of categorization. The relevancy ofsuch methods to music is rooted in their effective-ness as language definers and structure generators.Since these are fairly simple and commonly usedmethods, they can also be readily embedded in inter-active tools. Because of the central role occupied bythe Max language in interactive composition, I re-strict my software development to tools that can beembedded and used in Max/MSP and Pure Data (Pd).

    The bulk of this article is a revision of a conferencepaper (Neuman 2013) that proposed a compositionalmethod using an interactive compositional toolbased on generative grammars derived from PierreSchaeffer’s Tableau Récapitulatif de la Typologie(TARTYP, “Summary Table of Typology [of soundobjects],” cf. Schaeffer 1966). These grammarsenable the creation of new hierarchical musicalstructures that are, in turn, derived from the hierar-chical structure of Schaeffer’s table. These complexstructures are brought to life at the surface of thecomposition in a versatile way, utilizing the spectralsignatures of sound objects from Schaeffer’s audioexamples.

    The following background section presents themotivation for creating a TARTYP-based interactivecompositional tool, as well as presenting relatedwork. The third section of this article presents thereading of TARTYP that formed the foundationfor the generative grammars presented in thefourth section. The following two sections describethe design and implementation of the software.In the seventh section, I describe my currentresearch, which is focused on the development ofa compositional tool for real-time generation ofKlumpenhouwer networks (Lewin 1990, 1994). Inthe final section, I voice a call for the developmentof an extensible tool that gives the composer thefreedom to incorporate a broader collection ofmusical theories, classifications, languages, andgenerative methods for interactive composition.


    In 1957, Noam Chomsky introduced phrasestructure (PS) grammars, a form of generationsystems denoted [

    ∑, F], where

    ∑is a set of initial

    symbols and F is a set of rewrite rules. Such rewriterules have proven to be a suitable tool for musicalanalysis and composition, where a phrase is oftenexpanded or contracted to define the differenthierarchical levels of musical structures. Hence,if a set of rewrite rules defines and produces a“legal” musical phrase it can also produce musicalstructures that are more complex. Steven R.Holtzman (1981) demonstrated the use of theGenerative Grammar Definition Language todescribe the micro-components of musical objects,as well as complete sections of a composition. FredLerdahl and Ray Jackendoff (1993) compare their useof PS grammars for defining hierarchical structuresof temporal organization to the processes known inSchenkerian theory as prolongation and reduction.

    Both Chomsky’s theory and contemporary musicin general have strong ties to set theory. Chomskywas interested in classes of derivations and setsof rules that would generate the same terminallanguage (Chomsky 1966; Lasnik, Depiante, andStepanov 2000). Yet PS grammars are effectivein describing musical language because they canbe used to create multiple legal variants of thesame sentence. Consider Chomsky’s example “theman hit the ball” (Chomsky 1966). In the grammardescribing this sentence, the terminals are the words(the, man, hit, the, ball): variants of this sentence canbe formed, e.g., by replacing nouns with placeholdervariables, yielding “the X hit the Y,” where X =[man |woman |boy] and Y = [ball |table |wall]. Byreplacing each terminal by typed variables denotingclasses of words, the same set of rewrite rules wouldproduce legal variants of the original sentence.

    The recent focus on classification in musicalresearch corresponds to the introduction of musi-cal set theory, where terms such as “pitch class”and “interval class” are commonly used. Serialismexpands the use of classifications to compositionalelements other than pitch, such as rhythm, dy-namics, articulation, orchestration, and timbre. Awell-formed classification method of compositionalmaterial, however, is not always sufficient for defin-ing ways for composing out this material. PierreBoulez’s system for multiplication of pitch-classsets produces domains or collections of pitch-classsets that are structurally tied to the twelve-tone

    64 Computer Music Journal

  • series from which they originate. Although it isknown that Boulez used this classification in Lemarteau sans maı̂tre (1955), the order by which thecollections are used in the composition is a matterof some debate among researchers (Koblyakov 1990;Heinemann 1998).

    Schaeffer introduced TARTYP in 1966 as partof his typology of sound objects (Schaeffer 1966).The table is a classification of sound objects basedon their properties in the time and frequencydomains, and it introduces an alphanumeric no-tation for sound objects. Its structure alludes tointer-relationships between subcollections of soundobjects. Nevertheless, Schaeffer provides limited di-rection for how to use the classification in a compo-sitional process, focusing mainly on using the table’snotation to construct sequences of symbols describ-ing sounds that are more complex (Thoresen 2007).

    Most studies of TARTYP offer a translation,adaptation, or revision of this classification of soundobjects. Lasse Thoresen (2007), in his adaptation,removes some of the elements defining the time-domain axis of the table. Robert Normandeau (2010),in his revision to the table, interprets its main pairsof sound characteristics: mass-facture, duration-variation, and balance-originality. John Dack (2001)discusses the Excentric sound objects, a groupof sound objects that were defined by Schaefferas unsuitable for music, yet according to Dackthese objects are commonly used in electroacousticcompositions. These studies, like many others (suchas the work described in this article) rely on MichelChion’s Guide to Sound Objects ([1983] 2009) fora lexical collection of many terms in Schaefferiantheory, including the terms defining TARTYP.

    The work of Bernard Bel (1992, 1998; Bel andKippen 1992) combines the Schaefferian approach tosound objects and generative grammars in a creativeenvironment that he calls the Bol Processor. Thisenvironment supports composition and improvisa-tion, using a system of rewrite rules. The grammarsof the Bol Processor are derived from the metaphoriclanguage for drumming in Asia and Africa calledQa’idas. The focus here is on the mapping of theobjects to a structural organization in physical time,hence, the consideration is of the time domainproperties of sound objects (i.e., it does not take into

    Figure 1. Pierre Schaeffer’sTARTYP (after Chion[1983] 2009). The time-domain terms along the up-per row of the table and thefrequency-domain termsalong the leftmost columnare highlighted in gray.

    account classifications and defining features). Theoutput of the Bol Processor is a list of MIDI messagesor a CSound score. Although it has some real-timecapabilities, the Bol Processor relies on an externalsound processor to produce real-time output.

    In contrast, the TARTYP-based compositionaltools presented in the following sections utilizeexisting interactive real-time environments; asextensions of the Java-based MaxObject class, thesetools are embedded in Max/MSP or Pd and directlyengage the sound-processing capabilities of theseenvironments. The grammars of these tools arebased on Schaeffer’s classification of sound objectsas presented in TARTYP and as interpreted byChion ([1983] 2009) and Normandeau (2010). Thesegrammars, as well as the compositional processsuggested in this article, take into account thecharacteristics of sound objects in both the time andfrequency domain as presented in the TARTYP tableand demonstrated by Schaeffer’s sound examples(Schaeffer 2012).

    TARTYP Sound-Object Classification

    Sound-object characteristics are specified in TAR-TYP at the margins of the table, with time-domain characteristics along the upper row andthe frequency-domain characteristics along the left-most column (see Figure 1). The frequency-domain

    Neuman 65

  • terms that correspond to rows in the table describethe frequency content of sound objects by theirvariability on a scale between fixed sound (mass)and unpredictable noise. Hence, fixed mass can beof definite pitch (harmonic sound) or of complexpitch (with some inharmonicity and noise). In addi-tion, the not very variable mass is a glissando-likesound and the unpredictable variation of mass isa non-periodic noise sound. Variation in this tablerefers to internal variation in the frequency domain,i.e., sounds such that their endings differ from theirbeginnings (Chion [1983] 2009).

    The seven terms that describe the time domaincharacteristics correspond to columns in the table.The central column is labeled impulse, referringto a very short sound. To the right of this columnappear terms describing the characteristics or gainenvelopes of iterative sounds, including formediteration, (iterative) nonexistent facture, and (itera-tive) unpredictable facture. To the left appear termsdescribing the characteristics of held sounds includ-ing formed sustain, (held) nonexistent facture, and(held) unpredictable facture. The term facture refersto the way a sound evolves over time (Normandeau2010). Nonexistent factures are sounds that are tooredundant to exhibit change over time. Similarly,unpredictable factures exhibit too much instabil-ity to be “formed.” A formed sound is a soundof medium duration with a “closed” facture, verysimilar to a traditional musical note (Chion [1983]2009).

    Combinations of time-domain characteristics(i.e., the table’s column headings) with frequency-domain characteristics (i.e., the row headings) resultin the sound objects notated in the body of thetable. Hence, the combination of an impulse-typeenvelope and a definite pitch points to the soundobject notated as N′. Similarly, sound object X′′

    is defined by a formed-iteration type of envelopeand a complex pitch. This notation provides anabstraction of the different types of sound objects.Clearly, there are multiple sound objects that fitthe defining characteristics of N′ or X′′: thus, inSchaeffer’s sound examples, most of the nota-tion symbols are demonstrated by more than onesample (Schaeffer 2012). Each symbol thereforerepresents a sound-object class, a collection of

    (En) Hn N N' N'' Zn (An)

    (Ex) Hx X X' X'' Zx (Ax)BALANCED

    (Ey) TnTx Y Y' Y'' Zy (Ay)

    E T W ϕ K P A













    Figure 2. The division ofthe body of TARTYP intosubcollections ofsound-object classes.

    sound objects that share the same TARTYP definingcharacteristics.

    As indicated by Chion ([1983] 2009), TARTYPis subdivided into subcollections of sound-objectclasses that are not explicitly notated in the table(see Figure 2). The center of the table is a collection ofnine sound-object classes (N, N′, N′′, X, X′, X′′, Y, Y′,Y′′). These sound-object classes are called Balancedsound objects. The columns to the right and leftof the Balanced sound objects (three rows fromthe top) define the Redundant and Homogeneous(RH) subcollection, which is further subdividedinto RH Held (Hn, Hx, Tn, Tx) and RH Iter (Zn,Zx, Zy). The entire bottom row of the table andthe far-right and far-left columns constitute thesubcollection of Excentric objects, some of whichare further subdivided into Sample objects (in thefar-left column: En, Ex, Ey, E) and Accumulationobjects (in the far-right column: An, Ax, Ay, A).

    Note that the sound-object classes in the bottomrow of table (E, T, W, �, K, P, A) are simply referredto as Excentric objects. Hence, the term “Excentric”is applied to two layers of subcollections: a largersubcollection of Excentric objects that includes thesmaller subcollections of Sample and Accumulationobjects; and the smaller, second layer, subcollectionof Excentric objects. The sound-object class E is amember of both the Sample objects and the smallersubcollection of Excentric objects, and the sound-object class A is a member of both the Accumulationobjects and the smaller subcollection of Excentricobjects.

    66 Computer Music Journal

  • TARTYP-Derived Grammars

    In this section, I present generative grammarsderived from TARTYP classifications of soundobjects and its defining elements as well as from thestructure of the table and its subcollections. Therewrite rules of these grammars use the time andfrequency properties specified at the margins of thetable as terminals. For each of the subcollectionsof the table, I define a grammar in which eachterminal equals a subset of the notated sound-objectclasses. A set of rewrite rules in such a grammaryields a large number of paths, each of which canbe composed out as a sequence of sound objects. Inaddition, I present a table grammar that unifies allthe subcollection-based grammars and initiates ahierarchical structure of sound object sequences.

    To explain and exemplify a subcollection-basedgrammar, the following discussion will focus on thegrammar of the Balanced object subcollection. Asstated previously, this collection includes the ninesound-object classes at the center of the table (N,N′, N′′, X, X′, X′′, Y, Y′, Y′′). The grammar for thissubcollection uses the set of terminals specified inEquations 1.1 through 1.6. The right-hand side ofthese definitions also specifies the subsets of theBalanced object subcollection that are equivalent toeach one of these terminals.

    DEFINITE = {[N | N′ | N′′ |]+} (1.1)COMPLEX = {[X | X′ | X′′ |]+} (1.2)VARIABLE = {[Y | Y′ | Y′′ |]+} (1.3)IMPULSE = {[N′ | X′ | Y′ |]+} (1.4)

    FORMED ITER = {[N′′ | X′′ | Y′′ |]+} (1.5)FORMED SUS = {[N | X | Y |]+} (1.6)

    The general structure of a rewrite rule in agenerative grammar is:

    head -> body (2)denoting that the head of the rule can be rewrittenas the body. More specifically, the structure of rulesin the subcollection-based grammars discussed inthis section is:

    head -> [terminal] terminal [non-terminal] (3)where the items in [] are optional, yielding fourdifferent legal instantiations:

    head -> terminal (4.1)head -> terminal terminal (4.2)head -> terminal non-terminal (4.3)head -> terminal terminal non-terminal (4.4)

    A terminal corresponds to subsets of sound-object classes, and a non-terminal may be any othersymbol. The head is always selected from the set ofnon-terminal symbols plus a special start symbol.

    An example of a legal set of rewrite rules forthe Balanced grammar having special start symbol“balanced” is:

    bal expre v -> VARIABLE (5.1)bal expre fs -> FORMED SUS (5.2)bal expre fi -> FORMED ITER COMPLEX (5.3)bal expre fs -> FORMED SUS IMPULSE (5.4)bal expre i -> IMPULSE FORMED SUS

    bal expre fs (5.5)bal expre v -> VARIABLE FORMED ITER

    bal expre fi (5.6)bal expre fi -> FORMED ITER IMPULSE

    bal expre i (5.7)bal expre fi -> FORMED ITER VARIABLE

    bal expre v (5.8)bal expre v -> VARIABLE FORMED SUS

    bal expre fs (5.9)bal expre fs -> FORM SUS FORM SUS

    bal expre fs (5.10)bal expre fs -> FORMED SUS IMPULSE

    bal expre i (5.11)bal expre i -> IMPULSE IMPULSE bal expre i

    (5.12)bal expre v -> VARIABLE IMPULSE bal expre i

    (5.13)bal expre c -> COMPLEX FORMED SUS

    bal expre fs (5.14)bal expre fs -> FORMED SUS COMPLEX

    bal expre c (5.15)bal expre d -> DEFINITE FORMED SUS

    bal expre fs (5.16)bal expre c -> COMPLEX FORMED ITER

    bal expre fi (5.17)bal expre v -> VARIABLE VARIABLE bal expre v

    (5.18)balanced -> FORMED SUS bal expre fs (5.19)balanced -> IMPULSE bal expre i (5.20)

    Neuman 67

  • Figure 3. A path generatedfrom the sample rule set ofEquations 5.1–5.20. This isone of many alternative le-gal paths through the spacedefined by these rules.

    Here, all caps indicate terminals, and lower caseindicates non-terminals. A rule set is legal when:(1) it contains at least one rule with a start head, (2) itcontains no duplicate rules, and (3) all non-terminalsymbols that appear in the body of a rule have atleast one other corresponding rule where they appearonly in the head.

    The rewrite rules are used to construct a path,that is, a sequence of terminal symbols selectedaccording to the grammar expressed in the rewriterules. To construct a path, set it initially to thespecial start symbol (e.g., “balanced” in rules 5.19and 5.20). While the path contains non-terminalssymbols, select at random a rule having that non-terminal as the head, and replace the non-terminalwith the body of the rule. Figure 3 shows anexample of a path constructed by the set of rulesfrom Equations 5.1–5.20, consisting of a sequenceof terminals derived from rules 5.20, 5.12, 5.5,and 5.2.

    The structure of a rewrite rule and a rule-set inthe other grammars—RH Held, RH Iter, Excentric,Sample, and Accumulation—is similar to thatof Balanced grammar. The differences betweenthese grammars lie in the set of terminal symbolsof each grammar and in the equivalent subsetsof sound-object classes. The following examplespecifies the set of terminals and the equivalentsubsets of sound-object classes for the Samplegrammar:

    DEFINITE = {[ En ]+} (6.1)COMPLEX = {[ Ex ]+} (6.2)VARIABLE = {[ Ey]+} (6.3)

    UNPREDICTABLE = {[ E ]+} (6.4)HELD UF = {[En | Ex | Ey | E ]+} (6.5)

    To combine these six grammars into a singleprocess, I add collections of rules whose terminalscorrespond to the six different special start symbolsof the six different grammars. Thus, generating apath in this Table grammar consists of sequencinginvocations of the other six grammars accordingly,starting from the new global special start symbol

    Figure 4. Grammar-basedhierarchical structureenabled by the Table andSub-Table grammars. Thistree-like hierarchy unifiesall the types of grammar.

    “start.” Note that the new grammar reflects thestructure of TARTYP subcollections; thus, I canimplement the Sub-Table grammar as shown inFigure 4. The latter represents the larger subcollec-tion of Excentric objects. The title “Sub-Table” isused to prevent duplication. Hence, in the gram-mars, the name “Excentric” represents only thesmaller subcollection of Excentric objects in thebottom row of the TARTYP table.

    Rule Generation

    I implemented a generate-and-test algorithm de-signed to produce a set of rewrite rules that con-forms to the three legality criteria discussed inthe previous section. The input to this generationalgorithm consists of a set of parameters indicatinghow many rules are in the rule set, how manyof these rules are required for each special startsymbol, and how many rules have bodies consistingsolely of terminal nodes. The algorithm selects thenext rule to generate in accordance with the ruleset parameters (e.g., having a special start symbolas head, or having only terminals in the body). Theterminals in the rule body are randomly selectedfrom the terminal symbols and then, optionally, anon-terminal symbol is included in accordance withthe terminal symbols used. The head of the ruleis either a special start symbol or one of the othernon-terminal symbols. Duplicate rules are rejected,and a new rule is generated to replace it. Once therule set is complete, it is checked for consistency,ensuring that every non-terminal in a rule body hasat least one matching head in some other rule in therule set. If a rule with no such match exists, it is

    68 Computer Music Journal

  • Figure 5. Formal speci-fication of an algorithmfor generating legal rulesets as described in text.

    eliminated, and a new rule is generated to replace it(see Figure 5).

    Once a legal rule set is produced, a secondalgorithm can be used to repeatedly extract apath or a sequence of terminals. First, a rule israndomly selected from among all the rules whosehead matches the start symbol. Next, a rule israndomly selected from among all the rules whosehead matches the non-terminal ending of the rulepreviously selected. Finally, if the rule selected inthe second step has a non-terminal, the second stepis repeated.

    Interactive Interface and Composition

    These algorithms were implemented as four Javaclasses. The four classes are embedded in theMax/MSP or the Pd environments as extensions ofthe MaxObject class. They are combined to create anmxj type object (or pdj object in Pd) that creates, inreal time, a legal set of rules in one of the grammarsand then constructs (again, in real time) multiplelegal paths from the same set of rules. The processrepeats, generating new sets of rules and extractingadditional paths. There are mxj objects availablefor all grammars—Balanced, RH Held, RH Iter,Excentric, Sample, and Accumulation, as well as forthe structural grammars Table and Sub-Table. These

    Figure 6. A Max/MSPpatch using a coll objectfor storing and accessing apath as a list.

    grammar objects function in a simple way. Theyreceive a list of integers in the left inlet. This list isinterpreted by the object as the number of rules ofeach type to be generated. The set of rules is postedin the Max window. Following the generation of therule set, with each bang received in the left inlet theobject outputs a path or a sequence of terminals as asymbol list from its outlet.

    I now present a way to create an interactiveinterface with the mxj grammar objects, as well as amethod to compose with this interactive interface.Because an mxj grammar object produces a path as asymbol list, this list can be stored in a coll object toallow access to items on this list. The list is enteredinto the coll object and read from it using the patchshown in Figure 6. The list is read, in this case, usingan arbitrary timing based on the bangs generated bya metro object; however, a similar patch can be usedto reflect a structural time organization instead.

    The patch in Figure 6 is embedded in the patchin Figure 7. The latter includes the grammar objectmxj BalRuleM. As shown in the figure, a messagebox sends a list to the inlet of this object thatspecifies the number of rules to be generated. Thelive.text object bangs the mxj BalRuleM objectto output a path. The patch in Figure 7 is part of alarger patch that simulates the tree-like hierarchy

    Neuman 69

  • Figure 7. A Max/MSPpatch using a grammarobject to generate rule setsand paths.

    represented in Figure 4. The upper part of this largerpatch includes the grammar object mxj TabRuleMthat generates a rule set in the Table grammar. Whena path is output by this object, the list of terminalsis compared in a select object that activates thegrammars Balanced, RH Held, or RH Iter, or theSub-Table grammar that, in turn, would activate thegrammars Excentric, Sample, and Accumulation.The activity in this tree-like patch is monitored inan interface such as the one shown in Figure 8.

    In the tree-like hierarchy shown in Figure 4,the Table and Sub-Table grammars provide thebackground level of the structural organizationwhile the other grammars are the middle groundof this organization. A path extracted in the Tableor Sub-Table grammars would be advanced inrelation to the paths extracted in the middle-groundgrammars. If, for example, the path [BALANCED,RH HELD, BALANCED, RH ITER] is extracted inthe Table grammar, when the first terminal is readit activates a path in the Balanced grammar. The

    second terminal will be read only when the pathin the Balanced grammar has ended. Therefore, themiddle-ground grammars are a defining element inthe time organization of the hierarchical structuregenerated by this patch.

    There are many ways to connect such a structureto a foreground of sound generation and processing.One possible way is to map the terminals throughselect objects to bang messages that activate theplayback of sound files. An alternative method usesspectral signatures derived from Pierre Schaeffer’ssound recordings that exemplified TARTYP andits defining characteristics (Schaeffer 2012). Fromeach one of the sound objects in these examplesof Schaeffer’s, I extracted 64-bin frames from a fastFourier transform (FFT) to create a spectral signature.These frames were saved as lists of values in simpletext files. In a Pd patch like the one shown inFigure 9, the FFT frames are used to filter a live soundand apply the spectral signature of a Schaefferiansound object on the live sound. Similarly, thewaveform of a Schaefferian sound object, stored inan array object, is used to generate the amplitudeenvelope of the processed signal.

    The sub-patch [pd Paths] in the upper-rightcorner of Figure 9 uses the grammar objects togenerate rule-sets and sequences of terminals inthe way exemplified in Figures 6 and 7. The patchthen selects FFT frames and waveforms originatingfrom sound objects that are members of the subsetsequivalent to terminals in the sequence. The data ofthese FFT frames and waveforms are entered into thearrays FFTframe and AmpEnv. For example, if theterminal DEFINITE is part of a path extracted fromthe Balanced grammar it will cause the selectionof FFT frames and waveforms extracted from thesound objects exemplifying the sound-object classesN, N′, or N′′.

    This sound-processing system is combined withthe hierarchical structure generated by the gram-mars discussed in previous sections. As discussedbefore, the subcollection grammars define the timeorganization at the middle ground of this compo-sitional process. At the foreground, the temporalorganization is derived from the amplitude en-velopes generated by the waveforms selected fromSchaefferian sound objects. For example, if the

    70 Computer Music Journal

  • Figure 8. A sampleinterface used to monitorthe activity in the tree-likeMax/MSP patch describedin the text. In thisinterface, for each

    grammar the extractedpath is shown in a largebox, and the terminalcurrently playing is shownin the message box underthe title of the grammar.

    DEFINITE terminal of the previous paragraph isfollowed by an IMPULSE terminal, the latter will beread once the envelope generated for the DEFINITEterminal ends.

    Interactive Klumpenhouwer Networks

    The discussion so far has focused on using Scha-effer’s taxonomy of sound objects, as expressed inTARTYP, to create a generative compositional tool.In this section, I will describe the development of aninteractive tool for real-time generation of Klumpen-houwer networks (K-networks) that provides thecomposer more flexibility in defining underlyingstructures and the context of application. HenryKlumpenhouwer and David Lewin introduced theK-networks in 1990 (Lewin 1994) to describe thepitch organization of post-tonal compositions basedon transformational relationships. Although K-networks are traditionally an analytical tool usedto identify structural pitch-class relationships in acomposition, I have applied a generative approachto this method for incorporating such networks

    as structural elements in real-time composition.In the following paragraphs, I present a very basicintroduction to the K-network analytical methodand the generative compositional tool that is basedon this method.

    The basic element of the network is the graphdescribing transformational relationships within apitch-class set. The nodes of this graph representpitch classes and the edges represent the transfor-mations between these pitch classes. There are twotypes of edges: unidirectional edges for transposi-tion (marked Tn where n is an integer mod 12) andbidirectional edges for inversion (marked In where nis an integer mod 12). A K-network is expanded toinclude multiple pitch-class sets by isomorphism.Two networks are considered isographic if they havethe same configuration of nodes and arrows andthere exists a function that maps the transformationsystem used to label the arrows of one network intothe transformation system used to label the arrowsof the other. Hence, if the transformation X labels anarrow of the one network, then the transformationF (X) labels the corresponding arrow of the other(Lewin 1990).

    Neuman 71

  • Figure 9. A samplesound-processing Pdpatch applying the spectralsignature of a Schaefferiansound object on live sound.

    Five rules for isomorphism apply to Klumpen-houwer networks (Lewin 1990). The first four rulesare summarized in the following equations:

    F 〈1, j〉(Tn) = Tn; F 〈1, j〉(In) = In+ j (7.1)F 〈11, j〉(Tn) = T11n; F 〈11, j〉(In) = I11n+ j (7.2)F 〈5, j〉(Tn) = T5n; F 〈5, j〉(In) = I5n+ j (7.3)F 〈7, j〉(Tn) = T7n; F 〈7, j〉(In) = I7n+ j (7.4)

    The fifth and last rule simply states that twoK-networks will be isographic only if one of therules expressed in Equations 7.1–7.4 holds.

    Positively isographic networks are networks thatshare the same configuration of nodes and arrows,where the transposition levels of correspondingarrows are equal and the inversion indices of corre-sponding arrows differ by some constant j mod 12.Negatively isographic networks are networks thatshare the same configuration of nodes and arrows,where the transposition levels of correspondingarrows are complements and the inversion indicesof corresponding arrows differ by some constant jmod 12 (Lewin 1990). The transformation betweentwo pitch-class set networks is often referred to

    72 Computer Music Journal

  • as hyper-transformation (Lambert 2002). Based onEquations 7.1–7.4, the hyper-transformation be-tween two isographic K-networks is marked F 〈u, j〉where u ∈ {1, 5, 7, 11} and j is some constant mod 12.The hyper-transformation between two K-networks,which are both isographic to a given third K-network, can be calculated by Lewin’s formula 1(Lewin 1990):

    F 〈u, j〉F 〈v, k〉 = F 〈uv, uk+ j〉 (8.1)F 〈u, j〉F 〈v, k〉(Tn) = F 〈u, j〉(Tvn) = Tuvn (8.2)

    F 〈u, j〉F 〈v, k〉(In) = F 〈u, j〉(Ivn+k) = Iuvn+uk+ j (8.3)In positive and negative isographs, two typesof hyper-transformations are considered: hyperTfor positive isographs and hyperI for negativeisographs, with u ∈ {1, 11}, respectively. Hence, therelationships expressed in Equations 8.1–8.3 yieldfour possible cases specified by Lewin’s formula 2(Lewin 1990):

    〈1, j〉〈1, k〉 = 〈1, j + k〉 hyperTj+k (9.1)〈1, j〉〈11, k〉 = 〈11, j + k〉 hyper Ij+k (9.2)〈11, j〉〈1, k〉 = 〈11, j − k〉 hyper Ij−k (9.3)〈11, j〉〈11, k〉 = 〈1, j − k〉 hyperTj−k (9.4)

    Using predicate logic and the Prolog program-ming language, I have designed a tool that generatesa K-network and subsequently extracts randompaths from within the network. A path is a se-quence of pitch-class sets that can be incorporatedin the pitch organization of a composition or thatcan be mapped to other compositional elements.The Prolog program that implements the tool thusnecessarily has two modes, one to generate theK-network and one to extract paths from an existingnetwork. In the first mode, the program receivesa set of user-defined parameters and generates theK-network accordingly. These parameters includean initial pitch-class set, a set of u values (suchthat u ∈ {1,11}+) specifying positive and negativeisographs, a set of constant j values and a set of trans-position levels specifying the “bass note” of eachnewly generated pitch-class set. The program ana-lyzes the initial set, generates isographs accordingto the specified parameters, and returns the gener-ated network as well as all its hyperT and hyperIrelations.

    Figure 10. Prolog codegenerating a K-network.

    The Prolog code in Figure 10 generates a K-network that includes multiple pitch-class sets.The genAndAssert() statement is the top-level callinvoked to generate a network. It returns an aKNet()statement that specifies the pitch-class sets of aK-network and a new unique label L assigned to thisK-network. Furthermore, genAndAssert() insertsaKNet() statements into a database, which thencan be used to derive the hyperI() and hyperT()statements that define the internal structure of thenetwork.

    For example, consider the following query:

    genAndAssert([3,10,4],[11,11,11,11,1],[2,1,8,2,6],[4,10,5,10,6], ).

    which inserts the following K-network descriptionin the database:


    The unique newly generated label for this networkis “aknet0.” The query also inserts the following

    Neuman 73

  • associated set of hyper-transformations into thedatabase:


    Once the network has been generated, theprogram, operating in the second mode, can performa random-walk search to extract paths from thisnetwork on demand. Although the search tree ofany K-network can potentially grow exponentiallyas the size of the network increases, the user canspecify the start and end nodes (pitch-class sets), themaximum depth of the search, and a limit to thenumber of repetitions (cycles) in the generated pathwhen issuing the appropriate chain() statements,as shown in Figure 11. A chain() statement is theProlog query that generates a path from a knownnetwork with optional start and destination sets.The search tree is rooted in the hyperI() and hyperT()statements. The program looks for a random path inthis tree, meaning it picks a random branch at eachchoice point as well as a random instantiation fromamong the legal instantiations at the root hyperI()and hyperT() statements. Recall that hyperT() isa unidirectional edge in the network graph, andhyperI() is a bidirectional edge; hence, there are threepossible mapping functions between two sets in thenetwork. These mapping are represented explicitlyby the following three statements.

    related(X,Y,T,L,0) :- hyperT(X,Y,T,L).related(X,Y,I,L,1) :- hyperI(X,Y,I,L).related(X,Y,I,L,2) :- hyperI(Y,X,I,L).

    Figure 11. Prolog codegenerating a path from aknown K-network.

    Using a random number generator (that has beenseeded randomly) to elect which of the three legalpaths to take at each call to a related() statement, themodrelated() statements shown here step throughall three related clauses in some randomly chosenbut deterministic order determined by the inputrandom “seed,” S:

    modrelated(X,Y,T,L,S) :-N is mod(S,3),related(X,Y,T,L,N).

    modrelated(X,Y,T,L,S) :-N is mod(S+1,3),related(X,Y,T,L,N).

    modrelated(X,Y,T,L,S) :-N is mod(S+2,3),related(X,Y,T,L,N).

    The chain() statement in Figure 11 uses themodrelated() statements to build a path. The usercan impose a fixed horizon by specifying a value forthe parameter H. For example, if H = 4 the programwill execute four recursive steps and return a pathof length five or less. The user can also limit themaximum number of repetitions (or cycles) in thepath by specifying a value for D. For example, tofind a path from source set [3,10,4] to destinationset [6,1,7] with a length of no more than four steps,and with no more than two copies of any set in thenetwork labeled aknet0, the query is:

    findPath([3,10,4],[6,1,7],Path, aknet0,4,2).

    74 Computer Music Journal

  • The program may return, depending on the initialrandom seed, for example:

    Path = [[3,10,4],[6,1,7],[5,2,4],[10,3,9],[6,1,7]]

    Future Work

    The latest research in musical artificial intelligencehas followed trends and computational models de-rived from biological evolution, cultural evolution,and social interaction (Miranda 2011). A great dealof attention has been given to the developmentof compositional tools for interactive music andcomputer improvisation. For example, Eigenfeldt(2011) uses multi-agent social interaction softwareand evolutionary systems to model the spontaneityof improvisation and the musical interaction be-tween improvisers. David Plans and Davide Morelli(2011) attribute the success of free improvisers toexperience-based acquisition of listening skill and“thin-slicing” intuition. They have used machinelearning, genetic coevolution algorithms, and fast-and-frugal heuristics to develop software to mimicthese skills. Free improvisers, however, as noted inthe beginning of this article, define their own “hy-brid language.” Their listening skill and thin-slicingintuition emerge, by and large, from masteringthis musical language. Although machine-learningtechniques generally require a great deal of sampledata and can be cumbersome to apply in a real-timecontext, tools for defining musical languages andgenerating musical structures are inexpensive andcan be embedded in interactive environments suchas Max/MSP.

    It is straightforward to implement the toolsthat were presented in this preliminary researchas real-time musical structure generators for theMax/MSP and Pd environments. The implementa-tion’s limitations emerge from what Miller Puckettedefines as the Max paradigm. The latter, accordingto Puckette (2002, p. 31), is “a way of combiningpredesigned building blocks into configurationsuseful for real-time computer music performance.”These predesigned building blocks insulate the userfrom basic programming elements such as data

    structures, Boolean functions, logic operation, andtype definitions, all of which are common constructsin conventional programming languages, includingC, the underlying language of Max/MSP. This limitsthe user’s ability to implement complex algorithmsand manipulate more elaborate data structures suchas multi-dimensional arrays. In order to create thetools described in this article, I have written newMax externals. My implementation entails makingcontextual choices that limit the flexibility availableto the composer. Hence, the composer can use myTARTYP-based grammar object to regenerate differ-ent sets of rewrite rules, but cannot, for example,change the set of terminal symbols in order to referto a different sound taxonomy or compositional clas-sification. Similarly, the composer cannot changethe functions in the K-network code to generate anew type of network, because such changes wouldrequire rewriting the externals’ code.

    In future work I propose to develop software toallow composers of interactive music to generatemusical structures in real time in the manner ofa free improviser. My goal is to design a tool thatenables the composer to define a musical languageand specify a generative method without resortingto writing his or her own Max externals. The mainchallenge in developing such a tool will be to ac-commodate the largest possible array of creativeideas. My plan is to follow the model of the FTMshared library developed by the Real-Time Musi-cal Interactions research team at the Institut deRecherche et de Coordination Acoustique/Musique(IRCAM). This model emerged from the idea thatthe integration and manipulation of data structuresin Max/MSP that are more complex will “opennew possibilities to the user for powerful and ef-ficient data representations and modularizationof applications” (Schnell et al. 2005). FTM hasbeen used as the foundation for the design of ap-plications for score following, sound analysis andre-synthesis, statistical modeling, database access,advanced signal processing, and gestural analysis. Itincludes data structures, editors and visualizationtools, expression evaluation, and file import andexport. It allows the static and dynamic instanti-ation of FTM classes and the dynamic creation ofobjects.

    Neuman 75

  • Conclusion

    This article has presented two generative compo-sitional tools that generate and re-compose unifiedmusical structures in real time. The first tool isbased on generative grammars that reflect the struc-ture of Schaeffer’s classification of sound objectsas presented in his TARTYP, while the grammarspreserve the terminology and meaningful interclassrelationships which are an essential part of thistable. The article presents a compositional methodthat utilizes this tool and maintains contextual tiesto the same resource. The second tool continues thissame generative approach with Klumpenhouwernetworks. The level of structural relationshipsgenerated by these tools is achieved partly by pre-determined programming elements, as well as by apredetermined choice of context and resources. Infuture work, I hope to develop tools that will allowthe user to define musical languages and specifygenerative methods according to the composer’schoices of context and resources, while maintainingthe same level of structural relationships.


    I would like to thank Professor Alberto M. Segrefrom the University of Iowa Department of Com-puter Science for his guidance and insight, and forcollaborating with me in the development of thesoftware discussed in this article.


    Bel, B. 1992. “Symbolic and Sonic Representations ofSound-Object Structures.” In K. Ebcioglu, O. Laske,and M. Balaban, eds. Understanding Music withAI: Perspectives on Music Cognition. Cambridge,Massachusetts: MIT Press, pp. 65–109.

    Bel, B. 1998. “Migrating Musical Concepts: An Overviewof the Bol Processor.” Computer Music Journal 22(2):56–64.

    Bel, B., and J. Kippen. 1992. “Bol Processor Grammars.” InK. Ebcioglu, O. Laske, and M. Balaban, eds. Understand-ing Music with AI: Perspectives on Music Cognition.Cambridge, Massachusetts: MIT Press, pp. 366–400.

    Block, S. 1990. “Pitch-Class Transformation in Free Jazz.”Music Theory Spectrum 12(2):181–202.

    Chion, M. (1983) 2009. Guide to Sound Objects: PierreSchaeffer and Musical Research. J. Dack and C. North,trans. Paris: Buchet/Chastel. Available online Accessed9 January 2013.

    Chomsky, N. 1966. Syntactic Structures. The Hague:Mouton.

    Dack, J. 2001.“At the Limits of Schaeffer’s TARTYP.” InProceedings of the International Conference “Musicwithout Walls? Music without Instruments?” Availableonline at documents/art-design-and-humanities-documents/research/mtirc/nowalls/mww-dack.pdf. Accessed 23 October 2013.

    Eigenfeldt, A. 2011. “A-Life Multi-Agent Modeling forReal-Time Complex Rhythmic Interaction.” In E. R.Miranda, ed. A-Life for Music: Music and ComputerModels of Living System. Middleton, Wisconsin: A-REditions, pp. 13–35.

    Heinemann, S. 1998. “Pitch-Class Set Multiplicationin Theory and Practice.” Music Theory Spectrum20(1):72–96.

    Holtzman, S. R. 1981. “Using Generative Grammarsfor Music Composition.” Computer Music Journal5(1):51–64.

    Koblyakov, L. 1990. Pierre Boulez: A World of Harmony.New York: Harwood.

    Lambert, P. 2002. “Isographies and Some KlumpenhouwerNetworks They Involve.” Music Theory Spectrum24(2):165–195.

    Lasnik, H., M. Depiante, and A. Stepanov. 2000. Syn-tactic Structures Revisited: Contemporary Lectureson Classic Transformational Theory. Cambridge,Massachusetts: MIT Press.

    Lerdahl, F., and R. Jackendoff. 1993. “An Overview ofHierarchical Structure in Music.” In S. M. Schwanauerand D. A. Levitt, eds. Machine Models of Music.Cambridge, Massachusetts: MIT Press, pp. 289–312.

    Lewin, D. 1990. “Klumpenhouwer Networks and SomeIsographies That Involve Them.” Music Theory Spec-trum 12(1):83–120.

    Lewin, D. 1994. “A Tutorial on Klumpenhouwer Net-works, Using the Chorale in Schoenberg’s Opus 11, No.2.” Journal of Music Theory 38(1):79–101.

    Miranda, E. R. 2011. “Preface.” In E. R. Miranda, ed.A-Life for Music: Music and Computer Models ofLiving System. Middleton, Wisconsin: A-R Editions,pp. xix–xxiv.

    Neuman, I. 2013. “Generative Grammars for Interac-tive Composition Based on Schaeffer’s TARTYP.” In

    76 Computer Music Journal

  • Proceedings of the International Computer MusicConference, pp. 132–139.

    Normandeau, R. 2010. “A Revision of the TARTYPPublished by Pierre Schaeffer.” In Proceedings ofthe Seventh Electroacoustic Music Studies Net-work Conference. Available at www.ems-network.Org/IMG/pdf EMS10 Normandeau.pdf. Accessed 10July 2012.

    Plans, D., and D. Morelli. 2011. “Using Coevolution inMusic Improvisation.” In E. R. Miranda, ed. A-Life forMusic: Music and Computer Models of Living System.Middleton, Wisconsin: A-R Editions, pp. 37–52.

    Puckette, M. 2002. “Max at Seventeen.” Computer MusicJournal 26(4):31–43.

    Schaeffer, P. 1966. Traité des objets musicaux. Paris:Éditions du Seuil.

    Schaeffer, P. 2012. “Schaeffer’s Typology of SoundObjects.” In Cinema for the Ear: A History andAesthetics of Electroacoustic Music. Availableat Accessed 10 July 2012.

    Schnell, N., et al. 2005. “FTM: Complex Data Structuresfor Max.” In Proceedings of the International ComputerMusic Conference, pp. 9–12.

    Thoresen, L. 2007. “Spectromorphological Analysisof Sound Objects: An Adaptation of Pierre Scha-effer’s Typomorphology.” Organised Sound 12(2):129–141.

    Neuman 77