some mathematical tools for music making

Some Mathematical Tools for Music-Making

Miller Puckette∗

Abstract

Electronic music constantly uses transformations offunctions of time. Some frequently-used mathemat-ical operations are described, with an eye to theireffects on sound spectra and possible musical appli-cations.

1 Introduction

Music, like any art form, defies scientific analysis:the scientific method seeks traits common to a givenspecies, whereas a piece of music is significant pre-cisely because of its differences from other pieces ofmusic. But even if we aren’t able to find many reli-able scientific laws about music, we can sometimescreate interesting new tools for making it. Doingso requires a combination of intuition about musicand mathematical understanding of the underlyingmedium, sound. Here I’ll discuss one point of viewon sound as a medium, which emphasizes its depen-dence on time.

More than any other art form, music concerns it-self with time. A piece of music takes us on a pathfrom its beginning to its end. A work of visual art,reflected from right to left, might still be recognizablythe same work. But a piece of music turned from backto front is garbage. Time is the independent coordi-nate of music on which the dependent ones (pitch,loudness, ...) depend. Musical phrases are functionsof time in the same way that a gesture in a paintingis a function of two spatial dimensions.

This time-orientation may derive partly from thefact that music is usually transmitted using sound.

∗CRCA, Cal(it)2, UCSD. Presented at Art+Math 2005(Boulder, Co.)

When we perceive a sound, we can detect extraor-dinarily small changes in timing, easily down to amillisecond; but our acoustical perception of spatiallocation, even in the best of conditions, is limited toabout one degree of resolution and is usually muchworse. Visually, we can sometimes resolve images toabout 1/60 of a degree, both horizontally and ver-tically, but our eyes usually can’t perceive time tobetter than 50 milliseconds or so of accuracy.

This is reflected in the common digital formatsfor storing sounds and moving images. Sounds areusually stored at between 40,000 and 50,000 framesper second, but with only between 2 and 6 spatialchannels. Video normally requires only 30 framesper second but even low-quality video typically needs500,000 or more spatial channels (pixels).

Since music is made of sound, and since sound,in our perception at least, can approximately be re-duced to one or a few real-valued functions of time,we can get some insight into the workings of music bythinking about the real-valued functions of time andour perception of them. This will never lead to any-thing like a theory of music, but can shed light on thereasons music is what it is in certain respects—andcan help us greatly in our attempts to make music.

1.1 Time symmetry

The space of real-valued functions is symmetric withrespect to translations in time: f(t) 7→ f(t+τ). Thisis a linear operator whose eigenfunctions are sinu-soids:

f(t) = Aeiαt

where A is an amplitude and α an angular frequency.They behave under translations like this:

f(t + τ) = eiατf(t)

1

spectralenvelope

|A|

Figure 1: Spectrum of a periodic signal.

To our great fortune, our ears perform (very approx-imately) a sort of eigenvalue expansion of an incom-ing sound. This feature may have evolved in order tohelp us distinguish individual sound sources from theunruly mixture of sounds that reaches our ears. Ourhearing systems appear to search for additive, ap-proximately periodic components in complex sounds.

1.2 Periodicity

Both natural sounds and the sound of the humanvoice abound with signals (real-valued functions oftime) that are approximately periodic. Since period-icity is a time symmetry, it is natural to look at theeigenvalue expansion of a periodic function of time.A signal f(t) with period τ (and hence with frequency2π/τ radians per unit of time) can be expanded as:

f(t) = . . . + A−1e

−iαt + A0 + A1eiαt + A2e

2iαt + . . .

which is the well-known Fourier series for f(t). Sinceperiodic sounds occupy only a discrete subset of allavailable frequencies, it is possible to imagine con-fronting a spectrum of unknown origin and analyzingit as a sum of periodic functions.

Spectra of periodic functions (their Fourier series)can be graphed as in Figure 1. The amplitude ofthe ith harmonic is |Ai|. The timbre of the periodicsound is thought to depend mostly on a (not well-defined) curve called the spectral envelope, shown inthe figure; if the fundamental frequency is changedbut the spectral envelope kept the same, the resultingsound often has a similar timbre.

Combining two or more periodic signals whose fun-damental frequencies have simple ratios (fractionswith integers less than about 7 in the numerator anddenominator) gives rise to spectra with many sharedfrequencies; this is the basis for the Helmholtz the-ory of harmony [1], which depends on the fact that

|A|

Figure 2: Interference pattern between two time-shifted signals. Top: spectrum of the original signal.Bottom: the result.

the Fourier series, considered as a spectrum, occupiesfrequencies at integer multiples of the fundamentalfrequency.

2 Operations on signals

With this simple spectral model of sounds in mind,we can now develop some of the fundamental tech-niques for operating on sounds. These techniquesrecur constantly in efforts to synthesize and pro-cess musical sounds electronically, for example witha computer.

2.1 Interference patterns (filtering)

Adding a sound to a time-delayed copy of itself setsup an interference pattern in the spectrum of thesound. This is the acoustic analog of a diffractiongrating in optics. If the delay between two copies isτ , the two will interfere constructively at frequencies0, 2π/τ , 4π/τ , . . . and destructively halfway betweenthese points. Figure 2 shows a spectrum of an inputsignal, and the resulting spectrum from adding twocopies, one delayed in time compared to the other.

Linear combinations of many differently delayedcopies of a signal give rise to more complicated in-terference patterns in the spectrum. A huge fieldof study is concerned with choosing particular lin-ear combinations so that the interference pattern hasdesired properties, such as enhancing one frequencyrange compared to another [2]. Electrical engineersand electronic musicians call this technique filtering.

2

Filters arise in the natural world whenever soundencounters a cavity or barrier; for instance, the hu-man vocal tract can be thought of as filtering the rawoutput of the glottis (vocal fold). To see why this isso, imagine the sound of the glottis scattering, sepa-rately, off each point of the surface of the vocal cavity.At the output (the mouth and nose) you get the su-perposition of all the scattered (and hence delayed)copies: a filter. This is essentially the same pictureas Feynman used to describe quantum scattering asa superposition of all possible paths of the particlesin a system.

2.2 Frequency shifting (modulation)

Returning to our complex sinusoid, f(t) = A ·exp(iαt), we try multiplying it by another one, sayg(t) = exp(iβt). We get a third one,

f(t)g(t) = Aei(α+β)t

of frequency α+β. Since in the real world we usuallyhave access only to the real part of an incoming signaland can only send real-valued signals to our speakers,a more frequently encountered scenario is:

f(t) = A cos(αt), g(t) = cos(βt)

f(g)g(t) =A

2[cos((α + β)t) + cos((α − β)t)]

If the function f(t) has many sinusoidal components,by the distributive law, multiplying by g(t) = cos(βt)acts individually on each one. Figure 3 shows a possi-ble spectrum of a periodic function f(t) and the resultof multiplying it by a real-valued sinusoid. Engineersand electronic musicians call this modulation.

The resulting spectrum can again be that of a peri-odic signal (if the ratio of the frequency of the mod-ulating signal g(t) to the original fundamental fre-quency is a fraction with small numerator and de-nominator), or otherwise it might have no audiblefundamental. Both possibilities can be musically use-ful, depending on the context.

The spectral envelope of the result resembles thatof the original, unmodulated signal as long as themodulating frequency is small compared to the orig-inal fundamental; otherwise it can be greatly dis-torted. So the one operation can affect both tuning

|A|

Figure 3: Modulating a periodic signal. Top: Spec-trum of the original signal; Bottom: the result ofmultiplying it by a real-valued sinusoid.

and spectral envelope with some degree of indepen-dent control.

2.3 Nonlinear transfer functions(waveshaping)

A technique familiarized in Rock and Roll music ofthe sixties, but with antecedents in electronic music,is simply to distort sounds to change their timbres. Iff(t) is an incoming signal, we compose it with a non-linear transfer function h(t), and listen to the result,h(f(t)). In the R&R tradition, f(t) is the electricguitar signal and h(t) is the transfer function of theoverdriven amplifier.

For example, let f(t) be a real-valued “sinusoid”with time-varying amplitude:

f(t) = A(t) cos(αt)

and choose h(t) = t2 as a transfer function, giving:

h(f(t)) =A2(t)

2[cos(2αt) + 1]

Possible input and output functions are graphedin Figure 4. If h(t) is chosen to be a polynomial ora convergent power series, the actions of the mono-mials in h(t) will be mixed in ratios depending onA(t): changing amplitude in the input changes tim-bre in the output. Skillfully chosen input and transferfunctions can give rise to “nice” spectra; for instance,choosing

h(t) =1

1 + t2

3

t

f(t)

Figure 4: Waveshaping a sinusoid. Top: the incomingsound as a function of time. Bottom: the result ofapplying a simple non-linear function.

turns a sinusoid into a spectrum with exponentiallydropping partials, with the rate of rolloff determinedby A [3].

Using primarily these three fundamental tech-niques, electronic musicians operate on a startingpalette of sinusoids, white noise, and/or recordedsounds, to produce a huge variety of electronic soundsfor use as raw materials in making new forms of mu-sic.

3 Analysis

It is frequently desirable, in dealing with soundselectronically, to analyze the frequency content of asound. In the simplest situation we assume the soundis a finite sum of sinusoids and we would like to knowtheir frequencies and (complex) amplitudes. Thiswould be easy except for the fact that the frequen-cies and amplitudes in question are usually changing,sometimes quite rapidly. Very few sounds in natureor of electronic origin are well modeled as a sum ofeternally unchanging sinusoids.

The most frequently taken approach to this prob-lem is to extract a short segment of the signal tobe analyzed, hoping that whatever components arepresent haven’t had time to vary much within thesegment. For example, if the function to analyze isf(t), one could first multiply it by a windowing func-tion such as:

w(t) =

{

12 [cos(2πt/S) + 1] |t| < S/20 otherwise

real

imaginary

t

Figure 5: A (complex-valued) sinusoidal wave packetas a function of time.

where S is the length of the segment to analyze. Sup-pose that f is a complex sinusoid:

f(t) = Aeiαt

so that the product w(t)f(t) is as shown in Figure5. The Fourier transform of the product (as an L2

function [4, p. 168]) is:

FT {w(t)f(t)} (ω) = FT {w(t)} (ω − α)

or, in words, it is just the Fourier transform of thewindowing function w(t) shifted in frequency by α,as shown in Figure 6.

The main peak of Figure 6 is 8π/S in width, cen-tered about α. The shorter we make the time seg-ment S, the more spread-out the frequency-domainpeak will appear. This is the Heisenberg uncertaintyprinciple in action [5, p. 126].

Since the Fourier transform is linear, a superposi-tion of sinusoids would give a superposition of peakson the frequency axis. To fully resolve them we wouldneed the peaks to be separated by the peak width,8π/S. If the sound is periodic, the analysis shouldbe done over a length of time containing at least fourperiods of the sound. (In practice we can often allowsome overlap, reducing this to about three).

Looking at a sound’s Fourier transform, we candetermine the frequencies and amplitudes of its sinu-soidal components by fitting the observed peaks withtheir known theoretical shape. The spectral envelopecan be estimated as well.

Furthermore, a given audio signal can be modi-fied in interesting ways by taking its Fourier trans-form (using a sequence of overlapping analysis seg-ments called windows), performing some operation,

4

amplitude

Figure 6: Fourier transform of the wave packet.

and then taking the inverse Fourier transforms to re-construct a modified signal. For example, the spec-tral envelope of one signal can be “stamped” onanother by modifying the magnitude of the latter’sFourier transform non-uniformly.

Many details have been glossed over in this verybrief introduction; moreover, only that part of elec-tronic music which deals with realization of pieces ofmusic has been treated. Present-day research alsotouches on the design of real-time software systemsand human-friendly controls for music making; com-puter understanding of musical form and computer-aided composition; music perception; and music inmultimedia applications. Within the narrower fielddescribed here, many problems remain open andimprovements are constantly sought in our existingrepertoire of techniques. Mathematicians can findexcellent work here.

References

[1] Hermann Helmholtz. On the Sensations of Tone.Dover, New York, fourth edition, 1954. Transla-tion, A.J. Ellis.

[2] Kenneth Steiglitz. A Digital Signal Processing

Primer. Addison-Wesley, Menlo Park, California,1996.

[3] Miller S. Puckette. Formant-based audio synthe-sis using nonlinear distortion. Journal of the Au-

dio Engineering Society, 43(1):224–227, 1995.

[4] Walter Rudin. Functional Analysis. McGraw-Hill,New York, 1973.

[5] Charles F. Stevens. The Six Core Theories of

Modern Physics. MIT Press, Cambridge, Mas-sachusetts, 1995.

5

some mathematical tools for music making

Documents