new noninvasive measurement of the cochlear traveling-wave...

20
Noninvasive measurement of the cochlear traveling-wave ratio Christopher A. Shera and George Zweig Hearing Research Laboratory, $ignition, Inc., P.O. Box 1020,Los•41amos, New Mexico 87544, Theoretical Division, Los•41amos NationalLaboratory, Los•41amos, New Mexico87545, and Physics Department, California Institute of Technology, Pasadena, California 91125 (Received 6 February 1992;accepted for publication1 March 1993) The microstructure of threshold hearing curves andthe frequency spectra of evoked otoacoustic emissions both often evince a roughly periodic series of maxima and minima. Current models for the generation of otoacoustic emissions explain the observed spectral regularity by supposing that since the cochlea maps frequency into position the spectral periodicity mirrors a spatial oscillation in the mechanics of the organ of Corti. In this view emissions are generated when forward-traveling waves reflect fromperiodic corrugations in themechanics, suggesting that the amplitude of the cochlear traveling-wave ratio•defined to be the ratio of the backward- and forward-traveling cochlear waves at the stapes--should manifest pronounced maximaand minima with a corresponding periodicity. This paper describes measurements of stimulus-frequency emissions, establishes their analyticity properties, and uses themto explore the spatial distribution of mechanical inhomogeneities (emission "generators") in the human cochlea. The approximate formandfrequency dependence of thecochlear traveling-wave ratio are determined noninvasively. The amplitude of the empirical traveling-wave ratio is a slowly varying, nonperiodic function of frequency, suggesting thatthedistribution of inhomogeneities is uncorrelated with the periodicity foundin the threshold microstructure. The observed periodicities arise predominantly from the cyclic variation in relative phase between the forward- and backward-traveling waves at the stapes. PACS numbers: 43.64. Kc, 43.64.Jb,43.64.Bt, 43.64.Yp INTRODUCTION The studyof cochlear mechanics began with Helm- holtz (Helmholtz, 1863) who viewed the organ of Corti as a miniature harp connected stringby stringto neuralfi- bers. Sensations of tone were created as sound waves in- duced the strings to resonate in sympathetic vibration, ex- citing corresponding fibers thatsent electrical signals to the brain. This view of cochlearmechanics was overturned by the experiments of yon B•k•sy (yon B6k•sy, 1960) who showed that structures within the organ of Corti are not under tension, at least in cadavers. By directly observing the motionof the basilarmembrane, yon Bfikfisy demon- strated that a pure tone generates a forward-traveling wave that propagates through the cochlea to a region of maximal membrane displacement beyond which it is strongly atten- uated.The regionof maximal displacement varies mono- tonically with the frequencyof the tone. Low-frequency tones stimulateregionsnear the apex of the cochlea; higher-frequency tones excite regions closer to the stapes. This "textbook" picture of traveling-wave excitation in the cochlea was, until recently,believed correct at all stimulus levels. 1 It is now known, however, that the ear creates sound while listening to sound(Kemp, 1978). For example, a recent model of cochlear mechanics deduced from mea- surements of basilar-membrane motion (Zweig, 1991 ) pre- dicts that cellular force generators in the cochlea-- presumably the outer hair cells--amplify traveling waves somewhat as a laser amplifies light. Consequently, small backward-traveling waves, originating from forward- traveling waves by scattering fromspatial inhomogeneities in the mechanics of the organ of Corti (Manley, 1983; Wright,1984; Lonsbury-Martin et al., 1988), are amplified astheytravel backwards to the stapes, fromwhich they are partially reflected. Unreflected waves vibrate the middle- ear bones and ultimatelyappear in the ear canalas sound ("otoacoustic emissions"). The generation of large backward-traveling waves radicallychanges our view of wave motion in the cochlea at low sound-pressure levels. The superposition of forward- and backward-traveling waves leads to a standing-wave component in the cochlear response. In thispicture, active elements amplify the for- wardandbackward waves, thereby increasing the sensitiv- ity of hearing. The threshold hearing curve shows periodic minima (Elliot, 1958) at frequencies that correlate strongly with maxima in the spectraof otoacoustic emissions (Horst et al., 1983; Zwicker and Schloth, 1984). The ear emits most loudlyat those frequencies for whichit is mostsen- sitive. Cochlear excitation patterns produced when the ear listens to quiet sounds arequalitatively different fromthose produced in response to loudersounds, presumably be- cause the cellularforcegenerators are limited in the energy they can emit. Otoacoustic emissions and the microstructure of the threshold hearing curve maybe controlled from the central nervous system. Experiments have shown, for example, that contralateral tonescan alter both the amplitudeand frequency of spontaneous and evokedemissions (Mott et al., 1989). Whitehead ( 1991 ) hasdescribed similar, cen- 3333 J. Acoust.Soc. Am. 93 (6), June 1993 0001-4966/93/063333-20506.00 @ 1993 Acoustical Society of America 3333

Upload: others

Post on 17-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

Noninvasive measurement of the cochlear traveling-wave ratio Christopher A. Shera and George Zweig Hearing Research Laboratory, $ignition, Inc., P.O. Box 1020, Los •41amos, New Mexico 87544, Theoretical Division, Los •41amos National Laboratory, Los •41amos, New Mexico 87545, and Physics Department, California Institute of Technology, Pasadena, California 91125

(Received 6 February 1992; accepted for publication 1 March 1993)

The microstructure of threshold hearing curves and the frequency spectra of evoked otoacoustic emissions both often evince a roughly periodic series of maxima and minima. Current models for the generation of otoacoustic emissions explain the observed spectral regularity by supposing that since the cochlea maps frequency into position the spectral periodicity mirrors a spatial oscillation in the mechanics of the organ of Corti. In this view emissions are generated when forward-traveling waves reflect from periodic corrugations in the mechanics, suggesting that the amplitude of the cochlear traveling-wave ratio•defined to be the ratio of the backward- and forward-traveling cochlear waves at the stapes--should manifest pronounced maxima and minima with a corresponding periodicity. This paper describes measurements of stimulus-frequency emissions, establishes their analyticity properties, and uses them to explore the spatial distribution of mechanical inhomogeneities (emission "generators") in the human cochlea. The approximate form and frequency dependence of the cochlear traveling-wave ratio are determined noninvasively. The amplitude of the empirical traveling-wave ratio is a slowly varying, nonperiodic function of frequency, suggesting that the distribution of inhomogeneities is uncorrelated with the periodicity found in the threshold microstructure. The observed periodicities arise predominantly from the cyclic variation in relative phase between the forward- and backward-traveling waves at the stapes.

PACS numbers: 43.64. Kc, 43.64.Jb, 43.64.Bt, 43.64.Yp

INTRODUCTION

The study of cochlear mechanics began with Helm- holtz (Helmholtz, 1863) who viewed the organ of Corti as a miniature harp connected string by string to neural fi- bers. Sensations of tone were created as sound waves in-

duced the strings to resonate in sympathetic vibration, ex- citing corresponding fibers that sent electrical signals to the brain. This view of cochlear mechanics was overturned by the experiments of yon B•k•sy (yon B6k•sy, 1960) who showed that structures within the organ of Corti are not under tension, at least in cadavers. By directly observing the motion of the basilar membrane, yon Bfikfisy demon- strated that a pure tone generates a forward-traveling wave that propagates through the cochlea to a region of maximal membrane displacement beyond which it is strongly atten- uated. The region of maximal displacement varies mono- tonically with the frequency of the tone. Low-frequency tones stimulate regions near the apex of the cochlea; higher-frequency tones excite regions closer to the stapes. This "textbook" picture of traveling-wave excitation in the cochlea was, until recently, believed correct at all stimulus levels. 1

It is now known, however, that the ear creates sound while listening to sound (Kemp, 1978). For example, a recent model of cochlear mechanics deduced from mea-

surements of basilar-membrane motion (Zweig, 1991 ) pre- dicts that cellular force generators in the cochlea-- presumably the outer hair cells--amplify traveling waves somewhat as a laser amplifies light. Consequently, small backward-traveling waves, originating from forward-

traveling waves by scattering from spatial inhomogeneities in the mechanics of the organ of Corti (Manley, 1983; Wright, 1984; Lonsbury-Martin et al., 1988), are amplified as they travel backwards to the stapes, from which they are partially reflected. Unreflected waves vibrate the middle- ear bones and ultimately appear in the ear canal as sound ("otoacoustic emissions"). The generation of large backward-traveling waves radically changes our view of wave motion in the cochlea at low sound-pressure levels. The superposition of forward- and backward-traveling waves leads to a standing-wave component in the cochlear response. In this picture, active elements amplify the for- ward and backward waves, thereby increasing the sensitiv- ity of hearing.

The threshold hearing curve shows periodic minima (Elliot, 1958) at frequencies that correlate strongly with maxima in the spectra of otoacoustic emissions (Horst et al., 1983; Zwicker and Schloth, 1984). The ear emits most loudly at those frequencies for which it is most sen- sitive. Cochlear excitation patterns produced when the ear listens to quiet sounds are qualitatively different from those produced in response to louder sounds, presumably be- cause the cellular force generators are limited in the energy they can emit.

Otoacoustic emissions and the microstructure of the

threshold hearing curve may be controlled from the central nervous system. Experiments have shown, for example, that contralateral tones can alter both the amplitude and frequency of spontaneous and evoked emissions (Mott et al., 1989). Whitehead ( 1991 ) has described similar, cen-

3333 J. Acoust. Soc. Am. 93 (6), June 1993 0001-4966/93/063333-20506.00 @ 1993 Acoustical Society of America 3333

Page 2: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

trally mediated variations in emission characteristics. In addition, careful measurements of stimulus-frequency emissions (Zwicker and Schloth, 1984) have an analytic structure inconsistent with that of a causal system (Shera and Zweig, 1986, unpublished analysis), raising the in- triguing possibility that feedback from the brain plays a major role in controlling evoked emission. Since the stimuli used in the measurements are periodic and therefore "pre- dictable," the brain may actually be anticipating its input and altering the mechanical state of the cochlea accord- ingly. As seen from the ear canal, the cochlea would then appear acausal in its response.

This paper describes measurements of otoacoustic emissions performed with an accuracy sufficient to test this apparent acausality and to determine the frequency depen- dence of the complex-valued traveling-wave ratio. Deter- mination of the traveling-wave ratio enables one to explore the nature of the mechanical inhomogeneities (or, more generally, the emission "generators") from which backward-traveling waves originate. [The traveling-wave ratio, denoted R(to), is defined to be the ratio of the backward- to the forward-traveling wave measured at the basal end of the cochlea near the stapes.] The distribution of inhomogeneities cannot currently be determined from measurements of basilar-membrane motion because those

measurements are made at a single point. A quantitative analysis of otoacoustic emissions, however, makes an ex- ploration of the spatial variation of mechanical properties possible.

Two simple spatial distributions of the inhomogene- ities are consistent with current experiments. Since the co- chlea maps frequency into position, one possibility is that the spatial variation of mechanical characteristics corre- lates strongly with the periodicities observed in otoacoustic emissions and the threshold microstructure (e.g., Manley, 1983; Strube, 1985; Peisl, 1988; Strube, 1989; Fukazawa, 1992). For example, the effective damping of the organ of Corti could be made to mirror the microstructure of the

threshold hearing curve, with smaller damping (and there- fore greater sensitivity) occurring at points corresponding to threshold minima (where quieter sounds can be de- tected).

Another possibility is that the spatial variation of me- chanical characteristics is uncorrelated with the periodici- ties manifest in the frequency distribution of spectral peaks in otoacoustic emissions and the corresponding minima in the microstructure of the threshold hearing curve. In this case the inhomogeneities could, for example, be densely and randomly distributed along the organ of Corti and the periodic spectral variations due entirely to the frequency dependence of the relative phase at the stapes of the forward- and backward-traveling waves, leading to an al- ternation of constructive and destructive interference as

the frequency of the stimulus is varied monotonically. A mechanism explaining the emergence of such striking spec- tral order through the scattering of cochlear waves by what may be essentially random spatial inhomogeneities is pre- sented elsewhere (Shera and Zweig, 1993).

Evidence for or against these two possibilities, both

capable of explaining the oscillatory structure of the mea- sured ear-canal pressure Pec and the correlated threshold microstructure, is provided by the form of the traveling- wave ratio R, which contains information, carried back to

the stapes by the reflected wave, about possible spatial in- homogeneities in apical regions of the cochlea. A traveling- wave ratio whose amplitude I R I manifests correlated, pe- riodic variations---displaying, for example, pronounced maxima at the "dip" frequencies of the heating threshold curve and the corresponding peaks of I Pec I--strongly sUg - gests a spatial variation of mechanical parameters (distri- bution of emission generators) that mirrors the hearing threshold curve, with reflection ("stimulus re,mission") occurring at discrete, periodic intervals. 2 Alternatively, a traveling-wave ratio whose amplitude varies relatively slowly and nonperiodically suggests that the underlying inhomogeneities are dense in the medium and their spatial distribution more irregular or random (Shera and Zweig, 1993). The oscillations in ear-canal pressure Pec then arise because the phase of R varies monotonically with increas- ing frequency, alternately passing through plus and minus one. The existence of these two possibilities is evident from the limiting case in which the middle-ear is idealized as a simple mechanical transformer (and the Norton equivalent impedance of the transducer system is assumed infinite). The ear-canal pressure P•(w;R) then has the form 3 (Kemp, 1980; Shera and Zweig, 1992b)

P•c(•o;R) I+R

Pet(to;0) -- 1 --R ' (1)

Although R cannot be determined from measurements of otoacoustic emissions without detailed knowledge of middle-ear transfer coefficients, certain characteristics of its frequency variation can be found by exploiting a prop- erty of those emissions that enables one to "filter out" many of the unknown effects of the stimulus-delivery sys- tem and the middle ear. The oscillations in the ear-canal

pressure Pe• are superimposed on a smooth, slowly varying "background," whose form is determined by the transfer characteristics of the measuring apparatus, the external and middle ears, and cochlear mechanics and geometry near the stapes. Those two components of the pressure can be separated by filtering. The assumption that the un- known transfer characteristics are slowly varying then enables one to determine the principal frequency variation of R.

Starting with the empirical form for R determined here, another paper solves the cochlear "inverse scattering problem" and outlines a candidate theory for the origin of evoked emission (Shera and Zweig, 1993). That model shows how the measurements and analysis techniques dis- cussed below can be employed to provide noninvasive probes of cochlear mechanics and determine characteris- tics of the impedance of the organ of Corti near the peak of the basilar-membrane response function.

3334 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3334

Page 3: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

B+K 2610 •1

/ Slimulus delivery tube U

FIG. 1. Schematic diagram of the stimulus-delivery and recording system.

I. THE MEASUREMENT

A. Equipment and methods

Otoacoustic emissions were measured in the human

ear canal using a setup whose block diagram is illustrated in Fig. 1. Acoustic stimuli were delivered and the response recorded with miniature transducers sealed in the ear canal

(the EtymBtic Research ER-2 earphone and ER-10 micro- phone and preamp). Stimulus-frequency emissions were measured with the two-channel Hewlett-Packard HP-

3562A signal analyzer operating in swept-sine mode (Blackham etal., 1987). The output of the HP-3562A sinusoidal voltage oscillator was fed to channel A of the analyzer, which adjusted the source to maintain a constant amplitude, and then attenuated (Wavetek 5080.1 ) before being delivered to the ER-2 earphone. With a constant input voltage, the ER-2 is designed to produce an approx-

imately constant sound pressure at the eardrum. After am•- plification (Briiel & Kjaer 2610), the microphone signal from the ER-10 was filtered (Wavetek/Rockland 852, 4th- order Butterworth high-pass filter with a cutoff frequency of 600 Hz) and returned to the signal analyzer (channel B), which measured the complex ratio of the voltage sig- nals at its two input ports. Stimulus frequencies were stepped in a phase-continuous manner (e.g., at zero cross- ings) from the highest measured frequency to the lowest. (Because high frequencies are mapped closer to the stapes than lower frequencies, sweeping from high to low ad- vances the traveling-wave envelope into previously unstim- ulated regions of the cochlea. In practice, the direction of the sweep appears to make little difference, except perhaps at frequencies near a spontaneous emission.) To ensure a steady-state response the analyzer allows a settling time of 20 ms after stepping to a new frequency. In addition, the response at each frequency was measured for an integra- tion time AT (with AT>200 ms) chosen to be much greater than reported latencies of click-evoked echoes, which are typically less than 20 ms for all frequencies greater than 800 Hz (e.g., Neely et aL, 1988; Zweig et aL, 1992).

Subjects were seated comfortably in a reclining chair (BackSaver Classic) in a double-walled sound booth (In-

dustrial Acoustics). Each of the three subjects reported on here was between 20 and 40 years of age and had audio- metrically normal hearing.

FIG. 2. Equivalent circuit for the stimulus-delivery and recording system, which is represented by its Norton equivalent source amplitude U• and source admittance Ys. The residual ear-canal space, middle ear, and ves- tibular space are represented by an equivalent two-port network ½½ffi0, with transfer matrix ½CT0• (j •). Seen from the basal end of the organ of Corti, the cochlear response at the driving frequency is characterized by the input impedance Z(ro;A).

B. Equivalent circuit

In Fig. 2 the acoustic properties of the stimulus deliv- ery and recording system are represented by their Norton equivalent source U s and source admittance Ys. The vol- ume velocity U s is related to the voltage V i delivered by the voltage source (and sent, via an attenuator, to the ear- phone) through the transfer-function Ki; that is,

Us=KiV i. (2)

Similarly, the transfer function K o relates the ear-canal pressure Pec and the measured microphone output voltage

Pec=KoVo . (3)

The measurement thus consists of determining the dimen- sionless ratio

p'(to) (4) (The use of the diacritical prime is discussed below.)

When the ratio p' is independent of stimulus ampli- tude A the circuit illustrated in Fig. 2 implies that

p'(co) -- 1 + YsZec ' (5) where

Y' • Ki/K o , (6)

and Ze•(½) represents the ear-canal load impedance seen from the tip of the earphone assembly. The residual ear- canal space and middle-ear (including the vestibular space) are represented as a reciprocal two-port network •0 (e.g., Friedland et al., 1961; Shera and Zweig, 1992a),

ec• • / abx with transfer matrix -0 =•½d½- The load impedance •½ therefore has the value

aZ+b

--½Z+d' (7) where Z(•) is the cochlear input impedance. Thus Ze½ depends on the mechanics of the ear canal, the middle ear, and the cochlea.

3335 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3335

Page 4: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

2O

-20

0

-5• 1000

I I I v

1500 2000 2500 3000

Frequency (Hz)

FIG. 3. Control measurements in a cavity. The figure plots the function /9•av(to;A ) measured in a rigid-walled cylindrical cavity (consisting of a hole of approximate length 3.1 cm and volume 1.6 cm 3 drilled in a block of Plexiglas) at two-different stimulus levels: (A) measurements at 40 dB SL; (V) measurements at 20 dB SL. The measurement integration time was one second. Except for increased noise at the lower level (especially noticeable about the impedance minimum near 2700 Hz, a value in close agreement with the resonant frequency, f• =c/4l, expected for a closed tube of length l), P•av is, as expected, independent of A (note that the point symbols superpose to form a six-pointed star). The phase of P•av is dominated by a delay originating primarily in the stimulus-delivery tubes. With that delay subtracted, ,O•a v is a minimum-phase function, as indi- cated by the solid line ( ), which represents a smoothed, minimum- phase fit to the measurements at 40 dB SL (Zweig and Konishi, 1987; Konishi and Zweig, 1989).

As a control and check of the linearity of the measure- ment system, the ratio p' was measured with the probe assembly inserted into a rigid-walled cylindrical cavity. The resulting measurements of P•av, performed at sound pressure levels corresponding to sensation levels (SL) of 20 and 40 dB above threshold, are shown in Fig. 3. Except for differences in the noise level near the impedance mini- mum (arising from the zero in the reactante at the cavity resonance), the two data sets are indistinguishable.

Chiefly because of delays introduced by the stimulus delivery tubes, the phase of Ki, and hence of p', is domi- nated by a delay e-•oT, where the delay time T falls in the range 1-2 ms. Rather than working with p', it is thus convenient to define the quantities

p= p' e iø'r and Y•- Y' e i'ør, (8) in which that delay has been removed. With the delay subtracted, P•av becomes a minimum-phase function. In all measurements that follow, the delay time T, determined individually for each measurement by using a least-squares linear fit to the phase of p', has been removed in this way.

The qualitative features of p for the human ear are described in Sec. II, where two components of the response

25

20

15

10

1000 1200 1400 1600 1800 2000

Frequency (Hz)

FIG. 4. Variation with sound pressure level. The figure plots the func- tions p(to;A) measured in subject JEM-R at stimulus levels A indicated on the right in dB relative to threshold (SL). The average values of the curves approximately superpose but have been offset by 5 dB for clarity. The vertical dotted lines ( ...... ) indicate the frequencies of the subject's known spontaneous emissions in this frequency range (cf. Fig. 5). At the highest level, p(to;A) is a smooth, slowly varying function of frequency. As the stimulus level is reduced, oscillations with a period of roughly 100 Hz appear superimposed on that smooth background. Note the increased noise, which originates almost entirely from Vo, at lower stimulus levels. In the data discussed below (and in Fig. 3), the signal-to-noise ratio was substantially improved by increasing the measurement integration time from the 200 ms used here.

are identified. Those components are separated in Sec. III. Sections IV and V then show how contributions to p aris- ing from the reflection of cochlear waves can be extracted from the measured response.

II. FEATURES AND ANALYTICITY OF p(e•;A)

Figure 4 shows the results of a series of measurements performed at varying sound-pressure levels (subject JEM- R). Two components are identifiable, distinguished by their behavior with stimulus level. At the highest levels, p is a smooth and slowly varying function of frequency. As the stimulus level is reduced, however, an oscillatory com- ponent, with a period of roughly 90 Hz, appears superim- posed on that smooth background, which may itself vary with stimulus level. The observations suggest that at low levels (where, as shown below, the response is linear) the function p(•o;A) can be viewed as the sum of two compo- nents: a slowly varying background component arising pre- dominantly from the middle and external ears, and an os- ciliatory component, presumably arising from "stimulus reEmission" within the cochlea. In Sec. III those two com-

3336 J. Acoust. Soc. Am., Vol. 93, No. 6, Juno 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3336

Page 5: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

-lO

-12

-14

-16

-18

-20

:1 I I

'l I I lOOO 12oo •,•oo 16oo 4800 2000

Frequency (Hz)

FIG. 5. Spectral density of the ear-canal pressure measured without ex- ternal stimulation in subject JEM-R (average of 800 spectra). The verti- cal dotted lines ( ...... ), positioned at reproducible spectral peaks, indi- cate the known spontaneous emissions in this frequency range. Subsequent analysis focuses on the region between the two spontaneous emissions near 1200 and 1660 Hz.

ponents of the response are separated, focusing on the re- gion between the two spontaneous emissions at approxi- mately 1200 and 1660 Hz (cf, Fig. 5).

A. Linearity at low levels

Figure 6 demonstrates the appearance of a linear re- gime by overlaying measurements made at 0 and 5 dB relative to threshold (SL). The measurements superpose

-2

-10

160

•5o

140

130

120

:lZ I I I 1200 1.]00 1400 1500

Frequency (Hz)

FIG. 6. Linearity at low levels. The figure plots the functions p(•o;•4) measured (with an 8-s integration time) in subject JEM-R at stimulus levels 0 dB SL (• connected by ...... ) and 5 dB SL (A connected by --). As in Fig. 4, the vertical dotted line indicates a known sponta- neous emission. Aside from a low-frequency drift in the background-- most probably due to static pressure changes in the middle-ear cavities and/or temperature variations in the recording microphone--the two functions nearly superpose, indicating that the response is linear at these levels.

5.0

4.5

4.0

3.5

3

I I I

I I I

• 2

o 14oo 15oo •6oo 17oo 18oo

Frequency(Hz)

FIG. 7. Measurements of ear-canal pressure at 10 dB SL ( ) from Fig. 2 of Zwicker and Schloth (1984). The dashed line (---) in the upper (lower) panel represents the Hilbert transform of the data in the lower (upper) panel. Were the measurements both causal and minimum phase, the solid and dashed lines would everywhere superpose. (Allowing for the presence of a time delay similar to that subtracted from p' does not improve the agreement.) Although measurement errors are not given in the paper, the smoothness of the curves suggests that the random errors are small. A Hilbert-transform analysis of the real and imaginary parts of the pressure indicates that unless their errors are substantially greater than implied, the measurements could not have originated in a causal system.

over much of the frequency range, although a frequency- dependent temporal shift in the background component is apparent, especially above 1500 Hz. (Similar changes in the background component are apparent when consecutive measurements are made at the same stimulus level, indi- cating that the variation is due principally, if not entirely, to temporal drift, rather than to a frequency-dependent nonlinearity in the mechanics responsible for the back- ground.) A technique for estimating the background not subject to artifacts introduced by such shifts is introduced in Sec. III. Subsequent sections then focus on finding the form of the traveling-wave ratio in the linear regime.

B. Analyticity properties at low levels

Among the most careful published measurements of stimulus-frequency emissions are those of Zwicker and $chloth (1984). Representative measurements are repro- duced in Fig. 7 (data from Fig. 2 recorded from subject A.S.1 in the low-level linear regime at 10 dB SL). As shown below, however, their data have an analytic structure in- consistent with that of a causal system.

Causality requires that output never precede input. Restated in the frequency domain, causality requires that

3337 J. Acoust. :•,oc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3337

Page 6: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

0.25

0.00

-0.25

-0.5

9.0

8.8

8.6

8.4

8.2

I I I

12o0 13oo 14oo 15o0

Frequency

FIG. 8. Analyticity properties of p(•o). The figure plots the function In p at 5 dB SL (data from Fig. 6) together with smoothed, minimum-phase fits ( ) to the measurements (Zweig and Konishi, 1987; Konishi and Zweig, 1989). The error bars--estimated by comparison with the fits•corrcspond to 0.125 dB in the amplitude and 0.8 ø in the phase. Except near the spontaneous emission ( ...... ), the fit is excellent. Unlike the measurements of Zwicker and Schloth (1984), which, if accurate, could not have originated in a causal system, the measurements of stimulus-frequency emissions reported here are both causal and minimum-phase.

the real and imaginary parts of the measured pressure be Hilbert transforms of one another (Bode, 1945). Thus, 4

1 • Im(P(to')} dto', (9) Re{P(to)}=-• _ • w'--to and

1 •_o Re{P(to')}dto, ' (10) Im{P(to)}=• • to,_to where • represents a Cauchy principal-value integral. The measured pressure must satisfy these relations unless there are other, unaccounted-for inputs to the ear modifying the response (e.g., signals coming from the central nervous system). As measured from the ear canal, the ear might then appear acausal in its response.

Shown in the figure for comparison with the measure- ments are the corresponding Hilbert-transform pairs, in which the measurements of the amplitude were used to predict the phase, and vice versa? Unless the measurement errors are much larger than suggested by the apparent noise level, the expected analyticity properties would nor- mally require that the measured and predicted curves su- perpose.

To explore, however, whether the apparent acausality reflects changes in the mechanical state of the ear induced by feedback from the brain, Fig. 8 plots the real and imag-

inary parts of In p at 5 dB SL (subject JEM-R), together with smoothed, minimum-phase fits to the measurements (Zweig and Konishi, 1987; Konishi and Zweig, 1989). The plotted error bars•stimated by comparison with the minimum-phase fit--correspond to 0.125 dB in the ampli- tude and 0.8 ø in the phase. The fit is everywhere excellent, except within the immediate neighborhood of the known spontaneous emission. 6 Everywhere else, however, p satis- fies not only the constraints of causality but also the more stringent analyticity requirements of minimum-phase be- havior (Bode, 1945). Demonstrated here in the linear re- gime in one subject, these analyticity properties hold at all stimulus levels and are universal among measurements we have examined (seven ears in four subjects).

Although the origin of the peculiar analyticity proper- ties of the measurements of Zwicker and Schloth is not

known, their apparent acausality appears unlikely to reflect a general involvement of the central nervous system in the generation or control of otoacoustic emissions.

III. SEPARATING THE CROOKED FROM THE STRAIGHT

Understanding the origin of the oscillatory component in p(to) is facilitated by subtracting the smooth back- ground and working with the quantity A(to), defined by

A(to) = [ p(to) -- Po(to) ]/Po(to), ( 11 )

where Po is the background. (Since the analysis focuses on measurements in the linear regime, the dependence on stimulus amplitude .4 has been omitted. ) Defining A in this manner, rather than as the simple difference p-Po, guar- antees that the extracted oscillatory component will be in- dependent of the absolute scale of the background P0.

The two components of p(to)--namely the smooth background P0 (to) and the oscillations A (to)---can be sep- arated by filtering, thereby allowing each measurement of p (to) to serve as its own control against possible variations in the background that occur during the course of the measurement. The oscillatory component occurs in the fre- quency response and can be removed by passing p through a low-pass filter, just as if the measured signal had been recorded in the time domain. By smoothing p in this way, one obtains an estimate of the background P0.

Figure 9 plots the background P0 estimated by smoothing the data shown in Fig. 8 (data near the spon- taneous emission have not been included here). By sub- tracting P0 from p and normalizing according to Eq. ( 11 ) one obtains an estimate of the oscillatory component A shown in Fig. 10. Details of the smoothing and a discussion of the systematic errors it may introduce are given in the Appendix, which demonstrates that the conclusions of this paper are not especially sensitive to errors in the estimate of the smooth background.

An estimate of the oscillation period Af can be ob- tained by fitting a sinusoid to the data for A. A least- squares fit yields

A f= 84.6+0.2 Hz, (12)

3338 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3338

Page 7: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

-0.3

-0.4

-0.5

-0.6

-0.7

-0.8

0.9

0.8

0.7

0.6

0.5

0.4 1250 1300 1550 1400 1450

Frequency (Hz)

FIG. 9. The background extracted by smoothing. The figure plots the functions p(•o) (©) and the background p0(•o) (---) obtained by smoothing the data shown in Fig. 8 (see the Appendix). An interpolant for p, obtained by using bandlimited sin(f)/f interpolation (e.g., Papou- lis, 1977), is given by the solid line ( ). Data near the spontaneous emission are not shown.

where the period has been assumed constant over the in- terval. The next section shows how the oscillatory compo- nent A can be understood as arising from wave reflection in the cochlea. The measurements are then used to explore the frequency variation of the traveling-wave ratio.

IV. THE TRAVELING-WAVE RATIO

This section presents an interpretive framework within which to continue analysis of the measurements. The os- ciliatory component in the response is viewed as originat- ing through the creation of backward-traveling waves, pre- sumably by the partial reflection of the forward-traveling

•' 0.1 /

- c 0.0 ID

<3 ! '\ x. •// \\ //• • -0.1

-0.2 -

• 250 1300 • sso

Frequency

FIG. 10. •c oscillato• component •(•)• P/Po- i obtained from the bandlimit• intc•olants from Fig. 9. The real and imaginary parts are represented, respectively, by solid ( ) and dashed (---) lines. Note that the amplitude and frequency of the oscillations are nearly con- stant and the real and imaginary pa•s approximately 90 • out of phase.

wave. That reflection is likely to occur predominantly near the wave's characteristic place in the apical turns of the cochlea, where the response to the forward-traveling wave is largest (Shera and Zweig, 1993). By changing the pres- sure and volume velocity near the stapes, backward- traveling waves modify the effective value of the cochlear input impedance.

A. The cochlear input impedance

At stimulus amplitudes .4 for which the mechanics are linear, the cochlear response seen from the basal end of the organ of Corti can be characterized by the cochlear input impedance, ? defined as the ratio of the pressure difference P(x,w) across the organ of Cord to the volume velocity U(x,w) of the cochlear fluids in the scala vestibuli:

P

Z(to) •-• x=0; cochlea driven forward (13) The position x=0 corresponds to the basal end of the or- gan of Corti. At moderate intensities, the cochlear response varies strongly with .4 (Kemp, 1979); at high intensities (A >A]), however, the relative amplitude of those nonlin- ear contributions is always small (Kemp and Chum, 1980; Zwicker and Schloth, 1984), and the ratio P/U becomes independent of the amplitude of the stimulating tone.

At stimulus amplitudes A > A ] and frequencies o ,• tO½o, where we0 is the characteristic angular frequency at the beginning of the organ of Corti (x--O), the basal turn of the cochlea is analogous to a linear, one- dimensional mechanical transmission line (Zwislocki- Mo•cicki, 1948; Peterson and Bogert, 1950; Zweig, 1991) with an input impedance,

Z(•o)---Z½(o), for A>A I, (14)

depending principally on mechanical characteristics and cochlear geometry near the stapes (Shera and Zweig, 1991a). The stimulus amplitude A • corresponds to roughly 60 dB above threshold (Zwicker and Schloth, 1984).

At lower intensities, however, the response near the stapes contains significant contributions from more apical regions of the cochlea. Measurements of evoked otoacous- tic emissions (e.g., Zwicker and Schloth, 1984) indicate that their amplitude varies linearly with the stimulus at low sound-pressure levels (see also Sec. II A). In this low- amplitude linear regime, the input impedance has the form

l+R(w)

Z(o)=Zel_R(o), for A<A•, (15) familiar from transmission-line theory (e.g., Slater, 1942). (The super- and subscripted l's identify the high- and low- level linear regimes, respectively. ) The amplitude A• corre- sponds to roughly 5-10 dB above threshold in humans.

Equation (15) follows (Shera and Zweig, 1991b) from the observation that measurements of the cochlear input impedance in cat (Lynch et al., 1982) imply that the wave- length of the traveling pressure wave---or, equivalently, the characteristic impedance Zc(to) of the transmission line-- changes slowly in the basal turn (Shera and Zweig,

3339 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3339

Page 8: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

1991a). The tapering symmetry that guarantees a slowly changing wavelength is assumed applicable to the human cochlea as well. 8

The function R (o•), for which Eq. (15) constitutes a definition, is the traveling-wave ratio evaluated at the basal end of the cochlear spiral. Note that the high-amplitude input impedance Z½(w) is recovered in the limit R-,O; reflections are negligible at those intensities. The following section focuses on finding the form of R in the low-level linear regime.

V. FINDING THE FORM OF R

A. A power-series representation for p

The expectation of finding I R I less than unity suggests expanding p in a power series about R =0. Equation (7) indicates that Z½c constitutes a bilinear transform of Z, which, according to Eq. (15), is but a bilinear transform of R. When combined with Eq. (5) for p, those sequential bilinear transforms imply that the power series has the form

p/po = 1 +pR( 1 +qR +q2R2 + '" ), (16) where the smooth background is recovered as the limit

P0-- lira p. (17) R•0

The coefficients have the values

po= 1 + y•½ , (18) 2zc/

œ= (d+cZ•)2( 1 + YsZ•c); (19) and

2Zc(aY,+c)

q= 1 (d+cZ•) (1 + Ys•eec); (20) where

Z•ec aZ½+O

(21) Note that the coefficients Po, P, and q are determined by the characteristics of the measurement equipment, ear ca- nal and middle-ear, and the basal region of the cochlea near the stapes. By contrast, the traveling-wave ratio R provides information about mechanical characteristics and possible spatial inhomogeneities in more apical regions of the cochlea close to the characteristic-frequency point (Shera and Zweig, 1993).

B. Interpreting the power series

Convergence of the power series requires [qR [ < 1. By reExpressing the power series in the equivalent language of middle-ear scattering coefficients (Shera and Zweig, 1992b), one can show that the coefficient q represents the net reflection coefficient for retrograde cochlear waves measured at the stapes. Assuming the middle ear to be a passive mechanical system therefore yields the constraint

0.3 I I I I I

0.2

0.1

0.0

-0.2

-o.2 o.o 0.2 0.3

FIG. 11. Polar plot of the oscillatory component. The figure plots the real versus the imaginary part of A(o•) for varying w obtained from the data of Fig. 10. The roughly circular trajectory, traversed clockwise about the origin (arrow), is marked with an asterisk (,) at intervals of approxi- mately 5 Hz.

J q J < 1. Note that q is close to one at frequencies for which the source admittance Ys is small and the middle-ear "stiW' (so that I cZ• I • 1 ).

The function qR therefore represents the product of two reflection coefficients, both evaluated at the stapes but measured by driving the system (as seen from the stapes) in opposite directions. Note that terms in the power series proportional to R 2 or higher vanish in the limit that the stapes represents a perfectly reflectionless boundary to ret- rograde cochlear waves (i.e., in the limit Iql-.0). The higher-order terms thus arise from multiple internal reflec- tion within the cochlea.

In terms of the power-series expansion, the oscillatory component assumes the form

A=œR( 1 +qR +q2R2+ ß ß ß ). (22) Note for future reference that

lnA=lnp+lnR+ln(l+qR+q2R2+ '" ) (23)

=lnp+lnR+qR (]qR I,•1); (24)

as shown later, the inequality for I qRI is valid for the measurements reported here.

C. An approximate form for R

The real and imaginary parts of A are plotted against one another in Fig. 11. The roughly circular trajectory traced out clockwise about the origin in the complex plane indicates that the oscillations in the real and imaginary parts of A are roughly 90 ø out of phase and of nearly con- stant amplitude and frequency. Recall that to leading order A equals pR, where the coefficient p4etermined by char- acteristics of the recording system, middle ear, and cochlea near the stapes--is likely to vary slowly with frequency compared to R (co).9

3340 d. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3340

Page 9: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

To first approximation, therefore, A is proportional to R. The traveling-wave ratio then has the form of a circular path of slowly varying radius centered on the origin in the complex plane and traced out with nearly constant angular velocity as the frequency is varied uniformly. Symboli- cally, •ø

R •Rle -i[(a'-a'Or+q•l, (25)

where the radius R•(eo) and "velocity" r(eo) are both real, slowly varying functions of frequency (as indicated by the small parameter e); the phase shift •pl is a real con- stant. The reference frequency • emphasizes the local na- ture of the approximation. H Note that 1/r is simply the oscillation period Af. A heuristic description of evoked emissions incorporating a traveling-wave ratio with con- stant amplitude and linear phase has been proposed by Kemp (1980).

The procedure used to extract the background P0 ill- ters out any slowly varying dc component R0 in the traveling-wave ratio. If any such component is present and large, R would be more appropriately written

R•Ro+ R•e -i!('ø-'øO•+•!, (26)

where Ro=R0(•o) is a slowly varying complex function of frequency. Geometrically, R0 reflects a possible offset of the circular trajectories from their apparent center about the origin.

Although the experiments reported here do not con- strain the magnitude of such a component, other observa- tions suggest that it is small. For example, measurements of stimulus-frequency emissions (e.g., Zwicker and Schloth, 1984) indicate that low-level emission curves os- cillate about an "average" given by the corresponding emission curve, appropriately rescaled, measured at high levels. Our own measurements at higher levels, when con- trolled for temporal shifts in the background, corroborate those findings (see the Appendix). No significant tic-offset connected with R 0 is apparent. In addition, since R 0 is slowly varying, the corresponding group velocity--given by d/{Ro}/da•is, by hypothesis, much shorter than r. Any component R 0 in the traveling-wave ratio would therefore contribute a short-latency response to measure- ments of click- or tone-burst-evoked otoacoustic emissions.

No such component is observed in human ears (e.g., Kemp and Brown, 1983a).•2 Thus, if those measurements are cor- rect (and the coefficients P0, P, and q are, as expected, essentially independent of stimulus amplitude),

IRdR, I <1; (27)

the procedure employed here then captures the dominant contributions to R.

O. Predictions of the power series: Multiple internal reflection

The approximate form (25) for R(•o) was obtained by combining measurements of p(co) with the leading term in the power series expansion (16) for p(•o) deduced from theory. As the frequency is varied, the phase of R(•o)--

and, as a result, the real and imaginary parts of A(co)-- rotates with a "frequency" •-. The leading term, pR, in the series for A corresponds to a cooblear traveling wave that has undergone only a single reflection before appearing in the ear canal. The power series predicts, however, the ex- istence and properties of higher-order contributions to A arising from multiple internal reflection within the cochlea. For example, the higher-order pqR2=pRqR and pc12R 3 =pRqRqR terms in the series indicate that A(co) should contain additional, smaller oscillations at two and three times the fundamental (i.e., at fractional values of« and • of the spectral oscillation period l/r).

These predictions are most readily verified by noting that such higher-order oscillations should be readily appar- ent in the "temporal spectrum" of A. If the coefficients p and q are slowly varying, the empirical form for R implies the existence of a one-to-one correspondence between terms in the power series and "temporal peaks" in the inverse Fourier transform of A(•o). The correspondence follows from the fact that the Fourier transform of e -i•'ø•, representing the strong frequency dependence of R n, is pro- portional to the 6 function 6(t--nr). Consequently, the analysis framework predicts that a series of equally spaced temporal peaks (that is, echoes) should appear at times nr (corresponding to spectral oscillation periods of l/nr) and with amplitudes occurring in geometric progression. For example, in the idealized case in which p, q, and the pa- rameters defining the approximate form for R in Eq. (25) are all independent of frequency one obtains

F-• {A}=pRi eiol [8(t-r) +qR•eiø•8(t-- 2• ')

+q2R2•e2'ø'O(t-- 30 + ' ' ' l, (28) where

0• --=coff--•pl. (29)

Figure 12 shows the temporal spectrum of A(•o)-- denoted F-l{A (•o)}, where F{' } represents the operation of Fourier transformationsfor the data shown in Fig. 9. As predicted by the series expansion (22) for A, at least two, and perhaps three, temporal peaks are visible at frac- tional values -•, «, and « of the fundamental oscillation pe- riod (approximately 87 Hz). As described above, adjacent temporal peaks correspond to terms in A proportional to R, R •-, and R •. The progressive decrease in peak amplitude indicates that the series converges and provides ex-post- facto justification for the expansion. Inspection shows that the peaks are approximately colinear, implying that their amplitudes occur in geometric progression, as predicted by Eq. (22).

The ratio of adjacent peaks is determined by the prod- uct qR, a function that can be determined from the data by extracting higher-order terms in the series. The process is facilitated by substituting the approximate form (25) for R into Eq. (24) for In A:

Re{In A} = Re{In p} + In R• + I q [ R•

)<cos[ (•--•)r+•] +"- (30)

and

3341 J. Acoust. Sec. Am., Vol. 93, No. 6, June 1993 G.A. Shera and G. Zweig: Cochlear waves and emissions 3341

Page 10: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

-lO I I [ I

-2o

E -4o

-5O

-6O oo 87 43 29 22 17

Period of the oscillotion (Hz)

FIG. 12. Temporal spectral peaks arising from multiple internal reflec- tion. The figure plots the amplitude of the (discrete) inverse Fourier transform F •{A} (A connected by ) for the data of Fig. 10. The transform is plotted versus the oscillation period A f, which decreases along the abscissa and has units of reciprocal time (Hz).

Im{ln A} = Im{ln p}-- [ (o--o I )r+ •Pl ]

--[qIR• sin[ (o-roar)+q>] +... ,

where

(31)

(32)

For comparison, the empirical determination of In A is plotted in Fig. 13. If the functions lnp, R•, and q are slowly varying, Re{In A} consists of a nearly constant back- ground upon which oscillations of amplitude I q I Ri and approximate period 1/r are superposed. Such oscillations, shifted 90 ø in phase, appear also in Im{ln A}, but in this case superposed on a line of slope --r.

Equations (30) and (31) for the real and imaginary parts of In A suggest that an especially convenient repre- sentation of the data can be obtained by differentiating, which also serves to remove much of the unknown, but slowly varying background. We therefore define the dimen- sionless function

dlnA dlnR dR

n---- dfl •-•-•--+q•-•' (33)

where the dimensionless frequency 11 is defined by

ll---- rco. (34)

By using the approximate form (25) for R and the assump- tion that R 1 and r are slowly varying one finds

- 1.0

-2.5

-3.0

-3 1250 1300 1350 1400 1450

Frequency(Hz)

FIG. 13. The function In A(o) computed by using A(•o) from Fig. 10. As predicted, the real part consists of oscillations superimposed upon a nearly constant background. Similar oscillations, shifted 90' in phase, appear in the imaginary part, but superimposed upon a line of nearly constant slope --•-. The straight line in the phase indicates the presence of a delay, presumably due to the round-trip travel time between the stapes and the region of maximal reflection (Shera and Zweig, 1993).

r/.• [qlRl cos[ (co--col)r+q•]

-i[1+ IqlRl sin[ (co-o1)r+q•] ].

lm{n}

(35)

Thus, if the approximate form (25) is correct, r/is a causal function whose real and imaginary parts oscillate with a period of 1/r Hz and with an amplitude ]qlRl. Note that the model predicts that Ira{r/} oscillates about the aver- age value --1. These predictions can be compared with experiment by computing •/from the empirical determina- tion of A.

Figure 14 plots the function */obtained by differenti- ating the data of Fig. 13. The dimensionless frequency fl was defined by the value r= 11.82 ms obtained from the estimate (12) for the oscillation period Af. Since differen- tiation amplifies high-frequency noise components, the function shown has, for clarity, been low-pass filtered to isolate the dominant frequency dependence and suppress higher-order terms in ,/ (e.g., those proportional to R z or higher, where terms in */of order R z correspond to terms

3342 J. Acoust. Soc. Am., VoL 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3342

Page 11: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

1.0

0.5

0.0

-0.5

-1.0

0.0

-0.5

- 1.0

- 1.5

1200 ! 300 i 4-00

Frequency (Hz) 1500

1.0

0.5

-0.5

-1.0

0.0

-0.5

- 1.5

-2.0 1200 1300 14-00 1500

Frequency (Hz)

FIG. 14. Comparison between theory and experiment. The figure plots the function •/(•o) computed from Eq. {33) and the data of Fig. 13 ( ). For clarity, the function has been filtered to isolate the domi- nant frequency variation and remove high-frequency noise. Shown for comparison ( ...... ) are the predictions of Eq. (35) with parameter values I ql R, =0.12, •'= 11.82 ms, and q• = 1.93 rad determined by a least- squares fit to the data {real and imaginary parts simultaneously}. For this data set, the reference frequency co• was taken to have the value f. Ol/2•r = 1350 Hz. Note that Eq. (35} predicts that the imaginary part oscillates about -- I, in agreement with the empirical result.

FIG. 15. Uncertainty in the estimate of •/, The figure plots the function •/(ca) computed by using the data at 0 dB SL from Fig. 6 (---}. For comparison, the values are shown superimposed on the data (5 dB SL) from Fig. 14. The differences between the curves, originating principally through changes in the slowly varying background caused by temporal drift, are small and give an indication of the uncertainty in the analysis. Below 5 dB SL, •/(ca) is essentially independent of stimulus level.

in A of order R3). Note that Im{r/} oscillates about the value --1, as predicted by Eq. (35) by using the approxi- mate form (25) for R.

Shown for comparison are the predictions of Eq. (35), where the parameters Rt, •', and q were assumed constant over the interval before fitting the data (parameter values are given in the figure caption). The agreement between the empirical determination of •/and the model result im- plies that the foregoing analysis, including the approximate local form for R, is essentially correct.

E. PARAMETER VALUES AND THEIR UNCERTAINTIES

An indication of the error in the analysis is provided by Fig. 15, which overlays Fig. 14 with the function r/ computed by using the other data set illustrated in Fig. 6 (i.e., the data taken at a stimulus level of 0 riB). The differences between the curves originate principally with changes induced in the background by the low-frequency temporal shifts discussed in See. II. The Appendix dis- cusses the magnitude of the systematic errors introduced by the choice of filter used to extract the smooth back- ground P0-

Figure 16 overlays Fig. 15 with additional estimates of ß / computed by using measurements on the same ear per~ formed several months later. Combining the results yields

1.0

0.5

-0.5

-1.0

0.0

-0.5

1200 I I I

1.300 1400

Frequency (Hz) 1500

FIG. 16. Variation of r/(co) over time. The figure plots the functions •/(eo) computed by using three additional measurements of p made in the low-level linear regime on the same subject several months later (shown with dotted, dot-dashed, and dot-dot-dashed lines). For comparison, the functions are shown superimposed on those from Fig. 15.

3343 J. Acoust. Sec. Am., VoL 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3343

Page 12: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

1.0

-2.0 1200 1500

0.0 F I I I I -0.5 -

- 1.0 '"'- - 1.5 -

1,300 1400

Frequency

-lO

-50

-60

9n/2

• 7•/2 o L

¸ (/1 o

I I I I 87 4-,3 29 22

Period of the oscillotion (Hz)

FIG. 17. An estimate ( ) of the function •/(t0) computed by aver- aging the functions shown in Fig. 16 and plotted with error bars repre- senting standard deviations. Fitting the model of Eq. (35) separately to each of the five data sets and averaging the results yields the parameter estimates given in the text. Shown for comparison ( ...... ) are the predictions of the model using those average parameter values.

the "mean •/" and its standard deviation shown in Fig. 17. Averaging model fits to the data yields the parameter esti- mates: IqlR•=0.122•:0.013, r=11.72:1:0.12 ms, and q = 2.1 :e 0.2 rad. Note that no explicit model for the effects of the background is included in Eq. (35). The background provides a slowly varying, secular variation in the average value of the curves and thus affects, for example, the opti- mal choice of the amplitude I qlR•. Note that our deter- mination of •] provides an estimate of the product qR ap- pearing in the series expansion for p. That estimate can be expressed in terms of •/through Eq. (35):

qR • •7'-- i---- Re{•/} -- i[ 1 + Im{•/} ], (36)

where •7' is the complex conjugate of •/.

F. Quantitative prediction: The ratio of adjacent temporal peaks

An important prediction can be tested by recalling that the framework predicts that the inverse Fourier transform of • consists of a series of peaks occurring in geometric progression (see Fig. 12). Indeed, the power series (22) for A(to) predicts that knowledge of qR determines the successive ratios of all higher-order terms in the expansion. When the parameter values are assumed constant over the interval, Eq. (28) implies that the ratio of adjacent tem- poral peaks in the inverse Fourier transform of • assumes the approximate value

FIG. 18. Predicted colinear temporal harmonics. The figure plots the inverse Fourier transform F-•(fi(•o)) (• connected by ...... ) from Fig. 12. The temporal peaks corresponding to harmonics of the spectral oscillation period are shown with enlarged solid symbols. The analysis presented above predicts that those peaks occur in geometric progression and should therefore appear colinear on the scales used here. Straight lines, with slopes obtained from Eq. (37) by using the averaged parameter values given in the text, are superimposed for comparison (agreement is expected only at points shown with large symbols). Since the coefficients p and q and the parameters defining the approximate form for R all vary slowly over the (finite) measured frequency interval, the empirical temporal spectrum deviates from the idealized limit, given in Eq. (28), for which the amplitude is nonzero only at discrete intervals corresponding to spectral oscillations at the fundamental and its harmon- ics (i.e., to delay times representing integer multiples of r).

(peak)n+l (peak)n -- I qlR•ei(•ø•-q•)' (37)

Note that Eq. (37) represents a complex number: adjacent peaks differ in both amplitude and phase.

Figure 18 reproduces the amplitude and phase of the temporal spectrum of A(•o). Shown for comparison are straight lines whose slopes were determined from Eq. (37) by using the averaged parameter values given above. The foregoing analysis correctly predicts both the approximate colinearity and the complex ratio (37) of adjacent tempo- ral peaks. By reducing the measurement noise floor even further, one could presumably verify these predictions for temporal peaks of even higher order.

G. Estimating q and R• individually

The amplitude of the oscillations in •/ depend on the amplitude of the traveling-wave ratio through the expres- sion [ q I R I, where I q I depends on unknown properties of the middle-ear. If I ql is close to one, however, the oscil- lation amplitude is set principally by the value of R•. Fig-

3344 d. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3344

Page 13: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

1.0

0.8

0.6

0.4 -

0.2

0.0

0.0

-0.2

-0.4

-0.6

-0.8

- 1.0 500 1000 2000 .5000 5000

Frequency (Hz)

FIG. 19. The net stapes reflection coefficient q(•o) for retrograde cochlear waves predicted by the middle-ear models of Zwislocki ( ) and Kfinglebotn (- - -). The residual ear-canal space was modeled as a rigid- walled tube of length 1 cm and cross-sectional area 0.4 cm 2. The calcu- lations assume that the source admittance Y• is zero; more realistic values yield very similar results. Note that q approaches one at low frequencies where the eardrum and ossicular joints become "stiff." The models agree in their prediction that q is everywhere reasonably close to but less than one in magnitude (cf. Shera and Zweig, 1991b, Fig. 7).

ure 19 shows values of Iql calculated from Eq. (20) for published models of the "average" human middle ear (Zwislocki, 1962; Kringlebotn, 1988). Although their pre- dictions differ in detail, the models are in qualitative agree- ment and predict that q is a slowly varying function every- where close to but less than one in magnitude (see Sec. V B). Reference to Eq. (20) shows that q would be exactly one were the source admittance Ys zero and the middle ear perfectly "stiff;" that is, were the eardrum a rigid plate and the ossicular joints rigid (Shera and Zweig, 1991c; Shera and Zweig, 1992a).

Since Iql < 1, the value lqlRt=0.12 determined from the data provides a lower limit on the amplitude of the traveling-wave ratio. Assuming that the middle-ear model predictions are roughly accurate for this subject yields the estimate I ql =0.7 in the frequency range of the measure- ments. Thus, R1=0.16-0.18 in this subject. (The uncer- tainty reflects only the approximate uncertainty in the de- termination of the product I q I R l; the uncertainty in [ q I is not known.) The corresponding standing-wave ratio, which gives the approximate ratio of the cochlear pressure at a node to that at an antinode, is related to R by

+lRI

SWR= IR' (38) and is therefore roughly 1.4 near the stapes.

2.5

2.4

2.3

2.2

10.7

10.6

10.5

10.4

10.3 2100

I I I

I I I

2200 2300 2400 2500

Frequency (Hz)

FIG. 20. Measurements of In p in subject CKL-R at approximately 5 dB SL together with smoothed, minimum-phase fits ( ) to the mea- surements. The error bars correspond to 0.125 dB in the amplitude and 0.8 ø in the phase. The dashed line (---) represents the estimate of In P0 obtained by smoothing.

H. Anomalies and other subjects

The analysis above has focused on determining the form of R in a region where p has a simple, regular struc-

1.0

0.5

0.0

-0.5

- 1.0

0.0 I I I

-0.5

- 1.0

-1.5

-2.0 2100 2200 2,.'300 2400 2500

Frequency (Hz)

FIG. 21. The function */(c.o) computed from the data of Fig. 20 ( ) and filtered to isolate the dominant frequency variation and remove high- frequency noise. Shown for comparison ( ...... ) are the predictions of Eq. (35) with parameter values [qlRl=O,143, r=7.93 ms, and •p= 1.65 rad. The reference frequency roi has the value ro•/2rr=2300 Hz.

3345 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. $hera and G. Zweig: Cochlear waves and emissions 3345

Page 14: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

2.7

2.6

2.5

2.4

9.00

8.95

8.90

8.85

8.80

8.75

I

I

•50 1000 1050

I I

, I I 11oo 115o

Frequency (Hz)

ture. Reference to Fig. 4 indicates, however, that the peri- odic pattern can fluctuate somewhat erratically and is in- terrupted by regions where the regular pattern is distorted. Those general features are also found in measurements on other ears. Typical results are given in Figs. 20-23, which illustrate representative measurements of p, demonstrate its minimum-phase analyticity properties, and give the cor- responding functions ,/. Although the empirical functions ,/ tend to have more structure than the simple form Eq. (35)--reflecting either an incomplete removal of the back- ground or deviations in R from the approximate form given by Eq. (25)--the data from all subjects support the conclusion that, within the more regular regions, the am- plitude of R varies relatively slowly (and nonperiodically) with frequency compared to the phase, in general agree- ment with the idealized form (25). Figures 22 and 23 pro- vide an example of data collected from an anomalous re- gion flanked by regions of greater regularity. Such regions may result from local fluctuations in the distribution of mechanical inhomogeneities conjectured to give rise to evoked emission (Shera and Zweig, 1993).

FIG. 22. Measurements of In p in subject MGC-R at approximately 5 dB SL together with smoothed, minimum-phase fits ( ) to the mea- surements. The error bars correspond to 0.06 dB in the amplitude and 0.4' in the phase. The dashed line (---) represents the estimate of In P0 obtained by smoothing. Note the presence of an anomalous region, cen- tered about 1050 Hz and delimited by short-dashed lines, flanked by regions of more regular behavior.

1.0

0.5

,-• w o.o rv'

-0.5

- 1.0

-0.5

- 1.5

' 'l

-2.0 I I , I , I 950 1000 1050 1100

Frequency (Hz)

,t 1150

FIG. 23. The function ,/(•o) computed from the data of Fig. 22 ( ) and filtered to remove high-frequency noise. The anomalous region cen- tered near 1050 Hz divides the data into two regions flanked by intervals of more regular behavior. The parameter r= 22 ms used in the computa- tion was estimated from the data to the left of the anomaly.

VI. DISCUSSION

A. Interpreting the approximate form for R

The amplitude of the cochlear traveling-wave ratio is a relatively slowly varying and nonperiodic function of fre- quency, suggesting that the distribution of mechanical in- homogeneities (emission generators) responsible for evoked emission is uncorrelated with the periodicities ob- served in the microstructure of threshold hearing curves. Although the amplitude of R varies slowly, its phase ro- tates rapidly. The observed periodicities thus arise predom- inantly from the cyclic variation in relative phase between the forward- and backward-traveling waves at the stapes; as the frequency varies monotonically, the phasor e passes alternately through plus and minus one, giving rise to the peaks and valleys in the measured ear-canal pressure [el. Eq. ( 1 )].

In the time domain, the locally linear phase corre- sponds to a delay, presumably given by the round-trip travel time to and from the site of generation of the backward-traveling wave (Neely et al., 1988; Shera and Zweig, 1993). The values r found here are consistent with emission latencies reported elsewhere (e.g., Norton and Neely, 1987).

1. Variation of the delay with frequency

The approximate empirical form for R (a•) given in Eq. (25) was determined from data covering a limited fre- quency range over which the delay r appears nearly con- stant. The approximate global variation of the delay with frequency can be obtained from measurements of emission latency. Measurements of tone-burst-evoked emissions in- dicate that their latency, here denoted L, varies roughly inversely with frequency (Wilson, 1980; Norton and Neely, 1987; Zweig et al., 1992). Since the latency, or group delay, represents the rate at which the relative phase

3346 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3346

Page 15: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

of the incident and reflected waves (i.e., the phase of R) varies with frequency, consistency with the latency mea- surements requires

dO c) (39) where 0 denotes the phase of R and the parameter • varies only slowly with frequency. Contributions to the measured latency arising from the middle ear have been assumed small. Solving the differential equation for 0 yields,

0(•o) • In(o/o%), (40)

where the frequency scale a•Co represents the maximum fre- quency of hearing.

One thus obtains an approximate form for R(a•) in- corporating the global variation in latency with frequency:

R (•o) •R I ( co)e -i4 In (tø/tøc0), (41)

where R• is a slowly varying function of frequency. One can recover the local empirical form (25) in any

neighborhood by expanding the phase,

0=• ln( •o/OCo), (42) in a Taylor series about some arbitrary reference frequency

dO

0(•o) =0(a h) +•-• ot(•o--ah) +..- . (43) Evaluating the derivative yields,

0(O) ,• 01 q-• (0.1--(D I )/(Di, (44)

where 0• = 0(o• ) is a constant and the derivative d&/da• is, by definition, small and has therefore been neglected. In a neighborhood about •o• the traveling-wave ratio therefore has the approximate form

R(o)•Rle-iøte-i(tø-øq)r]=R•e -i(ø-ø1)ri, (45)

where R[ is slowly varying, and the local (approximately constant) delay r• has the value

'I' I •/(-O 1 . (46)

The parameter •/2rr thus represents the delay measured in units of the stimulus period. Higher-order terms in the Taylor series expansion of the phase can be used to show that the approximate local form for R (•o) can be expected to hold over a frequency interval Ao about •0 satisfying A•o/2•(o, in agreement with the observed local constancy of the delay r in the measurements reported here.

B. Another way of measuring the background

This paper has introduced a filtering technique for sep- arating the oscillatory component in the measured ear- canal pressure from the smooth background determined, among other things, by unknown characteristics of the middle ear. The method, outlined in the Appendix, re- quires only a single measurement at low sound-pressure levels, thereby allowing each measurement to serve as its

own control, both against shifts in the background that occur during the course of the measurement and against nonlinearities in the stimulus-delivery and recording sys- tem.

The vanishing of the oscillatory component with in- creasing intensity seen in Fig. 4 suggests defining an alter- native background, denoted p•, by the limit

poo(o) = lim p(•o•), (47)

where, in practice, the limit is achieved for stimulus am- plitudes •/>A I.

The alternative background P o•, which provides the basis for the "vector subtraction" method introduced by Kemp (Kemp, 1979; Kemp and Chum, 1980), was not employed here because a low-frequency temporal shift-- perhaps due to static pressure changes in the middle-ear cavities or temperature variations in the recording microphone--made it difficult to measure Po• with suffi- cient accuracy. If these speculations concerning the origin of the shift are correct, one expects it to originate either in the microphone transfer function K o or in the middle-ear cavity impedance Zca v, which manifests itself through the transfer coefficients (ab) (Shera and Zweig, 1992a). Note cd

that if the shift arises predominantly through K o, Eqs. (18)-(20) imply that the functions p and q remain unaf- fected.

The Appendix presents measurements of p and p• in which both transducer nonlinearities and the temporal shift have been controlled for (the latter by using an •/-B-•/ measurement sequence that permits "before" and "after" comparisons). Differences between Po• and the background P0 extracted by smoothing are small (and arise principally because of smoothing artifacts caused by the finite size of the measured frequency interval), suggest- ing that po=poo. Cochlear nonlinearities thus manifest themselves principally in the oscillatory component of the measured ear-canal pressure.

VII. SUMMARY

(1) Accurate measurements of stimulus-frequency evoked otoacoustic emissions have been made in the low-

level linear regime. Unlike the measurements reported by Zwicker and Schloth (1984), the measurements described here are consistent with both causal and minimum-phase behavior.

(2) The measured response, expressed in the form of the dimensionless ratio p(ro), consists of an oscillatory component A (to) superimposed on a slowly varying "back- ground" P0:

p(w) = p0(•o•) [ 1 + a (o•) ], (48)

where the presence of the small parameter e indicates that P0 varies slowly relative to A. A novel smoothing technique was developed and used to separate those two components. Whereas the linear background is determined principally by the acoustic properties of the recording system and mid- dle ear, the oscillatory component varies nonlinearly with sound pressure level and originates within the cochlea.

3347 d. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 G.A. Shera and G. Zweig: Cochlear waves and emissions 3347

Page 16: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

Measurements of the high-amplitude limit p • indicate that the smoothing technique accurately extracts the level- dependent, oscillatory component of the ear-canal sound pressure (see the Appendix).

(3) The oscillatory component A (co) was represented as originating through the reflection (re,mission) of forward-traveling waves, presumably by mechanical inho- mogeneities (emission generators) in the apical turns of the cochlea. Then A(co) can be expressed as a power series in the dimensionless traveling-wave ratio, R (co), which measures the net reflected wave, relative to the forward- traveling wave, at the basal end of the cochlea near the stapes.

(4) The fundamental of the oscillatory component in p(co) corresponds to the first term in the series expansion in R(co) and represents an evoked cochlear echo that has undergone a single reflection before appearing in the ear canal. The power series predicts, and the measurements reported here confirm, the existence and properties of higher-order terms in the expansion corresponding to spec- tral oscillations at harmonics of the fundamental that orig- inate through multiple internal reflection within the co- chlea.

(5) The extracted oscillatory component A(co) was analyzed to determine the principal frequency variation of R (ca), which was shown, locally, to have the approximate form

R ( CO ) •Ro + R le -i[ ('ø-'øO•'+q)d, (49)

where Rt(ec,) and •-(e•o) are both real, slowly varying functions of frequency (e<l); the phase shift qt is a real constant. Although the magnitude of any additional slowly varying component Ro(ec,) is not determined by the mea- surements reported here, other published measurements suggest that IR0/Rll•l. Typically, R•=O(I/5) and •-• 12 ms at frequencies co•/2•r near 1300 kHz. In individ- ual subjects, the delay r can be estimated with this tech- nique to within a few tenths of a millisecond.

(6) The amplitude of the cochlear traveling-wave ratio is a relatively slowly varying and nonperiodic func- tion of frequency, suggesting that the mechanical inhomo- geneities (emission generators) are densely distributed and appear uncorrelated with the periodicities observed in the microstructure of threshold hearing curves. Although the amplitude of R varies slowly, its phase rotates rapidly. As conjectured by Kemp (1980) and demonstrated here, the observed periodicities arise predominantly from the sinusoidal variation in relative phase between the forward- and backward-traveling waves at the stapes. The locally linear phase presumably arises as the result of wave prop- agation delays due to a total round-trip travel time •' from the stapes to the region of reflection and back again. A model in which the orderly, almost periodic pattern of spectral maxima and minima emerges dynamically from the scattering of cochlear waves by random inhomogene- ities in the organ of Corti is presented elsewhere (Shera and Zweig, 1993). The model indicates how noninvasive measurements of the sort reported here can be used to

determine characteristics of the impedance of the organ of Corti near the peak of the basilar-membrane response function.

ACKNOWLEDGMENTS

The authors thank Jennifer McDowell for her invalu-

able assistance with the measurements and gratefully ac- knowledge her remarkable patience as an experimental subject. This work was supported, in part, by DARPA and AFOSR contract N00014-86-C0399, the Theoretical Divi- sion of Los Alamos National Laboratory, the Office of Health and Environmental Research in the Department of Energy, and a National Science Foundation Graduate Fel- lowship to C. A. S.

APPENDIX: SMOOTHING TO EXTRACT THE BACKGROUND

This Appendix defines more explicitly the smoothing operation used to estimate the background P0 and com- pares it with the high-amplitude background p• measured experimentally.

Unlike the more familiar case of time-domain filtering, the oscillatory component to be removed/extracted occurs here in the frequency response. Smoothing involves con- volving p, wiggles and all, with a smoothing function $ of finite bandwidth (e.g., a Gaussian):

po=S © p, (A1)

where © denotes the operation of convolution. The convolution is equivalent to a multiplication in the

time domain. For example, let F(. )Arepresent the opera- tion of Fourier transformation and S the inverse Fourier

transform of the smoothing function:

•=F-t(S] .. (n2) (Note that ,q will have a low-pass characteristic in the time domain.) Then, when the width of the smoothing function (filter cutoff) is chosen appropriately,

po----F(•XF-•(p].] ., (A3) where X denotes ordinary multiplication.

Such filtering preserves the causality properties of p. Since p is causal (Fig. 8), the corresponding impulse re- sponse F-•(p) vanishes for negative times. So, therefore, does the product ,qXF-•(p), implying that the function P0 extracted in this way is also causal. t3

Ideally, the filter • should have a sharp spectral cutoff (in this case, a sharp temporal cutoff in the time domain) but avoid prolonged ringing in the impulse response (i.e., in the frequency response or smoothing function S). Here we approximate those ideal characteristics by •employing one of a class of "recurslye-exponential" filters Sn, defined by

•n( t;tc) ---- 1/F n( •.nt/tc), (A4) where t½ is the filter cutoff and F• is defined recursively:

F•+•----e rn-•, with Ft(t)=e ?. (AS)

3348 d. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. $hera and G. Zweig: Cochlear waves and emissions 3348

Page 17: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

1.0

0.5

0.0

-3 -2 -1 0 1 2 •

-2O

-40 -

-• -1 0

FIG. A1. Smoothing functions S,(f;fc) (top)--representing the Fourier transforms of the corresponding recursive-exponential filters •,(t;tc) (bottom) for two values of n: ( ) the 10th-order smoothing func- tion S]0 used to extract P0; ( ...... ) the Gaussian smoothing function S•. The smoothing functions are normalized to a maximum value of one and the two abscissae represent frequency a•nd time in units of the filter cutoff. Despite the sharp temporal cutoff in S10 the smoothing function S•0 dis- plays only minimal spectral ringing.

0 -6

E -8

• 160 .• 150

o 140

Q_ 130

I

1200

I I I

13oo •4oo ]5oo

Frequency (Hz)

FIG. A2. Two ways of measuring the background. The figure plots the function p(to) (A connected by ) and the background (---) obtained by smoothing (subject JEM-R, 5 dB SL). After sub- tracting a ramp function so that the first and last points were made zero, the data segment was convolved with the smoothing function Sm with a filter cutoff of f, = 200 Hz and periodic boundary conditions. Data lying outside the two dotted lines (whose distance from the edges represents the approximate width of the smoothing function) were then discarded to reduce end effects. Shown for comparison (smooth solid line) is the high- amplitude background p• measured at 60 dB above threshold. Between the dotted lines, differences between the two backgrounds are small and are due almost entirely to end effects introduced by the boundary condi- tions.

The scale factor ,•, is set by the requirement that the filter amplitude be 1/e at the cutoff point to:

An = xfer, where y.+]=ln(•'n+l) with y]=l. (A6) Note that the lowest-order (n = 1) filter and its Fourier transform are simple Gaussians. The filters •, are entire functions and have no poles or other unpleasantness to contribute exponentially damped sinusoids or other large oscillations to the impulse response.

The filters •,(t;tc) are defined to have (half) widths At such that At/to= 1. The width Af• of the corresponding smoothing function S,(f;fc)=F(Sn(t;tc) } satisfies

Af/fc•/1/•, (AT)

where f•--l/t•. The lower limit of the inequality is at- tained with the Gaussian smoothing function (n = 1 ).

As an example, .smoothing functions S,(f;f•) and corresponding filters $,t;•) are illustrated in Fig. A1 for two values of n: simple Gaussians (i.e., n= 1 ) and the 10th-order smoothing functions (n----10) used to extract Po. Despite the sharp temporal cutoff in •]o, the corre- sponding smoothing function Sto displays only minimal spectral ringing.

In practice, measurements are only available over a finite frequency range, and the smoothing operation is complicated by end effects. Throughout this paper, the measured frequency interval was chosen to include an in- tegral number of spectral cycles and smoothing was per- formed using periodic boundary conditions (i.e., as if the measured frequency interval had been wrapped around a cylinder). The smooth background so obtained was dis- carded near the "seam" (over a distance from either end of the interval corresponding to the approximate width of the smoothing function). Figure A2 illustrates the procedure.

Shown in Fig. A2 for comparison is the background p• measured at 60 dB SL. To control for the slow tempo- ral drift, measurements were made using an A •-B•-A2-B 2- ß " paradigm, where different letters represent different stimulus levels (5 and 60 dB, respectively). The alternating sequence was repeated until two consecutive measurements taken at the same level superposed (i.e., until A•=A•_• or B•----B•_]). Only those final superposing measurements (and the intervening measurement at the other stimulus level) were analyzed further. It typically took roughly 30 min of measurement time for the recording system/subject to settle down and begin to yield reproducible results free

3349 d. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3349

Page 18: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

o -0.4

c• -0.5

o -0.6 '• -0.7

-0.8 I I I I I 0.9

o 0.8 -

• 0.7

• 0.6

• 0.5

0.4 I I I I I 1250 1500 1•50 1400 1450

Frequency (Hz)

FIO. A3. The 6•cksround p0(•) o6tained 6y (under) smoot•in• t•e dat• sc•cnt of FiB. 8 with the function Sm with cut-off period/•: 88 •z ( ...... ) so t•at small oscillations remain in t•e backsround. FiSure 9, su•sed fo r comparison, plots p and usin8 a cutoff period of 2• •z.

of significant drift. Nonlinearities in the stimulus-delivery and recording equipment were compensated for by "cali- brating" the system at the two stimulus levels using the cylindrical cavity discussed in Fig. 3. Between the dotted

lines (where end effects are small), P0 and Poo almost over- lap; their difference is small relative to the size of the os- ciliatory component.

Figures A3 and A4 indicate the nature of the system- atic errors introduced into the analysis by the choice of filter cutoff. The figures plot the smooth background Po and the corresponding function •? computed using the filter •t0 with a cutoff period of 88 Hz. The results obtained in the text using a cutoff period of 200 Hz are given for com- parison. The smaller cutoff leaves clearly visible oscilla- tions in the background.

Although one might naively expect the amplitude of the oscillations in •/to be reduced correspondingly, Fig. A4 shows that those oscillations have, if anything, increased. To understand this, consider the function p(•o;R) as a function of R. Ideally, smoothing should yield the back- ground function p(to;0). If the width of the smoothing function is too small, however, filtering will (to first order in R) instead yield the function p(co;eR), where ß is non- zero but presumably small (with 0•< ] ß[ •< 1 ). Although the oscillations in the corresponding function A, defined by A = p (to;R)/p (to;eR) -- 1, clearly decrease in amplitude, the same is not true of the logarithm. Indeed, taking the logarithm and expanding in power series yields [cf. Eq. (24)]

lnA•ln[(1--e)œl+lnR+(l+e)qR (IqRI41). (AS)

The coefficient of the term proportional to R (and hence the amplitude of the oscillations in n) therefore increases with the filtering error ß. In the limit ß-, 1, the oscillations in n have twice their proper amplitude.

1.0

0.5

-0.5

- 1.0

0.0

-0.5

-1.5

-2.0 I I I I I 1200 1,300 1400

Frequency (Hz) 5OO

FIG. A4. The function t/(w) computed by using t•he (under) smoothed background from Fig. A3 extracted with the filter S]0 and a cutoff period of 88 Hz ( ...... ). For comparison, the figure is shown superimposed on the data from Fig. 14, for which the background was extracted using a cutoff period of 200 Hz. Although differences are small, the oscillations have generally increased slightly in amplitude.

]See, however, the work of LePage (LePage, 1987; LePage, 1990), who describes experiments identifying another possible component--a "sum- mating baseline shift"--in the response of the organ of Cord and sug- gests that they provide evidence for the dynamic control of cochlear tuning, achieved by varying the tension in the radial fibers of the pars pectinata. The outer hair cells then serve-returning to Helmholtz-much like the pedals on an orchestral harp.

2The mechanics underlying the production of evoked otoacoustic emis- sions (represented here, for concreteness, as "wave reflection from me- chanical inhomogeneities in the organ of Corti") can, without effect on the conclusions of this paper, be summarized more generically as "stim- ulus re,mission by a distribution of emission generators." 3Equation ( 1 ) for the pressure ratio Pe,(a•;R)/P•(•o;O) follows from Eqs. (16)-(21) in the limit that Y•=0 and the middle ear acts like a simple mechanical transformer (e.g., has transfer coefficients (•,•) satisfying the relations ad= 1 and b•-c•O).

4Dispersion relations (9) and (10) are valid when the measured pressure vanishes as •o • o:. When that is not the case, modified, or subtracted,

dispersion relations exist (Bode, 1945). In addition, the subtracted form of the dispersion relations may be computationally more convenient if the low-frequency behavior of the pressure is known but the high- frequency behavior poorly determined (Zweig, 1976; Zweig and Konishi, 1987).

SFor simplicity--and consistency with the representation given in Fig. 2 of Zwicker and Schloth ( 1984)--the Hilbert-transform analysis given in Fig. 7 is performed on the log-amplitude and phase of the pressure (i.e., on the real and imaginary parts of In P rather than P). The analysis therefore tests for the more stringent analyticity properties of minimum- phase behavior. Minimum-phase behavior, required of driving-point im- pedances, is found in the emission measurements reported here. Hilbert- transform analysis based on the real and imaginary parts of P demonstrates that the measurements of Zwickcr and Schloth cannot rep-

3350 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3350

Page 19: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

resent an accurate characterization of a causal system. eZwicker and Schloth (1984) report their subject (A.S.I) as having no measurable spontaneous emission.

?Here we assume that the cochlear contents are essentially incompressible (Shera and Zweig, 1992c).

8Relaxing the assumption of tapering symmetry and adopting the more general expression for the input impedance valid when the cochlear wave impedances depend on the direction of propagation (Shera and Zweig, 1991 b) changes only the values of the coefficients and not the form of the power series for p obtained as Eq. (16) below. øIn this context, a function y(f) varies slowly with frequency if its fractional change over a typical period of oscillation Af is small; that is, if

dln y

Af'•-f- •l. For the frequency region near I kHz, A f=90 Hz. When y satisfies this inequality, we write y(•f) to indicate that the frequency derivative is multiplied by a small parameter (•l).

mGlobally, R(•o;A) has the form R (•o;A) = I R I (•ø;A) e •ø•

where I R I and 0 are, respectively, the real-valued amplitude and phase. The requirement that the Fourier transform of Z(w;A) be real imposes the symmetry

R*(--co;A) =R (w;A),

and, consequently, ]R ] and 0 are even and odd functions of frequency, respectively.

HWhen fitting the data, the reference frequency •o• also serves to decrease the sensitivity of the phase shift qh to changes in the parameter r.

nA short-latency component has, however, been identified in the gerbil (Kemp and Brown 1983b).

•3An estimate of P0 guaranteed to have the same minimum-phase behav- ior as p is given by

po=exp[F{•XF-t{ In P}}I. Numerically, this estimate is essentially indistinguishable from the esti- mate obtained by filtering p directly.

B6k6sy, G. von (1960). Experiments in Hearing (McGraw-Hill, New York).

Blackham, R. C., Vasil, J. A., Atkinson, E. S., and Potter, R. W. (1987). "Measurement modes and digital demodulation for a low-frequency analyzer," Hewlett-Packard J. 38(1), 17-25.

Bode, H. (1945). Network Analysis and Feedback Amplifier Design (Van Nostrand Reinhold, Princeton).

Elliot, E. (1958). "A ripple effect in the audiogram," Nature 181, 1076. Friedland, B., Wing, O., and Ash, R. (1961). Principles of Linear Net-

works (McGraw-Hill, New York). Fukazawa, T. (1992). "Evoked otoacoustic emissions in a nonlinear

model of the cochlea," Hear. Res. 59, 17-24. Helmholtz, H. L. F. (1863). Die Lehre yon den Tonempfindungen als

physiologische Grundlage J•r die Theode der Musik (Vieweg, Braunsch- weig), translated by A. J. Ellis, On the Sensations of Tone as a Physio- logical Basis for the Theory of Music (Dover, New York, 1954).

Horst, J. W., Wit, H. P., and Ritsma, R. J. (1983). "Psychophysical aspects of cochlear acoustic emissions ('Kemp tones')," in Hearing-- Physiological Bases and Psychophysics, edited by R. Klinke and R. Hart- mann (Springer-Verlag, Berlin), pp. 89-94.

Kemp, D. T. (1978). "Stimulated acoustic emissions from within the human auditory system," J. Acoust. Soc. Am. 64, 1386-1391.

Kemp, D. T. (1979). "The evoked cochlear mechanical response and the auditory microstructure--Evidence for a new element in cochlear me- chanics," Seand. Audiol. Suppl. 9, 35-47.

Kemp, D. T. (1980). "Towards a model for the origin of cochlear ech- oes," Hear. Res. 2, 533-548.

Kemp, D. T., and Chum, R. A. (1980). "Observations on the generator mechanism of stimulus frequency acoustic emissions--Two tone sup- pression" in Psychophysical Physiological and Behavioural Studies in Hearing, edited by G. Van Den Brink and F. A. Bilsen (Delft U.P., Delft), pp. 34-42.

Kemp, D. T., and Brown, A.M. (1983a). "An integrated view of co- chlear mechanical nonlinearities observable from the ear canal," in Me- chanics of Hearing, edited by E. de Boer and M. A. Viergeve (Martinus Nijhoff, The Hague), pp. 75-82.

Kemp, D. T., and Brown, A.M. (1983b). "A comparison of mechanical nonlinearities in the cochlae of man and gerbil from ear canal measure- ments," in Hearing--Physiological Bases and Psychophysics, edited by R. Klinke and R. Hartmann (Springer-Verlag, Berlin), pp. 75-82.

Konishi, S., and Zweig, G. (1989). "Smoothing causal functions" (in preparation).

Kringlebotn, M. (1988). "Network model for the human middle ear," Scand. Audiol. 17, 75-85.

LePage, E. L. (1987). "Frequency-dependent self-induced bias of the basilar membrane and its potential for controlling sensitivity and tuning in the mammalian cochlea," J. Acoust. Soc. Am. 82, 139-154.

LePage, E. L. (1990). "Helmholtz revisited: Direct mechanical data sug- gest a physical model for dynamic control of mapping frequency to place along the cochlear partition," in Mechanics and Biophysics of Hearing, edited by P. Dallos, C. D. Geisler, J. W. Matthews, M. A. Ruggero (Springer-Verlag, Berlin), pp. 278-285.

Lonsbury-Martin, B. L., Martin, G. K., Probst, R., and Coats, A. C. (1988). "Spontaneous otoacoustic emissions in the nonhuman primate. II. Cochlear anatomy," Hear. Res. 33, 69-94.

Lynch, T. J., Nedzelnitsky, V., and Peake, W. T. (1982). "Input imped- ance of the cochlea in cat," J. Acoust. Soc. Am. 72, 108-130.

ManIcy, G. A. (1983). "Frequency spacing of acoustic emissions: A pos- sible explanation," in Mechanisms of Hearing, edited by W. R. Webster and L. M. Aitkin (Monash U.P., Clayton, Australia), pp. 36-39.

Mort, J. B., Norton S. J., Neely, S. T., and Wart, W. B. (1989). "Changes in spontaneous otoacoustic emissions produced by acoustic stimulation of the contralateral ear," Hear. Res. 38, 229-242.

Neely, S. T., Norton, S. J., Gorga, M.P., and Jesteadt, W. (1988). "Latency of auditory brainstem responses and otoacoustic emissions using tone-burst stimuli," J. Acoust. Soc. Am. 83, 652-656.

Norton, S. J., and Neely, S. T. (1987). "Tone-burst-evoked otoacoustic emissions from normal-hearing subjects," J. Acoust. Soc. Am. 81, 1860- 1872.

Papoulis, A. (1977). Signal Analysis (McGraw-Hill, New York). Peisl, W. (1988). "Simulation yon zeitverz6gerten evozierten oto-

akustischen Emissionen mit Hilfe eines digitMen Innenohrmodells," in Fortschritte der Akustik DAGA '88 (DPG-GmbH, Bad Honnet).

Peterson, L. C., and Bogert, B. P. (1950). "A dynamical theory of the cochlea," J. Acoust. Soc. Am. 22, 369-381.

Shera, C. A., and Zweig, G. (1991a). "A symmetry suppresses the co- chlear catastrophe," J. Acoust. Soc. Am. 89, 1276-1289.

Shera, C. A., and Zweig, G. (1991b). "Reflection of retrograde waves within the cochlea and at the stapes," J. Acoust. Soc. Am. 89, 1290- 1305.

Shera, C. A., and Zweig, G. (1991c). "Phenomenological characteriza- tion of eardrum transduction," J. Acoust. Soc. Am. 90, 253-262.

Shera, C. A. (1992). "Listening to the Ear," Ph.D. thesis, California Institute of Technology.

Shera, C. A., and Zweig, G. (1992a). "Middle-ear phenomenology: The view from the three windows," J. Acoust. Soc. Am. 92, 1356-1370.

Shera, C. A., and Zweig, G. (1992b). "Analyzing reverse middle-ear transmission: Noninvasive Gedankenexperiments," J. Acoust. Soc. Am. 92, 1371-1381.

Shera, C. A., and Zweig, G. (1992c). "An empirical bound on the com- pressibility of the cochlea," J. Acoust. Soc. Am. 92, 1382-1388.

Shera, C. A., and Zweig, G. (1993). "Dynamic symmetry creation: The origin of spectral periodicity in evoked otoacoustic emission," submitted to J. Acoust. Soc. Am.

Slater, J. C. (1942). Microwave Transmission (McGraw-Hill, New York).

Strube, H. W. (1985). "A computationally efficient basilar-membrane model," Acustica 58, 207-214.

Strube, H. W. (1989). "Evoked otoacoustic emissions as cochlear Bragg reflections," Hear. Res. 38, 35-45.

Whitehead, M. L. (1991). "Slow variations in the amplitude and fre- quency of spontaneous otoacoustic emissions," Hear. Res. 53, 269-280.

Wilson, J.P. (1980). "Evidence for a cochlear origin for acoustic re- emissions, threshold fine-structure and tonal tinnitus," Hear. Res. 2, 233-252.

Wright, A. A. (1984). "Dimensions of the cochlear stereocilia in man and in guinea pig," Hear. Res. 13, 89-98.

Zweig, G. (1976). "Basilar membrane motion," in CoM Spring Harbor Symposia on Quantitatioe Biology (Cold Spring Harbor Laboratory, Cold Spring Harbor), Vol. XL, pp. 619-633.

3351 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shera and G. Zweig: Cochlear waves and emissions 3351

Page 20: New Noninvasive Measurement of the Cochlear Traveling-wave Ratioweb.mit.edu/apg/papers/shera-zweig-twr-JASA93.pdf · 2006. 10. 13. · trally mediated variations in emission characteristics

Zweig, G., and Konishi, S. (1987). "Constraints on measurements of causal or minimum-phase systems," accepted for publication in J. Acoust. Soc. Am.

Zweig, G. (1991). "Finding the impedance of the organ of Corti," $. Acoust. Soc. Am. 89, 1229-1254.

Zweig, G., McDowell, J. E., and Shera, C. A. (1992). "Latency of tone- burst-evoked otoacoustic emissions" (in preparation).

Zwicker, E., and $chloth, E. (1984). "Interrelation of different oto-

acoustic emissions," J. Acoust. Soc. Am. 75, 1148-1154.

Zwislocki, J. (1962). "Analysis of the middle-ear function. Part I: Input impedance," J. Acoust. Soc. Am. 34, 1514-1523.

Zwislocki-Mo•cicki, J. (1948). "Theorie der Schneckenmechanik: Qual-

itative und quantitative Analyse," Acta Otolaryngol. Suppl. 72, 1-112.

3352 J. Acoust. Soc. Am., Vol. 93, No. 6, June 1993 C.A. Shem and G. Zweig: Cochlear waves and emissions 3352