encyclopedia of physical science and technology - classical physics

448
Acoustic Chaos Werner Lauterborn Universit¨ at G¨ ottingen I. The Problem of Acoustic Cavitation Noise II. The Period-Doubling Noise Sequence III. A Fractal Noise Attractor IV. Lyapunov Analysis V. Period-Doubling Bubble Oscillations VI. Theory of Driven Bubbles VII. Other Systems VIII. Philosophical Implications GLOSSARY Bifurcation Qualitative change in the behavior of a sys- tem, when a parameter (temperature, pressure, etc.) is altered (e.g., period-doubling bifurcation); related to phase change in thermodynamics. Cavitation Rupture of liquids when subject to tension ei- ther in flow fields (hydraulic cavitation) or by an acous- tic wave (acoustic cavitation). Chaos Behavior (motion) with all signs of statistics de- spite an underlying deterministic law (often, determin- istic chaos). Fractal Object (set of points) that does not have a smooth structure with an integer dimension (e.g., three dimen- sional). Instead, a fractal (noninteger) dimension must be ascribed to them. Period doubling Special way of obtaining chaotic (irreg- ular) motion; the period of a periodic motion doubles repeatedly until in the limit of infinite doubling aperi- odic motion is obtained. Phase space Space spanned by the dependent variables of a dynamic system. A point in phase space characterizes a specific state of the system. Strange attractor In dissipative systems, the motion tends to certain limits forms (attractors). When the mo- tion comes to rest, this attractor is called a fixed point. Chaotic motions run on a strange attractor, which has involved properties (e.g., a fractal dimension). THE PAST FEW years have seen a remarkable develop- ment in physics, which may be described as the upsurge of “chaos.” Chaos is a term scientists have adapted from common language to describe the motion or behavior of a system (physical or biological) that, although governed by an underlying deterministic law, is irregular and, in the long term, unpredictable. Chaotic motion seems to appear in any sufficiently complex dynamical system. Acoustics, that part of physics that descibes the vibration of usually larger en- sembles of molecules in gases, liquids, and solids, makes no exception. As a main necessary ingredient of chaotic 117

Upload: others

Post on 11-Sep-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN001-05 May 25, 2001 16:7

Acoustic ChaosWerner LauterbornUniversitat Gottingen

I. The Problem of Acoustic Cavitation NoiseII. The Period-Doubling Noise SequenceIII. A Fractal Noise AttractorIV. Lyapunov AnalysisV. Period-Doubling Bubble Oscillations

VI. Theory of Driven BubblesVII. Other Systems

VIII. Philosophical Implications

GLOSSARY

Bifurcation Qualitative change in the behavior of a sys-tem, when a parameter (temperature, pressure, etc.) isaltered (e.g., period-doubling bifurcation); related tophase change in thermodynamics.

Cavitation Rupture of liquids when subject to tension ei-ther in flow fields (hydraulic cavitation) or by an acous-tic wave (acoustic cavitation).

Chaos Behavior (motion) with all signs of statistics de-spite an underlying deterministic law (often, determin-istic chaos).

Fractal Object (set of points) that does not have a smoothstructure with an integer dimension (e.g., three dimen-sional). Instead, a fractal (noninteger) dimension mustbe ascribed to them.

Period doubling Special way of obtaining chaotic (irreg-ular) motion; the period of a periodic motion doublesrepeatedly until in the limit of infinite doubling aperi-odic motion is obtained.

Phase space Space spanned by the dependent variables of

a dynamic system. A point in phase space characterizesa specific state of the system.

Strange attractor In dissipative systems, the motiontends to certain limits forms (attractors). When the mo-tion comes to rest, this attractor is called a fixed point.Chaotic motions run on a strange attractor, which hasinvolved properties (e.g., a fractal dimension).

THE PAST FEW years have seen a remarkable develop-ment in physics, which may be described as the upsurgeof “chaos.” Chaos is a term scientists have adapted fromcommon language to describe the motion or behavior ofa system (physical or biological) that, although governedby an underlying deterministic law, is irregular and, in thelong term, unpredictable.

Chaotic motion seems to appear in any sufficientlycomplex dynamical system. Acoustics, that part ofphysics that descibes the vibration of usually larger en-sembles of molecules in gases, liquids, and solids, makesno exception. As a main necessary ingredient of chaotic

117

Page 2: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

118 Acoustic Chaos

dynamics is nonlinearity, acoustic chaos is closely relatedto nonlinear oscillations and waves in gases, liquids,and solids. It is the science of never-repeating soundwaves. This property it shares with noise, a term havingits origin in acoustics and formerly attributed to everysound signal with a broadband Fourier spectrum. ButFourier analysis is especially adapted to linear oscillatorysystems. The standard interpretation of the lines in aFourier spectrum is that each line corresponds to a (linear)mode of vibration and a degree of freedom of the system.However, as examples from chaos physics show, a broad-band spectrum can already be obtained with just three(nonlinear) degrees of freedom (that is, three dependentvariables). Chaos physics thus develops a totally newview of the noise problem. It is a deterministic view,but it is still an open question how far the new approachwill reach in explaining still unsolved noise problems(e.g., the 1/ f -noise spectrum encountered so often). Thedetailed relationship between chaos and noise is still anarea of active research. An example, where the propertiesof acoustic noise could be related to chaotic dynamics, isgiven below for the case of acoustic cavitation noise.

Acoustic chaos appears in an experiment when a liq-uid is irradiated with sound of high intensity. The liquidthen ruptures to form bubbles or cavities (almost emptybubbles). The phenomenon is known as acoustic cavita-tion and is accompanied by intense noise emission—theacoustic cavitation noise. It has its origin in the bubbles setinto oscillation in the sound field. Bubbles are nonlinearoscillators, and it can be shown both experimentally andtheoretically that they exhibit chaotic oscillations after aseries of period doublings. The acoustic emission fromthese bubbles is then a chaotic sound wave (i.e., irregularand never repeats). This is acoustic chaos.

I. THE PROBLEM OF ACOUSTICCAVITATION NOISE

The projection of high-intensity sound into liquids hasbeen investigated since the application of sound to locateobjects under water became used. It was soon noticed thatat too high an intensity the liquid may rupture, giving riseto acoustic cavitation. This phenomenon is accompaniedby broadband noise emission, which is detrimental to theuseful operation of, for instance, a sonar device.

The noise emission presents an interesting physicalproblem that may be formulated in the following way.A sound wave of a single frequency (a pure tone) is trans-formed into a broadband sound spectrum, consisting ofan (almost) infinite number of neighboring frequencies.What is the physical mechanism that causes this transfor-mation? The question may even be shifted in its emphasis

to ask what physical mechanisms are known to convert asingle frequency to a broadband spectrum? This could notbe answered before chaos theory was developed. However,although chaos theory is now well established, a physical(intuitive) understanding is still lacking.

II. THE PERIOD-DOUBLINGNOISE SEQUENCE

To investigate the sound emission from acoustic cavita-tion the experimental arrangement as depicted in Fig. 1is used. To irradiate the liquid (water) a piezoceramiccylinder (PZT-4) of 76-mm length, 76-mm inner diameter,and 5-mm wall thickness is used. When driven at its mainresonance, 23.56 kHz, a high-intensity acoustic field isgenerated in the interior and cavitation is easily achieved.The noise is picked up by a broadband hydrophone anddigitized at rates up to 60 MHz after suitable lowpassfiltering (for correct analog-to-digital conversion for laterprocessing) and strong filtering of the driving frequency,which would otherwise dominate the noise output. Theexperiment is fully computer controlled. The amplitude ofthe driving sound field can be made an arbitrary functionof time via a programmable synthesizer. In most cases,linear ramp functions are applied to study the buildup ofnoise when the driving pressure amplitude in the liquid isincreased.

From the data stored in the memory of the transientrecorder, power spectra are calculated via the fast-Fourier-transform algorithm from usually 4096 samples out of the128 × 1024 samples stored. This yields about 1000 short-time spectra when the 4096 samples are shifted by 128samples from one spectrum to the next.

Figure 2 shows four power spectra from one suchexperiment. Each diagram gives the excitation level at

FIGURE 1 Experimental arrangement for measurements onacoustic cavitation noise (chaotic sound).

Page 3: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

Acoustic Chaos 119

FIGURE 2 Power spectra of acoustic cavitation noise at different excitation levels (related to the pressure amplitudesof the driving sound wave). (From Lauterborn, W. (1986). Phys. Today 39, S-4.)

the transducer in volts, the time since the experiment(irradiating the liquid with a linear ramp of increasingexcitation) has started in milliseconds, and the powerspectrum at this time. At the beginning of the experiment,at low sound intensity, only the driving frequency f0 showsup. In the upper left diagram of Fig. 2 the third harmonic,3 f0, is present. When comparing both lines it shouldbe remembered that the driving frequency is stronglydamped by filtering. In the lower left-hand diagram, manymore lines are present. Of special interest is the spectralline at 1

2 f0 (and their harmonics). A well-known feature ofnonlinear systems is that they produce higher harmonics.Not yet widely known is that subharmonics can also beproduced by some nonlinear systems. These then seemto spontaneously divide the applied frequency f0 toyield, for example, exactly half that frequency (or exactlyone-third). This phenomenon has become known as aperiod-doubling (-tripling) bifurcation. A large class of

systems has been found to show period doubling, amongthem driven nonlinear oscillators. A peculiar featureof the period-doubling bifurcation is that it occurs insequences; that is, when one period-doubling bifurcationhas occurred, it is likely that further period doubling willoccur upon altering a parameter of the system, and so on,often in an infinite series. Acoustic cavitation has been oneof the first experimental examples known to exhibit thisseries. In Fig. 2, the upper right-hand diagram shows thenoise spectrum after further period doubling to 1

4 f0. Thedoubling sequence can be observed via 1

8 f0 and 116 f0 up

to 132 f0 (not shown here). It is obvious that the spectrum is

rapidly “filled” with lines and gets more and more dense.The limit of the infinite series yields an aperiodic motion,a densely packed power spectrum (not homogeneously),that is, broadband noise (but characteristically colored bylines). One such noise spectrum is shown in Fig. 2 (lowerright-hand diagram). Thus, at least one way of turning

Page 4: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

120 Acoustic Chaos

a pure tone into broadband noise has been found—viasuccessive period doubling.

This finding has a deeper implication. If a system be-comes aperiodic through the phenomenon of repeated pe-riod doubling, then this is a strong indication that the ir-regularity attained in this way is of simple deterministicorigin. This implies that acoustic cavitation noise is not abasically statistical phenomenon but a deterministic one. Italso implies that a description of the system with usual sta-tistical means may not be appropriate and that a successfuldescription by some deterministic theory may be feasible.

III. A FRACTAL NOISE ATTRACTOR

In Section II the sound signal has been treated by Fourieranalysis. Fourier analysis is a decomposition of a signalinto a sum of simple waves (normal modes) and is said togive the degrees of freedom of the described system. Chaostheory shows that this interpretation must be abandoned.Broadband noise, for instance, is usually thought to be dueto a high (nearly infinite) number of degrees of freedomthat superposed yield noise. Chaotic systems, however,have the ability to produce noise with only a few (nonlin-ear) degrees of freedom, that is, with only a few dependentvariables. Also, it has been found that continuous systemswith only three dependent variables are capable of chaoticmotions and thus, producing noise. Chaos theory has de-veloped new methods to cope with this problem. One ofthese is phase-space analysis, which in conjunction withfractal dimension estimation is capable of yielding the in-trinsic degrees of freedom of the system. This method hasbeen applied to inspect acoustic cavitation noise. The an-swer it may give is the dimension of the dynamical systemproducing acoustic cavitation noise. See SERIES.

The sampled noise data are first used to construct anoise attractor in a suitable phase space. Then the (frac-tal) dimension of the attractor is determined. The pro-cedure to construct an attractor in a space of chosen di-mension n simply consists in combining n samples (notnecessarily consecutive ones) to an n-tuple, whose en-tries are interpreted as the coordinate values of a point inn-dimensional Euclidian space. An example of a noise at-tractor constructed in this way is given in Fig. 3. The attrac-tor has been obtained from a time series of pressure valuesp(kts); t = 1, . . . , 4096; ts = 1 µsec taken at a samplingfrequency of fs = 1/ts = 1 MHz by forming the three-tuples [p(kts), p(kts + T ), p(kts + 2T )], k = 1, . . . , 4086,with T = 5 µsec. The frequency of the driving sound fieldhas been 23.56 kHz. The attractor in Fig. 3 is shown fromdifferent views to demonstrate its nearly flat structure. It ismost remarkable that not an unstructured cluster of pointsis obtained as is expected for noise, but a quite well-defined

FIGURE 3 Strange attractor of acoustic cavitation noise obtainedby phase–space analysis of experimental data (a time series ofpressure values sampled at 1 MHz). The attractor is rotated tovisualize its three-dimensional structure. (Courtesy of J. Holzfuss.From Lauterborn, W. (1986). In “Frontiers in Physical Acoustics”(D. Sette, ed.), pp. 124–144, North Holland, Amsterdam.)

object. This suggests that the dynamical system produc-ing the noise has only a few nonlinear degrees of freedom.The flat appearance of the attractor in a three-dimensionalphase space (Fig. 3) suggests that only three essential de-grees are needed for the system. This is confirmed by afractal dimension analysis, which yields a dimension ofd = 2.5 for this attractor. Unfortunately, a method has notyet been conceived of how to construct the equations ofmotion from the data.

IV. LYAPUNOV ANALYSIS

Chaotic systems exhibit what is called sensitive depen-dence on initial conditions. This expression has been intro-duced to denote the property of a chaotic system that smalldifferences in the initial conditions, however small, arepersistently magnified because of the dynamics of the sys-tem. This property is captured mathematically by the no-tion of Lyapunov exponents and Lyapunov spectra. Theirdefinition can be illustrated by the deformation of a small

Page 5: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

Acoustic Chaos 121

FIGURE 4 Idea for defining Lyapunov exponents. A small spherein phase space is deformed to an ellipsoid, indicating expansionor contraction of neighboring trajectories.

sphere of initial conditions along a fiducial trajectory (seeFig. 4). The expansion or contraction is used to define theLyapunov exponents λi , i = 1, 2, . . . , m, where m is thedimension of the phase space of the system. When, on theaverage, for example, r1(t) is larger than r1(0), then λ1 > 0and there is a persistent magnification in the system. Theset λi , i = 1, . . . , m, whereby the λi usually are orderedλ1 ≥ λ2 ≥ · · · ≥ λm , is called the Lyapunov spectrum.

FIGURE 5 Acoustic cavitation bubble field in water inside a cylin-drical piezoelectric transducer of about 7 cm in diameter. Twoplanes in depth are shown about 5 mm apart. The pictures are ob-tained by photographs from the reconstructed three-dimensionalimage of a hologram taken with a ruby laser.

In dissipative systems, the final motion takes place onattractors. Besides the fractal dimension, as discussed inthe previous section, the Lyapunov spectrum may serve tocharacterize these attractors. When at least one Lyapunovexponent is greater than zero, the attractor is said to bechaotic. Progress in the field of nonlinear dynamics hasmade possible the calculation of the Lyapunov spectrumfrom a time series. It could be shown that acoustic cavita-tion in the region of broadband noise emission is charac-terized by one positive Lyapunov exponent.

V. PERIOD-DOUBLING BUBBLEOSCILLATIONS

Thus far, only the acoustic signal has been investigated.An optic inspection of the liquid inside the piezoelectriccylinder (see Fig. 1) reveals that a highly structured cloudof bubbles or cavities is present (Fig. 5) oscillating andmoving in the sound field. It is obviously these bubblesthat produce the noise. If this is the case, the bubbles must

FIGURE 6 Reconstructed images from (a) a holographic seriestaken at 23.100 holograms per second of bubbles inside a piezo-electric cylinder driven at 23.100 Hz and (b) the correspondingpower spectrum of the noise emitted. Two period-doublings havetaken place.

Page 6: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

122 Acoustic Chaos

FIGURE 7 Period-doubling route to chaos for a driven bubble oscillator. Left column: radius-time solution curves;middle left column: trajectories in phase space; middle right column: Poincare section plots: right column: powerspectra. Rn is the radius of the bubble at rest, Ps and v are the pressure amplitude and frequency of the driving soundfield, respectively. (From Lauterborn, W., and Parlitz, U. (1988). J. Acoust. Soc. Am. 84, 1975.)

Page 7: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

Acoustic Chaos 123

FIGURE 7 (Continued )

Page 8: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

124 Acoustic Chaos

move chaotically and should show the period-doublingsequence encountered in the noise output. This has beenconfirmed by holographic investigations where once perperiod of the driving sound field a hologram of the bub-ble field has been taken. Holograms have been taken be-cause the bubbles move in three dimensions, and it isdifficult to photograph them at high resolution when anextended depth of view is needed. In one experiment thedriving frequency was 23,100 Hz, which means 23,100holograms per second have been taken. The total num-ber of holograms, however, was limited to a few hundred.Figure 6a gives an example of a series of photographstaken from a holographic series. In this case, two period-doubling bifurcations have already taken place since theoscillations only repeat after four cycles of the drivingsound wave. The first period doubling is strongly visible;the second one can only be seen by careful inspection.Figure 6b gives the noise power spectrum taken simulta-neously with the holograms. The acoustic measurementsshow both period doublings more clearly than the opticalmeasurement (documented in Fig. 6a) as the 1

4 f0 ( f0 =23.1 kHz) spectral line is strongly present together with itsharmonics.

VI. THEORY OF DRIVEN BUBBLES

A theory has not yet been developed that can account forthe dynamics of a bubble field as shown in Fig. 5. The mostadvanced theory is only able to describe the motion of asingle spherical bubble in a sound field. Even with suitableneglections the model is a highly nonlinear ordinarydifferential equation of second order for the radius R ofthe bubble as a function of time. With a sinusoidal drivingterm (sound wave) the phase space is three dimensional,just sufficient for a dynamical system to show irregular(chaotic) motion. The model is an example of a drivennonlinear oscillator for which chaotic solutions in certainparameter regions are by now standard. However, perioddoubling and irregular motion were found in the late 1960sin numerical calculations when chaos theory was not yetavailable and thus the interpretation of the results difficult.The surprising fact is that already this simple model of apurely spherically oscillating bubble set into oscillationby a sound wave yields successive period doubling upto chaotic oscillations. Figure 7 demonstrates the period-doubling route to chaos in four ways. The leftmost columngives the radius of the bubble in the sound field as a func-tion of time, where the dot on the curve indicates the lapseof a full period of the driving sound field. The next columnshows the corresponding trajectories in the plane spannedby the radius of the bubble and its velocity. The dots againmark the lapse of a full period of the driving sound field.

The third column shows so-called Poincare section plots.Here, only the dots after the lapse of one full period ofthe driving sound field are plotted in the radius–velocityplane of the bubble motion. Period doubling is seen mosteasily here and also the evolution of a strange (or chaotic)attractor. The rightmost column gives the power spectraof the radial bubble motion. The filling of the spectrumwith successive lines in between the old lines is evident,as is the ultimate filling when the chaotic motion isreached.

A compact way to show the period-doubling route tochaos is by plotting the radius of the bubble as a func-tion of a parameter of the system that can be varied, e.g.,the frequency of the driving sound field. Figure 8a givesan example for a bubble of radius at rest of Rn = 10 µm,driven by a sound field of frequency ν between 390 kHzand 510 kHz at a pressure amplitude of Ps = 290 kPa.The period-doubling cascade to chaos is clearly visible.In the chaotic region, “windows” of periodicity show

FIGURE 8 (a) A period-doubling cascade as seen in the bifurca-tion diagram. (b) The corresponding largest Lyapunov exponentλmax. (c) The winding number w. (From Parlitz, U. et al. (1990).J. Acoust. Soc. Am. 88, 1061.)

Page 9: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

Acoustic Chaos 125

up as regularly experienced with other chaotic systems.In Fig. 8b the largest Lyapunov exponent λmax is plot-ted. It is seen that λmax > 0 when the chaotic region isreached. Figure 8c gives a further characterization of thesystem by the winding number w. The winding numberdescribes the winding of a neighboring trajectory aroundthe given one per period of the bubble oscillation. It canbe seen that this quantity changes quite regularly in theperiod-doubling sequence, and rules can be given for thischange.

The driven bubble system shows resonances at vari-ous frequencies that can be labeled by the ratio of thelinear resonance frequency of the bubble to the drivingfrequency of the sound wave. Figure 9 gives an exampleof the complicated response characteristic of a driven bub-ble. At somewhat higher driving than given in the figurethe oscillations start to become chaotic. A chaotic bubbleattractor is shown in Fig. 10. To better reveal its structure,it is not the total trajectory that is plotted but only thepoints in the velocity–radius plane of the bubble wall at afixed phase of the driving. These points hop around on theattractor in an irregular fashion. These chaotic bubble os-cillations must be considered as the source of the chaoticsound output observed in acoustic cavitation.

FIGURE 9 Frequency response curves (resonance curves) for a bubble in water with a radius at rest of Rn = 10 µmfor different sound pressure amplitudes pA of 0.4, 0.5, 0.6, 0.7, and 0.8 bar. (From Lauterborn, W. (1976). J. Acoust.Soc. Am. 59, 283.)

VII. OTHER SYSTEMS

Are there other systems in acoustics with chaotic dynam-ics? The answer is surely yes, although the subtleties ofchaotic dynamics make it difficult to easily locate them.

When looking for chaotic acoustic systems, the ques-tion arises as to what ingredients an oscillatory system, asan acoustic one, must possess to be susceptible to chaos.The full answer is not yet known, but some understandingis emerging. A necessary, but unfortunately not sufficient,ingredient is nonlinearity. Next, period doubling is knownto be a precursor of chaos. It is a peculiar fact that, whenone period doubling has occurred, another one is likely toappear, and indeed a whole series with slight alterations ofparameters. Further, the appearance of oscillations whena parameter is altered points to an intrinsic instability of asystem and thus to the possibility of becoming a chaoticone. After all, two distinct classes can be formulated: (1)periodically driven passive nonlinear systems (oscillators)and (2) self-excited systems (oscillators). Passive meansthat in the absence of any external driving the systemstays at rest as, for instance, a pendulum does. But apendulum has the potential to oscillate chaotically whenbeing driven periodically, for instance by a sinusoidally

Page 10: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

126 Acoustic Chaos

FIGURE 10 A numerically calculated strange bubble attractor(Ps = 300 kPa, v = 600 kHz). (Courtesy of U. Parlitz.)

varying torque. This is easily shown experimentally bythe repeated period doubling that soon appears at higherperiodic driving. Self-excited systems develop sustainedoscillations from seemingly constant exterior conditions.One example is the Rayleigh-Benard convection, where aliquid layer is heated from below in a gravitational field.The system goes chaotic at a high enough temperaturedifference between the bottom and surface of the liquidlayer. Self-excited systems may also be driven, givingan important subclass of this type. The simplest modelin this class is the driven van der Pol oscillator. A realphysical system of this category is the weather (theatmosphere). It is periodically driven by solar radiationwith the low period of 24 hr, and it is a self-excitedsystem, as already constant heating by the sun may lead toRayleigh-Benard convection as observed on a faster timescale.

The first reported period-doubled oscillation from a pe-riodically driven passive system dates back to Faraday in1831. Starting with the investigation of sound-emitting, vi-brating surfaces with the help of Chladni figures, Faradayused water instead of sand, resulting in vibrating a layerof liquid vertically. He was very astonished about the re-sult: regular spatial patterns of a different kinds appearedand, above all, these patterns were oscillating at half thefrequency of the vertical motion of the plate. Photographywas not yet invented to catch the motion, but Faraday maywell have seen chaotic motion without knowing it. It is in-teresting to note that there is a connection to the oscillationof bubbles as considered before. Besides purely sphericaloscillations, bubbles are susceptible to surface oscillationsas are drops of liquid. The Faraday case of a vibrating flatsurface of a liquid may be considered as the limiting caseof either a bubble of larger and larger size or a drop oflarger and larger size, when the surface is bent around upor down. Today, the Faraday patterns and Faraday oscil-lations can be observed better, albeit still with difficulties

as it is a three-dimensional (space), nonlinear, dynamical(time) system; that is, it requires three space coordinatesand one time coordinate to be followed. This is at theborder of present-day technology both numerically andexperimentally. The latest measurements have singled outmode competition as the mechanism underlying the com-plex dynamics. Figure 11 gives two examples of oscilla-tory patterns: a periodic hexagonal structure (Fig. 11a) and

a

b

FIGURE 11 Two patterns appearing on the surface of a liq-uid layer vibrated vertically in a cylindrical container: (a) regularhexagonal pattern at low amplitude, and (b) pattern when ap-proaching chaotic vibration. (Courtesy of Ch. Merkwirth.)

Page 11: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001-05 May 8, 2001 14:48

Acoustic Chaos 127

its dissolution on the way to chaotic motion (Fig. 11b) atthe higher vertical driving oscillation amplitude of a thinliquid layer.

The other class of self-excited systems in acousticsis quite large. It comprises (1) musical instruments, (2)thermoacoustic oscillators as used today for cooling withsound waves, and (3) speech production via the vocalfolds. Period doubling could be observed in most of thesesystems; however, very few investigations have been doneso far concerning their chaotic properties.

VIII. PHILOSOPHICAL IMPLICATIONS

The results of chaos physics have shed new light on therelation between determinism and predictability and onhow seemingly random (irregular) motion is produced. Ithas been found that deterministic laws do not imply pre-dictability. The reason is that there are deterministic lawswhich persistently show a sensitive dependence on initialconditions. This means that in a finite, mostly short timeany significant digit of a measurement has been lost, andanother measurement after that time yields a value thatappears to come from a random process. Chaos physicshas thus shown a way of how random (seemingly random,one must say) motion is produced out of determinism andhas developed convincing methods (some of them exem-plified in the preceding sections on acoustic chaos) to clas-sify such motion. Random motion is thereby replaced bychaotic motion. Chaos physics suggests that one shouldnot resort too quickly to statistical methods when facedwith irregular data but instead should try a deterministicapproach. Thus, chaos physics has sharpened our viewconsiderably on how nature operates.

But, as always in physics, when progress has been madeon one problem other problems pile up. Quantum mechan-ics is thought to be the correct theory to describe nature.It contains “true” randomness. But, what then about therelationship between classical deterministic physics andquantum mechanics? Chaos physics has revived interestin these questions and formulated new specific ones, for in-stance, on how chaotic motion crosses the border to quan-tum mechanics. What is the quantum mechanical equiva-lent to sensitive dependence on initial conditions?

The exploration of chaos physics, including its relationto quantum mechanics, is therefore thought to be one ofthe big scientific enterprises of the new century. It is hopedthat acoustic chaos will accompany this enterprise furtheras an experimental testing ground.

SEE ALSO THE FOLLOWING ARTICLES

ACOUSTICAL MEASUREMENT • CHAOS • FOURIER SERIES

• FRACTALS • QUANTUM MECHANICS

BIBLIOGRAPHY

Lauterborn, W., and Holzfuss, J. (1991). “Acoustic chaos.” Int. J. Bifur-cation and Chaos 1, 13–26.

Lauterborn, W., and Parlitz, U. (1988). Methods of chaos physics andtheir application to acoustics. J. Acoust. Soc. Am. 84, 1975–1993.

Parlitz, U., Englisch, V., Scheffezyk, C., and Lauterborn, W. (1990).“Bifurcation structure of bubble oscillators.” J. Acoust. Soc. Am. 88,1061–1077.

Ruelle, D. (1991). “Chance and Chaos,” Princeton Univ. Press, Princeton,NJ.

Schuster, H. G. (1995). “Deterministic Chaos: An Introduction,” Wiley-VCH, Weinheim.

Page 12: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN001-08 May 25, 2001 16:4

Acoustical MeasurementAllan J. ZuckerwarNASA Langley Research Center

I. Instruments for Measuringthe Properties of Sound

II. Instruments for Processing Acoustical DataIII. Examples of Acoustical Measurements

GLOSSARY

Anechoic Having no reflections or echoes.Audio Pertaining to sound within the frequency range of

human hearing, nominally 20 Hz to 20 kHz.Coupler Small leak-tight enclosure into which acoustic

devices are inserted for the purpose of calibration, mea-surement, or testing.

Diffuse field Region of uniform acoustic energy density.Free field Region where sound propagation is unaffected

by boundaries.Harmonic Pertaining to a pure tone, that is, a sinusoidal

wave at a single frequency: an integral multiple of afundamental tone.

Infrasonic Pertaining to sound at frequencies below thelimit of human hearing, nominally 20 Hz.

Reverberant Highly reflecting.Ultrasonic Pertaining to sound at frequencies above the

limit of human hearing, nominally 20 kHz.

A SOUND WAVE propagating through a medium pro-duces deviations in pressure and density about their meanor static values. The deviation in pressure is called theacoustic or sound pressure, which has standard interna-tional (SI) units of pascal (Pa) or newton per square meter(N/m2). Because of the vast range of amplitude covered

in acoustic measurements, the sound pressure is conve-niently represented on a logarithmic scale as the soundpressure level (SPL). The SPL unit is the decibel (dB),defined as

SPL(dB) = 20 log(p/p0)

in which p is the root mean square (rms) sound pressureamplitude and p0 the reference pressure of 20 × 10−6 Pa.The equivalent SPLs of some common units are thefollowing:

pascal (Pa) 93.98 dB psi (lb/in.2) 170.75 dB

atmosphere (atm) 194.09 torr (mm Hg) 136.48

bar 193.98 dyne/cm2 73.98

The levels of some familiar sound sources and environ-ments are listed in Table I.

The displacement per unit time of a fluid particle due tothe sound wave, superimposed on that due to its thermalmotion, is called the acoustic particle velocity, in units ofmeters per second. Determination of the sound pressureand acoustic particle velocity at every point completelyspecifies an acoustic field, just as the voltages and currentscompletely specify an electrical network. Thus, acousticalinstrumentation serves to measure one of these quanti-ties or both. Since in most cases the relationship between

91

Page 13: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

92 Acoustical Measurement

TABLE I Representative Sound Pressure Levels ofFamiliar Sound Sources and Environments

Source or environment Level (dB)

Concentrated sources: re 1 m

Four-jet airliner 155

Pipe organ, loudest 125

Auto horn, loud 115

Power lawnmower 100

Conversation 60

Whisper 20

Diffuse environments

Concert hall, loud orchestra 105

Subway interior 95

Street corner, average traffic 80

Business office 60

Library 40

Bedroom at night 30

Threshold levels

Of pain 130

Of hearing impairment, continous exposure 90

Of hearing 0

Of detection, good microphone −2

sound pressure and particle velocity is known, it is suf-ficient to measure only one quantity, usually the soundpressure. The scope of this article is to describe instru-mentation for measuring the properties of sound in fluids,primarily in air and water, and in the audio (20 Hz–20 kHz)and infrasonic (<20 Hz) frequency ranges. Althoughmany instrumentation techniques conform to national andinternational standards, the standards are not cited in thetext, but a selected list is given in the bibliography.

I. INSTRUMENTS FOR MEASURINGTHE PROPERTIES OF SOUND

A. Measurement of Sound Pressure

A device that senses a sound pressure in a gas and pro-vides a proportional electrical output voltage is a mi-crophone. Functionally, microphones fall into two cate-gories: entertainment or broadcasting microphones andmeasurement microphones. Entertainment microphones,comprising mainly electret, ribbon, and moving-coil mi-crophones, conform to the requirements of speech andmusic and have preferred directionality. Measurement mi-crophones, on the other hand, may have capabilities thatextend well beyond these requirements, both in frequencyresponse and in dynamic range, and are for the most partomnidirectional. Nevertheless, the distinguishing mark ofa measurement microphone is consistent performance in

the face of prolonged service, exposure to various environ-mental conditions, and the passage of time. Recalibrationoften yields microphone sensitivities that are repeatable towithin tenths of a decibel—a feature not required in enter-tainment microphones. The common types of measure-ment microphone are air condenser, electret condenser,ceramic, and piezoresistive microphones.

1. Air Condenser Microphone

Beginning with the successful operational unit reported byWente in 1917, the evolution of condenser microphone de-sign culminated in the celebrated Western Electric model640 AA in 1948, which serves as the prototype of modernair condenser microphones. Because its operation dependson a purely geometric effect, the air condenser microphoneremains the most stable and most widely used measure-ment microphone today. Its basic construction is shown inFig. 1. An incident sound pressure p excites motion of themembrane, changing the capacitance between the mem-brane and backplate and producing a proportional outputvoltage. The mechanical and electrical functions will bedescribed separately in that order.

As the membrane vibrates, it compresses and expandsthe air layer in the gap and creates a reaction pressure,which opposes motion of the membrane. The reactionpressure is partially relieved by the flow of air throughthe openings in the backplate, and these determine thedamping of the membrane–air layer system. The back-plate may contain one or more “rings” of holes and nearlyalways a slot around its periphery. The flow of air throughthe openings depends on the pressure difference acrossthem. Because the pressure at any one opening dependson the pressures at all the other openings, both in the gapand in the backchamber, the pressures at the openings arecoupled together at both locations and their analysis is sub-ject to a very complicated boundary–value problem. How-ever, through simplifying assumptions the problem can besolved approximately but accurately, taking into accountthe specifics of the backplate configuration. The solutionyields a mechanical sensitivity Mm (m/Pa) of the form:

FIGURE 1 Basic construction of an air condenser microphone.

Page 14: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 93

Mm ≡ d

p= 1

T K 2· J2(Ka)

J0(Ka) + D(1)

where d is the mean membrane displacement fromequilibrium (m), p the incident sound pressure (N/m2), Kthe wave number for sound propagation in the membrane(m−1), a the membrane radius (m) and T its tension(N/m), and the J ’s are the Bessel functions of the firstkind. The complex term D, which accounts for the effectof the reaction pressure in the gap, can be expressed ex-plicitly in terms of the air layer and backplate parameters,but such a derivation is beyond the scope of this article.The membrane wave number is given by:

K = 4π2 f (ρMtM/T )1/2 (2)

where ρM is the membrane density (kg/m3), tM themembrane thickness (m), and f the acoustic frequency(Hz). The upper cutoff frequency of the microphone liesclose to the undamped fundamental resonant frequencyof the membrane:

fR = (T

/6.285ρMtMa2

)1/2(3)

At frequencies well below fR, that is, over the normaloperating range of the microphone, Ka 1 and the Besselfunctions in Eq. (1) can be represented by their leadingterms. The microphone can be represented by the equiv-alent lumped elements, the first four terms in the expan-sion of Eq. (1), shown in Fig. 2a. Its acoustic impedance(Section I.E) is

FIGURE 2a (a) Equivalent lumped element representation of acondenser microphone. (b) As a transmitter with the mechanicalelements referred to the electrical side. (c) As a receiver with theelectrical elements referred to the mechanical side. The transfor-mation ratio φ is equal to the static charge on the backplate (ormembrane) divided by the volume of the gap and microphoneadmittance.

FIGURE 2b Polarization circuit of an air condenser microphone,consisting of (I) the condenser microphone, (II) the charging net-work, and (III) the input elements of the preamplifier.

Zm ≡ p

U= p

jωdπa2= 1

jωMmπa2

= jωM + 1

(1

CM+ 1

CA

)+ RA (4)

whereω is the angular frequency = 2π f and U the volumevelocity of the incident sound. In terms of the microphoneparameters, the membrane mass M , membrane compli-ance CM, air layer compliance CA, and air layer resistanceRA are

M = 43

(ρMtM

/πa2

)(5)

CM = (πa2)2/8πT (6)

CA = (πa2)2/8T D′ (7)

RA = 8πT D′′/ω(πa2)2 (8)

in which D′ and D′′ are the real and imaginary parts of D.Typical values for the 1

2 - and 1-in. microphones are listedin Table II.

Most microphones are designed such that the low-frequency membrane displacement is controlled by themembrane compliance. In this case, the mechanical sen-sitivity can be approximated by

Mm ≈ a2/8T (9)

The membrane motion can be used to provide a pro-portion voltage by the arrangement shown in Fig. 2b. Theelectrical circuit is divided into three sections, represent-ing (I) the microphone, having a time-varying membrane–backplate capacitance C , and stray capacitance Cs; (II) a

TABLE II Representative Values of the Lumped Elements of12 - and 1-in. Condenser Microphones at Midband Frequenciesand 1 atm Pressure

Lumed element 12 -in. 1-in.

M 950 300 kg/m4

CM 5 100 ×10−14 m5/N

CA 90 500 ×10−14 m5/N

RA 15 2 ×107 · sec/m5

Page 15: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

94 Acoustical Measurement

charging network, consisting of a polarization voltage V0

and charging resistance Rc; and (III) the input resistanceRi and capacitance Ci of a preamplifier (once called acathode follower). The preamplifier is placed as close tothe microphone cartridge as possible in order to minimizethe input capacitance Ci. A blocking capacitor before thepreamplifier is not shown. The polarization voltage sourcemaintains a static charge on the capacitor. At frequenciesabove a certain lower limiting frequency, dependent on thecharging time constant of the circuit (see Section I.A.9),the electrical sensitivity Me can be approximated by:

Me ≡ υ

d= 1

d

C

CEV0 = V0

d0(10)

where C is the variation in C, CE = C + Cs + Ci, and d0

is the static gap between the microphone and backplate.The overall microphone sensitivity is the product of

Eqs. (9) and (10):

Mp = Mm Me ≈ a2V0/

8T d0 (11)

Despite its approximate nature, Eq. (11) illustrates, to-gether with Eq. (3), the effect of design parameters onmicrophone performance. A high sensitivity is favored bya large membrane radius, high polarization voltage, lowmembrane tension, and small membrane–backplate gap.A high-frequency response is favored by a high mem-brane tension and small membrane density, thickness, andradius. Choice of the ratio a2/T , which plays conflict-ing roles regarding sensitivity and frequency response,requires a design compromise. Generally, the tension ismade as high as practical and the frequency response iscontrolled by the membrane radius a.

A good membrane material has high tensile strengthin order to maintain high tension, high ductility so thatit can be stretched tightly without cracking or wrinkling,and good resistance to corrosion. Some suitable materi-als are finegrained nickle, titanium, and certain grades ofstainless steel. Typical values of thickness and tension aretM = 5 µm and T = 2000–4000 N/m, corresponding to atensile stress of 4–8 × 108 N/m2. In some microphones,the tension can be adjusted by means of a tightening ring,controlled by the turn of a screw after the membrane isclamped or welded in place. In practice, the tension is ad-justed beyond the design value and then reduced by heattreatment for additional stability. The membrane is ex-ceedingly delicate and usually covered with a protectivegrid.

The polarization voltage and static gap are determinedby limitations on the electric field in the gap and bypractical electronic considerations. Typical values areV0 = 200 V and d0 = 20 µm. Because the polarizationvoltage reduces d0 due to electrostatic attraction, consid-erations of electrical and mechanical stability may be im-portant. In battery-operated units, a polarization voltage

of V0 = 28 V is commonly used. The electrical resistanceRE = Rc‖Ri is of the order of gigaohms, corresponding toa charging time constant of several tenths of a second.

2. Electret Condenser Microphone

An electret material possesses a permanent electricaldipole moment. When used in a condenser microphone, itprovides the polarization voltage between the membraneand backplate in place of the external supply voltage. Elec-tret materials are generally high-resistivity polymers, aprime example being PTFE Teflon. They are fabricatedby heating a film of the material almost to its meltingpoint and subjecting it to an intense electric field. The netdipole moment results from either rotation of permanentdipoles in polar materials or from migration of free chargecarriers. In either case, when the material is cooled to roomtemperature the net dipole moment is “frozen-in.”

A typical construction is shown in Fig. 3. Here the elec-tret is bonded to the backplate. This has an advantage overarrangements where the electret is bonded to the mem-brane because electret materials are not very suitable forperforming the mechanical function of a membrane. Thelower surface makes electrical contact with the backplateand thus is at the same potential as the membrane (a metal).The voltage V0 at the upper surface and across the gap is

V0 = σ t/εε0 (12)

where σ is the surface charge density (C/m2), t the foilthickness (m), ε the dielectric constant, and ε0 the di-electric permittivity of free space (8.85 × 10−12 F/m,or farad per meter). Typical values of σ = 10−4 C/m2,t = 2 × 10−5 m, and ε = 2 lead to a voltage V0 ≈ 100 V,which is comparable to the polarization voltages used inair condenser microphones.

The capacity of an electret material to retain its sur-face charge is highly temperature dependent. In one test,an electret microphone stored at 50C and 95% relativehumidity lost sensitivity at a rate of ∼1 dB/year. Undernormal ambient conditions an electret microphone can beexpected to retain its initial sensitivity for many years.

FIGURE 3 Typical construction of an air electret microphone.Sometimes the electret is bonded to the membrane.

Page 16: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 95

The acoustic performance of the “backelectret” micro-phone is not much different from that of the air condensermicrophone. The elimination of the external polarizationvoltage supply, however, has a significant advantage. Thegeneration of a high-dc voltage and the extensive filter-ing needed to obtain a low noise floor, ripple, and humrequire bulky components (except for battery-operatedequipment). The absence of this requirement greatlyenhances the miniaturization potential of electret-basedinstrumentation.

3. Ceramic Microphone

A ceramic microphone utilizes the piezoelectric effect, thegeneration of a surface charge density as the result of anapplied stress. Traditionally, the sensing element is con-structed of a piezoelectric ceramic, such as lead zirconatetitanate (PZT).

The great rigidity of piezoelectric ceramics makes theirfabrication in the form of a membrane impractical: rather,such elements would operate as vibrating plates, for whichthe microphone compliance is controlled not by static ten-sion but by the elastic modulus. This would lead to twofundamental difficulties. First, the plate compliance wouldbe too small to permit a reasonable mechanical sensitiv-ity. Second, the sharp mechanical resonances of ceramicmaterials make them difficult to dampen.

An arrangement to circumvent these difficulties isshown in Fig. 4. The sound pressure on the membraneis transmitted to the ceramic element through a connect-ing rod. The sole purpose of the backplate is to provide therequired mechanical damping of the membrane. Ceramicmicrophones are characterized by ruggedness, low cost,and simple electronics. However, there has been a trendto replace them with lowcost electrets.

4. Piezoresistive Microphone

The piezoresistive microphone exploits the physical ef-fect known as “piezoresistivity”—the dependence of elec-trical resistivity upon mechanical stress or strain. The

FIGURE 4 Basic construction of a ceramic microphone. The ce-ramic is usually a bimorph element, that is, two crystals sand-wiched together to form a single assembly.

FIGURE 5 Basic construction of a piezoresistive microphone.Dopant is diffused or ion-implanted to form the piezoresistors.

microphone membrane is a thin, micromachined siliconwafer on which dopant is diffused or implanted to formthe resistors of a Wheatstone bridge, as shown in Fig. 5.Acoustical excitation deflects the membrane to generatea time-varying stress in the strategically positioned resis-tors. A proportional output voltage appears at the outputof the bridge. Advantages include small size and low out-put impedance, making possible remote-control electron-ics and thus installation in limited confines. Representa-tive specifications are given in Table III. Suited for highacoustic sound pressures, it is a favorite for wind tunneland aerospace testing. A disadvantage is the temperaturesensitivity of the resistors, requiring rather sophisticatedcompensation techniques.

5. Microphone Specifications

A microphone user examining a specification sheet willusually find the items listed in Table III. It is instructiveto examine their meanings and implications. Specifica-tions regarding environmental conditions will be consid-ered later.

The nominal size refers to the outside diameter of thecartridge, which is slightly larger than the active mem-brane diameter.

The open-circuit sensitivity is the output voltage perunit sound pressure without the loading of the preampli-fier input impedance. Sometimes this specification is givenin decibels relative to a sensitivity of 1 V/Pa. For exam-ple, an open-circuit sensitivity of −40 dB is equivalent to10 mV/Pa.

In a strict sense, the membrane resonant frequency isthe fundamental resonant frequency of the membrane in

Page 17: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

96 Acoustical Measurement

TABLE III Representative Microphone Specifications

Microphone type

Specifications Air condenser Electret Ceramic Piezoresistive

Nominal size (in.) 1 12

14

12 1 0.092

Open circuit sensitivity (mV/Pa) 50 15 2 10 10 0.025

Resonant frequency (Hz) 8000 25,000 75,000 14,000 — 70,000

Frequency range, −2 dB (Hz) 2–7000 4–20,000 8–70,000 4–20,000 2–12,000 0–20,000

Dynamic range (dB) 15–145 25–160 35–170 30–145 25–150 80–190

Polarized capacitance (pf) 60 20 6 30 400 —

Equivalent air volume (cm3) 0.15 0.01 0.0005 0.015 0.5 —

vacuum. In practical terms, it is the frequency at which themembrane displacement in air lags the applied pressureby 90, which is less than the vacuum resonant frequencybecause of damping by the air layer.

At the upper cutoff frequency, the microphone sen-sitivity falls 2 dB, sometimes specified as 3 dB, onthe high-frequency side of the damped membrane re-sponse. The specification is shown for pressure micro-phones and is different for free-field microphones (seeSection I.A.6). The lower cutoff frequency is discussed inSection I.A.9.

The dynamic range is the range of sound pressure ampli-tudes over which the microphone operates linearly, usuallyspecified in decibels. The upper limit is determined by totalharmonic distortion, typically taken at 3 or 4%. In a systemusing a polarization voltage, the harmonic distortion is ofelectrical origin and not mechanical, that is, not due to non-linear membrane displacement. This may not be true forsystems using carrier electronics (see Section I.A.9). Fora given sound pressure the harmonic distortion is less forsmall microphones because of lower output voltage. Thelower limit of the dynamic range is determined by the noisefloor, the rms output voltage over a specified frequencyrange (usually the “A-weighted” band, Section II.A.3) inthe absence of sound. Since acoustic applications gen-erally require a signal-to-noise ratio of better than 1 : 1(0 dB), the lower limit of the dynamic range is usuallyspecified as 5 dB above the noise floor. The two most im-portant sources of noise are (1) Brownian motion of airmolecules impinging on the membrane, primarily on theair layer side; and (2) noise generated in the preamplifier.The availability of high-quality field-effect transistors hasreduced the preamplifier noise to a secondary role. Today itis possible to produce 1-in. condenser microphone systemshaving a noise floor several decibels below the thresholdof hearing (0 dB) over the audible range of frequencies.It is interesting that the membrane displacement of a 1-in.condenser microphone having a 200-V polarization volt-age and responding to a sound pressure of 0 dB is 10−13 m.

The polarized capacitance is the membrane–backplatecapacitance with the polarization voltage applied. This isan important parameter in the determination of sensitiv-ity and low-frequency response. A high capacitance helpsreduce the noise floor.

Below the membrane resonant frequency the membraneacoustic impedance appears compliant, that is, dominatedby CM and CA in Fig. 2. The equivalent air volume Ve isthe volume of an enclosure that would present the sameacoustic impedance as the membrane,

Ve = γ P0/jωZm = γ P0CM (13)

where γ is the specific heat ratio for air (=1.4), P0 the am-bient pressure, and Zm the acoustic impedance of the mi-crophone (see Fig. 2). If a microphone cartridge is insertedinto a coupler, as for calibration purposes, then the equiv-alent volume must be added to the volume of the coupler.

6. Directional Properties

A microphone will not disturb a sound field if its dimen-sions are much smaller than the wavelength of the incidentsound. For this reason the preamplifier (or adapter) is madeas compact as possible, having a diameter not exceedingthat of the microphone cartridge. Since the pressure distri-bution is uniform over the membrane area, the response ofa microphone to this type of excitation is called the pres-sure response. An example is shown in Fig. 6. In this case,the mechanical damping of the microphone is designed formaximum flatness, corresponding to a mechanical qualityfactor of ∼0.7, and the microphone is called a pressuremicrophone. Most calibration methods provide a uniformpressure distribution and thus yield the pressure response.

As the wavelength in a free sound field approaches thedimensions of the microphone, reflection and diffractioncause considerable changes in the pressure distributionabout the microphone. Figure 7 shows the mean pressureover the membrane surface versus frequency for differ-ent angles of incidence. The response of the microphone

Page 18: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 97

FIGURE 6 Pressure response of a 12 -in. air condenser micro-

phone. At the resonant frequency fR, the membrane displacementlags the incident pressure by 90.

under this condition is called the free-field response. Atnormal incidence (φ = 0), the free-field effects are great-est; at grazing incidence (φ = 90), the pressure distri-bution is about the same as for the uniform pressurecondition. The random incidence response curve can beregarded as the mean response when incidence from alldirections is equally probable. A set of curves as shownin Fig. 7 can be used to correct the free-field response toan equivalent pressure response. A free-field microphoneis intentionally made overdamped to yield the flattest fre-quency response for normal incidence. As a result, theupper cutoff frequency far exceeds that of the pressureresponse, sometimes by as much as a factor of 2. The up-per cutoff frequencies shown in Table III are for pressuremicrophones.

7. Microphone Calibration

The three most widely used techniques for microphonecalibration are the electrostatic actuator, the pistonphone,and the reciprocity procedure.

FIGURE 7 Free-field response of a 12 -in. air condenser micro-

phone. (Redrawn with permission, courtesy of Bruel & Kjaer In-struments, Inc., Marlborough, MA.)

The electrostatic actuator is a flat metallic electrode po-sitioned at a nominal distance d1 from the microphonemembrane. A voltage applied between the electrode andmembrane produces a uniform electrostatic pressure onthe membrane. If an ac voltage υa is superimposed upona high dc polarization voltage Va, then the applied acvoltage and resulting electrostatic pressure have the samefrequency. The membrane responds to the electrostaticpressure as it would to a sound pressure. The electrodeis slotted to relieve acoustic loading between the elec-trode and membrane. The electrostatic pressure excitingthe membrane is

p = ε0Vaυa/

d21 (14)

With typical values Va = 800 V, υa = 30 V rms, andd1 = 0.0005 m, the rms pressure is 0.85 Pa, or 93 dB. Thetechnique is excellent for obtaining frequency responsebut is not suitable for absolute calibration because of atwofold uncertainty in the distance d1. First, the slots inthe electrode necessitate a theoretically derived correctionand, second, the polarization voltage shifts the equilibriumposition of the membrane toward the actuator electrode.

The essential parts of a pistonphone are a coupler, intowhich the microphone cartridge is inserted and sealed,usually with an O-ring, and a vibrating piston of knowndisplacement. The piston may be driven by a cam hav-ing a sinusoidal contour, generating a pure tone, or bya crankshaft, which in addition produces considerablesecond-harmonic distortion. The frequency is controlledthrough the speed of the cam or crankshaft. The pressuregenerated in the coupler is

p = γ P0Spd/V (15)

where Sp is the piston area, d the rms stroke, and V thevolume of the coupler (including the equivalent volumeof the microphone). With typical values of γ = 1.4, P0 =105 N/m2 (1 atm), Sp = 10−5 m3, d = 1.4 × 10−4 m, andV = 2 × 10−5 m3, the rms pressure is 9.8 Pa, or 114 dB.At audio frequencies, a precision of a couple of tenthsof a decibel is attainable. At low frequencies a correctionis needed for nonadiabatic compression. In the form of aportable, battery-operated device, it is ideally suited forquick calibration of microphones in the field. Two limi-tations are fixed amplitude and relatively low operatingfrequency (several hundred hertz maximum). Devices us-ing moving-coil drivers (without servo control) are not truepistonphones, for the generated volume velocity dependson the acoustic impedance of the load.

The reciprocity procedure is based on the followingelectromechanical principle. When a reversible transduceris operated as a receiver, the ratio of open-circuit outputvoltage υ to applied acoustic pressure p will equal someconstant A. Then, when it is operated as a transmitter,

Page 19: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

98 Acoustical Measurement

the ratio of the generated volume velocity U to the inputcurrent I will equal the same constant A, if the acous-tic load Zr is small (see Fig. 2a.a, b). According to thisprocedure, three transducers are placed pairwise in anacoustic coupler—a transmitter T, a reversible transducerR, and the test microphone M. In test 1, transmitter T gen-erates a sound pressure PT to excite receiver R, resultingin an open-circuit voltage υR = AR PT. In test 2, test mi-crophone M replaces transducer R to yield υM = AM PT.These lead to the relationship:

AM = ARυM/υR (16)

In test 3, transducer R, as transmitter, excites test micro-phone M, resulting in υ ′

M = AM PR. However, the above-stated reciprocity property, UR = AR IR, and the knownacoustic impedance of the coupler, ZC = PR/UR, lead tothe relationship:

AR = UR

IR= PR

ZC IR= υ ′

M

AM ZC IR(17)

Substitution of Eq. (17) into (16) yields:

AM = (υMυ ′M/ZCυR IR)1/2 (18)

Thus, the microphone sensitivity depends only on elec-trical quantities, which are measurable to high precision,and a readily determinable acoustic impedance. The reci-procity method, the most precise of all known methods,can achieve absolute precisions of the order of hundredthsof a decibel. A free-field variation of the procedure is sim-ilar but is beset with practical difficulties.

8. Microphone Performance in HarshEnvironments

An increase in ambient temperature has three primary ef-fects on condenser microphone parameters: a decrease inmembrane tension, normally an increase in membrane–backplate gap, and an increase in air viscosity. The firsttwo have compensating effects on the midband sensitivity.The last increases the membrane damping and is importantonly near resonance. As a result, the midband sensitivityof condenser microphones generally has a small temper-ature coefficient, typically <0.01 dB/C over an intervalfrom −50 to +60C.

The air compliance CA (Fig. 2) is inversely proportionaland the air layer resistance RA is directly proportional tothe air density. Thus, a change in ambient pressure has littleeffect on midband sensitivity but has a strong influence onmembrane damping. Typically, the pressure coefficient ofthe midband sensitivity is −10−5 dB/Pa.

Humidity is detrimental to condenser microphone per-formance primarily when it condenses and short-circuitsthe membrane to the backplate. Water vapor near satura-tion can cause arcing under an intense electric polarization

field. A dehumidifier, containing a dessicant and insertedbetween the preamplifier and cartridge, has proved suc-cessful in keeping the interior of the cartridge dry. Highrelative humidities apparently do not affect the surfacecharge of an electret significantly. The immunity of theceramic microphone from harmful effects of humidity isone reason for its popularity for many years.

The vibration sensitivity of a microphone depends onthe direction in which the vibration is applied and is max-imum when this direction is normal to the membrane sur-face. In this case, the equivalent acoustic pressure is

p = ρMtMaV (19)

where aV is the acceleration of the applied vibration.For a nickel membrane of density ρM = 8850 kg/m3 andof thickness tM = 5 × 10−6 m, an applied accelerationaV = 9.8 m/sec (1 g) will produce an equivalent soundpressure of 0.43 Pa, or 87 dB. A low membrane surfacedensity ρMtM is the key to suppressing vibration sensi-tivity.

In outdoor measurements, the ambient wind will gen-erate considerable noise in a microphone and may disturbthe intended measurement. A possible countermeasure isto install a windscreen, constructed of a fibrous material,either self-supporting or supported on a wire frame aboutthe microphone. In principle, the material displays a highflow resistance to the quasi-static pressure of the wind butlow resistance to acoustic pressures. A windscreen has twocontrary effects, however. Its presence creates turbulence,an effect that can be minimized by making the dimen-sions of the windscreen sufficiently large, and it createsan acoustic cavity with excitable modes. The windscreenis most effective at high frequencies and in a wind direc-tion normal to the membrane. Overall, in moderate winds(<30 km/hr) a windscreen may reduce the wind noise overthe audio band by as much as 10–20 dB.

Often it is necessary to make sound pressure measure-ments in extremely hostile or inaccessible locations, forexample, in jet engine exhausts or in the ear canal. Suchmeasurements can be realized with the aid of a probetube—a long, thin, hard-walled tube of the general con-figuration shown in Fig. 8. Ideally, the sound pressure pm

at the microphone, coupled to one end of the tube in acoupler cavity, will be the same as the test pressure pT

at the probe tip. This will be approximately the case atlong acoustic wavelengths or when the load impedanceat one end matches the characteristic impedance of thetube. Otherwise, the natural tube resonances will causean undulating frequency response. Since the characteris-tic tube impedance is resistive, impedance matching canbe achieved by means of an acoustic damping materialplaced at the probe tip. The response of the probe tubefor the underdamped, correctly damped, and overdampedcases is shown at the bottom of the figure.

Page 20: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 99

FIGURE 8 Probe tube and its response when it is under damped,correctly damped, and overdamped. (Redrawn with permission,courtesy of Bruel & Kjaer Instruments, Inc., Marlborough, MA.)

9. Infrasonic Measurements

There are two reasons for the low-frequency rolloff of acondenser microphone using a polarization voltage: oneelectrical and the other mechanical. The electrical rolloffis due to the charging time constant of the circuit of Fig. 3,for which the output voltage is

υ = CV0

CE

jωRECE

1 + jωRECE(20)

where CE = C + Cs + Ci and RE = R‖RC. The electricalcutoff frequency is (2π RECE)−1.

The origin of the mechanical rolloff lies in the capillaryvent tube shown in Fig. 1. The vent is necessary for staticpressure equalization on both sides of the membrane; oth-erwise, the microphone will show an undesirable responseto changes in ambient pressure. The mechanical cutofffrequency is (2π RVCV)−1, where RV is the acoustic resis-tance of the capillary tube and CV the acoustic complianceof the backchamber. Both the electrical and mechanicalcutoff frequencies can be controlled through the choiceof design components and are usually set equal to oneanother.

For infrasonic measurements, the mechanical rolloffcan be eliminated by closing the vent tube and the electri-cal rolloff through the use of a microphone carrier system.Here, the microphone is made the capacitive element ina tank circuit, which is tuned to an electrical carrier fre-quency, typically 1–10 MHz. Motion of the membranedetunes the tank circuit—an effect that is used toamplitude- or frequency-modulate the carrier voltage.Such a system responds to static changes in microphonecapacitance, in other words, to frequencies down to dc.A voltage-controlled capacitor placed across the micro-phone capacitance allows provision for automatic feed-

back compensation of capacitance changes due to changesin ambient pressure, as well as for remote calibration ofthe electronic system (the “insertion” technique). How-ever, the time constant of the feedback control systemplaces a lower limit on the measurable signal frequencies.A carrier system has a relatively high noise floor, typically50 dB over 20 Hz to 20 kHz for a 1-in. microphone, buta wide frequency response, from dc to about 20% of thecarrier frequency, microphone permitting. An infrasonicmicrophone system is best calibrated by an infrasonic pis-tonphone, but the coupler must be extremely leak tight.

10. Fiberoptic Sensors

The transmission of acoustically generated signalsthrough optical fibers has two major advantages over theircopper counterparts. The first is immunity from elec-tromagnetic interference, thus dramatically reducing thepractical problems associated with grounding, shielding,and guarding. The second is the remote placement ofthe supporting optoelectronics, permitting the sensing el-ement (e.g., membrane) to operate in harsh environmentsand confined locations. Classification of fiberoptic sensorsfollows the property of light that is modulated: wavelength(or phase), intensity, or polarization. Phase-modulatingsensors are further classified into grating and interferomet-ric sensors, and intensity-modulating sensors are classifiedaccording to whether the modulation affects the guided orevanescent light wave. Polarization-modulating sensorshave not enjoyed the comparable level of development ofthe others and will not be discussed further. As a rule,interferometric sensors employ single-mode fibers anditensity-modulating sensors multimode fibers. Sensors canbe designed to serve as microphones or hydrophones.

In the Mach-Zehnder interferometer (Fig. 9a) light fromthe source splits at the first coupler, one beam passingthrough the sensor fiber and the other through the refer-ence fiber. They recombine at the second coupler and aredetected at the photodetector. A sound wave incident uponthe sensor fiber modulates the phase of the sensor beam rel-ative to that of the reference beam. The resulting temporalinterference produces an optical signal proportional to theacoustical excitation.

The Fabry-Perot interferometer (Fig. 9b) passes thesource light through a coupler, where a fraction, say 50%,continues toward the membrane, the remainder lost at theabsorption cell (to prevent reflection back to the coupler).The membrane and the end of the first fiber comprise aFabry-Perot cavity, thus generating an interference pat-tern which is modulated by the sound wave incident uponthe membrane. The modulated light returns to the couplerand 50% passes on to the photodetector.

In the intensity-modulating fiberoptic lever (Fig. 9c),light passing through a bundle of transmitting fibers is

Page 21: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

100 Acoustical Measurement

FIGURE 9 Fiberoptic sensors. (a) Mach-Zehnder interferome-ter; (b) Fabry-Perot interferometer; (c) fiberoptic lever; and (d) mi-crobend sensor.

reflected from the membrane. A fraction of the reflectedlight is intercepted by a bundle of receiving fibers. A soundwave incident upon the membrane modulates the fractionof received light, thus the intensity of light into the receiv-ing bundle and photodetector.

In the microbend sensor (Fig. 9d), the optical fiber isconstrained to a periodic deformation. The periodicity de-termines the coupling between the optical propagatingmodes and radiating modes (exiting the fiber). An inci-dent sound wave upon the deformer plate modulates thefiber deformation, the modal coupling, and consequentlythe light to the photodetector.

In general, the sensitivity of a fiberoptic pressure sen-sor is the product of three component sensitivities: (1)mechanical, change of sensing element displacement perunit sound pressure; (2) optical, change of optical signalper unit sensing element displacement; and (3) electronic,change of output voltage per unit optical signal in the pho-

todetector. In a well-designed sensor, the threshold sensi-tivity is limited by the shot noise in the photodetector.

B. Measurement of Sound Intensity and Power

1. Sound Intensity

Sound intensity is the sound energy passing through a unitnormal area per unit time. It is a vector quantity havingunits of watts per square meter (W/m2). This definitioncan be expressed in terms of the following fundamentalacoustic parameters,

I = 〈p(t)u(t)〉 (21)

where the pressure and particle velocity are time-varyingquantities and 〈 〉 denotes a time average. For harmonicwaves:

I = Repu∗ = pu cos θ (22)

where boldface symbols denote rms time averages; Re,the real part; the asterisk, a complex conjugate; and θ , thetemporal phase angle between p and u. For illustration,let us apply Eq. (22) to two simple examples. For a planewave, u = p/ρc, θ = 0, and I = p2/ρc, where ρ is the airdensity and c the sound velocity. For a standing wave,u = p/ρc, θ = π/2, and I = 0. To avoid the difficult mea-surement of sound particle velocity, we invoke Newton’ssecond law:

u = −ρ−1∫

∇ p dt (23)

where ∇ p is the pressure gradient. Substituting Eq. (23)into (21) yields an expression in p alone:

I = −⟨(p/ρ)

∫∇ p dt

⟩(24)

In a practical measurement system, two microphonesare aligned coaxially face to face, or in a coplanar ar-rangement, and separated by a fixed distance r in thesound field (Fig. 10). As long as r λ, the wavelength,then Eq. (24) can be approximated as follows:

Ir = − 1

ρ

⟨(pA + pB)

2

∫(pA − pB)

rdt

⟩(25)

FIGURE 10 Measurement of sound intensity with a two-microphone arrangement. The microphones can also be alignedwith coplanar membranes.

Page 22: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 101

Note that p is represented by the mean value of pA andpB. Following the measurement of the sound pressurespA and pB, the sound intensity is evaluated by the signal-processing operations of summing, integrating, multiply-ing, and time averaging. It is good practice to repeat ameasurement with the microphones switched in order tocancel the effect of phase differences between channels.Equation (25) yields the intensity component along themicrophone axis. For sound propagation in a direction α

relative to this axis, Ir must be divided by cos α.A second measurement method is based on the fact

that the time-averaging operation indicated in Eq. (24) isclosely related to the crosspower spectrum GAB betweenpA and pB. Formal analysis yields:

Ir = −(ωρr )−1 ImGAB (26)

where Im denotes the imaginary part. If the microphonesignals are applied to a correlator or digitally to a com-puter, Eq. (26) can be used to compute the intensity.

Applications of sound intensity include sound powermeasurement (see Section I.B.3) and intensity mappingto locate sound sources and sinks.

2. Acoustic Enclosures

Before discussing sound power measurement, we shallfind it helpful to discuss four basic types of acoustic en-closure: anechoic, reverberant, resonant, and Helmholtz.These are depicted in Figs. 11a–d.

An anechoic chamber has no reflections (echoes) fromits walls and thus simulates free-field conditions. The ane-

FIGURE 11 Acoustic enclosures. (a) Anechoic: Waves from thesound source (dark circle) are absorbed without reflection. (b)Reverberant: Waves experience multiple reflections to establisha uniform sound energy density. (c) Resonant: Reflected wavesreinforce waves generated from the source to establish a standingwave. (d) Helmholtz: At long wavelengths, the chamber behavesas a compliance and its small opening as an acoustic mass.

choic condition is realized by lining the walls of the cham-ber with wedges made of a porous, absorptive materialsuch as rock wool, glass wool, or foam. Typical wedgedimensions are 20–60 cm at the base and a wedge angleof 10–15. Since the floor must also be lined, an enclo-sure usually has a wire mesh just above the wedge tips tosupport personnel and equipment. The free-field approx-imation is truest at high frequencies. The lowfrequencylimit is determined by the criterion:

h/λ > (1 + Rw)/(1 − Rw) (27)

where h is the appropriate chamber dimension and Rw

the reflection coefficient of the wedge along its axis.Using typical values Rw = 0.1 and h = 10 m, we findλ < 8.33 m, corresponding to a minimum operating fre-quency of 40 Hz. The background noise level of a goodanechoic chamber may lie lower than 0 dB SPL over mostof the entire audio range. However, the mark of quality ishow well the sound pressure from a point source adheresto the 1/r spherical spreading law, which is perturbed byreflections from the walls. Within confines no closer thanabout a meter from the walls, verification of the law towithin 1 dB is a reasonable design goal.

A reverberation chamber is used to produce ideally aspatially uniform sound energy density. The walls are hardand highly reflective, such that a sound ray emitted froma source will experience multiple reflections in haphazardfashion, as shown in Fig. 11b, and will eventually fill theroom. When the source is turned on, the energy densitybuilds up until dissipation balances the sound power emit-ted by the source. The resulting sound field is called a dif-fuse field, independent of location or direction, a situationdifficult to achieve in practice. Hard reflecting objects andmoving reflectors enhance the diffuseness. The volume Vof the room should exceed 3λ3 for octave analysis and 9λ3

for third-octave analysis (see Section II.A.2), where λ isthe largest wavelength. Recommended ratios for room di-mensions are 1 : 21/3 : 41/3. The source should be located atleast 1

4λ from the walls, and microphones at locations re-moved from known peaks and valleys in sound pressure.The reverberation chamber is widely used for measure-ment of sound absorption of materials and sound poweremission of sources.

An enclosure is said to be resonant if a reflected wavereturns to the source, in the direction from which it wasemitted, after progressing an integral number of wave-lengths. The returning waves reinforce the emitted wavesto produce a standing wave pattern, which grows in ampli-tude until dissipation equals emission. In Fig. 11c, a planewave is reflected between parallel walls. The resonant fre-quency for this one-dimensional case is

fR = c/λ = nc/2h (28)

Page 23: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

102 Acoustical Measurement

where h is the distance between the walls and n an integer.In two and three dimensions, the standing wave patternscan become quite complex. Resonance conditions also ex-ist for bound cylindrical and spherical waves.

A Helmholtz resonator is the acoustic analog of a mass–spring system. Two acoustic elements are coupled to-gether: a chamber and an opening in the form of an orificeor tube. At wavelengths much greater than the elementdimensions, the air in the opening moves as a unit andthus behaves as an acoustic mass; the chamber behavesas a compliance. The resonant frequency of the mass–compliance system is

fR = cS/ lV (29)

where S is the cross-sectional area and l the effectivelength of the opening, and V is the volume of the chamber.This frequency is much lower than the natural frequen-cies of the chamber alone. Helmholtz resonators are verycommon in nature and technology, from caves to winejugs to leaky rooms, and at one time were used to de-termine pitch. We have already come across an example:the backchamber–vent hole system of a condenser micro-phone.

3. Sound Power

Sound power is the integrated normal intensity over a sur-face enclosing an acoustic source and has units of watts:

W =∫

s

I · d

S (30)

In the absence of absorption the sound power is inde-pendent of distance from the source. Three methods formeasuring sound power are free field, diffuse field, andreference source.

The free-field method assumes that the sound field is (1)far field, (2) spherically spreading, and (3) in an anechoicenvironment. The sound source must be located either out-doors or in an anechoic chamber in order to approximatefree–field conditions. Microphone measurement stationsare located on an imaginary sphere (source-suspended)or hemisphere (source-grounded), having a radius r sev-eral times greater than the longest wavelength. The soundpower is determined from the mean value p2

m of the squaresof the sound pressure measurements at all the microphonestations:

W = (p2

m

/ρc

)(4πr2)F (31)

The directivity factor F has the value 1 if the source issuspended or 0.5 if the source is grounded on a perfectlyreflecting surface. At 20C, 1 atm, ρc = 415 N sec/m5 inair. For example, if pm = 20 µPa, 4πr2 = 1 m2, and F = 1,then W = 10−12 W.

If the sound intensity, using microphone pairs, is mea-sured instead of the sound pressure alone, the require-

ments on far-field, spherically spreading sound propaga-tion can be relaxed. However, so far this modification hasnot gained widespread usage.

The diffuse-field method is based on the fact that thesound energy density is uniform throughout a reverberantenclosure. If the sound source is placed in a goodreverberation chamber, then the energy density adjuststo a steady-state condition, whereby the sound poweremitted by the source balances the power dissipated in thechamber. The experimental arrangement is similar to thatused in free-field measurements, but now the sound powerbecomes:

W = p2m R

/4ρc (32)

where R is the reverberation constant of the room and canbe determined from the measurement of reverberationtime (see Section III.A).

Often the enclosure about a sound source is neither ane-choic nor reverberant, and accommodation to acoustic re-quirements is not possible—for example, if the source isa heavy machine. In such a case, the reference sourcemethod may prove useful. Here, an identical set of mea-surements is taken both for the test source and for a refer-ence source of known sound power. Then,

W = Wr p2m

/p2

mr (33)

where Wr and p2mr are the sound power and mean-squared

sound pressure for the reference source.

C. Measurement of AcousticParticle Velocity

This difficult measurement is generally avoided if theacoustic particle velocity can be related simply to thesound pressure. Otherwise, one approach is to exploitthe relationship between particle velocity and sound pres-sure gradient given in Eq. (23).

1. Pressure Gradient Microphone

Measurement of a sound pressure gradient can be achievedby any of the principles used to measure sound pressurealone. A highly successful device is the foil–electret gradi-ent microphone of Sessler and West, illustrated in Fig. 12.It is assumed that the microphone is sufficiently small notto perturb the sound field, requiring l < λ/2, and that thesound pressures on either side of the membrane are thesame as those at the protective grids, namely, p1 and p2.

A metallized electret foil is stretched tightly across,and in contact with, the backplate. Thus the response iscompliance-controlled, although a mass-controlled designis feasible. An air gap exists only because of irregularitiesin the backplate surface. Holes in the backplate and bothprotective grids permit the sound wave to gain access to

Page 24: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 103

FIGURE 12 Foil–electret gradient microphone. This device canbe used to measure acoustic particle velocity.

both sides of the membrane. The membrane responds tothe net normal component of sound pressure,

pi = |−∇ pl ′ cos θ |which is related to sound particle velocity u throughEq. (23). For a harmonic wave of frequency f :

pi = 2π f uρ0l ′ cos θ (34)

where ρ0 is the static density of air and l ≈ l ′ the effec-tive grid spacing. Measurement of pi thus determines usince all other quantities in Eq. (34) are known. Measure-ment of the response at constant amplitude but increasingfrequency is a good check as to whether the microphoneis truly responding to pressure gradient.

2. Other Methods

The Rayleigh disk is based on the fact that a sound wavewill exert a torque on a suspended disk oriented at a suit-able angle, usually 45, with respect to the sound field.The torque is proportional to the square of the incidentsound particle velocity. In a practical apparatus the sus-pending thread exercises a resisting torque, which bal-ances the sound-generated torque. Measurement of theangular displacement, usually by optical means, permitsdetermination of the torque and thus particle velocity. Suchmeasurements are most conveniently conducted in a tubesince the direction of sound propagation is known. At onetime, the British Post Office used this method for the ab-solute calibration of microphones.

The hot-wire anemometer is not suitable for purelyacoustic excitation but has been used successfully foracoustic excitation superimposed on a mean flow.

D. Measurement of Sound Speedand Attenuation

1. Sound Speed

The distance per unit time through which a phase pointof a sound signal propagates is the sound speed and,with specified direction, is called the phase velocity. It

is a physical property of the medium, although it has aslight dependence on the frequency. The theoretical ex-pression for the sound speed in a gas at low pressures,c = (γ RT/M)1/2, where γ is the ratio of specific heatsat constant pressure and constant volume, R the univer-sal gas constant, T the absolute temperature, and M themolecular weight, is very reliable. Measurements of soundspeed are used to determine the (nonideal) equation ofstate and the second virial coefficient. Measurements areusually taken at low frequencies (low kilohertz range) be-cause thermoviscous (classical) absorption increases verystrongly with frequency. In a liquid c = (κρ)−1/2, where κ

is the adiabatic compressibility and ρ the density, classi-cal absorption is not so great a problem and sound speedshows minimal dispersion well into the high megahertzregion. Measurements are used to determine the com-pressibility and the nonlinearity parameters. Of course,the sound speed is also of interest to acousticians en-gaged in sound propagation problems. At 20C, 1 atm, thespeed of sound is 343.23 m/sec in dry air, 1482.34 m/secin pure water, and 1529.03 m/sec in seawater (3.5%salinity). Two apparatuses commonly used for the mea-surement are the cylindrical resonator and the sphericalresonator.

2. Cylindrical Resonator

This resonator is a long hollow cylinder capped by twoparallel end plates. It employs a transmitter and receiver,either separately or as a single unit and usually located inthe end plates. Many types of electroacoustic transducersare used: piezoelectric, electrostatic, electrodynamic, andso on. If the excitation is in the form of a short pulse or toneburst (see Section I.D.4), then c = L/T , where L is thedistance from source to receiver and T the correspondingtransit time. If an axial mode of the resonator is excited,then:

c = 2L fn/n, n = 1, 2, . . . (35)

The advantages of the cylindrical geometry are modalpurity and ease of fabrication, especially for measure-ments at high pressures. The disadvantages are the rel-atively large thermoviscous losses at the walls, which re-quire a correction to the sound speed, and the existence ofcutoff frequencies, which limit the maximum usable fre-quency. The first nonaxial mode has a diametric node at afrequency:

f = 0.5861c/2R (36)

where R is the internal radius. If the resonator is sym-metrical about the cylindrical axis, this mode will not bestrongly excited and the limiting frequency will be that

Page 25: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

104 Acoustical Measurement

corresponding to the first radial mode:

f = 1.2187c/2R (37)

For example, if L = 1 m, R = 0.025 m, and f1 = 600 Hz(measured), then Eq. (35) yields c = 300 m/sec. Ignoringthe mode defined by Eq. (36), we find the cutoff frequencyfrom Eq. (37) to be f = 7312 Hz, which permits the useof 7312/600 ≈ 12 axial modes.

3. Spherical Resonator

The excitation of a radial wave in a spherical resonatorhas an important advantage: The particle velocity is nor-mal to the walls, and the wall losses are typically 10 : 1lower than those of a cylindrical resonator at a compa-rable frequency. The major difficulty lies in fabricatinga spherical shell with precise internal dimensions. Smallglass spheres of volumes from 0.5–100 liters permit mea-surement in liquids from about 5 kHz to 2 MHz. Largerresonators invariably consist of two metallic hemispheresjoined by flanges at the equator. The sound speed is relatedto the radial mode frequencies by:

c = 2π R fn/νn (38)

where R is the radius of the sphere and νn = 4.493, 7.725,

10.904, . . . for n = 1, 2, 3, . . . . Geometric imperfectionsaffect c only to second order. Measurement precisions of0.02% are commonplace; precisions of 0.003% have beenachieved.

4. Sound Attenuation

Sound attenuation is the reduction in amplitude of a soundwave as it propagates through a medium. It may be the re-sult of spreading, scattering, or absorption (direct conver-sion to heat). The same apparatus can be used to measureboth speed and attenuation; often both quantities are mea-sured together.

There are many ways to designate sound attenuation,most associated with a particular experimental technique.The quality factor Q is the most fundamental measurebecause it is based on energy considerations. If a mediumis subjected to periodic acoustic excitation, then:

Q = 2π × maximum energy stored

energy dissipation per cycle(39)

The quality factor—a physical property of the medium,just as the density and compliance are—is sensitive tophysical and chemical changes, sometimes strongly fre-quency dependent, but ill-suited to direct measurement.Rather, Q is determined through its relationship to theother measures of attenuation shown in the followingtabulation.

Attenuation constant, Np/m α = π/λQ

Reciprocal time constant, Np/sec β = π f/Q

Logarithmic decrement,Np/wavelength = π/Q

Resonant halfwidth, Hz f = f0/Q

Tangent of phase angle tan δ = 1/Q

When the wavelength is much smaller than the dimen-sions of the experimental enclosure—and this includespropagation in the free field—the attenuation constant α

is the appropriate measure. A popular method of determin-ing α is the pulse–echo method, illustrated in Fig. 13a. Atransmitter T launches a tone burst, a packet of harmonicwaves of frequency f , in a cylindrical resonator. In thesequence of reflections from the end plates, a receiver Rat a fixed station measures the amplitude of the packet,which attenuates by a factor exp(−αx) over a propagationdistance x . A suitable pulse width must be chosen by theobserver: Too great a width will decrease spatial resolu-tion, but too small a width will decrease frequency reso-lution. More sophisticated techniques fall into the domainof ultrasonic measurements. The attenuation constant α isthe most common measure of attenuation in fluids and theone normally shown in tabulations of absorption values.

The attenuation can also be described in terms of a timeconstant (or its reciprocal), as is common in reverberationmeasurements. In a typical experiment, white noise is used

FIGURE 13 Measurement of sound attenuation by (a) the pulse–echo, (b) the free decay, and (c) the resonant halfwidth methods.

Page 26: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 105

to excite a large number of reverberation chamber modes,which can be observed individually on a fixed receiverthrough narrow band filtering. When the excitation is re-moved, the amplitude decay, of the form exp(−βt), isobserved on each filter. The decay envelope is shown inFig. 13b.

The next three measures pertain to the resonant tech-nique, whereby a natural mode of a resonator is excited,usually an axial mode of a cylinder or radial mode of asphere.

When the excitation is removed, the sound pressure am-plitude of the mode decays freely, as in the reverberationtechnique. The logarithmic decrement is the naturallogarithm of the amplitude ratio of two successive peaks,shown in Fig. 13b:

= ln(pn/pn+1) (40)

Usually is averaged over a range of the free decay, typ-ically 10–40 dB. This method is most suitable for me-dia with a high Q, say Q > 10: in media with an ex-tremely high Q, such as degassed distilled water for whichQ > 106, this is the only feasible method for measuringsound attenuation.

In a steady-state experiment, the attenuation can be de-termined by the halfwidth of the resonance curve, shownin Fig. 13c. The sound pressure amplitude is measuredas the frequency is incremented or swept through theresonant frequency f0. The halfwidth f is defined asthe frequency interval between the two points wherep = pmax/

√2; that is, where the amplitude is 3 dB down

from the peak pmax. This method cannot be used when thehalfwidth is too sharp, due to instability of f0, or when thehalfwidth is too broad, due to poor peak definition. Fur-thermore, any losses inherent in the transmitter itself willcontribute to the halfwidth. With an efficient transmitterthis method can be used effectively over a range of Q fromabout two to several hundred.

If the medium is very lossy (Q < 2), the most effectivemeasure of attenuation is the loss tangent tan δ. Such lowQ’s are found in some polymeric liquids, seldom in gases.Because of the high loss the sample is made the lossyelement of a composite resonator, in which independentmeasurements of force and displacement yield the phaseangle δ.

The measured attenuation in a resonator contains threeprincipal components: wall absorption, absorption due tofluid–structure interaction, and the constituent absorptionthat is to be measured. The wall and structural compo-nents, called the background absorption, can be deter-mined through measurements on a background fluid hav-ing negligible constituent absorption over the range ofmeasurement parameters. In gases, argon and nitrogen arefrequently used for this purpose.

5. Free-Field Measurements of Attenuation

In the free field, corrections must be made for spreading.For a spherical source, the sound pressure falls as 1/r .At sufficiently large distances from the source, a spheri-cal wave can be approximated by a plane wave, and thecorrection is not needed.

6. Optoacoustical Method: Laser-InducedThermal Acoustics

The passage of laser light through a fluid can inducea strain either thermally (resonant) or electrostrictively(nonresonant). A typical laser-induced thermal acoustics(LITA) arrangement is shown in Fig. 14. Typical compo-nent specifications are shown in parentheses. Light froma pulsed pump laser (λpump = 532 nm) is split into twobeams which intersect at a small angle (2θ = 0.9). Opticalinterference fringes of spatial period = λpump/(2 sin θ )generate electrostrictively counterpropagating ultrasonicwaves of fixed wavelength to form a Bragg grating, shownin the insert. A long-pulsed probe laser (750 nm) illu-minates the grating, which diffracts a small fraction ofthe probe beam at an angle φ to a photomultiplier. Thediffracted signal is normalized to the direct probe sig-nal measured at the photodetector. Since the acousticalwavelength is known from the intersection angle 2θ and

FIGURE 14 Laser-induced thermal acoustics.

Page 27: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

106 Acoustical Measurement

pump laser wavelength, and the frequency is known fromthe photomultiplier signal, the speed of sound of thefluid medium can be measured. A “referenced” versionof LITA, implemented to avoid the large error associatedwith the intersection angle measurement, splits the pumpand probe beams and directs them to a second LITA cellcontaining a fluid of known sound speed.

E. Measurement of Acoustic Impedance

The relationship between sound pressure and acoustic flowvelocity plays a central role in the analysis of acous-tic devices, such as mufflers and musical instruments,and in the determination of a sound field in the pres-ence of a boundary. Quantitatively, this relationship isdescribed by one of three types of acoustic impedance.The acoustic impedance—the ratio of sound pressure tovolume velocity Z = p/U—is used in analyzing acous-tic circuits, where devices are represented by equiva-lent lumped elements. It is a property of the medium,frequency, and geometry and has units of N · sec/m5 =kg/sec · m4 = Rayl/m2.

The interaction between a sound wave and a boundarydepends on the specific acoustic impedance of the bound-ary relative to that of the propagation medium. The specificacoustic impedance is the ratio of sound pressure to acous-tic particle velocity z = p/u. For a plane wave z = ρc, andit is basically a property of the medium, although it canhave complex frequency-dependent parts. It is related toZ through z = Z S, where S is the cross-sectional area andhas units of N · sec/m3 = kg/sec · m2 = Rayl. Thus mea-surement of z readily leads to the determination of Z andvice versa.

The mechanical or radiation impedance, the ratio offorce to particle velocity, is of interest in systems contain-ing both discrete and continuous components but is notdiscussed here.

Methods of measuring acoustic impedance fall intothree broad categories: (1) impedance tube and waveguidemethods in general, (2) free-field methods, and (3) directmeasurement of sound pressure and volume velocity.

1. Impedance Tube

The test specimen is located at one end of a rigid tube anda transmitter at the other end (Fig. 15a). It is important todistinguish between materials of local reaction and thoseof extended reaction. In the former, the behavior of onepoint on the surface depends only on excitation at thatpoint and not on events taking place elsewhere in the ma-terial. In the latter, acoustic excitation at a point on thesurface generates waves that propagate laterally through-out the material. Generally, a material is locally reacting ifnormal acoustic penetration does not exceed a wavelength.

FIGURE 15 Measurement of acoustic impedance with an im-pedance tube. (a) Impedance tube, and (b) standing wave patternand its envelopes.

For a locally reacting material, a thin test specimen(thickness λ/4) is backed by a λ/4 air gap sandwichedbetween the specimen and a massive reflector. For a ma-terial of extended reaction, a specimen of approximatethickness λ/4 is backed by the massive reflector directlyagainst its surface.

The transmitter is tuned to establish a standing wavepattern, which is probed by a microphone located eitherwithin the tube or at the end of a probe tube. The observerslides the probe along the impedance tube axis and recordsthe standing wave pattern L(x) in decibels (Fig. 15b). HereL(x) = 20 log[p(x)/(p0)], where the reference pressurep0 is immaterial. The impedance is evaluated from thepressure standing wave ratio L0 at the specimen surface,which cannot be measured directly but is computed froma best fit to the trend of Lmax and Lmin shown in the figure.After computing the antilog of the pressure standing waveratio K0 and related quantities,

K0 = 10L0/20 (41)

φ =(

x1

x2 − x1− 1

2

)× 360 (42)

M = 12

(K0 + K −1

0

)(43)

N = 12

(K0 − K −1

0

)(44)

we determine the real z′ and imaginary z′′ parts of thespecific acoustic impedance relative to that of air, ρc:

z′

ρc= 1

M − N cos φ(45)

z′′

ρc= N sin φ

M − N cos φ(46)

This method is capable of yielding measurements of highprecision, to within a few percent based on repeatability.

Page 28: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 107

FIGURE 16 Measurement of acoustic impedance by a free-fieldmethod.

A major source of error lies in the determination of x ,which may be illdefined for a rough or fibrous specimensurface. To improve surface definition, a face sheet com-posed of a fine-meshed gauze of low acoustic resistancecan be used. A disadvantage is the time required to take thenumber of measurements needed to establish the standingwave pattern. More modern methods based on the trans-fer function between two microphone stations reduces themeasurement time considerably.

2. Free-Field Methods

A transmitter sends an incident wave at an angle ψ to-ward the test specimen, from which it is reflected, also atan angle ψ , toward a receiver (Fig. 16). The specific acous-tic impedance is evaluated from the reflection coefficientRp = pr/pi:

z

ρc= 1

sin ψ

(1 + Rp

1 − Rp

)(47)

A variety of techniques, both transient and steady state,have been devised to determine the three wave compo-nents pr, pi, and pd. One steady-state method utilizesthree separate measurements at each frequency: (1) withthe specimen in place, yielding p1 = pr + pd; (2) with thespecimen replaced by a reflector of high impedance, yield-ing p2 = p′

r + pd ≈ pi + pd; and (3) with the reflector re-moved, yielding p3 = pd alone. Thus,

Rp = pr

pi= p1 − p3

p2 − p3(48)

Free-field methods are used for testing materials at shortwavelengths and are popular for outdoor measurements ofthe earth’s ground surface.

3. Direct Measurement of Sound Pressure andVolume Velocity

For measurement of the acoustic impedance within anacoustic device, the sound pressure can be measured withthe aid of a probe tube (Section I.A.8), but measurementof the acoustic particle or volume velocity is difficult (Sec-tion I.C).

The most common method of attacking the latter prob-lem is to control the volume velocity at the transmitter.

This can be achieved in several ways: (1) by mountinga displacement sensor on the driver; (2) by using a dualdriver, directing one side to the test region and the otherside to a known impedance Zk and using U = p/Zk and(3) by exciting a driving piston with a cam so that thegenerated volume velocity will be independent of acous-tic load. The first two methods rely on the integrity of thevelocity measurement technique: the third is limited to rel-atively low frequencies. To measure the specific acousticimpedance of a material, a transmitter, receiver, and testspecimen are mounted in a coupler; the impedance of thelatter must be taken into account.

II. INSTRUMENTS FOR PROCESSINGACOUSTICAL DATA

A. Filters

The representation of an acoustic time history in thedomain of an integral (or discrete) transform has twoadvantages. First, it transforms an integrodifferential (ordifference) equation into a more tractable algebraic equa-tion. Second, it often separates relevant signal from irrel-evant signal and random noise. The two most commontransforms used in acoustics are the Fourier transform forcontinuous time histories and the z transform for discrete(sampled) time histories. The Fourier transform representsa time history f (t) in the frequency domain,

F(ω) = 1

∫ ∞

−∞f (t) exp(− jωt) dt (49)

with ω = 2π f . The z transform represents the sampledvalues f (nTs) in the z domain:

F(z) =∞∑

n=0

f (nTs)z−n (50)

where Ts is the sample interval and n the sample number.The representation of a time history in the transformeddomain is called a spectrum. We shall be concerned withthe frequency spectrum.

Filters fulfill three major functions in acoustics: spectralselection, analysis, and shaping. It is assumed that thereader is familiar with the general characteristics of filtersand with filter terminology.

1. Spectral Selection

We shall present two examples of spectral selection. Thefirst is antialiasing. In sampled systems, it is essential thatall frequency components above half the sampling fre-quency fs be suppressed to avoid “aliasing,” that is, theappearance of components of frequency fs − f in the ob-served spectrum. This is a consequence of the Nyquistsampling theorem. The maximum frequency for which

Page 29: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

108 Acoustical Measurement

the spectrum is uncorrupted by aliaising is called theNyquist frequency. The second example is signal-to-noiseimprovement. The observed signal is often a pure tone, forwhich narrow-band filtering will produce a considerableimprovement in signal-to-noise (S/N) ratio. If the noiseis “white,” that is, has a uniform spectral power density,then a reduction in bandwidth from BW to BN improvesthe S/N ratio by 20 log(BW/BN) decibels.

2. Spectral Analysis

The role of the filter here is to permit observation of a nar-row portion of a wideband spectrum. The selected bandis specified by a center frequency f0 and a bandwidth B,defined as the frequency interval about f0 where the out-put/input ratio remains within 3 dB of that at the center.The bandwidth of the filter may be constant (i.e., indepen-dent of f0) or a constant percentage of f0.

The constant-bandwidth filter is advantageous in caseswhere the measured spectrum is rich in detail over a limitedfrequency range, for example, where a series of harmon-ics appears as the result of nonlinear distortion or where anumber of sharp resonances are generated from a complexsound source. The constant-percentagebandwidth filter ismore appropriate in cases where the measured spectrumencompasses a large number of decades, say two or more;where the source is unstable, constantly shifting its promi-nent frequencies; or where the power transmitted overa band of frequencies is of interest, as in noise controlengineering.

Popular choices for the constant-percentage bandwidthare the octave (factor of 2), 1

3 , 16 , 1

12 , and 124 octave.

The bandwidth of a 13 -octave filter, for example, is

21/6 f0 − 2−1/6 f0 = 0.231 f0. The 13 -octave filter, in fact,

is the most widely used in acoustic spectral analysis. Thereason is rooted in a property of human auditory response.Consider an experiment in which a human subject is ex-posed to a 60-dB narrow-band tone at 10 kHz. If the ampli-tude and center frequency of the tone remain fixed but thebandwidth increases, the subject will perceive no changein loudness until the bandwidth reaches 2.3 kHz, and thenthe loudness begins to increase. This is called the criti-cal bandwidth and has a value of ∼ 1

3 octave. If the testis repeated at other, sufficiently high center frequencies,the resulting critical bandwidth remains at about 1

3 oc-tave. For sound measurements geared to human response,then, a narrower bandwidth does not influence loudnessand a greater bandwidth yields a false measurement ofloudness—hence, the choice of 1

3 -octave spectral reso-lution. A list of preferred 1

3 -octave center frequencies isgiven in Table IV. The audible spectrum, 20 Hz to 20 kHz,encompasses thirty-one 1

3 -octave bands.

TABLE IV Preferred 13 -Octave Center Frequenciesa

16 20 25 31.5 40 50

63 80 100 125 160 200

a In hertz (also ×10 or ×100).

3. Spectral Shaping

The perceived loudness of a tone of constant amplitudeis a strong function of frequency and amplitude. Manyacoustic instruments feature not only a linear response,an objective measurement of sound pressure, but also aweighted response, which conforms to the frequency re-sponse of the human ear. The function of a weighting filteris to shape an acoustic spectrum to match the response ofthe ear. Three standard frequency response curves, calledA, B, and C curves, conform to equal loudness curves at40, 70, and 100 phons, respectively. A phon is a unit ofloudness, usually specified in decibels; it is the same asthe SPL at 1 kHz but differs at most other frequencies.The D weight has been proposed for applications involv-ing aircraft noise measurement. The filter response curvesfor the A, B, C, and D weighting are shown in Fig. 17.

B. Spectrum Analyzers

A spectrum analyzer enables an observer to view the fre-quency spectrum of an acoustic time history on an out-put device such as a television monitor, chart recorder, ordigital printer. A real-time analyzer produces a complete,continuously updated spectrum without interruption. Thefirst real-time analyzers were analog in nature, based oneither of two principles: (1) time compression, which useda frequency transformation to speed up processing time,or (2) a parallel bank of analog filters and detectors. Theadvent of VLSI (very large-scale integration) in the semi-conductor industry made the all-digital, real-time analyzera reality, offering competitive cost and enhanced stability,linearity, and flexibility.

FIGURE 17 Response curves of A, B, C, and D weighting filters.

Page 30: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 109

The spectrum analyzer performs the basic functionsof preamplification, analog filtering, detection, analog-to-digital (A/D) conversion, logic control, computation, andoutput presentation. The frequency range usually coversthe audio band but may exceed it at both ends. Digitalreal-time analyzers operate on either of two principles:the digital filter or the fast Fourier transform (FFT).

1. Digital Filter

The transfer function of a two-pole analog filter is written:

H (s) = (s + r1)(s + r2)

(s + p1)(s + p2)(51)

where s is the Laplace operator, r1.2 the zeros, and p1.2

the poles. The filter characteristics—gain and cutofffrequencies—are fixed and can be changed only by chang-ing the components making up the filter. The frequencyresponse can be found by replacing s by jω.

The digital filter accepts samples f (nTs) of the timehistory from an A/D converter, where Ts is the sampleinterval, and yields an output in the form of a sequenceof numbers. The transfer function is represented in the zdomain:

H (z) = A0 + A1z−1 + A2z−2

1 − B1z−1 − B2z−2(52)

where z−1 = exp(−sTs) is called the unit delay opera-tor, since multiplication by z−1 is equivalent to delayingthe sequence by one sample number. Synthesis of H (z)requires a system that performs the basic operations ofmultiplying, summing, and delaying. Noteworthy is thefact that once the filter characteristics are set by choice ofcoefficients A0 . . . B2, the frequency response parameters(center frequency f0 and bandwidth B for a bandpass fil-ter) are controlled by the sample rate fs = 1/Ts. For exam-ple, doubling fs doubles f0 and B. Thus, the digital filteris a constant-percentage-bandwidth filter and is appropri-ate for those applications where such is required (Sec-tion II.A.2). Typically, the filters are six-pole Butterworthor Chebycheff filters of 1

3 -octave bandwidth. Several two-pole filters can be cascaded to produce filters of higherpoles, or the data can be recirculated through the samefilter several times.

2. Fast Fourier Transform

First consider the discrete Fourier transform (DFT), thedigital version of Eq. (49),

F(k) = 1

N

N−1∑n=0

f (n) exp(− j2πkn/N ) (53)

where f (n) is the value of the nth time sample, k the fre-quency component number, N the block size, or numberof time samples. The time resolution depends on the timewindow t = T/N , and the frequency resolution dependson the sampling frequency f = fmax/N . Obviously, thefilter is a constant-bandwidth filter and again is suited tothe appropriate applications (Section II.A.2). In contrast tothe digital filtering technique, the data throughput is notcontinuous but is segmented into data blocks. Thus, forreal-time analysis, the analyzer must be capable of pro-cessing one block of data while simultaneously acquiringa new block.

The FFT exploits the symmetry properties of the DFTto reduce the number of computations. The DFT requiresN 2 multiplications to transform a data block of N samplesfrom the time domain to the frequency domain: the FFTrequires only N log2 N multiplications. For a block size ofN = 1024 samples, the reduction is over a factor of 100.

The DFT of Eq. (53) differs from the continuous Fouriertransform in three ways, each presenting a data-processingproblem that must be addressed by man/machine. First, thetransformed function is a sampled time history. The sam-pling frequency must exceed twice the Nyquist frequency,as explained in Section II.A.1. In fact, it is beneficial tochoose an even higher sampling frequency. For example,in a six-pole low-pass filter, the signal is down ∼18 dBat 1

2 octave past the cutoff frequency fc. A strong compo-nent at this frequency will “fold over” as a component offrequency

√2 fc − fc ≈ 0.4 fc, attenuated only 18 dB, and

may have a level comparable to the true signal. Increas-ing the sampling frequency to fs = 2.5 fc will relieve theproblem in this case.

Second, the filter time window yields the well-knownsin x/x transform. In the frequency domain, the windowspectrum is convolved with the signal spectrum and in-troduces ripples in the latter. The sidelobes of the sin x/xspectrum introduce leakage of power from a spectral com-ponent to its neighbors. A countermeasure to this effect isto use a Hanning window, a weighting time function thatis maximum at the center of the window and zero at itsedges. The Hanning window improves the sidelobe sup-pression at the expense of increased bandwidth. However,the Hanning window may not be needed if the signal issmall at the edges of the window.

Finally, the digitally computed transform of the sam-pled time history must itself be presented as a sampledfrequency spectrum. This fact is responsible for the so-called picket fence effect, whereby we do not observe thecomplete spectrum but only samples. Thus, we may missa sharp peak and observe only the slopes. A Hanning win-dow also helps to compensate for this effect.

Examples of acoustic signals necessitating analysis inreal time are signals in the form of a sequence of transients,

Page 31: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

110 Acoustical Measurement

as speech; aircraft flyover noise, as required by the FederalAviation Administration (FAA); and measurements wherethe analyzer is an element in a control loop. For other typesof signals, such as stationary or quasi-stationary signals,or transients shorter than the time window, the time historycan be stored and analyzed at a later time.

3. Correlation

Many spectrum analyzers provide the capability of com-puting the autocorrelation and cross-correlation functionsand their Fourier transforms, namely, the spectral andcross-spectral density functions. These operations areused to compare the data at one test station with that atanother station. The cross-correlation of the time-varyingfunctions f1(t) and f2(t) is expressed in terms of a timedelay τ :

g12(τ ) = limT →∞

1

T

∫ T

0f1(t) f2(t + τ ) dt (54)

The Fourier transform of this function is the cross-powerspectral density function:

G12( f ) =∫ ∞

−∞g12(τ ) exp(− j2π f τ ) dτ (55)

If f1(t) and f2(t) are the same signal, say at station 1, thenEqs. (54) and (55) yield g11(τ ) and G11( f ), the autocor-relation function and the spectral power density function.

Two important acoustic applications of the cross-functions are transfer function determination and time de-lay estimation. Let us consider the transfer function. Sup-pose a noise or vibration source produces responses at twostations f1(t) and f2(t), having Fourier transforms F1( f )and F2( f ). The transfer function H12( f ) = F2( f )/F1( f )is related to the power spectra as follows:

H12( f ) = G12( f )/G11( f ) (56)

Thus, Eq. (56) permits the determination of H12( f ), whilethe source is operating in its natural condition.

Now consider time delay estimation. Suppose an acous-tic signal propagates from station 1 to station 2 in time τ0.Then g12(t) will show a peak at τ = τ0, and G12( f ) willhave a phase angle φ12 = 2π f τ0. If the signal is a puretone, say a cosine wave, then g12(τ ) will also be a cosinewave of the same frequency but shifted by φ12; that is,the maximum will be displaced by an angle φ12. If thetime delay τ0 exceeds the period 1/ f of the wave, theng12(τ ) will reveal two maxima and thus a twofold ambi-guity in τ0. Consequently, the maximum delay that canbe uniquely determined is τmax < 1/ f . If the signal is amixture of two tones of frequencies f1 and f2, then themaximum delay will be determined by the beat frequency,τmax < ( f2 − f1)−1. Formal analysis leads to the criterion:

τmax < 0.3/( f2 − f1) (57)

where f2 − f1 is the bandwidth of the signal. In thesetwo cases, the cross-spectral density is strongly peaked ata few prominent frequencies. If, on the other hand, thetime signal is strongly peaked as in the case of a nar-row pulse, comprising a broad spectrum of frequencies,there is a criterion on minimum system bandwidth to mea-sure a given delay similar to Eq. (57), with the inequalityreversed.

The autocorrelation function reveals the presence of pe-riodic signals in the presence of noise.

An important function in acoustic signal processing isthe coherence function,

C12( f ) = |G12( f )|2G11( f )G22( f )

(58)

which has a value between 0 and 1. This function servesas a criterion as to whether the signals received at stations1 and 2 have the same cause. It should have a reasonablyhigh value even in measurement systems subject to noiseand random events.

C. Sound Level Meters

A sound level meter is a compact portable instrument,usually battery-operated, for measuring SPL at a selectedlocation. The microphone signal is preamplified (atten-uated), weighted, again amplified (attenuated), detected,and displayed on an analog meter. The detector is a square-law detector followed by an averaging (mean or rms) net-work. There are a variety of additional features such ascalibration, overload indication, and external connectorsfor filters and output signal.

The directional response of the microphone affects theaccuracy of the measurement. In a free field, correctionsare based on curves such as those in Fig. 7 if the angle ofincidence is known. In a diffuse field, the random responsecurve must be relied on: The smaller the microphone, themore accurate are the results.

Two switch selections available to the user are weight-ing and time constant. The weighting networks are linear(unweighted), A, B, C, and sometimes D (Section II.A.3).For stationary or quasi-stationary signals, a “fast” or“slow” time constant, based on the response to a 200- or500-msec signal, respectively, is used. The fast responsefollows time-varying sound pressures more closely at theexpense of accuracy; the slow response offers a higherconfidence level for the rms sound pressure measurement.Impulsive signals present something of a problem. Cur-rent standards specify a time constant of 35 msec, in anattempt to simulate the response of the human ear, plus thecapability of storing the peak or rms value of the appliedsignal. To prevent saturation resulting from high peak am-plitudes, the detector circuit must be capable of sustaininga crest factor, the ratio of peak to rms signal, of at least 5.

Page 32: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 111

D. Storage of Acoustical Data

Up to the mid-1970s the workhorse of acoustical data stor-age was the magnetic tape recorder in both am and fmversions. The major limitation was the limited dynamicrange, amounting to less than 40 dB for am tape and 50–55 dB for fm tape. This was followed by 7- to 9-trackdigital tape, which improved the dynamic range but in the1980s yielded to VHS (video home systems) cassettes hav-ing greater storage density. Typical specifications for VHScassette recorders, which are still on the market today, are70 dB dynamic range, dc to 80-kHz frequency response,and recording time ranging from 50 min to 426.7 hr at sam-ple rates of 1280 and 2.5 thousands of samples per second,respectively. With the explosive development of personalcomputers, the development of digital storage systems hasproceeded at a comparable pace. These are classified aseither random access or sequential access devices.

Random-access devices include hard drives, CD (com-pact disc) writers, and DVD (digital versatile disc) RAM(random-access memory) devices. The hard drive typi-cally has a storage capacity of 20 gigabytes (GB) and adata transfer rate of over 10 megabytes (MB) per second.Traditional hard drives are not meant for archiving datanor for removal from one system to another. Now moreoptions with removable hard-disk systems are available,such as the “Jaz” and “Orb,” which have the disk in a re-movable cartridge. These cartridge-based hard drives havecapacities of up to 2 GB and sustained data rates of over8 MB/sec. The removable DVD-RAM has shown capaci-ties of 5.2 GB and transfer rates of up to 1 MB per second.

While sequential access times are significantly greaterthan random-access devices, sequential access providesthe highest storage capacities (up to 50 GB per tape) andvery high sustained data transfer rates of over 6 MB/sec.The advent of advanced intelligent tape has an electronicmemory device on each tape that speeds up the searchprocess. In addition, 8-mm, digital linear tape, and 4-mmtapes are among forms of storage that allow up to 20 ter-abytes of information to be stored and accessed in a cost-effective manner.

High-quality digital storage devices conform to theSmall Computer Systems Interface (SCSI) standard. Theadvantages are far-reaching. The conforming devices (in-cluding those mentioned above) are easily upgraded, mu-tually compatible, and interchangeable from one systemto another. A single SCSI controller can control up to 15independent SCSI devices.

An option available to users of digital storage devices isdata compression, whereby data density is compressed bya two-to-one ratio. Most compression schemes are veryrobust and, combined with error detection and correction,produce error rates on the order of 10−15.

E. The Computer as an Instrumentin Acoustical Measurements

The integration of a digital computer into an acoustic mea-surement system offers many practical advantages in addi-tion to improved specifications regarding dynamic range,data storage density, flexibility, and cost effectiveness.Many acoustic measurements require inordinately com-plex evaluation procedures. The capability of performingan on-line evaluation during a test provides the user withan immediate readout of the evaluated data: this may aidin the making of decisions regarding further data acquisi-tion and the choice of test parameters. The decisionmak-ing procedure can even be automated. The digital datacan readily be telecommunicated over ordinary telephonelines. Most digital systems accommodate a great varietyof peripheral equipment.

Figure 18 shows examples of a computer integrated intoan acoustical measurement system:

1. Active noise cancellation (Fig. 18a). The computerimplements real-time digital filters 1 and 2, whichserve as adaptive controllers to produce the requiredresponses of noise-cancelling speakers 1 and 2.

2. Spatial transformation of sound fields (Fig. 18b). Across spectrum analyzer yields a cross spectralrepresentation of a sound field, based on acousticalmeasurements over a selected scan plane; then, acomputer predicts the near field from the scan datausing near field acoustic holography and the far fieldfrom the Helmholtz integral equation.

3. Computer-steered microphone arrays (Fig. 18c).In a large room, such as an auditorium or conferencehall, the computer introduces a preprogrammed timedelay in each microphone of a rectangular array, thussteering the array to the direction of high selectivity;coordinating more than one array, it controls thelocation from which the received sound is especiallysensitive.

III. EXAMPLES OF ACOUSTICALMEASUREMENTS

A. Measurement of Reverberation Time

Reverberation time (RT) is the time required for the soundin a room to decay over a specific dynamic range, usuallytaken to be 60 dB, when a source is suddenly interrupted.The Sabine formula relates the RT to the properties of theroom.

T = 0.161V/αS (59)

where V is the volume of the room, S the area of itssurfaces, and α the absorption coefficient due to losses

Page 33: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

112 Acoustical Measurement

FIGURE 18 Measurement systems using computers. (a) Activenoise cancellation. (b) Spatial transformation of sound fields. (c)Computer-steered microphone arrays. (Courtesy of NASA, B&KInstruments, and J. Acoust. Soc. Am.)

in the air and at the surfaces. Recommended values fora 500-Hz tone in a 1000-m3 room are about 1.6 sec fora church, 1.2 sec for a concert hall, 1.0 sec for a broad-casting studio, and 0.8 sec for a motion picture theater,the values increasing slightly with room size. The roomconstant R, appearing in Eq. (32), is related to α through:

R = Sα/(1 − α) (60)

A typical measuring arrangement is shown in Fig. 19a.A sound source is placed at a propitious location and theresponse is averaged over several microphone locationsabout the room.

If the source is excited into a pure tone, the measurementis beset with two basic difficulties. The act of switchinggenerates additional tones, which establish beat frequen-cies and irregularities on the decay curves; furthermore,the excitation of room resonances can produce a breakin the slope of the decay curve (Fig. 19b). The smooth-ness of the decay curve can be improved by wideningthe bandwidth of the source. Three types of excitation areused for this purpose: random noise, an impulse, or a war-ble tone, in which the center frequency is FM-modulated.The 1

3 -octave analyzer performs two function: It permitsthe frequency dependence of the RT to be determined,and it provides a logarithmic output to linearize the freedecay curve. The output device can be a recorder (loga-rithmic if the analyzer provides a linear output) or a digitaldata acquisition system. The microphones can be multi-plexed or measured individually. A typical decay curve ob-tained by this method is shown in Fig. 19c. Because many

FIGURE 19 Measurement of reverberation time. (a) Experimen-tal arrangement showing microphones (circles) positioned at suit-able locations about the room. (b) Response curve showing abreak in slope due to simultaneous room resonances. (c) Re-sponse curve showing unambiguous reverberation.

Page 34: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 113

measurements are averaged to enhance confidence level,the method is time-consuming. If 20 averages are takenover each 1

3 -octave band from 125 Hz to 10 kHz, then 400decay curves would have to be evaluated.

B. Measurement of Impulsive Noises

The measurement of noise from impulsive sources such asgunshots, explosives, punch presses, and impact hammers,as well as short transients in general, requires considerablecare on the part of the observer. Sometimes these sourcesare under the observer’s control, but on some occasionstheir occurrence is unpredictable, often affording but asingle opportunity to make the measurement. By naturesuch sources are of large amplitude and short duration, re-quiring instruments capable of handling high crest factorsand extended frequency content.

Measurement of the peak pressure does not give infor-mation on duration. Of greater interest is the measurementof rms pressure, from which loudness and energy contentcan be inferred. Such a measurement can be made withsimple analog equipment, such as an impulse sound levelmeter. The pressure signal is squared and time-averaged,the square root extracted, and the result presented on ameter. The averaging time will affect the measurement.By convention, an averaging time constant of 35 msec isrecommended in an effort to simulate the response of thehuman ear.

A description of an impulsive source in the frequencydomain has several advantages. First, sources can be iden-tified by their characteristic spectral signatures. Second,those components bearing a large amount of energy canbe identified, as for noise control purposes. Finally, the re-sponse of an acoustic device to the signal is more readilyanalyzed in the frequency domain than in the time domain.Consider the Fourier spectrum of a pulse of constant am-plitude A and duration T , shown in Fig. 20a:

F( f ) = AT sin (π f T )/π f T (61)

Suppose the pulse is applied to an ideal, unity-gain fil-ter of bandwidth B, center frequency f0, and phase slopetL = d φ/d ω. The spectrum of the pulse and transfer func-tion of the filter are shown in Fig. 20b. The filter outputwill exhibit a characteristic ringing response; if T 1/B,this can be approximated as:

υ0(t) ≈ 2ABTsin(π f0T )

π f0T× sin π B(t − tL)

π B(t − tL)cos(2π f0t)

(62)

shown in Fig. 20c. The spectral component F( f0) is inti-mately related both to the peak response of the envelope,occurring at t = tL, and to the integrated-squared response:

FIGURE 20 Measurement of impulsive noise. (a) Time history ofa single pulse and (b) its amplitude–frequency spectrum togetherwith that of an ideal narrow-band filter. (c) Time history of thefilter response to the single pulse. (d) Reconstruction of the pulsespectrum from the outputs of several adjacent narrowband filters.(e) Time history of a periodic sequence of pulses and (f) its Fourierseries amplitude spectrum (with envelope).

υ0 peak ≈ 2ABT sin(π f0T )

π f0T= 2B F( f0) (63)

E =∫ ∞

−∞υ2

0 (t) dt ≈ 2A2 BT 2

[sin(π f0T )

π f0T

]2

= 2B F2( f0) (64)

Thus, the Fourier spectrum can be reconstructed fromthe measurements corresponding to Eq. (63) or (64), us-ing narrow-band filters of different center frequencies(Fig. 20d). If the condition T 1/B is not fulfilled, thefilter response shows two bursts, each similar to thatof Fig. 20c and separated by the pulse duration T , andEqs. (63) and (64) are no longer valid.

If the impulsive noise is repetitive or if a single pulsecan be reproduced repetitively, the pulse sequence canbe represented as a Fourier series. The Fourier coeffi-cients Fn are given by Eq. (61) if f is replaced by n/Tr,where Tr is the pulse repetition interval. The time historyand Fourier spectrum are shown in Figs. 20e,f. The num-ber of components per spectral lobe depends on the ratioT/Tr. Too large a ratio will yield too few components for

Page 35: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

114 Acoustical Measurement

accurate representation of the spectrum, but too small aratio must be avoided due to crest factor limitations in theanalyzing equipment. A reasonable compromise is a ratiobetween 0.2 and 0.5. The reconstruction of short-transientspectra is a fine application of constant-bandwidthfiltering.

Some final notes are pertinent to the measurement oftransients:

1. The principles discussed here for theconstant-amplitude pulse apply to short pulses ofother shapes. The features of the spectrum of Fig. 20bare generally retained. In the case of a tone burst, themain lobe is displaced from the origin.

2. If the occurrence of the transient event cannot bepredicted, a digital event recorder will prove useful. Atime history is continuously sampled and transferredto a buffer. The buffer content is transferred to astorage register only when the signal exceeds athreshold. In this manner the pre- and posteventbackground signals are included on both sides of theevent.

3. For long transients, such as sonic booms, a 13 -octave

real-time analyzer may prove advantageous becausethe condition T 1/B may be difficult to fulfill, andenergy content may spread over a wide band offrequencies. The rms response,

υ0 rms =√

E/τ (65)

depends on the averaging (or integrating) time τ . Fora fixed value of τ there will be a low-frequencyrolloff due to the long response times, tL, of thefilters, which causes part of the signal to be excludedfrom the averaging. At high frequencies some errorwill occur because of high crest factors.

C. Measurement of Aircraft Noise

Aircraft noise measurements can be organized into twobroad categories: aircraft noise monitoring and aircraftflyover testing, the latter for both engineering applicationsand noise certification.

1. Aircraft Noise Monitoring

Aircraft noise is measured routinely at numerous airportsaround the world to evaluate noise exposure in adjacentcommunities and to compare noise sources. The instru-ments are basically weather-protected sound level meters,covering an SPL range from about 60 to 120 dB at fre-quencies up to 10 kHz, and are generally installed near theairport boundaries.

2. Aircraft Flyover Testing for Certification

Federal Aviation Regulations, “Part 36—Noise Standards:Aircraft Type and Airworthiness Certification,” define in-strumentation requirements and test procedures for air-craft noise certification. The instrumentation system con-sists of microphones and their mounting, recording andreproducing equipment, calibrators, analysis equipment,and attenuators.

For subsonic transports and turbojet-powered airplanes,microphones are located on the extended centerline of therunway, 6500 m from the start of takeoff or 2000 m fromthe threshold of approach, and on the sideline 450 m fromthe runway. The microphones are of the capacitive type,either pressure or free field, with a minimum frequencyresponse from 44–11,200 Hz. If the wind exceeds 6 knots,a windscreen is used.

If the recording and reproducing instrument is a mag-netic tape recorder, it has a minimum dynamic range of45 dB (noise floor to 3% distortion level), with a standardreading level 10 dB below the maximum and a frequencyresponse comparable to that of the microphone.

The analyzer is a 13 -octave, real-time analyzer, having

24 bands in the frequency interval from 50–10,000 Hz.It has a minimum crest factor of 3, a minimum dynamicrange of 60 dB, and a specified response time and providesan rms output from each filter every 500 msec.

Field calibrations are performed immediately beforeand after each day’s testing. The microphone–preamplifiersystem is calibrated with a normal incidence pressurecalibrator, the electronic system with “pink” noise (con-stant power in each 1

3 -octave band), and the magnetic taperecorder with the aid of a pistonphone.

After the recorded data are corrected to reference at-mospheric conditions and reference flight conditions, aneffective perceived noise level—a measure of subjectiveresponse—is evaluated.

The noise spectrum from a noncertification flyover of aBoeing 747 aircraft is shown in Fig. 21. The aircraft had

FIGURE 21 Noncertification flyover noise spectrum, in 13 oc-

taves, of a Boeing 747 aircraft. The reference level of 0 dB isarbitrary. (Courtesy of NASA.)

Page 36: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FVZ Revised Pages

Encyclopedia of Physical Science and Technology EN001-08 April 20, 2001 12:45

Acoustical Measurement 115

a speed of 130 m/sec, an altitude of 60 m, and a positiondirectly over the microphone at the time the noise wasrecorded. The microphone was located 1.2 m above theground, and the averaging time of the analyzer was 0.9 sec.

SEE ALSO THE FOLLOWING ARTICLES

ACOUSTIC CHAOS • ACOUSTICS, LINEAR • ACOUSTIC

WAVE DEVICES • ANALOG SIGNAL ELECTRONIC CIRCUITS

• SIGNAL PROCESSING, ACOUSTIC • ULTRASONICS AND

ACOUSTICS • UNDERWATER ACOUSTICS

BIBLIOGRAPHY

Acoustical Society of America, Standards Secretariat, 120 Wall Street,32nd Floor, New York, NY 10005-3993.

Crocker, M.J., ed.-in-chief (1997). “Encyclopedia of Acoustics,” JohnWiley & Sons, New York.

Hassall, J.R., and Zavari, K., (1979). “Acoustic Noise Measurement,” 4thed., Bruel & Kjaer Instruments, Marlborough, MA.

International Organization for Standardization (ISO), Case Postale 56,CH-1211, Geneve, Switzerland.

Kundert, W.R. (1978). Sound and Vibration 12, 10–23.Wong, G.S.K., and Embleton, T.F.W., eds. (1995). “AIP Handbook of

Condenser Microphones,” American Institute of Physics Press, NewYork.

Page 37: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN001E-09 May 25, 2001 16:16

Acoustics, LinearJoshua E. GreensponJ. G. Engineering Research Associates

I. IntroductionII. Physical Phenomena in Linear AcousticsIII. Basic Assumptions and

Equations in Linear AcousticsIV. Free Sound PropagationV. Sound Propagation with Obstacles

VI. Free and Confined WavesVII. Sound Radiation and Vibration

VIII. Coupling of Structure/Medium (Interactions)IX. Random Linear Acoustics

GLOSSARY

Attenuation Reduction in amplitude of a wave as ittravels.

Condensation Ratio of density change to static density.Coupling Mutual interaction between two wave fields.Diffraction Bending of waves around corners and over

barriers.Dispersion Dependence of velocity on frequency, mani-

fested by distortion in the shape of a disturbance.Elastic waves Traveling disturbances in solid materials.Ergodic Statistical process in which each record is sta-

tistically equivalent to every other record. Ensembleaverages over a large number of records at fixed timescan be replaced by corresponding time averages on asingle representative record.

Impedance Pressure per unit velocity.Interaction Effect of two media on each other.Medium Material through which a wave propagates.

Nondispersive medium Medium in which the velocityis independent of frequency and the shape of the dis-turbance remains undistorted.

Normal mode Shape function of a wave pattern in trans-mission.

Propagation Motion of a disturbance characteristic ofradiaton or other phenomena governed by wave equa-tions.

Ray Line drawn along the path that the sound travels,perpendicular to the wave front.

Reflection Process of a disturbance bouncing off anobstacle.

Refraction Change in propagation direction of a wavewith change in medium density.

Reverberation Wave pattern set up in an enclosedspace.

Scattering Property of waves in which a sound pattern isformed around an obstacle enveloped by an incomingwave.

29

Page 38: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

130 Acoustics, Linear

Sommerfeld radiation condition Equation stating thatwaves must go out from their source towards infinityand not come in from infinity.

Standing waves Stationary wave pattern.Wave guide Structure or channel along which the wave

is confined.

ACOUSTICS is the science of sound—its generation,transmission, reception, and effects. Linear acoustics isthe study of the physical phenomena of sound in which theratio of density change to static density is small, typicallymuch less than 0.1. A sound wave is a disturbance thatproduces vibrations of the medium in which it propagates.

I. INTRODUCTION

A unified treatment of the principles of linear acousticsmust begin with the well-known phenomena of single-frequency acoustics. A second essential topic is randomlinear acoustics, a relatively new field, which is given atutorial treatment in the final section of this article.

The objective is to present the elementary principlesof linear acoustics and then to use straightforward mathe-matical development to describe some advanced concepts.Section II gives a physical description of phenomena inacoustics. Section III starts with the difference betweenlinear and nonlinear acoustics and leads to the derivation ofthe basic wave equation of linear acoustics. Section IV dis-cusses the fundamentals of normal-mode and ray acous-tics, which is used extensively in studies of underwatersound propagation. In Section V, details are given on soundpropagation as it is affected by barriers and obstacles.Sections VI–VIII deal with waves in confined spaces;sound radiation, with methods of solution to determine thesound radiated by structures; and the coupling of soundwith its surroundings. Section IX discusses the fundamen-tals of radom systems as applied to structural acoustics.

II. PHYSICAL PHENOMENAIN LINEAR ACOUSTICS

A. Sound Propagation in Air, Water, and Solids

Many practical problems are associated with the propa-gation of sound waves in air or water. Sound does notpropagate in free space but must have a dense medium topropagate. Thus, for example, when a sound wave is pro-duced by a voice, the air particles in front of the mouth arevibrated, and this vibration, in turn, produces a disturbancein the adjacent air particles, and so on. [See ACOUSTI-CAL MEASUREMENT.]

If the wave travels in the same direction as the particlesare being moved, it is called a longitudinal wave. Thissame phenomenon occurs whether the medium is air, wa-ter, or a solid. If the wave is moving perpendicular to themoving particles, it is called a transverse wave.

The rate at which a sound wave thins out, or attenuates,depends to a large extent on the medium through which it ispropagating. For example, sound attenuates more rapidlyin air than in water, which is the reason that sonar is usedmore extensively under water than in air. Conversely, radar(electromagnetic energy) attenuates much less in air thanin water, so that it is more useful as a communication toolin air.

Sound waves travel in solid or fluid materials by elasticdeformation of the material, which is called an elasticwave. In air (below a frequency of 20 kHz) and in water,a sound wave travels at constant speed without its shapebeing distorted. In solid material, the velocity of the wavechanges, and the disturbance changes shape as it travels.This phenomenon in solids is called dispersion. Air andwater are for the most part nondispersive media, whereasmost solids are dispersive media.

B. Reflection, Refraction, Diffraction,Interference, and Scattering

Sound propagates undisturbed in a nondispersive mediumuntil it reaches some obstacle. The obstacle, which canbe a density change in the medium or a physical object,distorts the sound wave in various ways. (It is interestingto note that sound and light have many propagation char-acteristics in common: The phenomena of reflection, re-fraction, diffraction, interference, and scattering for soundare very similar to the phenomena for light.) [See WAVEPHENOMENA.]

1. Reflection

When sound impinges on a rigid or elastic obstacle, partof it bounces off the obstacle, a characteristic that is calledreflection. The reflection of sound back toward its sourceis called an echo. Echoes are used in sonar to locate objectsunder water. Most people have experienced echoes in airby calling out in an empty hall and hearing their wordsrepeated as the sound bounces off the walls.

2. Refraction and Transmission

Refraction is the change of direction of a wave when ittravels from a medium in which it has one velocity to amedium in which it has a different velocity. Refractionof sound occurs in the ocean because the temperature orthe water changes with depth, which causes the velocity of

Page 39: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 131

sound also to change with depth. For simple ocean models,the layers of water at different temperatures act as thoughthey are layers of different media. The following exampleexplains refraction: Imagine a sound wave that is constantover a plane (i.e., a plane wave) in a given medium and aline drawn perpendicular to this plane (i.e., the normal tothe plane) which indicates the travel direction of the wave.When the wave travels to a different medium, the normalbends, thus changing the direction of the sound wave. Thisnormal line is called a ray and is discussed later with rayacoustics in Section IV.A.

When a sound wave impinges on a plate, part of thewave reflects and part goes through the plate. The partthat goes through the plate is the transmitted wave. Re-flection and transmission are related phenomena that areused extensively to describe the characteristics of soundbaffles and absorbers.

3. Diffraction

Diffraction is associated with the bending of sound wavesaround or over barriers. A sound wave can often be heardon the other side of a barrier even if the listener cannot seethe source of the sound. However, the barrier projects ashadow, called the shadow zone, within which the soundcannot be heard. This phenomenon is similar to that of alight that is blocked by a barrier.

4. Interference

Interference is the phenomenon that occurs when twosound waves converge. In linear acoustics the sound wavescan be superimposed. When this occurs, the waves inter-fere with each other, and the resultant sound is the sumof the two waves, taking into consideration the magnitudeand the phase of each wave.

5. Scattering

Sound scattering is related closely to reflection and trans-mission. It is the phenomenon that occurs when a soundwave envelops an obstacle and breaks up, producing asound pattern around the obstacle. The sound travels offin all directions around the obstacle. The sound that travelsback toward the source is called the backscattered sound,and the sound that travels away from the source is knownas the forwardscattered field.

C. Standing Waves, Propagating Waves,and Reverberation

When a sound wave travels freely in a medium withoutobstacles, it continues to propagate unless it is attentuatedby some characteristic of the medium, such as absorption.

When sound waves propagate in an enclosed space, theyreflect from the walls of the enclosure and travel in a dif-ferent direction until they hit another wall. In a regularenclosure, such as a rectangular room, the waves reflectback and forth between the sound source and the wall,setting up a constant wave pattern that no longer showsthe characteristics of a traveling wave. This wave pattern,called a standing wave, results from the superposition oftwo traveling waves propagating in opposite directions.The standing wave pattern exists as long as the sourcecontinues to emit sound waves. The continuous rebound-ing of the sound waves causes a reverberant field to beset up in the enclosure. If the walls of the enclosure areabsorbent, the reverberant field is decreased. If the soundsource stops emitting the waves, the reverberant standingwave field dies out because of the absorptive characterof the walls. The time it takes for the reverberant field todecay is sometimes called the time constant of the room.

D. Sound Radiation

The interaction of a vibrating structure with a medium pro-duces disturbances in the medium that propagate out fromthe structure. The sound field set up by these propagatingdisturbances is known as the sound radiation field. When-ever there is a disturbance in a sound medium, the wavespropagate out from the disturbance, forming a radiationfield.

E. Coupling and Interaction betweenStructures and the Surrounding Medium

A structure vibrating in air produces sound waves, whichpropagate out into the air. If the same vibrating structureis put into a vacuum, no sound is produced. However,whether the vibrating body is in a vacuum or air makeslittle difference in the vibration patterns, and the reactionof the structure to the medium is small. If the same vi-brating body is put into water, the high density of watercompared with air produces marked changes in the vi-bration and consequent radiation from the structure. Thewater, or any heavy liquid, produces two main effects onthe structure. The first is an added mass effect, and thesecond is a damping effect known as radiation damping.The same type of phenomenon also occurs in air, but to amuch smaller degree unless the body is traveling at highspeed. The coupling phenomenon in air at these speeds isassociated with flutter.

F. Deterministic (Single-Frequency) VersusRandom Linear Acoustics

When the vibrations are not single frequency but are ran-dom, new concepts must be introduced. Instead of dealing

Page 40: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

132 Acoustics, Linear

with ordinary parameters such as pressure and velocity, itis necessary to use statistical concepts such as auto- andcross-correlation of pressure in the time domain and auto-and cross-spectrum of pressure in the frequency domain.Frequency is a continuous variable in random systems,as opposed to a discrete variable in single-frequency sys-tems. In some acoustic problems there is randomness inboth space and time. Thus, statistical concepts have to beapplied to both time and spatial variables.

III. BASIC ASSUMPTIONS ANDEQUATIONS IN LINEAR ACOUSTICS

A. Linear Versus Nonlinear Acoustics

The basic difference between linear and nonlinear acous-tics is determined by the amplitude of the sound. Theamplitude is dependent on a parameter, called the con-densation, that describes how much the medium is com-pressed as the sound wave moves. When the condensationreaches certain levels, the sound becomes nonlinear. Themajor difference between linear and nonlinear acousticscan best be understood by deriving the one-dimensionalwave equation for sound waves and studying the param-eters involved in the derivation. Consider a plane soundwave traveling down a tube, as shown in Fig. 1.

Let the cross-sectional area of the tube be A and let ξ bethe particle displacement along the x axis from the equi-librium position. Applying the principle of conservationof mass to the volume A dx before and after it is displaced,the following equation is obtained:

ρ A dx(1 + ∂ξ/∂x) = ρo A dx (1)

The mass of the element before the disturbance arrives isρo A dx whereρo is the original density of the medium. Themass of this element as the disturbance passes is changedto:

ρ A dx(1 + ∂ξ/∂x)

FIGURE 1 Propagation of a plane one-dimensional sound wave.A= cross sectional area of tube; ξ = particle displacement alongthe x axis; p= acoustic pressure.

where ρ is the new density of the disturbed medium. Thisdisturbed density ρ can be defined in terms of the originaldensity ρo by the following relation:

ρ = ρo(1 + S) (2)

where S is called the condensation. By substituting Eq. (2)into (1) we obtain:

(1 + S)(1 + ∂ξ/∂x) = 1 (3)

If p is the sound pressure at x , then p + ∂p/∂x dx is thesound pressure at x + dx (by expanding p into a Taylorseries in x and neglecting higher-order terms in dx). Ap-plying Newton’s law to the differential element, we findthat:

−∂p

∂x= ρo

∂2ξ

∂t2(4)

If it is assumed that the process of sound propagationis adiabatic (i.e., there is no change of heat during theprocess), then the pressure and density are related by thefollowing equation:

P

po=

ρo

(5)

where P = total pressure = p + po, p is the disturbancesound pressure, and γ is the adiabatic constant, which hasa value of about 1.4 for air. Using Eqs. (2) and (3) gives:

ρ = ρo

1 + ∂ξ/∂x

Thus,

∂p

∂x= ∂ P

∂x= −γ po

(1 + ∂ξ

∂x

)−1−γ∂2ξ

∂x2(6)

Substituting into Eq. (4) gives:

γ po∂2ξ/∂x2

(1 + ∂ξ/∂x)1+γ= ρo

∂2ξ

∂t2(7)

or finally,

c2 ∂2ξ/∂x2

(1 + ∂ξ/∂x)1+γ= ∂2ξ

∂t2(8)

where c2 = poγ /ρo (c is the sound speed in the medium).If ∂ξ/∂x is small compared with 1, then Eq. (3) gives:

S = −∂ξ

∂x(9)

and (8) gives:

c2 ∂2ξ

∂x2= ∂2ξ

∂t2(10)

Thus,

ξ = f1(x − ct) + f2(x − ct) (10a)

Page 41: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 133

Equations (9) and (10) are the linear acoustic approxima-tions. The first term in Eq. (10a) is an undistorted travelingwave that is moving at speed c in the +x direction, andthe second term is an undistorted traveling wave movingwith speed c in the −x direction.

Condensation values S of the order of 1 are char-acteristic of sound waves with amplitudes approaching200 db rel. 0.0002 dyne/cm2. The threshold of pain isabout 130 db rel. 0.0002 dyne/cm2. This is the sound pres-sure level that results in a feeling of pain in the ear. Thecondensation value for this pressure is about S = 0.0001.For a condensation value S = 0.1, we are in the nonlinearregion. This condensation value corresponds to a soundpressure level of about 177 db rel. 0.0002 dyne/cm2. Allthe ordinary sounds that we hear such as speech and music(even very loud music) are usually well below 120 db rel.0.0002 dyne/cm2. A person who is very close to an ex-plosion or is exposed to sonar transducer sounds un-derwater would suffer permanent damage to his hearingbecause the sounds are usually well above 130 db rel.0.0002 dyne/cm2.

B. Derivation of Basic Equations

It is now necessary to derive the general three-dimensionalacoustic wave equations. In Section III.A, the one-dimensional wave equation was derived for both the lin-ear and nonlinear cases. If there is a fluid particle in themedium with coordinates x , y, z, the fluid particle canmove in three dimensions. Let the displacement vector ofthe particle be b having components ξ , η, ζ , as shown inFig. 2.

The velocity vector q is

q = ∂b/∂t (11)

FIGURE 2 The fluid particle. x, y, z = rectangular coordinates offluid particle; b = displacement vector of the particle (componentsof b are ξ , η, ζ ).

Let this velocity vector have components u, v, w where

u = ∂ξ/∂t v = ∂η/∂t w = ∂ζ/∂t (12)

As the sound wave passes an element of volume, V =dx dy dz, the element changes volume because of the dis-placement ξ , η, ζ . The change in length of the elementin the x , y, z directions, respectively, is (∂ξ/∂x) dx ,(∂η/∂y) dy, (∂ζ/∂z) dz; so the new volume is V + Vwhere:

V + V = dx

(1 + ∂ξ

∂x

)dy

(1 + ∂η

∂y

)dz

(1 + ∂ρ

∂z

)(13)

The density of the medium before displacement is ρo andthe density during displacement is ρo(1+S), as in the one-dimensional case developed in the last section. Applyingthe principle of conservation of mass to the element beforeand after displacement, we find that:

(1 + S)(1 + ∂ξ/∂x)(1 + ∂η/∂y)(1 + ∂ζ/∂z) = 1 (14)

Now we make the linear acoustic approximation that∂ξ/∂x , ∂η/∂y, and ∂ζ/∂z are small compared with 1.So Eq. (14) becomes the counterpart of Eq. (9) in onedimension:

S = −(∂ξ/∂x + ∂η/∂y + ∂ζ/∂z) (15)

This equation is called the equation of continuity for linearacoustics.

The equations of motion for the element dx dy dz aremerely three equations in the three coordinate directionsthat parallel the one-dimensional Eq. (4); thus, the threeequations of motion are

−∂p

∂x= ρo

∂2ξ

∂t2− ∂p

∂y= ρo

∂2 y

∂t2− ∂p

∂z= ρo

∂2ρ

∂t2

(16)

If one differentiates the first of these equations with respectto x , the second with respect to y, and the third with respectto z, and adds them, then letting

∇2 = ∂2/∂x2 + ∂2/∂y2 + ∂2/∂z2

one obtains:

∇2 p = ρo∂2S

∂t2(17)

Now we introduce the adiabatic assumption in Eq. (17);that is,

P

po=

ρo

(18)

where P = total pressure = p + po and p is the soundpressure due to the disturbance. Since

ρ = ρo(1 + S),(19)

P/po = (1 + S)γ

Page 42: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

134 Acoustics, Linear

FIGURE 3 Temperatures, velocities, and refraction angles of the sound. T1, T2, . . . , Tn = temperatures of the n layersof the model medium; V1, V2, . . . , Vn = sound velocities in the n layers of the model medium.

For small S, the binomial theorem applied to Eq. (19)gives:

P − po

ρo= Sc2 (20)

(c being the adiabatic sound velocity, as discussed for theone-dimensional case). Thus,

p = ρoSc2

Substituting into Eq. (17) we obtain:

c2∇2 p = ∂2 p/∂t2 (21)

C. Intensity and Energy

The one-dimensional equation for place waves is given byEq. (10). The displacement for a harmonic wave can bewritten:

ξ = Aei(ωt+kx)

The pressure is given by Eq. (20); that is, p = ρoC2o S,

where S = ∂ξ/∂x for the one-dimensional wave. Then,

p = ρoc2o∂ξ

∂x

The velocity is given by u = ∂ξ/∂t , so, for one-dimensional harmonic waves, p = ρoc2

o(ik) and u = iωξ ,but k = ω/co. Thus, p = ρocou. The intensity is defined asthe power flow per unit area (or the rate at which energy istransmitted per unit area). Thus, I = pξ . The energy perunit area is the work done on the medium by the pressurein going through displacement ξ , that is, E f = pξ . Andby the above,

I = p2/ρoco

IV. FREE SOUND PROPAGATION

A. Ray Acoustics

Characteristics of sound waves can be studied by the sametheory regardless of whether the sound is propagating in airor water. The simplest of the sound-propagation theoriesis known as ray acoustics. A sound ray is a line drawnnormal to the wave front of the sound wave as the soundtravels. In Section II. B. 2, refraction was described as thebending of sound waves when going from one mediumto another. When a medium such as air or water has atemperature gradient, then it can be thought of as havinglayers that act as different media. The objective of thistheory, or any other transmission theory, is to describe thesound field at a distance from the source. There are twomain equations of ray theory. The first is Snell’s law ofrefraction, which states that

V1

cos θ1= V2

cos θ2= V3

cos θ3= · · · = Vn

cos θn(22)

where V1, V2, . . . Vn are the velocities of sound throughthe various layers of the medium, which are at differenttemperatures as shown in Fig. 3.

The second relation is that the power flow remains con-stant along a ray tube (i.e., there is conservation of energyalong a ray tube). A ray tube is a closed surface formedby adjacent rays, as shown in Fig. 4. If the power flowremains constant along a ray tube, then,

p21 A1

ρoc1= p2

2 A2

ρoc2= · · · p2

n An

ρocn(23)

FIGURE 4 Ray tube, A1, A2, . . . , An = cross section area of theray tube at the n stations along the tube.

Page 43: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 135

where p refers to the sound pressure, A is the cross-sectional area of the ray tube, ρo the mass density of themedium, and C the sound velocity in the medium. But,p2/ρoc = I , the sound intensity. Thus,

I1 A1 = I2 A2 = · · · In An (24)

The intensity can therefore be found at any point if theinitial intensity I1 and the areas of the ray tube A1, . . . , An

are known. The ray tube and the consequent areas can bedetermined by tracing the rays. The tracing is done byusing Snell’s law (Eq. 22). The velocities of propagationV1, V2, . . . , Vn are determined from the temperature andsalinity of the layer. One such equation for sound velocityis

V = 1449 + 4.6T − 0.055T 2 + 0.0003T 3

+ (1.39 − 0.012T )(s − 35) + .017d (25)

where V is the velocity of sound in meters per second,T the temperature in degrees centigrade, s the salinity inparts per thousand, and d the depth in meters. The smallerthe ray tubes, that is, the closer together the rays, the moreaccurate are the results.

Simple ray-acoustics theory is good only at high fre-quencies (usually in the kilohertz region). For low fre-quencies (e.g., less than 100 Hz), another theory, the nor-mal mode theory, has to be used to compute transmissioncharacteristics.

B. Normal Mode Theory

The normal mode theory consists of forming a solution ofthe acoustic wave equation that satisfies specific boundaryconditions. Consider the sound velocity C(z) as a functionof the depth z, and let h be the depth of the medium. Themedium is bounded by a free surface at z = 0 and a rigidbottom at z = h. Let a point source be located at z = z1,r = 0 (in cylindrical coordinates) as shown in Fig. 5.

The pressure p is given by the Helmholtz equation:

∂2 p

∂r2+ 1

r

∂p

∂r+ ∂2 p

∂z2+ k2(z)p = −2

rδ(z − z1)δ(r )

(26)

k2(z) = ω2

c(z)2

FIGURE 5 Geometry for normal mode propagation. r , z =cylin-drical coordinates of a point in the medium; h = depth of medium;z1 = z coordinate of the source.

The δ functions describe the point source. The boundaryconditions are

p(r, o) = 0 (free surface)(27)

∂p

∂z(r, h) = 0 (rigid bottom)

Equations (26) and (27) essentially constitute all of thephysics of the solution. The rest is mathematics. The so-lution of the homogeneous form of Eq. (26) is first foundby separation of variables. Since the wave has to be anoutgoing wave, this solution is

p(r, z) = H (1)o (ξr )ψ(z, ξ ) (28)

where H (1)o is the Hankel function of the first kind of order

zero. The function ψ(z, ξ ) satisfies the equation:

d2ψ

dz2+ [k2(z) − ξ 2]ψ = 0 (29)

with boundary conditions:

ψ(o, ξ ) = 0dψ(h, ξ )

dz= 0 (30)

Since Eq. (30) is a second-order linear differential equa-tion, let the two linearly independent solutions be ψ1(z, ξ )and ψ2(z, ξ ). Thus, the complete solution is

ψ(z, ξ ) = B1ψ1(z, ξ ) + B2ψ2(z, ξ ) (31)

where B1 and B2 are constants.Substitution of Eq. (31) into (30) leads to an equation

from which the allowable values of ξ (the eigenvalues)can be obtained, that is,

ψ1(o, ξ )dψ2(h, ξ )

dz− ψ2(o, ξ )

dψ1(h, ξ )

dz= 0 (32)

The nth root of this equation is called ξn . The ratio of theconstants B1 and B2 is

B1

B2= −ψ2(o, ξn)

ψ1(o, ξn)(33)

The H (1)o (ξnr )ψ(z, ξn) are known as the normal mode

functions, and the solution of the original inhomogeneousequation can be expanded in terms of these normal modefunctions as follows:

p(r, z) =∑

n

An H (1)o (ξnr )ψ(z, ξn) (34)

with unknown coefficients An , which will be determinednext. Substituting Eq. (34) into (26) and employing therelation for the Hankel function,(

d2

dr2+ 1

r

d

dr+ ξ 2

n

)H (1)

o (ξnr ) = 2i

πrδ(r ) (35)

Page 44: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

136 Acoustics, Linear

leads to: ∑n

Anψn(z) = iπδ(z − z1) (36)

Next one must multiply Eq. (36) by ψm(z) and integrateover the depth 0 to h. Using the orthogonality of modeassumption, which states:∫ h

0ψn(z)ψm(z) dz = 0 if m = n (37)

we find that:

An = iπψn(z1)∫ h0 ψ2

n (z) dz(38)

So,

p(r, z) = π i∑

n

ψn(z1)ψn(z)H (1)

o (ξnr )∫ h0 ψ2

n (z) dz(39)

If the medium consists of a single layer with constantvelocity, Co, it is found that:

ψn(z) = cosh bnz ξn =√

b2n + k2

(40)bn = i

(n + 1

2

)π/h k = ω/co

C. Underwater Sound Propagation

Ray theory and normal mode theory are used extensivelyin studying the transmission of sound in the ocean. Atfrequencies below 100 Hz, the normal mode theory isnecessary.

Two types of sonar are used underwater: active and pas-sive. Active sonar produces sound waves that are sent outto locate objects by receiving echos. Passive sonar listensfor sounds. Since the sound rays are bent by refraction(as discussed in Section IV.B), there are shadow zones inwhich the sound does not travel. Thus a submarine locatedin a shadow zone has very little chance of being detectedby sonar.

Since sound is the principal means of detection underwater, there has been extensive research in various aspectsof this field. The research has been divided essentially intothree areas: generation, propagation, and signal process-ing. Generation deals with the mechanisms of producingthe sound, propagation deals with the transmission fromthe source to the receiver, and signal processing deals withanalyzing the signal to extract information.

D. Atmospheric Sound Propagation

It has been shown that large amplitude sounds such assonic booms from supersonic aircraft can be detected atvery low frequencies (called infrasonic frequencies) at dis-tances above 100 km from the source. In particular, the

Concorde sonic boom has been studied at about 300 km,and signals of about 0.6 N/m2 were received at frequen-cies of the order of 0.4 Hz. The same phenomenon occursfor thunder and explosions on the ground.

The same principles hold in the atmosphere as in waterfor the bending of rays in areas of changing temperature.Because of the large attenuation of higher frequency soundin air as opposed to water, sound energy is not used forcommunication in air. For example, considering the vari-ous mechanisms of absorption in the atmosphere, the totalattenuation is about 24 db per kiloyard at 100 Hz, whereas,for underwater, the sound attenuates at 100 Hz at about0.001 db per kiloyard.

V. SOUND PROPAGATIONWITH OBSTACLES

A. Refraction

Refraction is the bending of sound waves. (Section II.B.2.)The transmission of sound through the water with varioustemperature layers and the application of Snell’s law havealready been treated in this article. Transmission of soundthrough water is probably the most extensive practical ap-plication of refraction in acoustics.

B. Reflection and Transmission

There are many practical problems related to reflectionand transmission of sound waves. One example is usedhere to acquaint the reader with the concepts involved inthe reflection and transmission of sound.

Consider a sound wave coming from one medium andhitting another, as shown in Fig. 6. What happens in lay-ered media, such as the temperature layers described inconnection with ray acoustics and underwater sound, cannow be noted. When ray acoustics and underwater soundwere discussed, only refraction and transmission were de-scribed. The entire process for one transition layer cannow be explained.

The mass density and sound velocity in the uppermedium is ρ, c and in the lower medium is ρ1, c1. Thepressure in the incident wave, pinc, can be written:

pinc = poeik(x sin θ−z cos θ ) k = ω/c (41)

(i.e., assuming a plane wave front).From Snell’s law, it is found that the refracted wave

angle θ1 is determined by the relation:

sin θ

sin θ1= c

c1(42)

There is also a reflected wave that goes off at angle θ , asshown in the figure. The magnitude of the reflected wave

Page 45: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 137

FIGURE 6 Reflection and transmission from an interface. ρ, c =mass density and sound velocity in upper medium; ρ1, c1 = massdensity and sound velocity in lower medium; θ = angle of inci-dence and reflection; θ1 = angle of refraction.

is not known, but since its direction θ is known, it can bewritten in the form:

prefl = V eik(x sin θ+z cos θ ) (43)

where V is the reflection coefficient. Similarly, the re-fracted wave can be written in the form

prefrac = W poeik1(x sin θ1−z cos θ1) (44)

where k1 = ω/c1 and W is the transmission coefficient.The boundary conditions at z = 0 are

pupper = plower (acoustic pressure iscontinuous across )the boundary)

(vz)upper = (vz)lower (particle velocity normalto boundary is continuousacross the boundary)

(45)

The velocity is related to the pressure by the expression:

∂p

∂z= ρ

∂vz

∂t(46)

For harmonic motion v ∼ eiωt , so:

∂p/∂z = iωρvz (47)

The second boundary condition at z = 0 is, therefore,

1

ρ

∂pupper

∂z= 1

ρ1

∂plower

∂z(48)

The total field in the upper medium consists of the reflectedand incident waves combined, so:

pupper = pinc + prefl

= poeikx sin θ (e−ikz cos θ + V eikz cos θ ) (49)

Substituting into the boundary conditions, we find that:

poeikx sin θ (1 + V ) = W poeik1x sin θ1

so

1 + V = W eik1x sin θ1−ikx sin θ (50)

Since 1 + V is independent of x , then eik1x sin θ1−ikx sin θ

must also be independent of x . Thus,

k1sin θ1 = k sin θ (51)

which is Snell’s law. Thus, the first boundary conditionleads to Snell’s law. The second boundary condition leadsto the equation:

1

ρpoeikx sin θ (−ik cos θ + V ik cos θ )

= 1

ρ1W poeik1x sin θ1 (−ik1cos θ1) (52)

Substituting Eq. (51) into (50) gives:

1 + V = W (53)

and substituting Eq. (53) into (52) gives:

1

ρeikx sin θ (−ik cos θ + V ik cos θ )

= 1

ρ1(1 + V )eik1x sin θ1 (−ik1 cos θ1) (54)

or,ρ1

ρ(−ik cos θ + V ik cos θ ) = (1 + V )(−ik1 cos θ1)

So

V = (ρ1/ρ)k cos θ − k1 cos θ1

(ρ1/ρ)k cos θ + k1 cos θ1(55)

=ρ1

ρcos θ − k1

k cos θ1

ρ1

ρcos θ + k1

k cos θ1

k1

k= c

c1(56)

Equations (51), (53), and (56) give the unknowns θ1, V ,and W as functions of the known quantities ρ1, ρ, c1, c, andθ . Note that if the two media are the same then V = 0 andW = 1. Thus, there is no reflection, and the transmission is

Page 46: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

138 Acoustics, Linear

FIGURE 7 Diffraction over a wide barrier. Plot of the ratio of square of diffracted sound pressure amplitude p Diffrto the square of the amplitude pATL expected at an equivalent distance L from the source in the absence of thebarrier. Source and listener locations are as indicated in the sketch with zs = zL on opposite sides of a rectangularthree-sided barrier. Computations based on the Maekawa approximation and on the double-edge diffraction theory arepresented for listener angle θ between 0 and 90. Here, L represents a distance of 30 wavelengths (10λ + 10λ + 10λ).(Reprinted by permission from Pierce, A. D. (1974). J. Acoust. Soc. Am. 55 (5), 953.)

100%; that is, the incident wave continues to move alongthe original path. As θ → π/2 (grazing incidence) thenV → −1 and W → 0. This says that there is no wave trans-mitted to the second medium at grazing incidence of theincident wave. For θ such that (ρ1/ρ) cos θ = (k1/k) cos θ ,the reflection coefficient vanishes, and there is completetransmission similar to the case in which the two mediaare the same.

C. Diffraction

One of the most interesting practical applications ofdiffraction is in barriers. Figure 7 shows results of diffrac-tion over a wide barrier.

This plot illustrates how sound bends around corners.As the listener gets closer to the barrier (i.e., as θ → 0),the sound is reduced by almost 40 db for the case shown.When the listener is at the top of the barrier (θ → 90),the reduction is only 20 db. In the first case (θ → 0), thesound has to bend around the two corners. However, forthe second case, it has to bend only around the first cor-ner. Such barriers are often used to block the noise fromsuperhighways to housing developments. As can be seenfrom the curve, the listener has to be well in the shadowzone to achieve maximum benefits.

D. Interference

If the pressure in two different acoustic waves is p1, p2

and the velocity of the waves is u1, u2, respectively, thenthe intensity I for each of the waves is

I1 = p1u1 I2 = p2u2 (57)

When the waves are combined, the pressure p in the com-bined wave is

p = p1 + p2 (58)

and the velocity u in the combined wave is

u = u1 + u2 (59)

The intensity of the combined wave is

I = pu

= (p1 + p2)(u1 + u2) = p1u1 + p2u2 + p2u1 + p1u2

= I1 + I2 + (p2u1 + p1u2) (60)

Equation (60) states that the sum of the intensities of thetwo waves is not merely the sum of the intensities ofeach of the waves, but that there is an extra term. Thisterm is called the interference term. The phenomena thatthe superposition principle does not hold for intensity inlinear acoustics is known as interference. If both u1 and

Page 47: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 139

u2 are positive, then what results is called constructiveinterference. If u1 = −u2, then I = 0 and what results iscalled destructive interference.

E. Scattering

The discussion of reflection, refraction, and transmissionwas limited to waves that impinged on a flat infinite sur-face such as the interface between two fluids. In thosecases, the phenomena of reflection, refraction, and trans-mission were clear cut, and the various phenomena couldbe separated.

If the acoustic wave encounters a finite object, the pro-cesses are not so clear cut and cannot be separated. Theprocess of scattering actually involves reflection, refrac-tion, and transmission combined, but it is called scatteringbecause the wave scatters from the object in all directions.

Consider the classical two-dimensional problem of aplane wave impinging on a rigid cylinder, as shown inFig. 8. The intensity of the scattered wave can be written:

Is ≈(

2Ioa

πr

)|ψs(θ )|2 (61)

where Io is the intensity of the incident wave (Io = P2o /

2ρoco where Po is the pressure in the incident wave, ρo isthe density of the medium, and co is the sound velocity inthe medium), and ψs(θ ) is a distribution function. Figure 9shows the scattered power and distribution in intensity forvarious values of ka.

Several interesting cases can be noted. If ka → 0, thenthe wavelength of the sound is very large compared withthe radius of the cylinder, and the scattered power goesto zero. This means that the sound travels as if the objectwere not present at all. If the wavelength is very small

FIGURE 8 Plane wave impinging on a rigid cylinder. Io = intensityof incident plane wave; a = radius of cylinder; r , θ = cylindricalcoordinates of field point.

compared with the cylinder radius, it can be shown thatmost of the scattering is in the backward direction in theform of an echo or reflection, in the same manner as wouldoccur at normal incidence of a plane wave on an infiniteplane. Thus, for small ka (low frequency), there is mostlyforward scattering, and for large ka (high frequency), thereis mostly backscattering.

Consider now the contrast between scattering from elas-tic bodies compared with rigid bodies. Let a plane waveof magnitude p and frequency ω impinge broadside on acylinder as shown in Fig. 10(a). Let f∞(θ ) be defined asfollows:

f∞(θ ) =(

2r

a

)1/2 ps(θ )

po= form function

where r = radial distance to the point where the scatteredpressure is being measured; a =outside radius of the cylin-der; b = inside radius of a shell whose outside radius is a;ps(θ ) = amplitude of scattered pressure; po = amplitudeof incident wave; ka = ωa/co; ω = 2π f ; f = frequencyof incoming wave; co = sound velocity in the medium.

Figure 10(b) shows the form function for a rigid cylin-der as a function of ka. Figure 10(c) shows this func-tion for a rigid sphere of outside radius a. Contrast thiswith Fig. 10(d), which gives the form function for a solidaluminum cylinder in water, and with Fig. 10(e), whichshows the function for elastic aluminum shells of vari-ous thicknesses. As one can see, the elasticity of the bodyhas a dominant effect on the acoustic scattering from thebody.

VI. FREE AND CONFINED WAVES

A. Propagating Waves

The acoustic wave equation states that the pressure satis-fies the equation:

c2∇2 p = ∂2 p

∂t2(62)

For illustrative purposes, consider the one-dimensionalcase in which the waves are traveling in the x direction.The equation satisfied in this case is

c2 ∂2 p

∂x2= ∂2 p

∂t2(63)

The most general solution to this equation can be writtenin the form:

p = f1(x + ct) + f2(x − ct) (64)

This solution consists of two free traveling waves movingin opposite directions.

Page 48: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

140 Acoustics, Linear

FIGURE 9 Scattered power and distribution in intensity for a rigid cylinder. ψS(θ ) = angular distribution function;a = radius of cylinder, Io = incident intensity of plane wave; k = ω/c o; ω = frequency of wave; c o = sound velocity inthe medium. (Reprinted by permission from Lindsay, R. B. (1960). “Mechanical Radiation,” McGraw-Hill, New York.)

B. Standing Waves

Consider the waves described immediately above, butlimit the discussion to harmonic motion of frequency ω.One of the waves takes the form:

p = A cos(kx − ωt) (65)

where k = ω/c = 2π/λ and λ = wavelength of the sound.This equation can also be written in the form:

p = A cos k(x − ct) (66)

If this wave hits a rigid barrier, another wave of equalmagnitude is reflected back toward the source of the wavemotion. The reflected wave is of the form:

p = A cos k(x + ct) (67)

If the source continues to emit waves of the form ofEq. (66), and reflections of the form of Eq. (67) comeback, then the resulting pattern is a superposition of thewaves, that is,

p = A cos(kx + ωt) + A cos(kx − ωt) (68)

or

p = 2A cos kx cos ωt

The resultant wave pattern no longer has the character-istics of a traveling wave. The pattern is stationary and isknown as a standing wave.

C. Reverberation

When traveling waves are sent out in an enclosed space,they reflect from the walls and form a standing wave pat-tern in the space. This is a very simple description of a verycomplicated process in which waves impinge on the wallsfrom various angles, reflect, and impinge again on anotherwall, and so on. The process of reflection takes place con-tinually, and the sound is built up into a sound field knownas a reverberant field. If the source of the sound is cutoff, the field decays. The amount of time that it takes forthe sound energy density to decay by a factor of 106 (i.e.,60 db) is called the reverberation time of the room. Thesound energy density concept was used by Sabine in hisfundamental discoveries on reverberation. He found thatsound fills a reverberant room in such a way that the av-erage energy per unit volume (i.e., the energy density) inany region is nearly the same as in any other region.

The amount of reverberation depends on how muchsound is absorbed by the walls in each reflection and inthe air. The study of room reverberation and the answer-ing of questions such as how much the sound is absorbedby people in the room, and other absorbers placed in theroom, are included in the field of architectural acoustics.The acoustical design of concert halls or any structuresin which sound and reverberation are of importance is aspecialized and intricate art.

Page 49: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 141

FIGURE 10 (a) The geometry used in the description of the scat-tering of a plane wave by an infinitely long cylinder. (b) The formfunction for a rigid cylinder. (c) The form function vs. ka for a rigidsphere.

D. Wave Guides and Ducts

When a wave is confined in a pipe or duct, the duct isknown as a wave guide because it prevents the wave frommoving freely into the surrounding medium. In discussingnormal mode theory in connection with underwater sound

propagation, the boundary conditions were stipulated onthe surface and bottom. The problem thus became one ofpropagation in a wave guide.

One wave guide application that leads to interesting im-plications when coupled and uncoupled systems are con-sidered, is the propagation of axially symmetric waves ina fluid-filled elastic pipe. If axially symmetric pressurewaves of magnitude po and frequency ω are sent out froma plane wave source in a fluid-filled circular pipe, the pres-sure at any time t and at any location x from the sourcecan be written as follows:

p ≈ po

[1 + iω

(r2

2acζt

)]e−(xκt /a)+i[ω/c+(σt /a)]x−iωt

(69)

where r is the radial distance from the center of the pipeto any point in the fluid, a the mean radius of the pipe,c the sound velocity in the fluid inside the pipe, ω theradian frequency of the sound, x the longitudinal distancefrom the disturbance to any point in the fluid, and zt theimpedance of the pipe such that:

1

zt= 1

ρcζt= 1

ρc(κt − iσt ) (70)

where κt/ρc is the conductance of the pipe and σt/ρcthe susceptance of the pipe. The approximate value of thewave velocity υ down the tube is

υ = c[1 − σt (λ/2πa)] (71)

If the tube wall were perfectly rigid, then the tubeimpedance would be infinite (σt → 0) and the velocitywould be c. The attenuation is given by the factor eiκt x/a .If the tube were perfectly rigid (κt = 0), then the attenu-ation would be zero. If the tube is flexible, then energygradually leaks out as the wave travels and the wave in thefluid attenuates. This phenomenon is used extensively intrying to reduce sound in tubes and ducts by using acousticliners. These acoustic liners are flexible and absorb energyas the wave travels down the tube.

One critical item must be mentioned at this point. It hasbeen assumed that the tube impedance (or conductanceand susceptance) can be calculated independently. Thisis an assumption that can lead to gross errors in certaincases. It will be discussed further when coupled systemsare considered.

The equation for an axisymmetric wave propagating ina rigid circular duct or pipe is as follows:

p(r, z) = pm0 J0(αomr/a)ei(γom z−ωt)

pmo is the amplitude of the pressure wave, r is the radialdistance from the center of the pipe to any point, a is theradius of the pipe, z is the distance along the pipe, ω is theradian frequency, and t is time.

Page 50: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

142 Acoustics, Linear

FIGURE 10 (continued ). (d) Top, the form function for an aluminum cylinder in water, Bottom, comparison of theory(——) and experimental observation (the points) for an aluminum cylinder in water.

γom and αom are related by the following formula:

γom = [k2 − (αom/a)2

]1/2

J0 is the Bessel Function of order 0.

k = 2π f/ci

In the above relation, f is the frequency of the wave andci is the sound velocity in the fluid inside the pipe (forwater this sound velocity is about 5000 ft/sec and for airit is about 1100 ft/ sec). The values of αom for the first fewm are

m = 0 α00 = 0

m = 1 α01 = 3.83

m = 2 α02 = 7.02

If k < αom/a then γom is a pure imaginary number and thepressure takes the form:

p(r, z) = Pm0 J0(αomr/a)e−γ omze−iωt

which is the equation for a decaying wave in the z direc-tion. For frequencies which give k < αom/a, no wave is

propagated down the tube. Propagation takes place onlyfor frequencies in which k > αom/a. Since α00 = 0, prop-agation always takes place for this mode. The frequencyat which γom is 0 is called the cutoff frequency and is asfollows:

fom = ciαom/2πa

For frequencies below the cutoff frequency, no propaga-tion takes place. For general asymmetric waves the pres-sure takes the form:

p(r, z, φ) = Pnm Jn(αnmr/a)ei(γnm z−ωt) cos nφ

where γnm = [k2 − (αnm/a)2]1/2, fnm = ciαnm/2πa.A few values for αnm for n > 0 are as follows:

α10 = 1.84 α11 = 5.31 α12 = 8.53

α20 = 3.05 α21 = 6.71 α22 = 9.97

It is seen that only α00 = 0, and this is the only mode thatpropagates at all frequencies regardless of the size of theduct.

Consider a 10-in.-diameter pipe containing water. Thelowest cutoff frequency greater than the 00 mode is

Page 51: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 143

FIGURE 10 (continued ). (e) The form function vs. ka over therange of 0.2 ≤ ka ≤ 20 for aluminum shells with b/a values of (a)0.85, (b) 0.90, (c) 0.92, (d) 0.94, (e) 0.96, (f) 0.98, and (g) 0.99.(Figs. 10(a–e) reprinted by permission from Neubauer, W. G.,(June 1986). “Acoustic Reflection from Surfaces and Shapes,”Chapter 4, Eq. (6) and Figs. 1, 2, 7(a), 13, and 27, Naval Re-search Laboratory, Washington, D.C.)

3516 Hz. Thus, nothing will propagate below 3516 Hz ex-cept the lowest axisymmetric mode (i.e., the 00 mode). Ifthe pipe were 2 in. in diameter then nothing would prop-agate in the pipe below 17,580 Hz except the 00 mode.This means that in a great many practical cases no matterwhat is exciting the sound inside the duct, only the lowest

axisymmetric mode (00 mode) will propagate in the duct.

VII. SOUND RADIATION AND VIBRATION

A. Helmholtz Integral, Sommerfeld RadiationCondition, and Green’s Functions

In this presentation, acoustic fields that satisfy the waveequation of linear acoustics are of interest.

c2∇2 p = ∂2 p/∂t2 (72)

where p is the pressure in the field at point P and at timet . For sinusoidal time-varying fields,

p(P, t) = p(P)eiωt (73)

so that p satisfies the Helmholtz equation:

∇2 p + k2 p = 0 k2 = ω2/c2 (74)

This Helmholtz equation can be solved in general formby introducing an auxiliary function called the Green’sfunction. First, let ϕ and ψ be any two functions of spacevariables that have first and second derivatives on S andoutside S (see Fig. 11). Let Vo be the volume between So

and the boundary at ∞. Green’s theorem states that withinthe volume Vo,∫

S

∂ψ

∂n−ψ

∂ϕ

∂n

)d S =

∫Vo

(ϕ∇2ψ−ψ∇2ϕ) dVo (75)

where S denotes the entire boundary surface and Vo theentire volume of the region outside So. In Eq. (75) ∂/∂ndenotes the normal derivative at the boundary surface. Re-arrange the terms in Eq. (75) and subtract

∫Vo

k2ϕψ dVo

from each side, and the result is∫Sϕ

∂ψ

∂nd S −

∫Vo

ϕ(∇2ψ + k2ψ) dVo=∫

∂ϕ

∂nd S

−∫

Vo

ψ(∇2ϕ + k2ϕ) dVo (76)

Now choose ϕ as the pressure p in the region Vo; thus,

∇2 p + k2 p = ∇2ϕ + k2ϕ = 0 (77)

and choose ψ as a function that satisfies:

∇2ψ + k2ψ = δ(P − P ′) (78)

where δ(P − P ′) is a δ function of field points P and P ′.Choose another symbol for ψ , that is,

ψ = g(P, P ′, ω) (79)

By virtue of the definition of the δ function, the followingis obtained: ∫

Vo

ϕ(P ′)δ(P − P ′) d P ′ = ϕ(P) (80)

Page 52: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

144 Acoustics, Linear

FIGURE 11 The volume and boundary surface. S∞ = surface at ∞; So = radiating surface; V0 = volume betweenS∞ and So; S1, S2 = surfaces connecting So to S∞.

Thus, Eq. (76) becomes:∫S

p(S, ω)∂g(P, S, ω)

∂n− p(P, ω)

=∫

Sg(P, S, ω)

∂p(S, ω)

∂nd S (81)

or

p(P, ω) =∫

S

[p(S, ω)

∂g(P, S, ω)

∂n

− g(P, S, ω)∂p(S, ω)

∂n

]d S (82)

It is now clear that the arbitrary function ψ was chosen sothat the volume integral would reduce to the pressure atP . The function g is the Green’s function, which thus faris a completely arbitrary solution of:

∇2g(P, P ′, ω) + k2g(P, P ′, ω) = δ(P − P ′) (83)

For this Green’s function to be a possible solution, it mustsatisfy the condition that there are only outgoing travelingwaves from the surface So to ∞, and no waves are com-ing in from ∞. Sommerfeld formulated this condition asfollows:

limr→∞ r

(∂g

∂r− ikg

)= 0 (84)

where r is the distance from any point on the surface toany point in the field. A solution that satisfies Eqs. (83)and (84) can be written as follows:

g = 1

eikr

r(85)

This function is known as the free-space Green’s function.Thus, Eq. (82) can be written in terms of this free-space

Green’s function as follows:

p(P, ω) = 1

∫S

[ρ(S, ω)

∂n

(eikr

r

)− eikr

r

∂p(S, ω)

∂n

]d S

(86)

Several useful alternative forms of Eqs. (82) and (86) canbe derived. If a Green’s function can be found whose nor-mal derivative vanishes on the boundary So, then:

∂g

∂n(P, S, ω) = 0 on So

and Eq. (82) becomes:

p(P, ω) = −∫

Sg(P, S, ω)

∂p

∂nd S (87)

Alternatively, if a Green’s function can be found that itselfvanishes on the boundary So, then g(P, S, ω) = 0 on So

and Eq. (82) becomes:

p(P, ω) =∫

Sp(S, ω)

∂g(P, S, ω)

∂nd S (88)

From Newton’s law,

∂p/∂n = −ρwn (89)

where wn is the normal acceleration of the surface. Thus,Eq. (87) can be written as:

p(P, ω) = ρ

∫S

g(P, S, ω)wn d S (90)

If γ is the angle between the outward normal to the surfaceat S and the line drawn from S to P , then Eq. (86) can bewritten in the form (assuming harmonic motion):

p(P, ω) = 1

∫S(ρwn(s) − ikp(s) cos γ )

eikr

rd S (91)

Since p(s) is unknown, then Eq. (86) or, alternatively,Eq. (91) is an integral equation for p.

Page 53: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 145

An interesting limiting case can be studied. Assume alow-frequency oscillation of the body enclosed by So. Forthis case, k is very small and Eq. (91) reduces to:

p(P, ω) = 1

4πr

∫Sρwn ds (92)

or

p(P, ω) = ρV /4π R (93)

where V is the volume acceleration of the body.For some cases of slender bodies vibrating at low fre-

quency, it can be argued that the term involving p inEq. (91) is small compared with the acceleration term.For these cases,

p(P, ω) ≈ 1

∫Sρwn(S)

eikr

rd S (94)

B. Rayleigh’s Formula for Planar Sources

It was stated in the last section that if a Green’s functioncould be found whose normal derivative vanishes at theboundary, then the sound pressure radiated from the sur-face could be written as Eq. (87) or, alternatively, usingEq. (89):

p(P, ω) = ρ

∫S

g(P, S, ω)wn(S) ds (95)

It can be shown that such a Green’s function for an infiniteplane is exactly twice the free-space Green’s function, thatis,

g(P, S, ω) = 2 × 1

eikr

r(96)

The argument here is that the pressure field generated bytwo identical point sources in free space displays a zeroderivative in the direction normal to the plane of symmetryof the two sources.

Substituting Eq. (96) into Eq. (95) gives:

p = ρ

∫S

eikr

rwn(S) ds (97)

Equation (97) is known as Rayleigh’s formula for planarsources.

C. Vibrating Structures and Radiation:Multipole Contributions

To make it clear how the above relations can be applied,consider the case of a slender vibrating cylindrical surface

FIGURE 12 Geometry of the vibrating cylinder. R S1 = radiusvector to point S1 on the cylinder surface; R1 = radius vector tothe far-field point; aRi = unit vector in the direction of R1; P1 = far-field point (with spherical coordinates R1, ϕ1, θ1). (Reprinted bypermission from Greenspon, J. E. (1967). J. Acoust. Soc. Am. 41(5), 1203.)

at low frequency such that Eq. (94) applies. The geometryof the problem is shown in Fig. 12.

eikr

r= eik R1

R1e−ik(aR1·RS1 )

(98)aR1 · RS1 = zo cos θ1 + xo sin θ1 cos ϕ1 + yo sin θ1 sin ϕ1

where xo, yo, zo are the rectangular coordinates of a pointon the vibrating surface of the structure, RS1 is the radiusvector to point S1 on the surface, R1, θ, ϕ1 are the sphericalcoordinates of point P1 in the far field, any aR1 is a unitvector in the direction of R1 (the radius vector from theorigin to the far field point). Thus, aR1 · RS1 is the projectionof RS1 on R1 making R1 − aR1 · RS1 the distance from thefar field point to the surface point.

Assume the acceleration distribution of the cylindricalsurface to be that of a freely supported shell subdividedinto its modes, that is,

w2(S) =∞∑

m=1

∞∑q=0

(Amq cos qϕ1+Bmq sin qϕ) sinmπ z

l

(99)

Expression (99) is a half-range Fourier expansion in thelongitudinal z direction (which is chosen as the distancealong the generator) between bulkheads, which are dis-tance 1 apart. The expression (99) is a full-range Fourierexpansion in the peripheral ϕ direction. It is known thatsuch an expression does approximately satisfy the differ-ential equations of shell vibration and practical end con-ditions at the edges of the compartment. Substitution ofEqs. (98) and (99) into (94) and integrating results in:

Page 54: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

146 Acoustics, Linear

p(P1, ω) = ρoeik R1

4π R1

∞∑m=1

∞∑q=0

Amq cos qϕ1 + Bmq sin qϕ1

× Jq (ka sin θ1)2πaliq

[1

2(mπ − kl cos θ1)

+ 1

2(mπ + kl cos θ1)

]−

[cos(mπ − kl cos θ1)

2(mπ + kl cos θ1)

+ cos(mπ − kl cos θ1)

2(mπ + kl cos θ1)

]− i

[sin(mπ − kl cos ϑ1)

2(mπ + kl cos θ1)

− sin(mπ − kl cos ϑ1)

2(mπ + kl cos θ1)

]

Consider the directivity pattern in the horizontal plane ofthe cylindrical structure, that is, at ϕ1 = π/2, and let usexamine the Amq term in the series above. After somealgebraic manipulation, the amplitude of the pressure canbe written as follows:For mπ = kl cos θ1,

Imq = pmq/2πalAmqρoeik R1

4π R1cos qθ1

=√

2mπ Jq (ka sin θ1)√

1 − (−1)m cos(kl cos θ1)

(mπ )2 − (kl cos θ1)2

For mπ = kl cos θ1,

Imq = 12 Jq (ka sin θ1) (100)

where Jq is the Bessel function of order q .Figure 13 shows the patterns of the far-field pressure

for various values of ka, m, q . A source pattern is definedas one that is uniform in all directions. A dipole patternhas two lobes, a quadrupole pattern has four lobes, andso on. Note that for ka = 0.1, q = 1, m = 1, 3, 5; all showdipole-type radiation. In general. Fig. 13 shows how themultipole contributions depend upon the spatial pattern ofthe acceleration of the structure.

For low frequencies where kl mπ , it is seen that (not-ing that πl/λ = kl/2) for m even,

Imq ≈ 2Jq (ka sin θ1) sin(

πlλ

cos θ1)

mπ(101)

for m odd,

Imq ≈ 2Jq (ka sin θ1) cos(

πlλ

cos θ1)

Thus, at low frequencies (i.e., for kl mπ ), the structuralmodes radiate as though there were two sources at theedges of the compartment (i.e., l apart). What results is

the directivity pattern for two point sources at the edgesof the compartment modified by Jq (ka sin θ1). If the lon-gitudinal mode number m is even, then the sources are180 degrees out of phase, and if m is odd, the sources arein phase. Such modes are called edge modes.

D. Vibrations of Flat Plates in Water

Consider a simply supported flat rectangular elastic plateplaced in an infinite plane baffle and excited by a pointforce as shown in Fig. 14. The plate is made of aluminumand is square with each side being 13.8 in. long. Its thick-ness is 1

4 in., and it has a damping loss factor of 0.05. Theforce is equal to 1 lb, has a frequency of 3000 cps, and islocated at x0 = 9 in., y0 = 7 in. If the plate is stopped atthe instant when the deflection is a maximum, the velocitypattern would be as shown in Fig. 14. Since the velocityis just the frequency multiplied by the deflection, this isan instantaneous picture of the plate.

VIII. COUPLING OF STRUCTURE/MEDIUM(INTERACTIONS)

A. Coupled Versus Uncoupled Systems

When a problem involving the interaction between twomedia can be solved without involving the solution inboth media simultaneously, the problem is said to be un-coupled. One example of a coupled system is a vibratingstructure submerged in a fluid. Usually the amplitude ofvibration depends on the dynamic fluid pressure, and thedynamic pressure in the fluid depends on the amplitudeof vibration. In certain limiting cases, the pressure on thestructure can be written as an explicit function of the ve-locity of the structure. In these cases, the system is said tobe uncoupled.

Another example is a pipe containing an acoustic liner.Sometimes it is possible to represent the effect of the linerby an acoustic impedance, as described in a previous sec-tion of this presentation. Such a theory was offered byMorse. As Scott indicated, implicit in Morse’s theory isthe assumption that the motion of the surface between theliner and the fluid depends only on the acoustic impedanceand the local acoustic pressure, and not on the acousticpressure elsewhere. This is associated with the concept of“local” and “extended” reaction. In a truly coupled system,the motion of the surface depends on the distribution ofacoustic pressure, and, conversely, the acoustic pressuredepends on the distribution of motion. Thus, the reactionof the surface and the pressure produced are interrelatedat all points. The motion of the surface at point A is notonly governed by the pressure at point A. There is motionat A due to pressure at B and, conversely, motion at B due

Page 55: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 147

FIGURE 13 Horizontal directivity patterns for a cylinder in which L/a = 4, where L is length of cylinder and a is radiusof cylinder. The plots show lmq as a function of θ1 at ϕ1 = π/2 for various values of ka, m, and q. (a) ka = 0.1, q = 0;(b) ka = 0.1, q = 1; (c) ka = 0.1, q = 2; (d) ka = 1.0, q = 0; (e) ka = 1.0, q = 1; (f) ka = 3.0, q = 0; (g) ka = 3.0, q = 1; (h)ka = 3.0, q = 5. The numbers shown on the figure are the values of m. (Reprinted by permission from Greenspon,J. E. (1967). J. Acoust. Soc. Am. 41 (5), 1205.)

to pressure at A. Figure 15 illustrates how this assumptioncan lead to errors in the phase velocity and attenuation inlined ducts.

The alternative is to solve the completely coupled prob-lem of the liner and fluid as outlined by Scott.

In aeroelastic or hydroelastic problems, it is necessaryto solve the coupled problem of structure and fluid be-cause the stability of the system usually depends on thefeeding of energy from the fluid to the structure. Simi-larly, in acoustoelastic problems such as soft duct liners

Page 56: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

148 Acoustics, Linear

FIGURE 14 Real part of velocity of plate.

in pipes, the attenuation of the acoustic wave in the pipeis dependent upon the coupling of this acoustic wave withthe wave in the liner. Similar problems were encounteredwith viscoelastic liners in water-filled pipes. A recent solu-tion of the coupled problem gave realistic results, whereasthe uncoupled problem gave attenuations that were muchtoo high in much the same manner as that shown inFig. 15.

B. Methods for Simplifying Coupled Systems

1. Impedance Considerations

In cases of acoustic waves propagating in pipes or ductsthat are not quite rigid but have some elastic deformationas the wave passes, it is satisfactory to use an acousticimpedance to represent the effect of the pipe on the acous-tic wave. Only when the acoustic liner or pipe is rathersoft and undergoes considerable deformation during thepassage of the acoustic wave is it necessary to solve thecoupled problem.

It is interesting to contrast a typical uncoupled problemwith a typical coupled problem. Consider a plane acousticwave incident on a plane surface, where ζ is the dimen-

sionless specific impedance of the surface (Fig. 16). Theimpedance is given by:

p

∂p/∂n= Zn

ikρc= ζ

ikk = w/c (102)

In the above equation, p is the pressure, ∂p/∂n is the nor-mal derivative of the pressure, Zn is the normal impedanceof the surface, ρ is the density of the medium, and c is thesound velocity in the medium. If pi is the magnitude of theincident pressure and pr is the magnitude of the reflectedpressure from the surface, then the reflection coefficientR can be written as:

R = pr

pi= ζ cos ϕi − 1

ζ cos ϕi + 1(103)

Contrast this result with the coupled problem of the reflec-tion coefficient of an incident wave on an elastic boundary(Fig. 17). In this case, the reflection coefficient, R, and thetransmission coefficient, S, are

R = Z− + Zm − Z+Z− + Zm + Z+

(104)

S = 2Z−Z+ + Zm + Z−

(104a)

Page 57: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 149

FIGURE 15 Comparison of the results of experiment and Morse’s theory for a narrow lined duct. (Reprinted bypermission from Scott, R. A. (1946). Proc. Phys. Soc. 58, 358.)

where

Z− = ρ−c−/ cos ϕt

Z+ = ρ+c+/ cos ϕi

cos ϕt =√

1 −(

c−c+

)sin2 ϕi

Zm = iωδ

[(cm

c+

)2

sin4 ϕi − 1

]

if the elastic surface is a membrane

FIGURE 16 Plane wave incident on impedance surface. ζ = impedance of surface; ρ, c = density and sound velocityin medium; ϕi = angle of incidence.

Zm = Z p = iωδ

[(cp

c+

)4

sin4 ϕi − 1

]

if the elastic surface is a plate.

Where δ is the mass per unit area of surface, ρ− thedensity of the lower medium, ρ+ the density of the up-per medium, c− the sound velocity in the lower medium,c+ the sound velocity in the upper medium, cm = √

T/δ

the sound velocity in the membrane, T the tension inthe membrane, cρ = c+

√ω/ν+ the sound velocity in the

plate, ν2+ = 12δ(12 − ν2)c4

+/Eh3, ν is Poisson’s ratio forthe plate material, and h the thickness of the plate.

Page 58: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

150 Acoustics, Linear

FIGURE 17 Plane wave incident on elastic boundary. ρ+, c+ = density and sound velocity in the upper medium;ρ−, c− = density and sound velocity in the lower medium; ϕi = angle of incidence.

It is seen that the reflection coefficient for the coupledproblem depends on both the media and the characteristicsof the surface in between the media. In the uncoupledproblem, the reflection coefficient depended only on theimpedance of the surface.

Figure 18 illustrates the reflection and transmission co-efficients between 0 and 15,000 Hz for 3

8 -in., 12 -in., and

1-in. steel plates with water on both sides and with wateron one side and air on the other.

Note that, with water on both sides, most of the soundgets transmitted through the plate at low frequency (below2000 Hz), whereas most of the sound gets reflected at15,000 Hz. For plates with air on one side and water on theincoming wave side, the plate acts like a perfect baffle—i.e., all energy gets reflected back into the water and notransmission to the air takes place.

2. Asymptotic Approximations

In classical scattering problems, solutions to theHelmholtz equation are sought,

∇2 p + k2 p = 0 k = ω/c (105)

that satisfy the boundary condition at the fluid structureinterface:

∂p/∂n = ρω2w (106)

where p is the fluid pressure, ρ is the density of themedium, and w is the normal component of the displace-ment of the surface of the structure. The pressure in thefield can be written as

p = pI = psc (107)

where p is the total field pressure, pI is the pressure inthe incident wave, and psc is the pressure in the scatteredwave. For points p on the structural surface S, the scatteredpressure can be written in terms of the Helmholtz relation:

psc(P) = 1

∫s

[p(s)

∂n

(eikr

r

)− eikr

r

∂p(s)

∂n

]ds

(108)

The equation of the elastic structure can be written inoperator form as:

L(w) = p(s) (109)

where L is a differential operator. Equation (108) is an in-tegral equation for the pressure. Equations (106–109) con-stitute the set of equations needed to solve for the scatteredpressure. The integral equation (108) is solved by couplingEqs. (106), (107), and (109) with Eq. (108), dividing thesurface into many parts, and solving the resulting systemof equations.

An alternative method of solution is offered by theasymptotic approximations which give the scattered pres-sure explicitly in terms of the motion of the surface. First,write the equation of motion of the elastic structure inmatrix form as follows:

Ms x + Cs x + Ksx = fint + fext

fext = −G Af(pI + psc) (110)

GTx = uI + usc

where x is the structural displacement vector; Ms, Cs, andKs are the structural mass, damping, and stiffness ma-trices, respectively; Af is a diagonal area matrix for thefluid-structure interface; G is a transformation matrix thatrelates the forces on the structure to those on the inter-face; and fint is the known internal force vector. The termspI and uI are the (known) free-field pressure and fluidparticle velocity associated with the incident wave, andpsc and usc are the pressure and fluid particle velocity forthe scattered wave. The dots denote differentiation withrespect to time. The following fluid-structure interactionequations are then used to relate the pressure and motionon the fluid-structure interface.

1. First doubly asymptotic approximation (DAA1):

Mf ps + ρcAf ps = ρcM f(GT x − uI

)(111)

2. Second doubly asymptotic approximation (DAA2):

Mf ps + ρcAf ps = ρcf Af ps = ρc[Mf

(GT˙x + u I

)+ f Mf

(GT x − u I

)](112)

Page 59: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 151

FIGURE 18 (a) Reflection (R ) and transmission (S ) coefficients for 38 -in., 1

2 -in., and 1-in. steel plates with water on

both sides (φi = 30). (b) Reflection (R ) and transmission (S ) coefficients for 38 -in., 1

2 -in., and 1-in. steel plates withwater on the incident wave side and air on the other side (φi = 30). (Note that ordinates are in thousandths; thus,R ≈ 1 and S is very small.)

Page 60: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

152 Acoustics, Linear

The Mf and f are the fluid added mass and frequencymatrices pertaining to the fluid-structure interface, and GT

is the transpose of G.In essence, the doubly asymptotic approximations un-

couple the fluid–structure interaction by giving an explicitrelation between pressure and motion of the surface. Anillustration of the uncoupling procedure of the asymptoticapproximations can be formulated by taking the very sim-plest case of a plane wave. The pressure and velocity arerelated by (this holds for high frequency):

p = ρocu (113)

where p is the pressure and u is the velocity. If this pres-sure were applied to a simple elastic plate, the resultingdifferential equation would be (noting that u = w):

D∇4w + ρhw = −ρocw (114)

Thus, for this simple case, the entire fluid-structure in-teraction can be solved by one equation instead of having

FIGURE 19 Block diagram for the system. x1(t), x2(t) . . . xn(t) = n inputs; h1, h2, . . . hn = n transfer functions;y1(t)1t2(t) . . . ym(t) = m outputs.

to solve the Helmholtz integral equation coupled with theelastic plate equation.

IX. RANDOM LINEAR ACOUSTICS

A. Linear System Equations

1. Impulse Response

Consider a system with n inputs and m outputs as shownin Fig. 19. Each of the inputs and outputs is a functionof time. The central problem lies in trying to determinethe outputs or some function of the outputs in terms ofthe inputs or some function of them. Let any input x(t) bedivided into a succession of impulses as shown in Fig. 20.Let h(t − τ ) be the response at time t due to a unit impulseat time τ . A unit impulse is defined as one in which thearea under the input versus time curve is unity. Thus, if

Page 61: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 153

FIGURE 20 Input divided into infinitesimal impulses. t = time; τ = the value of time at which x(τ ) is taken; τ = timewidth of impulse; x(τ ) = value of the impulse at time τ .

the base is τ , the height of the unit impulse is 1/ τ .Thus, h(t − τ ) is the response per unit area (or per unitimpulse at t = τ ). The area (or impulse) is x(τ ) τ . Theresponse at time t is the sum of the responses due to allthe unit impulses for all time up to t , that is, from −∞ tot . But it is physically impossible for a system to respondto anything but past inputs; therefore,

h(t − τ ) = 0 for τ > t (115)

Thus, the upper limit of integration can be changed to+∞. By a simple change of variable (θ = t − τ ) it can bedemonstrated that:

y(t) =∫ +∞

−∞h(θ )x(t − θ ) dθ (116)

Since there are n inputs and m outputs, there has to be oneof these equations for each input and output. Thus, for thei th input and j th output,

yi j (t) =∫ +∞

−∞hi j (τ )xi (t − τ ) dτ (117)

2. Frequency Response Function

The frequency response function or the transfer function(the system function, as it is sometimes known) is definedas the ratio of the complex output amplitude to the com-plex input amplitude for a steady-state sinusoidal input.(The frequency response function is the output per unitsinusoidal input at frequency ω.) Thus, the input is

xi (t) = xi (ω)eiωt (118)

and the corresponding output is

y j = y j (ω)eiωt (119)

where xi (ω) and y j (ω) are the complex amplitudes of theinput and output respectively. Then the frequency response

function Hi j (ω) is

Hi j (ω) = y j (ω)

xi (ω)(120)

For sinusoidal input and output, Eq. (117) becomes:

yi (ω)

xi (ω)=

∫ +∞

−∞hi j (τ )e−iωτ dτ (121)

It is therefore proven that the frequency response functionis the Fourier transform of the unit impulse function.

3. Statistics of the Response

Since the linear process is assumed to be random, theresults are based on statistical operations on the process.In this section, the pertinent statistical parameters will bederived. Referring back to Eq. (117), we see that the totalresponse y j is the sum over all inputs. Thus,

y j (t) =n∑

i=1

∫ +∞

−∞hi j (τ )xi (t − τ ) dτ (122)

The cross correlation between outputs y j (t) and yk(t) isdefined as follows:

C jk(τ ) = limT →∞

1

2T

∫ +T

−Ty j (t)yk(t + τ ) dt (123)

From the definition of C jk(τ ) it is seen that:

Ckj (τ ) = 〈yk(t)y j (t + τ )〉where:

〈( )〉 = limT →∞

1

2T

∫ +T

−T( ) dt

Page 62: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

154 Acoustics, Linear

Substituting Eq. (122) and rearranging,

C jk(τ ) =m∑

s=1

n∑r=1

∫ +∞

−∞du

∫ +∞

−∞dv

h js(u)hkr (υ)

×[

limZ→∞

1

2T

∫ +T

−Txs(t − u)xr (t − υ + τ ) dt

]

(124)

By definition of the cross correlation,

limT →∞

1

2T

∫ +T

−Txs(t − u)xr (t − υ + τ ) dt

= Crs(u − υ + τ ) (125)

Thus,

C jk(τ ) =m∑

s=1

n∑r=1

∫ +∞

−∞du

∫ +∞

−∞dυ

× [h js(u)hkr (υ)Crs(u − υ + τ )] (126)

The cross spectrum G jk(ω) is defined as the Fourier trans-form of the cross correlation. The inverse Fourier trans-form relation is, then,

Ckj (τ ) = 1

∫ +∞

−∞G jk(ω)eiωτ dω

Thus,

G jk(ω) =∫ +∞

−∞Ckj (τ )e−iωτ dτ (127)

Note that:

Gkj (ω) =∫ +∞

−∞Ckj (τ )eiωτ dτ =

∫ +∞

−∞C jk(−τ )e−iωτ dτ

=∫ +∞

−∞C jk(θ )eiωθ dθ = G∗

jk(ω)

where G∗jk is the complex conjugate of G jk .

Substituting Eq. (126), changing variables θ =u − υ + τ , and using definition (127) and relation (121),

G jk(ω) =n∑

s=1

n∑r=1

H∗js(ω)Hkr (ω)Grs(ω) (128)

in which H∗js is the complex conjugate of Hjs . Equa-

tion (128) gives the cross spectrum of the outputs G jk(ω)in terms of the cross spectrum of the inputs Grs(ω). Inmatrix notation, Eq. (128) can be written as:

Go(ω) = H∗G i H T (129)

where Go is the output matrix of cross spectra, G i is theinput matrix of cross spectra, and H is the matrix of trans-

fer functions. H T denotes the transpose matrix of H , andH∗ is the complex conjugate matrix of H . Thus,

Go(ω) =

∣∣∣∣∣∣∣∣∣∣∣

Go11(ω) Go

12(ω) Go13(ω) · · · Go

1k(ω)

Go21(ω)

...

Goj1(ω) · · · Go

j j (ω)

∣∣∣∣∣∣∣∣∣∣∣(130)

By virtue of the fact that Gi j (ω) = G∗j i (ω), the above ma-

trix is a square Hermitian matrix. The input matrix G i

is

G i(ω) =

∣∣∣∣∣∣∣∣∣∣∣

G i11(ω) G i

12(ω) G i13(ω) · · · G i

15(ω)

G i21(ω)

...

G ir1(ω) · · · G i

rr (ω)

∣∣∣∣∣∣∣∣∣∣∣(131)

G i is also Hermitian of order r × r = s × s = r × s. Thetransfer function matrix is a k × r complex matrix (notHermitian):

H (ω) =

∣∣∣∣∣∣∣∣∣∣∣

H11(ω) H12(ω) · · · H1r (ω)

H21(ω)...

Hk1(ω) · · · Hkr (ω)

∣∣∣∣∣∣∣∣∣∣∣(132)

4. Important Quantities Derivable fromthe Cross Spectrum

The cross spectrum can be used as a starting point to deriveseveral important quantities. The spectrum of the responseat point j is obtained by letting k = j in Eq. (128). Theautocorrelation is obtained by letting k = j in Eq. (123).The mean-square response is further obtained from theautocorrelation by letting τ = 0. If the Fourier inverse ofEq. (127) is used to determine mean square, then:

C j j (0) = 1

∫ +∞

−∞G j j (ω) dω

= meam square = M2j (133)

= limT →∞

1

2T

∫ +T

−Ty2

j (t) dt

If the mean square is desired over a frequency band = 2 − 1, then it is given by:

(M2

j

)

= 1

∫ 2

2

G j j (ω) dω (134)

Page 63: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 155

The mean value of y j (t) is defined as:

M j = limT →∞

1

2T

∫ +T

−Ty j (t) dt (135)

The variance σ 2j is defined as the mean square value about

the mean:

σ 2j = lim

T →∞1

2T

∫ +T

−T[y j (t) − M j ]

2 dt (136)

The square root of the variance is known as the standarddeviation. By using Eqs. (133) and (135), Eq. (136) canbe written:

σ 2j = M2

j − (M j )2 (137)

The mean, the variance, and the standard deviation arethree important parameters involved in probability distri-butions. Note that if the process is one with zero meanvalue, then the variance is equal to the mean square, andthe standard deviation is the root mean square.

The above quantities are associated with the ordinaryspectrum rather than the cross spectrum. An importantphysical quantity associated with the cross spectrum isthe coherence, which is defined as:

γ 2jk(ω) = |G jk(ω)|2

G j j (ω)Gkk(ω)(138)

The lower limit of γ 2jk must be zero since the lower limit of

G jk(ω) is zero. This corresponds to no correlation betweenthe signals at j and k. In addition, γ 2

jk ≤ 1. Going back toEq. (128), we see that if there is only one input, then:

G jk(ω) = H∗js Hkr Grr (139)

Thus,

γ 2jk = H∗

js Hkr Hjs H∗kr G2

rr

H∗js Hjs Grr H∗

kr Hkr Grr= 1 (140)

So the field is completely coherent for a single input to thesystem.

In an acoustic field, sound emanating from a singlesource is coherent. If the coherence is less than unity, thenthe field is partially coherent. The partial coherence effectis sometimes due to the fact that the source is of finiteextent. It is also sometimes due to the fact that there areseveral sources causing the radiation and these sources arecorrelated in some way with each other.

5. The Cross Spectrum in Termsof Fourier Transforms

The cross spectrum can also be expressed in terms ofFourier transforms alone. To see this, start with the ba-sic definition of cross spectrum as given by Eq. (127),where:

C jk(τ ) = limT →∞

1

2T

∫ +T

−Ty j (t)yk(t + τ ) dt (141)

Thus,

G jk(ω) =∫ +∞

−∞

lim

T →∞1

2T

×∫ +T

−Ty j (t)yk(t + τ ) dt

e−iωτ dτ (142)

Letting t = u and t + τ = υ, we have:

G jk =∫ +∞

−∞

lim

T →∞1

2T

∫ +T

−Ty j (u)yk(υ) du

e−iω(υ−u) dυ

(143)

=∫ +∞

−∞

lim

T →∞1

2T

∫ +T

−Ty j (u)eiωu du

yk(υ)e−iωυ dυ

(144)

The next step is true only under the condition that theprocess is ergodic. In this case, the last equation can bewritten as:

G jk(ω) = limT →∞

1

2T

( ∫ +T

−Ty j (u)eiωu du

)

×( ∫ +T

−Tyk(υ)e−iωυ dυ

)(145)

This last relation can then be written:

G jk(ω) = limT →∞

y∗j (T, ω)yk(T, ω)

2T(146)

where:

y j (T, ω) =∫ +T

−Ty j (t)e

iωt dt (147)

and y∗j is the complex conjugate of y j .

yk(T, ω) =∫ +T

−Tyk(t)e−iωt dt (148)

Equation (146) expresses the cross spectrum in terms ofthe limit of the product of truncated Fourier transforms.

6. The Conceptual Meaning of Cross Correlation,Cross Spectrum, and Coherence

Given two functions of time x(t) and y(t), the cross corre-lation between these two functions is defined mathemati-cally by the formula:

C(x, y, τ ) = limT →∞

1

2T

∫ +T

−Tx(t)y(t + τ ) dt (149)

Page 64: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

156 Acoustics, Linear

This formula states that we take x at any time t , multiply itby y at a time t + τ (i.e., at a time τ later than t), and sumthe product over all values (− T < t < + T ). The result isthen divided by 2T . In real systems, naturally, T is finite,and the meaning of ∞ in the formula is that various valuesof T must be tried to make sure that the same answer resultsindependent of T .

For two arbitrary functions of time, formula (149) hasno real meaning. It is only when the two signals havesomething to do with each other that the cross correlationtells us something. To see this point clearly, consider an ar-bitrary random wave train moving in space. (It could be anacoustic wave, an elastic wave, an electromagnetic wave,etc.) Let x(t) = x1(t) be the response (pressure, stress, etc.)at one point, and y(t) = x2(t) be the response at anotherpoint. Now form the cross correlation between x1 and x2

(the limit is eliminated, it being understood):

C(x1, x2, τ ) = 1

2T

∫ +T

−Tx1(t)x2(t + τ ) dt (150)

When the points coincide (i.e., x1 = x2), the relationbecomes:

C(x1, τ ) = 1

2T

∫ +T

−Tx1(t)x1(t + τ ) dt (151)

and if τ = 0,

C(x1, 0) = 1

2T

∫ +T

−Tx2

1 (t) dt (152)

which is, by definition, the mean square value of the re-sponse at point x1. For other values of τ , Eq. (151) definesthe autocorrelation at point 1. It is the mean value betweenthe response at one time and the response at another time τ

later than t . Thus, Eq. (150) is the mean product betweenthe response at point 1 and the response at point 2 at a timeτ later.

Now, going back to the random wave train, let us assumethat it is traveling in a nondispersive medium (i.e., withvelocity independent of frequency). It is seen that if thewave train leaves point 1 at time t (see Fig. 21) and travelsthrough the system with no distortion, then:

y(t) = x2(t) = Ax1(t − τ1) (153)

where A is some decay constant giving the amount thatthe wave has decreased in amplitude from point 1 to point

FIGURE 21 Input and output in a linear system. x(t) = input; y (t ) + output.

2, and τ is the time of travel from 1 to 2. Forming the crosscorrelation C(x1, x2, τ ) gives

C(x1, x2, τ ) = 1

2T

∫ +T

−Tx1(t)Ax1(t + τ − τ1) dt

= AC(x1, x1, τ − τ1) (154)

Thus, the cross correlation of a random wave train in anondispersive system is exactly the same form as the auto-correlation of the wave at the starting point, except that thepeak occurs at a time delay corresponding to the time nec-essary for the wave to travel between the points. In the ab-sence of attenuation, the wave is transmitted undisturbedin the medium. However, in most cases it is probable thatthe peak is attenuated somewhat as the wave progresses.It is thus seen that cross correlation is an extremely use-ful concept for measuring the time delay of a propagatingrandom signal. In the above case, it had to be assumed thatthe signal was propagating in a nondispersive medium andthat when the cross correlation was done, the signal wasactually being traced as it moved through the system.

Consider the meaning of cross correlation if the systemwas dispersive (i.e., if the velocity was a function of fre-quency). White has addressed himself to this question andhas demonstrated that time delays in the cross correlationcan still be measured with confidence if the signal thatis traveling is band-limited noise. For dispersive systemswhere the velocity is a function of frequency, it has beenpointed out in the literature that time delays can also beobtained. For this case, the following cross spectrum isformed:

S12(ω) =∫ +∞

−∞C12(τ )e−iωτ dτ (155)

The cross spectrum is a complex number and can be writ-ten in terms of amplitude and phase angle θ12(ω) as fol-lows:

S12(ω) = |S12(ω)|e−iθ12(ω) (156)

The phase angle θ12(ω) is actually the phase between inputand output at frequency ω. The time delay from input tooutput is then:

τ (ω) = θ12(ω)/ω (157)

Suppose that the signal has lost its propagatingproperties in that it has reflected many times and set up

Page 65: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 157

a reverberant field in the system. Consider the physicalmeaning of cross correlation in this case. To answer thisquestion partially, examine an optical field. In opticalsystems, extensive use has been made of the concept ofpartial coherence.

At the beginning of this section, two functions x(t) andy(t) were chosen, and the cross correlation between themwas formed. It was pointed out that if the two functions arecompletely arbitrary, then there is no real physical mean-ing to the cross correlation. However, if the two func-tions are descriptions of response at points of a field, thenthere is a common ground to interpret cross correlation.Thus, the cross correlation and any function that is derivedtherefrom give some measure of the dependence of the vi-brations at one point on the vibrations at the other point.This is a general statement, and to tie it down, the conceptcoherence has been used.

In the optical case, suppose light is coming into twopoints in a field. If the light comes from a small, sin-gle source of narrow spectral range (i.e., almost singlefrequency), then the light at the two field points is de-pendent. If each point in the field receives light from adifferent source, then the light at the two field points isindependent. The first case is termed coherent, and thefield at the points is highly correlated (or dependent). Thesecond case is termed incoherent, and the field betweenthe two points is uncorrelated (independent).

These are the two extreme cases, and between themthere are degrees of coherence (i.e., partial coherence).Just as in everyday usage, coherence is analogous to clar-ity or sharpness of the field, whereas incoherence is tan-tamount to haziness or “jumbledness.” The same idea isused when speaking about someone’s speech or writtenarticle. If it is concise and presented clearly, it is coher-ent. If the ideas and presentation are jumbled, they can becalled incoherent.

Single-frequency radiation is coherent radiation; radi-ation with a finite bandwidth is not. The partial coher-ence associated with finite spectral bandwidth is calledthe temporal (or timewise) coherence. On the other hand,light or sound emanating from a single source gives co-herent radiation, but a point source is never actually ob-tained. The partial coherence effect, due to the fact that thesource is of finite extent, is termed space coherence. Thepoint source gives unit coherence in a system, whereasan extended source gives coherence somewhat less thanunity.

The square of coherence γ12(ω) between signals atpoints 1 and 2 at frequency ω is defined as:

γ 212(ω) = |S12(ω)|2

S11(ω)S22(ω)(158)

where:

S12(ω) =∫ +∞

−∞C12(τ )e−iωτ dτ (159)

in which C12(τ ) is the cross correlation between signals atpoints 1 and 2. The function S12(ω) is the cross spectrumbetween signals at points 1 and 2, and S11(ω) and S22(ω)are the autospectra of the signals at points 1 and 2, re-spectively. Wolf has other ways of defining coherence byfunctions called complex degree of coherence or mutualcoherence function, but it all amounts conceptually to thesame cross spectrum as given by Eq. (158).

Although there are formal proofs that γ12(ω) is alwaysbetween 0 and 1, one can reason this out nonrigorouslyby going back to the basic physical ideas associated withcorrelation and coherence. If the signals at two points areuncorrelated, and therefore incoherent, the cross correla-tion is zero, thus γ12(ω) is zero. If the signals are perfectlycorrelated, then this is tantamount to saying that the sig-nals in the field are a result of input to the system from asingle source, as shown in Fig. 22.

As seen before, the cross spectrum Syz(ω) can be writ-ten in terms of the input spectrum Sx (ω) and the transferfunctions Yz(iω) and Yy(iω) as follows:

Syz(iω) = Yy(iω)Y ∗z (iω)Sx (ω) (160)

Thus,

γ 2yz(ω) = |Syz(ω)|2

Syy(ω)Szz(ω)(161)

= Yy(iω)Y ∗z (iω)Y ∗

y (iω)Yz(iω)S2x (ω)

|Yy(iω)|2Sx (ω)|Yz(iω)|2Sx (ω)= 1 (162)

So that:

0 ≤ γ12(ω) ≤ 1 (complete coherence) (163)

Between the cases of complete coherence and com-plete incoherence there are many degrees of partialcoherence.

B. Statistical Acoustics

1. Physical Concept of Transfer Function

In Section IX.A.2 it was shown that Hi j was the trans-fer function that gave the output at j per unit sinusoidalinput at i . Suppose there is an acoustic field which isgenerated by a group of sound sources and these sourcesare surrounded by an imaginery surface S o as shown inFig. 23.

Through each element of So, sound passes into the field.Thus, each element of So, denoted by ds, can be considereda source that radiates sound into the field. Consider the

Page 66: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

158 Acoustics, Linear

FIGURE 22 Single input or coherent system. x(t) = input; Sx (ω) = spectrum of input; Yz(i ω) = transfer function for zoutput; Yy (i ω) = transfer function for y output; z (t), y (t) = outputs.

pressure dp(P, ω) at field point P at frequency ω due toradiation out of element ds,

dp(P, ω) = Hp(P, S, ω)p(S, ω) ds (164)

where Hp(P, S, ω) is the pressure at field point P per unitarea of S due to a unit sinusoidal input pressure on S. Thetotal pressure in the field at point P is

p(P, ω) =∫

So

Hp(P, S, ω)p(S, ω) ds (165)

If motion (e.g., acceleration) of the surface S is con-sidered instead of pressure, the counterpart to Eq. (165)is

p(P, ω) =∫

So

Ha(P, S, ω)a(S, ω) ds (166)

where Ha(P, S, ω) is the transfer function associated withacceleration; that is, it is the pressure at field point P perunit area of S due to a unit sinusoidal input accelerationof S.

Applying these ideas to Eq. (128), it is seen that the crossspectrum of the field pressure can immediately be written

FIGURE 23 Surface surrounding the sources. So = surroundingsurface; Vo = volume outside the surface; P, Q = field points.

in terms of the cross spectrum of the surface pressure orthe cross spectrum of the surface acceleration. For surfacepressure, Eq. (128) becomes:

G(P, Q, ω) =∫

Si

∫Sr

H∗p (P, Si , ω)Hp(Q, Sr , ω)

× G(Si , Sr , ω) dsi dsr (167)

Comparing Eq. (167) with Eq. (128) we find that the pointsj , k become field points P , Q. G(P, Q, ω) is the crossspectrum of pressure at field points P , Q. The trans-fer functions H∗

js(ω) become H∗p (P, Si , ω) in which Si

is the surface point or i th input point. Hkr (ω) becomesHp(Q, Sr , ω) where Sr is the other surface point or r th in-put point. G(Si , Sr , ω) is the input cross spectrum, whichis the cross spectrum of surface pressure. The summa-tions over r and s become integrals over Si and Sr . Foracceleration,

G(P, Q, ω) =∫

Si

∫Sr

H∗a (P, Si , ω)Ha(Q, Sr , ω)

× A(Si , Sr , ω) dsi dsr (168)

The transfer functions are those for acceleration, andA(Si , Sr , ω) is the cross spectrum of the surface accel-eration. The relation can be written for any other surfaceinput such as velocity.

2. Response in Terms of Green’s Functions

The Green’s functions for single-frequency systems weretaken up in a previous section. The transfer function forpressure is associated with the Green’s function that van-ishes over the surface. Thus,

Hp(P, S, ω) = ∂g1(P, S, ω)

∂n(169)

Page 67: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 159

FIGURE 24 Surface surrounding main sources with presence of other field sources. So = surrounding surface;Qk, Q j = strengths of other sources not within So.

The transfer function for acceleration is associated withthe Green’s function whose normal derivative vanishesover So. Thus,

Ha(P, S, ω) = ρg2(P, S, ω) (169a)

The statistical relations for the field pressure can thereforeimmediately be written in terms of the cross spectrum ofpressure or acceleration over the surface surrounding thesources,

G(P, Q, ω) =∫

Si

∫Sr

∂g∗1 (P, Si , ω)

∂ni

∂g1(Q, Sr , ω)

∂nr

× G(Si , Sr , ω) dsi dsr (170)

or

G(P, Q, ω) =∫

Si

∫Sr

ρ2g∗2 (P, Si , ω)g2(Q, Sr , ω)

× A(Si , Sr , ω) dsi dsr (171)

These relations give the cross spectrum of the field pres-sure as a function of either the cross spectrum of the sur-face pressure G(Si , Sr , ω) or the cross spectrum of thesurface acceleration A(Si , Sr , ω). Equation (170) was de-rived by Parrent using a different approach. At this point,one should review the relationship between Eqs. (170) and(171) for acoustic systems and (128) for general linear sys-tems. It is evident that the inputs in Eqs. (170) and (171)are G(Si , Sr , ω), A(Si , Sr , ω), respectively, and the outputis G(P, Q, ω) in both cases. The frequency response func-tions are the transfer functions described by Eqs. (169) and(169a).

3. Statistical Differential Equations Governingthe Sound Field

Consider the general case where there are source termspresent in the field equation. This is tantamount to sayingthat outside the series of main radiating sources which

have been surrounded by a surface (see Fig. 24) there areother sources arbitrarily located in the field. For example,in the case of turbulence surrounding a moving structure,the turbulent volume constitutes such a source, whereasthe surface of the structure surrounds all the other vibratingsources. The equation governing the propagation of soundwaves in the medium is

∇2 p(P, t) − 1

c20

∂2 p(P, t)

∂t2= V (Q, t) (172)

where:

V (Q, t) =∑

i

Vi (Qi , t) (173)

In the above equation, V is a general source term thatmay consist of a series of sources at various points inthe medium. Actually, the medium being considered isbounded internally by So so that the sources inside So arenot in the medium. The sources at Qi , however, are in themedium.

The various types of source terms that can enter acous-tical fields arise from the injection of mass, momentum,heat energy, or vorticity into the field. These are discussedby Morse and Ingard and will not be treated here. It isassumed that the source term V (Q, t) is a known functionof space and time, or if it is a random function, then somestatistical information is known about it such as its crosscorrelation or cross spectrum.

In cases where the field is random, a statistical de-scription has to be used. The cross correlation function (P1, P2, τ ) between pressures at field points P1 and P2

are defined as:

(P1, P2, τ ) = limT →∞

1

2T

∫ +T

−Tp(P1, t)p(P2, t + τ ) dt

(174)

(To the author’s knowledge, one of the first pieces of workon correlation in wavefields was the paper of Marsh.)

Page 68: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

160 Acoustics, Linear

The Fourier transform, U (P, ω) of the pressure p(P, t)is

U (P, ω) =∫ +∞

−∞p(P, t)e−iωt dt (175)

Taking the inverse, we can write the above equation

∇2 p = 1

∫ +∞

−∞∇2U (P, ω)eiωt dω (176)

Also from the Fourier transform of V (Q, t):

W (Q, ω) =∫ +∞

−∞V (Q, t)e−iωt dt (177)

and its inverse:

V (Q, t) = 1

∫ +∞

−∞W (Q, ω)eiωt dω (178)

Substitution into the original nonhomogeneous waveequation (172) gives:

1

∫ +∞

−∞

[∇2U + ω2

c20

U − W

]dω = 0 (179)

Thus, for this relation to hold for all P and all ω, theremust be

∇2U (P, ω) + k2U (P, ω) = W (Q, ω) (180)

The cross spectrum G(P1, P2, ω) between the pressures atP1 and P2 at frequency ω is defined in terms of the crosscorrelation (P1, P2, τ ), by:

G(P1, P2, ω) =∫ +∞

−∞ (P1, P2, τ )eiωτ dτ (181)

and the inverse is

(P1, P2, τ ) = 1

∫ +∞

−∞G(P1, P2, ω)e−iωτ dω (182)

Thus, (P1, P2, τ ) can be written as:

(P1, P2, τ ) = limT →∞

1

2T

×∫ +T

−T

[1

∫ +∞

−∞U (P1, ω)e−iωt dω

]

×[

1

∫ +∞

−∞U (P2, ω)e−iω(t+τ ) dω

]dt

(183)

So

∇22 (P1, P2, τ ) = lim

T →∞1

2T

×∫ +T

−T

[1

∫ +∞

−∞U (P1, ω)e−iωt dω

]

×[

1

∫ +∞

−∞∇2

2U (P2, ω)e−iω(t+τ ) dω

]dt

(184)

where ∇22 stands for operations performed in the

P2(x2, y2, z2) coordinates. Also,

∂2 (P1, P2, τ )

∂τ 2= lim

T →∞1

2T

×∫ +T

−T

[1

∫ +∞

−∞U (P1, ω)e−iωt dω

]

×[

1

∫ +∞

−∞(−ω2)U (P2, ω)e−iω(t+τ ) dω

]dt

(185)

Thus,

∇22 (P1, P2, τ ) − 1

c20

∂2 (P1, P2, τ )

∂τ 2

= 〈p(P1, t)V (Q2, t + τ )〉 (186)

It should be clear from an analysis similar to that givenabove that the following relation also holds:

∇21 (P2, P1, τ ) − 1

c20

∂2 (P2, P1, τ )

∂τ 2

= 〈p(P2, t)V (Q1, t + τ )〉 (187)

This set of Eqs. (186) and (187) is an extension of theequation obtained by Eckart and Wolf. If the source termwere zero, then:

∇22 (P1, P2, τ ) − 1

c20

∂2 (P1, P2, τ )

∂τ 2= 0

(188)

∇21 (P2, P1, τ ) − 1

c20

∂2 (P2, P1, τ )

∂τ 2= 0

However, since:

(P2, P1, τ ) = (P1, P2, −τ ) (189)

and

∂2

∂τ 2= ∂2

∂(−τ )2(190)

Then,

∇21,2 (P1, P2, τ ) − 1

c20

∂2 (P1, P2, τ )

∂τ 2= 0 (191)

From the above relations, it is seen that the cross correla-tion is propagated in the same way that the original pres-sure wave propagates, except that real time t is replacedby correlation time τ .

The nonhomogeneous counterparts given by Eqs. (186)and (187) state that the source term takes the statistical

Page 69: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 161

FIGURE 25 The loaded structure. f (ro, t) = force component at location ro and time t ; w (r, t) = deflection componentat r at time t .

form of the cross correlation between the pressure p ata reference point and the source function V . Taking theFourier transform of Eqs. (186) and (187), we see that thecross spectrum satisfies:

∇22 G(P1, P2, ω) + k2G(P1, P2, ω) = !2(P1, Q2, ω)

∇21 G(P2, P1, ω) + k2G(P2, P1, ω) = !1(P2, Q1, ω)

(192)

where !1 and !2 are the Fourier transforms of the crosscorrelation between the reference pressure and sourcefunction, that is

!2(P1, Q2, ω) =∫ +∞

−∞〈p(P1, t)V (Q2, t + τ )〉e−iωτ dτ

!1(P2, Q1, ω) =∫ +∞

−∞〈p(P2, t)V (Q1, t + τ )〉e−iωτ dτ

(193)

Thus, !1 and !2 are cross-spectrum functions betweenthe pressure and source term.

In Eqs. (186), (187), (192), and (193), it is impor-tant to note that one point is being used as a referencepoint and the other is the actual variable. For example, inEq. (187), the varying is being done in the P1(x1, y1, z1)coordinates; thus, all cross correlations are performedwith P2 fixed. Conversely, in Eq. (186) all the operationsare being carried out in the P2 space with P1 remainingfixed.

C. Statistics of Structures

1. Integral Relation for the Response

Let the loading (per unit area) on the structure be repre-sented by the function f (r0, t) where r0 is the positionvector of a loaded point on the body with respect to afixed system of axes, as shown in Fig. 25. Let the unitimpulse response be h(r, r0, t − θ ); this is the output at rcorresponding to a unit impulse at t = 0 and at location r0.The response at r at time t due to an arbitrary distributedexcitation f (r0, t) can then be written:

w(r, t) =∫

r0

dr0

∫ t

−∞f (r0, θ )h(r, r0, t − θ ) dθ (194)

The integration is taken over the whole loaded surfacedenoted by r0. Since the loading is usually random in na-ture, only the statistics of the response (that is, the meansquare values, the power spectral density, and so on) aredeterminable. Thus, let U = t − θ and form the cross cor-relation of the response at two points r1 and r2. This crosscorrelation is denoted by Rw and is

Rw(r1, r2, τ ) = limT →∞

1

2T

∫ +T

−T

[ ∫r0

dr0

∫ +∞

−∞

× f (r0, t − U1)h(r1, r0, U1) dU1

]

×[ ∫

r0

dr′0

∫ +∞

−∞f (r′

0, t − U2 + τ )

× h(r2, r′0, U2) dU2

]dt (195)

Page 70: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

162 Acoustics, Linear

or

Rw(r1, r2, τ ) =∫

r0

∫r′

0

dr0 dr′0

∫ +∞

−∞

∫ +∞

−∞

× h(r1, r0, U1)h(r2, r′0, U2) dU1 dU2

×[

limT →∞

1

2T

∫ +T

−Tf (r0, t − U1)

× f (r′0, t − U2 + τ ) dt (196)

We assume a stationary process so that the loading is onlya function of the difference of the times t − U2 + τ andt − U1. Let

τ3 = (t − U2 + τ ) − (t − U1)

= U1 − U2 + τ (197)

then,

Rw(r1, r2, τ ) =∫

r0

∫r′

0

dr0 dr′0

∫ +∞

−∞

∫ +∞

−∞

× h(r1, r0, U1)(r2, r′0, U2)

× R"(r0, r′0, τ3) dU1 dU2 (198)

where R"(r0, r′0, τ3) is the cross-correlation function of the

loading. Now form the cross spectrum of the response:

Sw(r1, r2, ω) =∫ +∞

−∞Rw(r1, r′

2, τ3)e−iωτ dτ

(199)e−iωτ = eiω(τ3−U1+U2)

Thus,

Sw(r1, r2, ω) =∫

r0

∫r′

0

dr0 dr′0

∫ +∞

−∞h(r1, r0, U1)e

iωU1 dU1

×∫ +∞

−∞h(r2, r′

0, U2)e−iω U2 dU2

×[ ∫ +∞

−∞R"(r0, r′

0, τ3)e−iωτ3 dτ3

](200)

but the Fourier transform of the impulse function is theGreen’s function. Thus,∫ +∞

−∞h(r1, r0, U1)eiωU1 dU1 = G∗(r1, r0, ω)

(201)∫ +∞

−∞h(r2, r′

0, U2)e−iωU2 dU2 = G(r2, r′0, ω)

where G∗ denotes the complex conjugate of G. TheGreen’s function G(r2, r′

0, ω) is the response at r2 due to aunit sinusoidal load at r′

0, and G∗(r1, r0, ω) is the complexconjugate of the response at r1 due to a unit sinusoidal loadat r0. The bracket can be written:

S"(r0, r′0, ω) =

∫ +∞

−∞R"(r0, r′

0, τ3)e−iωτ3 dτ3 (202)

where S" is the cross spectrum of the load. Thus, the ex-pression for the cross spectrum of the response becomes:

Sw(r1, r2, ω) =∫

r0

∫r′

0

dr0 dr′0G∗(r1, r0, ω)

× G(r2, r′0, ω)S"(r0, r′

0, ω) (203)

The spectrum at any point r1 is obtained by setting r1 = r2;thus,

Sw(r1, ω) =∫

r0

∫r′

0

dr0 dr′0G∗(r1, r0, ω)

× G(r1, r′0, ω)S"(r0, r′

0, ω) (204)

Note the equivalence between Eq. (203) and the generalEq. (128) for linear systems.

In many practical cases, especially in turbulence exci-tation, the cross spectrum of the loading takes a homoge-neous form as follows:

S"(r0, r′0, ω) = S(r0 − r′

0, ω) (205)

Let r0 − r′0 = ξ . Equation (203) now becomes:

Sw(r1, r2, ω) =∫

r0

∫r′

0

dr0 dr′0G(r1, r′

0, ω)

× G(r2, r′0, ω)S(ξ, ω) (206)

White has shown that by applying Parseval’s theorem andletting r1 = r2, the above equation can be written:

Sw(r1, r′1, ω) = (2π )2

∫k

S(k, ω)ψ(k, ω) dk (207)

where:

ψ(k, ω) =∣∣∣∣ 1

(2π )2

∫ +∞

−∞G(r1, r, ω)eik(r−r1) dr

∣∣∣∣2

(208)

In the above equations, S(k, ω) is the spectrum of theexcitation field in wave number space, and ψ(k, ω) is thesquare of the Fourier transform of the Green’s function,which can be obtained very quickly on a computer byapplication of the fast Fourier transform technique.

The Green’s functions take on the true spatial characterof an influence function. They represent the response atone point due to a unit sinuosidal load at another point.The inputs are loads, the outputs are deflections, and thelinear black boxes are pieces of the structure as used in thefirst section of this chapter.

A few very interesting results can immediately be writ-ten from Eq. (204). Supposing a body is loaded by a singlerandom force at point p, the loading "(r, t) can be written:

"(r, t) = P(t)δ(r − rp) (209)

Page 71: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 163

The δ function signifies that " is 0 except when r = rp.Thus,

S"(r0, r′0, ω) = Sp(ω)δ(r0 − r′

p)δ(r′0 − rp) (210)

The spectrum of the response is, therefore,

Sw(r1, ω) =∫

r0

∫r′

0

dr0 dr′0G∗(r1, r0, ω)

× G(r1, r′0, ω)Sp(ω)δ(r0 − rp)δ(r′

0 − rp)

(211)

= Sp(ω)∫

r0

G∗(r1, r0, ω)

× δ(r0 − rp) dr0

∫r′

0

G(r1, r′0, ω)

× δ(r′0 − rp) dr′

0 (212)

= Sp(ω)|G(r1, rp, ω)|2 (213)

The spectrum of the response is the square absolute valueof the Green’s function multiplied by the spectrum of theforce. The Green’s function in this case is the responseat r1 due to unit sinusoidal force of frequency ω at rp (pbeing the loading point).

Suppose there is a group of independent forces on thestructure. The cross correlation between them is 0, so:

S"(r0, r′0, ω) = S(r0, ω)δ(r0 − r′

0) (214)

That is, S" = 0 except when r0 = r′0, so:

Sw(r1, ω) =∫

r0

∫r′

0

G∗(r1, r0, ω)G(r1, r′0, ω)

× S(r0, ω)δ(r0 − r′0) dr0 dr′

0

=∫

r′0

G(r1, r′0, ω)

[ ∫r0

G∗(r1, r0, ω)

× S(r0, ω)δ(r0 − r′0) dr0

]dr′

0

=∫

r′0

G(r1, r′0, ω)G∗(r1, r′

0, ω)S(r′0, ω)] dr′

0

=∫

r′0

|G(r1, r′0, ω)|2S(r′

0, ω) dr′0 (215)

If there are n forces, each with spectrum S(rn, ω),

Sw(r1, ω) =∑

n

|G(r1, rn, ω)|2S(rn, ω) (215a)

The response is just the sum of the spectra for each forceacting separately.

2. Computation of the Response in Termsof Modes

The general variational equation of motion for any elasticstructure can be written as:∫ ∫

V

∫[ρ(uδu + vδv + wδw) + δW ] dV

−∫ ∫

S

(Xνδu + Yνδυ + Zνδw) d S = 0 (216)

where ρ is mass density of body; u, v, w are displacementsat any point; δu, δv, δw are variations of the displacements;Xν , Yν , Zν are surface forces; ds is the elemental surfacearea; dV is the elemental volume; and δW is the variationof potential energy. In accordance with Love’s analysis,let the displacements in the normal modes be describedby:

u = urϕ′r , v = vrϕ

′r , w = wrϕ

′r (217)

where ϕ′r = Ar cos pr t, pr being the natural frequency of

the r th mode. Now let the forced motion of the system bedescribed by:

u =∑

r

urϕr , v =∑

r

vrϕr , w =∑

r

wrϕr (218)

where ur , vr , wr are the mode shapes, and ϕr is a functionof time. In accordance with Love, let

u = urϕr δu = usϕs

v = vrϕr δv = vsϕs (219)

w = wrϕr δw = wsϕs

Substituting into the variational equation of motion, weobtain the following:∫ ∫

V

∫ρ(ur ϕr usϕs + vr ϕrvsϕs + wr ϕrwsϕs) dV

+∫ ∫

V

∫δW dV =

∫ ∫S

(Xνusϕs + Yνvsϕs

+ Zνwsϕs) d S (220)

However, since the modal functions satisfy the equationfor free vibration:∫ ∫

V

∫δW dV =

∫ ∫V

∫ρ(

p2r urϕr usϕs

+ p2r vrϕrvsϕs + p2

r wrϕrwsϕs)

dV

(221)

Page 72: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

164 Acoustics, Linear

and Love shows that:∫ ∫V

∫ρ(ur us +vrvs +wrws) dV = 0 r = s (222)

the final equation of motion becomes:

ϕr (t) + p2r ϕ(t) = Fr (t) (223)

where Mr = ∫∫V

∫ρ(u2

r + v2r + w2

r ) dV (the generalizedmass for the r th mode):

Fr (t) = 1

Mr

∫ ∫S

[Xν(t)ur + Yν(t)vr + Zν(t)wr ] d S

(224)If structural damping is taken into account, it can be writtenas another generalized force that opposes the motion:

(Fr )damping = −κϕr

∫ ∫V

∫ (u2

r + v2r + w2

r

)dV (225)

where κ is the damping force per unit volume per unitvelocity. Finally, the equation of motion becomes:

ϕr + ψr ϕr + p2r ϕr = Fr (226)

where:

ψr = κ

Mr

∫ ∫V

∫ (u2

r + v2r + w2

r

)dV

It is convenient to employ the vector notation; thus, letthe displacement functions in the r th mode be written as:

qr = ur i + vr j + wr k (227)

where i, j, k are the unit vectors in the x , y, z directions,respectively. Let

F(s, t) = Xν i + Yνj + Zνk (228)

Thus,

Mr =∫ ∫

V

∫ρqr · qr dV, qr = qr (V )

(229)

Fr (t) = 1

Mr

∫ ∫S

F · qr ds, F = F(S, t)

The Fourier transform of F(S, t) is

SF(S, ω) =∫ +∞

−∞F(S, t)e−iωr dt (230)

and the Fourier transform of ϕr is

Sϕr (ω) =∫ +∞

−∞ϕr (t)e−iωr dt (231)

Now,

ϕr (t) = 1

∫ +∞

−∞Sϕr (ω)eiωt dω

ϕr (t) = 1

∫ +∞

−∞iωSϕr (ω) eiωt dω (232)

ϕr (t) = 1

∫ +∞

−∞−ω2Sϕr (ω)eiωt dω

Sϕr (ω) = iωSϕr (ω), Sϕr (ω) = −ω2Sϕr (ω) (233)

Let βr be the damping constant for the r th mode. Nowtake the Fourier transform of the equation of motion:

Sϕr + βr Sϕr + p2r Sϕr = SFr (234)

which is

−ω2Sϕr (ω) + iωβr Sϕr (ω) + p2r Sϕr (ω) = SFr (ω) (235)

where:

SFr (ω) = 1

Mr

∫ ∫S

SF(rs, ω) · qr (rs) ds (236)

So,

Sϕr (ω) =

∫∫s

SF(rs, ω) · qr (rs) ds

Mr[(

p2r − ω2

) + iωβr]

In dealing with statistical averaging, the cross correla-tion function is used. The cross correlation between thedisplacement at two points in any direction (the directioncan be different at the two points) is

q (r1, r2, τ ) = limT →∞

1

2T

∫ +T

−Tq(r1, t)q(r2, t + τ ) dt

(237)

We are picking a given direction at each point, so the twoquantities are scalar (no longer vector). Then,

q =∑

r

qrϕr

q(r2, t + τ ) = 1

∫ +∞

−∞Sq (r2, ω)eiω(t+τ ) dω (238)

q(r1, t) = 1

∫ +∞

−∞Sq (r1, ω)eiωt dω

Page 73: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 165

So,

q (r1, r2, τ ) = limT →∞

1

∫ +T

−Tq(r1, t)

×(

1

∫ +T

−TSq (r2, ω)eiω(t+τ ) dω

)dt

= limT →∞

1

2T

∫ +T

−TSq (r2, ω)eiωτ

×(

1

∫ +T

−Tq(r1, t)eiωt dt

)dω

= limT →∞

1

2T

∫ +T

−TST

q (r2, ω)S∗q (r1, ω)eiωτ dω

(239)

Now, the power spectral density of the displacement isdefined in terms of the cross correlation as:

q (r1, r2, τ ) = 1

∫ +∞

−∞Gq (r1, r2, ω)eiωτ dω (240)

Then,

Gq (r1, r2, ω) = limT →∞

1

2TS∗T

q (r1, ω)S∗Tq (r2, ω) (241)

Now,

STq (r2, ω) =

∑r

qr (r2)STϕr

(ω) (242)

Thus,

Gq (r1, r2, ω) = limT →∞

1

2T

∑r

∑k

× qr (r1)STϕr

(ω)qk(r2)STϕk

(ω) (243)

Gq (r1, r2, ω) =∑

r

∑k

qr (r1)qk(r2)

Y ∗r (iω)Yk(Iω)

∫ ∫Su

∫ ∫Sv

× limT →∞

1

2T

[S∗T

F

(rSu , ω

) · qr(rSu

)]× [

STF

(rSv

, ω) · qk

(rSv

)]dsu dsv

(244)

Now, if the integrand is written in the double surface in-tegral, it is

S∗TX ST

X ur uk + S∗TX ST

Y urvk + S∗TX ST

Z urwk + S∗TY ST

X vr uk

+ S∗TY ST

Y vrvk + S∗TY ST

Z vrwk + S∗TZ SXwr uk

+ S∗TZ ST

Y wrvk + S∗TZ ST

Z wrwk (245)

Note the tensor properties of the last expression involv-ing each component of loading. Note that in the generalformula involving the dot product, the component of themodal vector in the direction of the loading function at thetwo points has to be taken. Now, assuming that the load-ing is normal to the surface of the structure, our concern iswith the cross spectral density of the normal accelerationat two points (or cross spectral density between normalacceleration at two points r1s and r2s) on the surface:

an(r1s, r2s, ω) = ω4∑

r

∑q

[qr (r1s)qk(r2s)]n

Y ∗r (iω)Yk(iω)

×∫ ∫Su

∫ ∫Sv

G p(rSu , rSv

, ω)

× qr(rSu

)qk

(rSv

)dsu dsv (246)

where G p(rSu , rSv, ω) is the cross spectral density of the

loading normal to the surface at points rSu and rSv. The

mean square acceleration over a frequency band 1 to 2

at point r1S is given by

an(r1s)2 = 1

∫ 2

1

an(r1S, r1s, ω) dω (247)

Equation (246) is nothing other than Eq. (203) with theintegrand expanded in terms of modes of the structure, andEq. (203) in turn is nothing other than Eq. (128) writtenfor a continuous structure instead of just a linear black boxsystem.

D. Coupled Structural Acoustic Systems

Equation (171) stated that the cross spectral density of thefield pressure in tems of acceleration spectra on the surfaceis

G(P1, P2, ω) = 1

(4π)2

∫S1

∫S2

ρ20 an(S1, S2, ω)g(P1, S1, ω)g∗

× (P2, S2, ω) d S1 d S2 (248)

Furthermore, it was found in the last section that the cross-spectral density of the normal acceleration for a structurein which the loading is normal to the surface can then bewritten:

an(S1, S2, ω) = ω4∑

r

∑m

qrn(S1)qmn(S2)

Yr (iω)Y ∗m(iω)

Crm(ω)

(249)in which:

Crm(ω) =∫

S1

∫S2

G(S1, S2, ω)qrn(S1)qmn(S2) d S1 d S2

where G(S1, S2, ω) is the cross-spectral density of thepressure that excites the structure, and qrn(S1) is the

Page 74: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

166 Acoustics, Linear

normal component of the r th mode evaluated at point S1

of the surface. If the damping in the structure is relativelylow, then in accordance with the analysis of Powell, Hurty,and Rubenstein, the cross-product terms can be neglectedand

an(S1, S2, ω) ≈ ω4∑

r

qrn(S1)qrn(S2)

|Yr (iω)|2 Crr (ω) (250)

where:

Crr (ω) =∫

S1

∫S2

G(S1, S2, ω)qrn(S1)qrn(S2) d S1 d S2

To carry the analysis further, a Green’s function must beobtained. Using the analysis of Strasberg and Morse andIngard as a guide, we assume the use of a free-field Green’sfunction. The analysis, although approximate, then comesout in general form instead of being limited to a particularsurface. Therefore, let

g(P1, S1, ω) = eik R1

R1e−ik(aR1·RS1 ) d S2 (251)

in which (see Fig. 11):

aR1 · RS1 = zo cos θ1 + xo sin θ1cos ϕ1 + yo sin θ1 sin ϕ1

where xo, yo, zo are the rectangular coordinates of thepoint on the vibrating surface of the structure; RS1 is theradius vector to point S1 on the surface; R1, θ1, ϕ1 arethe spherical coordinates of point P1 in the far field; aR1 isa unit vector in the direction of R1 (the radius vector fromthe origin to the far-field point). Thus, aR1 · RS1 is the pro-jection of RS1 on R1, making R1 − aR · RS1 the distancefrom the far-field point to the surface point. Therefore,

g∗(P2, S2, ω) = e−ik R2

R2eik(aR2·RS2 ) (253)

Combining Eqs. (248), (249), (251), and (253) gives thefollowing expression for far-field cross spectrum of far-field pressure:

G(P1, P2, ω) = ω4ρ2o

eik(R1−R2)

(4π )2 R1 R2

∑r

∑m

Ir (θ1, ϕ1, ω)I ∗m

× (θ2, ϕ2, ω) × Crm

Yr (iω)Y ∗m(iω)

(254)

where:

Ir =∫

S1

qrn(S1)e−ik(aR1·RS1 ) d S1

(255)

I ∗m =

∫S2

qmn(S2)e−ik(aR2·RS2 ) d S2

With the low damping approximation given by Eq. (250),the far-field auto spectrum at point P1 is

G(P1, P1, ω) ≈ ρoω4

(4π )2 R21

×∑

r

|Ir (θ1, ϕ1, ω)|2|Yr (iω)|2 Crr (ω) (256)

The far-field mean square pressure in a frequency band = 2 − can be written:

p(P1)2 = 1

∫ 2

1

G(P1, P2, ω) dw (257)

Thus,

p(P1)2 = ρ2

o

(4π )2 R21

1

×∫ 2

1

[ ∑r

ω4Crr (ω)

|Yr (iω)|2 |Ir (θ1, ϕ1, ω)|2 dw

]

(258)

In cases in which the structure is lightly damped, the fol-lowing can be written:

1

∫ 2

1

ω4Crr (ω) dω

|Yr (iω)|2 ≈ pr Crr (pr )

8ζr M2r

(259)

Where Crr (pr ) is defined as the joint acceptance evaluatedat the natural frequency pr (pr consists of those natural fre-quencies between 1 and 2), Mr is the total generalizedmass of the r th mode (including virtual mass), and

ζr = Cr/(Cc)r (260)

where Cr is the damping constant for the r th mode (includ-ing radiation damping) and (Cc)r is the critical dampingconstant for that mode. Thus, the mean square pressure atthe far-field point P1 in the frequency band is

p(P1)2 ≈ ρ2

o

R21

1

(4π )2

∑r in

pr Crr (pr )

8ζr M2r

|Ir (θ1, ϕ1, pr )|2

(261)

In Eq. (261), Crr (pr ) describes the characteristics of thegeneralized force of the random loading, pr/8ζr M2

r de-scribes the characteristics of the structure, and Ir describesthe directivity of the noise field. The sum is taken overthose modes that resonate in the band.

SEE ALSO THE FOLLOWING ARTICLES

ACOUSTICAL MEASUREMENT • ACOUSTIC CHAOS

• ACOUSTIC WAVE DEVICES • SIGNAL PROCESSING,ACOUSTIC • WAVE PHENOMENA

Page 75: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FYK Revised Pages

Encyclopedia of Physical Science and Technology EN001E-09 May 7, 2001 16:21

Acoustics, Linear 167

BIBLIOGRAPHY

Ando, Y. (1998). “Architectural Acoustics: Blending Sound Sources,Sound Fields, and Listener,” Modern Acoustics and Signal Processing,Springer-Verlag.

Brekhovskikh, L. M., and Godin, O. A. (1998). “Acoustics of LayeredMedia I: Plane and Quasi-Plane Waves,” Springer Series on WavePhenomena, Vol. 5, Springer-Verlag.

Brekhovskikh, L. M., and Godin, O. A. (1999). “Acoustics of LayeredMedia II: Point Sources and Bounded Beams,” Springer Series on

Wave Phenomena, Vol. 14, Springer-Verlag.Howe, M. S. (1998). “Acoustics of Fluid-Structure Interactions,” Cam-

bridge University Press.Kishi, T., Ohtsu, M., and Yuyama, S., eds. (2000). “Acoustic Emission—

Beyond the Millennium,” Elsevier.Munk, W., Worcester, P., and Wunsch, C. (1995). “Ocean Acoustic To-

mography,” Cambridge University Press.Ohayon, R., and Soize, C. (1998). “Structural Acoustics and Vibration,”

Academic Press.Tohyama, M., Suzuki, H., and Ando, Y. (1996). “The Nature and Tech-

nology of Acoustic Space,” Academic Press, London.

Page 76: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

ChaosJoshua SocolarDuke University

I. IntroductionII. Classical Chaos

III. Dissipative Dynamical SystemsIV. Hamiltonian SystemsV. Quantum Chaos

GLOSSARY

Cantor set Simple example of a fractal set of points witha noninteger dimension.

Chaos Technical term referring to the irregular, unpre-dictable, and apparently random behavior of determin-istic dynamical systems.

Deterministic equations Equations of motion with norandom elements for which formal existence anduniqueness theorems guarantee that once the necessaryinitial and boundary conditions are specified the solu-tions in the past and future are uniquely determined.

Dissipative system Dynamical system in which frictionalor dissipative effects cause volumes in the phase spaceto contract and the long-time motion to approach anattractor consisting of a fixed point, a periodic cycle,or a strange attractor.

Dynamical system System of equations describing thetime evolution of one or more dependent variables.The equations of motion may be difference equations ifthe time is measured in discrete units, a set of ordinarydifferential equations, or a set of partial differentialequations.

Ergodic theory Branch of mathematics that introduces

statistical concepts to describe average properties ofdeterministic dynamical systems.

Extreme sensitivity to initial conditions Refers to therapid, exponential divergence of nearby trajectories inchaotic dynamical systems.

Fractal Geometrical structure with self-similar struc-ture on all scales that may have a noninteger dimen-sion, such as the outline of a cloud, a coastline, or asnowflake.

Hamiltonian system Dynamical system that conservesvolumes in phase space, such as a mechanical oscillatormoving without friction, the motion of a planet, or aparticle in an accelerator.

KAM theorem The Kolmogorov-Arnold–Moser theo-rem proves that when a small, nonlinear perturbationis applied to an integrable Hamiltonian system it re-mains nearly integrable if the perturbation is suffi-ciently small.

Kicked rotor Simple model of a Hamiltonian dynamicalsystem that is exactly described by the classical stan-dard map and the quantum standard map.

Kolmogorov–Sinai entropy Measure of the rate of mix-ing in a chaotic dynamical system that is closely related

637

Page 77: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

638 Chaos

to the average Lyapunov exponent, which measures theexponential rate of divergence of nearby trajectories.

Localization Quantum interference effect, introduced byAnderson in solid-state physics, which inhibits thetransport of electrons in disordered or chaotic dynami-cal systems as in the conduction of electronics in disor-dered media or the microwave excitation and ionizationof highly excited hydrogen atoms.

Lyapunov exponent A real number λ specifying the av-erage exponential rate at which nearby trajectories inphase space diverge or converge.

Mixing Technical term from ergodic theory that refersto dynamical behavior that resembles the evolution ofcream poured in a stirred cup of coffee.

Period-doubling bifurcations Refers to a common routefrom regularity to chaos in chaotic dynamical systemsin which a sequence of periodic cycles appears in whichthe period increases by a factor of two as a controlparameter is varied.

Phase space Mathematical space spanned by the depen-dent variables of the dynamical system. For example,a mechanical oscillator moving in one dimension has atwo-dimensional phase space spanned by the positionand momentum variables.

Poincare section Stroboscopic picture of the evolutionof a dynamical system in which the values of two de-pendent variables are plotted as points in a plane eachtime the other dependent variables assume a specifiedset of values.

Random matrix theory Theory introduced to describethe statistical fluctuations of the spacings of nuclearenergy levels based on the statistical properties of theeigenvalues of matrices with random elements.

Resonance overlap criterion Simple analytical estimateof the conditions for breakup of KAM surfaces leadingto widespread, global chaos.

Strange attractor Aperiodic attracting set with a fractalstructure that often characterizes the longtime dynam-ics of chaotic dissipative systems.

Trajectory A path in the phase space of a dynamical sys-tem that is traced out by a system starting from a par-ticular set of initial values of the dependent variables.

Universality Refers to the detailed quantitative similarityof the transition from regular behavior to chaos in abroad class of disparate dynamical systems.

A WIDE VARIETY of natural phenomena exhibit com-plex, irregular behavior. In the past, many of these phe-nomena were considered to be too difficult to analyze;however, the advent of high-speed digital computers cou-pled with new mathematical and physical insight has led tothe development of a new interdisciplinary field of science

called nonlinear dynamics, which has been very success-ful in finding some underlying order concealed in nature’scomplexity. In particular, research in the latter half on the20th century has revealed how very simple, diterministicmathematical models of physical and biological systemscan exhibit surprisingly complex behavior. The apparentlyrandom behavior of these deterministic, nonlinear dynam-ical systems is called chaos.

Since many different fields of science and engineeringare confronted with difficult problems involving nonlin-ear equations, the field of nonlinear dynamics has evolvedin a highly interdisciplinary manner, with important con-tributions coming from biologists, mathematicians, engi-neers, and physicists. In the physical sciences, importantadvances have been made in our understanding of com-plex processes and patterns in dissipative systems, such asdamped, driven, nonlinear oscillators and turbulent fluids,and in the derivation of statistical descriptions of Hamilto-nian systems, such as the motion of celestial bodies and themotion of charged particles in accelerators and plasmas.Moreover, the predictions of chaotic behavior in simplemechanical systems have led to the investigation of themanifestations of chaos in the corresponding quantum sys-tems, such as atoms and molecules in very strong fields.

This article attempts to describe some of the fundamen-tal ideas; to highlight a few of the important advances inthe study of chaos in classical, dissipative, and Hamilto-nian systems; and to indicate some of the implications forquantum systems.

I. INTRODUCTION

In the last 25 years, the word chaos has emerged as a tech-nical term to refer to the complex, irregular, and appar-ently random behavior of a wide variety of physical phe-nomena, such as turbulent fluid flow, oscillating chemicalreactions, vibrating structures, the behavior of nonlinearelectrical circuits, the motion of charged particles in ac-celerators and fusion devices, the orbits of asteroids, andthe dynamics of atoms and molecules in strong fields. Inthe past, these complex phenomena were often referredto as random or stochastic, which meant that researchersgave up all hope of providing a detailed microscopic de-scription of these phenomena and restricted themselvesto statistical descriptions alone. What distinguishes chaosfrom these older terms is the recognition that many com-plex physcial phenomena are actually described by deter-ministic equations, such as the Navier–Stokes equationsof fluid mechanics, Newton’s equations of classical me-chanics, or Schrodinger’s equation of quantum mechan-ics, and the important discovery that even very simple,deterministic equations of motion can exhibit exceedingly

Page 78: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 639

complex behavior and structure that is indistinguishablefrom an idealized random process. Consequently, a newterm was required to describe the irregular behavior ofthese deterministic dynamical systems that reflected thenew found hope for a deeper understanding of these vari-ous physical phenomena. These realizations also led to therapid development of a new, highly interdisciplinary fieldof scientific research called nonlinear dynamics, which isdevoted to the description of complex, but deterministic,behavior and to the search for “order in chaos.”

The rise of nonlinear dynamics was stimulated by thecombination of some old and often obscure mathematicsfrom the early part of the 20th century that were preservedand developed by isolated mathematicians in the UnitedStates, the Soviet Union, and Europe; the deep natural in-sight of a number of pioneering researchers in meteorol-ogy, biology, and physics; and by the widespread availabil-ity of high-speed digital computers with high-resolutioncomputer graphics. The mathematicians constructed sim-ple, but abstract, dynamical systems that could generatecomplex behavior and geometrical patterns. Then, earlyresearchers studying the nonlinear evolution of weatherpatterns and the fluctuations of biological populations re-alized that their crude approximations to the full math-ematical equations, in the form of a single differenceequation or a few ordinary differential equations, couldalso exhibit behavior as complex and seemingly randomas the natural phenomena. Finally, high-speed comput-ers provided a means for detailed computer experimentson these simple mathematical models with complex be-havior. In particular, high-resolution computer graphicshave enabled experimental mathematicians to search fororder in chaos that would otherwise be buried in reamsof computer output. This rich interplay of mathematicaltheory, physical insight, and computer experimentation,which characterizes the study of chaos and the field ofnonlinear dynamics, will be clearly illustrated in each ofthe examples discussed in this article.

Chaos research in the physical sciences and engineer-ing can be divided into three distinct areas relating to thestudy of nonlinear dynamical systems that correspond to(1) classical dissipative systems, such as turbulent flows ormechanical, electrical, and chemical oscillators; (2) classi-cal Hamiltonian systems, where dissipative processes canbe neglected, such as charged particles in accelerators andmagnetic confinement fusion devices or the orbits of aster-oids and planets; and (3) quantum systems, such as atomsand molecules in strong static or intense electromagneticfields or electrons confined to submicron-scale cavities.

The study of chaos in classical systems (both dissipa-tive and Hamiltonian) is now a fairly well-developed fieldthat has been described in great detail in a number of pop-

ular and technical books. In particular, the term chaos hasa very precise mathematical definition for classical non-linear systems, and many of the characteristic features ofchaotic motion, such as the extreme sensitivity to initialconditions, the appearance of strange attractors with non-integer fractal dimensions, and the period-doubling routeto chaos, have been cataloged in a large number of exam-ples and applications, and new discoveries continue to filltechnical journals. In Section II, we will begin with a pre-cise definition of chaos for classical systems and present avery simple mathematical example that illustrates the ori-gin of this complex, apparently random motion in simpledeterministic dynamical systems. In Section II, we willalso consider additional examples to illustrate some ofthe other important general features of chaotic classicalsystems, such as the notion of geometric structures withnoninteger dimensions.

Some of the principal accomplishments of the applica-tion of these new ideas to dissipative systems include thediscovery of a universal theory for the transition from reg-ular, periodic behavior to chaos via a sequence of period-doubling bifurcations, which provides quantitative pre-dictions for a wide variety of physical systems, and thediscoveries that mathematical models of turbulence withas few as three nonlinear differential equations can exhibitchaotic behavior that is governed by a strange attractor.

The ideas and analytical methods introduced by the sim-ple models of nonlinear dynamics have provided impor-tant analogies and metaphors for describing complex nat-ural phenomena that should ultimately pave the way fora better theoretical understanding. Section III will be de-voted to a detailed discussion of several models of dissipa-tive systems with important applications in the descriptionof turbulence and the onset of chaotic behavior in a varietyof nonlinear oscillators.

The latter portion of Section III introduces concepts as-sociated with the description of dissipative dynamical sys-tems with many degrees of freedom and briefly discussessome issues that have been central to chaos research in thelast decade.

In the realm of Hamiltonian systems, the exact non-linear equations for the motion of particles in accelera-tors and fusion devices and of celestial bodies are simpleenough to be analyzed using the analytical and numericalmethods of nonlinear dynamics without any gross approx-imations. Consequently, accurate quantitative predictionsof the conditions for the onset of chaotic behavior that playsignificant roles in the design of accelerators and fusiondevices and in understanding the irregular dynamics of as-teroids can be made. Moreover, the important realizationthat only a few interacting particles, representing a smallnumber of degrees of freedom, can exhibit motion that is

Page 79: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

640 Chaos

sufficiently chaotic to permit a statistical description hasgreatly enhanced our understanding of the microscopicfoundations of statistical mechanics, which have also re-mained an outstanding problem of theoretical physics forover a century. Section IV will examine several simplemathematical models of Hamiltonian systems with appli-cations to the motion of particles in accelerators and fusiondevices and to the motion of celestial bodies.

Finally, in Section V, we will discuss the more recentand more controversial studies of the quantum behaviorof strongly coupled and strongly perturbed Hamiltoniansystems, which are classically chaotic. In contrast to thetheory of classical chaos, there is not yet a consensus onthe definition of quantum chaos because the Schrodingerequation is a linear equation for the deterministic evolu-tion of the quantum wave function, which is incapableof exhibiting the strong dynamical instability that defineschaos in nonlinear classical systems. Nevertheless, bothnumerical studies of model problems and real experimentson atoms and molecules reveal that quantum systems canexhibit behavior that resembles classical chaos for longtimes. In addition, considerable research has been de-voted to identifying the distinct signatures or symptomsof the quantum behavior of classically chaotic systems.At present, the principal contributions of these studies hasbeen the demonstration that atomic and molecular physicsof strongly perturbed and strongly coupled systems can bevery different from that predicted by the traditional per-turbative methods of quantum mechanics. For example,experiments with highly excited hydrogen atoms in strongmicrowave fields have revealed a novel ionization mecha-nism that depends strongly on the intensity of the radiationbut only weakly on the frequency. This dependence is justthe opposite of the quantum photoelectric effect, but thesharp onset of ionization in the experiments is very welldescribed by the onset of chaos in the corresponding clas-sical system.

II. CLASSICAL CHAOS

This section provides a summary of the fundamental ideasthat underlie the discussion of chaos in all classical dynam-ical systems. It begins with a precise definition of chaosand illustrates the important features of the definition us-ing some very simple mathematical models. These exam-ples are also used to exhibit some important propertiesof chaotic dynamical systems, such as extreme sensitiv-ity to initial conditions, the unpredictability of the long-time dynamics, and the possibility of geometric struc-tures corresponding to strange attractors with noninteger,fractal dimensions. The manifestations of these funda-mental concepts in more realistic examples of dissipative

and Hamiltonian systems will be provided in Sections IIIand IV.

A. The Definition of Chaos

The word chaos describes the irregular, unpredictable, andapparently random behavior of nonlinear dynamical sys-tems that are described mathematically by the determinis-tic iteration of nonlinear difference equations or the evolu-tion of systems of nonlinear ordinary or partial differentialequations. The precise mathematical definition of chaosrequires that the dynamical system exhibit mixing behav-ior with positive Kolmogorov–Sinai entropy (or positiveaverage Lyapunov exponent). This definition of chaos in-vokes a number of concepts from ergodic theory, which is abranch of mathematics that arose in response to attemptsto reconcile statistical mechanics with the deterministicequations of classical mechanics. Although the equationsthat describe the evolution of chaotic dynamical systemsare fully deterministic (no averages over random forcesor initial conditions are involved), the complexity of thedynamics invites a statistical description. Consequently,the statistical concepts of ergodic theory provide a naturallanguage to define and characterize chaotic behavior.

1. Ergodicity

A central concept familiar to physicists because of its im-portance to the foundations of statistical mechanics thenotion of ergodicity. Roughly speaking, a dynamical sys-tem is ergodic if the system comes arbitrarily close to everypossible point (or state) in the accessible phase space overtime. In this case, the celebrated ergodic theorem guaran-tees that long-time averages of any physical quantity canbe determined by performing averages over phase spacewith respect to a probability distribution. However, al-though there has been considerable confusion in the phys-ical literature, ergodicity alone is not sufficiently irregularto account for the complex behavior of turbulent flows orinteracting many-body systems. A simple mathematicalexample clearly reveals these limitations.

Consider the dynamical system described by the differ-ence equation,

xn+1 = xn + a, Mod 1 (1)

which takes a real number xn between 0 and 1, adds an-other real number a, and subtracts the integer part of thesum (Mod 1) to return a value of xn+1 on the unit interval[0, 1]. The sequence of numbers, xnn=0,1,2,3,..., gener-ated by iterating this one-dimensional map describes thetime history of the dynamical variable xn (where time ismeasured in discrete units labeled by n). If a = p/q is arational number (where p and q are integers), then startingwith any initial x0, this dynamical system generates a time

Page 80: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 641

sequence of xn that returns to x0 after q iterations sincexq = x0 + p (Mod1) = x0 (Mod 1). In this case, the long-time behavior is described by a periodic cycle of period qthat visits only q different values of x on the unit interval[0, 1]. Since this time sequence does not come arbitrarilyclose to every point in the unit interval (which is the phaseor state space of this dynamical system), this map is notergodic for rational values of a.

However, if a is an irrational number, the time sequencenever repeats and xn will come arbitrarily close to everypoint in the unit interval. Moreover, since the time se-quence visits every region of the unit interval with equalprobability, the long-time averages of any functions ofthe dynamical variable x can be replaced by spatial aver-ages with respect to the uniform probability distributionP(x) = 1 for x in [0, 1]. Therefore, for irrational valuesof a, this dynamical system, described by a single, deter-ministic difference equation, is an ergodic system.

Unfortunately, the time sequence generated by this mapis much too regular to be chaotic. For example, if we ini-tially colored all the points in the phase space between 0and 1

4 red and iterated the map, then the red points wouldremain clumped together in a continuous interval (Mod 1)for all time. But, if we pour a little cream in a stirred cup ofcoffee or release a dyed gas in the corner of the room, thedifferent particles of the cream or the colored gas quicklyspread uniformly over the accessible phase space.

2. Mixing

A stronger notion of statistical behavior is required to de-scribe turbulent flows and the approach to equilibrium inmany-body systems. In ergodic theory, this property isnaturally called mixing. Roughly speaking, a dynamicalsystem described by deterministic difference or differ-ential equations is said to be a mixing system if sets ofinitial conditions that cover limited regions of the phasespace spread throughout the accessible phase space andevolve in time like the particles of cream in coffee. Onceagain a simple difference equations serves to illustrate thisconcept.

Consider the shift map:

xn+1 = 2xn, Mod 1 (2)

which takes xn on the unit interval, multiplies it by 2,and subtracts the integer part to return a value of xn+1 onthe unit interval. If we take almost any initial condition,x0, then this deterministic map generates a time sequencexn that never repeats and for long times is indistinguish-able from a random process. Since the successive iterateswander over the entire unit interval and come arbitrarilyclose to every point in the phase space, this map is er-godic. Moreover, like Eq. (1), the long-time averages of

any function of the xn can be replaced by the spatial av-erage with respect to the uniform probability distributionP(x) =1.

However, the dynamics of each individual trajectory ismuch more irregular than that generated by Eq. (1). Ifwe were to start with a set of red initial conditions onthe interval [0, 1

4 ], then it is easy to see that these pointswould be uniformly dispersed on the unit interval afteronly two iterations of the map. Therefore, we call this dy-namical system a mixing system. (Of course, if we were tochoose very special initial conditions, such as x0 = 0 orx0 = p/2m , where p and m are positive integers, then thetime sequence would still be periodic. However, in the setof all possible initial conditions, these exceptional initialconditions are very rare. Mathematically, they comprisea set of zero measure, which means the chance of choos-ing one of these special initial conditions by accident isnil.)

It is very easy to see that the time sequences generatedby the vast majority of possible initial conditions is asrandom as the time sequence generated by flipping a coin.Simply write the initial condition in binary representation,that is, x0 = 0.0110011011100011010 . . . . Multiplicationby 2 corresponds to a register shift that moves the binarypoint to the right (just like multiplying a decimal num-ber by 10). Therefore, when we iterate Eq. (2), we readoff successive digits in the initial condition. If the leadingdigit to the left of the binary point is a one, then the Mod1 replaces it by a 0. Since a theorem by Martin–Lof guar-antees that the binary digits of almost every real numberare a random sequence with no apparent order, the timesequence xn generated by iterating this map will alsobe random. In particular, if we call out heads wheneverthe leading digit is a 1 (which means that xn lies on theinterval [ 1

2 , 1]) and tails whenever the leading digit is a 0(which means that xn lies on the interval [0, 1

2 ]), then thetime sequence xn generated by this deterministic differ-ence equation will jump back and forth between the leftand right halves of the unit interval in a process that isindistinguishable from that generated by a series of coinflips.

The technical definition of chaos refers to the behaviorof the time sequence generated by a mixing system, such asthe shift map defined by Eq. (2). This simple, deterministicdynamical system with random behavior is the simplestchaotic system, and it serves as the paradigm for all chaoticsystems.

3. Extreme Sensitivity to Initial Conditions

One of the essential characteristics of chaotic systems isthat they exhibit extreme sensitivity to initial conditions.This means that two trajectories with initial conditions that

Page 81: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

642 Chaos

are arbitrarily close will diverge at an exponential rate.The exponential rate of divergence in mixing systems isrelated to a positive Kolmogorov–Sinai entropy. For sim-ple systems, such as the one-dimensional maps definedby Eqs. (1) and (2), this local instability is character-ized by the average Lyapunov exponent, which in prac-tice is much easier to evaluate than the Kolmogorov–Sinaientropy.

It is easy to see that Eq. (2) exhibits extreme sensitivityto initial conditions with a positive average Lyapunov ex-ponent, while Eq.(1) does not. If we consider two nearbyinitial conditions x0 and y0, which are d0 = |x0 − y0| apart,then after one iteration of a map, xn+1 = F(xn) of the formof Eqs. (1) or (2), the two trajectories will be approx-imately separated by a distance d1 = |(d F/dx)(x0|d0).Clearly, if |d F/dx | <1, the distance between the twopoints decreases; if |d F/dx | >1, the two distance in-creases; while if |d F/dx | =1, the two trajectories remainapproximately the same distance apart. We can easily seeby differentiating the map or looking at the slopes of thegraphs of the return maps in Figs. 1 and 2 that |d F/dx | =1for Eq. (1), while |d F/dx | = 2 for Eq. (2). Therefore, aftermany iterations of Eq. (1), nearby initial conditions willgenerate trajectories that stay close together (the red pointsremained clumped), while the trajectories generated byEq. (2) diverge at an exponential rate (the red points mixthroughout the phase space). Moreover, the average Lya-punov exponent for these one-dimensional maps, definedby:

FIGURE 1 A graph of the return map defined by Eq. (1) fora = (

√5 − 1)/2 0.618. The successive values of the time se-

quence xnn = 1,2,3,... are simply determined by taking the oldvalues of xn and reading off the new values xn+1 from thegraph.

FIGURE 2 A graph of the return map defined by Eq. (2). Forvalues of xn between 0 and 0.5, the map increases linearly withslope 2; but for xn larger than 0.5, the action of the Mod 1 requiresthat the line reenter the unit square at 0.5 and rise again to 1.0.

λ = limN→∞

1

N

N∑n=0

ln

∣∣∣∣d F

dx(xn)

∣∣∣∣ (3)

provides a direct measure of the exponential rate of diver-gence of nearby trajectories. Since the slope of the returnmaps for Eqs. (1) and (2) are the same for almost all val-ues of xn , the average Lyapunov exponents can be easilyevaluated. For Eq. (1), we get λ = 0, while Eq. (2) givesλ = log 2 > 0.

However, it is important to note that all trajectories gen-erated by Eq. (2) do not diverge exponentially. As men-tioned earlier, the set of rational x0’s with even denomina-tors generate regular periodic orbits. Although these pointsare a set of measure zero compared with all of the realnumbers on the unit interval, they are dense, which meansthat in every subinterval, no matter how small, we can al-ways find one of these periodic orbits. The significanceof these special trajectories is that, figuratively speaking,they play the role of rocks or obstructions in a rushingstream around which the other trajectories must wander.If this dense set of periodic points were not present in thephase space, then the extreme sensitivity to initial condi-tions alone would not be sufficient to guarantee mixingbehavior. For example, if we iterated Eq. (2) without theMod 1, then all trajectories would diverge exponentially,but the red points would never be able to spread through-out the accessible space that would consist of the entirepositive real axis. In this case, the dynamical system issimply unstable, not chaotic.

Page 82: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 643

4. Unpredictability

One important consequence of the extreme sensitivity toinitial conditions is that long-term prediction of the evo-lution of chaotic dynamical systems, like predicting theweather, is a practical impossibility. Although chaotic dy-namical systems are fully deterministic, which means thatonce the initial conditions are specified the solution ofthe differential or difference equations are uniquely deter-mined for all time, this does not mean that it is humanlypossible to find the solution for all time. If nearby initialconditions diverge exponentially, then any errors in speci-fying the initial conditions, no matter how small, will alsogrow exponentially. For example, if we can only specifythe initial condition in Eq. (2) to an accuracy of one partin a thousand, then the uncertainty in predicting xn willdouble each time step. After only 10 time steps, the un-certainty will be as large as the entire phase space, so thateven approximate predictions will be impossible. (If wecan specify the initial conditions to double precision ac-curacy on a digital computer (1 part in 1018), then we canonly provide approximate predictions of the future valuesof xn for 60 time steps before the error spans the entireunit interval.) In contrast, if we specify the initial condi-tion for Eq. (1) to an accuracy of 10−3, then we can alwayspredict the future values of xn to the same accuracy. (Ofcourse, errors in the time evolution can also arise from un-certainties in the parameters in the equations of evolution.However, if we can only specify the parameter a in Eq. (1)to an accuracy of 10−3, we could still make approximatepredictions for the time sequence for as many as 103 iter-ations before the uncertainty becomes as large as the unitinterval.)

B. Fractals

Another common feature of chaotic dynamical systems isthe natural appearance of geometrical structures with non-integer dimensions. For example, in dissipative dynamicalsystems, described by systems of differential equations,the presence of dissipation (friction) causes the long-timebehavior to converge to a geometrical structure in thephase space called an attractor. The attractor may con-sist of a single fixed point with dimension 0, a periodiclimit cycle described by a closed curve with dimension 1,or, if the long-time dynamics is chaotic, the attracting setmay resemble a curve with an infinite number of twists,turns, and folds that never closes on itself. This strangeattractor is more than a simple curve with dimension 1,but it may fail to completely cover an area of dimension2. In addition, strange attractors are found to exhibit thesame level of structure on all scales. If we look at the com-plex structure through a microscope, it does not look any

simpler no matter how much we increase the magnifica-tions. The term fractal was coined by Benoit Mandelbrotto describe these complex geometrical objects. Like theshapes of snowflakes and clouds and the outlines of coast-lines and mountain ranges, these fractal objects are bestcharacterized by a noninteger dimension.

The simplest geometrical object with a noninteger di-mension is the middle-thirds Cantor set. If we take the unitinterval [0, 1] and remove all of the points in the middlethird, then we will be left with a set consisting of two pieces[0, 1

3 ] and [ 23 , 1] each of length 1

3 . If we remove the middlethirds of these remaining pieces. we get a set consistingof four pieces of length 1

9 . By repeating this constructionad infinitum, we end up with a strange set of points called aCantor set. Although it consists of points none is isolated.In fact, if we magnify any interval containing elements ofthe set—for example, the segment contained on the inter-val [0, 1

3n] for any positive n—then the magnified interval

will look the same as the complete set (see Fig. 3).In order to calculate a dimension for this set, we must

first provide a mathematical definition of dimension thatagrees with our natural intuition for geometrical objectswith integer dimension. Although there are a variety of

FIGURE 3 The middle-thirds Cantor set is constructed by firstremoving the points in the middle third of the unit interval andthen successively removing the middle thirds of the remainingintervals ad infinitum. This figure shows the first four stages ofCantor set construction. After the first two steps, the first segmentis magnified to illustrate the self-similar structure of the set.

Page 83: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

644 Chaos

different definitions of dimension corresponding to differ-ent levels of mathematical rigor, one definition that servesour purpose is to define the dimension of a geometricalobject in terms of the number of boxes of uniform sizerequired to cover the object. For example, if we considertwo-dimensional boxes with sides of length L (for ex-ample, L =1 cm), then the number of boxes required tocover a two-dimensional object, N (L), will be approxi-mately equal to the area measured in units of L2 (that is,cm2). Now, if we decrease the size of the boxes to L ′, thenthe number of boxes N (L ′) will increase approximatelyas (L/L ′2. [If L ′ =1 mm, then N (L ′) ≈ 100N (L).] Simi-larly, if we try to cover a one-dimensional object, such asa closed curve, with these boxes, the number of boxes willincrease only as (L/L ′). (For L = 1 cm and L ′ = 1 mm,N (L) will be approximately equal to the length of thecurve in centimeters, while N (L ′) will be approximatelythe length of the curve in millimeters.) In general,

N (L) ∞L+d (4)

where d is the dimension of the object. Therefore, onenatural mathematical definition of dimension is providedby the equation:

d = limL→0

log N (L)/ log(1/L) (5)

obtained by taking the logarithm of both sides of Eq. (4).For common geometrical objects, such as a point, a sim-ple curve, an area, or a volume, this definition yields theusual integer dimensions 0, 1, 2, and 3, respectively. How-ever, for the strange fractal sets associated with manychaotic dynamical systems, this definition allows for thepossibility of noninteger values. For example, if we countthe number of boxes required to cover the middle-thirdsCantor set at each level of construction, we find that wecan always cover every point in the set using 2n boxesof length

(13

)n(that is, 2 boxes of length 1

3 , 4 boxes oflength 1

9 , etc.), Therefore, Eq. (5) yields a dimension ofd = log 2/ log 3 = 0.63093 . . . , which reflects the intri-cate self-similar structure of this set.

III. DISSIPATIVE DYNAMICAL SYSTEMS

In this section, we will examine three important examplesof simple mathematical models that can exhibit chaotic be-havior and that arise in applications to problems in scienceand engineering. Each represents a dynamical system withdissipation so that the long-time behavior converges to anattractor in the phase space. The examples increase in com-plexity from a single difference equation, such as Eqs. (1)and (2), to a system of two coupled difference equationsand then to a system of three coupled ordinary differen-tial equations. Each example illustrates the characteris-

tic properties of chaos in dissipative dynamical systemswith irregular, unpredictable behavior that exhibits ex-treme sensitivity to initial conditions and fractal attractors.

A. The Logistic Map

The first example is a one-dimensional difference equa-tion, like Eqs. (1) and (2), called the logistic map, whichis defined by:

Xn+1 = axn(1 − xn) ≡ F(Xn) (6)

For values of the control parameter a between 0 and 4, thisnonlinear difference equation also takes values of xn be-tween 0 and 1 and returns a value xn+1 on the unit interval.However, as a is varied, the time sequences xn generatedby this map exhibit extraordinary transitions from regularbehavior, such as that generated by Eq. (1), to chaos, suchas that generated by Eq. (2). Although this mathematicalmodel is too simple to be directly applicable to problemsin physics and engineering, which are usually describedby differential equations, Mitchell Feigenbaum has shownthat the transition from order to chaos in dissipative dy-namical systems exhibits universal characteristics (to bediscussed later), so that the logistic map is representativeof a large class of dissipative dynamical systems. More-over, since the analysis of this deceptively simple differ-ence equation involves a number of standard techniquesused in the study of nonlinear dynamical systems, we willexamine it in considerable detail.

As noted in the seminal review article in 1974 by RobertMay, a biologist who considered the logistic map as amodel for annual variations of insect populations, the timeevolution generated by the map can be easily studied usinga graphical analysis of the return maps displayed in Fig. 4.Equation (6) describes an inverted parabola that interceptsthe xn+1 = 0 axis at xn = 0 and 1, with a maximum ofxn+1 = a/4 at xn = 0.5. Although this map can be easilyiterated using a short computer program, the qualitativebehavior of the time sequence xn generated by any initialx0 can be examined by simply tracing lines on the graphof the return map with a pencil as illustrated in Fig. 4.

For values of a <1, almost every initial condition isattracted to x = 0 as shown in Fig. 4 for a = 0.95. Clearly,x = 0 is a fixed point of the nonlinear map. If we start withx0 = 0, then the logistic map returns the value xn = 0 for allfuture iterations. Moreover, a simple linear analysis, suchas that used to define the Lyapunov exponent in SectionII, shows that for a <1 this fixed point is stable. (Initialconditions that are slightly displaced from the origin willbe attracted back since |(d F/dx)(0)| = a <1.)

However, when the control parameter is increased toa >1, this fixed point becomes unstable and the long-timebehavior is attracted to a new fixed point, as shown inFig. 4 for a = 2.9, which lies at the other intersection of

Page 84: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 645

FIGURE 4 Return maps for the logistic map, Eq. (6), are shown for four different values of the control parameter a.These figures illustrate how pencil and paper can be used to compute the time evolution of the map. For example, ifwe start our pencil at an initial value of x0 = 0.6 for a = 0.95, then the new value of x1 is determined by tracing verticallyto the graph of the inverted parabola. Then, to get x2, we could return to the horizontal axis and repeat this procedure,but it is easier to simply reflect off the 45 line and return to the parabola. Successive iterations of this procedure giverapid convergence to the stable, fixed point at x = 0. However, if we start at x0 = 0.1 for a = 2.9, our pencil computerdiverges from x = 0 and eventually settles down to a stable, fixed point at the intersection of the parabola and the45 line. Then, when we increase a > 3, this fixed point repels the trace of the trajectory, which settles into either aperiodic cycle, such as the period-2 cycle for a = 3.2, or a chaotic orbit, such as that for a = 4.0.

the 45 line and the graph of the return map. In this case,the dynamical system approaches an equilibrium with anonzero value of the dependent variable x . Elementary al-gebra shows that this point corresponds to the nonzeroroot of the quadratic equation x = ax(1 − x) given byx∗ = (a − 1)/a. Again, a simple linear analysis of smalldisplacements from this fixed point reveals that it remainsstable for values of a between 1 and 3. When a becomeslarger than 3, this fixed point also becomes unstable and thelong-time behavior becomes more complicated, as shownin Fig. 4.

1. Period Doubling

For values of a slightly bigger than 3, empirical observa-tions of the time sequences for this nonlinear dynamical

system generated by using a hand calculator, a digital com-puter, or our “pencil computer” reveals that the long-timebehavior approaches a periodic cycle of period 2, whichalternates between two different values of x . Because ofthe large nonlinearity in the difference equation, this pe-riodic behavior could not be deduced from any analyticalarguments based on exact solutions or from perturbationtheory. However, as typically occurs in the field of nonlin-ear dynamics, the empirical observations provide us withclues to new analytical procedures for describing and un-derstanding the dynamics. Once again, the graphical anal-ysis provides an easy way of understanding the origin ofthe period-2 cycle.

Consider a new map.

xn+2 = F (2)(xn) = F[F(xn)]

= a2(xn − x2

n

) − a3(x2

n − 2x3n + x4

n

)(7)

Page 85: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

646 Chaos

FIGURE 5 The return maps are shown for the second iterate of the logistic map, F (2), defined by Eq. (7). The fixedpoints at the intersection of the 45 line and the map correspond to values of x that repeat every two periods. Fora = 2.9, the two intersections are just the period-1 fixed points at 0 and a*, which repeat every period and thereforeevery other period, as well. However, when a is increased to 3.2, the peaks and valleys of the return map becomemore pronounced and pass through the 45 line and two new fixed points appear. Both of the old, fixed points are nowunstable because the absolute value of the slope of the return map is larger than 1, but the new points are stable, andthey correspond to the two elements of the period-2 cycle displayed in Fig. 4. Moreover, because the portion of thereturn map contained in the dashed box resembles an inverted image of the original logistic map, one might expectthat the same bifurcation process will be repeated for each of these period-2 points as a is increased further.

constructed by composing the logistic map with itself. Thegraph of the corresponding return map, which gives thevalues of xn every other iteration of the logistic map, isdisplayed in Fig. 5. If we use the same methods of anal-ysis as we applied to Eq. (6), we find that there can be atmost four fixed points that correspond to the intersectionof the graph of the quartic return map with the 45 line.Because the fixed points of Eq. (4) are values of x that re-turn every other iteration, these points must be members ofthe period-2 cycles of the original logistic map. However,since the period-1 fixed points of the logistic map at x = 0and x∗ are automatically period-2 points, two of the fixedpoints of Eq. (7) must be x = 0, x ∗. When 1 < a < 3, theseare the only two fixed points of Eq. (7), as shown in Fig. 5for a = 2.9. However, when a is increased above 3, twonew fixed points of Eq. (7) appear, as shown in Fig. 5 fora = 3.2, on either side of the fixed point at x = x∗, whichhas just become unstable.

Therefore, when the stable period-1 point at x∗ becomesunstable, it gives birth to a pair of fixed points, x (1), x (2)

of Eq. (7), which form the elements of the period-2 cy-cle found empirically for the logistic map. This process iscalled a pitchfork bifurcation. For values of a just above

3, these new fixed points are stable and the long-time dy-namics of the second iterate of the logistic map, F (2), isattracted to one or the other of these fixed points. However,as a increases, the new fixed points move away from x∗,the graphs of the return maps for Eq. (7) get steeper andsteeper, and when |d F (2)/dx |

x(1) ,x(2) > 1 the period-2 cyclealso becomes unstable. (A simple application of the chainrule of differential calculus shows that both periodic pointsdestabilize at the same value of a, since F(x (1),(2)) = x (2),(1)

and (d F (2)/dx)(x (1)) = (d F/dx)(x (2))(d F/dx)(x (1)) =(d F (2)/dx)(x (1)).)

Once again, empirical observations of the long-timebehavior of the iterates of the map reveal that whenthe period-2 cycle becomes unstable it gives birth to astable period-4 cycle. Then, as a increases, the period-4cycle becomes unstable and undergoes a pitchfork bifur-cation to a period-16 cycle, then a period-32 cycle, andso on. Since the successive period-doubling bifurcationsrequire smaller and smaller changes in the control param-eter, this bifurcation sequence rapidly accumulates to aperiod cycle of infinite period at a∞ = 3.57 . . . .

This sequence of pitchfork bifurcations is clearly dis-played in the bifurcation diagram shown in Fig. 6. This

Page 86: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 647

FIGURE 6 A bifurcation diagram illustrates the variety of long-time behavior exhibited by the logistic map as the control param-eter a is increased from 3.5 to 4.0. The sequences of period-doubling bifurcations from period-4 to period-8 to period-16 areclearly visible in addition to ranges of a in which the orbits appearto wander over continuous intervals and ranges of a in which pe-riodic orbits, including odd periods, appear to emerge from thechaos.

graph is generated by iterating the map for several hundredtime steps for successive values of a. For each value of a,we plot only the last hundred values of xn to display thelong-time behavior. For a < 3, all of these points land closeto the fixed point at a∗; for a > 3, these points alternatebetween the two period-2 points, then between the fourperiod-4 points, and so on.

The origin of each of these new periodic cycles canbe qualitatively understood by applying the same analysisthat we used to explain the birth of the period-2 cycle fromperiod 1. For the period-4 cycle, we consider the seconditerate of the period-2 map:

xn+4 = F (4)(xn) = F (2)[F (2)(xn)

]= FF[F(F(xn)] (8)

In this case, the return map is described by a polynomialof degree 16 that can have as many as 16 fixed pointsthat correspond to intersections of the 45 line with thegraph of the return map. Two of these period-4 pointscorrespond to the period-1 fixed points at 0 and x∗, andfor a > 3, two correspond to the period-2 points at x (1)

and x (2). The remaining 12 period-4 points can form threedifferent period-4 cycles that appear for different values ofa. Figure 7 shows a graph of F (4)(xn) for a = 3.2, wherethe period-2 cycle is still stable, and for a = 3.5, wherethe unstable period-2 cycle has bifurcated into a period-4 cycle. (The other two period-4 cycles are only brieflystable for other values of a > a∞·)

We could repeat the same arguments to describe the ori-gin of period 8; however, now the graph of the return mapof the corresponding polynomial of degree 32 would be-gin to tax the abilities of our graphics display terminal aswell as our eyes. Fortunately, the “slaving” of the stabilityproperties of each periodic point via the chain-rule argu-ment (described previously for the period-2 cycle) meansthat we only have to focus on the behavior of the succes-sive iterates of the map in the vicinity of the periodic pointclosest to x = 0.5. In fact, a close examination of Figs. 4,5, and 7 reveals that the bifurcation process for each F (n) issimply a miniature replica of the original period-doublingbifurcation from the period-1 cycle to the period-2 cy-cle. In each case, the return map is locally described bya parabolic curve (although it is not exactly a parabolabeyond the first iteration and the curve is flipped over forevery other F (N ).

Because each successive period-doubling bifurcationis described by the fixed points of a return mapxn+N = F (N )(Xn) with ever greater oscillations on the unitinterval, the amount the parameter a must increase beforethe next bifurcation decreases rapidly, as shown in the bi-furcation diagram in Fig. 6. The differences in the changesin the control parameter for each succeeding bifurcation,an+1 − an , decreases at a geometric rate that is found torapidly converge to a value of:

δ = an − an−1

an+1 − an= 4.6692016 . . . (9)

In addition, the maximum separation of the stable daughtercycles of each pitchfork bifurcation also decreases rapidly,as shown in Fig. 6, by a geometric factor that rapidlyconverges to:

α = 2.502907875 . . . (10)

2. Universality

The fact that each successive period doubling is controlledby the behavior of the iterates of the map, F (N )(x), nearx = 0.5, lies at the root of a very significant propertyof nonlinear dynamical systems that exhibit sequencesof period-doubling bifurcations called universality. In theprocess of developing a quantitative description of perioddoubling in the logistic map, Feigenbaum discovered thatthe precise functional form of the map did not seem to mat-ter. For example, he found that a map on the unit intervaldescribed by F(x) = a sin πx gave a similar sequence ofperiod-doubling bifurcations. Although the values of thecontrol parameter a at which each period-doubling bifur-cation occurs are different, he found that both the ratiosof the changes in the control parameter and the separa-tions of the stable daughter cycles decreased at the samegeometrical rates δ and α as the logistic map.

Page 87: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

648 Chaos

FIGURE 7 The appearance of the period-4 cycle as a is increased from 3.2 to 3.5 is illustrated by these graphs ofthe return maps for the fourth iterate of the logistic map, F (4). For a = 3.2, there are only four period-4 fixed pointsthat correspond to the two unstable period-1 points and the two stable period-2 points. However, when a is increasedto 3.5, the same process that led to the birth of the period-2 fixed points is repeated again in miniature. Moreover,the similarity of the portion of the map near xn = 0.5 to the original map indicates how this same bifurcation processoccurs again as a is increased.

This observation ultimately led to a rigorous proof,using the mathematical methods of the renormalizationgroup borrowed from the theory of critical phenomena,that these geometrical ratios were universal numbers thatwould apply to the quantitative description of any period-doubling sequence generated by nonlinear maps with asingle quadratic extremum. The logistic map and the sinemap are just two examples of this large universality class.The great significance of this result is that the global detailsof the dynamical system do not matter. A thorough under-standing of the simple logistic map is sufficient for describ-ing both qualitatively and, to a large extent, quantitativelythe period-doubling route to chaos in a wide variety ofnonlinear dynamical systems. In fact, we will see that thisuniversality class extends beyond one-dimensional mapsto nonlinear dynamical systems described by more real-istic physical models corresponding to two-dimensionalmaps, systems of ordinary differential equations, and evenpartial differential equations.

3. Chaos

Of course, these stable periodic cycles, described byFeigenbaum’s universal theory, are not chaotic. Even thecycle with an infinite period at the period-doubling accu-mulation point a has a zero average Lyapunov exponent.However, for many values of a above a∞, the time se-quences generated by the logistic map have a positive aver-

age Lyapunov exponent and therefore satisfy the definitionof chaos. Figure 8 plots the average Lyapunov exponentcomputed numerically using Eq. (3) for the same rangeof values of a, as displayed in the bifurcation diagramin Fig. 6.

FIGURE 8 The values of the average Lyapunov exponent, com-puted numerically using Eq. (3), are displayed for the same valuesof a shown in Fig. 6. Positive values of λ correspond to chaotic dy-namics, while negative values represent regular, periodic motion.

Page 88: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 649

Wherever the trajectory appears to wander chaoticallyover continuous intervals, the average Lyapunov expo-nent is positive. However, embedded in the chaos fora∞ < a < 4 we see stable period attractors in the bifurca-tion diagram with sharply negative average Lyapunov ex-ponents. The most prominent periodic cycle is the period-3cycle, which appears near a3 = 3.83. In fact, between a∞and a3, there is a range of values of a for which cyclesof every odd and even period are stable. However, the in-tervals for the longer cycles are too small to discern inFig. 6. The period-5 cycle near a = 3.74 and the period-6 cycles near a = 3.63 and a = 3.85 are the most readilyapparent in both the bifurcation diagram and the graph ofthe average Lyapunov exponent.

Although these stable periodic cycles are mathemat-ically dense over this range of control parameters, thevalues of a where the dynamics are truly chaotic can bemathematically proven to be a significant set with nonzeromeasure. The proof of the positivity of the average Lya-punov exponent is much more difficult for the logisticmap than for Eq. (2) since log |(d F/dx)(xn)| can take onboth negative and positive values depending on whetherxn is close to 1

2 or to 0 or 1. However, one simple case forwhich the logistic map is easily proven to be chaotic isfor a = 4. In this case, the time sequence appears to wan-der over the entire unit interval in the bifurcation diagram,and the numerically computed average Lyapunov expo-nent is positive. If we simply change variables from xn

to yn = (2/π ) sin−1 √xn , then the logistic map for a = 4

transforms to the tent map:

yn+1 =

2yn 0 ≤ yn ≤ 0.5

2(1 − yn) 0.5 ≤ yn ≤ 1(11)

which is closely related to the shift map, Eq. (2). In partic-ular, since |d F/dy| = 2, the average Lyapunov exponentis found to be λ = log 2 ≈ 0.693, which is the same as thenumerical value for the logistic map.

B. The Henon Map

Most nonlinear dynamical systems that arise in physicalapplications involve more than one dependent variable.For example, the dynamical description of any mechan-ical oscillator requires at least two variables—a positionand a momentum variable. One of the simplest dissipativedynamical systems that describes the coupled evolution oftwo variables was introduced by Michel Henon in 1976.It is defined by taking a one-dimensional quadratic mapfor xn+1 similar to the logistic map and coupling it to asecond linear map for yn+1:

xn+1 = 1 − ax2n + yn (12a)

yn+1 = bxn (12b)

This pair of difference equations takes points in thex–y plane with coordinates (xn, yn) and maps them to newpoints (xn+1, yn+1). The behavior of the sequence of pointsgenerated by successive iterates of this two-dimensionalmap from an initial point (x0, y0) is determined by thevalues of two control parameters a and b. If a and bare both 0, then Eq. (12) maps every point in the planeto the attracting fixed point at (1, 0) after at most twoiterations.

If b = 0 but a is nonzero, then the Henon map reduces toa one-dimensional quadratic map that can be transformedinto the logistic map by shifting the variable x . Therefore,for b = 0 and even for b small, the behavior of the timesequence of points generated by the Henon map closelyresembles the behavior of the logistic map. For small val-ues of a, the long-time iterates are attracted to stable pe-riodic orbits that exhibit a sequence of period-doublingbifurcations to chaos as the nonlinear control parameter ais increased. For small but nonzero b, the main differencefrom the one-dimensional maps is that these regular orbitsof period N are described by N points in the (x–y) planerather than points on the unit interval. (In addition, thebasin of attraction for these periodic cycles consists of afinite region in the plane rather than the unit interval alone.Just as in the one-dimensional logistic map, if a point liesoutside this basin of attraction, then the successive iteratesdiverge to ∞.)

The Henon map remains a dissipative map with timesequences that converge to a finite attractor as long as b isless than 1. This is easy to understand if we think of the ac-tion of the map as a coordinate transformation in the planefrom the variables (xn, yn) to (xn+1, yn+1). From elemen-tary calculus, we know that the Jacobian of this coordinatetransformation, which is given by the determinant of thematrix,

M =(−2ax 1

b 0

)(13)

describes how the area covered by any set of points in-creases or decreases under the coordinate transformation.In this case, J = Det M = −b. When |J | > 1, areas growlarger and sets of initial conditions disperse throughoutthe x–y plane under the iteration of the map. But, when|J | < 1, the areas decrease under each iteration, so ar-eas must contract to sets of points that correspond to theattractors.

1. Strange Attractors

However, these attracting sets need not be a simple fixedpoint or a finite number of points forming a periodic cycle.In fact, when the parameters a and b have values that give

Page 89: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

650 Chaos

FIGURE 9 The first 10,000 iterates of the two-dimensional Henonmap trace the outlines of a strange attractor in the xn−yn plane.The parameters were chosen to be a = 1.4 and b= 0.3 and theinitial point was (0, 0).

rise to chaotic dynamics, the attractors can be exceedinglycomplex, composed of an uncountable set of points thatform intricate patterns in the plane. These strange attrac-tors are best characterized as fractal objects with noninte-ger dimensions.

Figure 9 displays 10,000 iterates of the Henon map fora = 1.4 and b = 0.3. (In this case, the initial point was cho-sen to be (x0, y0) = (0, 0), but any initial point in the basinof attraction would give similar results.) Because b < 1,the successive iterates rapidly converge to an intricate ge-ometrical structure that looks like a line that is foldedon itself an infinite number of times. The magnificationsof the sections of the attractor shown in Fig. 10 displaythe detailed self-similar structure. The cross sections ofthe folded line resemble the Cantor set described in Sec-tion II.B. Therefore, since the attractor is more than a linebut less than an area (since there are always gaps betweenthe strands at every magnification), we might expect it tobe characterized by a fractal dimension that lies between1 and 2. In fact, an application of the box-counting defini-tion of fractal dimension given by Eq. (5) yields a fractaldimension of d = 1.26 . . . .

Moreover, if you were to watch a computer screen whilethese points are plotted, you would see that they wanderabout the screen in a very irregular manner, slowly re-vealing this complex structure. Numerical measurementsof the sensitivity to initial conditions and of the averageLyapunov exponents (which are more difficult to compute

than for one-dimensional maps) indicate that the dynamicson this strange attractor are indeed chaotic.

C. The Lorenz Attractor

The study of chaos is not restricted to nonlinear differ-ence equations such as the logistic map and the Henonmap. Systems of coupled nonlinear differential equationsalso exhibit the rich variety of behavior that we have al-ready seen in the simplest nonlinear dynamical systemsdescribed by maps. A classic example is provided by theLorenz model described by three coupled nonlinear dif-ferential equations:

dx/dt = −σ x + σ y (14a)

dy/dt = −xz + r x − y (14b)

dz/dt = xy − bz (14c)

These equations were introduced in 1963 by EdwardLorenz, a meteorologist, as a severe truncation of theNavier–Stokes equations describing Rayleigh–Benardconvection in a fluid (like Earth’s atmosphere), which isheated from below in a gravitational field. The dependentvariable x represents a single Fourier mode of the streamfunction for the velocity flow, the variables y and z rep-resent two Fourier components of the temperature field;and the constants r , σ , and b are the Rayleigh number, thePrandtl number, and a geometrical factor, respectively.

The Lorenz equations provide our first example of amodel dynamical system that is reasonably close to areal physical system. (The same equations provide aneven better description of optical instabilities in lasers,and similar equations have been introduced to describechemical oscillators.) Numerical studies of the solutionsof these equations, starting with Lorenz’s own pioneer-ing work using primitive digital computers in 1963, haverevealed the same complexity as the Henon map. Infact, Henon originally introduced Eq. (12) as a simplemodel that exhibits the essential properties of the Lorenzequations.

A linear analysis of the evolution of small volumes in thethree-dimensional phase space spanned by the dependentvariables x , y, and z shows that this dissipative dynam-ical system rapidly contracts sets of initial conditions toan attractor. When the Rayleigh number r is less than 1,the point (x, y, z) = (0, 0, 0) is an attracting fixed point.But, when r > 1, a wide variety of different attractors thatdepend in a complicated way on all three parameters r , σ ,and b are possible. Like the Henon map, the long-time be-havior of the solutions of these differential equations canbe attracted to fixed points; to periodic cycles, which aredescribed by limit cycles consisting of closed curves in the

Page 90: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 651

FIGURE 10 To see how strange the attractor displayed in Fig. 9 really is, we show two successive magnifications ofa strand of the attractor contained in the box in Fig. 9. Here, (a) shows that the single strand in Fig. 9 breaks up intoseveral distinct bands, which shows even finer structure in (b) when the map is iterated 10,000,000 time steps.

three-dimensional phase space; and to strange attractors,which are described by a fractal structure in phase space.In the first two cases, the dynamics is regular and pre-dictable, but the dynamics on strange attractors is chaoticand unpredictable (as unpredictable as the weather).

The possibility of strange attractors for three or moreautonomous differential equations, such as the Lorenzmodel, was established mathematically by Ruelle and Tak-ens. Figure 11 shows a three-dimensional graph of thefamous strange attractor for the Lorenz equations corre-sponding to the values of the parameters r = 28, σ = 10,and b = 8

3 , which provides a graphic illustration of theconsequences of their theorem. The initial conditions werechosen to be (1, 1, 1). The trajectory appears to loop aroundon two surfaces that resemble the wings of a butterfly,jumping from one wing to the other in an irregular man-ner. However, a close inspection of these surfaces revealsthat under successive magnification they exhibit the samekind of intricate, self-similar structure as the striations ofthe Henon attractor. This detailed structure is best revealedby a so-called Poincare section of continuous dynamics,shown in Fig. 12, which was generated by plotting a pointin the x–z plane every time the orbit passes from negativey to positive y.

Since we could imagine that this Poincare section wasgenerated by iterating a pair of nonlinear difference equa-tions, such as the Henon map, it is easy to understand, byanalogy with the analysis described in Section III.B, howthe time evolution can be chaotic with extreme sensitiv-ity to initial conditions and how this cross section of the

Lorenz attractor, as well as the Lorenz attractor itself, canhave a noninteger, fractal dimension.

D. Applications

Perhaps the most significant conclusion that can be drawnfrom these three examples of dissipative dynamical sys-tems that exhibit chaotic behavior is that the essential fea-tures of the behavior of the more realistic Lorenz modelare well described by the properties of the much simplerHenon map and to a large extent by the logistic map. Theseobservations provide strong motivation to hope that simplenonlinear systems will also capture the essential propertiesof even more complex dynamical systems that describe awide variety of physical phenomena with irregular behav-ior. In fact, the great advances of nonlinear dynamics andthe study of chaos in the last [20] years can be attributedto the fulfillment of this hope in both numerical studies ofmore complicated mathematical models and experimentalstudies of a variety of complicated natural phenomena.

The successes of this program of reducing the essentialfeatures of complicated dynamical processes to simplenonlinear maps or to a few coupled, nonlinear differen-tial equations have been well documented in a numberof conference proceedings and textbooks. For example,the universality of the period-doubling route to chaos andthe appearance of strange attractors have been demon-strated in numerical studies of a wide variety of nonlinearmaps, systems of nonlinear, ordinary, and partial differ-ential equations. Even more importantly, Feigenbaum’s

Page 91: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

652 Chaos

FIGURE 11 The solution of the Lorenz equations for the parameters r = 28, σ = 10, and b= 83 rapidly converges to

a strange attractor. This figure shows a projection of this three-dimensional attractor onto the x−z plane, which istraced out by approximately 100 turns of the orbit.

universal constants δ and α, which characterize the quanti-tative scaling properties of the period-doubling sequence,have been measured to good accuracy in a number of care-ful experiments on Rayleigh–Benard convection and non-linear electrical circuits and in oscillating chemical re-actions (such as the Belousov–Zabotinsky reaction), laseroscillators, acoustical oscillators, and even the response ofheart cells to electrical stimuli. In addition, a large num-ber of papers have been devoted to the measurement ofthe fractal dimensions of strange attractors that may gov-ern the irregular, chaotic behavior of chemical reactions,turbulent flows, climatic changes, and brainwave patterns.

Perhaps the most important lesson of nonlinear dynam-ics has been the realization that complex behavior neednot have complex causes and that many aspects of irregu-lar, unpredictable phenomena may be understood in termsof simple nonlinear models. However, the study of chaosalso teaches us that despite an underlying simplicity andorder we will never be able to describe the precise behav-

ior of chaotic systems analytically nor will we succeedin making accurate long-term predictions no matter howmuch computer power is available. At the very best, wemay hope to discern some of this underlying order in aneffort to develop reliable statistical methods for makingpredictions for average properties of chaotic systems.

E. Hyperchaos

The notion of Lyapunov exponent can be extended to sys-tems of differential equations or higher dimensional maps.In general, a system has a set of Lyapunov exponents,each characterizing the average stretching or shrinking ofphase space in a particular direction. The logistic map dis-cussed above has only one Lyapunov exponent because ithas only one dependent variable that can be displaced. Inthe Lorenz model, the system has only one positive Lya-punov exponent, but has three altogether. Consider an ini-tial point in phase space that is on the strange attractor and

Page 92: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 653

FIGURE 12 A Poincare section of the Lorenz attractor is con-structed by plotting a cross section of the butterfly wings. Thisgraph is generated by plotting the point in the x −z plane eachtime the orbit displayed in Fig. 11 passes through y = 0. This viewof the strange attractor is analogous to that displayed in Fig. 9for the Henon map. This figure appears to consist of only a sin-gle strand, but this is because of the large contraction rate ofthe Lorenz model. Successive magnifications would reveal a fine-scale structure similar to that shown in Fig. 10 for the Henon map.

another point displaced infinitesimally from it. The tra-jectories followed from these two points may remain thesame distance apart (on average), diverge exponentially,or converge exponentially. The first case corresponds to aLyapunov exponent of zero and is realized when the dis-placement lies along the trajectory of the initial point. Thetwo points then follow the same trajectory, but displacedin time. The second case corresponds to a positive Lya-punov exponent, the third to a negative one. In the Lorenzsystem, the fact that the attractor has a planar structurelocally indicates that trajectories converge in the direc-tion transverse to the plane, hence one of the Lyapunovexponents is negative.

An arbitrary, infinitesimal perturbation will almost cer-tainly have some projection on each of the directions cor-responding to the different Lyapunov exponents. Since thegrowth of the perturbation in the direction associated withthe largest Lyapunov exponent is exponentially faster thanthat in any other direction, the observed trajectory diver-gence will occur in that direction. In numerical models,one can measure the n largest Lyapunov exponents by in-tegrating the linearized equations for the deviations froma given trajectory for n different initial conditions. Onemust repeatedly rescale the deviations to avoid both expo-

nential growth that causes overflow errors and problemsassociated with the convergence of all the deviations tothe direction associated with the largest exponent.

It is possible to have an attractor with two or moreLyapunov exponents greater than zero. This is sometimesreferred to as “hyperchaos” and is common in systemswith many degrees of freedom. The problem of distin-guishing between hyperchaos and stochastic fluctuationsin interpreting experimental data has received substantialattention. We are typically presented with an experimentaltrace of the time variation of a single variable and wish todetermine whether the system that generated it was essen-tially deterministic or stochastic. The distinction here isquantitative rather than qualitative. If the observed fluctu-ations involve so many degrees of freedom that it appearshopeless to model them with a simple set of deterministicequations, we label it stochastic and introduce noise termsinto the equations.

“Time-series analysis” algorithms have been developedto identify underlying deterministic dynamics in appar-ently random systems. The central idea behind these algo-rithms is the construction of a representation of the strangeattractor (if it exists) via delay coordinates. Given a timeseries for a single variable x(t), the n-dimensional vec-tor X(t) = (x(t), x(t − τ ), x(t − 2τ ), . . . , x(t − (n − 1)τ ))is formed, where τ is a fixed delay time comparable tothe scale on which x(t) fluctuates. For sufficiently largen, the topological structure of the attractor for X (t) willgenerically be identical to that of the dynamical systemthat generated the data. This allows for mesasures of thegeometric structure of the trajectory in the space of delaycoordinates, called the embedding space, to provide anupper bound on the dimension of the true attractor. As apractical matter, hyperchaos with more than about 10 pos-itive Lyapunov exponents is extremely difficult to identifyunambiguously.

F. Spatiotemporal Chaos

The term spatiotemporal chaos has been used to referto any system in which some variable exhibits chaoticmotion in time and the spatial structure of the systemvaries with time as well. Real systems are always com-posed of spatially extended materials for which a fullydetailed mathematical model would require either the useof partial differential equations or an enormous numberof ordinary differential equations. In many cases, the dy-namics of the vast majority of degrees of freedom rep-resented in these equations need not be solved explicitly.All but a few dependent variables exhibit trivial behavior,decaying exponentially quickly to a steady state, or elseoscillate at amplitudes negligible for the problem at hand.The remaining variables are described by a few ordinary

Page 93: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

654 Chaos

differential equations (or maps) of the type discussed inthe previous section.

In some cases, the relevant variables are easily identi-fied. For example, it is not difficult to guess that the motionof a pendulum can be described by solving coupled equa-tions for the position and velocity of the bob. One generallydoes not have to worry about the elastic deformations ofthe bar. In other cases, the relevant variable are amplitudesof modes of oscillation that can have a nontrivial spatialstructure. A system of ordinary differential equations forthe amplitudes of a few modes may correspond to an com-plicated pattern of activity in real space. For example, aviolin string can vibrate in different harmonic modes, eachof which corresponds to a particular shape of the string thatoscillates sinusoidally in time. A model of large-amplitudevibrations might be cast in the form of coupled, nonlin-ear equations for the amplitudes of a few of the lowestfrequency modes. If those equations yielded chaos, thespatial shape of the string would fluctuate in complicated,unpredictable ways. This complex motion in space andtime is sometimes referred to as spatiotemporal chaos,though it is a rather simple version since the dynamicssimplifies greatly when the correct modes are identified.

In general, models of smaller systems require fewervariables in the following sense. What determines thenumber of modes necessary for an accurate description isthe smallest scale spatial variation that has an appreciableprobability of occuring at a noticeable amplitude. Sincelarge amplitude variations over very short-length scalesgenerally require large amounts of energy, there will bean effective small-scale cutoff determined by the strengthwith which the system is driven. Systems whose size iscomparable to the cutoff scale will require the analysisof only a few modes; in systems much larger than thisscale many modes may be involved and the dynamics canbe considerably more complex. The term “spatiotemporalchaos” is sometimes reserved for this regime.

Interest in the general subject of turbulence and its sta-tistical description has led to a number of studies of deter-ministic systems that exhibit spatiotemporal chaos witha level of complexity proportional to the volume of thesystem. By analogy with thermodynamic properties thatare proportional to the volume, such systems are said toexhibit “extensive chaos.” A well-studied example is theirregular pattern of activity known as Benard convection,where a fluid confined to a thin, horizontal layer is heatedfrom below. As the temperature difference between thebottom and top surface of the fluid is increased, the fluidbegins to move, arranging itself in a pattern of roughlycylindrical rolls in which warm fluid rises on one side andfalls on the other. At the onset of convection, the rolls formstraight stripes (apart from boundary effects). As the tem-perature difference is increased further, the rolls may form

more complicated patterns of spirals and defects that con-tinually move around, never settling into a periodic patternor steady state. The question of whether the “spiral defectchaos” state is an example of extensive chaos is not easyto answer directly, but numerical simulations of modelsexhibiting similar behavior can be analyzed in detail.

To establish the fact that a numerical model exhibitsextensive chaos, one must define an appropriate quantitythat characterizes the complexity of the chaotic attractor.A quantity that has proven useful is the Lyapunov dimen-sion, Dλ. Let λ1 be the largest Lyapunov exponent, λ2 thesecond largest, etc. Note that in most extended systemsthe exponents with higher indices become increasinglystrongly negative. Let N be the largest integer for which∑N

i=1 λi > 0. We define the Lyapunov dimension as

Dλ = N + 1

|λN+1|N∑

i=1

λi .

Numerical studies of systems of partial differential equa-tions such as the complex Ginzburg-Landau equationin two dimensions have demonstrated the existence ofattractors for which Dλ does indeed grow proportionallyto the system volume; that is, extensive chaos does exist insimple, spatially extended systems, and the spiral defectchaos state is a real example of this phenomenon.

G. Control and Synchronization of Chaos

Over the past 10 years, mathematicians, physicists, andengineers have become increasingly interested in the pos-sibilities of using the unique properties of chaotic systemsfor novel applications. A key feature of strange attractorsthat spurred much of this effort was that they have em-bedded within them an infinite set of perfectly periodictrajectories. These trajectories, called unstable periodicorbits (UPOs) lie on the attractor but are not normally ob-served because they are unstable. In 1990, Ott, Grebogi,and Yorke pointed out that UPOs could form the basisof a switching system. Using standard techniques of con-trol theory for feedback stabilize, we can arrange for anintrinsically chaotic system to follow a selected UPO. Byturning off that feedback and turning on a different one, wecan stabilize a different UPO. The beauty of the scheme isthat we are guaranteed, due to the nature of the strange at-tractor, that the system will come very close to the desiredUPO in a relatively short time. Thus, our feedback sys-tem need only be capable of applying tiny perturbationsto the system. The chaotic dynamics does the hard workof switching from the vicinity of one orbit to the vicinityof the other.

The notion that chaos can be suppressed using smallfeedback perturbations has generated a great deal of

Page 94: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 655

interest even independent of the possibility of switchingbetween UPOs. At the time of this writing, applications of“controlling chaos” (or simply suppressing it) are beingactively pursued in systems as diverse as semiconductorlasers, mechanical systems, fluid flows, the electrodynam-ics of cardiac tissue.

A development closely related to controlling chaos hasbeen the use of simple coupling between two nearly identi-cal chaotic systems to synchronize their chaotic behaviors.Given two identical chaotic systems that are uncoupled,their behaviors deviate wildly from each other because ofthe exponetial divergence of nearby initial conditions. Itis possible, however, to couple the systems together in asimple way such that the orginal strange attractor is notaltered, but the two systems follow the same trajectory onthat attractor. The coupling must be based on the differ-ences between the values of corresponding variables in thetwo systems. When the systems are synchronized, thosedifferences vanish and the two systems follow the chaoticattractor. If the two systems begin to diverge, however,a feedback is generated via the coupling. An appropri-ately chosen coupling scheme can maintain the synchro-nized motion. Synchronization is currently being pursuedas a novel means for efficient transmission of informationthrough an electronic or optical channel.

IV. HAMILTONIAN SYSTEMS

Although most physics textbooks on classical mechan-ics are largely devoted to the description of Hamilto-nian systems in which dissipative, frictional forces canbe neglected, such systems are rare in nature. The mostimportant examples arise in celestial mechanics, whichdescribes the motions of planets and stars; acceleratordesign, which deals with tenuous beams of high-energycharged particles moving in guiding magnetic fields; andthe physics of magnetically confined plasmas, which is pri-marily concerned with the dynamics of trapped electronsand ions in high-temperature fusion devices. Although fewin number, these examples are very important.

In this section, we will examine three simple examplesof classical Hamiltonian systems that exhibit chaoticbehavior. The first example is the well-known baker’stransformation, which clearly illustrates the fundamentalconcepts of chaotic behavior in Hamiltonian systems. Al-though it has no direct applications to physical problems,the baker’s transformation, like the logistic map, serves asa paradigm for all chaotic Hamiltonian systems. The sec-ond example is the standard map, which has direct applica-tions in the description of the behavior of a wide variety ofperiodically perturbed nonlinear oscillators ranging fromparticle motion in accelerators and plasma fusion devices

to the irregular rotation of Hyperion, one of the moonsof Saturn. Finally, we will consider the Henon–Heilesmodel, which corresponds to an autonomous Hamiltoniansystem with two degrees of freedom, describing, forexample, the motion of a particle in a nonaxisymmetric,two-dimensional potential well or the interaction of threenonlinear oscillators (the three-body problem).

A. The Baker’s Transformation

The description of a Hamiltonian system, like a friction-less mechanical oscillator, requires at least two dependentvariables that usually correspond to a generalized posi-tion variable and a generalized momentum variable. Thesevariables define a phase space for the mechanical system,and the solutions of the equations of motion describe themotion of a point in the phase space. Starting from theinitial conditions specified by an initial point in the 2–dplane, the time evolution generated by the equations ofmotion trace out a trajectory or orbit.

The distinctive feature of Hamiltonian systems is thatthe areas or volumes of small sets of initial conditionsare preserved under the time evolution, in contrast to thedissipative systems, such as the Henon map or the Lorenzmodel, where phase-space volumes are contracted. There-fore, Hamiltonian systems are not characterized by attrac-tors, either regular or strange, but the dynamics can nev-ertheless exhibit the same rich variety of behavior withregular periodic and quasi-periodic cycles and chaos.

The simplest Hamiltonian systems correspond to area-preserving maps on the x–y plane. One well-studied ex-ample is the so-called baker’s transformation, defined bya pair of difference equations:

xn+1 = 2xn, Mod 1 (15a)

yn+1 =

0.5yn 0 ≤ xn ≤ 0.5

0.5(yn + 1) 0.5 ≤ xn ≤ 1(15b)

The action of this map is easy to describe by using the anal-ogy of how a baker kneads dough (hence, the origin of thename of the map). If we take a set of points (xn, yn) cover-ing the unit square (0 ≤ xn ≤ 1 and 0 ≤ yn ≤ 1), Eq. (15a)requires that each value of xn be doubled so that the square(or dough) is stretched out in the x direction to twice itsoriginal length. Then, Eq. (15b) reduces the values of yn

by a factor of two and simultaneously cuts the resultingrectangular set of points (or dough) in half at x = 1 andplaces one piece on top of the other, which returns thedough to its original shape, as shown in Fig. 13. Then, thisdynamical process (or kneading) is repeated over and overagain.

Since area is preserved under each iteration, this dy-namical system is Hamiltonian. This can be easily seen

Page 95: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

656 Chaos

FIGURE 13 The baker’s transformation takes all of the pointsin the unit square (the dough), compresses them vertically by afactor of 1

2 , and stretches them out horizontally by a factor of 2.Then, this rectangular set of points is cut at x = 1, the two resultingrectangles are stacked one on top of the other to return the shapeto the unit square, and the transformation is repeated over again.In the process, a “raisin,” indicated schematically by the black dot,wanders chaotically around the unit square.

mathematically if we think of the successive iterates ofthe baker’s transformation as changes of coordinates fromxn, yn to xn−1, yn+1. As in the case of the Henon map, wecan analyze the effects of this transformation by evaluat-ing the Jacobian of the coordinate transformation that isthe determinant of the matrix:

M =(

2 0

0 12

)(16)

Since J = Det M = 1, we know from elementary inte-gral calculus that volumes are preserved by this changeof variables.

1. Chaotic Mixing

Starting from a single initial condition (x0, y0), the timeevolution will be described by a sequence of points in theplane. (To return to the baking analogy we could imagine

that (x0, y0) specifies the initial coordinate of a raisin inthe dough.) For almost all initial conditions, the trajecto-ries generated by this simple, deterministic map will bechaotic. Because the evolution of the x coordinate is com-pletely determined by the one-dimensional, chaotic shiftmap, Eq. (2), the trajectory will move from the right half tothe left half of the unit square in a sequence that is indistin-guishable from the sequence of heads and tails generatedby flipping a coin. Moreover, since the location of the or-bit in the upper half or lower half of the unit square isdetermined by the same random sequence, the successiveiterates of the initial point (the raisin) will wander aroundthe unit square in a chaotic fashion.

In this simple model, it is easy to see that the mecha-nism responsible for the chaotic dynamics is the processof stretching and folding of the phase space. In fact, thissame stretching and folding lies at the root of all chaoticbehavior in both dissipative and Hamiltonian systems. Thestretching is responsible for the exponential divergence ofnearby trajectories, which is the cause of the extreme sen-sitivity to initial conditions that characterizes chaotic dy-namics. The folding ensures that trajectories return to theinitial region of phase space so that the unstable systemdoes not simply explode.

Since the stretching only occurs in the x direction for thebaker’s transformation, we can easily compute the value ofthe exponential divergence of nearby trajectories, which issimply the logarithm of the largest eigenvalue of the matrixM . Therefore, the baker’s transformation has a positiveKolmogorov–Sinai entropy, λ = log 2, so that the dynam-ics satisfy out definition of chaos.

B. The Standard Map

Our second example of a simple Hamiltonian system thatexhibits chaotic behavior is the standard map describedby the pair of nonlinear difference equations:

xn+1 = xn + yn+1, Mod 2π (17a)

and

yn+1 = yn + k sin xn Mod 2π (17b)

Starting from an initial point (x0, y0) on the “2π square,”Eq. (17b) determines the new value of y1 and Eq. (17a)gives the new value of x1. The behavior of the trajec-tory generated by successive iterates is determined by thecontrol parameter k, which measures the strength of thenonlinearity.

The standard map provides a remarkably good modelof a wide variety of physical phenomena that are properlydescribed by systems of nonlinear differential equations(hence, the name standard map). In particular, it servesas a paradigm for the response of all nonlinear oscillators

Page 96: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 657

FIGURE 14 Successive iterates of the two-dimensional standard map for a number of different initial conditions aredisplayed for four values of the control parameter k. For small values of k, the orbits trace out smooth, regular curvesin the two-dimensional phase space that become more distorted by resonance effects as k increases. For k = 1,the interaction of these resonances has generated visible regions of chaos in which individual trajectories wanderover large regions of the phase space. However, for k = 1 and k = 2, the chaotic regions coexist with regular islets ofstability associated with strong nonlinear resonances. The boundaries of the chaotic regions are defined by residualKAM curves. For still larger values of k (not shown), these regular islands shrink until they are no longer visible inthese figures.

to periodic perturbations. For example, it provides an ap-proximate description of a particle interacting with a broadspectrum of traveling waves, an electron moving in the im-perfect magnetic fields of magnetic bottles used to confinefusion plasmas, and the motion of an electron in a highlyexcited hydrogen atom in the presence of intense elec-tromagnetic fields. In each case, xn and yn correspond tothe values of the generalized position and momentum vari-ables, respectively, at discrete times n. Since this model ex-hibits most of the generic features of Hamiltonian systemsthat exhibit a transition from regular behavior to chaos, wewill examine this example in detail.

The standard map actually provides the exact math-ematical description for one physical system called the“kicked rotor.” Consider a rigid rotor in the absence ofany gravitational or frictional forces that is subject to pe-riodic kicks every unit of time n = 1, 2, 3, . . . . Then, xn

and yn describe the angle and the angular velocity (an-

gular momentum) just before the nth kick. The rotor canbe kicked either forward or backward depending on thesign of sin xn , and the strengths of the kicks are deter-mined by the value of k. As the nonlinear parameter kis increased, the trajectories generated by this map ex-hibit a dramatic transition from regular, ordered behaviorto chaos. This remarkable transformation is illustrated inFig. 14, where a number of trajectories are plotted for fourdifferent values of k.

When k = 0, the value of y remains constant at y0 andthe value of xn increases each iteration by the amounty0 (Mod 2π , which means that if xn does not lie on theinterval [0, 2π ] we add or subtract 2π until it does). Inthis case, the motion is regular and the trajectories traceout straight lines in the phase space. The rotor rotatescontinuously at the constant angular velocity y0. If y0 isa rational multiple of 2π , then Eq. (17a), like Eq. (1),exhibits a periodic cycle. However, if y0 is an irrational

Page 97: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

658 Chaos

multiple of 2π , then the dynamics is quasi-periodic foralmost all initial values of x0 and the points describing theorbit gradually trace out a solid horizontal line in the phasespace.

1. Resonance Islands

As k is increased to k = 0.5, most of the orbits remainregular and lie on smooth curves in the phase space; how-ever, elliptical islands begin to appear around the point(π, 0) = (π, 2π ). (Remember, the intrinsic periodicity ofthe map implies that the top of the 2π square is connectedto the bottom and the right-hand side to the left.) Theseislands correspond to a resonance between the weak peri-odic kicks and the rotational frequency of the rotor. Conse-quently, when the kicks and the rotations are synchronous,the rotor is accelerated. However, because it is a nonlin-ear oscillator (as opposed to a linear, harmonic oscillator),the rotation frequency changes as the velocity increasesso that the motion goes out of resonance and therefore thekicks retard the motion and the velocity decreases until therotation velocity returns to resonance, then this pattern isrepeated. The orbits associated with these quasi-periodiccycles of increasing and decreasing angular velocity traceout elliptical paths in the phase space, as shown in Fig. 14.

The center of the island, (π, 0), corresponds to aperiod-1 point of the standard map. (This is easy to checkby simply plugging (π, 0) into the right-hand side ofEq. (17).) Figure 14 also shows indications of a smaller is-land centered at (π, π). Again, it is easy to verify that thispoint is a member of a period-2 cycle (the other element isthe point (2π, π) = (0, π )). In fact, there are resonance is-lands described by chains of ellipses throughout the phasespace associated with periodic orbits of all orders. How-ever, most of these islands are much too small to show upin the graphs displayed in Fig. 14.

As the strength of the kicks increases, these islands in-crease in size and become more prominent. For k = 1,several different resonance island chains are clearly visi-ble corresponding to the period-1, period-2, period-3, andperiod-7 cycles. However, as the resonance regions in-crease in size and they begin to overlap, individual trajec-tories between the resonance regions become confused,and the motion becomes chaotic. These chaotic orbits nolonger lie on smooth curves in the phase space but be-gin to wander about larger and larger areas of the phasespace as k is increased. For k = 2, a single orbit wandersover more than half of the phase space, and for k = 5 (notshown), a single orbit would appear to uniformly coverthe entire 2π square (although a microscopic examina-tion would always reveal small regular regions near someperiodic points).

2. The Kolmogorov–Arnold–Moser Theorem

Since the periodic orbits and the associated resonance re-gions are mathematically dense in the phase space (thougha set of measure zero), there are always small regions ofchaos for any nonzero value of k. However, for small val-ues of k, an important mathematical theorem, called theKolmogorov–Arnold–Moser (KAM) theorem, guaranteesthat if the perturbation applied to the integrable Hamilto-nian system is sufficiently small, then most of the trajecto-ries will lie on smooth curves, such as those displayed inFig. 14 for k = 0 and k = 0.5. However, Fig. 14 clearlyshows that some of these so-called KAM curves (alsocalled KAM surfaces or invariant tori in higher dimen-sions) persist for relatively large values of k ∼ 1.

The significance of these KAM surfaces is that theyform barriers in the phase space. Although these barrierscan be circumvented by the slow process of Arnold diffu-sion in four or more dimensions, they are strictly confiningin the two-dimensional phase space of the standard map.This means that orbits starting on one side cannot crossto the other side, and the chaotic regions will be confinedby these curves. However, as resonance regions grow withincreasing k and begin to overlap, these KAM curves aredestroyed and the chaos spreads, as shown in Fig. 14.

The critical kc for this onset of global chaos can beestimated analytically using Chirikov’s resonance over-lap criteria, which yields an approximate value of kc ≈ 2.However, a more precise value of kc can be determined bya detailed examination of the breakup of the last confiningKAM curve. Since the resonance regions associated withlow-order periodic orbits are the largest, the last KAMcurve to survive is the one furthest from a periodic or-bit. This corresponds to an orbit with the most irrationalvalue of average rotation frequency, which is the goldenmean = (

√5 − 1)/2. Careful numerical studies of the stan-

dard map show that the golden mean KAM curve, which isthe last smooth curve to divide the top of the phase spacefrom the bottom, is destroyed for k 1 (more preciselyfor kc = 0.971635406). For k > kc, MacKay et al. (1987)have shown that this last confining curve breaks up into aso-called cantorus, which is a curve filled with gaps resem-bling a Cantor set. These gaps allow chaotic trajectories toleak through so that single orbits can wander throughoutlarge regions of the phase space, as shown in Fig. 14 fork = 2.

3. Chaotic Diffusion

Because of the intrinsic nonlinearity of Eq. (17b), the re-striction of the map to the 2π square was only a graphicalconvenience that exploited the natural periodicities of themap. However, in reality, both the angle variable and the

Page 98: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 659

angular velocity of a real physical system described bythe standard map can take on all real values. In particular,when the golden mean KAM torus is destroyed, the angu-lar velocity associated with the chaotic orbits can wanderto arbitrarily large positive and negative values.

Because the chaotic evolution of both the angle andangular velocity appears to execute a random walk in thephase space, it is natural to attempt to describe the dynam-ics using a statistical description despite the fact that theunderlying dynamical equations are fully deterministic.In fact, when k kc, careful numerical studies show thatthe evolution of an ensemble of initial conditions can bewell described by a diffusion equation. Consequently, thissimple deterministic dynamical system provides an inter-esting model for studying the problem of the microscopicfoundations of statistical mechanics, which is concernedwith the question of how the reversible and deterministicequations of classical mechanics can give rise to the ir-reversible and statistical equations of classical statisticalmechanics and thermodynamics.

C. The Henon–Heiles Model

Our third example of a Hamiltonian system that exhibitsa transition from regular behavior to chaos is describedby a system of four coupled, nonlinear differential equa-tions. It was originally introduced by Michel Henon andCarl Heiles in 1964 as a model of the motion of a starin a nonaxisymmetric, two-dimensional potential corre-sponding to the mean gravitational field in a galaxy. Theequations of motion for the two components of the posi-tion and momentum,

dx/dt = px (18a)

dy/dt = py (18b)

dpx/dt = −x − 2xy (18a)

dpy/dt = −y + y2 − x2 (18b)

are generated by the Hamiltonian

H (x, y, px , py) = p2x

2+ p2

y

2+ 1

2(x2 + y2) + x2 y − 1

3y3

(19)

where the mass is taken to be unity. Equation 19 cor-responds to the Hamiltonian of two uncoupled harmonicoscillators H0 = (p2

x/2) + (p2y/2) + 1

2 (x2+y2) (consistingof the sum of the kinetic and a quadratic potential energy)plus a cubic perturbation H1 = x2 y − 1

3 y3, which providesa nonlinear coupling for the two linear oscillators.

Since the Hamiltonian is independent of time, it is aconstant of motion that corresponds to the total energy of

the system E = H (x, y, px , py). When E is small, boththe values of the momenta (px , py) and the positions (x, y)must remain small. Therefore, in the limit E 1, the cubicperturbation can be neglected and the motion will be ap-proximately described by the equations of motion for theunperturbed Hamiltonian, which are easily integrated an-alytically. Moreover, the application of the KAM theoremto this problem guarantees that as long as E is sufficientlysmall the motion will remain regular. However, as E isincreased, the solutions of the equations of motion, likethe orbits generated by the standard map, will become in-creasingly complicated. First, nonlinear resonances willappear from the coupling of the motions in the x and the ydirections. As the energy increases, the effect of the non-linear coupling grows, the sizes of the resonances grow,and, when they begin to overlap, the orbits begin to exhibitchaotic motion.

1. Poincare Sections

Although Eq. (18) can be easily integrated numericallyfor any value of E , it is difficult to graphically displaythe transition from regular behavior to chaos because theresulting trajectories move in a four-dimensional phasespace spanned by x, y, px , and py . Although we can usethe constancy of the energy to reduce the dimension ofthe accessible phase space to three, the graphs of the re-sulting three-dimensional trajectories would be even lessrevealing than the three-dimensional graphs of the Lorenzattractor since there is no attractor to consolidate the dy-namics. However, we can simplify the display of the tra-jectories by exploiting the same device used to relate theHenon map to the Lorenz model. If we plot the value of px

versus x every time the orbit passes through y = 0, thenwe can construct a Poincare section of the trajectory thatprovides a very clear display of the transition from regularbehavior to chaos.

Figure 15 displays these Poincare sections for a num-ber of different initial conditions corresponding to threedifferent energies, E = 1

12 , 18 , and 1

6 . For very small E ,most of the trajectories lie on an ellipsoid in four-dimensional phase space, so the intersection of the orbitswith the px –x plane traces out simple ellipses centeredat (x, px ) = (0, 0). For E = 1

12 , these ellipses are distortedand island chains associated with the nonlinear resonancesbetween the coupled motions appear; however, most or-bits appear to remain on smooth, regular curves. Finally,as E is increased to 1

8 and 16 , the Poincare sections reveal

a transition from ordered motion to chaos, similar to thatobserved in the standard map.

In particular, when E = 16 , a single orbit appears to uni-

formly cover most of the accessible phase space defined bythe surface of constant energy in the full four-dimensional

Page 99: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

660 Chaos

FIGURE 15 Poincare sections for a number of different orbitsgenerated by the Henon–Heiles equations are plotted for threedifferent values of the energy E. These figure were created byplotting the position of the orbit in the x–px plane each time thesolutions of the Henon–Heiles equations passed through y = 0with positive, py. For E = 1

12 , the effect of the perturbation is smalland the orbits resemble the smooth but distorted curves observedin the standard map for small k, with resonance islands associ-ated with coupling of the x and y oscillations. However, as theenergy increases and the effects of the nonlinearities becomemore pronounced, large regions of chaotic dynamics become vis-ible and grow until most of the accessible phase space appearsto be chaotic for E = 1

6 . (These figures can be compared with theless symmetrical Poincare sections plotted in the y–py plane thatusually appear in the literature).

phase space. Although the dynamics of individual trajec-tories is very complicated in this case, the average prop-erties of an ensemble of trajectories generated by this de-terministic but chaotic dynamical system should be welldescribed using the standard methods of statistical me-chanics. For example, we may not be able to predict whena star will move chaotically into a particular region ofthe galaxy, but the average time that the star spends inthat region can be computed by simply measuring the rel-ative volume of the corresponding region of the phasespace.

D. Applications

The earliest applications of the modern ideas of nonlineardynamics and chaos to Hamiltonian systems were in thefield of accelerator design starting in the late 1950s. Inorder to maintain a beam of charged particles in an ac-celerator or storage ring, it is important to understand thedynamics of the corresponding Hamiltonian equations ofmotion for very long times (in some cases, for more than108 revolutions). For example, the nonlinear resonancesassociated with the coupling of the radial and vertical os-cillations of the beam can be described by models similarto the Henon–Heiles equations, and the coupling to fieldoscillations around the accelerator can be approximatedby models related to the standard map. In both cases, ifthe nonlinear coupling or perturbations are too large, thechaotic orbits can cause the beam to defocus and run intothe wall.

Similar problems arise in the description of magneti-cally confined electrons and ions in plasma fusion devices.The densities of these thermonuclear plasmas are suffi-ciently low that the individual particle motions are effec-tively collisionless on the time scales of the experiments,so dissipation can be neglected. Again, the nonlinear equa-tions describing the motion of the plasma particles can ex-hibit chaotic behavior that allows the particles to escapefrom the confining fields. For example, electrons circu-lating along the guiding magnetic field lines in a toroidalconfinement device called a TOKAMAK will feel a peri-odic perturbation because of slight variations in magneticfields, which can be described by a model similar to thestandard map. When this perturbation is sufficiently large,electron orbits can become chaotic, which leads to ananomalous loss of plasma confinement that poses a seriousimpediment to the successful design of a fusion reactor.

The fact that a high-temperature plasma is effectivelycollisionless also raises another problem in which chaosactually plays a beneficial role and which goes right to theroot of a fundamental problem of the microscopic foun-dations of statistical mechanics. The problem is how doyou heat a collisionless plasma? How do you make anirreversible transfer of energy from an external source,

Page 100: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 661

such as the injection of a high-energy particle beam orhigh-intensity electromagnetic radiation, to a reversible,Hamiltonian system? The answer is chaos. For example,the application of intense radio-frequency radiation in-duces a strong periodic perturbation on the natural oscil-latory motion of the plasma particles. Then, if the pertur-bation is strong enough, the particle motion will becomechaotic. Although the motion remains deterministic andreversible, the chaotic trajectories associated with the en-semble of particles can wander over a large region of thephase space, in particular to higher and lower velocities.Since the temperature is a measure of the range of possiblevelocities, this process causes the plasma temperature toincrease.

Progress in the understanding of chaotic behavior hasalso caused a revival of interest in a number of problemsrelated to celestial mechanics. In addition to Henon andHeiles’ work on stellar dynamics described previously,Jack Wisdom at MIT has recently solved several old puz-zles relating to the origin of meteorites and the presenceof gaps in the asteroid belt by invoking chaos. Each timean asteroid that initially lies in an orbit between Mars andJupiter passes the massive planet Jupiter, it feels a gravi-tational tug. This periodic perturbation on small orbitingasteroids results in a strong resonant interaction when thetwo frequencies are related by low-order rational numbers.As in the standard map and the Henon–Heiles model, ifthis resonant interaction is sufficiently strong, the aster-oid motion can become chaotic. The ideal Kepler ellipsesbegin to precess and elongate until their orbits cross theorbit of Earth. Then, we see them as meteors and mete-orites, and the depletion of the asteroid belts leaves gapsthat correspond to the observations.

The study of chaotic behavior in Hamiltonian systemshas also found many recent applications in physical chem-istry. Many models similar to the Henon–Heiles modelhave been proposed for the description of the interactionof coupled nonlinear oscillators that correspond to atomsin a molecule. The interesting questions here relate to howenergy is transferred from one part of the molecule to theother. If the classical dynamics of the interacting atoms isregular, then the transfer of energy is impeded by KAMsurfaces, such as those in Figs. 14 and 15. However, ifthe classical dynamics is fully chaotic, then the moleculemay exhibit equipartition of energy as predicted by statis-tical theories. Even more interesting is the common casewhere some regions of the phase space are chaotic andsome are regular. Since most realistic, classical models ofmolecules involve more than two degrees of freedom, theunraveling of this complex phase-space structure in six ormore dimensions remains a challenging problem.

Finally, most recently there has been considerable in-terest in the classical Hamiltonian dynamics of electrons

in highly excited atoms in the presence of strong magneticfields and intense electromagnetic radiation. The studiesof the regular and chaotic dynamics of these strongly per-turbed systems have provided a new understanding of theatomic physics in a realm in which conventional meth-ods of quantum perturbation theory fail. However, thesestudies of chaos in microscopic systems, like those ofmolecules, have also raised profound, new questions re-lating to whether the effects of classical chaos can survivein the quantum world. These issues will be discussed inSection V.

V. QUANTUM CHAOS

The discovery that simple nonlinear models of classicaldynamical systems can exhibit behavior that is indistin-guishable from a random process has naturally raised thequestion of whether this behavior persists in the quantumrealm where the classical nonlinear equations of motionare replaced by the linear Schrodinger equation. This iscurrently a lively area of research. Although there is gen-eral consensus on the key problems, the solutions remaina subject of controversy. In contrast to the subject of clas-sical chaos, there is not even agreement on the definitionof quantum chaos. There is only a list of possible symp-toms for this poorly characterized disease. In this section,we will briefly discuss the problem of quantum chaos anddescribe some of the characteristic features of quantumsystems that correspond to classically chaotic Hamilto-nian systems. Some of these features will be illustratedusing a simple model that corresponds to the quantizeddescription of the kicked rotor described in Section IV.B.Then, we will conclude with a description of the compari-son of classical and quantum theory with real experimentson highly excited atoms in strong fields.

A. The Problem of Quantum Chaos

Guided by Bohr’s correspondence principle, it might benatural to conclude that quantum mechanics should agreewith the predictions of classical chaos for macroscopicsystems. In addition, because chaos has played a funda-mental role in improving our understanding of the micro-scopic foundations of classical statistical mechanics, onewould hope that it would play a similar role in shoring upthe foundations of quantum statistical mechanics. Unfor-tunately, quantum mechanics appears to be incapable ofexhibiting the strong local instability that defines classicalchaos as a mixing system with positive Kolmogorof–Sinaientropy.

One way of seeing this difficulty is to note that theSchrodinger equation is a linear equation for the wave

Page 101: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

662 Chaos

function, and neither the wave function nor any observ-able quantities (determined by taking expectation valuesof self-adjoint operators) can exhibit extreme sensitivityto initial conditions. In fact, if the Hamiltonian system isbounded (like the Henon–Heiles Model), then the quan-tum mechanical energy spectrum is discrete and the timeevolution of all quantum mechanical quantities is doomedto quasiperiodic behavior, such as that Eq. (1).

Although the question of the existence of quantumchaos remains a controversial topic, nearly everyoneagrees that the most important questions relate to howquantum systems behave when the corresponding clas-sical Hamiltonian systems exhibit chaotic behavior. Forexample, how does the wave function behave for stronglyperturbed oscillators, such as those modeled by the clas-sical standard map, and what are the characteristics of theenergy levels for a system of strongly coupled oscillators,such as those described by the Henon–Heiles model?

B. Symptoms of Quantum Chaos

Even though the Schrodinger equation is a linear equa-tion, the essential nonintegrability of chaotic Hamilto-nian systems carries over to the quantum domain. Thereare no known examples of chaotic classical systems forwhich the corresponding wave equations can be solvedanalytically. Consequently, theoretical searches for quan-tum chaos have also relied heavily on numerical solutions.These detailed numerical studies by physical chemists andphysicists studying the dynamics of molecules and the ex-citation and ionization of atoms in strong fields have ledto the identification of several characteristic features ofthe quantum wave functions and energy levels that revealthe manifestation of chaos in the corresponding classicalsystems.

One of the most studied characteristics of nonintegrablequantum systems that correspond to classically chaoticHamiltonian systems is the appearance of irregular energyspectra. The energy levels in the hydrogen atom, describedclassically by regular, elliptical Kepler orbits, form an or-derly sequence, En = −1/(2n2), where n = 1, 2, 3, . . . isthe principal quantum number. However, the energy lev-els of chaotic systems, such as the quantum Henon–Heilesmodel, do not appear to have any simple order at large en-ergies that can be expressed in terms of well-defined quan-tum numbers. This correspondence makes sense since thequantum numbers that define the energy levels of inte-grable systems are associated with the classical constantsof motion (such as angular momentum), which are de-stroyed by the nonintegrable perturbation. For example,Fig. 16 displays the calculated energy levels for a hydro-gen atom in a magnetic field that shows the transitionfrom the regular spectrum at low magnetic fields to an ir-

FIGURE 16 The quantum mechanical energy levels for a highlyexcited hydrogen atom in a strong magnetic field are highly irregu-lar. This figure shows the numerically calculated energy levels as afunction of the square of the magnetic field for a range of energiescorresponding to quantum states with principal quantum numbersn ≈ 40–50. Because the magnetic field breaks the natural spher-ical and Coulomb symmetries of the hydrogen atom, the energylevels and associated quantum states exhibit a jumble of multipleavoided crossings caused by level repulsion, which is a commonsymptom of quantum systems that are classically chaotic. [FromDelande, D. (1988). Ph. D. thesis, Universite Pierre & Marie Curie,Paris.]

regular spectrum (“spaghetti”) at high fields in which themagnetic forces are comparable to the Coulomb bindingfields.

This irregular spacing of the quantum energy levels canbe conveniently characterized in terms of the statistics ofthe energy level spacings. For example, Fig. 17 shows ahistogram of the energy level spacings, s = Ei+1 ∼ Ei , forthe hydrogen atom in a magnetic field that is strong enoughto make most of the classical electron orbits chaotic. Re-markably, this distribution of energy level spacings, P(s),is identical to that found for a much more complicatedquantum system with irregular spectra–compound nuclei.Moreover, both distributions are well described by thepredictions of random matrix theory, which simply re-places the nonintegrable (or unknown) quantum Hamil-tonian with an ensemble of large matrices with randomvalues for the matrix elements. In particular, this distribu-tion of energy level spacings is expected to be given by theWigner–Dyson distribution, P(s) ∼ s exp(−s2), displayedin Fig. 17. Although these random matrices cannot predictthe location of specific energy levels, they do account formany of the statistical features relating to the fluctuationsin the energy level spacings.

Despite the apparent statistical character of the quan-tum energy levels for classically chaotic systems, theselevel spacings are not completely random. If they were

Page 102: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 663

FIGURE 17 The repulsion of the quantum mechanical energylevels displayed in Fig. 16 results in a distribution of energy levelspacings, P(s), in which accidental degeneracies (s= 0) are ex-tremely rare. This figure displays a histogram of the energy levelspacings for 1295 levels, such as those in Fig. 16. This distribu-tion compares very well with the Wigner–Dyson distribution (solidcurve), which is predicted for the energy level spacing for randommatrices. If the energy levels were uncorrelated random numbers,then they would be expected to have a Poisson distribution indi-cated by the dashed curve. [From Delande. D., and Gay, J. C.(1986). Phys. Rev. Lett. 57, 2006.]

completely uncorrelated, then the spacings statisticswould obey a Poison distribution, P(s) ∼ exp(s), whichwould predict a much higher probability of nearly degen-erate energy levels. The absence of degeneracies in chaoticsystems is easily understood because the interaction of allthe quantum states induced by the nonintegrable pertur-bation leads to a repulsion of nearby levels. In addition,the energy levels exhibit an important long-range correla-tion called spectral rigidity, which means that fluctuationsabout the average level spacing are relatively small overa wide energy range. Michael Berry has traced this spec-tral rigidity in the spectra of simple chaotic Hamiltoniansto the persistence of regular (but not necessarily stable)periodic orbits in the classical phase space. Remarkably,these sets of measure-zero classical orbits appear to have adominant influence on the characteristics of the quantumenergy levels and quantum states.

Experimental studies of the energy levels of Rydbergatoms in strong magnetic fields by Karl Welge and col-laborators at the University of Bielefeld appear to haveconfirmed many of these theoretical and numerical pre-dictions. Unfortunately, the experiments can only resolvea limited range of energy levels, which makes the con-firmation of statistical predictions difficult. However, theexperimental observations of this symptom of quantumchaos are very suggestive. In addition, the experimentshave provided very striking evidence for the important roleof classical regular orbits embedded in the chaotic sea oftrajectories in determining gross features in the fluctua-tions in the irregular spectrum. In particular, there appears

to be a one-to-one correspondence between regular oscil-lations in the spectrum and the periods of the shortest peri-odic orbits in the classical Hamiltonian system. Althoughthe corresponding classical dynamics of these simple sys-tems is fully chaotic, the quantum mechanics appears tocling to the remnants of regularity.

Another symptom of quantum chaos that is more directis to simply look for quantum behavior that resembles thepredictions of classical chaos. In the cases of atoms ormolecules in strong electromagnetic fields where classi-cal chaos predicts ionization or dissociation, this symptomis unambiguous. (The patient dies.) However, quantumsystems appear to be only capable of mimicking classi-cal chaotic behavior for finite times determined by thedensity of quantum states (or the size of the quantumnumbers). In the case of as few as 50 interacting parti-cles, this break time may exceed the age of the universe,however, for small quantum systems, such as those de-scribed by the simple models of Hamiltonian chaos, thistime scale, where the Bohr correspondence principle forchaotic systems breaks down, may be accessible to exper-imental measurements.

C. The Quantum Standard Map

One model system that has greatly enhanced our under-standing of the quantum behavior of classically chaoticsystems is the quantum standard map, which was first in-troduced by Casati et al. in 1979. The Schrodinger equa-tion for the kicked rotor described in Section IV.B alsoreduces to a map that describes how the wave function (ex-pressed in terms of the unperturbed quantum eigenstatesof the rotor) spreads at each kick. Although this map isformally described by an infinite system of linear differ-ence equations, these equations can be solved numericallyto good approximation by truncating the set of equationsto a large but finite number (typically, ≤1000 states).

The comparison of the results of these quantum calcu-lations with the classical results for the evolution of thestandard map over a wide range of parameters has re-vealed a number of striking features. For short times, thequantum evolution resembles the classical dynamics gen-erated by evolving an ensemble of initial conditions withthe same initial energy or angular momenta but differentinitial angles. In particular, when the classical dynamicsis chaotic, the quantum mechanical average of the kineticenergy also increases linearly up to a break time where theclassical dynamics continue to diffuse in angular velocitybut the quantum evolution freezes and eventually exhibitsquasi-periodic recurrences to the initial state. Moreover,when the classical mechanics is regular the quantum wavefunction is also confined by the KAM surfaces for shorttimes but may eventually “tunnel” or leak through.

Page 103: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

664 Chaos

This relatively simple example shows that quantum me-chanics is capable of stabilizing the dynamics of the clas-sically chaotic systems and destabilizing the regular clas-sical dynamics, depending on the system parameters. Inaddition, this dramatic quantum suppression of classicalchaos in the quantum standard map has been related tothe phenomenon of Anderson localization in solid-statephysics where an electron in a disordered lattice will re-main localized (will not conduct electricity) through de-structive quantum interference effects. Although there isno random disorder in the quantum standard map, the clas-sical chaos appears to play the same role.

D. Microwave Ionization of HighlyExcited Hydrogen Atoms

As a consequence of these suggestive results for the quan-tum standard map, there has been a considerable effort tosee whether the manifestations of classical chaos and itssuppression by quantum interference effects could be ob-served experimentally in a real quantum system consistingof a hydrogen atom prepared in a highly excited state thatis then exposed to intense microwave fields.

Since the experiments can be performed with atoms pre-pared in states with principal quantum numbers as high asn = 100, one could hope that the dynamics of this elec-tron with a 0.5-µm Bohr radius would be well describedby classical dynamics. In the presence of an intense oscil-lating field, this classical nonlinear oscillator is expectedto exhibit a transition to global chaos such as that exhib-ited by the classical standard map at k ≈ 1. For example,Fig. 18 shows a Poincare section of the classical action-angle phase space for a one-dimensional model of a hydro-gen atom in an oscillating field for parameters that corre-spond closely to those of the experiments. For small valuesof the classical action I , which correspond to low quan-tum numbers by the Bohr–Somerfeld quantization rule,the perturbing field is much weaker than the Coulombbinding fields and the orbits lie on smooth curves that arebounded by invariant KAM tori. However, for larger val-ues of I , the relative size of the perturbation increases andthe orbits become chaotic, filling large regions of phasespace and wandering to arbitrarily large values of the ac-tion and ionizing. Since these chaotic orbits ionize, theclassical theory predicts an ionization mechanism that de-pends strongly on the intensity of the radiation and onlyweakly on the frequency, which is just the opposite of thedependence of the traditional photoelectric effect.

In fact, this chaotic ionization mechanism was first ex-perimentally observed in the pioneering experiments ofJim Bayfield and Peter Koch in 1974, who observed thesharp onset of ionization in atoms prepared in the n ≈ 66state, when a 10-GHz microwave field exceeded a criticalthreshold. Subsequently, the agreement of the predictions

FIGURE 18 This Poincare section of the classical dynamics ofa one-dimensional hydrogen atom in a strong oscillating electricfield was generated by plotting the value of the classical action Iand angle θ once every period of the perturbation with strengthI 4 F = 0.03 and frequency I 3 = 1.5. In the absence of the pertur-bations, the action (which corresponds to principal quantum num-ber n by the Bohr–Sommerfeld quantization rule) is a constantof motion. In this case, different initial conditions (correspondingto different quantum states of the hydrogen atom) would traceout horizontal lines in the phase space, such as those in Fig. 14,for the standard map at k = 0. Since the Coulomb binding fielddecreases as 1/I 4 (or 1/n4), the relative strength of the pertur-bation increases with I . For a fixed value of the perturbing fieldF , the classical dynamics is regular for small values of I with aprominent nonlinear resonance below I = 1.0. A prominent pair ofislands also appears near I = 1.1, but it is surrounded by a chaoticsea. Since the chaotic orbits can wander to arbitrarily high valuesof the action, they ultimately led to ionization of the atom.

of classical chaos on the quantum measurements has beenconfirmed for a wide range of parameters corresponding toprincipal quantum numbers from n = 32 to 90. Figure 19shows the comparison of the measured thresholds for theonset of ionization with the theoretical predictions for theonset of classical chaos in a one-dimensional model ofthe experiment.

Moreover, detailed numerical studies of the solution ofthe Schrodinger equation for the one-dimensional modelhave revealed that the quantum mechanism that mimicsthe onset of classical chaos is the abrupt delocalization ofthe evolving wave packet when the perturbation exceedsa critical threshold. However, these quantum calculationsalso showed that in a parameter range just beyond thatstudied in the original experiments the threshold fieldsfor quantum delocalization would become larger than theclassical predictions for the onset of chaotic ionization.This quantum suppression of the classical chaos wouldbe analogous to that observed in the quantum standardmap. Very recently, the experiments in this new regime

Page 104: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPJ 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002E-94 May 19, 2001 20:28

Chaos 665

FIGURE 19 A comparison of the threshold field strengths for theonset of microwave ionization predicted by the classical theoryfor the onset of chaos (solid curve) with the results of experi-mental measurements on real hydrogen atoms with n = 32 to 90(open squares) and with estimates from the numerical solution ofthe corresponding Schrodinger equation (crosses). The thresh-old field strengths are conveniently plotted in terms of the scaledvariable n4 F = I 4 F , which is the ratio of the perturbing field Fto the Coulomb binding field 1/n4 versus the scaled frequencyn3 = l 3, which is the ratio of the microwave frequency to theKepler orbital frequency 1/n3. The prominent features near ratio-nal values of the scaled frequency, n3 = 1, 1

2 , 13 , and 1

4 , whichappear in both the classical and quantum calculations as well asthe experimental measurements, are associated with the pres-ence of nonlinear resonances in the classical phase space.

have been performed, and the experimental evidence sup-ports the theoretical prediction for quantum suppression ofclassical chaos, although the detailed mechanisms remaina topic of controversy.

These experiments and the associated classical andquantum theories are parts of the exploration of the fron-tiers of a new regime of atomic and molecular physics forstrongly interacting and strongly perturbed systems. Asour understanding of the dynamics of the simplest quan-tum systems improves, these studies promise a number ofimportant applications to problems in atomic and molec-ular physics, physical chemistry, solid-state physics, andnuclear physics.

SEE ALSO THE FOLLOWING ARTICLES

ACOUSTIC CHAOS • ATOMIC AND MOLECULAR COLLI-SIONS • COLLIDER DETECTORS FOR MULTI-TEV PARTI-CLES • FLUID DYNAMICS • FRACTALS • MATHEMATICAL

MODELING • MECHANICS, CLASSICAL • NONLINEAR DY-

NAMICS • QUANTUM THEORY • TECTONOPHYSICS • VI-BRATION, MECHANICAL

BIBLIOGRAPHY

Baker, G. L., and Gollub, J. P. (1990). “Chaotic Dynamics: An Introduc-tion,” Cambridge University Press, New York.

Berry, M. V. (1983). “Semi-classical mechanics of regular and irregularmotion,” In “Chaotic Behavior of Deterministic Systems” (G. Iooss,R. H. G. Helleman, and R. H. G. Stora, eds.), p. 171. North-Holland,Amsterdam.

Berry, M. V. (1985). “Semi-classical theory of spectral rigidity,” Proc.R. Soc. Lond. A 400, 229.

Bohr, T., Jensen, M. H., Paladin, G., and Vulpiani, A. (1998). “DynamicalSystems Approach to Turbulence,” Cambridge University Press, NewYork.

Campbell, D., ed. (1983). “Order in Chaos, Physica 7D,” Plenum, NewYork.

Casati, G., ed. (1985). “Chaotic Behavior in Quantum Systems,” Plenum,New York.

Casati, G., Chirikov, B. V., Shepelyansky, D. L., and Guarneri, I. (1987).“Relevance of classical chaos in quantum mechanics: the hydrogenatom in a monochromatic field,” Phys. Rep. 154, 77.

Crutchfield, J. P., Farmer, J. D., Packard, N. H., and Shaw, R. S. (1986).“Chaos,” Sci. Am. 255, 46.

Cvitanovic, P., ed. (1984). “Universality in Chaos,” Adam Hilger, Bris-tol. (This volume contains a collection of the seminal articles by M.Feigenbaum, E. Lorenz, R. M. May, and D. Ruelle, as well as anexcellent review by R. H. G. Helleman.)

Ford, J. (1983). “How random is a coin toss?” Phys. Today 36, 40.Giannoni, M.-J., Voros, A., and Zinn-Justin, J., eds. (1990). “Chaos and

Quantum Physics,” Elsevier Science, London.Gleick, J. (1987). “Chaos: Making of a New Science,” Viking, New York.Gutzwiller, M. C. (1990). “Choas in Classical and Quantum Mechanics,”

Springer-Verlag, New York. (This book treats the correspondence be-tween classical chaos and relevant quantum systems in detail, on arather formal level.)

Jensen, R. V. (1987a). “Classical chaos,” Am. Sci. 75, 166.Jensen, R. V. (1987b). “Chaos in atomic physics,” In “Atomic Physics

10” (H. Narami and I. Shimimura, eds.), p. 319, North-Holland,Amsterdam.

Jensen, R. V. (1988). “Chaos in atomic physics,” Phys. Today 41, S-30.Jensen, R. V., Susskind, S. M., and Sanders, M. M. (1991). “Chaotic

ionization of highly excited hydrogen atoms: comparison of classicaland quantum theory with experiment,” Phys. Rep. 201, 1.

Lichtenberg, A. J., and Lieberman, M. A. (1983). “Regular and StochasticMotion,” Springer-Verlag, New York.

MacKay, R. S., and Meiss, J. D., eds. (1987). “Hamiltonian DynamicalSystems,” Adam Hilger, Bristol.

Mandelbrot, B. B. (1982). “The Fractal Geometry of Nature,” Freeman,San Francisco.

Ott, E. (1981). “Strange attractors and chaotic motions off dynamicalsystems,” Rev. Mod. Phys. 53, 655.

Ott, E. (1993). “Chaos in Dynamical Systems,” Cambridge UniversityPress, New York. (This is a comprehensive, self-contained introduc-tion to the subject of chaos, presented at a level appropriate for graduatestudents and researchers in the physical sciences, mathematics, andengineering.)

Physics Today (1985). “Chaotic orbits and spins in the solar system,”Phys. Today 38, 17.

Schuster, H. G. (1984). “Deterministic Chaos,” Physik-Verlag, Wein-heim, F. R. G.

Page 105: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle OpticsP. W. HawkesCNRS, Toulouse, France

I. IntroductionII. Geometric Optics

III. Wave OpticsIV. Concluding Remarks

GLOSSARY

Aberration A perfect lens would produce an image thatwas a scaled representation of the object; real lensessuffer from defects known as aberrations and measuredby aberration coefficients.

Cardinal elements The focusing properties of opticalcomponents such as lenses are characterized by a setof quantities known as cardinal elements; the most im-portant are the positions of the foci and of the principalplanes and the focal lengths.

Conjugate Planes are said to be conjugate if a sharp im-age is formed in one plane of an object situated in theother. Corresponding points in such pairs of planes arealso called conjugates.

Electron lens A region of space containing a rotationallysymmetric electric or magnetic field created by suit-ably shaped electrodes or coils and magnetic materialsis known as a round (electrostatic or magnetic) lens.Other types of lenses have lower symmetry; quadrupolelenses, for example, have planes of symmetry orantisymmetry.

Electron prism A region of space containing a field inwhich a plane but not a straight optic axis can be definedforms a prism.

Image processing Images can be improved in variousways by manipulation in a digital computer or by op-tical analog techniques; they may contain latent infor-mation, which can similarly be extracted, or they maybe so complex that a computer is used to reduce thelabor of analyzing them. Image processing is conve-niently divided into acquisition and coding; enhance-ment; restoration; and analysis.

Optic axis In the optical as opposed to the ballistic studyof particle motion in electric and magnetic fields, thebehavior of particles that remain in the neighborhoodof a central trajectory is studied. This central trajectoryis known as the optic axis.

Paraxial Remaining in the close vicinity of the optic axis.In the paraxial approximation, all but the lowest orderterms in the general equations of motion are neglected,and the distance from the optic axis and the gradient ofthe trajectories are assumed to be very small.

Scanning electron microscope (SEM) Instrument inwhich a small probe is scanned in a raster over the sur-face of a specimen and provokes one or several signals,which are then used to create an image on a cathoderaytube or monitor. These signals may be X-ray inten-sities or secondary electron or backscattered electroncurrents, and there are several other possibilities.

667

Page 106: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

668 Charged -Particle Optics

Scanning transmission electron microscope (STEM)As in the scanning electron microscope, a small probeexplores the specimen, but the specimen is thin and thesignals used to generate the images are detected down-stream. The resolution is comparable with that of thetransmission electron microscope.

Scattering When electrons strike a solid target or passthrough a thin object, they are deflected by the lo-cal field. They are said to be scattered, elasticallyif the change of direction is affected with negligibleloss of energy, inelastically when the energy loss isappreciable.

Transmission electron microscope (TEM) Instrumentclosely resembling a light microscope in its generalprinciples. A specimen area is suitably illuminated bymeans of condenser lenses. An objective close to thespecimen provides the first stage of magnification, andintermediate and projector lens magnify the image fur-ther. Unlike glass lenses, the lens strength can be variedat will, and the total magnification can hence be variedfrom a few hundred times to hundreds of thousands oftimes. Either the object plane or the plane in which thediffraction pattern of the object is formed can be madeconjugate to the image plane.

OF THE MANY PROBES used to explore the structureof matter, charged particles are among the most versa-tile. At high energies they are the only tools availableto the nuclear physicist; at lower energies, electrons andions are used for high-resolution microscopy and manyrelated tasks in the physical and life sciences. The behav-ior of the associated instruments can often be accuratelydescribed in the language of optics. When the wavelengthassociated with the particles is unimportant, geometricoptics are applicable and the geometric optical proper-ties of the principal optical components—round lenses,quadrupoles, and prisms—are therefore discussed in de-tail. Electron microscopes, however, are operated closeto their theoretical limit of resolution, and to understandhow the image is formed a knowledge of wave optics isessential. The theory is presented and applied to the twofamilies of high-resolution instruments.

I. INTRODUCTION

Charged particles in motion are deflected by electric andmagnetic fields, and their behavior is described either bythe Lorentz equation, which is Newton’s equation of mo-tion modified to include any relativistic effects, or bySchrodinger’s equation when spin is negligible. There

are many devices in which charged particles travel in arestricted zone in the neighborhood of a curve, or axis,which is frequently a straight line, and in the vast major-ity of these devices, the electric or magnetic fields exhibitsome very simple symmetry. It is then possible to describethe deviations of the particle motion by the fields in thefamiliar language of optics. If the fields are rotationallysymmetric about an axis, for example, their effects areclosely analogous to those of round glass lenses on lightrays. Focusing can be described by cardinal elements, andthe associated defects resemble the geometric and chro-matic aberrations of the lenses used in light microscopes,telescopes, and other optical instruments. If the fields arenot rotationally symmetric but possess planes of symme-try or antisymmetry that intersect along the optic axis, theyhave an analog in toric lenses, for example the glass lensesin spectacles that correct astigmatism. The other importantfield configuration is the analog of the glass prism; herethe axis is no longer straight but a plane curve, typicallya circle, and such fields separate particles of different en-ergy or wavelength just as glass prisms redistribute whitelight into a spectrum.

In these remarks, we have been regarding charged par-ticles as classical particles, obeying Newton’s laws. Themention of wavelength reminds us that their behavior isalso governed by Schrodinger’s equation, and the resultingdescription of the propagation of particle beams is neededto discuss the resolution of electron-optical instruments,notably electron microscopes, and indeed any physical ef-fect involving charged particles in which the wavelengthis not negligible.

Charged-particle optics is still a young subject. Thefirst experiments on electron diffraction were made in the1920s, shortly after Louis de Broglie associated the notionof wavelength with particles, and in the same decade HansBusch showed that the effect of a rotationally symmet-ric magnetic field acting on a beam of electrons travelingclose to the symmetry axis could be described in opticalterms. The first approximate formula for the focal lengthwas given by Busch in 1926–1927. The fundamental equa-tions and formulas of the subject were derived during the1930s, with Walter Glaser and Otto Scherzer contribut-ing many original ideas, and by the end of the decade theGerman Siemens Company had put the first commercialelectron microscope with magnetic lenses on the market.The latter was a direct descendant of the prototypes builtby Max Knoll, Ernst Ruska, and Bodo von Borries from1932 onwards. Comparable work on the development ofan electrostatic instrument was being done by the AEGCompany.

Subsequently, several commercial ventures werelaunched, and French, British, Dutch, Japanese, Swiss,

Page 107: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 669

American, Czechoslovakian, and Russian electron micro-scopes appeared on the market as well as the Germaninstruments. These are not the only devices that dependon charged-particle optics, however. Particle acceleratorsalso use electric and magnetic fields to guide the parti-cles being accelerated, but in many cases these fields arenot static but dynamic; frequently the current density inthe particle beam is very high. Although the traditionaloptical concepts need not be completely abandoned, theydo not provide an adequate representation of all the prop-erties of “heavy” beams, that is, beams in which the cur-rent density is so high that interactions between individualparticles are important. The use of very high frequencieslikewise requires different methods and a new vocabularythat, although known as “dynamic electron optics,” is farremoved from the optics of lenses and prisms. This ac-count is confined to the charged-particle optics of staticfields or fields that vary so slowly that the static equationscan be employed with negligible error (scanning devices);it is likewise restricted to beams in which the current den-sity is so low that interactions between individual parti-cles can be neglected, except in a few local regions (thecrossover of electron guns).

New devices that exploit charged-particle optics areconstantly being added to the family that began with thetransmission electron microscope of Knoll and Ruska.Thus, in 1965, the Cambridge Instrument Co. launchedthe first commercial scanning electron microscope aftermany years of development under Charles Oatley in theCambridge University Engineering Department. Here, theimage is formed by generating a signal at the specimen byscanning a small electron probe over the latter in a regu-lar pattern and using this signal to modulate the intensityof a cathode-ray tube. Shortly afterward, Albert Crewe ofthe Argonne National Laboratory and the University ofChicago developed the first scanning transmission elec-tron microscope, which combines all the attractions of ascanning device with the very high resolution of a “con-ventional” electron microscope. More recently still, fineelectron beams have been used for microlithography, forin the quest for microminiaturization of circuits, the wave-length of light set a lower limit on the dimensions attain-able. Finally, there are, many devices in which the chargedparticles are ions of one or many species. Some of theseoperate on essentially the same principles as their electroncounterparts; in others, such as mass spectrometers, thepresence of several ion species is intrinsic. The laws thatgovern the motion of all charged particles are essentiallythe same, however, and we shall consider mainly electronoptics; the equations are applicable to any charged par-ticle, provided that the appropriate mass and charge areinserted.

II. GEOMETRIC OPTICS

A. Paraxial Equations

Although it is, strictly speaking, true that any beam ofcharged particles that remains in the vicinity of an arbi-trary curve in space can be described in optical language,this is far too general a starting point for our present pur-poses. Even for light, the optics of systems in which theaxis is a skew curve in space, developed for the study ofthe eye by Allvar Gullstrand and pursued by ConstantinCaratheodory, are little known and rarely used. The sameis true of the corresponding theory for particles, devel-oped by G. A. Grinberg and Peter Sturrock. We shall in-stead consider the other extreme case, in which the axisis straight and any magnetic and electrostatic fields arerotationally symmetric about this axis.

1. Round Lenses

We introduce a Cartesian coordinate system in which the zaxis coincides with the symmetry axis, and we provision-ally denote the transverse axes X and Y . The motion of acharged particle of rest mass m0 and charge Q in an elec-trostatic field E and a magnetic field B is then determinedby the differential equation

(d/dt)(γ m0v) = Q(E + v × B)

γ = (1 − v2/c2)−1/2, (1)

which represents Newton’s second law modified forrelativistic effects (Lorentz equation); v is the veloc-ity. For electrons, we have e = −Q 1.6 × 10−19 C ande/m0 176 C/µg. Since we are concerned with staticfields, the time of arrival of the particles is often of nointerest, and it is then preferable to differentiate not withrespect to time but with respect to the axial coordinate z.A fairly lengthy calculation yields the trajectory equations

d2 X

dz2= ρ2

g

(∂g

∂ X− X ′ ∂g

∂z

)

+ Qρ

g

[Y ′(Bz + X ′ BX ) − BY (1 + X ′2)

]d2Y

dz2= ρ2

g

(∂g

∂Y− Y ′ ∂g

∂z

)

+ Qρ

g

[−X ′(Bz + Y ′ BY ) + BX (1 + Y ′2)]

(2)

in which ρ2 = 1 + X ′2 + Y ′2 and g = γ m0v.By specializing these equations to the various cases of

interest, we obtain equations from which the optical prop-erties can be derived by the “trajectory method.” It is well

Page 108: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

670 Charged -Particle Optics

known that equations such as Eq. (1) are identical with theEuler–Lagrange equations of a variational principle of theform

W =∫ t1

t0

L(r, v, t) dt = extremum (3)

provided that t0, t1, r(t0), and r(t1) are held constant. TheLagrangian L has the form

L = m0c2[1 − (1 − v2/c2)1/2] + Q(v · A − ) (4)

in which and A are the scalar and vector potentialscorresponding to E, E = −grad and to B, B = curl A.For static systems with a straight axis, we can rewriteEq. (3) in the form

S =∫ z1

z0

M(x, y, z, x ′, y′) dz, (5)

where

M = (1 + X ′2 + Y ′2)1/2g(r)

+Q(X ′ AX + Y ′ AY + Az). (6)

The Euler–Lagrange equations,

d

dz

(∂M∂X ′

)= ∂M

∂X;

d

dz

(∂M∂Y ′

)= ∂M

∂Y(7)

again define trajectory equations. A very powerful methodof analyzing optical properties is based on a study of thefunction M and its integral S; this is known as the methodof characteristic functions, or eikonal method.

We now consider the special case of rotationally sym-metric systems in the paraxial approximation; that is, weexamine the behavior of charged particles, specificallyelectrons, that remain very close to the axis. For such par-ticles, the trajectory equations collapse to a simpler form,namely,

X ′′ + γφ′

2φX ′ + γφ′′

4φX + ηB

φ1/2Y ′ + ηB ′

2φ1/2Y = 0

(8)

Y ′′ + γφ′

2φY ′ + γφ′′

4φY − ηB

φ1/2X ′ − ηB ′

2φ1/2X = 0

in which φ(z) denotes the distribution of electrostaticpotential on the optic axis, φ(z) = (0, 0, z); φ(z) =φ(z)[1 + eφ(z)/2m0c2]. Likewise, B(z) denotes the mag-netic field distribution on the axis. These equations arecoupled, in the sense that X and Y occur in both, but thiscan be remedied by introducing new coordinate axes x ,y, inclined to X and Y at an angle θ (z) that varies with z;x = 0, y = 0 will therefore define not planes but surfaces.By choosing θ (z) such that

dθ/dz = ηB/2φ1/2; η = (e/2m0)1/2, (9)

FIGURE 1 Paraxial solutions demonstrating image formation.

we find

x ′′ + γφ′x ′/2φ + [(γφ′′ + η2 B2)/4φ]/x = 0(10)

y′′ + γφ′y′/2φ + [(γφ′′ + η2 B2)/4φ]/y = 0.

These differential equations are linear, homogeneous,and second order. The general solution of either is a linearcombination of any two linearly independent solutions,and this fact is alone sufficient to show that the corre-sponding fields B(z) and potentials φ(z) have an imagingaction, as we now show. Consider the particular solutionh(z) of Eq. (10) that intersects the axis at z = z0 and z = zi

(Fig. 1). A pencil of rays that intersects the plane z = zo atsome point Po(xo, yo) can be described by

x(z) = xog(z) + λh(z)(11)

y(z) = yog(z) + µh(z)

in which g(z) is any solution of Eq. (10) that is linearlyindependent of h(z) such that g(zo) = 1 andλ,µ are param-eters; each member of the pencil corresponds to a differentpair of values of λ, µ. In the plane z = zi, we find

x(zi) = xog(zi); y(zi) = yog(zi) (12)

for all λ and µ and hence for all rays passing through Po.This is true for every point in the plane z = zo, and hencethe latter will be stigmatically imaged in z = zi.

Furthermore, both ratios and x(zi)/xo and y(zi)/yo areequal to the constant g(zi), which means that any pattern ofpoints in z = zo will be reproduced faithfully in the imageplane, magnified by this factor g(zi), which is hence knownas the (transverse) magnification and denoted by M .

The form of the paraxial equations has numerous otherconsequences. We have seen that the coordinate frame x–y–z rotates relative to the fixed frame X–Y –Z about theoptic axis, with the result that the image will be rotatedwith respect to the object if magnetic fields are used. Inan instrument such as an electron microscope, the imagetherefore rotates as the magnification is altered, since thelatter is affected by altering the strength of the magneticfield and Eq. (9) shows that the angle of rotation is a func-tion of this quantity. Even more important is the fact thatthe coefficient of the linear term is strictly positive in the

Page 109: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 671

case of magnetic fields. This implies that the curvature ofany solution x(z) is opposite in sign to x(z), with the resultthat the field always drives the electrons toward the axis;magnetic electron lenses always have a convergent action.The same is true of the overall effect of electrostatic lenses,although the reasoning is not quite so simple.

A particular combination of any two linearly indepen-dent solutions of Eq. (10) forms the invariant known asthe Wronskian. This quantity is defined by

φ1/2(x1x ′2 − x ′

1x2); φ1/2(y1 y′2 − y′

1 y2) (13)

Suppose that we select x1 = h and x2 = g, where h(zo) =h(zi) = 0 and g(zo) = 1 so that g(zi) = M . Then

φ1/2o h′

o = φ1/2i h′

i M (14)

The ratio h′i/h′

o is the angular magnification MA and so

MMA = (φo/φi)1/2 (15)

or MMA = 1 if the lens has no overall accelerating effectand hence φo = φi. Identifying φ1/2 with the refractive in-dex, Eq. (15) is the particle analog of the Smith–Helmholtzformula of light optics. Analogs of all the other opticallaws can be established; in particular, we find that the lon-gitudinal magnification Ml is given by.

Ml = M/MA = (φi/φo)1/2 M2 (16)

and that Abbe’s sine condition and Herschel’s conditiontake their familiar forms.

We now show that image formation by electron lensescan be characterized with the aid of cardinal elements:foci, focal lengths, and principal planes. First, however,we must explain the novel notions of real and asymp-totic imaging. So far, we have simply spoken of rotation-ally symmetric fields without specifying their distributionin space. Electron lenses are localized regions in whichthe magnetic or electrostatic field is strong and outside ofwhich the field is weak but, in theory at least, does notvanish. Some typical lens geometries are shown in Fig. 2.

If the object and image are far from the lens, in effec-tively field-free space, or if the object is not a physicalspecimen but an intermediate image of the latter, the im-age formation can be analyzed in terms of the asymptotesto rays entering or emerging from the lens region. If, how-ever, the true object or image is immersed within the lensfield, as frequently occurs in the case of magnetic lenses, adifferent method of characterizing the lens properties mustbe adopted, and we shall speak of real cardinal elements.We consider the asymptotic case first.

It is convenient to introduce the solutions of Eq. (10)that satisfy the boundary conditions

limz→−∞ G(z) = 1; lim

z→∞ G(z) = 1 (17)

FIGURE 2 Typical electron lenses: (a–c) electrostatic lenses, ofwhich (c) is an einzel lens; (d–e) magnetic lenses of traditionaldesign.

These are rays that arrive at or leave the lens parallelto the axis (Fig. 3). As usual, the general solution isx(z) = αG(z) + βG(z), where α and β are constants. Wedenote the emergent asymptote to G(z) thus:

limz→∞ G(z) = G i(z − zFi) (18)

We denote the incident asymptote to G(z) thus:

limz→−∞ G(z) = G ′

o(z − zFo) (19)

FIGURE 3 Rays G(z ) and G(z ).

Page 110: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

672 Charged -Particle Optics

FIGURE 4 Focal and principal planes.

Clearly, all rays incident parallel to the axis have emergentasymptotes that intersect at z = zFi; this point is known asthe asymptotic image focus. It is not difficult to show thatthe emergent asymptotes to any family of rays that areparallel to one another but not to the axis intersect at a pointin the plane z = zFi. By applying a similar reasoning toG(z), we recognize that zFo is the asymptotic object focus.The incident and emergent asymptotes to G(z) intersect ina plane zPi, which is known as the image principal plane(Fig. 4). The distance between zFi and zPi is the asymptoticimage focal length:

zFi − zPi = −1/G ′i = fi (20)

We can likewise define zPo and fo:

zPo − zFo = 1/G ′o = fo (21)

The Wronskian tells us that φ1/2(GG ′ − G ′G) is constantand so

φ1/2o G ′

o = −φ1/2i G ′

i

or

fo/φ1/2

o = fi/φ

1/2i (22)

In magnetic lenses and electrostatic lenses, that provideno overall acceleration, φo = φi and so fo = fi; we dropthe subscript when no confusion can arise.

The coupling between an object space and an imagespace is conveniently expressed in terms of zFo, zFi, fo, andfi. From the general solution x = αG + βG, we see that

limz→−∞ x(z) = α + β(z − zFo)/ fo

(23)lim

z→∞ x(z) = −α(z − zFi)/ fi + β

and likewise for y(z). Eliminating α and β, we find

[x2

x ′2

]=

–z2 − zFi

fifo + (z1 − zFo)(z2 − zFi)

fi

–1

fi

zo − zFo

fo

[x1

x ′1

]

(24)

where x1 denotes x(z) in some plane z = z1 on the incidentasymptote and x2 denotes x(z) in some plane z = z2 on

the emergent asymptote; x ′ = dx/dz. The matrix thatappears in this equation is widely used to study systemswith many focusing elements; it is known as the (paraxial)transfer matrix and takes slightly different forms for thevarious elements in use, quadrupoles in particular. Wedenote the transfer matrix by T .

If the planes z1 and z2 are conjugate, the point of arrivalof a ray in z2 will vary with the position coordinates ofits point of departure in z1 but will be independent of thegradient at that point. The transfer matrix element T12 musttherefore vanish,

(zo − zFo)(zi − zFi) = − fo fi (25)

in which we have replaced z1 and z2 by zo and zi to indicatethat these are now conjugates (object and image). Thisis the familiar lens equation in Newtonian form. WritingzFi = zPi + fi and zFo = zPo − fo, we obtain

fo

zPo − zo+ fi

zi − zPi= 1 (26)

the thick-lens form of the regular lens equation.Between conjugates, the matrix T takes the form

T = M 0

− 1

fi

fo

fi

1

M

(27)

in which M denotes the asymptotic magnification, theheight of the image asymptote to G(z) in the image plane.

If, however, the object is a real physical specimen andnot a mere intermediate image, the asymptotic cardinal el-ements cannot in general be used, because the object maywell be situated inside the field region and only a part ofthe field will then contribute to the image formation. Fortu-nately, objective lenses, in which this situation arises, arenormally operated at high magnification with the speci-men close to the real object focus, the point at which theray G(z) itself intersects the axis [whereas the asymptoticobject focus is the point at which the asymptote to G(z) inobject space intersects the optic axis]. The correspondingreal focal length is then defined by the slope of G(z) at theobject focus Fo: f = 1/G ′(Fo); see Fig. 5.

FIGURE 5 Real focus and focal length.

Page 111: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 673

2. Quadrupoles

In the foregoing discussion, we have considered only ro-tationally symmetric fields and have needed only the axialdistributions B(z) and φ(z). The other symmetry of great-est practical interest is that associated with electrostaticand magnetic quadrupoles, widely used in particle accel-erators. Here, the symmetry is lower, the fields possessingplanes of symmetry and antisymmetry only; these planesintersect in the optic axis, and we shall assume forthwiththat electrostatic and magnetic quadrupoles are disposedas shown in Fig. 6. The reason for this is simple: Theparaxial equations of motion for charged particles trav-eling through quadrupoles separate into two uncoupledequations only if this choice is adopted. This is not merelya question of mathematical convenience; if quadrupolefields overlap and the total system does not have thesymmetry indicated, the desired imaging will not beachieved.

FIGURE 6 (a) Magnetic and (b) electrostatic quadrupoles.

The paraxial equations are now different in the x–z andy–z planes:

d

dz(φ1/2x ′) + γφ′′ − 2γ p2 + 4ηQ2φ

1/2

4φ1/2x = 0

(28)d

dz(φ1/2 y′) + γφ′′ + 2γ p2 − 4ηQ2φ

1/2

4φ1/2y = 0

in which we have retained the possible presence of around electrostatic lens field φ(z). The functions p2(z) andQ2(z) that also appear characterize the quadrupole fields;their meaning is easily seen from the field expansions [forB(z) = 0]:

(x, y, z) = φ(z) − 1

4(x2 + y2)φ′′(z)

+ 1

64(x2 + y2)2φ(4)(z)

+ 1

2(x2 − y2)p2(z) − 1

24(x4 + y4)p′′

2 (z)

+ 1

24p4(z)(x4 − 6x2 y2 + y4) + · · · (29)

(r, ψ, z) = φ(z) − 1

4r2φ′′ + 1

64r4φ(4)

+ 1

2p2r2 cos 2ψ − 1

24p′′

2r4 cos 2ψ

+ 1

24p4r4 cos 4ψ + · · ·

Ax = − x

12(x2 − 3y2)Q′

2(z)

Ay = y

12(y2 − 3x2)Q′

2(z)

(30)

Az = 1

2(x2 − y2)Q2(z) − 1

24(x4 − y4)Q′′

2(z)

+ 1

24(x4 − 6x2 y2 + y4)Q4(z)

The terms p4(z) and Q4(z) characterize octopole fields,and we shall refer to them briefly in connection with theaberration correction below.

It is now necessary to define separate transfer matricesfor the x–z plane and for the y–z plane. These have ex-actly the same form as Eqs. (24) and (27), but we haveto distinguish between two sets of cardinal elements. Forarbitrary planes z1 and z2, we have

Page 112: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

674 Charged -Particle Optics

T (x) =

− z2 − z(x)Fi

fx i

(z2 − z(x)

Fi

)(z2 − z(x)

Fo

)fxo

+ fx i

− 1

fx i

z1 − z(x)Fo

fx i

T (y) =

− z2 − z(y)Fi

fyi

(z2 − z(y)

Fi

)(z1 − z(y)

Fo

)fyo

+ fyi

− 1

fyi

z1 − z(y)Fo

fyi

·

(31)

Suppose now that z = zxo and z = zx i and conjugate so thatT (x)

12 = 0; in general, T (y)12 = 0 and so a point in the object

plane z = zxo will be imaged as a line parallel to the y axis.Similarly, if we consider a pair of conjugates z = zyo andz = zyi, we obtain a line parallel to the x axis. The imag-ing is hence astigmatic, and the astigmatic differences inobject and image space can be related to the magnification

∧i := zx i − zyi = ∧Fi − fx i Mx + fyi My(32)

∧i := zxo − zyo = ∧Fo + fxo/Mx − fyo/My,

where

∧Fi := z(x)Fi − z(y)

Fi = ∧i(Mx = My = 0)(33)

∧Fo := z(x)Fo − z(y)

Fo = ∧o(Mx = My → ∞).

Solving the equations ∧i = ∧o = 0 for Mx and My , we findthat there is a pair of object planes for which the image isstigmatic though not free of distortion.

3. Prisms

There is an important class of devices in which the opticaxis is not straight but a simple curve, almost invariablylying in a plane. The particles remain in the vicinity of thiscurve, but they experience different focusing forces in theplane and perpendicular to it. In many cases, the axis isa circular arc terminated by straight lines. We considerthe situation in which charged particles travel through amagnetic sector field (Fig. 7); for simplicity, we assumethat the field falls abruptly to zero at entrance and exitplanes (rather than curved surfaces) and that the latter arenormal to the optic axis, which is circular. We regard theplane containing the axis as horizontal. The vertical field atthe axis is denoted by Bo, and off the axis, B = Bo(r/R)−n

in the horizontal plane. It can then be shown, with thenotation of Fig. 7, that paraxial trajectory equations of theform

x ′′ + k2v x = 0; y′′ + k2

H y = 0 (34)

describe the particle motion, with k2H = (1 − n)/R2 and

k2v = n/R2. Since these are identical in appearance with

FIGURE 7 Passage through a sector magnet.

the quadrupole equations but do not have different signs,the particles will be focused in both directions but not in thesame “image” plane unless kH = kv and hence n = 1

2 . Thecases n = 0, for which the magnetic field is homogeneous,and n = 1

2 have been extensively studied. Since prisms arewidely used to separate particles of different energy or mo-mentum, the dispersion is an important quantity, and thetransfer matrices are usually extended to include this infor-mation. In practice, more complex end faces are employedthan the simple planes normal to the axis considered here,and the fringing fields cannot be completely neglected, asthey are in the sharp cutoff approximation.

Electrostatic prisms can be analyzed in a similar wayand will not be discussed separately.

B. Aberrations

1. Traditional Method

The paraxial approximation describes the dominant fo-cusing in the principal electron-optical devices, but thisis inevitably perturbed by higher order effects, or aberra-tions. There are several kinds of aberrations. By retaininghigher order terms in the field or potential expansions, weobtain the family of geometric aberrations. By consideringsmall changes in particle energy and lens strength, we ob-tain the chromatic aberrations. Finally, by examining theeffect of small departures from the assumed symmetry ofthe field, we obtain the parasitic aberrations.

All these types of aberrations are conveniently studiedby means of perturbation theory. Suppose that we have ob-tained the paraxial equations as the Euler–Lagrange equa-tions of the paraxial form of M [Eq. (6)], which we denote

Page 113: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 675

M (P). Apart from a trivial change of scale, we have

M (P) = −(1/8φ1/2)(γφ′′ + η2 B2)(x2 + y2)

+ 1

2φ1/2(x ′2 + y′2) (35)

Suppose now that M (P) is perturbed to M (P) + M (A).The second term M (A) may represent additional terms,neglected in the paraxial approximation, and will thenenable us to calculate the geometric aberrations; alterna-tively, M (A) may measure the change in M (P) when particleenergy and lens strength fluctuate, in which case it tellsus the chromatic aberration. Other field terms yield theparasitic aberration. We illustrate the use of perturbationtheory by considering the geometric aberrations of roundlenses. Here, we have

M (A) = M (4) = − 1

4L1(x2 + y2)2

− 1

2L2(x2 + y2)(x ′2 + y′2)

− 1

4L3(x ′2 + y′2)2

− R(xy′ − x ′y)2

− Pφ1/2(x2 + y2)(xy′ − x ′y)

− Qφ1/2(x ′2 + y′2)(xy′ − x ′y) (36)

with

L1 = 1

32φ1/2

(φ′′2

φ− γφ(4) + 2γφ′′η2 B2

φ

+ η4 B4

φ− 4η2 B B ′′

)

L2 = 1

8φ1/2(γφ′′ + η2 B2)

L3 = 1

2φ1/2; P = η

16φ1/2

(γφ′′ B

φ+ η2 B2

φ− B ′′

)

Q = ηB

4φ1/2; R = η2 B2

8φ1/2(37)

and with S(A) = ∫ zz0

M (4) dz, we can show that

∂S(A)

∂xa= p(A)

x t(z) − x (A)φ1/2t ′(z)

(38)∂S(A)

∂ya= p(A)

x s(z) − x (A)φ1/2s ′(z)

where s(z) and t(z) are the solutions of Eq. (10) forwhich s(z0) = t(za) = 1, s(za) = t(z0) = 0, and z = za de-notes some aperture plane. Thus, in the image plane,

x (A) = −(M/W )∂S(A)oi /∂xa (39)

where S(A)oi denotes

∫ zi

zoM (4) dz, with a similar expression

for y(A). The quantities with superscript (A) indicate thedeparture from the paraxial approximation, and we write

xi = x (A)/M = −(1/W ) ∂S(A)oi /∂xa

(40)yi = y(A)/M = −(1/W ) ∂S(A)

oi /∂y(a)

The remainder of the calculation is lengthy but straight-forward. Into M (4), the paraxial solutions are substitutedand the resulting terms are grouped according to their de-pendence on xo, yo, xa, and ya. We find that S(A) can bewritten

−S(A)/W = 1

4Er4

o + 1

4Cr4

a + 1

2A(V 2 − v2)

+ 1

2Fr2

o r2a + Dr2

o V + Kr2a V

+ v(dr2

o + kr2a + aV

)(41)

with

r2o = x2

o + y2o ; r2

a = x2a + y2

a(42)

V = xoxa + yo ya; v = xo ya − xa yo

and

xi = xa[Cr2

a + 2K V + 2kv + (F − A)r2o

]+ xo

(Kr2

a + 2AV + av + Dr2o

)− yo

(kr2

a + aV + dr2o

)(43)

yi = ya[Cr2

a + 2K V + 2kv + (F − A)r2o

]+ xo

(kr2

a + aV + dr2o

)Each coefficient A, C , . . . , d, k represents a differ-

ent type of geometric aberration. Although all lensessuffer from every aberration, with the exception of theanisotropic aberrations described by k, a, and d, which arepeculiar to magnetic lenses, the various aberrations are ofvery unequal importance when lenses are used for differ-ent purposes. In microscope objectives, for example, theincident electrons are scattered within the specimen andemerge at relatively steep angles to the optic axis (sev-eral milliradians or tens of milliradians). Here, it is thespherical (or aperture) aberration C that dominates, andsince this aberration does not vanish on the optic axis,being independent of ro, it has an extremely importanteffect on image quality. Of the geometric aberrations, itis this spherical aberration that determines the resolvingpower of the electron microscope. In the subsequent lensesof such instruments, the image is progressively enlargeduntil the final magnification, which may reach 100,000×or 1,000,000×, is attained. Since angular magnificationis inversely proportional to transverse magnification, the

Page 114: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

676 Charged -Particle Optics

angular spread of the beam in these projector lenses willbe tiny, whereas the off-axis distance becomes large. Here,therefore, the distortions D and d are dominant.

A characteristic aberration figure is associated with eachaberration. This figure is the pattern in the image planeformed by rays from some object point that cross the aper-ture plane around a circle. For the spherical aberration, thisfigure is itself a circle, irrespective of the object position,and the effect of this aberration is therefore to blur the im-age uniformly, each Gaussian image point being replacedby a disk of radius MCr3

a . The next most important aber-ration for objective lenses is the coma, characterized byK and k, which generates the comet-shaped streak fromwhich it takes its name. The coefficients A and F describeSeidel astigmatism and field curvature, respectively; theastigmatism replaces stigmatic imagery by line imagery,two line foci being formed on either side of the Gaussianimage plane, while the field curvature causes the image tobe formed not on a plane but on a curved image surface.The distortions are more graphically understood by con-sidering their effect on a square grid in the object plane.Such a grid is swollen or shrunk by the isotropic distortionD and warped by the anisotropic distortion d; the latterhas been evocatively styled as a pocket handkerchief dis-tortion. Figure 8 illustrates these various aberrations.

Each aberration has a large literature, and we confinethis account to the spherical aberration, an eternal pre-occupation of microscope lens designers. In practice, itis more convenient to define this in terms of angle at thespecimen, and recalling that x(z) = xos(z) + xat(z), we seethat x ′

o = xos ′(zo) + xat ′(zo) Hence,

xi = Cxa(x2

a + y2a

) = C

t ′3o

x ′o

(x ′2

o + y′2o

) + · · · (44)

and we therefore write Cs = c/t ′3o so that

xi = Csx′o

(x ′2

o + y′2o

); yi = Cs y′

o

(x ′2

o + y′2o

)(45)

It is this coefficient Cs that is conventionally quoted andtabulated. A very important and disappointing property ofCs is that it is intrinsically positive: The formula for it canbe cast into positive-definite form, which means that wecannot hope to design a round lens free of this aberrationby skillful choice of geometry and excitation. This resultis known as Scherzer’s theorem. An interesting attempt toupset the theorem was made by Glaser, who tried settingthe integrand that occurs in the formula for Cs, and thatcan be written as the sum of several squared terms, equalto zero and solving the resulting differential equation forthe field (in the magnetic case). Alas, the field distributionthat emerged was not suitable for image formation, thusconfirming the truth of the theorem, but it has been founduseful in β-ray spectroscopy. The full implications of thetheorem were established by Werner Tretner, who estab-

FIGURE 8 Aberration patterns: (a) spherical aberration; (b)coma; (c–e) distortions.

lished the lower limit for Cs as a function of the practicalconstraints imposed by electrical breakdown, magneticsaturation, and geometry.

Like the cardinal elements, the aberrations of objectivelenses require a slightly different treatment from those ofcondenser lenses and projector lenses. The reason is eas-ily understood: In magnetic objective lenses (and probe-forming lenses), the specimen (or target) is commonlyimmersed deep inside the field and only the field re-gion downstream contributes to the image formation. The

Page 115: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 677

spherical aberration is likewise generated only by thispart of the field, and the expression for Cs as an integralfrom object plane to image plane reflects this. In otherlenses, however, the object is in fact an intermediate im-age, formed by the next lens upstream, and the wholelens field contributes to the image formation and henceto the aberrations. It is then the coupling between inci-dent and emergent asymptotes that is of interest, and theaberrations are characterized by asymptotic aberration co-efficients. These exhibit an interesting property: They canbe expressed as polynomials in reciprocal magnificationm (m = 1/M), with the coefficients in these polynomialsbeing determined by the lens geometry and excitation andindependent of magnification (and hence of object posi-tion). This dependence can be written

C

K

A

F

D

= Q

m4

m3

m2

m

1

k

a

d

= q

m2

m

1

(46)

in which Q and q have the patterns

Q =

x x x x x

0 x x x x

0 0 x x x

0 0 x x x

0 0 0 x x

; q =

x x x

0 x x

0 0 x

, (47)

where an x indicates that the matrix element is a nonzeroquantity determined by the lens geometry and excitation.

Turning now to chromatic aberrations, we have

m(P) = ∂m(2)

∂φφ + ∂m(2)

∂ BB (48)

and a straightforward calculation yields

x (c) = −(Ccx ′o + CDxo − Cθ yo)

(γφo

φo− 2

B0

B0

)

y(c) = −(Cc yo + CD yo + Cθ xo)

(γφo

φo− 2

B0

B0

)

(49)

for magnetic lenses or

x (c) = −(Ccx ′o + CDxo)

φo

φo(50)

with a similar expression for y(c) for electrostatic lenses.In objective lenses, the dominant aberration is the (axial)

chromatic aberration Cc, which causes a blur in the imagethat is independent of the position of the object point, likethat due to Cs. The coefficient Cc also shares with Cs theproperty of being intrinsically positive. The coefficientsCD and Cθ affect projector lenses, but although they arepure distortions, they may well cause blurring since theterm in φo and Bo represents a spread, as in the caseof the initial electron energy, or an oscillation typically atmain frequency, coming from the power supplies.

Although a general theory can be established for theparasitic aberrations, this is much less useful than the the-ory of the geometric and chromatic aberrations; becausethe parasitic aberrations are those caused by accidental,unsystematic errors—imperfect roundness of the open-ings in a round lens, for example, or inhomogeneity ofthe magnetic material of the yoke of a magnetic lens, orimperfect alignment of the polepieces or electrodes. Wetherefore merely point out that one of the most importantparasitic aberrations is an axial astigmatism due to theweak quadrupole field component associated with ellip-ticity of the openings. So large is this aberration, even incarefully machined lenses, that microscopes are equippedwith a variable weak quadrupole, known as a stigmator,to cancel this unwelcome effect.

We will not give details of the aberrations of quad-rupoles and prisms here. Quadrupoles have more indepen-dent aberrations than round lenses, as their lower symme-try leads us to expect, but these aberrations can be groupedinto the same families: aperture aberrations, comas, fieldcurvatures, astigmatisms, and distortions. Since the op-tic axis is straight, they are third-order aberrations, likethose of round lenses, in the sense that the degree of thedependence on xo, x ′

o, yo, and y′o is three. The primary

aberrations of prisms, on the other hand, are of secondorder, with the axis now being curved.

2. Lie Methods

An alternative way of using Hamiltonian mechanics tostudy the motion of charged particles has been developed,by Alex Dragt and colleagues especially, in which theproperties of Lie algebra are exploited. This has come tobe known as Lie optics. It has two attractions, one veryimportant for particle optics at high energies (acceleratoroptics): first, interrelations between aberration coefficientsare easy to establish, and second, high-order perturbationscan be studied systematically with the aid of computer al-gebra and, in particular, of the differential algebra devel-oped for the purpose by Martin Berz. At lower energies,the Lie methods provide a useful check of results obtainedby the traditional procedures, but at higher energies theygive valuable information that would be difficult to obtainin any other way.

Page 116: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

678 Charged -Particle Optics

C. Instrumental Optics: Components

1. Guns

The range of types of particle sources is very wide, fromthe simple triode gun with a hairpin-shaped filament re-lying on thermionic emission to the plasma sources fur-nishing high-current ion beams. We confine this accountto the thermionic and field-emission guns that are usedin electron-optical instruments to furnish modest electroncurrents: thermionic guns with tungsten or lanthanum hex-aboride emitters, in which the electron emission is causedby heating the filament, and field-emission guns, in whicha very high electric field is applied to a sharply pointedtip (which may also be heated). The current provided bythe gun is not the only parameter of interest and is indeedoften not the most crucial. For microscope applications, aknowledge of brightness B is much more important; thisquantity is a measure of the quality of the beam. Its exactdefinition requires considerable care, but for our presentpurposes it is sufficient to say that it is a measure of thecurrent density per unit solid angle in the beam. For agiven current, the brightness will be high for a small areaof emission and if the emission is confined to a narrowsolid angle. In scanning devices, the writing speed andthe brightness are interrelated, and the resulting limita-tion is so severe that the scanning transmission electronmicroscope (STEM) came into being only with the de-velopment of high-brightness field-emission guns. Apartfrom a change of scale with φ2/φ1 in accelerating struc-tures, the brightness is a conserved quantity in electron-optical systems (provided that the appropriate definitionof brightness is employed).

The simplest and still the most widely used electron gunis the triode gun, consisting of a heated filament or cath-ode, an anode held at a high positive potential relative tothe cathode, and, between the two, a control electrodeknown as the wehnelt. The latter is held at a small nega-tive potential relative to the cathode and serves to definethe area of the cathode from which electrons are emitted.The electrons converge to a waist, known as the crossover,which is frequently within the gun itself (Fig. 9). If jc isthe current density at the center of this crossover and αs isthe angular spread (defined in Fig. 9), then

B = jc/πα2

s (51)

It can be shown that B cannot exceed the Langmuir limitBmax = jeφ/πkT , in which j is the current density at thefilament, φ is the accelerating voltage, k is Boltzmann’sconstant (1.4 × 10−23 J/K), and T is the filament temper-ature. The various properties of the gun vary considerablywith the size and position of the wehnelt and anode andthe potentials applied to them; the general behavior has

FIGURE 9 Electron gun and formation of the crossover.

been satisfactorily explained in terms of a rather simplemodel by Rolf Lauer.

The crossover is a region in which the current density ishigh, and frequently high enough for interactions betweenthe beam electrons to be appreciable. A consequence ofthis is a redistribution of the energy of the particles and, inparticular, an increase in the energy spread by a few elec-tron volts. This effect, detected by Hans Boersch in 1954and named after him, can be understood by estimating themean interaction using statistical techniques.

Another family of thermionic guns has rare-earth boridecathodes, LaB6 in particular. These guns were introducedin an attempt to obtain higher brightness than a traditionalthermionic gun could provide, and they are indeed brightersources; they are technologically somewhat more com-plex, however. They require a slightly better vacuum thantungsten triode guns, and in the first designs the LaB6

rod was heated indirectly by placing a small heating coilaround it; subsequently, however, directly heated designswere developed, which made these guns more attractivefor commercial purposes.

Even LaB6 guns are not bright enough for the needsof the high-resolution STEM, in which a probe only afew tenths of a nanometer in diameter is raster-scannedover a thin specimen and the transmitted beam is used toform an image (or images). Here, a field-emission gun isindispensable. Such guns consist of a fine tip and two (ormore) electrodes, the first of which creates a very highelectric field at the tip, while the second accelerates theelectrons thus extracted to the desired accelerating voltage.Such guns operate satisfactorily only if the vacuum is verygood indeed; the pressure in a field-emission gun mustbe some five or six orders of magnitude higher than thatin a thermionic triode gun. The resulting brightness is

Page 117: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 679

FIGURE 10 Electrostatic einzel lens design: (A) lens casing; (Band C) insulators.

appreciably higher, but the current is not always sufficientwhen only a modest magnification is required.

We repeat that the guns described above form only oneend of the spectrum of particle sources. Others have largeflat cathodes. Many are required to produce high currentsand current densities, in which case we speak of space-charge flow; these are the Pierce guns and PIGs (Pierceion guns).

2. Electrostatic Lenses

Round electrostatic lenses take the form of a series ofplates in which a round opening has been pierced or cir-cular cylinders all centered on a common axis (Fig. 10).The potentials applied may be all different or, more often,form a simple pattern. The most useful distinction in prac-tice separates lenses that create no overall acceleration ofthe beam (although, of course, the particles are acceler-ated and decelerated within the lens field) and those that doproduce an overall acceleration or deceleration. In the firstcase, the usual configuration is the einzel lens, in whichthe outer two of the three electrodes are held at anodepotential (or at the potential of the last electrode of anylens upstream if this is not at anode potential) and the cen-tral electrode is held at a different potential. Such lenseswere once used in electrostatic microscopes and are stillroutinely employed when the insensitivity of electrostaticsystems to voltage fluctuations that affect all the potentialsequally is exploited. Extensive sets of curves and tablesdescribing the properties of such lenses are available.

Accelerating lenses with only a few electrodes havealso been thoroughly studied; a configuration that is ofinterest today is the multielectrode accelerator structure.These accelerators are not intended to furnish very highparticle energies, for which very different types of accel-erator are employed, but rather to accelerate electrons toenergies beyond the limit of the simple triode structure,which cannot be operated above ∼150 kV. For microscopeand microprobe instruments with accelerating voltages inthe range of a few hundred kilovolts up to a few mega-volts, therefore, an accelerating structure must be inserted

between the gun and the first condenser lens. This struc-ture is essentially a multielectrode electrostatic lens withthe desired accelerating voltage between its terminal elec-trodes. This point of view is particularly useful when afield-emission gun is employed because of an inconve-nient aspect of the optics of such guns: The position andsize of the crossover vary with the current emitted. Ina thermionic gun, the current is governed essentially bythe temperature of the filament and can hence be variedby changing the heating current. In field-emission guns,however, the current is determined by the field at the tipand is hence varied by changing the potential applied tothe first electrode, which in turn affects the focusing fieldinside the gun. When such a gun is followed by an accel-erator, it is not easy to achieve a satisfactory match for allemission currents and final accelerating voltages unlessboth gun and accelerator are treated as optical elements.Miniature lenses and guns and arrays of these are beingfabricated, largely to satisfy the needs of nanolithography.A spectacular achievement is the construction of a scan-ning electron microscope that fits into the hand, no biggerthan a pen. The optical principles are the same as for anyother lens.

3. Magnetic Lenses

There are several kinds of magnetic lenses, but the vastmajority have the form of an electromagnet pierced by acircular canal along which the electrons pass. Figure 11shows such a lens schematically, and Fig. 12 illustrates amore realistic design in some detail. The magnetic flux

FIGURE 11 Typical field distribution in a magnetic lens.

Page 118: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

680 Charged -Particle Optics

FIGURE 12 Modern magnetic objective lens design. (Courtesyof Philips, Eindhoven.)

is provided by a coil, which usually carries a low currentthrough a large number of turns; water cooling preventsoverheating. The magnetic flux is channeled through aniron yoke and escapes only at the gap, where the yokeis terminated with polepieces of high permeability. Thisarrangement is chosen because the lens properties will bemost favorable if the axial magnetic field is in the formof a high, narrow bell shape (Fig. 11) and the use of ahigh-permeability alloy at the polepieces enables one tocreate a strong axial field without saturating the yoke. Con-siderable care is needed in designing the exact shape ofthese polepieces, but for a satisfactory choice, the prop-erties of the lens are essentially determined by the gapS, the bore D (or the front and back bores if these arenot the same), and the excitation parameter J ; the latteris defined by J = NI/φ1/2

o , where NI is the product of thenumber of turns of the coil and the current carried by itand φo is the relativistic accelerating voltage; S and Dare typically of the order of millimeters and J is a fewamperes per (volts)1/2. The quantity NI can be related tothe axial field strength with the aid of Ampere’s circuitaltheorem (Fig. 13); we see that∫ ∞

−∞B(z) dz = µ0NI so that NI ∝ B0

the maximum field in the gap, the constant of proportion-

FIGURE 13 Use of Ampere’s circuital theorem to relate lens ex-citation to axial field strength.

ality being determined by the area under the normalizedflux distribution B(z)/B0.

Although accurate values of the optical properties ofmagnetic lenses can be obtained only by numerical meth-ods, in which the field distribution is first calculated by oneof the various techniques available—finite differences, fi-nite elements, and boundary elements in particular—theirvariation can be studied with the aid of field models. Themost useful (though not the most accurate) of these isGlaser’s bell-shaped model, which has the merits of sim-plicity, reasonable accuracy, and, above all, the possibil-ity of expressing all the optical quantities such as focallength, focal distance, the spherical and chromatic aberra-tion coefficients Cs and Cc, and indeed all the third-orderaberration coefficients, in closed form, in terms of circularfunctions. In this model, B(z) is represented by

B(z) = B0/

(1 + z2/a2) (52)

and writing w2 = 1 + k2, k2 = η2 B20 a2/4φ0, z = a cot ψ

the paraxial equation has the general solution

x(ψ) = (A cos ψ + B sin ψ)/ sin ψ (53)

The focal length and focal distance can be written downimmediately, and the integrals that give Cs and Cc canbe evaluated explicitly. This model explains very satis-factorily the way in which these quantities vary with theexcitation and with the geometric parameter a.

The traditional design of Fig. 12 has many minor vari-ations in which the bore diameter is varied and the yokeshape altered, but the optical behavior is not greatly af-fected. The design adopted is usually a compromise be-tween the optical performance desired and the technolog-ical needs of the user. In high-performance systems, thespecimen is usually inside the field region and may be in-serted either down the upper bore (top entry) or laterallythrough the gap (side entry). The specimen-holder mecha-nism requires a certain volume, especially if it is of one ofthe sophisticated models that permit in situ experiments:specimen heating, to study phase changes in alloys, forexample, or specimen cooling to liquid nitrogen or liquidhelium temperature, or straining; specimen rotation andtilt are routine requirements of the metallurgist. All thisrequires space in the gap region, which is further encum-bered by a cooling device to protect the specimen fromcontamination, the stigmator, and the objective aperturedrive. The desired optical properties must be achieved sub-ject to the demands on space of all these devices, as faras this is possible. As Ugo Valdre has said, the interior ofan electron microscope objective should be regarded as amicrolaboratory.

Magnetic lenses commonly operate at room temper-ature, but there is some advantage in going to very

Page 119: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 681

FIGURE 14 Superconducting lens system: (1) objective (shield-ing lens); (2) intermediate with iron circuit; (3) specimen holder;and (4) corrector device.

low temperature and running in the superconductingregime. Several designs have been explored since AndreLaberrigue, Humberto Fernandez-Moran, and Hans Boer-sch introduced the first superconducting lenses, but onlyone has survived, the superconducting shielding lens in-troduced by Isolde Dietrich and colleagues at Siemens(Fig. 14). Here, the entire lens is at a very low tempera-ture, the axial field being produced by a superconductingcoil and concentrated into the narrow gap region by su-perconducting tubes. Owing to the Meissner–Ochsenfeldeffect, the magnetic field cannot penetrate the metal ofthese superconducting tubes and is hence concentrated inthe gap. The field is likewise prevented from escapingfrom the gap by a superconducting shield. Such lenseshave been incorporated into a number of microscopes andare particularly useful for studying material that must beexamined at extremely low temperatures; organic speci-mens that are irretrievably damaged by the electron beamat higher temperatures are a striking example.

Despite their very different technology, these supercon-ducting lenses have essentially the same optical propertiesas their warmer counterparts. This is not true of the var-ious magnetic lenses that are grouped under the headingof unconventional designs; these were introduced mainlyby Tom Mulvey, although the earliest, the minilens, wasdevised by Jan Le Poole. The common feature of theselenses, which are extremely varied in appearance, is thatthe space occupied by the lens is very different in vol-ume or shape from that required by a traditional lens. Asubstantial reduction in the volume can be achieved byincreasing the current density in the coil; in the minilens(Fig. 15), the value may be ∼80 mm2, whereas in a con-ventional lens, 2 A/mm2 is a typical figure. Such lenses areemployed as auxiliary lenses in zones already occupied byother elements, such as bulky traditional lenses. After the

initial success of these minilenses, a family of miniaturelenses came into being, with which it would be possi-ble to reduce the dimensions of the huge, heavy lensesused for very high voltage microscopes (in the megavoltrange). Once the conventional design had been questioned,it was natural to inquire whether there was any advantageto be gained by abandoning its symmetric shape. This ledto the invention of the pancake lens, flat like a phono-graph record, and various single-polepiece or “snorkel”lenses (Fig. 16). These are attractive in situations wherethe electrons are at the end of their trajectory, and thesingle-polepiece design of Fig. 16 can be used with a tar-get in front of it or a gun beyond it. Owing to their veryflat shape, such lenses, with a bore, can be used to rendermicroscope projector systems free of certain distortions,which are otherwise very difficult to eliminate.

This does not exhaust all the types of magnetic lens.For many years, permanent-magnet lenses were investi-gated in the hope that a simple and inexpensive micro-scope could be constructed with them. An addition tothe family of traditional lenses is the unsymmetric triple-polepiece lens, which offers the same advantages as thesingle-polepiece designs in the projector system. Mag-netic lens studies have also been revivified by the needsof electron beam lithography.

4. Aberration Correction

The quest for high resolution has been a persistent preoc-cupation of microscope designers since these instrumentscame into being. Scherzer’s theorem (1936) was thereforea very unwelcome result, showing as it did that the prin-cipal resolution-limiting aberration could never vanish in

FIGURE 15 Minilens.

Page 120: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

682 Charged -Particle Optics

FIGURE 16 Some unconventional magnetic lenses.

round lenses. It was Scherzer again who pointed out (1947)the various ways of circumventing his earlier result by in-troducing aberration correctors of various kinds. The proofof the theorem required rotational symmetry, static fields,the absence of space charge, and the continuity of certainproperties of the electrostatic potential. By relaxing anyone of these requirements, aberration correction is in prin-ciple possible, but only two approaches have achieved anymeasure of success.

The most promising type of corrector was long believedto be that obtained by departing from rotational symme-try, and it was with such devices that correction was atlast successfully achieved in the late 1990s. Such correc-tors fall into two classes. In the first, quadrupole lensesare employed. These introduce new aperture aberrations,but by adding octopole fields, the combined aberration ofthe round lens and the quadrupoles can be cancelled. Atleast four quadrupoles and three octopoles are required.

FIGURE 17 Correction of spherical aberration in a scanningtransmission electron microscope. (Left) Schematic diagram ofthe quadrupole–octopole corrector and typical trajectories. (Right)Incorporation of the corrector in the column of a Vacuum Genera-tors STEM. [From Krivanek, O. L., et al. (1997). Institute of PhysicsConference Series 153, 35. Copyright IOP Publishing.]

A corrector based on this principle has been incorporatedinto a scanning transmission electron microscope by O.Krivanek at the University of Cambridge (Fig. 17). Inthe second class of corrector, the nonrotationally sym-metric elements are sextupoles. A suitable combinationof two sextupoles has a spherical aberration similar tothat of a round lens but of opposite sign, and the unde-sirable second-order aberrations cancel out (Fig. 18). Thetechnical difficulties of introducing such a corrector ina high-resolution transmission electron microscope havebeen overcome by M. Haider (Fig. 19).

Quadrupoles and octopoles had seemed the most likelytype of corrector to succeed because the disturbance to theexisting instrument, already capable of an unaided reso-lution of a few angstroms, was slight. The family of cor-rectors that employ space charge or charged foils placedacross the beam perturb the microscope rather more. Ef-forts continue to improve lenses by inserting one or more

FIGURE 18 Correction of spherical aberration in a transmissionelectron microscope. Arrangemnent of round lenses and sex-tupoles (hexapoles) that forms a semiaplanatic objective lens.The distances are chosen to eliminate radial coma. [From Haider,M., et al. (1995). Optik 99, 167. Copyright WissenschaftlicheVerlagsgesellschaft.]

Page 121: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 683

FIGURE 19 (a) The corrector of Fig. 18 incorporated in a trans-mission electron microscope. (b) The phase contrast transfer func-tion of the corrected microscope. Dashed line: no correction. Fullline: corrector switched on, energy width (a measure of the tempo-ral coherence) 0.7 eV. Dotted line: energy width 0.2 eV. Chromaticaberration remains a problem, and the full benefit of the correctoris obtained only if the energy width is very narrow. [From Haider,M., et al. (1998). J. Electron Microsc. 47, 395. Copyright JapaneseSociety of Electron Microscopy.]

FIGURE 20 Foil lens and polepieces of an objective lens to becorrected. [From Hanai, T., et al. (1998). J. Electron Microsc. 47,185. Copyright Japanese Society of Electron Microscopy.]

foils in the path of the electrons, with a certain measure ofsuccess, but doubts still persist about this method. Evenif a reduction in total Cs is achieved, the foil must havea finite thickness and will inevitably scatter the electronstraversing it. How is this scattering to be separated fromthat due to the specimen? Figure 20 shows the design em-ployed in an ongoing Japanese project.

An even more radical solution involves replacing thestatic objective lens by one or more microwave cavities.In Scherzer’s original proposal, the incident electron beamwas broken into short pulses and the electrons far from theaxis would hence arrive at the lens slightly later than thosetraveling near the axis. By arranging that the axial elec-trons encounter the maximum field so that the peripheralelectrons experience a weaker field, Scherzer argued, theeffect of Cs could be eliminated since, in static lenses,the peripheral electrons are too strongly focused. Unfor-tunately, when we insert realistic figures into the corre-sponding equations, we find that the necessary frequencyis in the gigahertz range, with the result that the electronsspend a substantial part of a cycle, or more than a cycle,within the microwave field. Although this means that thesimple explanation is inadequate, it does not invalidate theprinciple, and experiment and theory both show that mi-crowave cavity lenses can have positive or negative spheri-cal aberration coefficients. The principal obstacles to theiruse are the need to produce very short pulses containingsufficient current and, above all, the fact that the beamemerging from such cavity lenses has a rather large energy

Page 122: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

684 Charged -Particle Optics

FIGURE 21 Microwave cavity lens between the polepieces of amagnetic lens. (Courtesy of L. C. Oldfield.)

spread, which makes further magnification a problem. Anexample is shown in Fig. 21.

Finally, we mention the possibility of a posteriori cor-rection in which we accept the deleterious effect of Cs onthe recorded micrograph but attempt to reduce or elimi-nate it by subsequent digital or analog processing of theimage. A knowledge of the wave theory of electron imageformation is needed to understand this idea and we there-fore defer discussion of it to Section III.B.

5. Prisms, Mirrors, and Energy Analyzers

Magnetic and electrostatic prisms and systems built upfrom these are used mainly for their dispersive propertiesin particle optics. We have not yet encountered electronmirrors, but we mention them here because a mirror actionis associated with some prisms; if electrons encounter apotential barrier that is high enough to halt them, they willbe reflected and a paraxial optical formalism can be devel-oped to describe such mirror optics. This is less straight-forward than for lenses, since the ray gradient is far fromsmall at the turning point, which means that one of theusual paraxial assumptions that off-axis distance and raygradient are everywhere small is no longer justified.

The simplest magnetic prisms, as we have seen, aresector fields created by magnets of the C-type or picture-frame arrangement (Fig. 22) with circular poles or sectorpoles with a sector or rectangular yoke. These or analogous

electrostatic designs can be combined in many ways, ofwhich we can mention only a small selection. A very inge-nious arrangement, which combines magnetic deflectionwith an electrostatic mirror, is the Castaing–Henry ana-lyzer (Figs. 23a–23c) which has the constructional conve-nience that the incident and emergent optic axes are in line;its optical properties are such that an energy-filtered im-age or an energy spectrum from a selected area can be ob-tained. A natural extension of this is the magnetic filter(Fig. 23d), in which the mirror is suppressed; if the particleenergy is not too high, use of the electrostatic analog ofthis can be envisaged (Fig. 23e). It is possible to eliminatemany of the aberrations of such filters by arranging the sys-tem not only symmetrically about the mid-plane (x ′ − xin Fig. 23d), but also antisymmetrically about the planesmidway between the mid-plane and the optic axis. A vastnumber of prism combinations have been explored by Ve-niamin Kel’man and colleagues in Alma-Ata in the questfor high-performance mass and electron spectrometers.

Energy analysis is a subject in itself, and we can dono more than mention various other kinds of energy or

FIGURE 22 (a) C-Type and (b) picture-frame magnets URE typ-ically having (c) sector-shaped yoke and poles.

Page 123: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 685

FIGURE 23 Analyzers: (a–c) Castaing–Henry analyzer; (d) filter; and (e) electrostatic analog of the filter.

Page 124: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

686 Charged -Particle Optics

FIGURE 24 Mollenstedt analyzer.

momentum analyzers. The Wien filter consists of crossedelectrostatic and magnetic fields, through which particlesof a particular energy will pass undeflected, whereas allothers will be deviated from their path. The early β-rayspectrometers exploited the fact that the chromatic aber-ration of a lens causes particles of different energies to befocused in different planes. The Mollenstedt analyzer isbased on the fact that rays in an electrostatic lens far fromthe axis are rapidly separated if their energies are different(Fig. 24). The Ichinokawa analyzer is the magnetic analogof this and is used at higher accelerating voltages whereelectrostatic lenses are no longer practicable. In retarding-field analyzers, a potential barrier is placed in the path ofthe electrons and the current recorded as the barrier isprogressively lowered.

6. Combined Deflection and Focusing Devices

In the quest for microminiaturization, electron beamlithography has acquired considerable importance. Itproves to be advantageous to include focusing and deflect-ing fields within the same volume, and the optical proper-ties of such combined devices have hence been thoroughlystudied, particularly, their aberrations. It is important tokeep the adverse effect of these aberrations small, espe-cially because the beam must be deflected far from theoriginal optical axis. An ingenious way of achieving this,proposed by Hajime, Ohiwa, is to arrange that the opticaxis effectively shifts parallel to itself as the deflectingfield is applied; for this, appropriate additional deflec-tion, round and multipole fields must be superimposedand the result may be regarded as a “moving objectivelens” (MOL) or “variable-axis lens” (VAL). Perfected im-mersion versions of these and of the “swinging objective

lens” (SOL) have been developed, in which the target lieswithin the field region.

III. WAVE OPTICS

A. Wave Propagation

The starting point here is not the Newton–Lorentz equa-tions but Schrodinger’s equation; we shall use the nonrel-ativistic form, which can be trivially extended to includerelativistic effects for magnetic lenses. Spin is thus ne-glected, which is entirely justifiable in the vast majorityof practical situations. The full Schrodinger equation takesthe form

− h2

m0∇2 + eh

im0A · grad

+(

−e + e2

2m0A2

) − ih

∂t= 0 (54)

and writing

(x, y, z, t) = ψ(x, y, z)e−iωt (55)

we obtain

− h2

2m0∇2ψ + eh

im0A · grad ψ

+(

−e + e2

2m0A2

)ψ = Eψ (56)

with

E = hω (57)

where h = h/2π and h is Planck’s constant. The free-space solution corresponds to

p = h/λ

or

λ = h/(2em0φ0)1/2 ≈ 12.5/φ1/20 (58)

where p is the momentum.As in the case of geometric optics, we consider the

paraxial approximation, which for the Schrodinger equa-tion takes the form

−h2

(∂2ψ

∂x2+ ∂2ψ

∂y2

)+ 1

2em0(φ′′ + η2 B2)(x2 + y2)ψ

−ihp′ψ − 2ihp∂ψ

∂z= 0 (59)

and we seek a wavelike solution:

ψ(x, y, z) = a(z) exp[i S(x, y, z)/h]. (60)

Page 125: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 687

After some calculation, we obtain the required equationdescribing the propagation of the wave function throughelectrostatic and magnetic fields:

ψ(x, y, z) = p3/20

2π ihh(z)p1/2exp

[i pg′(z)

2 hg(z)(x2 + y2)

]

×∞∫

−∞

∫ψ(xo, yo, zo) exp

i po

2 hg(z)h(z)

× [(x − xog)2 + (y − yog)2

]dxo dyo

(61)

This extremely important equation is the basis for all thatfollows. In it, g(z) and h(z) now denote the solutions ofthe geometric paraxial equations satisfying the boundaryconditions g(zo) = h′(zo) = 1, g′(zo) = h(zo) = 0. Reorga-nizing the various terms, Eq. (61) can be written

ψ(x, y, z) = 1

iλrh(z)

∞∫−∞

∫ψ(xo, yo, zo)

× exp

λh(z)

[g(z)

(x2

o + y2o

)− 2(xox + yo y)

+ rh′(z)(x2 + y2)]

dxo dyo (62)

with λ = h/po and r = p/po = (φ/φo)1/2.Let us consider the plane z = zd in which g(z) vanishes,

g(zd) = 0. For the magnetic case (r = 1), we find

ψ(xd, yd, zd) = Ed

iλh(zo)

∫∫ψ(xo, yo, zo)

× exp

[− 2i

λh(zd)(xoxd + yo yd)

]dxo dyo

(63)

with Ed = exp[iπh′(zd)(x2d + y2

d )/λh(zd)], so that, scalefactors apart, the wave function in this plane is the Fouriertransform of the same function in the object plane.

We now consider the relation between the wave functionin the object plane and in the image plane z = zi conjugateto this, in which h(z) vanishes: h(zi) = 0. It is convenientto calculate this in two stages, first relating ψ(xi, yi, zi) tothe wave function in the exit pupil plane of the lens,ψ(xa, ya, za) and then calculating the latter with the aidof Eq. (62). Introducing the paraxial solutions G(z), H (z)such that

G(za) = H ′(za) = 1; G ′(za) = H (za) = 0

we have

ψ(xi, yi, zi) = 1

iλH (zi)

∫∫ψ(xa, ya, za)

× exp

λH (z)

[G(zi)

(x2

a + y2a

)− 2(xaxi + ya yi)

+ H ′(zi)(x2

i + y2i

)]dxa dya (64)

Using Eq. (62), we find

Mψ(xi, yi, zi)Ei

=∫∫

ψ(xo, yo, zo)K (xi, yi; xo, yo)Eo dxo dyo (65)

where M is the magnification, M = g(zi), and

Ei = exp

[iπ

λM

gah′i − g′

iha

ha

(x2

i + y2i

)]

Eo = exp

[iπga

λha

(x2

o + y2o

)](66)

These quadratic factors are of little practical consequence;they measure the curvature of the wave surface arriving atthe specimen and at the image. If the diffraction patternplane coincides with the exit pupil, then Eo = 1. We writeh(za) = f since this quantity is in practice close to thefocal length, so that for the case zd = za,

Ei = exp

[− iπg′

i

λM

(x2

i + y2i

)](67)

The most important quantity in Eq. (65) is the functionK (xi, yi; xo, yo), which is given by

K (x, y; xo, yo) = 1

λ2 f 2

∫∫A(xa, ya)

× exp

−2π i

λ f

[(xo − x

M

)xa

+(

yo − y

M

)ya

]dxa dya (68)

or introducing the spatial frequency components

ξ = xa/λ f ; η = ya/λ f (69)

we find

K (x, y; xo, yo) =∫∫

A(λ f ξ, λ f η)

× exp

−2π i

[(xo − x

M

+(

yo − y

M

]dξ dη (70)

Page 126: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

688 Charged -Particle Optics

In the paraxial approximation, the aperture function A issimply a mathematical device defining the area of inte-gration in the aperture plane: A = 1 inside the pupil andA = 0 outside the pupil. If we wish to include the effectof geometric aberrations, however, we can represent themas a phase shift of the electron wave function at the exitpupil. Thus, if the lens suffers from spherical aberration,we write

A(xa, ya) = a(xa, ya) exp[−iγ (xa, ya)] (71)

in which

γ = 2π

λ

1

4Cs

(x2

a + y2a

f 2

)2

− 1

2

x2a + y2

a

f 2

= πλ

2

Csλ

2(ξ 2 + η2)2 − 2(ξ 2 + η2)

(72)

the last term in allowing for any defocus, that is, anysmall difference between the object plane and the planeconjugate to the image plane. All the third-order geomet-ric aberrations can be included in the phase shift γ , butwe consider only Cs and the defocus . This limitationis justified by the fact that Cs is the dominant aberrationof objective lenses and proves to be extremely convenientbecause Eq. (65) relating the image and object wave func-tions then has the form of a convolution, which it loses ifother aberrations are retained (although coma can be ac-commodated rather uncomfortably). It is now the ampli-tude function a(xa, ya) that represents the physical pupil,being equal to unity inside the opening and zero elsewhere.

In the light of all this, we rewrite Eq. (65) as

Eiψ(xi, yi, zi) = 1

M

∫∫K

(xi

M− xo,

yi

M− yo

)Eo

×ψo(xo, yo, zo) dxo dyo (73)

Defining the Fourier transforms of ψo, ψi, and K as fol-lows,

ψo(ξ, η) =∫∫

Eoψo

× exp[−2π i(ξ xo + ηyo)] dxo dyo

ψo(ξ, η) =∫∫

Eiψi(Mxi, Myi)

× exp[−2π i(ξ xi + ηyi)] dxi dyi(74)

= 1

M2

∫∫Eiψi(xi, yi)

× exp

[−2π i

(ξ xi + ηyi)

M

]dxi dyi

K (ξ, η) =∫∫

K (x, y)

× exp[−2π i(ξ x + ηy)] dx dy

in which small departures from the conventional defini-tions have been introduced to assimilate inconvenient fac-tors, Eq. (65) becomes

ψ i(ξ, η) = 1

MK (ξ, η)ψo(ξ, η) (75)

This relation is central to the comprehension ofelectron-optical image-forming instruments, for it tells usthat the formation of an image may be regarded as a filter-ing operation. If K were equal to unity, the image wavefunction would be identical with the object wave function,appropriately magnified; but in reality K is not unity anddifferent spatial frequencies of the wave leaving the spec-imen, ψ(xo, yo, zo), are transferred to the image with dif-ferent weights. Some may be suppressed, some attenuated,some may have their sign reversed, and some, fortunately,pass through the filter unaffected. The notion of spatialfrequency is the spatial analog of the temporal frequency,and we associate high spatial frequencies with fine detailand low frequencies with coarse detail; the exact interpre-tation is in terms of the fourier transform, as we have seen.

We shall use Eqs. (73) and (75) to study image forma-tion in two types of optical instruments, the transmissionelectron microscope (TEM) and its scanning transmissioncounterpart, the STEM. This is the subject of the nextsection.

B. Instrumental Optics: Microscopes

The conventional electron microscope (TEM) consists ofa source, condenser lenses to illuminate a limited area ofthe specimen, an objective to provide the first stage ofmagnification beyond the specimen, and projector lenses,which magnify the first intermediate image or, in diffrac-tion conditions, the pattern formed in the plane denoted byz = zd in the preceding section. In the STEM, the role ofthe condenser lenses is to demagnify the crossover so thata very small electron probe is formed on the specimen.Scanning coils move this probe over the surface of the lat-ter in a regular raster, and detectors downstream measurethe current transmitted. There are inevitably several de-tectors, because transmission microscope specimens areessentially transparent to electrons, and thus there is nodiminution of the total current but there is a redistributionof the directions of motion present in the beam. Electron-optical specimens deflect electrons but do not absorb them.In the language of light optics, they are phase specimens,and the electron microscope possesses means of convert-ing invisible phase variations ot amplitude variations thatthe eye can see.

We now examine image formation in the TEM in moredetail. We first assume, and it is a very reasonable first ap-proximation, that the specimen is illuminated by a parallel

Page 127: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 689

uniform beam of electrons or, in other words, that the waveincident on the specimen is a constant. We represent the ef-fect of the specimen on this wave by a multiplicative spec-imen transparency function S(xo, yo), which is a satisfac-tory model for the very thin specimens employed for high-resolution work and for many other specimens. This spec-imen transparency is a complex function, and we write

S(xo, yo) = [1 − s(xo, yo)] exp[iϕ(xo, yo)] (76a)

= 1 − s + iϕ (76b)

for small values of s and ϕ. The real term s requiressome explanation, for our earlier remarks suggest that smust vanish if no electrons are halted by the specimen.we retain the term in s for two reasons. First, someelectrons may be scattered inelastically in the specimen,in which case they must be regarded as lost in this simplemonochromatic and hence monoenergetic version of thetheory. Second, all but linear terms have been neglectedin the approximate expression (76b) and, if necessary, thenext-higher-order term (− 1

2ϕ2) can be represented by s.The wave leaving the specimen is now proportional to

S normalizing, so that the constant of proportionality isunity; after we substitute

ψ(xo, yo, zo) = 1 − s + iϕ (77)

into Eq. (75). Again denoting Fourier transforms by thetilde, we have

ψ i(ξ, η) = 1

MK (ξ, η)[δ(ξ, η) − s(ξ, η) + i ϕ(ξ, η)]

= 1

Ma exp(−iγ )(δ − s + i ϕ) (78)

and hence

ψi(Mxi, Myi) = 1

M

∫∫a exp(−iγ )(δ − s + i ϕ)

× exp[2π i(ξ xi + ηyi)] dξ dη (79)

The current density at the image, which is what we seeon the fluorescent screen of the microscope and record onfilm, is proportional to ψiψ

∗i . We find that if both ϕ and s

are small,

M2ψiψ∗i ≈ 1 − 2

∞∫−∞

∫as cos γ

× exp[2π i(ξ x + ηy)] dξ dη

+ 2

∞∫−∞

∫aϕ sin γ

× exp[2π i(ξ x + ηy)] dξ dη (80)

FIGURE 25 Function sin γ at Scherzer defocus = (Csλ)1/2.

and writing j = M2ψiψ∗i and C = j − 1, we see that

C = −2as cos γ + 2aϕ sin γ (81)

This justifies our earlier qualitative description of imageformation as a filter process. Here we see that the two fam-ilies of spatial frequencies characterizing the specimen, ϕand s, are distorted before they reach the image by thelinear filters cos γ and sin γ . The latter is by far the moreimportant. A typical example is shown in Fig. 25. The dis-tribution 2 sin γ can be observed directly by examining anamorphous phase specimen, a very thin carbon film, forexample. The spatial frequency spectrum of such a spec-imen is fairly uniform over a wide range of frequenciesso that C ∞ sin γ . A typical spectrum is shown in Fig. 26,in which the radial intensity distribution is proportional tosin2 γ . Such spectra can be used to estimate the defocus and the coefficient Cs very accurately.

The foregoing discussion is idealized in two respects,both serious in practice. First, the illuminating beam hasbeen assumed to be perfectly monochromatic, whereasin reality there will be a spread of wavelengths of sev-eral parts per million; in addition, the wave incident onthe specimen has been regarded as a uniform plane wave,which is equivalent to saying that it originated in an idealponint source. Real sources, of course, have a finite size,and the single plane wave should therefore be replacedby a spectrum of plane waves incident at a range of smallangles to the specimen. The step from point source andmonochromatic beam to finite source size and finite wave-length spread is equivalent to replacing perfectly coher-ent illumination by partially coherent radiation, with thewavelength spread corresponding to temporal partial co-herence and the finite source size corresponding to spatialpartial coherence. (We cannot discuss the legitimacy ofseparating these effects here, but simply state that this isalmost always permissible.) Each can be represented byan envelope function, which multiplies the coherent trans-fer functions sin γ and cos γ . This is easily seen for thetemporal spatial coherence. Let us associate a probabilitydistribution H ( f ),

∫H ( f ) d f = 1, with the current density

at each point in the image, the argument f being some

Page 128: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

690 Charged -Particle Optics

FIGURE 26 Spatial frequency spectrum (right) of an amorphous phase specimen (left).

convenient measure of the energy variation in the beamincident on the specimen. Hence, d j/j = H ( f ) d f . FromEq. (80), we find

j = 1 −∫

asTs exp[2π i(ξ x + ηy)] d ξ d η

+∫

a ϕTϕ exp[2π i(ξ x + ηy)] d ξ d η (82)

where

Ts = 2∫

cos γ (ξ, η, f )H ( f ) d f

(83)

Tϕ = 2∫

sin γ (ξ, η, f )H ( f ) d f

and if f is a measure of the defocus variation associatedwith the energy spread, we may set equal to o + f ,giving

Ts = 2 cos γ∫

H ( f ) cos[πλ f (ξ 2 + η2)] d f

(84)

Tϕ = 2 sin γ∫

H ( f ) cos[πλ f (ξ 2 + η2)] d f

if H ( f ) is even, and a slightly longer expression when itis not.

The familiar sin γ and cos γ are thus clearly seen to bemodulated by an envelope function, which is essentiallythe Fourier transform of H ( f ). A similar result can beobtained for the effect of spatial partial coherence, but thedemonstration is longer. Some typical envelope functionsare shown in Fig. 27.

An important feature of the function sin γ is that itgives us a convenient measure of the resolution of the

microscope. Beyond the first zero of the function, infor-mation is no longer transferred faithfully, but in the firstzone the transfer is reasonably correct until the curve be-gins to dip toward zero for certain privileged values ofthe defocus, = (Csλ)1/2, (3Csλ)1/2, and (5Csλ)1/2; forthe first of these values, known as the Scherzer defocus,

FIGURE 27 Envelope functions characterizing (a) spatial and(b) temporal partial coherence.

Page 129: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 691

the zero occurs at the spatial frequency (Csλ3)−1/4; the

reciprocal of this multiplied by one of various factors haslong been regarded as the resolution limit of the elec-tron microscope, but transfer function theory enables usto understand the content of the image in the vicinity ofthe limit in much greater detail. The arrival of commer-cial electron microscopes equipped with spherical aber-ration correctors is having a profound influence on thepractical exploitation of transfer theory. Hitherto, the ef-fect of spherical aberration dictated the mode of operationof the TEM when the highest resolution was required.When the coefficient of spherical aberration has been ren-dered very small by correction, this defect is no longer thelimiting factor and other modes of operation become ofinterest.

We now turn to the STEM. Here a bright source, typi-cally a field-emission gun, is focused onto the specimen;the small probe is scanned over the surface and, well be-yond the specimen, a far-field diffraction pattern of eachelementary object area is formed. This pattern is sampledby a structured detector, which in the simplest case con-sists of a plate with a hole in the center, behind whichis another plate or, more commonly, an energy analyzer.The signals from the various detectors are displayed oncathode-ray tubes, locked in synchronism with the scan-ning coils of the microscope. The reason for this com-bination of annular detector and central detector is to befound in the laws describing electron scattering. The elec-trons incident on a thin specimen may pass through unaf-fected; or they may be deflected with virtually no trans-fer of energy to the specimen, in which case they aresaid to be elastically scattered; or they may be deflectedand lose energy, in which case they are inelastically scat-tered. The important point is that, on average, inelasti-cally scattered electrons are deflected through smaller an-gles than those scattered elastically, with the result thatthe annular detector receives mostly elastically scatteredparticles, whereas the central detector collects those thathave suffered inelastic collisions. The latter therefore havea range of energies, which can be separated by meansof an energy analyzer, and we could, for example, forman image with the electrons corresponding to the mostprobable energy loss for some particular chemical ele-ment of interest. Another imaging mode exploits electronsthat have been Rutherford scattered through rather largeangles.

These modes of STEM image formation and others thatwe shall meet below can be explained in terms of a transferfunction theory analogous to that derived for the TEM.This is not surprising, for many of the properties of theSTEM can be understood by regarding it as an invertedTEM, the TEM gun corresponding to the small central

detector of the STEM and the large recording area of theTEM to the source in the STEM, spread out over a largezone if we project back the scanning probe. We will notpursue this analogy here, but most texts on the STEMexplore it in some detail. Consider now a probe centeredon a point xo = ξ in the specimen plane of the STEM. Weshall use a vector notation here, so that xo = (xo, yo), andsimilarly for other coordinates. The wave emerging fromthe specimen will be given by

ψ(xo; ξ) = S(xo)K (ξ − xo) (85)

in which S(xo) is again the specimen transparency and Kdescribes the incident wave and, in particular, the effectof the pupil size, defocus, and aberrations of the probe-forming lens, the last member of the condenser system.Far below the specimen, in the detector plane (subscriptd) the wave function is given by

ψd(xd, ξ) =∫

S(xo)K (ξ − xo)

× exp(−2π ixd · xo/λR) dxo (86)

in which R is a measure of the effective distance betweenthe specimen and the detector. The shape of the detector(and its response if this is not uniform) can most easily beexpressed by introducing a detector function D(xd), equalto zero outside the detector and equal to its response, usu-ally uniform, over its surface. The detector records inci-dent current, and the signal generated is therefore propor-tional to

jd(ξ) =∫

|ψd(xd; ξ)|2 D(xd) dxd

=∫∫∫

S(xo)S∗(x′o)K (ξ − xo)K ∗(ξ − x′

o)

× exp[−2π ixd · (xo − x′o)/λR]

×D(xd) dxo dx′o dxd (87)

or introducing the Fourier transform of the detector re-sponse,

jd(ξ) =∫∫

S(xo)S∗x′o)K (ξ − xo)K ∗(ξ − x′

o)

×D

(xo − x′

o

λR

)dxo dx′

o (88)

We shall use the formula below to analyze the signalscollected by the simpler detectors, but first we derive theSTEM analog of the filter Eq. (81). For this we introduce

Page 130: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

692 Charged -Particle Optics

the Fourier transforms of S and K into the expression forψd (xd, ξ). Setting u = xd/λR, we obtain

ψd(λRu; ξ) =∫

S(xo)K (ξ − xo) exp(−2π iu · xo) dxo)

=∫∫∫

S(p)K (q)

× exp[−2π ixo · (u − p + q)]

× exp[(−2π iq · ξ) dp dq dxo

=∫∫

S(p)K (q)δ(u − p + q)

× exp(2π iq · ξ) dp dq

=∫

S(p)K (p − u) exp[2π iξ · (p − u)] dp

(89)

After some calculation, we obtain an expression forjd(ξ) = ∫

j(xd; ξ)D(xd) dxd and hence for its Fouriertransform

jd(p) =∫

j(xd; p)D(xd) dxd (90)

Explicitly,

jd(p) =∫

|K (xd/λR)|2 D(xd)δ(p)

− s(p)∫

qs(xd/λR; p D(xd) dxd

+ i ϕ(p)∫

qϕ(xd/λR; p)D(xd) dxd (91)

for weakly scattering objects, s 1, ϕ 1. The spatialfrequency spectrum of the bright-field image signal is thusrelated to s and ϕ by a filter relation very similar to thatobtained for the TEM.

We now return to Eqs. (87) and (88) to analyze theannular and central detector configuration. For a smallaxial detector, we see immediately that

jd(ξ) ∝∣∣∣∣∫

S(xo)K (ξ − xo) dxo

∣∣∣∣2

(92)

which is very similar to the image observed in aTEM. For an annular detector, we divide S(xo) intoan unscattered and a scattered part, S(xo) = 1 + σs(xo).The signal consists of two main contributions, one ofthe form

∫[σs(xo +σ ∗

s (xo))] |K (ξ − xo)|2 dxo, and theother

∫ |σs(xo)|2 |K (ξ − xo)|2dxo. The latter term usuallydominates.

We have seen that the current distribution in the detec-tor plane at any instant is the far-field diffraction pattern

of the object element currently illuminated. The fact thatwe have direct access to this wealth of information aboutthe specimen is one of the remarkable and attractive fea-tures of the STEM, rendering possible imaging modesthat present insuperable difficulties in the TEM. The sim-ple detectors so far described hardly exploit this wealth ofinformation at all, since only two total currents are mea-sured, one falling on the central region, the other on theannular detector. A slightly more complicated geometrypermits us to extract directly information about the phasevariation ϕ(xo) of the specimen transparency S(xo). Herethe detector is divided into four quadrants, and by formingappropriate linear combinations of the four signals thusgenerated, the gradient of the phase variation can be dis-played immediately. This technique has been used to studythe magnetic fields across domain boundaries in magneticmaterials.

Other detector geometries have been proposed, and it isof interest that it is not necessary to equip the microscopewith a host of different detectors, provided that the instru-ment has been interfaced to a computer. It is one of theconveniences of all scanning systems that the signal thatserves to generate the image is produced sequentially andcan therefore be dispatched directly to computer memoryfor subsequent or on-line processing if required. By form-ing the far-field diffraction pattern not on a single largedetector but on a honeycomb of very small detectors andreading the currents detected by each cell into framestorememory, complete information about each elementary ob-ject area can be recorded. Framestore memory can be pro-grammed to perform simple arithmetic operations, and theframestore can thus be instructed to multiply the incom-ing intensity data by 1 or 0 in such a way as to mimic anydesired detector geometry. The signals from connected re-gions of the detector—quadrants, for example—are thenadded, and the total signal on each part is then stored,after which the operation is repeated for the next objectelement under the probe. Alternatively, the image of eachelementary object area can be exploited to extract infor-mation about the phase and amplitude of the electron waveemerging from the specimen.

A STEM imaging mode that is capable of furnishingvery high resolution images has largely superseded themodes described above. Electrons scattered through rel-atively wide angles (Rutherford scattering) and collectedby an annular detector with appropriate dimensions forman “incoherent” image of the specimen structure, but withphase information converted into amplitude variations inthe image. Atomic columns can be made visible by thistechnique, which is rapidly gaining importance.

The effect of partial coherence in the STEM can beanalyzed by a reasoning similar to that followed for theTEM; we will not reproduce this here.

Page 131: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 693

C. Image Processing

1. Interference and Holography

The resolution of electron lenses is, as we have seen, lim-ited by the spherical aberration of the objective lens, andmany types of correctors have been devised in the hopeof overcoming this limit. It was realized by Dennis Gaborin the late 1940s, however, that although image detail be-yond the limit cannot be discerned by eye, the informationis still there if only we could retrieve it. The method heproposed for doing this was holography, but it was manyyears before his idea could be successfully put into prac-tice; this had to await the invention of the laser and thedevelopment of high-performance electron microscopes.With the detailed understanding of electron image forma-tion, the intimate connection between electron hologra-phy, electron interference, and transfer theory has becomemuch clearer, largely thanks to Karl-Joseph Hanszen andcolleagues in Braunschweig. The electron analogs of theprincipal holographic modes have been thoroughly ex-plored with the aid of the Mollenstedt biprism. In thehands of Akira Tonomura in Tokyo and Hannes Lichtein Tubingen, electron holography has become a tool ofpractical importance.

The simplest type of hologram is the Fraunhofer in-linehologram, which is none other than a defocused electronimage. Successful reconstruction requires a very coher-ent source (a field-emission gun) and, if the reconstruc-tion is performed light-optically rather than digitally, glasslenses with immense spherical aberration. Such holo-grams should permit high-contrast detection of small weakobjects.

The next degree of complexity is the single-sidebandhologram, which is a defocused micrograph obtained with

FIGURE 28 (Left) Ray diagram showing how an electron hologram is formed. (Right) Cross-section of an electron microscope equippedfor holography. [From Tonomura, A. (1999). “Electron Holography,” Springer-Verlag, Berlin/New York.]

half of the diffraction pattern plane obscured. From the twocomplementary holograms obtained by obscuring eachhalf in turn, separate phase and amplitude reconstructionis, in principle, possible. Unfortunately, this procedure isextremely difficult to put into practice, because charge ac-cumulates along the edge of the plane that cuts off halfthe aperture and severely distorts the wave fronts in itsvicinity; compensation is possible, but the usefulness ofthe technique is much reduced.

In view of these comments, it is not surprising that off-axis holography, in which a true reference wave interfereswith the image wave in the recording plane, has completelysupplanted these earlier arrangements. In the in-line meth-ods, the reference wave is, of course, to be identified withthe unscattered part of the main beam. Figure 28 shows anarrangement suitable for obtaining the hologram; the refer-ence wave and image wave are recombined by the electro-static counterpart of a biprism. In the reconstruction step, areference wave must again be suitably combined with thewave field generated by the hologram, and the most suit-able arrangement has been found to be that of the Mach–Zehnder interferometer. Many spectacular results havebeen obtained in this way, largely thanks to the various in-terference techniques developed by the school of A. Tono-mura and the Bolognese group. Here, the reconstructedimage is made to interfere with a plane wave. The twomay be exactly aligned and yield an interference patternrepresenting the magnetic field in the specimen, for exam-ple; often, however, it is preferable to arrange that they areslightly inclined with respect to one another since phase“valleys” can then be distinguished from “hills.” In an-other arrangement, the twin images are made to interfere,thereby amplifying the corresponding phase shifts twofold(or more, if higher order diffracted beams are employed).

Page 132: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

694 Charged -Particle Optics

FIGURE 29 Arrangement of lenses and mirrors suitable for inter-ference microscopy. [From Tonomura A. (1999). “Electron Holog-raphy,” Springer-Verlag, Berlin/New York.]

Electron holography has a great many ramifications,which we cannot describe here, but we repeat that manyof the problems that arise in the reconstruction step vanishif the hologram is available in digital form and can hencebe processed in a computer. We now examine the relatedtechniques, although not specifically in connection withholography.

2. Digital Processing

If we can sample and measure the gray levels of the elec-tron image accurately and reliably, we can employ thecomputer to process the resulting matrix of image gray-level measurements in many ways. The simplest tech-niques, usually known as image enhancement, help toadapt the image to the visual response or to highlight fea-tures of particular interest. Many of these are routinelyavailable on commercial scanning microscopes, and wewill say no more about them here. The class of methodsthat allow image restoration to be achieved offer solutionsof more difficult problems. Restoration filters, for exam-ple, reduce the adverse effect of the transfer functions ofEq. (81). Here, we record two or more images with differ-ent values of the defocus and hence with different forms ofthe transfer function and seek the weighted linear combi-nations of these images, or rather of their spatial frequencyspectra, that yield the best estimates (in the least-squaressense) of ϕ and s. By using a focal series of such images,we can both cancel, or at least substantially reduce, the ef-fect of the transfer functions sin γ and cos γ and fill in theinformation missing from each individual picture aroundthe zeros of these functions.

Another problem of considerable interest, in other fieldsas well as in electron optics, concerns the phase of the ob-ject wave for strongly scattering objects. We have seenthat the specimens studied in transmission microscopy

are essentially transparent: The image is formed not byabsorption but by scattering. The information about thespecimen is therefore in some sense coded in the angu-lar distribution of the electron trajectories emerging fromthe specimen. In an ideal system, this angular distributionwould be preserved, apart from magnification effects, atthe image and no contrast would be seen. Fortunately,however, the microscope is imperfect; contrast is gener-ated by the loss of electrons scattered through large anglesand intercepted by the diaphragm or “objective aperture”and by the crude analog of a phase plate provided by thecombination of spherical aberration and defocus. It is thefact that the latter affects the angular distribution withinthe beam and converts it to a positional dependence with afidelity that is measured by the transfer function sin γ thatis important. The resulting contrast can be related simplyto the specimen transparency only if the phase and am-plitude variations are small, however, and this is true ofonly a tiny class of specimens. For many of the remain-der, the problem remains. It can be expressed graphicallyby saying that we know from our intensity record wherethe electrons arrive (amplitude) but not their directionsof motion at the point of arrival (phase). Several ways ofobtaining this missing information have been proposed,many inspired by the earliest suggestion, the Gerchberg–Saxton algorithm. Here, the image and diffraction patternof exactly the same area are recorded, and the fact thatthe corresponding wave functions are related by a Fouriertransform is used to find the phase iteratively. First, theknown amplitudes in the image, say, are given arbitraryphases and the Fourier transform is calculated; the ampli-tudes thus found are then replaced by the known diffrac-tion pattern amplitudes and the process is repeated. Afterseveral iterations, the unknown phases should be recov-ered. This procedure encounters many practical difficul-ties and some theoretical ones as well, since the effect ofnoise is difficult to incorporate. This and several relatedalgorithms have now been thoroughly studied and their re-liability is well understood. In these iterative procedures,two signals generated by the object are required (imageand diffraction pattern or two images at different defocusvalues in particular). If a STEM is used, this multiplicity ofinformation is available in a single record if the intensitydistribution associated with every object pixel is recordedand not reduced to one or a few summed values. A se-quence of Fourier transforms and mask operations thatgenerate the phase and amplitude of the electron wave hasbeen devised by John Rodenburg.

A very different group of methods has grown up aroundthe problem of three-dimensional reconstruction. The two-dimensional projected image that we see in the microscopeoften gives very little idea of the complex spatial relation-ships of the true structure, and techniques have therefore

Page 133: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

Charged -Particle Optics 695

been developed for reconstructing the latter. They con-sist essentially in combining the information provided byseveral different views of the specimen, supplemented ifpossible by prior knowledge of an intrinsic symmetry ofthe structure. The fact that several views are required re-minds us that not all specimens can withstand the electrononslaught that such multiple exposure represents. Indeed,there are interesting specimens that cannot be directly ob-served at all, because they are destroyed by the electrondose that would be needed to form a discernible image.Very low dose imaging must therefore be employed, andthis has led to the development of an additional class ofimage restoration methods. Here, the aim is first to detectthe structures, invisible to the unaided eye, and superim-pose low-dose images of identical structures in such a waythat the signal increases more rapidly than the noise andso gradually emerges from the surrounding fog. Three-dimensional reconstruction may then be the next step.The problem here, therefore, is first to find the structures,then to align them in position and orientation with theprecision needed to achieve the desired resolution. Somestatistical check must be applied to be sure that all thestructures found are indeed the same and not members ofdistinct groups that bear a resemblance to one other butare not identical. Finally, individual members of the samegroup are superposed. Each step demands a different treat-ment. The individual structures are first found by elaboratecross-correlation calculations. Cross-correlation likewiseenables us to align them with high precision. Multivari-ate analysis is then used to classify them into groups orto prove that they do, after all, belong to the same groupand, a very important point, to assign probabilities to theirmembership of a particular group.

IV. CONCLUDING REMARKS

Charged-particle optics has never remained stationarywith the times, but the greatest upheaval has certainlybeen that caused by the widespread availability of large,fast computers. Before, the analysis of electron lenses re-lied heavily on rather simple field or potential models, andmuch ingenuity was devoted to finding models that wereat once physically realistic and mathematically tractable.Apart from sets of measurements, guns were almost virginterritory. The analysis of in-lens deflectors would havebeen unthinkable but fortunately was not indispensablesince even the word microminiaturization has not yet beencoined. Today, it is possible to predict with great accuracythe behavior of almost any system; it is even possible toobtain aberration coefficients, not by evaluating the cor-responding integrals, themselves obtained as a result ofexceedingly long and tedious algebra, but by solving the

exact ray equations and fitting the results to the knownaberration pattern. This is particularly valuable when par-asitic aberrations, for which aberration integrals are notmuch help, are being studied. Moreover, the aberrationintegrals can themselves now be established not by longhours of laborious calculation, but by means of one ofthe computer algebra languages. A knowledge of the fun-damentals of the subject, presented here, will always benecessary for students of the subject, but modern numeri-cal methods now allow them to go as deeply as they wishinto the properties of the most complex systems.

SEE ALSO THE FOLLOWING ARTICLES

ACCELERATOR PHYSICS AND ENGINEERING • HOLOG-RAPHY • QUANTUM OPTICS • SCANNING ELECTRON

MICROSCOPY • SCATTERING AND RECOILING SPEC-TROSCOPY • SIGNAL PROCESSING, DIGITAL • WAVE PHE-NOMENA

BIBLIOGRAPHY

Carey, D. C. (1987). “The Optics of Charged Particle, Beams,” HarwoodAcademic, London.

Chapman, J. N., and Craven, A. J., eds. (1984). “Quantitative ElectronMicroscopy,” SUSSP, Edinburgh.

Dragt, A. J., and Forest, E. (1986). Adv. Electron. Electron. Phys. 67,65–120.

Feinerman, A. D. and Crewe, D. A. (1998). “Miniature electron optics.”Adv. Imaging Electron Phys. 102, 187–234.

Frank, J. (1996). “Three-Dimensional Electron Microscopy of Macro-molecular Assemblies,” Academic Press, San Diego.

Glaser, W. (1952). “Grundlagen der Elektronenoptik,” Springer-Verlag,Vienna.

Glaser, W. (1956). Elektronen- und Ionenoptik, Handb. Phys. 33, 123–395.

Grivet, P. (1972). “Electron Optics,” 2nd Ed. Pergamon, Oxford.Hawkes, P. W. (1970). Adv. Electron. Electron Phys., Suppl. 7. Academic

Press, New York.Hawkes, P. W., ed. (1973). “Image Processing and Computer-Aided

Design in Electron Optics,” Academic Press, New York.Hawkes, P. W., ed. (1980). “Computer Processing of Electron Micro-

scope Images,” Springer-Verlag, Berlin and New York.Hawkes, P. W., ed. (1982). “Magnetic Electron Lenses,” Springer-Verlag,

Berlin and New York.Hawkes, P. W., and Kasper, E. (1989, 1994). “Principles of Electron

Optics,” Academic Press, San Diego.Hawkes, P. W., ed. (1994). “Selected Papers on Electron Optics,” SPIE

Milestones Series, Vol. 94.Humphries, S. (1986). “Principles of Charged Particle Acceleration.”

(1990). “Charged Particle Beams,” Wiley-Interscience, New York andChichester.

Lawson, J. D. (1988). “The Physics of Charged-Particle Beams,” OxfordUniv. Press, Oxford.

Lencova, B. (1997). Electrostatic Lenses. In “Handbook of Charged Par-ticle Optics” (J. Orloff, ed.), pp. 177–221, CRC Press, Boca Raton,FL.

Page 134: Encyclopedia of Physical Science and Technology - Classical Physics

P1: FLV 2nd Revised Pages

Encyclopedia of Physical Science and Technology EN002-95 May 19, 2001 20:57

696 Charged -Particle Optics

Livingood, J. J. (1969). “The Optics of Dipole Magnets,” AcademicPress, New York.

Orloff, J., ed. (1997). “Handbook of Charged Particle Optics,” CRCPress, Boca Raton, FL.

Reimer, L. (1997). “Transmission Electron Microscopy,” Springer-Verlag, Berlin and New York.

Reimer, L. (1998). “Scanning Electron Microscopy,” Springer-Verlag,Berlin and New York.

Saxton, W. O. (1978). Adv. Electron. Electron Phys., Suppl. 10. Aca-demic Press, New York.

Septier, A., ed. (1967). “Focusing of Charged Particles,” Vols. 1 and 2,Academic Press, New York.

Septier, A., ed. (1980–1983). Adv. Electron. Electron Phys., Suppl.13A–C. Academic Press, New York.

Tonomura, A. (1999). “Electron Holography,” Springer-Verlag, Berlin.Tsuno, K. (1997). Magnetic Lenses for Electron Microscopy. In “Hand-

book of Charged Particle Optics” (J. Orloff, ed.), pp. 143–175, CRCPress, Boca Raton, FL.

Wollnik, H. (1987). “Optics of Charged Particles,” Academic Press,Orlando.

Page 135: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

ElasticityHerbert ReismannState University of New York at Buffalo

I. One-Dimensional ConsiderationsII. StressIII. StrainIV. Hooke’s Law and Its LimitsV. Strain Energy

VI. Equilibrium and the Formulation of BoundaryValue Problems

VII. Examples

GLOSSARY

Anisotropy A medium is said to be anisotropic if thevalue of a measured, physical field quantity dependson the orientation (or direction) of measurement.

Eigenvalue and eigenvector Consider the matrix equa-tion AX = λX, where A is an n × n square matrix,and X is an n-dimensional column vector. In this case,the scalar λ is an eigenvalue, and X is the associatedeigenvector.

Isotropy A medium is said to be isotropic if the value ofa measured, physical field quantity is independent oforientation.

ELASTICITY THEORY is the (mathematical) study ofthe behavior of those solids that have the property of re-covering their size and shape when the forces that causethe deformation are removed. To some extent, almost allsolids display this property. In this article, most of thediscussion will be limited to the special case of linearlyelastic solids, where deformation is proportional to ap-

plied forces. This topic is usually referred to as classicalelasticity theory. This branch of mathematical physics wasformulated during the nineteenth century and, since its in-ception, has been developed and refined to form the back-ground and foundation for disciplines such as structuralmechanics; stress analysis; strength of materials; platesand shells; solid mechanics; and wave propagation andvibrations in solids. These topics are fundamental to solv-ing present-day problems in many branches of modernengineering and applied science. They are used by struc-tural (civil) engineers, aerospace engineers, mechanicalengineers, geophysicists, geologists, and bioengineers, toname a few. The deformation, vibrations, and structural in-tegrity of modern high-rise buildings, airplanes, and high-speed rotating machinery are predicted by applying themodern theory of elasticity.

I. ONE-DIMENSIONAL CONSIDERATIONS

If we consider a suitably prepared rod of mild steel, with(original) length L and cross-sectional area A, subjected

801

Page 136: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

802 Elasticity

FIGURE 1 (a) Tension rod. (b) Stress–strain curve (ductilematerial).

to a longitudinal, tensile force of magnitude F , then therod will experience an elongation of magnitude L , asshown in Fig. la. So that we can compare the behavior ofrods of differing cross section in a meaningful manner, it isconvenient to define the (uniform) axial stress in the rod byσ = F/A and the (uniform) axial strain by ε = L/L . Wenote that the unit of stress is force per unit of (original) areaand the unit of strain is change in length divided by originallength. If, in a typical tensile test, we plot stress σ versusstrain ε, we obtain the curve shown in Fig. 1b. In the caseof mild steel, and many other ductile materials, this curvehas a straight line portion that extends from 0 < σ < σp,where σp is the proportional limit. The slope of this line isσ/ε = E , where E is known as Young’s modulus (ThomasYoung, 1773–1829). When σp < σ , the stress–strain curveis no longer linear, as shown in Fig. 1b. When the rod isextended beyond σ = σp (the proportional limit), it suffersa permanent set (deformation) upon removal of the loadF. At σ = Y (the yield point), the strain will increase con-siderably for relatively small increases in stress (Fig. 1b).For the majority of structural applications, it is desirable toremain in the linearly elastic, shape-recoverable range ofstress and strain (0 ≤ σ ≤ σp). The mathematical materialmodel that is based on this assumption is said to displaylinear material characteristics. For example, an airplanewing will deflect in flight because of air loads and ma-neuvers, but when the loads are removed, the wing revertsto its original shape. If this were not the case, the wing’slifting capability would not be reliably predictable, and,of course, this would not be desirable. In addition, if theload is doubled, the deflection will also double.

Within the context of the international system of units(Systeme International, or SI), the unit of stress is the pas-cal (Pa). One pascal is equal to one newton per square me-ter (N m−2). The unit of strain is meter per meter, and thusstrain is a dimensionless quantity. We note that 1 N m−2 =1 Pa = 1.4504 × 10−4 psi and 1 psi = 6894.76 Pa.

Typical values of the Young’s (elastic) modulus E andyield stress in tension Y for some ductile materials areshown in Table II in Section IV. The tension test of a rodand naive definitions of stress and strain are associatedwith one-dimensional considerations. Elasticity theory isconcerned with the generalization of these concepts to thegeneral, three-dimensional case.

II. STRESS

Elastic solids are capable of transmitting forces, and theconcept of stress in a solid is a sophistication and gener-alization of the concept of force. We consider a materialpoint P in the interior of an elastic solid and pass an ori-ented plane II through P with unit normal vector n (seeFig. 2). Consider the portion of the solid which is shaded.Then on a (small) area A surrounding the point P , therewill act a net force of magnitude F, and the stress vectorat P is defined by the limiting process

T(n) = limA→0

FA

. (1)

It is to be noted that the magnitude as well as the directionof the stress vector T depends upon the orientation of n. Ifwe resolve the stress vector along the (arbitrarily chosen)(x, y, z) = (x1, x2, x3) axes, then we can write

FIGURE 2 Stress vector and components.

Page 137: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

Elasticity 803

T1 = τ11e1 + τ12e2 + τ13e3

T2 = τ21e1 + τ22e2 + τ23e3 (2)

T3 = τ31e1 + τ32e2 + τ33e3,

where Ti = T(ei ) for i = 1, 2, 3; that is, the Ti are stressvectors acting upon the three coordinate planes andei are unit vectors associated with the coordinate axes(x, y, z) = (x1, x2, x3). We note that here and in subse-quent developments, we use the convenient and commonnotation τ12 ≡ τxy, T1 ≡ Tx , T2 ≡ Ty, etc. In otherwords, the subscripts 1, 2, 3 take the place of x, y, z. Wecan also write

T1 = τ11n1 + τ12n2 + τ13n3

T2 = τ21n1 + τ22n2 + τ23n3 (3)

T3 = τ31n1 + τ32n2 + τ33n3,

where

n = e1n1 + e2n2 + e3n3

and

T(n) = e1T1 + e2T2 + e3T3

= T1n1 + T2n2 + T3n3. (4)

This last expression is known as the lemma of Cauchy(A. L. Cauchy, 1789–1857). The stress tensor components

[τi j ] =

τ11 τ12 τ13

τ21 τ22 τ23

τ31 τ32 τ33

(5)

can be visualized with reference to Fig. 3, with all stressesshown acting in the positive sense. We note that τi j is thestress component acting on the face with normal ei , in thedirection of the vector e j .

With reference to Fig. 2, it can also be shown that rela-tive to the plane II, the normal component N and the shearcomponent S are given by

Tn ≡ N = T · n =3∑

i=1

Ti ni

=3∑

i=1

3∑j=1

ni n jτi j (6a)

and

Ts ≡ S = T · s =3∑

i=1

Ti si

=3∑

i=1

3∑j=1

ni s jτi j , (6b)

where n · s = 0 and n, s are unit vectors normal and parallelto the plane II, respectively.

FIGURE 3 Stress tensor components.

At every interior point of a stressed solid, there existat least three mutually perpendicular directions for whichall shearing stresses τi j , i = j , vanish. This preferred axissystem is called the principal axis system. It can be foundby solving the algebraic eigenvalue–eigenvector problemcharacterized by

τ11 − σ τ12 τ13

τ21 τ22 − σ τ23

τ31 τ32 τ33 − σ

n1

n2

n3

=

0

0

0

, (7)

where n, n2, and n3 are the direction cosines of the prin-cipal axis system such that n2

1 + n22 + n2

3 = 1; and σ1, σ2,and σ3 are the (scalar) principal stress components. Thenecessary and sufficient condition for the existence of asolution for Eq. (7) is obtained by setting the coefficientdeterminant equal to zero. The result is

σ 3 + I1σ2 + I2σ − I3 = 0, (8)

where the quantities

I1 = τ11 + τ22 + τ33, (9a)

I2 =∣∣∣∣τ11 τ12

τ21 τ22

∣∣∣∣ +∣∣∣∣τ22 τ23

τ32 τ33

∣∣∣∣ +∣∣∣∣τ33 τ31

τ13 τ11

∣∣∣∣ , (9b)

and

I3 =

∣∣∣∣∣∣∣τ11 τ12 τ13

τ21 τ22 τ23

τ31 τ32 τ33

∣∣∣∣∣∣∣ (9c)

Page 138: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

804 Elasticity

are known as the first, second, and third stress invariants,respectively. For example, we consider these stress tensorcomponents at a point P of a solid, relative to the x, y, zaxes:

[τi j ] =

3 1 1

1 0 2

1 2 0

. (10)

Thus,

I1 = 3, I2 = −6, I3 = −8

and

σ 3 − 3σ 2 − 6σ + 8 = (σ − 4)(σ − 1)(σ + 2) = 0.

Consequently, the principal stresses at P are σ1 = 4,

σ2 = 1, and σ3 = −2. With the aid of Eq. (7), it canbe shown that the principal directions at P are given bythe mutually perpendicular unit vectors

n(1) = e12√6

+ e21√6

+ e31√6

n(2) = e1

(− 1√

3

)+ e2

1√3

+ e31√3

(11)

n(3) = e1(0) + e2

(− 1√

2

)+ e3

(1√2

).

When the Cartesian axes are rotated in a rigid man-ner from x1 , x2 , x3 , to x1′ , x2′ , x3′ , as shown in Fig. 4, thecomponents of the stress tensor transform according to therule

τp′q ′ =3∑

i=1

3∑j=1

ap′i aq ′ jτi j , (12)

where ap′i = cos(x p′, x1) = cos(ep′, ei ) are the nine direc-tion cosines that orient the primed coordinate system rel-ative to the unprimed system.

For example, consider the rotation of axes characterizedby the table of direction cosines

[ai ′ j ] =

23 − 2

3 − 13

13

23 − 2

323

13

23

. (13)

The stress components τi j in Eq. (10) relative to the x, y, zaxes will become

[τp′q ′ ] =

0.889 0.778 0.222

0.778 −1.444 1.444

0.222 1.444 3.556

(14)

FIGURE 4 Principal axes.

when referred to x ′, y′, z′ axes, according to the law oftransformation [Eq. (12)]. The extreme shear stress ata point is given by τmax = 1

2 (σ1 − σ3) and this value isτmax = 1

2 (4 + 2) = 3 for the stress tensor [Eq. (10)]. Itshould be noted that the principal stresses are ordered,that is, σ1 ≥ σ2 ≥ σ3, and that σ1(σ3) is the largest (small-est) normal stress for all possible planes through thepoint P .

If we now establish a coordinate system coincident withprincipal axes then in “principal stress space,” the normalstress N and the shear stress S on a plane characterizedby the outer unit normal vector n are, respectively,

N = n21σ1 + n2

2σ2 + n23σ3 (15a)

and

S2 = n21n2

2(σ1 − σ2)2 + n22n2

3(σ2 − σ3)2 + n23n2

1(σ − σ1)2,

(15b)

where σ1, σ2, and σ3 are principal stresses. We now visual-ize eight planes, the normal to each of which makes equalangles with respect to principal axes. The shear stress act-ing upon these planes is known as the octahedral shearstress τ0, and its magnitude is

τ0 = 13

[(σ1 − σ2)2 + (σ2 − σ3)2 + (σ3 − σ1)2

]1/2 ≥ 0.

(16)

It can be shown that the octahedral shear stress is relatedto the average of the square of all possible shear stressesat the point, and the relation is

35 (τ0)2 = 〈S2〉. (17)

Page 139: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

Elasticity 805

It can also be shown that

9τ 20 = 2I 2

1 − 6I2, (18)

where I1 and I2 are the first and second stress invariants,respectively [see Eqs. (9a) and (9b)]. We also note thebound

1 ≤√

3

2

τ0

τmax≤ 2√

3(19)

and the associated implication that 32τ0

∼= 1.08 τmax witha maximum error of about 7%. Returning to the stresstensor [Eq. (10)], we have

τmax = 3

and

9τ 20 = 2I 2

1 − 6I2 = (2)(9) + (6)(6) = 54,

or

τ0 =√

6 = 2.4495,

and

1 ≤√

3/2τ0

τmax=

√3/2(2.4495)

3≤ 2√

3,

FIGURE 5 Strain.

or

1 = 1 ≤ 1.1547.

III. STRAIN

In our discussion of the concept of stress, we noted thatstress characterizes the action of a “force at a point” in asolid. In a similar manner, we shall show that the concept ofstrain can be used to quantify the notion of “deformationat a point” in a solid.

We consider a (small) quadrilateral element in the un-strained solid with dimensions dx, dy, and dz. The sidesof the element are taken to be parallel to the coordi-nate axes. After deformation, the volume element has theshape of a rectangular parallelepiped with edges of length(dx +du), (dy +d v), (dz +d w). With reference to Fig. 5,the material point P in the undeformed configuration iscarried into the point P ′ in the deformed configuration.A projection of the element sides onto the x–y plane, be-fore and after deformation, is shown in Fig. 5. We notethat all changes in length and angles are small, and they

Page 140: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

806 Elasticity

have been exaggerated for purposes of clarity. We nowdefine extensional strain εxx = ε11 as change in lengthper unit length, and therefore for the edge PA (in Fig. 5),we have

εxx = [dx + (∂u/∂x) dx] − dx

dx= ∂u

∂x

and

εyy = [dx + (∂v/∂y) dy] − dy

dy= ∂v

∂y,

and a projection onto the y–z plane will result in

εzz = [dz + (∂w/∂z) dz] − dz

dz= ∂w

∂z.

The shear strain is defined as one-half of the decrease ofthe originally right angle APB. Thus, with reference toFig. 5, we have

2εxy = 2εyx = (∂v/∂x) dx

dx + (∂u/∂x) dx+ (∂u/∂y) dy

dy + (∂v/∂y) dy

= ∂v/∂x

1 + (∂u/∂x)+ ∂u/∂y

1 + (∂v/∂y)= ∂v

∂x+ ∂u

∂y

because it is assumed that

1 ∂u/∂x ; 1 ∂v/∂y (small rotations).

In a similar manner, using projections onto the planes y–zand z–x, we can show that

2εyz = ∂v

∂z+ ∂w

∂y

and

2εzx = ∂w

∂x+ ∂u

∂z.

Consequently, the complete (linearized) strain-displace-ment relations are given by

εxx εxy εxz

εyx εyy εyz

εzx εzy εzz

=

∂u

∂x

1

2

(∂u

∂y+ ∂v

∂x

)1

2

(∂u

∂z+ ∂w

∂x

)1

2

(∂v

∂x+ ∂u

∂y

)∂v

∂y

1

2

(∂v

∂z+ ∂w

∂y

)1

2

(∂w

∂x+ ∂u

∂z

)1

2

(∂w

∂y+ ∂v

∂z

)∂w

∂z

.

(20)

Equation (20) characterizes the deformation of the solidat a point. If we define the mutually perpendicular unitvectors n and s with reference to a plane II through a

point P in a solid (see Fig. 2), then it can be shown thatthe extensional strain N in the direction n is given by theformula

N =3∑

i=1

3∑j=1

εi j ni n j (21a)

and the shear strain relative to the vectors n and s is

S = 1

2

3∑i=1

3∑j=1

εi j ni s j . (21b)

Equation (21a) expresses the extensional strain andEq. (21b) expresses the shearing strain for an arbitrarilychosen element; therefore, we can infer that the nine(six independent) quantities εi j (i = 1, 2, 3; j = 1, 2, 3)provide a complete characterization of strain associatedwith a material point in the solid. It can be shown thatthe nine quantities εi j constitute the components of atensor of order two in a three-dimensional space, andthe appropriate law of transformation under a rotation ofcoordinate axes is

εp′q ′ =3∑

i=1

3∑j=1

ap′i aq ′ jεi j ; (22)

p = 1, 2, 3; q = 1, 2, 3,

where the ap′i are direction cosines as in Eq. (12). As inthe case of stress, there will be at least one set of mutuallyperpendicular axes for which the shearing strains vanish.These axes are principal axes of strain. They are found ina manner that is entirely analogous to the determinationof principal stresses and axes. (See Section II.)

It should be noted that a single-valued, continuous dis-placement field for a simply connected region is guar-anteed provided that the six equations of compatibility ofA. J. C. Barre de Saint-Venant (1779–1886) are satisfied:

∂2εxx

∂y2+ ∂2εyy

∂x2= 2

∂2εxy

∂x∂y, (23a)

∂2εxx

∂y∂z= ∂

∂x

(−∂eyx

∂x+ ∂εxz

∂y+ ∂εxy

∂z

),

(23b)

and there are two additional equations for each ofEqs. (23a) and (23b), which are readily obtained by cyclicpermutation of x, y, z.

IV. HOOKE’S LAW AND ITS LIMITS

The most general linear relationship between stress ten-sor and strain tensor components at a point in a solid isgiven by

Page 141: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

Elasticity 807

τi j =3∑

k=1

3∑l=1

Ci jklεkl ; i = 1, 2, 3;

j = 1, 2, 3, (24)

where the 34 = 81 constants Ci jkl are the elastic constantsof the solid. If a strain energy density function exists (seeSection V), and in view of the fact that the stress and straintensor components are symmetric, the elastic constantsmust satisfy the relations

Ci jkl = Ci jlk, Ci jkl = C jikl , Ci jkl = Ckli j ,

(25)

and therefore the number of independent elastic con-stants is reduced to 1

2 (62 − 6) + 6 = 21 for the generalanisotropic elastic solid. If, in addition, the elastic proper-ties of the solid are independent of orientation, the numberof independent elastic constants can be reduced to two. Inthis case of an isotropic elastic solid, the relation betweenstress and strain is given by

Eεxx = τxx − ν(τyy + τzz)

Eεyy = τyy − ν(τzz + τxx )

Eεzz = τzz − ν(τxx + τyy)(26)

2Gεxy = τxy

2Gεyz = τyz

2Gεzx = τzx ,

where G = E /2(1 + ν) is the shear modulus, E is Young’smodulus (see Section I), and ν is Pohisson’s ratio (S. D.Poisson, 1781–1840). Equation (26) is known as Hooke’slaw (Robert Hooke, 1635–1693) for a linearly elastic,isotropic solid. A listing of typical values of the elasticconstants is provided in Table I.

Many failure theories for solids have been proposed,and they are usually associated with specific classes of

TABLE I Typical Values of Elastic Constantsa

Material v E (Pa)b G (Pa)b

Aluminum 0.34 6.89 × 1010 2.57 × 1010

Concrete 0.20 0.76 × 1010 1.15 × 1010

Copper 0.34 8.96 × 1010 3.34 × 1010

Glass 0.25 6.89 × 1010 2.76 × 1010

Nylon 0.40 2.83 × 1010 1.01 × 1010

Rubber 0.499 1.96 × 106 0.654 × 106

Steel 0.29 20.7 × 1010 8.02 × 1010

a Adapted from Reismann, H., and Pawlik, P. S. (1980). “Elastic-ity: Theory and Applications,” Wiley (Interscience), New York.b Note that 1 Pa = 1 N m−2 = 1.4504 × 10−4 lb in.−2.

TABLE II Some Material Properties for Ductile Materialsa

Yield Young’s Strain atpoint stress, modulus, yield point,

Material σY (tension, Pa) E (Pa) εY (tension)

Aluminum alloy 290 × 106 7.30 × 1010 0.00397(2024 T 4)

Brass 103 × 106 10.3 × 1010 0.00100

Bronze 138 × 106 10.3 × 1010 0.00134

Magnesium alloy 138 × 106 4.50 × 1010 0.00307

Steel (low carbon, 248 × 106 20.7 × 1010 0.00120structural

Steel (high carbon) 414 × 106 20.7 × 1010 0.00200

a Adapted from Reismann, H., and Pawlik, P. S. (1980). “Elasticity:Theory and Applications,” Wiley (Interscience), New York.

materials. In the case of a ductile material with a well-defined yield point (see Fig. 1b), there are at least twofailure theories that yield useful results.

A. The Hencky-Mises Yield Criterion

This theory predicts failure (yielding) at a point of thesolid when 9τ 2

0 ≥ 2Y 2 , where τ0 is the octahedral shearstress [see Eq. (16)] and Y is the yield stress in tension (seeFig. 1b). In this case, the ratio of yield stress in tension Yto the yield stress in pure shear τ has the value Y/τ = √

3.

B. The Tresca Yield Criterion

This theory postulates that yielding occurs when the ex-treme shear stress τmax at a point attains the value τmax ≥Y /2. We note that for this theory the ratio of yield stress intension to the yield stress in pure shear is equal to Y/τ = 2.A listing of the values of Y for some commonly used ma-terials is given in Table II.

V. STRAIN ENERGY

We now consider an interior material point P in a stressed,elastic solid. We can construct a Cartesian coordinate sys-tem x, y, z with origin at P , which is coincident withprincipal axes at P . The point P is enclosed by a small,rectangular parallelepiped with sides of length dx, dy,

and dz. The areas of the sides of the parallelepiped aredAz = dx dy, dAx = dy dz, dAy = dz dx , and the volumeis dV = dx dy dz. The potential (or strain) energy storedin the linearly elastic solid is equal to the work of the ex-ternal forces. Consequently, neglecting heat generation,if W is the strain energy per unit volume (strain energydensity), we have

Page 142: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

808 Elasticity

W dV = 12 (τxx Ax )(dxεxx ) + 1

2 (τyy Ay)(dyεyy)

+ 12 (τzz Az)(dzεzz)

= 12 (τxxεxx + τyyεyy + τzzεzz) dV,

and therefore the strain energy density referred to principalaxes is

W = 12 (τxxεxx + τyyεyy + τzzεzz).

In the general case of arbitrary (in general, nonprincipal)axes, this expression assumes the form

W = 12 (τxxεxx + τxyεxy + τxzεxz)

+ 12 (τyxεyx + τyyεyy + τyzεyz)

+ 12 (τzxεzx + τzyεzy + τzzεzz),

or, in abbreviated notation,

W = 1

2

3∑i=1

3∑j=1

τi jεi j . (27)

In view of the relations in Eqs. (24) and (27), the expres-sion for strain energy density can be written in the form

W = 1

2

3∑i=1

3∑j=1

3∑k=1

3∑l=1

Ci jklεi jεkl . (28a)

In the case of an isotropic elastic material [see Eq. (26)],this equation reduces to

W = 1

2

[λ(ε11 + ε22 + ε33)2 + 2G

3∑i=1

3∑j=1

εi jεi j

],

(28b)

where

λ = E/(1 + v)(1 − 2v).

Thus, with reference to Eq. (28), we note that the strainenergy density is a quadratic function of the strain ten-sor components, and W vanishes when the strain fieldvanishes. Equation (28) serves as a potential (generating)function for the generation of the stress field, that is,

τi j = ∂W (εi j )

∂εi j=

3∑k=1

3∑l=1

Ci jklεkl ; i = 1, 2, 3;

j = 1, 2, 3 (29)

[see Eq. (24)]. The concept of strain energy serves as thestarting point for many useful and important investiga-tions in elasticity theory and its applications. For details,the reader is referred to the extensive literature, a smallselection of which can be found in the Bibliography.

VI. EQUILIBRIUM AND THEFORMULATION OF BOUNDARYVALUE PROBLEMS

External agencies usually deform a solid by two distincttypes of loadings: (a) surface tractions and (b) body forces.Surface tractions act by virtue of the application of normaland shearing stresses to the surface of the solid, whilebody forces act upon the interior, distributed mass of thesolid. For example, a box resting on a table is subjected to(normal) surface traction forces at the interface betweentabletop and box bottom, whereas gravity causes forces tobe exerted upon the contents of the box.

Consider a solid body B bounded by the surface S in astate of static equilibrium. Then at every internal point ofB, these partial differential equations must be satisfied:

∂τxx

∂x+ ∂τxy

∂y+ ∂τxz

∂z+ Fx = 0

∂τyx

∂x+ ∂τyy

∂y+ ∂τyz

∂z+ Fy = 0 (30)

∂τzx

∂x+ ∂τzy

∂y+ ∂τzz

∂z+ Fz = 0,

where τxy = τyx , τyz = τzy, τzx = τxz , and F = Fx ex +Fyey + Fzez is the body force vector per unit volume.

The admissible boundary conditions associated withEq. (30) may be stated in the form:

T ≡ (T1, T2, T3) on S1

and

u ≡ (u, v, w) on S2, (31)

where T is the surface traction vector [see Eq. (4)], u isthe displacement vector, and S = S1 + S2 denotes thebounding surface of the solid.

The solution of a problem in (three-dimensional) elas-ticity theory requires the determination of

the displacement vector field uthe stress tensor field τi j

and the strain tensor field εi j

in B. (32)

This solution is required to satisfy the equations ofequilibrium [Eq. (30)], the equations of compatibility[Eq. (23)], the strain-displacement relations [Eq. (20)],and the stress–strain relations [Eq. (26) or (24)], as wellas the boundary conditions [Eq. (31)]. This is a formidabletask, even for relatively simple geometries and boundaryconditions, and the exact or approximate solution requiresextensive use of advanced analytical as well as numericalmathematical methods in most cases.

Page 143: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

Elasticity 809

VII. EXAMPLES

A. Example A

We consider an elastic cylinder of length L with an arbi-trary cross section. The cylinder is composed of a linearlyelastic, isotropic material with Young’s modulus E andPoisson’s ratio ν. The cylinder is inserted into a perfectlyfitting cavity in a rigid medium, as shown in Fig. 6, andsubjected to a uniformly distributed normal stress τzz = Ton the free surface at z = L . We assume that the bottomof the cylinder remains in smooth contact with the rigidmedium, and that the lateral surfaces between the cylinderand the rigid medium are smooth, thus capable of transmit-ting normal surface tractions only. Moreover, normal dis-placements over the lateral surfaces are prevented. Thus,we have the displacement field

u = v = 0, w = (δ/L)z,

FIGURE 6 Transversely constrained cylinder.

where δ is the z displacement of the top of the cylinder.With the aid of Eq. (20), we obtain the strain field

εxx = εyy = 0; εzz = δ/L;(33)

εi j ≡ 0, i = j.

In view of Eqs. (26) and (33), we have

τxx − ν(τyy + τzz) = 0,

τyy − ν(τxx + τzz) = 0,

and

τzz − ν(τyy + τxx ) = E(δ/L),

and therefore,

τxx = τyy = ν

1 − ντzz

τzz = Eδ

L

(1 − ν)

(1 − 2ν)(1 + ν)= T, (34)

τi j = 0 for i = j.

In the case of a copper cylinder, we have (see Table I)ν = 0.34, E = 8.96 × 1010 Pa; and for an axial strainεzz = δ/L = 0.0005, we readily obtain

τxx = τyy = 35.53 × 106 Pa

and

τzz = 68.9 × 106 Pa.

Thus, when we compress the copper cylinder with a stressτzz = T = −68.9 × 106 Pa, there will be induced a lateralcompressive stress τxx = τyy = −35.53×106 Pa. We notethat the strain field [Eq. (33)] satisfies the equations ofcompatibility [Eq. (23)] and the stress field [Eq. (34)]satisfies the equations of equilibrium [Eq. (30)] providedthe body force vector field F vanishes (or is negligible).

B. Example B

We consider the case of plane, elastic pure bending (orflexure) of a beam by end couples as shown in Fig. 7. Inthe reference state, the z axis and the beam longitudinalaxis coincide. The cross section of the beam (normal tothe z axis) is constant and symmetrical with respect tothe y axis. Its area is denoted by the symbol A, and thecentroid of A is at (0, 0, z). The beam is acted upon byend moments Mx = M such that

Mx =∫

Aτzz y d A = M

and

My =∫

Aτzz x d A = 0.

Page 144: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

810 Elasticity

FIGURE 7 Pure bending of a beam.

The present situation suggests the stress field

τxx τxy τxz

τyx τyy τyz

τzx τzy τzz

=

0 0 0

0 0 0

0 0My

I

, (35)

where I = ∫A y2 d A, on account of physical reasoning and

(elementary) Euler-Bernoulli beam theory. Upon substi-tution of Eq. (35) into Eq. (26), and in view of Eq. (20),we obtain

εxx = − ν

Eτzz = − ν

E

M

Iy = ∂u

∂x

εyy = − ν

Eτzz = − ν

E

M

Iy = ∂v

∂y(36)

εzz = τzz

E= M

E Iy = ∂w

∂z,

and all shearing strains vanish.We now integrate the partial differential equations in

(36), subject to the following boundary conditions: At(x, y, z) = (0, 0, 0) we require u = v = w = 0 and

∂u

∂z= ∂y

∂z= ∂u

∂y= 0.

Thus, the beam displacement field is given by

u = − Mν

E Ixy

v = − M

2E I[z2 + ν(y2 − x2)] (37)

w = M

E Iyz.

We note that the strain field (36) satisfies the equationof compatibility (23) and the stress field (35) satisfies theequations of equilibrium (30) provided the body force vec-tor field F vanishes (or is negligible).

With reference to Fig. 7, in the reference configuration,the top surface of the beam is characterized by the plane

y = b. Subsequent to deformation, the top surface of thebeam is characterized by

v = − M

2EI(z2 − νx2) − vMb2

2EI, (38)

and for (x, y, z) = (0, b, 0) we have

v(0, b, 0) = −νMb2

2EI.

We now write Eq. (38) in the form

V = v + νMb2

2EI= − M

2EI(z2 − νx2), (39)

and we note that V denotes the deflection of the (origi-nally) plane top surface of the beam. The contour linesV = constant of this saddle surface are shown in Fig. 8a.We note that the contour lines consist of two familiesof hyperbolas, each having two branches. The asymp-totes are straight lines characterized by V = 0, so that tanα = z/x = √

ν.An experimental technique called holographic interfer-

ometry is uniquely suited to measure sufficiently smalldeformations of a beam loaded as shown in Fig. 7. InFig. 8b we show a double-exposure hologram of the de-formed top surface of a beam loaded as shown in Fig. 7.This hologram was obtained by the application of a two(light) beam technique, utilizing Kodak Holographic 120-02 plates. The laser was a 10-mW He-Ne laser, 632.8 nm,with beam ratio 4:1. The fringe lines in Fig. 8b correspondto the contour lines of Fig. 8a. The close correspondencebetween theory and experiment is readily observed. Wealso note that this technique results in the nondestruc-tive, experimental determination of Poisson’s ratio ν ofthe beam.

C. Example C

We wish to find the displacement, stress field, and strainfield in a spherical shell of thickness (b − a) > 0 subjectedto uniform, internal fluid (or gas) pressure p. The shell is

Page 145: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

Elasticity 811

(a)

(b)

FIGURE 8 (a) Contour lines; V , constant. (b) Double-exposurehologram of deformed plate surface. (Holographic work was per-formed by P. Malyak in the laboratory of D. P. Malone, Depart-ment of Electrical Engineering, State University of New York atBuffalo.) [This hologram is taken from Reismann, H., and Pawlik,P. S. (1980). “Elasticity: Theory and Applications,” Wiley (Inter-science), New York.]

bounded by concentric spherical surfaces with outer radiusr = b and inner radius r = a, and we designate the centerof the shell by O . In view of the resulting point-symmetricdisplacement field, there will be no shear stresses actingupon planes passing through O and upon spherical sur-faces a ≤ r ≤ b. Consequently, at each point of the shellinterior, the principal stresses are radial tension (or com-pression) τrr and circumferential tension (or compression)τθθ , the latter having equal magnitude in all circumferen-tial directions.

To obtain the pertinent equation of equilibrium, we con-sider a volume element (free body) bounded by two pairsof radial planes passing through O , each pair subtending a(small) angle θ , and two spherical surfaces with radii rand r + r . Invoking the condition of (radial) static equi-librium, we obtain

(τrr + τrr )[(r + r )θ ]2 − τrr (rθ )2

= 2τθθ

(r + r

2

)r (θ )2.

We now divide this equation by (θ )2 and r then take thelimit as r → 0 and τrr/r → dτrr/dr . The result ofthese manipulations is the stress equation of equilibrium

dτrr

dr+ 2

r(τrr − τθθ ) = 0. (40)

In view of the definition of strain in Section III, the strain-displacement relations for the present problem are

εrr = (dr + du) − dr

dr= du

dr

εθθ = 2π (r + u) − 2πr

2πr= u

r, (41)

where the letter u denotes radial displacement. For ourpresent purpose, we now write Hooke’s law (26) in thefollowing form:

τrr = (λ + 2G)εrr + 2λεθθ

(42)τθθ = 2(λ + G)εθθ + λεrr ,

where

λ = Eν

(1 + ν)(1 − 2ν)= 2Gν

(1 − 2ν).

If we substitute Eq. (41) into Eq. (42) and then substi-tute the resulting equations into Eq. (40), we obtain thedisplacement equation of equilibrium

d2u

dr2+ 2

r

du

dr− 2

r2u = 0. (43)

The spherical shell has a free boundary at r = b and isstressed by internal gas (or liquid) pressure acting uponthe spherical surface r = a. Consequently, the boundaryconditions are

τrr (a) = −p, (44)

where p ≥ 0 and τrr (b) = 0. The solution of the differen-tial equation (43) subject to the boundary conditions (44) is

u = pa3r

3K (b3 − a3)+ pa3b3

4G(b3 − a3)r2, a ≤ r ≤ b,

(45)where K = E/[3(1−2ν)] = (3λ+2G)/3 is the modulusof volume expansion, or bulk modulus. Upon substitutionof Eq. (45) into Eq. (41), we obtain the strain field

εrr = pa3

3K (b3 − a3)− pa3b3

2G(b3 − a3)r3

(46)

εθθ = pa3

3K (b3 − a3)+ pa3b3

4G(b3 − a3)r3,

and upon substitution of Eq. (46) into Eq. (42), we obtainthe stress field

Page 146: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GKX/GJK P2: FQP Final Pages/FFV QC: FGE

Encyclopedia of Physical Science and Technology EN005B197 June 8, 2001 19:35

812 Elasticity

τrr = pa3

(b3 − a3)

[1 −

(b

r

)3]

= σ2 = σ3 ≤ 0

(47)

τθθ = pa3

(b3 − a3)

[1 + 1

2

(b

r

)3]

= σ1 ≥ 0.

We also note the following relations:

τrr + 2τθθ = 3pa3

(b3 − a3), εrr + 2εθθ = pa3

K (b3 − a3),

(48)τrr + 2τθθ

εrr + 2εθθ

= 3K .

With reference to Eq. (16), the octahedral shear stress is

τ0 = 1

3

[(σ1 − σ )2 + (σ2 − σ3)2 + (σ3 − σ1)2

]1/2

=√

2

2

pa3

(b3 − a3)

(b

r

)3

, (49)

and the maximum shear stress (as a function of r ) is

τmax = 1

2(σ1 − σ3) = 3

4

pa3

(b3 − a3)

(b

r

)3

, (50)

and we note that for the present case we have τ0/τmax =(2

√2)/3 ∼= 0.9428 and [see Eq. (19)]

1 <

√3

2

τ0

τmax= 2√

3. (51)

We now apply the failure criterion due to Hencky-Mises(see Section IV): Yielding will occur when 3τ0 = √

2Y ,where Y denotes the yield stress in simple tension of theshell material. Upon application of this criterion and withthe aid of Eq. (49), we obtain

p = 2

3

(b3 − a3)

a3

(r

b

)3

Y, (52)

and the smallest value of p results when r = a. Thus weconclude that the Hencky-Mises failure criterion predictsyielding on the surface r = a when

p = 2

3

[1 −

(a

b

)3]

Y. (53)

The criterion due to Tresca (see Section IV) predictsfailure when τmax = Y/2. With the aid of Eq. (50), thisresults again in Eq. (53), and we conclude that for thepresent example, the failure criteria of Hencky-Mises andTresca predict the same pressure at incipient failure of theshell given by the formula (53).

SEE ALSO THE FOLLOWING ARTICLES

ELASTICITY, RUBBERLIKE • FRACTURE AND FATIGUE

• MECHANICS, CLASSICAL • MECHANICS OF STRUC-TURES • NUMERICAL ANALYSIS • STRUCTURAL ANAL-YSIS, AEROSPACE

BIBLIOGRAPHY

Boresi, A. P., and Chong, K. P. (1987). “Elasticity in Engineering Me-chanics,” Elsevier, Amsterdam.

Brekhovskikh, L., and Goncharov, V. (1985). “Mechanics ofContinua and Wave Dynamics,” Springer-Verlag, Berlin andNew York.

Filonenko-Borodich, M. (1963). “Theory of Elasticity,” Peace Publish-ers, Moscow.

Fung, Y. C. “Foundations of Solid Mechanics,” Prentice-Hall,Englewood Cliffs, NJ.

Green, A. E., and Zerna, W. (1968). “Theoretical Elasticity,” 2nd ed.,Oxford Univ. Press, London and New York.

Landau, L. D., and Lifshitz, F. M. (1970). “Theory of Elasticity” (Vol. 7of Course of Theoretical Physics), 2nd ed., Pergamon, Oxford.

Leipholz, H. (1974). “Theory of Elasticity,” Noordhoff-InternationalPublications, Leyden, The Netherlands.

Lur’e, A. I. (1964). “Three-Dimensional Problems of the Theory of Elas-ticity,” Wiley (Interscience), New York.

Novozhilov, V. V. (1961). “Theory of Elasticity,” Office of TechnicalServices, U.S. Department of Commerce, Washington, D.C.

Parkus, H. (1968). “Thermoelasticity,” Ginn (Blaisdell), Boston.Parton, V. Z., and Perlin, P. I. (1984). “Mathematical Methods of the

Theory of Elasticity,” Vols. I and II, Mir Moscow.Reismann, H., and Pawlik, P. S. (1974). “Elastokinetics,” West, St. Paul,

Minn.Reismann, H., and Pawlik, P. S. (1980). “Elasticity: Theory and Appli-

cations,” Wiley (Interscience), New York.Solomon, L. (1968). “Elasticite Lineaire,” Masson, Paris.Southwell, R. V. (1969). “An Introduction to the Theory of Elasticity,”

Dover, New York.Timoshenko, S. P., and Goodier, J. M. (1970). “Theory of Elasticity,” 3rd

ed., McGraw-Hill, New York.

Page 147: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic CompatibilityJ. F. DawsonA. C. MarvinC. A. MarshmanUniversity of York

I. Sources of Electromagnetic InterferenceII. Effects of InterferenceIII. Interference Coupling Paths and Their ControlIV. Design for Electromagnetic CompatibilityV. Electromagnetic Compatibility Regulations

and StandardsVI. Measurement and Instrumentation

GLOSSARY

Antenna factor The factor by which the received volt-age at a specified load is multiplied to determine thereceived field at the antenna.

Common-mode current/voltage The component of cur-rent/voltage which exists equally and in the same direc-tion on a pair of conductors or multiconductor bundle,i.e., the return is via a common ground connection (cf.differential mode).

Crosstalk Unintentional transfer of energy from one cir-cuit to another by inductive or capacitive coupling orby means of a common impedance (e.g., in a commonreturn conductor).

Differential mode current/voltage The component ofcurrent/voltage which exists equally and in oppositedirections on a pair of conductors (cf. common mode).

Shielding effectiveness The ratio of electric or magneticfield strength without a shield to that with the shieldpresent (larger numbers mean better shielding).

Skin depth The depth of the layer in which radiofre-quency current flows on the surface of a conductor.

Skin effect The confinement, at high frequencies, ofcurrent to a thin layer close to the surface of aconductor.

Source The source of electromagnetic interference.Victim A circuit or system affected by electromagnetic

interference.

ELECTROMAGNETIC COMPATIBILITY (EMC) isthe ability of electrical and electronic systems to coexistwith each other without causing or suffering from mal-function due to electromagnetic interference (EMI) fromeach other or from natural causes. As we rely more andmore upon electronic systems for the day-to-day operationof our factories, houses, and transport systems, the needto achieve electromagnetic compatibility has increased inimportance. This has resulted in the design, analysis, andmeasurement techniques discussed in this article.

261

Page 148: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

262 Electromagnetic Compatibility

The limits of electromagnetic (EM) emissions fromequipment and immunity to EMI that an equipment musttolerate in an operating environment are determined bystandards organizations, in particular, the InternationalElectrotechnical Commission (IEC) and its CISPR com-mittee (Comite International Special Perturbations Radio-electrique). The guidelines laid down in the standards maybe enforced through regulations.

I. SOURCES OF ELECTROMAGNETICINTERFERENCE

A. Natural Sources

1. Electrostatic Discharge

When differing materials are in sliding contact one mate-rial may lose electrons to the other—this is the triboelec-tric effect. This results in a buildup of electrical charge. Theelectric field due to the charge can cause electrical break-down of the air (or other insulating material) surroundingthe source of the charge, resulting in an electrostatic dis-charge (ESD).

The rate of charge transfer depends on the materials incontact. Electrostatic discharge can be reduced by usingmaterials which are closely matched in the triboelectric se-ries or by using materials with a low conductivity whichallow the charge to leak away before it accumulates suffi-ciently to discharge due to a breakdown of insulation.

A common cause of electrostatic discharge is the useof synthetic clothing and furniture. Electric charge is in-duced on the human body due to friction between clothingor shoes and furniture or floor coverings; the body capaci-tance (a few hundred picofarads) can charge to voltages ashigh as 15 kV. When the body comes in close proximity toelectronic equipment a spark between the body and metalon the equipment may occur. This can result in a largecurrent flow with a very fast rise time (<1 nsec) and aduration of about 100 nsec, which may disrupt or damagethe electronic equipment as well as radiating electromag-netic energy which may disturb nearby equipment. Thefast-rise-time, initial peak is due to the discharge of thefinger and arm, while the slower, secondary peak is due tothe discharge of the remainder of the body (Fig. 1).

Electrostatic discharge can also occur on aircraft dueto air friction and on satellites due to direct bombardmentwith charged particles.

2. Lightning

Lightning is the result of the ionization of the air due tocharge accumulated in clouds. This is thought to be dueto a triboelectric effect between ice crystals. Lightning

FIGURE 1 Approximate current waveform for electrostaticdischarge.

has a much larger energy than the electrostatic dischargephenomenon described above.

Lightning discharges have a rise time of the order of1 µsec and decay in about 50 µsec. Lightning strikes caninduce large currents (up to 100 kA) and voltages (up to100 kV) in conductors and may therefore be a source ofelectromagnetic interference to electronic systems.

3. Solar Storms

Solar storms can induce large currents in power networks.This low-frequency interference phenomenon resulted ina blackout of the Hydro-Quebec power grid in Canadaduring the 1989 solar maximum; it has been of much in-terest to power companies worldwide, who have spentconsiderable resources to in harden their distribution sys-tems to prevent similar occurrences during the 2000 solarmaximum.

B. Man-Made Sources

1. Intentional Sources

Radio and radar transmitters, industrial, scientific, andmedical (ISM) equipment using high-power radiofre-quency energy, and microwave ovens and other equip-ment which produces significant radiofrequency fields in-tentionally are also sources of interference for systemswhich may be susceptible to their emissions.

The proliferation of mobile phones is a significantsource of interference. Their use must be controlled nearsensitive systems such as medical monitoring equipmentand in aircraft (to prevent interference with navigationaids). The increase in wireless networking for portablecomputing devices is likely to increase this problem.

Page 149: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic Compatibility 263

2. Unintentional Sources

Most electrical and electronic equipment has the potentialto cause electromagnetic interference. Particular sourcesinclude:

Electrical circuit breakers, contactors, and relays,which are likely to draw arcs as they open or close

Arcs such as those created by welding equipment andfurnaces and discharges such as those in fluorescentlighting

Brushgear in electrical machinery Solid state switching circuits ranging from logic

circuits to switched-mode power converters Radiofrequency oscillators (including unintentional

ones).

II. EFFECTS OF INTERFERENCE

A. Interference with Radio Communicationsand Navigation Aids

EMC began with the need to prevent electrical noise gen-erated by trams (trolley cars) and automobile ignition sys-tems from interfering with broadcast radio transmission.The main factor driving limits on electromagnetic emis-sions in EMC standards is still the prevention of radiointerference. The interference from mobile phones andportable electronic equipment has become a problem incertain sensitive environments. In aircraft, where sensitiveradio receivers are used for communications and naviga-tion, the use of portable electronic equipment is prohibitedduring critical phases of flight (i.e., takeoff and landing).

B. Malfunction of Electronic Systems

Equipment which does not interfere with radio receptionis unlikely to interfere with other electronic systems; how-ever, many of the sources of interference described aboveproduce a large enough disturbance to interfere with thenormal operation of electronic systems.

1. Demodulation and Intermodulation

Radiofrequency interference is often outside the passbandof a circuit and does not directly interfere with the wantedsignal. However, all active components have a degree ofnonlinearity, which means that radiofrequency interfer-ence which enters a circuit can be demodulated to producesignals within the passband of the circuit. This is the mostcommon cause of interference effects in analog circuits.

If more than one interfering frequency is present, sumand difference frequencies of the fundamental compo-

nents and their harmonics (intermodulation products) aregenerated by any nonlinearities, which can result in newfrequency components that may be within the passband ofthe circuit.

2. Data Corruption

The presence of interference in digital circuits can inducetiming jitter (may cause failure due to violation of tim-ing constraints) and eventual direct corruption of data indigital circuits (noise margin).

3. Damage

High levels of interference can cause damage to compo-nents which may result in their failure or reduced reli-ability (latent failures). Electrostatic discharge is one ofthe most common causes of damage to electronic com-ponents both in their handling (prior to manufacture) andin service. The high-intensity radiated fields (HIRF) inthe vicinity of radio, television, and radar transmitters caninduce sufficient energy in electronic systems to causedamage; this is of particular concern to the aerospace in-dustry, where sensitive electronic systems must operate inthe vicinity of high-power radars and radio transmitters.

III. INTERFERENCE COUPLINGPATHS AND THEIR CONTROL

A. Conducted Interference

At frequencies below 30 MHz interference can propagateefficiently along power circuits within buildings and otherinstallations. Above 30 MHz the attenuation in electricalwiring limits the propagation of interference and directradiation often provides a lower loss path.

Conducted interference can be resolved into differen-tial and common-mode components. Figure 2 shows twogrounded enclosures linked by two wires. A source in oneenclosure drives a load in the other and, as expected, a cur-rent Idm (the differential mode current) flows in each wirein opposing directions. No current flows in the ground.Figure 3 shows a circuit in which a common-mode current

FIGURE 2 The flow of differential mode current in a two-wireconnection between grounded enclosures.

Page 150: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

264 Electromagnetic Compatibility

FIGURE 3 The flow of common-mode current in a two-wire con-nection between grounded enclosures.

flows equally, in the same direction, in each of two conduc-tors, returning through the common ground connection.The flow of common mode current is often caused by ex-ternal fields inducing current in the ground loop or dueto the two wires having different impedances to ground,resulting in the generation of an EMF which can drive acurrent around the ground loop. Potential differences be-tween different grounding points due to earth-leakage cur-rents and lightning strikes, can also be a source of commonmode current. In practice, both differential and common-mode currents are present in most cases (Fig. 4).

1. Sources

Switch mode power supplies, inverters, and speed con-trollers generate switching transients, which may ap-pear as conducted interference in other systems. Linearpower supplies generate harmonic currents, which in-crease losses in the distribution system but do not directlyaffect other electronic systems.

Electrical switchgear is a source of conducted interfer-ence due to the arcing which occurs when contacts areoperated. The rapid establishment and breaking of an arc(showering arc) which can occur when switching loadswith inductive and capacitive elements can result in a burstof short (5 nsec rise time, 50 nsec width), high-voltage(several kilovolts) transients, lasting for a few tens of mil-liseconds, known as a fast transient burst. Damped oscil-latory transients at frequencies of 100 kHz to 1 MHz canalso be generated during contact operation.

Brushgear on electrical machines can cause broadbandconducted electrical noise.

Induced currents in ground, power supply, and signalwiring due to lightning and electrostatic discharge can

FIGURE 4 The total current: the sum of common-mode and dif-ferential mode currents.

cause significant interference and damage to electronicsystems.

2. Control

Filters, transient suppressors, and isolation techniquesmay be used to reduce the amplitude of conducted in-terference at both the source and victim. Shielded cablesmay help to reduce coupling of interference between ca-bles running in close proximity (e.g., in a cable duct).

It is not practical to completely filter transients fromelectrical switchgear, so potential victim equipment musthave inherent immunity. Care must be taken to ensurethat interference does not bypass the protection circuits(e.g., by direct coupling between the input and outputconnections).

Filters work by using frequency-dependent impedances(e.g., capacitors and inductors) consisting of series ele-ments, which are intended to reduce the flow of interfer-ing current, and shunt elements, which are intended toallow the interfering current to bypass the victim circuit.Clearly a filter can only be effective when the spectrum ofthe interference differs from the spectrum of the desiredsignal or power source. The large transient voltages thatcan occur at the input of filters used in EMC applicationsmean that care must be take to ensure that inductor coresdo not saturate (reducing their effectiveness) and that thedielectric strength of capacitors is not exceeded.

Safety must be considered for filters used on power cir-cuits. In particular the presence of capacitors between lineand chassis ground can cause an equipment chassis to be-come “live” if the chassis ground becomes disconnected.Capacitors to be used in line-power circuits are subject toregulatory control which limits the size of the capacitorto limit the current flow in case of electric shock via a“live” chassis due to a disconnected ground. These capac-itors must be self-healing so as to correct any dielectricdamage due to overvoltage transients.

Filters for EMC differ from filters for communicationsand signal processing because they operate in a less wellcontrolled environment—source and load impedancesmay vary rapidly with frequency ( a few ohms to a fewkilo-ohms and any phase angle is typical). In order tocontrol the filter behavior with a wide range of load andsource impedances, lossy elements are often incorporatedinto high-quality filters. These include lossy ferrite coresand simple resistors.

Transient suppressors are used in conjunction with fil-ters to minimize the effect of large-amplitude transientson electronic equipment. The spark gap is used widelyas a transient suppressor and has the advantages of lowcapacitance, high impedance when not activated, and theability to shunt very high currents when the arc is struck;

Page 151: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic Compatibility 265

however, it is relatively slow in operation (on the order ofmicroseconds) and metal vapor produced from the elec-trodes when an arc is struck is deposited in the envelopeand leads to a falling resistance and eventual failure. Thelow voltage required to sustain an arc means that somemeans of quenching the arc must be used in power circuits(e.g., a fuse or contact breaker). Metal oxide varistors havea faster response than a spark gap, and their inherent ca-pacitive nature can be an advantage in some applications.The varistor has a constant voltage characteristic and maybe used in power circuits to limit transients without anyquenching mechanism. The voltage and current capabil-ities of a varistor are lower than those of a spark gap.Avalanche diodes optimized for speed are used as tran-sient suppressors to protect sensitive solid state circuitry;these are essentially low-voltage devices (a few tens ofvolts) with a very fast switching time (nanoseconds). Inpractical suppression systems all three devices may beused in conjunction with filter elements to prevent dam-age from high-energy transients such as those induced bynearby lightning strikes.

In the case of common-mode currents, both filters andtransient suppressors rely on bypassing some of the in-terfering signal to a ground or chassis connection, ratherthan having it flow through the internal ground circuitry.A good connection to a metal chassis for transient sup-pressors and filters is essential for their correct operation.

Signal circuits may be protected from common-mode,conducted interference by means of opto- or transformer-based isolators, which break the ground loop and preventthe flow of common-mode current.

Shielded cables are useful for minimizing the effectof the coupling of common-mode, conducted interferencebetween cables in close proximity. Shielding is also appro-priate when the spectra of the signal or power connectionoverlap the interference spectrum, so that filters are notapplicable. The common-mode currents flow on the cableshield, rather than the signal wires, reducing the effect ofinterference. However, the flow of large interference cur-rents on cable shields can cause problems in itself. If acable shield is used as a zero-volt reference for the signalwires, then any potential difference along the cable shielddue to the flow of common-mode current appears in se-ries with the signal voltages. Also, the presence of a largecirculating current in the cable shield can cause safetyproblems. This should be addressed by the proper safetybonding of equipment and building grounding systems.

B. Radiated Interference

1. Emissions from Electronic Equipment

The electromagnetic radiation from an electrical circuit in-creases with the rate of change of current and/or voltage in

FIGURE 5 Total radiated power from an equipment enclosurewith a 1-m lead excited by a 1-V (common-mode) source. Notethe drop in resonant frequency when the lead is grounded due toimage currents in the ground.

the circuit. Efficient antennas must be a significant fractionof a wavelength (e.g., half- or quarter-wave resonance), sothat equipment begins to radiate efficiently when it is of theorder of one half-wavelength large or has cables attachedof that length. Typical desktop equipment has leads of theorder of 1 m attached and so begins to radiate electromag-netic noise efficiently in the VHF band (Fig. 5). Tracks onprinted circuit boards (PCBs), heatsinks, apertures, andseams (joints) in equipment cases and other small struc-tures can also become efficient antennas when their lengthbecomes a significant fraction of a wavelength. In desktopequipment this radiation mechanism becomes significantin the UHF band. With microprocessor operating speedsmoving into the gigahertz region, radiation from smallstructures is becoming a significant factor.

2. Susceptibility of Electronic Equipment

Radiated interference enters electronic equipment throughcables, apertures, seams, etc.—the same paths that al-low emissions. Although the propagation mechanisms areidentical, the circuits that are likely to be affected by in-terference entering a system are often not the same asthose likely to cause radiated emissions. Therefore mea-sures taken to suppress emissions do not necessarily haveany effect on the susceptibility of equipment to externalsources of interference and vice versa.

3. Control

Radiated interference is controlled in part by means offilters, shielding, and the physical layout of a system. Thecareful design of software and circuits is also an importantfactor.

Page 152: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

266 Electromagnetic Compatibility

Filters serve to prevent unwanted frequency signalspassing between a system and external cabling which mayact as an antenna. A significant example is the use of a fer-rite bead on cabling to reduce the common-mode currentson the bundle or screen. Uncontrolled common-mode cur-rents on cables are the most common cause of radiationfrom equipment in the VHF band. The common-modecurrent may be induced by imbalance in the signal andreturn connections in the cable, potential differences inthe grounding structure of equipment, or currents coupledvia internal cable looms. A cable that is an efficient ra-diator is also an effective receiver of interference—thepath into or out of a piece of equipment is reciprocal—so a filter will have the same effect on immunity asemissions.

Screened cables can greatly reduce cable radiation andingress of interference via cables. At radio frequenciesthe currents induced by internal wires tend to flow on theinside of the screen, while currents induced by externalfields flow on the outside due to the skin effect; the twotend not to interact. The imperfections in braided screenslimit their effectiveness compared with a solid screen asfrequency increases. The performance of a cable screenis specified by its transfer impedance Z t such that theequivalent voltage source Vs on the inner conductor dueto a current Is on the outside of the screen for a an elementof cable of length δl is given by

Vs = Is Z tδl.

Figure 6 shows the typical variation of Z t with frequencyfor solid and braided shielded cables.

In many cases the limiting factor in the performance ofa screened cable is the manner in which its screen is con-nected to the equipment enclosure. A good-quality con-nector which maintains a 360-deg connection of the screen

FIGURE 6 Transfer impedance of typical solid and good braidedscreened cables.

FIGURE 7 A connector with 360-deg cable termination and aconnector with a pigtail screen connection.

to the equipment enclosure is required to realize the fullperformance of a cable. A termination in which the screenis gathered into a loop (often known as a pigtail) beforeconnection to the enclosure will significantly degrade theperformance of the screen (Fig. 7).

A conductive enclosure can greatly reduce the prop-agation of electromagnetic radiation between electroniccircuits and the environment. This aids immunity to exter-nal interference and reduced electromagnetic emissions.The shielding effectiveness of an enclosure depends onthe frequency of the electromagnetic radiation, the elec-tromagnetic properties of the material from which it ismade, the geometry of the enclosure, and its contents.

Figure 8 shows the shielding effectiveness of a sealedenclosure of relatively low conductivity (carbon-fiber-reinforced plastic). It has a large electric field shieldingat all frequencies. Its magnetic field shielding effective-ness is poor at low frequencies; low-frequency magnetic

FIGURE 8 Shielding effectiveness at the center of a sealedcarbon-fiber reinforced plastic (CFRP) composite enclosure com-pared with a metal enclosure (cube) of the same volume with anaperture (computed by approximate methods).

Page 153: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic Compatibility 267

shielding is difficult to achieve without a good conductorand/or magnetic materials. In both electric and magneticcases the shielding increases rapidly at high frequenciesdue to the skin effect; current from external fields flowsonly on the outside of the enclosure, causing no distur-bance internally (and vice versa for currents due to inter-nal fields). A sealed metal enclosure would have a largershielding effectiveness. The shielding effectiveness of ametal enclosure with an aperture is also shown in Fig. 8.The aperture dominates the screening of this metal enclo-sure: electromagnetic energy can more easily pass throughthe aperture with increasing frequency. The resonant be-havior of the metal enclosure can be seen at 700 MHz. Thiscan result in a field enhancement in the enclosure. The an-alytical solution used for the sealed enclosure does notinclude the effects of resonances which will be presentin real enclosures. In practical metallic enclosures withapertures and unshielded cable penetrations a shieldingeffectiveness of about 20 dB is typical. It should be statedthat the fields within an enclosure can vary considerablywith position, so that a value given at a single measurementpoint is of limited use.

IV. DESIGN FOR ELECTROMAGNETICCOMPATIBILITY

A. The Design Process

Electromagnetic compatibility is affected by almost everyaspect of the design and construction of a piece of equip-ment. It is therefore necessary to integrate EMC consid-erations into every stage of the design. Figure 9 showsan idealized view of the design process. Design rules forEMC can be applied from the first concept; as the designprocess continues, rules become hard constraints whichmay be determined by factors other than just EMC. At

FIGURE 9 An idealized view of the design process.

each stage in the design some estimate or prediction ofEMC must be made. Eventually the system must be testedto see if it meets its EMC specification. Failure to meet thespecification must result in redesign until the specificationis met.

B. Design Rules and Constraints

Design rules encapsulate a range of measures that arethought to improve the EMC of a system. Often the amountof improvement is difficult to quantify and may vary de-pending on the details of the system under consideration.

1. Design Concept

If the EMC implications are considered as the design con-cept is developed, then areas where EMC is an importantconsideration can be highlighted and an EMC control planformulated. Alternatives can be considered where EMCweaknesses are suspected. We suggest the following rules.

a. Partition the system into noisy, quiet, robust, andsusceptible parts; each can then be consideredseparately (though the robust can be placed with thenoisy, and the quiet with the susceptible).

b. Select internal and external interfaces to minimizeemissions and susceptibility (i.e., use the largestsignals and narrowest bandwidths possible in circuitsthat may be susceptible to interference, and use thesmallest signals and narrowest bandwidths in circuitsthat may generate interference).

c. Consider where filtering and shielding are required.d. Plan the monitoring of EMC throughout the design

and development process.

2. Robust, Quiet Circuits

If circuits can be made robust in the presence of inter-ference, then the need for shielding and filtering can bereduced. We suggest the following rules.

a. Select logic circuits with the lowest bandwidth andhighest noise margins.

b. Minimize the bandwidth of analog circuits.c. Apply adequate decoupling on analog and digital

circuits.d. Consider the recovery of analog circuits from

transients (simple measures such as limiter diodes canreduce recovery times drastically).

e. Consider carefully partitioning and noise propagationin power supplies.

f. Ensure that unused states in digital (andmicroprocessor) circuits have transitions into safe

Page 154: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

268 Electromagnetic Compatibility

states to allow recovery after disruption byinterference and use “watchdog” circuits to forcereset after failure in microprocessor systems.

g. Separate I/O busses from the main processor bus toreduce interference transfer to and from interfaces.

h. Use filters and/or isolation to prevent interferencepropagation.

3. Robust Software

a. Provide system integrity checks (e.g., error detectionon code and data).

b. Check peripheral inputs for sensible values (e.g.,reject transient changes caused by interference).

c. Check and/or reinitialize peripheral devicesperiodically to allow recovery from EMI-inducedfailures.

d. Ensure unused interrupt vectors and unused memoryare initialized to cause predictable operation ifaccessed as a result of interference-induced errors.

4. Quiet Software

Minimize unnecessary activity (e.g., poll/updateinterfaces only when necessary, use interrupts todetect changed conditions rather than polling, halt theprocessor when not active).

5. Physical Layout

a. Partition circuits to minimize propagation ofinterference from noisy circuits and to susceptiblecircuits.

b. Provide nearby return for each power and signalconnection (by use of power/ground-planes, twistedpairs, shielded cables, etc.).

c. Minimize the physical size of critical circuits tominimize radiation/pickup.

d. Ensure proper termination of cable screens (thescreen and enclosure should form a continuousvolume in which the conductors are contained;pigtails should not be used).

e. Minimize the dimensions of any apertures and seamsin shielded enclosures (the interference propagationdepends on the largest dimension; many small holesare better than a single large hole, seams must havegood electrical connection avoiding long gaps).

C. Analysis

Analytical techniques for the solution of electromagneticproblems are complex and applicable only to very simplegeometries. This has made the direct analytical solutionof real EMC problems nearly impossible. However, ap-proximate analysis can often provide useful insight into

the magnitude of potential problems and relative perfor-mance of possible solutions. With the advent of cheapdesktop computing, the evaluation of complex analyticalapproximations can be achieved in seconds.

D. Computer-Aided Design

Computer-aided design (CAD) tools have permeatedmuch of engineering design and are well established inareas such as circuit analysis and the design of electricalmachinery, but are still new to electromagnetic compati-bility analysis.

1. Numerical Electromagnetic Solvers

A numerical solution of the electromagnetic properties ofarbitrary geometries is possible in principle. In practicethe large computational resources required prevent the so-lution of problems as complex as the prediction of elec-tromagnetic compatibility of complete electronic systems.Numerical methods are widely used to solve simplified ge-ometries in order to allow a better understanding of EMCproblems.

2. Signal Integrity

Signal integrity is one area where the use of CAD is wellestablished. Many commercial tools are available for theprediction of signal propagation and crosstalk on printedcircuit boards.

3. Design Rules Checking

Automated checking of design rules is an area where CADcan help improve the ease of design. One example isthe checking of design rule compliance in printed circuitboard layout. A PCB may have many thousands of tracksacross six or more layers, making manual checking a slowand error-prone process. Automatic design-rule-checkingsoftware is commercially available and can be used tocheck manufacturing, signal integrity, and EMC rules.

4. Knowledge-Based Systems

Knowledge-based systems attempt to encapsulate theknowledge of a human expert in a computer package.Commercial products which provide design advice anddiagnosis of EMC problems are available.

5. Design Frameworks

The difficulty of fully predicting the EMC performance ofelectronic systems has led to the concept of a design frame-work which can be used to combine a range of information

Page 155: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic Compatibility 269

on a system which improves in quality as the design andprototype construction progress. At the concept stage arough estimate of the EMC performance of the system canbe obtained by the use of past data, approximate/analyticalsolutions, and numerical models. This may be enhancedby measurements on subsystems or more-detailed numer-ical models as the design progresses.

V. ELECTROMAGNETIC COMPATIBILITYREGULATIONS AND STANDARDS

Here we review the EMC regulations and standards of theUnited States and Europe. Other countries and areas usingor adopting EMC regulations include Japan, Australasia,and Taiwan.

A. EMC Regulations

1. Rationale

EMC regulations exist to enforce control of the unwantedemissions from electrical or electronic equipment. Thiscontrols ‘pollution’ of the radio spectrum and provides anenvironment for the reliable operation of all electrical orelectronic equipment.

2. Federal Communications Commission(FCC) Regulations

The FCC administers the use of the radiofrequency spec-trum in the United States. Title 47 of the code of FederalRegulations covers telecommunications and contains infive volumes the intentional and incidental use of the spec-trum. The parts relevant to EMC are contained in Chapter1: Part 15 Radio Frequency Devices and Part 18 IndustrialScientific and Medical Equipment.

Part 15 governs emissions from intentional and un-intentional radiators and sets out the regulations, tech-nical specifications, and administrative requirements toenable equipment to be marketed without an individuallicense. Subpart A is concerned with digital devices, sub-part B with unintentional radiators, and subpart C withintentional radiators. The FCC classifies equipment intoClass A and Class B. Essentially Class A equipment isintended for use in an industrial or commercial environ-ment, while Class B is intended for the residential environ-ment. Accordingly, verification tests for Class A devicesare performed by the manufacturer and retained on file;certification by the FCC is not required. For Class B de-vices FCC certification must be obtained; this is achievedby examining a manufacturer’s test results.

The technical requirements for the emission limits arelaid down for both conducted emissions and radiated emis-sions. The methods of measurement are defined by the

FIGURE 10 The FCC and Euronorm radiated emissions limitsmeasured at 10 m.

American National Standards Institute (ANSI) standardC63.4 Methods of Measurement of Radio-Noise Emis-sions from Low Voltage Electrical and Electronic Equip-ment in the Range 9 kHz to 40 GHz. The emission limitsand the ANSI test methods are derived from CISPR 22(see Section V.B). Where the device’s highest internallygenerated frequency is greater than 1 GHz, the highestemission frequency to be measured is determined as fivetimes this frequency.

Part 18 covers equipment designed to generate andlocally use radiofrequency (RF) energy at frequenciesgreater than 9 kHz for industrial, scientific, and medi-cal (ISM) purposes. It also includes microwave ovens.ISM frequencies are defined at the international levelby the International Telecommunications Union (ITU).These frequencies are then allocated at a national levelby the national authorities; in the United States the fre-quencies are allocated by the FCC and are listed in Part18. Limits and measurements broadly follow CISPR 11(see Section V.B). Most ISM equipment is subject to FCCcertification.

The FCC regulations exclude most industrial electron-ics equipment.

3. European EMC Regulations—EMCDirective (89/336/EEC)

EMC regulations apply throughout Europe and have hada major impact on the development of EMC regulationsthroughout the world.

The European regulations result from European Com-mission Directive 89/336/EEC, which affects all electri-cal or electronic systems or products sold throughout theEuropean Economic Area (EEA). It also encompassesall electromagnetic phenomena. As a “new approach

Page 156: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

270 Electromagnetic Compatibility

directive,” the technical requirements are defined by Eu-ropean standards. The new approach directives were de-signed to remove technical barriers to trade within theEuropean community.

The essential protection requirements of the EMC Di-rective are as follows:

Equipment should be constructed so that it will notaffect broadcast services or the intended function ofother equipment—the emission aspect.

Equipment should have an inherent immunity toexternally generated electromagneticdisturbances—the immunity aspect

Note that the FCC regulations and Japanese VoluntaryCouncil for the Control of Interference (VCCI) require-ments do not have an equivalent requirement for immunity.

The EMC Directive specifies the routes available tomanufacturers to show that their product complies withthese protection requirements.

The simplest route is to demonstrate compliance withan appropriate European standard. This is a standardwhose reference number has been published in the Offi-cial Journal of the European Communities (OJEC) and is aCENELEC (the European electrical standards body) EuroNorm (EN) that has been transposed into a national stan-dard. An example is EN 55022 (the same as CISPR 22),the emission standard for information technology equip-ment; the transposed UK standard is BS EN 55022 andthe transposed German standard is DIN EN 55022. Thestandards define emission limits, immunity levels, and thetests that should be performed on equipment to show thatit meets these limits and levels. While the European reg-ulations do not explicitly require a product to be testedin order to demonstrate compliance with the protectionrequirements, it must be demonstrated that the productcomplies with the standard and therefore by implicationmust be tested in accordance with the standard. Testingmay be performed by the manufacturer or by a third party.There is no requirement for the testing laboratory to haveaccreditation; however, the use of an accredited laboratorywill provide a manufacturer with an assurance that the test-ing has been performed to the standard correctly and themanufacturer can obtain an accredited test certificate.

When standards are not available to a manufacturer orthe equipment has features that mean that a standard canonly be partly applied, then the manufacturer must usethe “technical construction file” (TCF) route to compli-ance. Essentially the manufacturer assembles the techni-cal information demonstrating that the product meets theprotection requirements. These data, which is likely to in-clude test results, must be reviewed by a competent bodyappointed by the national authorities. The requirements

for a competent body are laid down in Annex II to theEMC Directive. A competent body must demonstrate thatit has the appropriate expertise, operates systems that en-sure client confidentiality, and has the independence tomake an impartial judgement. Such systems are usuallyensured by quality assurance to standards such as ISO9002 and EN45011.

The essential features of a TCF are:

Part I: description of the apparatusa. Identification of the apparatusb. A technical description

Part II: Procedures used to ensure conformity of theapparatus to the protection requirements:a. Technical rationaleb. Detail of significant design aspectsc. Test data

Part III: Report (or certificate) from a competent body

For radio transmission equipment (including trans-ceivers) compliance with the Radio and Telecommunica-tions Terminal Equipment (R&TTE) Directive is requiredexcept in the case of air traffic management equipment,which is required to conform with the EMC Directive bythe “type examination” route. This means that the equip-ment must be submitted to a notified body (NB; an organi-zation which has been notified to the European Commis-sion by the national competent authority). The NB willrequire a type examination to be performed. This may becarried out by the NB or one of the NB’s approved testlaboratories.

When conformance with the protection requirementsof the EMC Directive has been demonstrated by one ofthese three methods, a Declaration of Conformity is issuedby the manufacturer and the European Community mark,the CE marking, affixed to the product or its packaging.It should be noted that the CE marking implies that theproduct complies with all of the new approach directivesapplicable to it (e.g., machinery safety).

The Australian EMC Framework follows broadly thesame pattern as the European regulations, while the U.S.FCC regulations are much more specific and apply to theemission aspects only of “digital” and “industrial scien-tific and medical equipment,” Parts 15 and 18, respec-tively of the Code of Federal Regulations (CFR) 47 (seeSection V.A.1).

B. Overview of Standards

1. Standards Rationale

In order to achieve electromagnetic compatibility betweenelectrical/electronic apparatus, it is necessary to control

Page 157: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic Compatibility 271

(a) emissions from equipment and (b) the level of immu-nity of equipment to such emissions. This is achieved byusing guidelines published as standards, which may beenforced by regulations.

Most standards follow the recommendations of theComite International Special des Perturbations Radio-electriques (or International Special Committee on Ra-dio Interference), CISPR, for establishing emission lim-its, susceptibility levels, and test procedures. CISPR is acommittee of the International Electro-technical Commis-sion (IEC).

Examples of CISPR recommendations are:

CISPR 11: Limits and methods of measurement ofradio disturbance characteristics of industrial,scientific and medical (ISM) radiofrequencyequipment (example equivalents: FCC Part 18 and EN55011).

CISPR 22: Limits and methods of measurement ofradio interference characteristics of informationtechnology equipment (example equivalents: FCCPart 15, ANSI C63.4, EN 55022, and the JapaneseVCCI requirements).

2. Relevant Standards for Conformancewith EU EMC Regulations

a. General. The EMC Directive defines two meth-ods for demonstrating compliance with the protection re-quirements. With self-certification, the manufacturer isable to declare that apparatus conforms to relevant stan-dards. Alternatively, a technical construction file can beprepared, which must include a technical report or a cer-tificate from a competent body.

For manufacturers to self-certify their products, theymust be designed, built, and tested to meet the require-ments of “relevant standards.” A “relevant standard” isdefined by Article 7 of the EMC Directive as a nationalstandard that has been harmonized with a standard whosereference number has been published in the Official Jour-nal of the European Communities (OJEC).

In practice this means that a relevant standard is a EuroNorm (EN), published by CENELEC, the European Com-mittee for Electrotechnical Standardisation.

Euro Norms are derived from CISPR and other IECpublications. It is necessary for individual EEA memberstates to harmonize their own national standards with theappropriate EN. This means that identical standards willbe used in all EEA countries. For example, the Britishstandard that covers emissions from information technol-ogy equipment is BS EN 55022. This is harmonized withEN 55 022 and is identical to CISPR 22.

There are two categories of relevant standard: (a) theproduct, or product family, specific standard, and (b) thegeneric standard. A product-specific standard applies to aparticular type of product or family of products, for exam-ple, EN 55 022, which applies to information technologyequipment. A product-specific standard takes precedenceover generic standards. A generic standard is categorizedaccording to environment type (for example, “residential,commercial, and light industry”) and applies to a broadrange of product types. Either category of standard mayrefer to ‘reference’ or ‘basic’ standards.

A considerable number of product types have been cov-ered by relevant standards. A representative listing is givenin Table I.

b. Generic emission standards. The generic emis-sion standard is EN 50081. Part 1 covers residential, com-mercial, and light industry environments. Part 2 covers theindustrial environment.

Part 1 principally restates the emission limits and testmethods defined by EN 55022 Class B, which is theproduct-specific emission standard for IT equipment; Part2 does the same with EN 55011, which is the product-specific standard for ISM equipment.

c. Generic immunity standards. The generic im-munity standard is EN 50082. Parts 1 and 2 have thesame environmental classification as the generic emissionstandard. These reference the basic standards were intro-duced by the IEC in the IEC 61000 series and adopted byCENELEC as the EN 61000 series.

Generally, both product-specific and generic standardsnot only define emission limits and immunity levels, butalso specify the test methods to be employed. Manufactur-ers using these standards to demonstrate compliance withthe EMC Directive must be familiar with the contents,and appreciate the implications, of all harmonized EMCstandards. In particular they must be aware of sectionsopen to misinterpretation, deficiencies within the stan-dards, and test methods that require significant financialinvestment.

d. List of representative relevant (harmonised )standards. For manufacturers required to self-certifytheir products for compliance with the EU EMC Direc-tive, a list of the available product-specific and genericstandards is essential. These are made available by theEuropean Commission; an example list is given in Table I.

3. Military Standards

EMC requirements for military equipment have been wellunderstood for many years and as a result the standards

Page 158: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

272 Electromagnetic Compatibility

TABLE I Representative List of Harmonized European EMC Standards

Product-specific standards: emission

EN 50065-1 Mains signaling equipment

EN 55011 Industrial, scientific, and medical (ISM)

EN 55013 Broadcast receivers and associated equipment

EN 55014 Household appliances

EN 55015 Luminaires

EN 55022 Information technology equipment (ITE)

EN 55103-1 Audio, video, audio visual, and entertainment lightingcontrol apparatus for professional use

Product-specific standards: immunity

EN 55020 Broadcast receivers and associated equipment

EN 55104 Household appliances

EN 50130-4 Alarm systems

Product-specific standards: emission and immunity

EN 50091-2 Uninterruptable power systems (UPS)

EN 50121 Railway applications

EN 50199 Arc welding equipment

EN 60601-2-3 Electro-medical devices

EN 61131-2 Programmable controllers (PLCs)

Generic standards: emission

EN 50081-1 Generic class: residential, commercial, and light industry

EN 50081-2 Generic class: industrial

Generic standards: immunity

EN 50082-1 Generic class: residential, commercial, and light industry

EN 50082-2 Generic class: industrial

Basic standards

EN61000-3-2 Harmonics

EN61000-3-3 Voltage fluctuation and flicker

EN61000-4-2 ESD

EN61000-4-3 Radiated immunity

EN61000-4-4 EFT/B

EN61000-4-5 Surge

EN61000-4-6 Conducted RF immunity

EN61000-4-8 Power frequency magnetic field immunity

EN61000-4-11 Voltage dips, interruptions

and test methods are well established. DEF Stan 59-41 isthe UK MOD standard covering all aspects of EMC fromselection of requirements through management of projectsto testing and test reporting. The equivalent U.S. standardsare MIL-STD-461, which covers EMC requirements, andMIL-STD-462, which covers the test methods.

Effects such as radiation hazards, detection of data fromunintentional emissions, and electronic countermeasuresare not considered as EMC topics, although related in anumber of ways.

The following designations are used in the titles of mil-itary standards: R, radiated; C, conducted; MF, magneto-static field; E, emissions; S, susceptibility (referred to asimmunity in commercial standards).

Examples are DEF STAN 59-41, where the first radiatedsusceptibility test is designated DRS01; and MIL-STD-462, where the first radiated susceptibility test is desig-nated RS101.

Generally the testing methods defined in commercialstandards are used in the military standards but the fre-quency ranges are greater and the severity of susceptibil-ity/immunity test levels is much higher. Examples includeMIL-STD-461D RS103, which covers a frequency rangeof 10 kHz to 40 GHz, compared with EN 61000-4-3, whichcovers only the frequency range 80 MHz to 1 GHz; andMIL-STD-461D RS103, which requires immunity to afield strength of 200 V/m, compared with EN 61000-4-3,which requires an immunity level of only 10 V/m.

Page 159: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic Compatibility 273

VI. MEASUREMENT ANDINSTRUMENTATION

The electromagnetic compatibility of an electronic systemcan only be fully demonstrated when the system is takeninto service. Clearly such a course of action is not ac-ceptable in today’s engineering environment and the risksof not achieving EMC must be minimized. Along withthe incorporation of EMC in the design process, some at-tempt must be made to ensure that the equipment will becompatible before it is released into the market. This isachieved by making EMC measurements. Measurementsmade on a complete system before release to market canbe made with reference to EMC standards that quantifythe levels of acceptable interference. In addition to thesemeasurements EMC measurements can be made on sub-systems bought in from other suppliers or on incompleteparts of the system in order to assess the efficacy of designmeasures taken before the system is complete. This latterprocess is particularly important in the design of complexor large systems incorporating electronics such as aircraftor other types of vehicle. Again, relevant standards maybe employed.

Many electronic systems can be assessed for EMC instandardized test facilities operated by EMC test houses.There are size limitations on such equipment, but itemssuch as PCs, household appliances, or TV and hi-fi equip-ment can readily be transported to such facilities. Largerequipment such as motor vehicles are tested in special-ist test houses. Very large equipment must be tested af-ter installation, as often the details of the installationcan have a bearing on its performance. Normally sub-systems of large equipment will have been tested prior toinstallation.

EMC measurements are inevitably simplified mimicsof reality. Consider the hypothetical problem of assess-ing the interference caused on a computer by a nearby arcwelder. The welder is a source of electromagnetic interfer-ence energy and the computer is disrupted by that energy.When either item is designed, the interference scenariocannot be predicted in detail. Thus the unwanted inter-ference energy leaving the arc welder must be measuredand compared to a predetermined level defined in a stan-dard. This is referred to as an emissions measurement.The interference energy incident upon the computer willcause disruption if the computer is not adequately immuneto this interference. Thus, the immunity of the computerto external interference must be measured. The standardsare devised such that the required immunity of electronicsystems to external interference is greater than the ag-gregate emissions from neighboring systems. Some sys-tems such as radio transmitters have intentional emissionsat energy levels much higher than would ordinarily be

allowed. Immunity requirements are adjusted to accountfor this.

In addition to classifying EMC measurements intoemissions and immunity, a further simplifying breakdownis required. The interference between the arc welder andthe computer in the above example occurs as a conse-quence of the transfer of energy between the two. The paththat this energy takes may not be immediately apparent.For example, the interference may be a consequence ofinterference energy conducted away from the welder viathe supply mains and entering the computer via its supply.Conversely, it may be due to interference radiated from thewelder’s leads being picked up by a peripheral lead on thecomputer. It is not possible to determine each interferencescenario in advance, but it can be stated in general that in-terference energy propagation between the source and thevictim is by either conduction or radiation. For this reason,measurements are made for both mechanisms. Consider-ation of the physics of the energy propagation mechanismleads to a further simplification. In general, conducted in-terference is a low-frequency phenomenon and radiatedinterference is a high-frequency phenomenon. This arisesbecause the efficiency of any structure acting as a trans-mitting or receiving antenna increases with frequency. Forsignificant radiative energy transfer, the “antenna” needsto be comparable to the wavelength in its linear dimen-sions, i.e., typically more than a tenth of a wavelength.For a system with linear dimensions of 1 m this impliesa wavelength of 10 m, corresponding to a frequency of30 MHz. The boundary is fuzzy. Few emissions measure-ments are made below 30 MHz and similarly few are madeabove 100 MHz.

The following sections briefly outline the principal mea-surement techniques.

A. Emissions

1. Conducted

Conducted emission measurements are made on cablesconveying power and signals to and from equipment. Caremust be taken to ensure that the signals measured areemerging from the equipment under test (EUT) and arenot due to other sources connected to the cable.

Two types of emission are measured. Common-modeemissions measurements use a calibrated current trans-former to measure the total interference current presenton a cable. The output of the current transformer is fed toa measurement receiver that indicates the voltage presentat its input port. The current transformer is calibrated interms of its transfer impedance. This impedance relatesthe receiver input voltage to the current flowing on the ca-ble. The EUT can be isolated from the far end of the cableby placing an absorbing ferrite clamp around the cable.

Page 160: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

274 Electromagnetic Compatibility

Such a clamp provides a stable absorbing load for the in-terference currents on the cable and the required isolationfrom other devices connected to the cable. Often the cur-rent transformer is combined in the same structure as theabsorbing clamp.

Differential mode emissions measurements are madeas voltage measurements between individual pairs of con-ductors in a cable. Again isolation is required. The iso-lation is provided by a line impedance isolation network(LISN) inserted into the conductor pair. The simplest formof LISN provides a low-pass filter for the intentionalsignals on the conductor pair and a barrier for higherfrequency interference signals propagating in either di-rection along the cable. Along with the receiver’s inputimpedance, it provides a stable and defined measurementimpedance for the interference signals.

2. Radiated

Radiated emission measurements are made using a definedenvironment into which the EUT radiates. The current in-ternational standard environment is the open-area test site(OATS). The radiation from the EUT is measured using acalibrated antenna and a measurement receiver. The EUTand antenna are positioned on the OATS at the foci ofan ellipse. The area of the OATS is defined as the areaof the ellipse with a major diameter of twice the focallength and a minor diameter of the square root of threetimes the focal length. Typical EUT-antenna spacings are10 and 3 m. A 10-m OATS thus has an elliptical area of20 by 17.3 m. The signal received by the antenna is thecombination of the direct wave from the EUT and theground reflection. In order to preserve the repeatabilityof measurements on a given site and the reproducibilityof measurements between sites, the ground reflection hasto be stabilized against changes introduced by climaticeffects. A metallic ground plane is used beneath the an-tenna and the EUT and in the space between them for thispurpose. In order to measure the maximum signal aris-ing from the combination of the two waves, the antennaheight is scanned from 1 to 4 m. The OATS suffers fromthe presence of ambient signals that can mask the emis-sions from the EUT. This disadvantage can be overcomeby enclosing the OATS in a screened room with radio-absorbing material on the walls. Such a facility is calleda semi-anechoic chamber. Recently, it has been suggestedthat a fully anechoic chamber with radio absorber on itsfloor would be a better environment for measuring radi-ated emissions. Such a chamber would not need to havethe antenna height scan and would be compatible withchambers used for radiated immunity measurements asdescribed below. It remains to be seen if this suggestionwill be adopted.

B. Immunity

1. Conducted

Conducted immunity is measured using transducers sim-ilar to those used for conducted emissions. Energy is in-jected onto cables in both common mode and differentialmode. Conducted emission measurements are made usingreceivers tuned across a specified frequency range, the fre-quency domain. Conducted immunity measurements canalso be made in the frequency domain by injecting energyat specified frequencies. Other time-domain waveformscan also be used such as pulses or bursts of pulses in orderto simulate known threats to the EUT.

2. Radiated

The radiated immunity of an EUT is measured by illumi-nating the EUT with a radio wave that simulates the per-ceived threat. This is always done in an anechoic chamberin order to prevent the radiated energy from causing inter-ference to other systems. In general, the threat to an EUTis likely to come from an intentional radio transmitter,usually a low-power mobile transmitter. The EUT is illu-minated at an appropriate field strength by an amplitude-modulated signal the modulation of which mimics themodulation of the threat. For example, in analog amplitudemodulation schemes the chosen standard is 80% modula-tion depth with a 1-kHz tone. GSM mobile phone modula-tion is simulated by a 217-Hz pulse modulation. Frequencyand phase modulation is simulated by a constant-strengthcarrier. The EUT needs to be observed when under stress.For this reason the threat is applied at a series of frequen-cies with each having a defined dwell time. The frequencyis normally incremented in 1% or 2% steps.

3. Electrostatic Discharge

Electrostatic discharge (ESD) is a further electromagneticphenomenon that may cause equipment malfunction. Themost common scenario is the discharge of a charged hu-man body through a finger onto the EUT. The source ofthe charge is the triboelectric effect acting on floor cov-erings and synthetic clothing. Charging potentials of upto 16 kV can be experienced. An electrostatic dischargegun simulates this threat by approximating the chargedhuman body with a series resistor/capacitor circuit withthe capacitor charged to an appropriate potential. Typi-cal circuit values for an adult human are 200 pF in serieswith 200 ohms. The discharge is through an artificial fin-ger either with an air discharge to the EUT or a directcontact discharge. The ESD event results in significantreactive fields in the vicinity of the discharge. A furthertest requires a discharge to an earthed plate close to the

Page 161: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/LPB P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN005I-210 June 15, 2001 20:29

Electromagnetic Compatibility 275

EUT. The potential disturbance via the reactive field isassessed.

SEE ALSO THE FOLLOWING ARTICLES

ELECTROMAGNETICS • MICROWAVE COMMUNICATIONS •RADAR • RADIO PROPAGATION • RADIO SPECTRUM UTI-LIZATION • SOLAR SYSTEM, MAGNETIC AND ELECTRIC

FIELDS • WIRELESS COMMUNICATION

BIBLIOGRAPHY

Archambeault, B., Ramahi, O., and Brench, C. (1998). “EMI/EMC Com-putational Modeling Handbook,” Kluwer Academic Publishers, Nor-well, Massachusetts.

Department of Trade and Industry (UK). (1992). “The ElectromagneticCompatibility Regulations,” HMSO, London.

Department of Trade and Industry (UK). (1992). “Guidance Document

on the Preparation of a Technical Construction File As Required byEC Directive 89/336,” HMSO, London.

Goedbloed, J. (1992). “Electromagnetic Compatibility,” Prentice Hall,Englewood Cliff, NJ.

Hoeft, L. O., and Hofstra, J. S. (1988). “Measured electromagneticshielding performance of commonly used cables and connectors,”IEEE Trans. EMC 30(3), 260–275.

Hubing, T. (1991). “A survey of numerical electromagnetic techniques,”In “ITEM Update,” pp. 17–13, 60, 62, Robar Industries, West Con-shohocken, Pennsylvania. URL:www.rbitem.com.

Marshman, C. (1995). “The Guide to the EMC Directive 89/336/EEC,”2nd ed., EPA Press, Saffron Walden, UK.

Molinkski, T. S., Feero, W. E., and Damsky, B. L. (2000). “Shieldinggrids from solar storms,” IEEE Spectrum 37(11), 55–60.

Paul, C. R. (1992). “Introduction to Electromagnetic Compatibility,”Wiley Interscience, New York.

Tesche, F. M., Ianoz, M. V., and Karlsson, T. (1997). “EMC AnalysisMethods and Computational Models,” Wiley, New York.

Williams, T. (1996). “EMC for Product Designers,” Newnes,Butterworth-Heineman, Woburn, MA.

Williams, T., and Armstrong, K. (2000). “EMC for Systems and Instal-lations,” Newnes, Butterworth-Heineman, Woburn, MA.

Page 162: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

ElectromagneticsSheila PrasadNortheastern University

I. Historical IntroductionII. Maxwell’s EquationsIII. Electromagnetic WavesIV. Applications of ElectromagneticsV. Recent Developments

GLOSSARY

Antenna Structure that is designed is such a way that itwill radiate electromagnetic power efficiently.

Charge Fundamental physical quantity that is indestruc-tible and is characterized by mutual interactions withother charges.

Current Time rate of change of charges that are inmotion.

Electric field Force per unit charge.Electromagnetic energy Energy stored in the electro-

magnetic field.Elementary dipole Positive and negative charge that are

tightly bound together.Elementary magnet Electron rotating about an axis.Magnetic field Force produced by an electric current.Magnetization Orientation of elementary magnets along

parallel axes due to an external force.Phase velocity Speed with which a wave front moves in

space.Plane wave Wave for which constant-phase surfaces are

planes perpendicular to the direction of propagation.Potential Potential energy of the electromagnetic field.Spherical wave Wave for which constant-phase sur-

faces are spheres perpendicular to the direction ofpropagation.

Wavelength Distance between two constant-phase sur-faces with a phase difference of 180.

ELECTROMAGNETICS is the description of the elec-tricity and magnetism that exist in space and various mate-rials. These physical phenomena are described in terms ofelectric and magnetic fields created from electric chargesand currents and forces associated with them. A precisemathematical formulation of these physical effects is givenin Maxwell’s equations. The energy in the electromag-netic field is transported by electromagnetic waves, whichtravel in unrestricted space, infinitely large material me-dia, or in physical structures that guide the wave in specificdirections.

I. HISTORICAL INTRODUCTION

The history of the development of electromagnetics isthe history of the development of electrical science. Ab-stract mathematical

277

theory was applied to the description

Page 163: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

278 Electromagnetics

of physical phenomena, and this eventually evolved intomodern technology, which is continually changing.

The history of electromagnetism may be treated in termsof the three periods of its growth. In the first, the funda-mental concepts of action at a distance between chargesand currents were developed. Earlier, it was proposed thataction could take place only by contact through a mate-rial medium. This concept is as old as the history of man.Aristotle was a believer in such action, and this same ideawas propounded by the philosophers of the East who pre-dated Aristotle. Much later, even Newton considered theidea that one body could act on another through emptyspace an impossible one. It was much easier for thesephilosophers to explain the process of throwing a stone,which implied action by contact, than the mere falling ofa stone due to the interaction of it and the earth with novisible push.

It was Newton, however, who made the concept of ac-tion at a distance acceptable. His law of gravitation gavethe force between two masses at a distance without re-ferring to a mechanical medium. The mathematician Eu-ler attempted to explain the theories of gravitation, lighttransmission, and the interaction of permanent magnets interms of the intervening material medium, the ether. New-ton’s inverse square law of gravitation formed the basis ofearly work on the interaction between charges at a distancein dependent of any intermediate material in contact. Theinverse square law for electric charges was first suggestedby Priestly (1766), who used an electrometer, and was dis-cussed by Cavendish (1771), and the formulation familiarto us today was given by Coulomb (1785), who carried outexperiments using a torsion balance. The inverse squarelaw for magnetic poles was expressed by Michell (1750)for the first time. Much later (1820–1821), the magneticeffects of currents were investigated by Oersted, Biot,Savart, and Faraday. During this time, Laplace formu-lated a law of action at a distance between elements ofcurrent and magnetic dipoles. Ampere (1823) performedexperiments that led to the law of force between currentelements, which was, once again, a law of action at a dis-tance. Ohm’s law (1826) was followed by Faraday’s lawof induction (1832) and Lenz’s law (1834). Potential the-ory was expounded by Gauss and Green separately duringthis same period. Neumann and Weber (1845–1847) ex-pounded their work on induction resulting from current-carrying conductors in motion and due to the rise anddecay of currents. The current and voltage laws that arethe basis for electrical engineering were given by Kirch-hoff (1845). The work of developing a fundamental lawof electromagnetic action at a distance was continued byGrassmann, Riemann, and Clausius during this period.The idea of the propagation of this action was first sug-gested by Gauss. This was extended by Riemann (1858),who showed that such propagation moved with a velocity

equal to that of the velocity of light. The last importantwork of this period was by Lorenz (1867), who suggestedusing retarded scalar and vector potentials. This showedclearly that it was not necessary to have contact with amedium to have action at a distance.

The second period of growth was marked by the out-standing work of Maxwell, which laid the foundation ofelectromagnetics. Kelvin (1847) attempted to explain theresults of his electrical experiments with theories of elas-ticity. The ideas of Kelvin and Faraday on electromagneticforce were the basis for Maxwell’s investigations (1864).However, Maxwell’s entire hypothesis was based on an all-pervading mechanical medium, the ether. The field equa-tions formulated by Maxwell govern all electromagneticphenomena and form the basis of an understanding ofelectromagnetics. The equations were so comprehensivethat they included all the earlier observations: the laws ofCoulomb, Gauss, Faraday, and Ampere. Maxwell’s for-mulation was simplified by Heaviside and Hertz. The flowof energy in the ether was proposed by Poynting (1884)to be governed by a vector that now bears his name.Hertz (1887) demonstrated the existence of electromag-netic waves in the ether. Radio transmission was achievedfor the first time, and Maxwell’s theories were verified.

The third and final period of the growth of electromag-netic theory involved the development of the theory ofretarded action at a distance between charges and cur-rents. Lorentz (1895) coordinated the earlier theories ofaction between charges and currents with Maxwell’s gen-eral theory of the state of the ether. He theorized that mattercontains electrons that act on each other in various waysto produce all electromagnetic (including optical) effects.Lorentz assumed that the electromagnetic field character-izes and is propagated by the ether.

The ether was proposed as a means of transporting elec-trical effects from one charge to another rather than thebasis of all electromagnetic phenomena as proposed byMaxwell. The theory of relativity laid to rest all claimsabout the legitimacy of the ether hypothesis, and the con-clusion was that there is no ether. Maxwell’s theory ofthe electromagnetic ether is only of historical significance.However, the field equations of Maxwell continue to be thebasis of macroscopic electromagnetic theory. The funda-mental law of macroscopic electromagnetism as expressedin the field and force equations is interpreted as retardedaction at a distance.

II. MAXWELL’S EQUATIONS

A. The Density Functions

Electric charge is the fundamental physical quantity fromwhich all other concepts in electromagnetics are derived.

Page 164: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

Electromagnetics 279

It is indestructible in that it can be neither created nordestroyed—this is the principle of conservation of charge.The electrodynamical model, which serves as the mathe-matical foundation for electromagnetics, depends on con-tinuous functions, which are called densities or densityfunctions. These depend on and take account of the mag-nitude, the distribution, and the relative velocities of thecharges. These density functions are termed the volumedensity of charge ρ and the surface density of charge η

when the statistically stationary state is being considered.In this state, the charges are all in random motion. Thestatistical behavior of the charges in motion is obtainedby taking a time average; from the statistical point ofview, a volume containing millions of charges movingrandomly is indistinguishable from the same volume con-taining the same charges with each fixed at an averagerest position. Hence, the statistically stationary state is es-sentially a static state. The volume density of charge ρ

describes the average condition of total charge through-out the region under consideration but does not describethe average separation and orientation of the statistical restpositions of positive and negative charges. When a regionconsisting of tightly bound charges (positive and nega-tive) is exposed to an external force that attracts negativeand repels positive charges, the negative charge is pulledaway from the positive charge, and the separation of thetwo charges is oriented in the same direction as the exter-nal force. Such a structure is called a dipole. If the positivecharge q is separated a distance d from an equal negativecharge, then the average polarization or dipole moment isgiven by

p = qd, (1)

where d is a vector drawn from the center of the negativecharge to the center of the positive charge. The volumedensity of polarization (also called the polarization) is de-noted by P. It is the polarization per unit volume. Sincethe volume function P is a measure of the average densityof polarization vectors due to individual dipoles in a smallregion about a Point, the component of P directed alongthe outward or external normal to a closed surface at anypoint,

Pn = n · P (2)

(n is the unit external normal), gives the average sum of theoutwardly directed normal components of the polarizationvectors piercing the unit area of the surface on which Pn

is defined. Then ∫

n · P dσ (3)

(where is a surface and dσ the element of surface) mea-sures the number of polarization vectors piercing the sur-

face normally as well as the total positive charge leaving(or negative charge entering) the volume enclosed by thesurface. Hence, the net addition of positive charge per unitvolume (or removal of negative charge per unit volume)is given by

−∫

n · P dσ/τ, (4)

Where τ is the volume enclosed by . Then

limτ→0

−∫

n · P dσ/τ = −∇ · P (divergence of P).

(5)

Hence the diveragence represents the total outward normalflux of the polarization and measures the charge added.

The steady state is a generalization of the static state. Itis characterized by a steady drift or circulation of electriccharges relative to the statistically stationary rest positionsthat characterize the static condition. A steady averageflow is assumed to be superimposed on the random mo-tions of the charges. A steady drift might consist of onekind of charge flowing in a definite direction or of twokinds of charges flowing in opposite directions. Such adrift is called a convection current. A special form of theconvection current is the steady drift of electrons relativeto statistically stationary nuclei. This is called a conduc-tion current. The volume density of the moving charge orthe volume density of the convection or conduction currentdenoted by J is a measure of the average drift of electriccharges both in magnitude and direction. If there is a layerof free electrons moving with a steady drift velocity onthe surface, a surface density of convection current maybe defined.

In the steady state, the model consisting of electronsrotating about an axis through an atom is an elementarymagnet. It is produced by forces causing the electronsin an atom to change their random orbits and circulateabout a common axis. Such elementary magnets are ori-ented along parallel axes with a common direction of ro-tation, and this orientation is called the magnetization dueto circulation, mc. The magnetization per unit volume orthe volume density of magnetization of the circulatingelectrons is denoted by Mc. There is also a magnetizationdue to the spin of the electrons. The spin is the propertyof the electron whereby it has an intrinsic angular momen-tum in addition to the angular momentum of its orbitalmotion. The spin magnetization is denoted by ms and thevolume density of spin magnetization is denoted by Ms.The volume density of spin magnetization M = Mc + Ms.

The magnetization vector M is a continuous functionthat measures the average density and direction of magne-tization vectors in a small region about any point. Since itis parallel to the axis of rotation of the circulating charges,

Page 165: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

280 Electromagnetics

M is perpendicular to the plane of motion of the chargesto be represented by it. The vector n ×(−M) is perpen-dicular to M, and n and is proportional to M (the negativesign preserves the right-hand screw convention for the ro-tating current whirls of positive charge). The magnitudeof n × (−M) is a measure of the average vector sum of thetangential components of the magnetization vectors.

The expression ∫

n × M dσ (6)

is a measure of the average number of positive half-currentwhirls that are cut and added to the volume enclosed by. Hence the current per unit volume appearing in ∆τ is∫

n × M dσ/τ. (7)

So, it represents the mean density of magnetization currentin τ . Then

limτ→0

n × M dσ/τ = ∇ × M (curl of M). (8)

The actual effective volume and surface densities of chargeare given by

ρ(r ) = ρ(r ) − ∇ · P(r ),(9)

η(r ) = η(r ) + n · P(r ).

The effective volume and surface densities of current are

J(r ) = J(r ) + ∇ × M(r ),(10)

K(r ) = K(r ) − n × M(r ).

The static and steady states are both stationary since thedensity functions are independent of time. In the nonsta-tionary state, the same density functions may be used todescribe the instantaneous distributions but vary in timeat every point. Hence,

∂ρ/∂t = ∂ρ/∂t − ∇ · ∂P/∂t. (11)

If τ is a volume cell enclosed by a surface , the totalcharge is ρτ , and the rate of increase of the positivecharge is given by the time derivative of it. This must beequal to the net positive charge entering acrossτ , per unittime due to energy conservation. The resulting equation is

∂ρ/∂t = limτ→0

−∫

n · J dσ/τ = −∇ · J, (12)

which is the equation of continuity for electric charge.With Eq. (11) and the vector identity ∇ · ∇ × M = 0, themore general equation of continuity is obtained as

∂ρ/∂t + ∇ · ρmv = 0, (13)

where the essential volume density of moving charge isdefined as

ρmv = J + ∇ × M + ∂P/∂t. (14)

The general surface equation of continuity is

∂η/∂t + ∇ · ηmv − n · ρmv = 0, (15)

where η and ρmv are defined in Eqs. (9) and (14) and ηmvis the surface density of moving charge or current definedin Eq. (10), and

ηmv ≡ K = K − n × M.

B. Maxwell’s Equations

The density fields of matter described in Section II.A haveto be interconnected, and that is the fundamental purposeof the mathematical description of space in terms of itselectromagnetic properties. In such a description, spaceconsists of a coordinate system that assigns three coordi-nates to every point to provide a relationship to an arbitrar-ily selected region. In certain regions, which are located bythese coordinates, the density fields characterizing matterhave nonzero values. These regions define the positionsof the mathematical bodies in terms of the coordinates. Inempty space the density fields are zero. The mathematicalmodel includes all of space in order to interconnect the dif-ferent density fields that are scattered over it. This is doneby assigning two vectors to every point in space whetherit is empty or has nonzero density fields. The electricalstructure of mathematical space is described in terms oftwo vector fields: the electric vector E and the magneticvector B. An electric field is said to exist in a region inwhich E has a value at every point. A magnetic field existsin a region where B has a value at every point. The super-position of the two fields is called the electromagneticfield. Thus, the mathematical description of the structureof space is completely identified with the electromagneticfield. The definition of each of the two vectors E and Binvolves a numerical, experimentally determined propor-tionality constant with appropriate dimensions. These arethe fundamental electric constant ε0 called the permittiv-ity of free space and the fundamental magnetic constantµ0 called the permeability of free space. These factors arenecessary to get the numerical coordination between themathematical model of electromagnetism and experimen-tal measurements.

The definition of the vectors E and B in terms of thedensity fields that characterize the space occupied by mat-ter depends on a fundamental theorem of vector analysis.This theorem states that a vector field is uniquely deter-mined if its divergence and curl are specified and if thenormal component of the field is known over a closed

Page 166: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

Electromagnetics 281

surface or if the vector vanishes as 1/r2 at infinity, wherer is the distance from the density distribution.

The definition of the vectors E and B in terms of their re-spective divergences and curls is the second fundamentalprinciple of electromagnetics. The first fundamental prin-ciple is the conservation of charge, which was expressedmathematicaly in the equation of continuity. The secondprinciple (which contains the first) is expressed by a set ofpartial differential equations, called Maxwell’s equations.These express the divergence and curl of the E and B vec-tors in terms of the density functions and the constants ε0

and µ0 as follows:

ε0∇ · E = ρ, (16)

∇ × E = −∂B/∂t, (17)

µ−10 ∇ × B = ρmv + ε0 ∂E/∂t, (18)

∇ · B = 0. (19)

It is assumed that the region (or regions) that is character-ized by ρ is as a whole at rest relative to the observer. Thedefining relations, Eqs. (16)–(19), completely describe theelectromagnetic field in terms of the essential volume char-acteristics. The vectors E and B define a macroscopic field.

In the stationary states, all of the functions are constantin time, and the field equations are of the form

ε0∇ · E = ρ, (20)

∇ × E = 0, (21)

µ−10 ∇ × B = J = J + ∇ × M, (22)

∇ · E = 0. (23)

The unit for the electric vector E is volts per meter, and theunit for the magnetic vector B is webers per square meteror teslas. The numerical values of the universal constantsε0 and µ0 are obtained from standard experiments to be

ε0 = 8.854 × 10−12 = 1/36π × 10−9 F/m

and

µ0 = 1.257 × 10−6 = 4π × 10−7 H/m,

respectively.

C. Field Equations at a Surface:Boundary Conditions

A boundary surface is defined to be either the mathemat-ical envelope between a charged region and empty space,where the density fields associated with the region van-ish, or the mathematical envelope between two electricallydifferent regions in contact, where the density fields as-sociated with the two change abruptly. Conditions at theboundary between a charged region and space are obtained

from those for two charged regions in contact by settingone set of density fields equal to zero.

Since the electromagnetic vectors E and B are defined interms of all of the volume densities, they cannot representmore rapid fluctuations in electrical conditions than canthe densities themselves. Therefore, discontinuities in Eand B can exist only at a boundary where an abrupt changefrom one set of densities to another occurs.

Maxwell’s equations, written for surface effects on theboundary between two regions, have the form

ε0n1 · E1 + ε0n2 · E2 = −(η1 + η2 + n1 · P1 + n2 · P2)

(24)

n1 × E1 + n2 × E2 = 0, (25)

µ−10 (n1 × B1 + n2 × B2)

= −(K1 + K2 − n1 × M1 − n2 × M2) (26)

n1 · B1 + n2 · B2 = 0, (27)

where n1 and n2 are the exterior unit normals (pointingout of the region in each case). The boundary conditionscan be interpreted easily.

The relations (24) and (27) apply to the normal com-ponents of the vectors E and B. Thus, Eq. (24) states thatthe normal component of the electric vector is discontin-uous in crossing a boundary surface, and the magnitudeof the discontinuity is the essential surface characteristicof charge η divided by ε0. Equation (27) states that thenormal component of the magnetic vector is continuousacross all boundaries. The interpretation of Eqs. (25) and(26) is somewhat more involved since the vector productof the external normal to the surface of a region and one ofthe field vectors actually does not specify any particularcomponent of the field vector. It defines an axial vectorthat has the magnitude of the tangential component of thefield vector at the surface and a direction normal to theplane formed by the field vector and the external normal.It follows that the magnitude of the discontinuity of theaxial vector so defined in the tangential component of thefield vector. Therefore, Eq. (25) requires that the tangentialcomponent of the electric vector be continuous in crossingall boundaries, wherease Eq. (26) requires the tangentialcomponent of the magnetic vector to be discontinuous bya magnitude equal to the essential surface density of cur-rent ηmv divided by µ0. These boundary conditions areillustrated in Fig. 1.

The field equations may also be written in terms of theauxiliary field vectors D and H, which are defined as

D = ε0E + P (28)

and

H = µ−10 B − M. (29)

Page 167: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

282 Electromagnetics

FIGURE 1 Electric (a) and magnetic (b) fields at a boundary.

At all points in space or in bodies where P and M vanish,D = ε0E and H = µ−1

0 B. Since ε0 and µ0 are scalars, thevectors D and H point in the same direction as E and B atall points where P and –M vanish or are not defined.

Maxwell’s equations expressed in terms of the auxiliaryvectors D and H are

∇ · D = ρ, (30)

∇ × E = −∂B/∂t, (31)

∇ × H = J + ∂D/∂t, (32)

∇ · B = 0. (33)

The corresponding surface equations are

n1 · D1 + n2 · D2 = −(η1 + η2), (34)

n1 × E1 + n2 × E2 = 0, (35)

n1 × H1 + n2 × H2 = −(K1 + K2), (36)

n1 · B1 + n2 · B2 = 0. (37)

The boundary conditions are illustrated in Figs. 2 and 3,with η = 0 and K = 0. Media with P parallel to and in thesame direction as E are said to be dielectric. Media with−M parallel to and in the same direction as B are diamag-netic. Media with −M parallel and directed opposited toB are paramagnetic when M is small and ferromagneticwhen M is large.

FIGURE 2 Electric (a) and magnetic (b) field vectors at aboundary.

Two universal constants may be derived from ε0 andµ0. The characteristic velocity of space (the velocity oflight) c = 1/

√(ε0µ0) = 3 × 108 m/sec, and the character-

istic resistance of space ζ0 = √(µ0/ε0) = 120π ohms.

D. Integral Forms of the Field Equations

Maxwell’s equations defining the electromagnetic fieldconsist of four simultaneous partial differential equations.It is possible to transform them into integral relations thatare often more convenient in the solution of problems,

FIGURE 3 Magnetic field vectors at a boundary.

Page 168: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

Electromagnetics 283

particularly those characterized by symmetry. This is ac-complished by using general integral theorems of calculus.The two theorems are the divergence theorem and Stokes’stheorem.

The divergence theorem transforms a volume integralinto an integral evaluated over a surface enclosing the vol-ume. Let A be a continuous vector point function thatdefines a vector field. If any volume V is chosen in thisfield, it will be contained in a closed surface S. Let dV bean element of the volume, d S an element of the enclos-ing surface, and n an external normal to the surface. Thetheorem is ∫

V∇ · A dV =

∫S

n · A ds. (38)

Stokes’s theorem is a theorem for transforming a surfaceintegral over a cap-or cup-shaped surface (a surface thatdoes not enclose a volume) into a line integral aroundthe closed boundary of the surface. Consider any opencap or cup-shaped surface S, which may have any formwhatsoever, from a flat disc enclosed by the boundary lines to a deep balloon with only a narrow opening enclosed bythe boundary line s. Let this surface be entirely in the fieldof a continuous vector point function A. The theorem is∫

S(cap)n · (∇ × A) d S =

∫s(closed line)

A · ds. (39)

The line integration around the closed boundary s is tobe performed such that the right-hand screw convention issatisfied with respect to the normal n to the surface S.

Stokes’s theorem and the divergence theorem may beapplied to the four field equations. In integral form theybecome:

1. Gauss’s law:

ε0

∫S(closed)

n · E d S = Q. (40)

2. Faraday’s law:∫s

E · ds = − ∂

∂t

∫S(cap)

N · B d S. (41)

3. The Ampere–Maxwell theorem:

µ−10

∫s

B · ds = I + ∂

∂tε0

∫S(cap)

N · E d S. (42)

4. Gauss’s law:∫S(closed)

n · B d S = 0. (43)

Here n is the external normal to a closed surface S enclos-ing the volume V , N is the normal to a cap surface S (opensurface) bounded by a closed contour s,

Q =∫

τ

ρ dV +∫

η d S, (44)

and

I =∫

S(cap)N · ρmv d S +

∫s

N · ηmv ds. (45)

The integral forms may be written in terms of the auxiliaryvectors. In Eq. (40), E and Q are replaced by D and Q,respectively; Q is obtained from Eq. (44) by replacingρ and η by ρ and η. In Eq. (42), B and I are replacedby H and I , respectively; I is obtained from Eq. (45) byreplacing ρmv by J and ηmv by K.

E. Field Equations in Simple Media

Maxwell’s equations in their general forms together withthe definitions of the essential densities involve four vol-ume densities: ρ, P, J, and M. In many materials, thesedensities are induced by an externally maintained electro-magnetic field. In simple media or linear media, there isa linear relation between the density functions on the onehand and the exciting field on the other. In such media,

D = ε0E + P = (1 + χe)ε0E, (46)

H = µ−10 B − M = (1 + χm)µ−1

0 B, (47)

J f = σE, (48)

where χe is the electric susceptibility, χm the magnetic sus-ceptibility, and σ the conductivity. The quantities (1 + χe)and (1 + χm); are each represented by a symbol standingfor the properties of linear polarizability or linear mag-netizability.Hence εr = (1 + χe) and µr = (1 + χm); εr isthe relative dielectric constant or relative permittivity of alinearly polarizable medium and µr is the relative perme-ability of a linearly magnetizable medium. Then ε = ε0εr

and µ = µ0µr; ε is the dielectric constant or permittivityand µ is the permeability. The auxiliary vectors and thepolarization and magnetization vectors may be written interms of these symbols.

The field equations in simple media can now be writtenas

ε∇ · E = ρf, (49)

∇ × E = −∂B/∂t, (50)

µ−1∇ × B = σE + ε ∂E/∂t, (51)

∇ · B = 0, (52)

where the volume density of free charge ρf is the only den-sity function appearing explicitly. Equations (49)–(52) arevalid in linearly polarizing, magnetizing, and conductingmedia.

The boundary conditions are also expressed solely interms of free charge densities as follows:

Page 169: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

284 Electromagnetics

ε1n1 · E1 + ε2n2 · E2 = −(η1f + η2f), (53)

n1 × E1 + n2 × E2 = 0, (54)

µ−11 (n1 × B1) + µ−1

2 (n2 × B2) = −(K1f + K2f), (55)

n1 · B1 + n2 · B2 = 0. (56)

The field equations and the boundary conditions maybe specialized for different materials. For good conductorsthe conductivity is large, and it may be assumed that thefree charge is on the surface; the volume density of chargeis therefore approximately zero. Equations (49) and (51)now become

∇ · E = 0 (57)

and

∇ × B = µσE. (58)

At the boundary between two good conductors, numbered1 and 2, with the assumption that K1 f and K2 f are eachzero, Eq. (55) becomes

µ−11 (n1 × B1) + µ−1

2 (n2 × B2) = 0. (59)

All the other equations are unchanged in the interior aswell as on the surface.

For nonconductors, there is no free charge distributionon the surface or in the interior of the material and theconductivity is very small. Thus, the field equations (49)and (51) for the interior become

∇ · E = 0 (60)

and

∇ × B = µε ∂E/∂t. (61)

At boundaries between two nonconductors, numbered 1and 2, there are no free charges or currents. Hence, theright-hand sides of Eqs. (53) and (55) vanish. All the otherequations remain unchanged. When there is a boundarybetween a nonconductor and a conductor, Eqs. (53)–(56)may be suitably adapted.

The first-order partial differential equations can be con-verted to second-order equations. Hence,

∇ × ∇ × E + µσ∂E∂t

+ µε∂2E∂t2

= 0. (62)

The second-order equation for the magnetic field is ob-tained from Eq. (62) by replacing E by B.

The field equations have been formulated for arbitrarytime dependence in the foregoing. For most practical ap-plications, sinusoidal signals are used. Signals that are inthe form of pulses can also be decomposed into sinusoidalcomponents with the use of Fourier analysis. Therefore,Maxwell’s equations are obtained assuming a periodic (orharmonic) time dependence.

The equations for the interior are

ε0∇ · E = ρ, (63)

∇ × E = − jωB, (64)

∇ × B = ρmv + jωε0E, (65)

∇ · B = 0. (66)

The boundary conditions are unchanged and are givenby Eqs. (53)–(56). The definitions of ρ, η, and ηmv areunchanged. The volume current density function becomes

ρmv = J + ∇ × M + jωP. (67)

The equations of continuity are

∇ · ρmv + jωρ = 0 (68)

and

∇ · (ηmv1 + ηmv2) + jω(n1 + η2)

− (n1 · ρmv1 + n2 · ρmv2) = 0. (69)

The field vectors and the density functions are now of theform

E(r, t) = Re[E(r, ω)e jωt ]. (70)

The field equations and boundary conditions for simplemedia with periodic time dependence are easily obtainedfrom the above relations by expressing J and P in terms Eand M in terms of B for nonconductors and conductors.

There are certain generalized coefficients for simplemedia that are of great importance in electromagnetics.The complex quantity k is known as the wavenumber andis defined as

k = ω(µε)1/2, (71)

where ε is a complex permittivity defined as

ε = (ε − jσ/ω). (72)

When the medium is air, ε = ε0, µ = µ0, and k becomes thefree-space wavenumber k0 = ω/c, where c is the velocityof light; k is dimensionally a reciprocal length measuredin reciprocal meters. The complex wavenumber definedin Eq. (71) may be written as k = β − jα, where β is thereal phase constant and α is the real attenuation constant.Corresponding to the velocity in air, a phase velocity inthe medium is defined as v = ω/β. The loss tangent p isdefined as

p = σ/ωε, (73)

so that

k = ke(1 − j p), (74)

where ke = ω(µε)1/2 = k0ε1/2r . The characteristic phase

velocity v and characteristic impedance ζ of a simple

Page 170: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

Electromagnetics 285

medium may be defined by analogy with air with ε and µ

replacing ε0 and µ0.For poor conductors, the loss tangent is very small,

p2 1 and β = ke = ω/v. The attenuation is also verysmall and is given by α = σζ/2. For good conduc-tors, p 1, εr = 0 and β = α = (ωσµ/2)1/2. The recip-rocal of α is dimensionally a length called the skindepth or skin thickness for a good conductor. It isds = 1/α = (2/ωµσ )1/2. It is defined as the depth at whichthe electric field reduces to 1/e of its value at the surfaceof the conductor, where e is the base of the Napierianlogarithm.

F. Scalar and Vector Potential Functions

In the solution of electromagnetic field problems, it is of-ten easier to use auxiliary functions defined in terms of thefield vectors such as potential functions. Two such func-tions are the scalar potential function φ and the vector po-tential function A. The four first-order field equations canbe transformed into two second-order equations in the po-tential functions that can be integrated. The magnetic vec-tor B has a zero divergence and it is said to be a solenoidalvector. Using the vector identity ∇ · ∇ × C = 0, one canderive, B from a vector potential function A,

∇ × A = B. (75)

The vector point function is defined incompletely inEq. (75), as its divergence is not known. This functionis known as the magnetic vector potential.

The define a scalar potential it is necessary to find avector with vanishing curl. From the symmetry of electricand magnetic quantities, the second field equation shouldbe used for this purpose, since this is the electric analogueof the magnetic fourth equation. The second equation inits most general form is given in Eq. (17). With the sub-stitution of Eq. (75), this becomes

∇ × (E + ∂A/∂t) = 0. (76)

The vector (E + ∂A/∂t) is a potential vector because itscurl vanishes. It can be derived from a scalar potential φ

defined by

−∇φ = E + ∂A/∂t. (77)

This is also a fundamentally important relation. The scalarpotential defined by Eq. (77) is called the electric scalarpotential. It is seen that if the scalar and vector potentialsφ and A are known, the electromagnetic vectors E and Bmay be determined directly from them.

With the scalar and vector potentials defined, the nextstep is to eliminate E and B from the field equations. Itis to be noted that the scalar potential φ has been definedcompletely, but the definition of A will not be complete

until its divergence is defined. The potential functions havebeen defined in terms of two of the field equations. Theymust still satisfy the other two. The divergence of A isnow defined as follows:

∇ · A + ε0µ0 ∂φ/∂t = 0. (78)

This is known as the Lorentz condition. With Eq. (78) andthe relations between the field vectors and the potentialfunctions, the following d’Alembert equations governingthe potential functions φ and A are obtained:

∇2φ − µ0ε0 ∂2φ/∂t2 = −ρ/ε0. (79)

and

∇2A − µ0ε0 ∂2A/∂t2 = −µ0ρmv. (80)

In the stationary state, the second terms in Eqs. (79) and(80) vanish and each reduces to Poisson’s equation. Inregions where there are no densities present, Poisson’sequation reduces to Laplace’s equation. In simple media,the equations may be obtained directly by writing ε forε0, µ for µ0, ρf for ρ, and Jf for ρmv. The potential equa-tions and the Lorentz condition may be written for peri-odic time dependence by the substitutions ∂/∂t = jω and∂2/∂t2 = −ω2.

The solutions to the potential equations with periodictime dependence may be written in different forms, but itis desirable to choose those that satisfy the Lorentz condi-tion. A particularly useful form of the solution is a partic-ular integral called the Helmholtz integral. These integralsfor the scalar and vector potentials are

φ = 1

4πε0

[∫τ

ρ ′

Re− jk0 R dτ ′ +

η′

Re− jk0 R dσ ′

], (81)

A = µ0

[ ∫τ

ρmv′

Re− jk0 R dτ ′ +

ηmv′

Re− jk0 R dσ ′

],

(82)

where dτ ′ represents an element of volume, and dσ ′ rep-resents an element of surface.The integration is carried outover all regions and surfaces where the density functionsare nonzero and are defined. R is the distance from a pointP where φ and A are to be determined to a variable pointof integration P ′. In Cartesian coordinates, R is the dis-tance between P(x, y, z) and P ′(x ′, y′, z′) and given byR = [(x − x ′)2 + (y − y′)2 + (z − z′)2]1/2. Equations (81)and (82) represent the solutions to the D’Alembert equa-tions for the scalar and vector potentials for a periodic timedependence outside regions where the charge and currentdensities exist. The solutions in the interior of regionswhere charge and current densities exist are obtained bysolving the appropriate boundary value problem. It maybe noted that k0 = ω(ε0µ0)1/2.

Page 171: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

286 Electromagnetics

G. Electromagnetic Force and Energy

The concepts of force and energy are very important in anydiscussion of electromagnetics. The electromagnetic forceand energy may be expressed in terms of the field vectorsas well as the potential functions. The electromagneticforce acting on a body is given by

F = qE + qmv × B, (83)

with

q =∫

τ

ρ dτ +∫

η dσ, (84)

qmv =∫

τ

ρmv dτ +∫

ηmv dσ. (85)

If all the charges are in motion and none is stationary (as,for example, in an electron stream), q and qmv coincideand Eq. (83) reduces to

F = q(E + v × B). (86)

A vector S called the Poynting vector is defined as

S = E × H. (87)

It has the units of volt-ampere per square meter or wattper square meter. The total outward normal flux of thePoynting vector T is given by

T =∫

(n · S) dσ, (88)

where represents the surface enclosing the volume τ .The electric energy density is equal to D · E, and the mag-netic energy density is equal to H · B. Therefore, the totalelectromagnetic energy stored in the volume τ is

U = UE + UM =∫

τ

(12µ−1 B2 + 1

2εE2)

dτ. (89)

The principle of conservation of energy requires that thenet energy lost must be equal to the net energy gained.Since power is the time rate of change of energy, the prin-ciple of energy conservation can be written as the prin-ciple of power conservation. The Poynting theorem givesthe power conservation equation for simple media,

T = −∂U/∂t −∫

τ

Jf · E dτ (90)

T measures the time rate of the increase of energy asso-ciated with all the regions outside the volume τ ; that is, itrepresents the flow of power across the surface enclos-ing the volume τ . The element U is defined in Eq. (89)and the first term on the right of Eq. (90) defines the timerate of decrease of the total electromagnetic energy in thevolume τ ; the second term on the right represents the timerate of heat dissipated in the volume τ , and it is associatedwith moving free charges. It is observed that as the energy

(or power) outside τ increases, the energy (or power) in-side τ decreases, thus conserving energy. Equation (90) isobtained from Maxwell’s equations. The freespace Poynt-ing theorem is written by substituting ε0 for ε and µ0 forµ in Eq. (90).

For harmonic time dependence, a complex Poyntingvector may be defined as

S∗ = 12 (E × H). (91)

A real power equation is now obtained:∫

Re(S∗) dσ = 1

2

∫τ

J 2f

σdτ, (92)

where Re(S∗) is the time-average power density leavingthe unit area of the surface . The time-average electricand magnetic energy functions may be defined as 〈UE〉 =UE/2 and 〈UM〉 = UM/2. The total time-average electro-magnetic energy is the sum of the two quantities. It re-mains a constant due to energy conservation, and hence thetime-average electric energy is equal to the time-averagemagnetic energy, and the total energy may be written as

〈U (t)〉 = 2〈UE(t)〉 = 2〈UM(t)〉. (93)

III. ELECTROMAGNETIC WAVES

The Helmholtz integrals that define the potential field interms of the essential densities of charge and current wereseen in Eqs. (81) and (82). Since the relations between theelectric and magnetic vectors and the potential functionsare known, explicit formulas for E and B may be obtained.These are the integral solutions of Maxwell’s equations.The general integrals for the electromagnetic field withperiodic time dependence are

E = 1

4πε0

∫τ

e− jk0 R

[Rρ ′

R2

+(

ρ ′R − ρmv′

C

)jk0

R

]dτ ′ + SE, (94)

B = µ0

∫τ

e− jk0 R(R × ρmv′)(

1

R2+ jk0

R

)dτ ′ + SB,

(95)

where SE and SB are the surface integrals that are obtainedfrom the volume integrals in each case by writing η for ρ

and for τ , R is a unit vector from the variable point P ′

to the fixed point P where the field is to be calcualted, andR is the distance between P ′ and P .

If the electromagnetic field at points in space is due tocurrents and charges on a cylindrical conductor (such asa dipole antenna) with a small cross section and its axis

Page 172: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

Electromagnetics 287

along the z axis of a coordinate system, the integrals inEqs. (94) and (95) may be further simplified. The con-ductor is assumed to extend from z = −h to z = +h alongthe z axis and to have a radius a. The expressions for Eand B are obtained from Eqs. (94) and (95) by replacingthe volume integral by a line integral between the limitsz = −h and z = +h; ρ ′ becomes q ′, ρmv′ becomes zIz , anddτ ′ becomes dz′. There is no surface integral and

q ′ =∫

Sρ ′ d S′ +

∫sη′ ds ′, (96)

I ′z =

∫Sρmv′

z d S′ +∫

sηmv′

z ds ′. (97)

In Eqs. (96) and (97), the first integral is taken over thecross section S = πa2 and the second integral is taken overthe circumference s = 2πa; q ′ and I ′

Z are the amplitudesof the total charge per unit length and the total axial con-duction current. For a thin conductor, the contributionsdue to radial and azimuthal current are negligible. The in-stantaneous values of E and B are obtained by introducingthe time factor e jωt and taking the real part. The chargeis expressed in terms of the current using the equation ofcontinuity:

d I ′z/dz′ + jωq ′ = 0. (98)

The vector E has a component along R and a secondcomponent parallel to z and hence parallel to the currentin the conductor; B is every where perpendicular to both Rand z and hence is directed along tangents to circles drawnwith the conductor as the common axis. Therefore, E and Bare everywhere mutually perpendicular, and the equiphasesurfaces of E and B due to the periodically varying chargeand current in an element dz′ of the conductor are sphericalshells.

The general electromagnetic field can be expressed asthe sum of components: the induction field and the radi-ation field. The induction (or near-zone) field dominatesin the immediate vicinity of the volume on and in whichcharge and current densities exist. The radiation (or far-zone) field is dominant at great distances from the sourcedistributions. The near zone is characterized by the condi-tion k0 R 1, and only the terms with 1/R2 are significantin the volume and surface integrals in Eqs. (94) and (95).In the stationary state, all the densities are constant in time,and the expressions for the electrostatic and magnetostaticfield are obtained from Eqs. (94) and (95). At ω = 0 andat low frequencies, the induction field is dominant andthe radiation field is negligibly small. In the special casewhere R is larger than the largest dimension in the volumeτ , the field vector is given by

E = R q/

4πε0 R2 = FQ, (99)

where F is the electrostatic force and Eq. (99) is a state-ment of Coulomb’s law, a fundamental postulate of elec-tromagnetism. The expression for the near-zone B vectoris called the Biot–Savart law.

The radiation or far-zone field is defined by k0 R 1.When this condition is satisfied, the radiation field is dom-inant, and the terms involving 1/R in Eqs. (94) and (95)define the E and B vectors. The 1/R2 terms are vansi-hingly small. As before, the fields may be specialized forthe case of a cylindrical conductor that is thin.

The electromagnetic field in the far zone due to peri-odically varying charges and currents in an element oflength dz′ may be described in terms of a wave picturewith spherically equiphase surfaces expanding with theconstant velocity c in free space. The magnetic waves aretransverse; the electric waves are both transverse and lon-gitudinal.

Spherical and plane electromagnetic waves will now bedescribed. Since the conductor of length 2h producing thefield is at a very large distance from the point of observa-tion P where the radiation field is to be determined, theinequality R h is satisfied. Here, R is the distance fromany point on the conductor to P . It follows that R0 his also satisfied, where R0 is the distance from the originlocated at the center of the conductor to P , as shown inFig. 4. Then the field vectors may be written as

E = jk0

4πε0

e− jk0 R0

R0

∫ +h

−h

(R0q ′ − zI ′

z

c

)e− jk0 R0 cos θ dz′

(100)

and

B = jµ0k0

e− jk0 R0

R0

∫ +h

−h(R0 × z)I ′

z e− jk0 R0 cos θ dz′.

(101)

It may be shown that

E = c(B × R0). (102)

FIGURE 4 Antenna in space.

Page 173: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

288 Electromagnetics

In the system of spherical coordinates (R0, , ) the Evector has a component and the B vector has a com-ponent. All other components vanish. The field in the ra-diation zone is of the form

E = f (h, )e− jk0 R0

R0= F(, R0) = cB, (103)

with

f (h, ) = jωµ0

∫ +h

−hI ′

z e jk0z′ cos θ sin dz′. (104)

The function F describes transverse spherical waves ex-panding radially outward with a constant velocity c. Thespherical equiphase surfaces are separated by a constantdistance, which is the wavelength λ0 = 2π/k0.

In most problems involving the far-zone field of a radi-ating structure, only a small part of space is involved. Thefunction F can then be approximated by

Fe jωt = K e j(ωt − k0s). (105)

Equation (105) characterizes the approximate distributionof E and B at a large distance from the radiating structurewithin a volume V whose dimensions are small comparedwith R. It is readily interpreted in terms of a simple wavepicture. The element s may be written in terms of theCartesian coordinates as

s = xl0 + ym0 + zn0, (106)

where l0, m0 and n0 are the direction cosines, defined asthe cosines of the angles between each of the coordinateaxes and R0. Equation (106) defines a plane at right anglesto s and at a distance s from the origin. Then Eq. (105)may be described using a picture of plane equiphase sur-faces at right angles to s and traveling along s with a con-stant velocity c. The arcs of radially expanding sphericalequiphase surfaces defined by Eq. (104) are assumed to beapproximately plane and of constant amplitude providedthe distance to the volume from the source is sufficientlygreat and the solid angle subtended by it is small. Hence,the spherical electromagnetic wave may be approximatedby the plane electromagnetic wave.

In problems involving the field in the radiation zone, itis convenient to use rectangular coordinates. The x axiscoincides with the spherical coordinate R, the y axis istangent to and in the direction of the coordinate at P ,and z axis is tangent to and in the direction − at P; x, y,

and z form a right-handed system, and the electromagneticfield at P in a small region surrounding it is given by

E = −Ez′ , B = By . (107)

In this case, the radiation zone field may be written as

−Ez(t) = cBy(t) = K e j(ωt−k0x). (108)

This expression is interpreted in terms of plane transversewaves normal to the x axis and traveling along it witha constant velocity c. The energy is transported in the xdirection, and the Poynting vector is in the x direction.

The E vector is always parallel to the z axis and is saidto be linearly polarized along the z axis. The B vector islinearly polarized along the y axis. If the orientation ofthe source of the radiation field, the antenna, is changed,the axes of polarization of E and B are changed. How-ever, the electric and magnetic fields at any point in theradiation zone of a dipole antenna are always linearly po-larized along mutually perpendicular axes. Furthermore,these axes lie in a plane at right angles to the line joiningthe point to the center of the antenna.

If there is more than one antenna, the field will still bein a plane at right angles to the line from the center of theantenna to the point of observation, but E and B will eachhave components in the z and y directions. If only onefrequency is involved,

E(t) = −(zEz + yEy)e jωt (109)

and

B(t) = (y By + z Bz)ejωt , (110)

with

−Ez = cBy = K e− jk0x , −Ey = cBz = Ne− jk0x ,

(111)where K and N are complex numbers given by

K = ae jg; N = be jp. (112)

When a = b and d = p − g = 0, the loci of the ends ofthe vectors E(t) and B(t) are ellipses. When a = b andδ = ±π/2, the loci are circles, and when δ = π , the fieldsare linearly polarized.

IV. APPLICATIONS OFELECTROMAGNETICS

A. Transmission of Electromagnetic Waves

A generator producing an electric signal is the source ofelectromagnetic energy. This energy is transmitted fromthe generator in a wave motion in metallic or nonmetal-lic structures, which deliver the energy either to radiat-ing structures such as antennas or to transformers at lowfrequencies. Up to the ultrahigh-frequency range, theseguiding structures are transmission lines, which consist oftwo or more conductors placed in a specific configurationsuch as a coaxial one for two conductors. For higher fre-quencies, waveguides, strip lines, and microstrips are used.The electromagnetic field in these structures is obtainedby solving the second-order partial differential equations

Page 174: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

Electromagnetics 289

obtained from Maxwell’s equations with the appropriateboundary conditions. For transmission lines, the field is ofthe transverse electromagnetic type, where the electric andmagnetic fields are orthogonal to each other and transverseto the direction of propagation. The electromagnetic fieldfor waveguides has components transverse to the directionof propagation as well as in the direction of propagation.The fields are of the transverse electric type (the com-ponent of the electric field in the direction of propagationvanishes) and of the transverse magnetic type (the compo-nent of the magnetic field in the direction of propagationvanishes). Approximate solutions of Maxwell’s equationsshow that the field in microstrips is of the quasi-transverseelectromagnetic type. At frequencies in the optical range,dielectric waveguides of very small cross section are used.The wave equation is solved subject to the boundary con-ditions for dielectrics. For such structures, hybrid modesare obtained with field components in all the coordinatedirections.

B. Radiation of Electromagnetic Waves

Power from the generator is transmitted in a wave motionand delivered to the structure that radiates the energy intospace. The radiating structure is called an antenna. Theradiated field is determined by using the integral solutionsfor the second-order differential equations. It is usuallymore convenient to use the Helmholtz integrals for thescalar and vector potential functions and to determine theelectromagnetic field from them. The current and chargedistributions on the antenna have to be assumed in orderto be able to use the integral solutions. The Helmholtzintegrals may also be used to determine the current dis-tribution on the antenna. The boundary conditions on theconducting surface of the antenna are used, and this yieldsan integral equation for the current that may be solvedapproximately. The electromagnetic field in the radiationzone depends on the structure of the antenna. In practi-cal problems, the antenna system is designed to give thedesired field.

C. Reception of Electromagnetic Waves

The radiated electromagnetic field is received by an an-tenna system consisting of one or more antennas. Theproperties of the antenna have to be known for it to beused efficiently as a receiver. An important theorem ofelectromagnetics is used to show that the properties ofan antenna used as a receiver can be predicated from itsperformance as a transmitter. This is called the Rayleigh–Carson reciprocal theorem. It is derived from Maxwell’sequations and subject only to the condition that the essen-tial density of moving charge ρmv is linearly related to the

electric field. It can be shown that this is true in all simplypolarizing, magnetizing, and conducting media. The the-orem may be expressed as follows: if a generator with anelectromotive force (EMF) or driving potential differenceof complex amplitude V e between its terminals maintainsa current of complex amplitude I through a load con-nected between any other pair of terminals in the same orin a coupled network, the current in the load is unalteredif load and generator are interchanged provided that theimpedances connected between each pair of terminals arethe same in both cases and the generator maintains thesame EMF. Suppose that when V e′

j is applied at the termi-nals j , a current I ′

i exists at terminals i , and when V e′′i is

applied at terminals i , a current I ′′j exists at terminals j .

The reciprocal theorem reduces to the form

I ′′j V e′

j = I ′i V e′′

i . (113)

The theorem may be further simplified if the same poten-tial difference is applied in the one case across the termi-nals j as in the other case across the terminals i . When

V e′j = V e′′

i , (114)

it follows that the reciprocal theorem becomes

I ′j = I ′

i (115)

When this relation is applied to an antenna, it can beshown that the directional properties of the antenna arethe same for transmission and reception. Hence, the en-ergy absorbed by a receiving antenna can be determinedfrom its radiation zone field when used as a transmittingantenna.

D. Scattering and Diffractionof Electromagnetic Waves

Electromagnetic energy is transmitted from generators toantennas or other radiating systems in which oscillatingcurrents are set up over a wide band of frequencies corre-sponding to the frequencies of the generator. These in turninduce similar currents in surrounding bodies and regionsof matter. This interaction has been described earlier interms of trains of electromagnetic waves traveling outwardfrom the source. When they encounter obstacles, currentsare induced in them, and this results in the generation ofa secondary electromagnetic field known as the scatteredor reradiated field. Where it penetrates the geometricalshadow, it is known as the diffracted field. The nature ofthe scattered and diffracted field depends on the electricalproperties, shape, and orientation of the scattering obsta-cle relative to the incident field from the distant antenna.The actual calculation of the scattered field and the in-duced currents that generate it is carried out by solving

Page 175: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

290 Electromagnetics

Maxwell’s equations with the associated boundary con-ditions. A few of the problems are analytically solvable,among them being that of the scattering and diffraction bya conducting or totally absorbing (black) half-plane. Thisis one of the most important applications of electromag-netics in frequencies ranging from the radio to the opticalrange.

The problem is most easily solved by considering thereflection of plane electromagnetic waves from a perfectlyconducting, infinite half-plane: An important theorem ofelectromagnetics is used to replace the half-plane by anequivalent source called the virtual source. The total fieldis then due to the actual source and the virtual source.The theorem of images determines the virtual source. If aconductor is placed above a perfectly conducting plane,the plane can be replaced by an identical image conductorarranged to be the exact geometrical image of the first con-ductor except in one respect. All the currents and charges,while the same in magnitude at image points in the imageconductor, are opposite in direction and sign, respectively(Fig. 5). The resultant scalar and vector potentials at anypoint P are the sum of the potentials due to the actualconductor and its image. Since the conducting plane isat z = 0. the fields due to the actual and image conductorshould vanish at z = 0. The theorem of images permitsthe substitution of a relatively simple problem involving aconductor and its image in space for a rather difficult oneinvolving a single conductor over a perfectly conductingplane.

The virtual source for the scattering problem is obtainedfrom the theorem of images, and the complete field is thenthe incident field together with the reflected field, whichis the field of the virtual source. The scattered field maybe calculated from Maxwell’s equations and appropriateboundary conditions. When the geometry of the scattering

FIGURE 5 Conductor with image.

obstacle changes, similar methods may be employed todetermine the field. The scattering of the electromagneticfield form conducting cylinders and spheres can be solvedanalytically.

In all the applications of electromagnetics, experimentalstudies have to be carried out to verify theoretical conclu-sions. The experiment is often simplified if models of con-venient size are used to simulate the system under study.The scale model may be much smaller or much larger thanthe actual system and of a size suitable for laboratory ex-periments. There is a relationship between the size of thescale model and the actual system. The determination ofthis relationship with the use of Maxwell’s equations iscalled electrodynamical similitude. This theory of mod-els provides the relationship between lengths, frequen-cies, conductivities, permittivities, and permeabilities inthe system under study and the scale model.

V. RECENT DEVELOPMENTS

The understanding of electromagnetics has been much en-hanced by the use of computer simulation and optimiza-tion. Extensive software has become available for bothpersonal and mainframe computers. Typical of such soft-ware are interactive graphics programs which show co-ordinate transformations, fields, and operators as well aselectromagnetic wave motion. It is now possible to vi-sualize the electromagnetic field and its behavior. Videocassettes have also become available which give a furtherappreciation of the electromagnetic field.

Computational methods for the solution of electromag-netic problems have also attained greater sophisticationand these new numerical techniques have led to more accu-rate solutions. New computer-aided methods of numericalmodeling are being developed, and the study of electro-magnetics has taken on a new dimension.

SEE ALSO THE FOLLOWING ARTICLES

ELECTRODYNAMICS, QUANTUM • ELECTROMAGNETIC

COMPATABILITY • FERROMAGNETISM • GEOMAGNETISM

• MAGNETIC MATERIALS • RADIO PROPAGATION • SO-LAR SYSTEM, MAGNETIC AND ELECTRIC FIELDS • WAVE

PHENOMENA

BIBLIOGRAPHY

De Wolf, D. A. (2001). “Essentials of Electromagnetics for Engineering,”Cambridge University Press, Cambridge.

Page 176: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/MAG P2: FQP Final Pages

Encyclopedia of Physical Science and Technology En005I-967 June 15, 2001 20:29

Electromagnetics 291

Elliott, R. S. (1999). “Electromagnetics: History, Theory, and Applica-tion,” IEEE, Piscataway, NJ.

Hayt, W., and Buck, J. (2000). “Engineering Electromagnetics” 6th ed.,McGraw-Hill, New York.

IEEE. (2000). “1999 International Conference on Computational Elec-tromagnetics and Its Application,” IEEE, Piscataway, NJ.

IEEE. (2000). “The Second Asia-Pacific Conference on EnvironmentalElectromagnetics—CEEM 2000,” IEEE, Piscataway, NJ.

Kraus, J. D., and Fleisch, D. (eds.). (1999). “Electromagnetics,”McGraw-Hill, New York.

Peterson, A., Scott, R., and Mittra, R. (1998). “Computational Methodsfor Electromagnetics,” Oxford University Press, Oxford.

Page 177: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave PhysicsKostas D. KokkotasAristotle University of Thessaloniki

I. IntroductionII. Theory of Gravitational Waves

III. Detection of Gravitational RadiationIV. Astronomical Sources of Gravitational Waves

GLOSSARY

Black hole A region of spacetime in which there is suchan immense concentration of material within a smallvolume that spacetime curves over on itself and theescape velocity from it exceeds the speed of light.

Neutron star A very compact “dead” star, whose degen-erate material is composed almost entirely of neutrons.

Parsec (pc) The distance at which a body would have anannual parallax of 1 sec arc. It equals 3.26 light-yearsor 3.084 × 1013 km.

Pulsar A rotating neutron star, with an off-axis magneticfield, that emits regular pulses of radiation.

Supernova A stellar outburst during which a star (closeto the end of its life) suddenly increases in brightnessroughly 1 million times.

Virgo cluster A vast cluster of thousands of galaxies,of which 2500 are fairly bright. The average distancefrom earth is about 16 Mpc. So called because its centerappears to lie in the constellation Virgo.

GRAVITATIONAL WAVES are propagating fluctuationsof gravitational fields, that is, “ripples” in spacetime,

generated mainly by moving massive bodies. Thesedistortions of spacetime travel with the speed of light.Every body in the path of such a wave feels a tidal gravita-tional force that acts perpendicular to the wave’s directionof propagation; these forces change the distance betweengiven points, and the size of the change is proportionalto the distance between the points. Gravitational wavescan be detected by devices which measure the inducedlength changes. The frequencies and the amplitudes of thewaves are related to the motion of the masses involved.Thus, the analysis of gravitational waveforms allows usto learn about their source and, if there are more than twodetectors involved in observation, to estimate the distanceand position of their source in the sky.

I. INTRODUCTION

Einstein first postulated the existence of gravitationalwaves in 1916 as a consequence of his theory of generalrelativity, but no direct detection of such waves has beenmade yet. The best evidence thus far for their existenceis due to the work of 1993 Nobel laureates Joseph Taylorand Russell Hulse. They observed in 1974 two neutron

67

Page 178: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

68 Gravitational Wave Physics

stars orbiting faster and faster around each other, exactlywhat would be expected if the binary neutron star waslosing energy in the form of emitted gravitational waves.The predicted rate of orbital acceleration caused by gravi-tational radiation emission according to general relativitywas verified observationally, with high precision.

Cosmic gravitational waves, upon arriving on earth,are much weaker than the corresponding electromag-netic waves. The reason is that strong gravitational wavesare emitted by very massive compact sources undergo-ing very violent dynamics. These kinds of sources arenot very common and so the corresponding gravitationalwaves come from large astronomical distances. On theother hand, the waves thus produced propagate essen-tially unscathed through space, without being scatteredor absorbed from intervening matter.

A. Why Are Gravitational Waves Interesting?

Detection of gravitational waves is important for tworeasons: First, their detection is expected to open up anew window for observational astronomy since the in-formation they carry is very different from that carriedby electromagnetic waves. This new window onto theuniverse will complement our view of the cosmos and willhelp us unveil the fabric of spacetime around black-holes,observe directly the formation of black holes or the merg-ing of binary systems consisting of black holes or neutronstars, search for rapidly spinning neutron stars, dig deepinto the very early moments of the origin of the universe,and look at the very center of the galaxies where super-massive black holes weighing millions of solar massesare hidden. These are only some of the great scientificdiscoveries scientists can expect to make during the earlyyears of the 21st century. Second, detecting gravitationalwaves is important for our understanding of the funda-mental laws of physics; the proof that gravitational wavesexist will verify a fundamental 85-year-old prediction ofgeneral relativity. Also, by comparing the arrival times oflight and gravitational waves from, e.g., supernovae, Ein-stein’s prediction that light and gravitational waves travelat the same speed could be checked. Finally, we couldverify that they have the polarization predicted by generalrelativity.

B. How Will We Detect Them?

Up to now, the only indication of the existence of grav-itational waves is the indirect evidence that the orbitalenergy in the Hulse–Taylor binary pulsar is drained awayat a rate consistent with the prediction of general relativ-ity. The gravitational wave is a signal, the shape of whichdepends upon the changes in the gravitational field of its

source. As mentioned earlier, any body in the path of thewave will feel an oscillating tidal gravitational force thatacts in a plane perpendicular to the wave’s direction ofpropagation. This means that a group of freely movingmasses placed on a plane perpendicular to the direction ofpropagation of the wave will oscillate as long as the wavepasses through them, and the distance between them willvary as a function of time, as in Fig. 1. Thus, the detectionof gravitational waves can be accomplished by monitoringthe tiny changes in the distance between freely moving testmasses. These changes are extremely small; for example,when the Hulse–Taylor binary system finally merges, thestrong gravitational wave signal that will be emitted willinduce changes in the distance of two particles on earththat are 1 km apart much smaller than the diameter of theatomic nucleus! To measure such motions of macroscopicobjects is a tremendous challenge for experimentalists. Asearly as the mid-1960s, Joseph Weber designed and con-structed heavy metal bars, seismically isolated, to whicha set of piezoelectric strain transducers were bonded insuch a way that they could detect vibrations of the bar ifit had been excited by a gravitational wave. Today, thereare a number of such apparatuses operating around theworld which have achieved unprecedented sensitivities,but they still are not sensitive enough to detect gravita-tional waves. Another form of gravitational wave detectorthat is more promising uses laser beams to measure the dis-tance between two well-separated masses. Such devicesare basically kilometer-sized laser interferometers consist-ing of three masses placed in an L-shaped configuration.The laser beams are reflected back and forth between themirrors attached to the three masses, the mirrors lyingseveral kilometers away from each other. A gravitationalwave passing by will cause the lengths of the two armsto oscillate with time. When one arm contracts, the otherexpands, and this pattern alternates. The result is that theinterference pattern of the two laser beams changes withtime. With this technique, higher sensitivities could beachieved than are possible with the bar detectors. It isexpected that laser interferometric detectors are the onesthat will provide us with the first direct detection of grav-itational waves.

II. THEORY OF GRAVITATIONAL WAVES

Newton’s theory of gravity has enjoyed great success indescribing many aspects of our every-day life and addi-tionally explains most of the motions of celestial bodies inthe universe. General relativity corrected Newton’s theoryand is recognized as one of the most ingenious creations ofthe human mind. The laws of general relativity, though, inthe case of slowly moving bodies and weak gravitational

Page 179: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 69

FIGURE 1 The effects of a gravitational wave traveling perpendicular the plane of a circular ring of particles, sketchedas a series of snapshots. The deformations due the two polarizations are shown.

fields reduce to the standard laws of Newtonian theory.Nevertheless, general relativity is conceptually differentfrom Newton’s theory as it introduces the notion of space-time and its geometry. One of the basic differences of thetwo theories concerns the speed of propagation of anychange in a gravitational field. As the apple falls fromthe tree, we have a rearrangement of the distribution ofmass of the earth, the gravitational field changes, and adistant observer with a high-precision instrument will de-tect this change. According to Newton, the changes of thefield are instantaneous, i.e., they propagate with infinitespeed; if this were true, however, the principle of causalitywould break down. No information can travel faster thanthe speed of light. In Einstein’s theory there is no such am-biguity; the information of the varying gravitational fieldpropagates with finite speed, the speed of light, as a rip-ple in the fabric of spacetime. These are the gravitationalwaves. The existence of gravitational waves is an imme-diate consequence of any relativistic theory of gravity.However, the strength and the form of the waves dependon the details of the gravitational theory. This means thatthe detection of gravitational waves will also serve as atest of basic gravitational theory.

The fundamental geometrical framework of relativisticmetric theories of gravity is spacetime, which mathemat-

ically can be described as a four-dimensional manifoldwhose points are called events. Every event is labeled byfour coordinates xµ (µ = 0, 1, 2, 3); the three coordinatesxi (i = 1, 2, 3) give the spatial position of the event, whilex0 is related to the coordinate time t (x0 = ct , where c isthe speed of light, which unless otherwise stated will beset equal to 1). The choice of the coordinate system isquite arbitrary and coordinate transformations of the formxµ = f µ(xλ) are allowed. The motion of a test particle isdescribed by a curve in spacetime. The distance ds be-tween two neighboring events, one with coordinates xµ

and the other with coordinates xµ + dxµ, can be expressedas a function of the coordinates via a symmetric tensorgµν(xλ) = gνµ(xλ), i.e.,

ds2 = gµνdxµdxν . (1)

This is a generalization of the standard measure ofdistance between two points in Euclidean space. For theMinkowski spacetime (the spacetime of special relativ-ity), gµν ≡ nµν = diag(−1, 1, 1, 1). The symmetric tensorgµν is called the metric tensor or simply the metric of thespacetime. In general relativity the gravitational field isdescribed by the metric tensor alone, but in many othertheories one or more supplementary fields may be neededas well. In what follows, we will consider only the general

Page 180: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

70 Gravitational Wave Physics

relativistic description of gravitational fields since most ofthe alternative theories fail to pass the experimental tests.

The information about the degree of curvature (i.e., thedeviation from flatness) of a spacetime is encoded in themetric of the spacetime. According to general relativity,any distribution of mass bends the spacetime fabric, andthe Riemann tensor Rκλµν (which is a function of themetric tensor gµν and of its first and second derivatives) isa measure of the spacetime curvature. The Riemann ten-sor has 20 independent components. When it vanishes, thecorresponding spacetime is flat.

In the following presentation, we will consider massdistributions, which we will describe by the stress-energytensor T µν(xλ). For a perfect fluid (a fluid or gas withisotropic pressure but without viscosity or shear stresses)the stress-energy tensor is given by the followingexpression:

T µν(xλ) = (ρ + p)uµuν + pgµν, (2)

where p(xλ) is the local pressure, ρ(xλ) is the local energydensity, and uµ(xλ) is the four-velocity of the infinitesimalfluid element characterized by the event xλ.

Einstein’s gravitational field equations connect the cur-vature tensor (see below) and the stress-energy tensorthrough the fundamental relation

Gµν ≡ Rµν − 1

2gµν R = kTµν. (3)

This means that the gravitational field, which is directlyconnected to the geometry of spacetime, is related to thedistribution of matter and radiation in the universe. Bysolving the field equations, both the gravitational field(the gµν) and the motion of matter is determined. Rµν

is the so-called Ricci tensor and comes from a contrac-tion of the Riemann tensor (Rµν = gσρ Rσµρν), R is thescalar curvature (R = gρσ Rρσ ), while Gµν is the so-calledEinstein tensor, k = 8πG/c4 is the coupling constant ofthe theory, and G is the gravitational constant, which, un-less otherwise stated, will be considered equal to 1. Thevanishing of the Ricci tensor corresponds to a spacetimefree of any matter distribution. However, this does not im-ply that the Riemann tensor is zero. As a consequence,in the empty space far from any matter distribution, theRicci tensor will vanish, while the Riemann tensor can benonzero; this means that the effects of a propagating grav-itational wave in an empty spacetime will be described viathe Riemann tensor.

A. Linearized Theory

Now us assume that an observer is far away from a givenstatic matter distribution, and the spacetime in which heor she lives is described by a metric gµν . Any change in

the matter distribution, i.e., in Tµν , will induce a change inthe gravitational field, which will be recorded as a changein metric. The new metric will be

gµν = gµν + hµν, (4)

where hµν is a tensor describing the variations inducedin the spacetime metric. As we will describe analyticallylater, this new tensor describes the propagation of ripplesin spacetime curvature, i.e., the gravitational waves. Inorder to calculate the new tensor hµν we have to solve Ein-stein’s equations for the varying matter distribution. Thisis not an easy task in general. However, there is a conve-nient, yet powerful way to proceed, namely to assume thathµν is small (|hµν | 1), so that we need only keep termslinear in hµν in our calculations. In making this approxi-mation we are effectively assuming that the disturbancesproduced in spacetime are not huge. This linearizationapproach has proved extremely useful for calculations,and, for weak fields at least, gives accurate results for thegeneration of the waves and for their propagation.

The first attempt to prove that in general relativity grav-itational perturbations propagate as waves with the speedof light is due to Einstein himself. Shortly after the for-mulation of his theory—the year after—he proved thatby assuming linearized perturbations around a flat metric,i.e., gµν = nµν , then the tensor

hµν ≡ hµν − 1

2nµνhα

α (5)

is governed by a wave equation which admits plane wavesolutions similar to the ones of electromagnetism; herehµν is the metric perturbation and hµν is the gravitationalfield (or the “trace reverse” of hµν). Then the linear fieldequations in vacuum have the form

(− ∂2

∂t2+ ∇2

)hµν ≡ ∂λ∂

λhµν = 0 (6)

(∂µkα ≡ ∂kα/∂xµ), which is the three-dimensional waveequation. To obtain the above simplified form, the con-dition ∂µhµν = 0, known as Hilbert’s gauge condition(equivalent to the Lorentz gauge condition of electromag-netism), has been assumed. A gauge transformation is asuitable change of coordinates defined by

x ′µ ≡ xµ + ξµ, (7)

which induces a redefinition of the gravitational fieldtensor

h′µν ≡ hµν − ∂µξν − ∂νξµ + nµν∂λξ

λ. (8)

It can be easily proved that ξµ must satisfy the condition

∂µ∂µξα = 0, (9)

Page 181: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 71

so that the new gravitational field is in agreement with theHilbert gauge condition. The solution of Eq. (9) definesthe four components of ξµ so that the new tensor h′

µν isalso a solution of the wave equation (6), and thus has thesame physical meaning as hµν . In general, gauge transfor-mations correspond to symmetries of the field equations,which means that the field equations are invariant undersuch transformations. This implies that the field equationsdo not determine the field uniquely; however, this ambi-guity in determining the field is devoid of any physicalmeaning.

The simplest solution to the wave equation (6) is a planewave solution of the form

hµν = Aµνeika xa, (10)

where Aµν is a constant symmetric tensor, the polariza-tion tensor, in which information about the amplitude andthe polarization of the waves is encoded, while kα is aconstant vector, the wave vector, which determines thepropagation direction of the wave and its frequency. Inphysical applications we will use only the real part of theabove wave solution. By applying the Hilbert gauge con-dition on the plane wave solution we obtain the relationAµνkµ = 0, the geometrical meaning of which is that Aµν

and kµ are orthogonal. This relation can be written as fourequations that impose four conditions on Aµν , and thisis the first step in reducing the number of its independentcomponents. As a consequence, Aµν , instead of having 10independent components (as has every symmetric second-rank tensor in a four-dimensional space), has only 6 in-dependent ones. Further substitution of the plane wavesolution in the wave equation leads to the important equa-tion kµkµ = k2

0 − (k2x + k2

y + k2z ) = 0, which means that kµ

is a lightlike or null vector, i.e., the wave propagates onthe light-cone. This means that the speed of the wave is 1,i.e., equal to the speed of light. The frequency of the waveis ω = k0.

Up to this point it has been proven that Aµν has sixarbitrary components, but due to the gauge freedom, i.e.,the freedom in choosing the four components of the vec-tor ξµ, the actual number of its independent componentscan be reduced to two, in a suitably chosen gauge. Thetransverse-traceless or T T gauge is an example of sucha gauge. In this gauge, only the spatial components ofhµν are nonzero (hence hµ0 = 0), which means that thewave is transverse to its own direction of propagation,and, additionally, the sum of the diagonal componentsis zero (hµ

µ = h00 + h1

1 + h22 + h3

3 ≡ h = h = 0) (traceless).Due to this last property and Eq. (5), in this gauge there isno difference between hµν (the perturbation of the metric)and hµν (the gravitational field). It is customary to writethe gravitational wave solution in the TT gauge as hTT

µν .That Aµν has only two independent components means

that a gravitational wave is completely described by twodimensionless amplitudes, h+ and h×, say. If, for exam-ple, we assume a wave propagating along the z direction,then the amplitude Aµν can be written as

Aµν = h+εµν+ + h×ε

µν× , (11)

where εµν+ and ε

µν× are the so-called unit polarization ten-

sors defined by

εµν+ ≡

0 0 0 0

0 1 0 0

0 0 −1 0

0 0 0 0

, ε

µν× ≡

0 0 0 0

0 0 1 0

0 1 0 0

0 0 0 0

.

(12)

As mentioned earlier, the Riemann tensor is a measureof the curvature of spacetime. A gravitational wave prop-agating in a flat spacetime, generates periodic distortionswhich can be described in terms of the Riemann tensor. Inlinearized theory the Riemann tensor takes the followinggauge-independent form

Rκλµν = 1

2(∂νκhλµ + ∂λµhκν − ∂κµhλν − ∂λνhκµ), (13)

which is considerably simplified by choosing the TTgauge:

RTTj0k0 = −1

2

∂2

∂t2hTT

jk , j, k = 1, 2, 3. (14)

Furthermore, in the Newtonian limit

RTTj0k0 ≈ ∂2

∂x j∂xk, (15)

where describes the gravitational potential in Newto-nian theory. Earlier we defined the Riemann tensor as ageometrical object, but this tensor has a simple physi-cal interpretation: it is the tidal force field and describesthe relative acceleration between two particles in freefall. If we assume two particles moving freely alonggeodesics of a curved spacetime with coordinates xµ(τ )and xµ(τ ) + ξµ(τ ) [for a given value of the proper timeτ , ξµ(τ ) is the displacement vector connecting the twoevents], it can be shown that, in the case of slowly movingparticles,

d2ξ k

dt2≈ −RTTk

0 j0 ξ j . (16)

This is a simplified form of the equation of geodesicdeviation. Hence, the tidal force acting on a particle is

f k ≈ −m Rk0 j0ξ

j , (17)

where m is the mass of the particle. Equation (17) cor-responds to the standard Newtonian relation for the tidalforce acting on a particle in a field .

Page 182: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

72 Gravitational Wave Physics

Keeping this in mind, we will try to visualize the effectof a gravitational wave. Let us first consider two freelyfalling particles hit by a gravitational wave traveling alongthe z direction with the (+) polarization present only, i.e.,

hµν = h+εµν+ cos[ω(t − z)]. (18)

Then, the measured distance ξ x between the two particlesoriginally at a distance ξ x

0 along the x direction, will be

ξ x =

1 − 1

2h+ cos[ω(t − z)]

ξ x

0

or

δξ x = ξ x − ξ x0 = −1

2h+ cos[ω(t − z)]ξ x

0 , (19)

which implies that the relative distance δξ x between thetwo particles will oscillate with frequency ω. This doesnot mean that the particle’s coordinate positions change;they remain at rest relative to the coordinates, but the co-ordinate distance oscillates. If the particles were placedoriginally along the y direction, the coordinate distancewould oscillate according to

ξ y =[

1 + 1

2h+ cos[ω(t − z)]

y0

or

δξ y = ξ y − ξy0 = 1

2h+ cos[ω(t − z)]ξ y

0 (20)

In other words, the coordinate distances along the twoaxes oscillate out of phase, that is, when the distance be-tween two particles along the x direction is maximum, thedistance of two other particles along the y direction is min-imum, and after half a period, it is the other way around.The effects are similar for the other polarization, where

FIGURE 2 The tidal field lines of force for a gravitational wave with polarization (+) (left panel) and (×) (right panel).The orientation of the field lines changes every half period producing the deformations as seen in Fig. 1. Any pointaccelerates in the direction of the arrows, and the denser are the lines, the stronger is the acceleration. Since theacceleration is proportional to the distance from the center of mass, the force lines get denser as one moves awayfrom the origin. For the polarization (×) the force lines undergo a 45 rotation.

the axes along which the oscillations are out of phase areat an angle of 45 with respect to the first ones. This isvisualized in Fig. 1, where the effect of a passing gravita-tional wave on a ring of particles is shown as a series ofsnapshots closely separated in time.

Another way of understanding the effects of gravita-tional waves is to study the tidal force field lines. In theTT gauge the equation of the geodesic deviation (16) takesthe simple form

d2ξ k

dt2≈ 1

2

d2hTTjk

dt2ξ j (21)

and the corresponding tidal force is

f k ≈ m

2

d2hTTjk

dt2ξ j . (22)

For the wave given by Eq. (18) the two nonzero compo-nents of the tidal force are

f x ≈ m

2h+ω2 cos[ω(t − z)]ξ x

0 ,

(23)f y ≈ −m

2h+ω2 cos[ω(t − z)]ξ y

0 .

It can be easily proved that the divergence of the tidalforce is zero (∂ f x /∂ξ x + ∂ f y /∂ξ y = 0). It can thereforebe represented graphically by field lines as in Fig. 2.

Let us now return to the two polarization states repre-sented by the two matrices ε

µν+ and ε

µν× . It is impossible

to construct the (+) pattern from the (×) pattern and viceversa; they are orthogonal polarization states. By analogywith electromagnetic waves, the two polarizations couldbe added with phase difference (±π/2) to obtain circu-larly polarized waves. The effect of circularly polarizedwaves on a ring of particles is to deform the ring into a ro-tating ellipse with either positive or negative helicity. The

Page 183: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 73

particles themselves do not rotate; they only oscillate inand out around their initial positions. These circularly po-larized waves carry angular momentum the amount ofwhich is (2/ω) times the energy carried by the wave.If we consider the gravitational field, then, according toquantum field theory, the waves are associated with fun-damental particles responsible for the gravitational inter-action, and a quantum of the field will have energy hω

and consequently spin 2h. This means that the quanta ofthe gravitational field, the gravitons, are spin-2, masslessparticles (since they travel with the speed of light). An-other way of explaining why a graviton should be a spin-2 particle comes from observing Figure 1. One can seethat a gravitational wave is invariant under rotations of180 about its direction of propagation; the pattern re-peats itself after half a period. For comparison, electro-magnetic waves are invariant under rotations of 360. Inthe quantum mechanical description of massless particles,the wavefunction of a particle is invariant under rotationsof 360/s, where s is the spin of the particle. Thus thephoton is a spin-1 particle and the graviton is a spin-2 par-ticle. In other relativistic theories of gravity, the wave fieldhas other symmetries and therefore these theories attributedifferent spins to the gravitons.

B. Properties of Gravitational Waves

Gravitational waves, once they are generated, propagatealmost unimpeded. Indeed, it has been proven that theyare even harder to stop than neutrinos! The only signif-icant change they suffer as they propagate is a decreasein amplitude as they travel away from their source anda redshift they undergo (cosmological, gravitational, orDoppler), as is the case for electromagnetic waves.

There are other effects that marginally influence thegravitational waveforms, for instance, absorption by in-terstellar or intergalactic matter intervening between theobserver and the source, which is extremely weak (actu-ally, the extremely weak coupling of gravitational waveswith matter is the main reason that gravitational waveshave not been observed). Scattering and dispersion ofgravitational waves are also practically unimportant, al-though they may have been important during the earlyphases of the universe (this is also true for absorption).Gravitational waves can be focused by strong gravitationalfields and also can be diffracted, exactly as happens withelectromagnetic waves.

There are also a number of “exotic” effects that gravita-tional waves can experience due to the nonlinear nature ofEinstein’s equations (purely general-relativistic effects),such as scattering by the background curvature, theexistence of tails of the waves that interact with the wavesthemselves, parametric amplification by the background

curvature, nonlinear coupling of waves with themselves(creation of geons, that is, bundles of gravitational wavesheld together by their own self-generated curvature),and even formation of singularities by colliding waves(for such exotic phenomena see the extensive reviewby Thorne, 1987). These aspects of nonlinearity affectthe majority of gravitational wave sources and from thispoint of view our understanding of gravitational wavegeneration is based on approximations. However, theerror in these approximations for most processes thatgenerate gravitational waves is expected to be quitesmall. Powerful numerical codes, using state-of-the-artcomputer software and hardware, have been developed(and continue to be developed) for minimizing all possiblesources of error in order to have as accurate as possible anunderstanding of the processes that generate gravitationalwaves and the waveforms produced.

For most of the properties mentioned above there isa correspondence with electromagnetic waves. Gravita-tional waves are fundamentally different, however, eventhough they share similar wave properties away fromthe source. Gravitational waves are emitted by coherentbulk motions of matter (for example, by the implosionof the core of a star during a supernova explosion) orby coherent oscillations of spacetime curvature, and thusthey serve as a probe of such phenomena. By contrast,cosmic electromagnetic waves are mainly the result ofincoherent radiation by individual atoms or charged par-ticles. As a consequence, from the cosmic electromag-netic radiation we mainly learn about the form of mat-ter in various regions of the universe, especially aboutits temperature and density, or about the existence ofmagnetic fields. Strong gravitational waves, are emittedfrom regions of spacetime where gravity is very strongand the velocities of the bulk motions of matter are nearthe speed of light. Since most of the time these areasare either surrounded by thick layers of matter that ab-sorb electromagnetic radiation or do not emit any elec-tromagnetic radiation at all (black holes), the only wayto study these regions of the universe is via gravitationalwaves.

C. Energy Flux Carried by Gravitational Waves

Gravitational waves carry energy and cause a deformationof spacetime. The stress-energy carried by gravitationalwaves cannot be localized within a wavelength. Instead,one can say that a certain amount of stress-energy is con-tained in a region of space which extends over severalwavelengths. It can be proven that in the TT gauge of lin-earized theory the stress-energy tensor of a gravitationalwave (in analogy with the stress-energy tensor of a perfectfluid as defined earlier) is given by

Page 184: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

74 Gravitational Wave Physics

tGWµν = 1

32π

⟨(∂µhTT

i j

)(∂νhTT

i j

)⟩, (24)

where the angular brackets are used to indicate averagingover several wavelengths. For the special case of a planewave propagating in the z direction which we consideredearlier, the stress-energy tensor has only three nonzerocomponents, which take the simple form

tGW00 = tGW

zz

c2= − tGW

0z

c= 1

32π

c2

Gω2

(h2

+ + h2×), (25)

where tGW00 is the energy density, tGW

zz is the momentumflux, and tGW

0z is the energy flow along the z directionper unit area and unit time (for practical reasons we haverestored the normal units). The energy flux has all theproperties one would anticipate by analogy with electro-magnetic waves: (a) it is conserved (the amplitude diesout as 1/r , the flux as 1/r2), (b) it can be absorbed bydetectors, and (c) it can generate curvature like any otherenergy source in Einstein’s formulation of relativity. Asan example, by using the above relation, we will estimatethe energy flux in gravitational waves from the collapseof the core of a supernova to create a 10-solar-mass blackhole at a distance of 50-million light-years (15 Mpc) fromthe earth (at the distance of the Virgo cluster of galaxies).A conservative estimate of the amplitude of the waves onearth (as we will show later) is of the order of 10−22 (at afrequency of about 1 kHz). This corresponds to a flux ofabout 3 ergs/cm2 sec. This is an enormous amount of en-ergy flux and is about 10 orders of magnitude larger thanthe observed energy flux in electromagnetic waves! Thebasic difference is the duration of the two signals; a gravi-tational wave signal will last a few milliseconds, whereasan electromagnetic signal lasts many days. This exam-ple provides a useful numerical formula for the energyflux:

F = 3

(f

1 kHz

)2( h

10−22

)2 ergs

cm2 sec, (26)

from which one can easily estimate the flux on earth, giventhe amplitude (on earth) and the frequency of the waves.

D. Generation of Gravitational Waves

As mentioned earlier, when the gravitational field is strongthere are a number of nonlinear effects on the genera-tion and propagation of gravitational waves. For example,nonlinear effects are significant during the last phases ofblack hole formation. The analytic description of such adynamically changing spacetime is impossible, and untilnumerical relativity provides us with accurate estimatesof the dynamics of gravitational fields under such ex-treme conditions we have to be content with order-of-

magnitude estimates. Furthermore, there are differencesin the predictions of various relativistic theories of gravityin the case of high concentrations of rapidly varying en-ergy distributions. However, all metric theories of gravity,as long as they admit the correct Newtonian limit, makesimilar predictions for the total amount of gravitationalradiation emitted by “weak” gravitational wave sources,that is, sources where the energy content is small enoughto produce only small deformations of the flat spacetimeand where all motions are slow compared to the velocityof light.

Let us now try to understand the nature of gravi-tational radiation by starting from the production ofelectromagnetic radiation. Electromagnetic radiationemitted by slowly varying charge distributions canbe decomposed into a series of multipoles, where theamplitude of the 2-pole ( = 0, 1, 2, . . .) contains a smallfactor a, with a equal to the ratio of the diameter ofthe source to the typical wavelength, namely, a numbertypically much smaller than 1. From this point of view thestrongest electromagnetic radiation would be expectedfor monopolar radiation ( = 0), but this is completelyabsent because the electromagnetic monopole moment isproportional to the total charge, which does not changewith time (it is a conserved quantity). Therefore, elec-tromagnetic radiation consists only of ≥ 1 multipoles,the strongest being the electric dipole radiation ( = 1),followed by the weaker magnetic dipole and electricquadrupole radiation ( = 2). One could proceed with asimilar analysis for gravitational waves and by followingthe same arguments show that mass conservation (whichis equivalent to charge conservation in electromagnetictheory) will exclude monopole radiation. Also, the rateof change of the mass dipole moment is proportional tothe linear momentum of the system, which is a conservedquantity, and therefore there cannot be any mass dipoleradiation in Einstein’s relativity theory. The next strongestform of electromagnetic radiation is the magnetic dipole.For the case of gravity, the change of the “magneticdipole” is proportional to the angular momentum of thesystem, which is also a conserved quantity, and thus thereis no dipolar gravitational radiation of any sort. It followsthat gravitational radiation is of quadrupolar or highernature and is directly linked to the quadrupole moment ofthe mass distribution.

As early as 1918, Einstein derived the quadrupoleformula for gravitational radiation. This formula statesthat the wave amplitude hi j is proportional to the sec-ond time derivative of the quadrupole moment of thesource:

hi j = 2

r

G

c4QTT

i j

(t − r

c

), (27)

Page 185: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 75

where

QTTi j (x) =

∫ρ

(xi x j − 1

3δi j r2

)d3x (28)

is the quadrupole moment in the TT gauge evaluated atthe retarded time t − r/c and ρ is the matter density in avolume element d3x at the position xi . This result is quiteaccurate for all sources as long as the reduced wavelengthλ = λ/2π is much longer than the source size R. It shouldbe pointed out that the above result can be derived via aquite cumbersome calculation in which we solve the waveequation (6) with a source term Tµν on the right-hand side.In the course of such a derivation, a number of assumptionsmust be used. In particular, the observer must be located ata distance r λ, far greater than the reduced wavelength(in what is called the radiation zone) and Tµν must notchange very quickly.

Using the formulas (24) and (25) for the energy carriedby gravitational waves, one can derive the luminosity ingravitational waves as a function of the third-order timederivative of the quadrupole moment tensor. This is thequadrupole formula

LGW = −d E

dt= 1

5

G

c5〈 ...Qi j

...Qi j 〉. (29)

Based on this formula, we derive some additional formulaswhich provide order-of-magnitude estimates for the am-plitude of the gravitational waves and the correspondingpower output of a source. First, the quadrupole moment ofa system is approximately equal to the mass M of the partof the system that moves, times the square of the size R ofthe system. This means that the third-order time derivativeof the quadrupole moment is

...Qi j ∼ MR2/T 3 ∼ Mυ2/T ∼ Ens/T, (30)

where υ is the mean velocity of the moving parts, Ens is thekinetic energy of the component of the source’s internalmotion that is nonspherical, and T is the time scale for amass to move from one side of the system to the other.The time scale (or period) is actually proportional to theinverse of the square root of the mean density of the system

T ∼√

R3/GM . (31)

This relation provides a rough estimate of the charac-teristic frequency of the system f = 2π/T . Then theluminosity of gravitational waves of a given source isapproximately

LGW ∼ G

c5

(M

R

)5

∼ G

c5

(M

R

)2

υ6 ∼ c5

G

(RSch

R

)2(υ

c

)6

,

(32)

where RSch = 2GM/c2 is the Schwarzschild radius of thesource. It is obvious that the maximum value of the lu-

minosity in gravitational waves can be achieved if thesource’s dimensions are of the order of its Schwarzschildradius and the typical velocities of the components of thesystem are of the order of the speed of light. This ex-plains why we expect the best gravitational wave sourcesto be highly relativistic compact objects. The above for-mula sets also an upper limit on the power emitted by asource, which for R ∼ RSch and υ ∼ c is

LGW ∼ c5/G = 3.6 × 1059 ergs/sec. (33)

This is an immense power, often called the luminosity ofthe universe.

Using the above order-of-magnitude estimates, we canget a rough estimate of the amplitude of gravitationalwaves at a distance r from the source:

h ∼ G

c4

Ens

r∼ G

c4

εEkin

r, (34)

where εEkin (with 0 ≤ ε ≤ 1) is the fraction of kinetic en-ergy of the source that is able to produce gravitationalwaves. The factor ε is a measure of the asymmetry of thesource and implies that only a time-varying quadrupolemoment will emit gravitational waves. For example, evenif a huge amount of kinetic energy is involved in a givenexplosion and/or implosion, if the event takes place in aspherically symmetric manner, there will be no gravita-tional radiation.

Another formula for the amplitude of gravitationalwaves can be derived from the flux formula (26). If, for ex-ample, we consider an event (perhaps a supernova explo-sion) at the Virgo cluster during which the energy equiva-lent of 10−4 solar masses is released in gravitational wavesat a frequency of 1 kHz and with signal duration of theorder of 1 msec, the amplitude of the gravitational waveson earth will be

h ≈ 10−22

(E

10−4 MSun

)1/2( f

1 kHz

)−1

×(

τ

1 msec

)−1/2( r

15 Mpc

)−1

. (35)

These numbers explain why experimenters are trying sohard to build ultrasensitive detectors.

Finally, it is useful to know the damping time, that is,the time it takes for a source to transform a fraction 1/eof its energy into gravitational radiation. One can obtaina rough estimate from the formula

τ = Ekin

LGW∼ 1

cR

(R

RSch

)3

. (36)

For example, for a non-radially oscillating neutron starwith a mass of roughly 1.4 solar masses and a radius of12 km, the damping time will be of the order of 1 msec.Also, by using formula (31), we get an estimate for the

Page 186: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

76 Gravitational Wave Physics

frequency of oscillation which is directly related to the fre-quency of the emitted gravitational waves, roughly 2 kHzfor the above case.

E. Rotating Binary System

Among of the most interesting sources of gravitationalwaves are binaries. The inspiraling of such systems, con-sisting of black holes or neutron stars, is, as we will discusslater, the most promising source for gravitational wavedetectors. Binary systems are also sources of the gravita-tional waves whose dynamics we understand the best. Ifwe assume that the two bodies making up the binary liein the x−y plane and their orbits are circular (see Fig. 3),then the only nonvanishing components of the quadrupoletensor are

Qxx = −Qyy = 1

2µa2 cos 2t,

(37)

Qxy = Qyx = 1

2µa2 sin 2t,

where is the orbital angular velocity, µ = M1 M2/M isthe reduced mass of the system, and M = M1 + M2 is itstotal mass.

According to Eq. (29) the gravitational radiation lumi-nosity of the system is

LGW = 32

5

G

c5µ2a46 = 32

5

G4

c5

M3µ2

a5, (38)

where, in order to obtain the last part of the relation, wehave used Kepler’s third law, 2 = GM/a3. As the gravi-tating system loses energy by emitting radiation, the dis-tance between the two bodies shrinks at a rate

da

dt= −64

5

G3

c5

µM2

a3, (39)

FIGURE 3 A system of two bodies orbiting around their common center of gravity. Binary systems are the bestsources of gravitational waves.

and the orbital frequency increases accordingly (T /T =1.5a/a). If, for example, the present separation of the twostars is a0, then the binary system will coalesce after atime

τ = 5

256

c5

G3

a40

µM2. (40)

Finally, the amplitude of the gravitational waves is

h = 5 × 10−22

(M

2.8MSun

)2/3(µ

0.7MSun

)

×(

f

100 Hz

)2/3(15 Mpc

r

). (41)

In all these formulas we have assumed that the orbits arecircular. In general, the orbits of the two bodies are approx-imately ellipses, but it has been shown that long before thecoalescence of the two bodies, the orbits become circular,at least for long-lived binaries, due to gravitational ra-diation. Also, the amplitude of the emitted gravitationalwaves depends on the angle between the line of sight andthe axis of angular momentum; formula (41) refers to anobserver along the axis of the orbital angular momentum.The complete formula for the amplitude contains angularfactors of order 1. The relative strength of the two polar-izations depends on that angle as well.

If three or more detectors observe the same signal, itis possible to reconstruct the full waveform and deducemany details of the orbit of the binary system.

As an example, we will provide some details of thewell-studied pulsar PSR 1913 + 16 (the Hulse–Taylor pul-sar), which is expected to coalesce after ∼3.5 × 108 years.The binary system is roughly 5 kpc away from earth, themasses of the two neutron stars are estimated to be ∼1.4solar masses each, and the present period of the systemis ∼7 hr, 45 min. The predicted rate of period change is

Page 187: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 77

T = −2.40 × 10−12 sec/sec, while the corresponding ob-served value is in excellent agreement with the predic-tions, i.e., T = (−2.30 ± 0.22) × 10−12 sec/sec; finally thepresent amplitude of gravitational waves is of the order ofh ∼ 10−23 at a frequency of ∼7 × 10−5 Hz.

III. DETECTION OF GRAVITATIONALRADIATION

The first attempt to detect gravitational waves was under-taken by Joseph Weber during the early 1960s. He devel-oped the first resonant mass detector and inspired manyother physicists to build new detectors and explore froma theoretical viewpoint possible cosmic sources of gravi-tational radiation.

A pair of masses joined by a spring can be viewed asthe simplest conceivable detector; see Figure 4. In prac-tice, a cylindrical massive metal bar or even a massivesphere is used instead of this simple system. When a grav-itational wave hits such a device, it causes the bar to vi-brate. By monitoring this vibration, we can reconstruct thetrue waveform. The next step following resonant mass de-tectors was the replacement of the spring by pendulums.In this new detector the motions induced by a passinggravitational wave would be detected by monitoring, vialaser interferometry, the relative change in the distance oftwo freely suspended bodies. The use of interferometry isprobably the most decisive step in our attempt to detectgravitational wave signals. In what follows, we will dis-cuss both resonant bars and laser interferometric detectors.

Although the basic principle of such detectors is verysimple, the sensitivity of detectors is limited by varioussources of noise. The internal noise of the detectors canbe Gaussian or non-Gaussian. The non-Gaussian noisemay occur several times per day such as by strain re-leases in the suspension systems which isolate the detectorfrom any environmental mechanical source of noise, andthe only way to remove this type of noise is via com-parisons of the data streams from various detectors. Theso-called Gaussian noise obeys the probability distribu-tion of Gaussian statistics and can be characterized by aspectral density Sn( f ). The observed signal at the outputof a detector consists of the true gravitational wave strain h

FIGURE 4 A pair of masses joined by a spring can be viewed asthe simplest conceivable detector.

and Gaussian noise. The optimal method to detect a gravi-tational wave signal leads to the following signal-to-noiseratio: (

S

N

)2

opt

= 2∫ ∞

0

|h( f )|2Sn( f )

d f, (42)

where h( f ) is the Fourier transform of the signal wave-form. It is clear from this expression that the sensitivity ofgravitational wave detectors is limited by noise.

A. Resonant Detectors

Suppose that a gravitational wave propagating along thez axis with pure (+) polarization impinges on an idealizeddetector, two masses joined by a spring along the x axisas in Fig. 4. We will assume that hµν describes the strainproduced in the spacetime by the passing wave. We will tryto calculate the amplitude of the oscillations induced onthe spring detector by the wave and the amount of energyabsorbed by this detector. The tidal force induced on thedetector is given by Eq. (23), and the masses will moveaccording to the following equation of motion:

ξ + ξ /τ + ω20ξ = − 1

2ω2Lh+eiωt , (43)

where ω0 is the natural vibration frequency of the detector,τ is the damping time of the oscillator due to frictionalforces, L is the separation between the two masses, and ξ

is the relative change in the distance of the two masses. Thegravitational wave plays the role of the driving force for theideal oscillator, and the solution to the above equation is

ξ =12ω2Lh+eiωt

ω20 − ω2 + iω/τ

. (44)

If the frequency ω of the impinging wave is near thenatural frequency ω0 of the oscillator (near resonance),the detector is excited into large-amplitude motions and itrings like a bell. Actually, in the case of ω = ω0, we get themaximum amplitude ξmax = ω0τ Lh+/2. Since the size ofthe detector L and the amplitude of the gravitational wavesh+ are fixed, large-amplitude motions can be achievedonly by increasing the quality factor Q = ω0τ of the detec-tor. In practice, the frequency of the detector is fixed by itssize and the only improvement we can get is by choosingthe type of material so that long relaxation times areachieved.

The cross section is a measure of the interception abilityof a detector. For the special case of resonance, the averagecross section of the test detector, assuming any possibledirection of the wave, is

σ = 32π

15

G

c3ω0QML2. (45)

Page 188: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

78 Gravitational Wave Physics

This formula is general; it applies even if we replace ourtoy detector with a massive metal cylinder, for example,Weber’s first detector. That detector had the followingcharacteristics: mass M = 1410 kg, length L = 1.5 m, di-ameter 66 cm, resonant frequencyω0 = 1660 Hz, and qual-ity factor Q = ω0τ = 2 × 105. For these values the calcu-lated cross section is roughly 3 × 10−19 cm2, which is quitesmall, and even worse, it can be reached only when thefrequency of the impinging wave is very close to the res-onance frequency (the typical resonance width is usuallyof the order of 0.1–1 Hz).

In reality, the efficiency of a resonance bar detectordepends on several other parameters. Here, we will dis-cuss only the more fundamental ones. Assuming perfectisolation of the resonant bar detector from any externalsource of noise (acoustical, seismic, electromagnetic), thethermal noise is the only factor limiting our ability to de-tect gravitational waves. Thus, in order to detect a signal,the energy deposited by the gravitational wave every τ

seconds should be larger than the energy kT due to ther-mal fluctuations. This leads to a formula for the minimumdetectable energy flux of gravitational waves, which, fol-lowing Eq. (25), leads into a minimum detectable strainamplitude h,

h ≥ 1

ω0L Q

√15kT

M. (46)

For Weber’s detector, at room temperature this yields aminimum detectable strain of the order of 10−20. However,this estimate of the minimum sensitivity applies only togravitational waves whose duration is at least as long as thedamping time of the bar’s vibrations and whose frequencyperfectly matches the resonant frequency of the detector.

FIGURE 5 A sketch of Nautilus at Frascati, near Rome, probably the most sensitive resonant detector available.

For burst signals or for periodic signals which are off-resonance (with regard to the detector) the sensitivity of aresonant bar detector decreases further by several ordersof magnitude.

In reality, modern resonant bar detectors are quite com-plicated devices, consisting of a solid metallic cylinderweighing a few tons and suspended in vacuo by a ca-ble wrapped under its center of gravity (Figure 5). Thissuspension system protects the antenna from external me-chanical shocks. The whole system is cooled down to tem-peratures of a few kelvins or even millikelvins. To monitorthe vibrations of the bar, piezoelectric transducers (or themore modern capacitive ones) are attached to the bar.The transducers convert the bar’s mechanical energy intoelectrical energy. The signal is amplified by an ultralow-frequency amplifier by using a device called a SQUID(superconducting quantum interference device) before itbecomes available for data analysis. Transducers and am-plifiers of electronic signals require careful design toachieve low noise combined with adequate signal transfer.

The above description of the resonant bar detectorsshows that, in order to achieve high sensitivity, one hasto:

1. Create more massive antennas. Today, most antennasare about 50% more massive than Weber’s earlyantenna. There are studies and research plans forfuture construction of spherical antennas weighing upto 100 tons.

2. Obtain higher quality factor Q. Modern antennasgenerally use aluminum alloy 5056 (Q ∼ 4 × 107);niobium (which is used in the Niobe detector) is evenbetter (Q ∼ 108), but much more expensive. Silicon

Page 189: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 79

or sapphire bars would enhance the quality factoreven more, but experimenters must first find a way toproduce large, single pieces of these crystals.

3. Lower the temperature of the antenna as much aspossible. Advanced cryogenic techniques have beenused and the resonant bar detectors are probably thecoldest places in the universe. Typical coolingtemperatures for the most advanced antennas arebelow the temperature of liquid helium.

4. Achieve strong coupling between the antenna and theelectronics and low electrical noise. The bar detectorsinclude the best available technology related totransducers and integrate the most recent advances inSQUID technology.

Since the early 1990s, a number of resonant bar detec-tors have been in nearly continuous operation in severalplaces around the world. They have achieved sensitivitiesof a few times 10−21, but there has been no clear evidenceof gravitational wave detection. As we will discuss later,they will have a good chance of detecting a gravitationalwave signal from a supernova explosion in our galaxy,although, this is a rather rare event (2–5 per century).

The details of the most sensitive cryogenic bar detectorsin operation are as follows:

Allegro (Baton Rouge, LA): Mass 2296 kg(Aluminum 5056), length 3 m, bar temperature 4.2 K,mode frequency 896 Hz.

Auriga (Legrano, Italy): Mass 2230 kg (Aluminum5056), length 2.9 m, bar temperature 0.2 K, modefrequency 913 Hz.

Explorer (CERN, Switzerland): Mass 2270 kg(Aluminum 5056), length 3 m, bar temperature 2.6 K,mode frequency 906 Hz.

Nautilus (Frascati, Italy): Mass 2260 kg (Aluminum5056), length 3 m, bar temperature 0.1 K, modefrequency 908 Hz.

Niobe (Perth, Australia): Mass 1500 kg (niobium),length 1.5 m, bar temperature 5 K, mode frequency695 Hz.

Also, there are plans for construction of massive spher-ical resonant detectors, the advantages of which will betheir high mass (100 tons), their broader sensitivity (upto 100–200 Hz), and their omnidirectional sensitivity. Ina spherical detector, five modes at a time will be ex-cited, which is equivalent to five independent detectorsoriented in different ways. This offers the opportunity, inthe case of detection, to obtain direct information aboutthe polarization of the wave and the direction to thesource.

B. Beam Detectors

1. Laser Interferometers

A laser interferometer is an alternative gravitational wavedetector that offers the possibility of very high sensitivitiesover a broad frequency band. Originally, the idea was toconstruct a new type of resonant detector with much largerdimensions. As one can realize from the relations (45) and(46), the longer the resonant detector is, the more sensitiveit becomes. One could then try to measure the relativechange in the distance of two well-separated masses bymonitoring their separation via a laser beam that continu-ously bounces back and forth between them. (This tech-nique is actually used in searching for gravitational wavesby using the so-called Doppler tracking technique, wherea distant interplanetary spacecraft is monitored from earththrough a microwave tracking link; the earth and space-craft act as free particles.) Soon, it was realized that it ismuch easier to use laser light to measure relative changesin the lengths of two perpendicular arms; see Figure 6.Gravitational waves that are propagating perpendicular tothe plane of the interferometer will increase the lengthof one arm of the interferometer and shorten the lengthof the other arm. This technique of monitoring waves isbased on Michelson interferometry. L-shaped interferom-eters are particularly suited to the detection of gravitationalwaves due to their quadrupolar nature.

Figure 6 shows a schematic design of a Michelson in-terferometer; the three masses M0, M1, and M2 are freelysuspended. Note that the resonant frequencies of thesependulums should be much smaller than the frequenciesof the waves that we are supposed to detect since the pen-dulums are supposed to behave like free masses. Mirrorsare attached to M1 and M2, and the mirror attached to massM0 splits the light into two perpendicular directions. Thelight is reflected at the two corner mirrors and returns tothe beam splitter. The splitter now half-transmits and half-reflects each of the beams. One part of each beam goesback to the laser, and the other parts are combined to reachthe photodetector, where the fringe pattern is monitored.If a gravitational wave slightly changes the lengths of thetwo arms, the fringe pattern will change, and so by mon-itoring the changes of the fringe pattern one can measurethe changes in the arm lengths and consequently monitorthe incoming gravitational radiation.1

Let us consider an impinging gravitational wave withamplitude h and (+) polarization propagating perpendicu-lar to the plane of the detector. We will further assume that

1In practice, things are arranged so that all light that returns on thebeam splitter from the corner mirrors is sent back into the laser, andonly if there is some motion of the masses there is an output at thephotodetector.

Page 190: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

80 Gravitational Wave Physics

FIGURE 6 Schematic design of a Michelson interferometer.

the frequency is much higher than the resonant frequencyof the pendulums and the wavelength is much longer thanthe arm length of the detector. Such a wave will gener-ate a change of L ∼ hL/2 in the arm length along the xdirection and an opposite change in the arm length alongthe y direction, according to equations (19) and (20). Thetotal difference in length between the two arms will be

L

L∼ h. (47)

For a gravitational wave with amplitude h ∼ 10−21 anddetector arm length 4 km (such as LIGO), this will in-duce a change in the arm length of about L ∼ 10−16 cm.In the general case, when a gravitational wave with arbi-trary polarization impinges on the detector from a randomdirection, the above formula will be modified by someangular coefficients of order 1.

If the light bounces a few times between the mirrorsbefore it is collected in the photodiode, the effective armlength of the detector is increased considerably and themeasured variations of the arm lengths will be increasedaccordingly. This is a quite efficient procedure for makingthe arm length longer. For example, a gravitational wave ata frequency of 100 Hz has a wavelength of 3000 km, andif we assume 100 bounces of the laser beam in the armsof the detector, the effective arm length of the detector is100 times larger than the actual arm length, but still thisis 10 times smaller than the wavelength of the incoming

wave. The optical cavity that is created between the mir-rors of the detector is known as a Fabry–Perot cavity andis used in modern interferometers.

In the remainder of this subsection we will focus on theGaussian sources of noise and their expected influence onthe sensitivity of laser intereferometers.

a. Photon shot noise. When a gravitational waveproduces a change L in the arm length, the phase differ-ence between the two light beams changes by an amountϕ = 2bL/λ, where λ is the reduced wavelength of thelaser light (∼10−8 cm) and b is the number of bouncesof the light in each arm. It is expected that a detectablegravitational wave will produce a phase shift of the orderof 10−9 rad. The precision of the measurements, though,is ultimately restricted by fluctuations in the fringe patterndue to fluctuations in the number of detected photons. Thenumber of photons that reach the detector is proportionalto the intensity of the laser beam and can be estimated viathe relation N = N0 sin2(ϕ/2), where N0 is the numberof photons that the laser supplies and N is the numberof detected photons. Inversion of this equation leads toan estimation of the relative change of the arm lengthsL by measuring the number of the emerging photons N .However, there are statistical fluctuations in the popula-tion of photons, which are proportional to the square rootof the number of photons. This implies an uncertainty inthe measurement of the arm length

Page 191: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 81

δ(L) ∼ λ

2b√

N0. (48)

Thus, the minimum gravitational wave amplitude that wecan measure is

hmin = δ(L)

L= L

L∼ λ

bLN1/20

∼ 1

bL

(h cλ

τ I0

)1/2

,

(49)where I0 is the intensity of the laser light (∼5–10 W)and τ is the duration of the measurement. This limitationin the detector’s sensitivity due to the photon countinguncertainty is known as photon shot noise. For a typicallaser interferometer the photon shot noise is the dominantsource of noise for frequencies above 200 Hz, while itspower spectral density Sn( f ) for frequencies 100−200 Hzis of the order of ∼3 × 10−23

√Hz.

b. Radiation pressure noise. According to formula(49), the sensitivity of a detector can be increased by in-creasing the intensity of the laser. However, a very power-ful laser produces a large radiation pressure on the mirrors.Then an uncertainty in the measurement of the momentumdeposited on the mirrors leads to a proportional uncer-tainty in the position of the mirrors or, equivalently, in themeasured change in the arm lengths. Then, the minimumdetectable strain is limited by

hmin ∼ τ

m

b

L

(τh I0

)1/2

, (50)

where m is the mass of the mirrors. As we have seen, thephoton shot noise decreases as the laser power increases,while the inverse is true for the noise due to radiationpressure fluctuations. If we try to minimize these two typesof noise with respect to the laser power, we get a minimumdetectable strain for the optimal power via the very simplerelation

hmin ∼ 1

L

(τh

m

)1/2

, (51)

which for the LIGO detector (where the mass of the mir-rors is∼100 kg and the arm length is 4 km), for observationtime of 1 msec gives hmin ≈ 10−23.

c. Quantum limit. An additional source of uncer-tainty in the measurements is set by Heisenberg’s prin-ciple, which says that the knowledge of the position andthe momentum of a body is restricted by the relationx · p ≥ h. For an observation that lasts some time τ ,the smallest measurable displacement of a mirror of massm is L; assuming that the momentum uncertainty isp ≈ m · L/τ , we get a minimum detectable strain dueto quantum uncertainties

hmin = L

L∼ 1

L

(τh

m

)1/2

. (52)

Surprisingly, this is identical to the optimal limit that wecalculated earlier for the other two types of noise. Thestandard quantum limit does set a fundamental limit onthe sensitivity of beam detectors. An interesting featureof the quantum limit is that it depends only on a singleparameter, the mass of the mirrors.

d. Seismic noise. At frequencies below 60 Hz, thenoise in the interferometers is dominated by seismic noise.This noise is due to geological activity of the earth and hu-man sources, e.g., traffic and explosions. The vibrations ofthe ground couple to the mirrors via the wire suspensionswhich support them. This effect is strongly suppressed byproperly designed suspension systems. Still, seismic noiseis very difficult to eliminate at frequencies below 5–10 Hz.

e. Residual gas-phase noise. The statistical fluc-tuations of the residual gas density induce a fluctuationof the refraction index and consequently of the moni-tored phase shift. Hence, the residual gas pressure throughwhich the laser beams travel should be extremely low. Forthis reason the laser beams are enclosed in pipes over theirentire length. Inside the pipes a high vacuum of the orderof 10−9 torr guarantees elimination of this type of noise.

Prototype laser interferometric detectors have been con-structed in the United States, Germany, and the UnitedKingdom. These detectors have an arm length of a fewtens of meters and they have achieved sensitivities of theorder of h ∼ 10−19. A new generation of laser interfer-ometric detectors is under construction and their opera-tion will start by the year 2001, with the first science runto commence around 2002–2003. The American LIGO(Laser Interferometer Gravitational Observatory) projectconsists of two detectors with arm length of 4 km, one inHanford, Washington, one in Livingston, Louisiana. Thedetector in Hanford includes, in the same vacuum system,a second detector with an arm length of 2 km.

The Italian/French Virgo detector of arm length 3 km atCascina near Pisa, Italy, is designed to have better sensi-tivity at lower frequencies. GEO600 is a German/Britishdetector built in Hannover, Germany. It has a 600 m armlength and is going to be in operation roughly at thesame time as LIGO. The completed TAMA300 detectorin Tokyo has an arm length of 300 m and is at an advancedstage of testing of the various components.

2. Space Detectors

Both bar and laser interferometers are high-frequency de-tectors, but there are a number of interesting gravitational

Page 192: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

82 Gravitational Wave Physics

FIGURE 7 Schematic design of the space interferometer LISA.

wave sources which emit signals at lower frequencies.The seismic noise provides an insurmountable obstacle inany earth-based experiment and the only way to overcomethis barrier is to fly a laser interferometer in space. LISA(Laser Interferometer Space Antenna) is such a system.It has been proposed by European and American scien-tists and has been adopted by the European Space Agency(ESA) as a cornerstone mission; recently NASA joinedthe effort. The launch date is expected to be around 2008.

LISA will consist of three identical drag-free spacecraftforming an equilateral triangle with one spacecraft at eachvertex (Fig. 7). The distance between the two vertices (thearm length) is 5 × 106 km. The spacecraft will be placedinto the same heliocentric orbit as earth, but about 20

behind earth. The equilateral triangle will be inclined atan angle of 60 with respect to earth’s orbital plane. Thethree spacecraft will track each other optically by usinglaser beams. Because of the diffraction losses it is notfeasible to reflect the beams back and forth as is done withLIGO. Instead, each spacecraft will have its own laser.The lasers will be phase locked to each other, achievingthe same kind of phase coherence as LIGO does withmirrors. The configuration will function as three partiallyindependent and partially redundant gravitational waveinterferometers.

At frequencies f ≥ 10−3 Hz, LISA’s noise is mainlydue to photon shot noise. The sensitivity curve steepens atf ∼ 3 × 10−2 Hz because at larger frequencies the gravi-tational wave’s period is shorter than the round-trip lighttravel time in each arm. For f ≤ 3 × 10−2 Hz, the noise is

due to buffeting-induced random motions of the spacecraftand cannot be removed by the drag-compensation system.LISA’s sensitivity is roughly the same as that of LIGO,but at 105 times lower frequency. Since the gravitationalwave energy flux scales as F ∼ f 2h2, this corresponds to1010 times better energy sensitivity than LIGO.

3. Satellite Tracking

The Doppler delay of communication signals betweenearth-based stations and spacecraft underlies another typeof gravitational wave detector. A radio signal of frequencyv0 is transmitted to a spacecraft and is coherently trans-ported back to earth, where it is received and its fre-quency measured with a highly stable clock (typically ahydrogen maser). The relative change v/v0 as functionof time is monitored. A gravitational wave propagatingthrough the solar system causes small perturbations inv/v0. The relative shift in the frequency of the signalsis proportional to the amplitude of gravitational waves.With this technique, broad-band searches are possible inthe millihertz frequency band, and thanks to very stableatomic clocks it is possible to achieve sensitivities of orderhmin ∼ 10−13−10−15. Noise sources that affect the sensi-tivity of Doppler tracking experiments can be divided intotwo broad classes: (a) instrumental and (b) related to prop-agation. At the high-frequency end of the band accessibleto Doppler tracking, thermal noise dominates over all othernoise sources, typically at about 0.1 Hz. Among all othersources of instrumental noise (transmitter and receiver,

Page 193: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 83

mechanical stability of the antenna, stability of the space-craft etc), clock noise has been shown to be the most im-portant instrumental source of frequency fluctuations. Thepropagation noise is due to fluctuations in the index ofrefraction of the troposphere, ionosphere, and interplane-tary solar plasma. Both NASA and ESA have performedsuch measurements and there is continued effort in thisdirection.

4. Pulsar Timing

Pulsars are extremely stable clocks and by measuring ir-regularities in their pulses we expect to set upper limits onbackground gravitational waves (see next section). If anobserver monitors simultaneously two or more pulsars, thecorrelation of their signals can be used to detect gravita-tional waves. Since such observation requires time scalesof the order of 1 year, this means that the waves have tobe of extremely low frequencies.

IV. ASTRONOMICAL SOURCES OFGRAVITATIONAL WAVES

The new generation of gravitational wave detectors(LIGO, Virgo) have very good chances of detecting grav-itational waves, but until these expectations are fulfilled,we can only make educated guesses as to the possibleastronomical sources of gravitational waves. The de-tectability of these sources depends on three parameters:their intrinsic gravitational wave luminosity, their eventrate, and their distance from the earth. The luminositycan be approximately estimated via the quadrupoleformula discussed earlier. Even though there are certainrestrictions in its applicability (weak field, slow motion),it provides a very good order-of-magnitude estimate forthe expected gravitational wave flux on earth. The rate atwhich various events with high luminosity in gravitationalwaves take place is extrapolated from astronomical obser-vations in the electromagnetic spectrum. Still, there mightbe a number of gravitationally luminous sources, forexample, binary black holes, for which we have no directobservations in the electromagnetic spectrum. Finally, theamplitude of gravitational wave signals decreases as oneover the distance to the source. Thus, a signal from a su-pernova explosion might be clearly detectable if the eventtakes place in our galaxy (2–3 events per century) but it ishighly unlikely to be detected if the supernova explosionoccurs at far greater distances, of order 100 Mpc, wherethe event rate is high and at least a few events per day takeplace. All three factors have to be taken into account whendiscussing sources of gravitational waves, but we will notdiscuss this matter further, as this is treated elsewhere.

It was mentioned earlier that the frequency of gravita-tional waves is proportional to the square root of the meandensity of the emitting system; this is approximately truefor any gravitating system. For example, neutron stars usu-ally have masses of around 1.4 solar masses and radii ofthe order of 10 km; thus, if we use these numbers in therelation f ∼

√GM/R3, we find that an oscillating neutron

star will emit gravitational waves primarily at frequenciesof 2–3 kHz. By analogy, a black hole a 100 times moremassive than the sun will have a radius of ∼300 km and thenatural oscillation frequency will be around 100 Hz. Fi-nally, for a binary system, Kepler’s law (see Section II.E)provides a direct and accurate estimation of the frequencyof the emitted gravitational waves. For two 1.4-solar-massneutron stars orbiting around each other at a distance of160 km, Kepler’s law predicts an orbital frequency of50 Hz, which leads to an observed gravitational wave fre-quency of 100 Hz.

A. Radiation from Gravitational Collapse

Type II supernovae are associated with the core collapseof a massive star together with a shock-driven expansionof a luminous shell which leaves behind a rapidly rotatingneutron star or, if the core has mass of >2–3 solar masses,a black hole. The typical signal from such an explosion isbroadband and peaked at around 1 kHz. Detection of sucha signal has been the goal of detector development overthe last three decades. However, we still know little aboutthe efficiency with which this process produces gravita-tional waves. For example, an exactly spherical collapsewill not produce any gravitational radiation at all. The keyissue is the kinetic energy of the nonspherical motionssince the gravitational wave amplitude is proportional tothis [Eq. (30)]. After 30 years of theoretical and numeri-cal attempts to simulate gravitational collapse, there is stillno great progress in understanding the efficiency of thisprocess in producing gravitational waves. For a conserva-tive estimate of the energy in nonspherical motions duringthe collapse, relation (31) leads to events of an amplitudedetectable in our galaxy, even by bar detectors. The nextgeneration of laser interferometers would be able to detectsuch signals from the Virgo cluster at a rate of a few eventsper month.

The main source of nonsphericity during the collapseis the angular momentum. During the contraction phase,the angular momentum is conserved and the star spins upto rotational periods of the order of 1 msec. In this case,a number of consequent processes with large luminositymight take place in this newly born neutron star. A numberof instabilities, such as the so-called bar mode instabilityand the r-mode instability, may occur which radiate copi-ous amounts of gravitational radiation immediately after

Page 194: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

84 Gravitational Wave Physics

the initial burst. Gravitational wave signals from these ro-tationally induced stellar instabilities are detectable fromsources in our galaxy and are marginally detectable if theevent takes place in the nearby cluster of about 2500 galax-ies, the Virgo cluster, 15 Mpc away from the earth. Addi-tionally, there will be weaker but extremely useful signalsdue to subsequent oscillations of the neutron star; f, p,and w modes are some of the main patterns of oscillations(normal modes) of the neutron star that observers mightsearch for. These modes have been studied in detail, andonce detected in the signal, they would provide a sensitiveprobe of the neutron star structure and its supranuclearequation of state. Detectors with high sensitivity in thekilohertz band will be needed in order to fully developthis so-called gravitational wave asteroseismology.

If the collapsing central core is unable to drive off its sur-rounding envelope, then the collapse continues and finallya black hole forms. In this case the instabilities and oscil-lations discussed above are absent and the newly formedblack hole radiates away within a few milliseconds anydeviations from axisymmetry and ends up as a rotating orKerr black hole. The characteristic oscillations of blackholes (normal modes) are well studied, and this uniqueringing down of a black hole could be used as a direct probeof their existence. The frequency of the signal is inverselyproportional to the black hole mass. For example, it wasstated earlier that a 100-solar-mass black hole will oscil-late at a frequency of ∼100 Hz (an ideal source for LIGO),while a supermassive one with mass 107 solar masses,which might be excited by an infalling star, will ring downat a frequency of 10−3 Hz (an ideal source for LISA). Theanalysis of such a signal should reveal directly the twoparameters that characterize any (uncharged) black hole,namely its mass and angular momentum.

B. Radiation from Spinning Neutron Stars

A perfectly axisymmetric rotating body does not emit anygravitational radiation. Neutron stars are axisymmetricconfigurations, but small deviations cannot be ruled out.Irregularities in the crust (perhaps imprinted at the timeof crust formation), strains that have built up as the starshave spun down, off-axis magnetic fields, and/or accre-tion could distort the axisymmetry. A bump that might becreated at the surface of a neutron star spinning with fre-quency f will produce gravitational waves at a frequencyof 2 f and such a neutron star will be a weak but continuousand almost monochromatic source of gravitational waves.The radiated energy comes at the expense of the rotationalenergy of the star, which leads to a spindown of the star.If gravitational wave emission contributes considerably tothe observed spindown of pulsars, then we can estimatethe amount of the emitted energy. The corresponding am-

plitude of gravitational waves from nearby pulsars (a fewkpc away) is of the order of h ∼ 10−25−10−26, which isextremely small. If we accumulate data for sufficientlylong time, e.g., 1 month, then the effective amplitude,which increases as the square root of the number of cy-cles, could easily go up to the order of hc ∼ 10−22. Wemust admit that we are extremely ignorant of the degreeof asymmetry in rotating neutron stars, and these estimatesare probably very optimistic. On the other hand, if we donot observe gravitational radiation from a given pulsar wecan place a constraint on the degree of nonaxisymmetryof the star.

C. Radiation from Binary Systems

Binary systems are the best sources of gravitational wavesbecause they emit copious amounts of gravitational radia-tion, and for every system we know exactly the amplitudeand frequency of the gravitational waves in terms of themasses of the two bodies and their separation (see Sec-tion II.E). If a binary system emits detectable gravitationalradiation in the bandwidth of our detectors, we can easilyidentify the parameters of the system. According to theformulas of Section II.E, the observed frequency changewill be f ∼ f 11/3 M5/3

chirp and the corresponding amplitudewill be h ∼ M5/3

chirp f 2/3/r = f / f 3r , where M5/3chirp = µM2/3

is a combination of the total and reduced mass of the sys-tem called the chirp mass. Since both frequency f andits rate of change f are measurable quantities, we canimmediately compute the chirp mass (from the first rela-tion), thus obtaining a measure of the masses involved.The second relation provides a direct estimate of the dis-tance of the source. These relations have been derivedusing the Newtonian theory to describe the orbit of thesystem and the quadrupole formula for the emission ofgravitational waves. Post-Newtonian theory—inclusion ofthe most important relativistic corrections in the descrip-tion of the orbit—can provide more accurate estimatesof the individual masses of the components of the binarysystem.

When analyzing the data of periodic signals, the effec-tive amplitude is not the amplitude of the signal alone, buthc = √

n · h, where n is the number of cycles of the signalwithin the frequency range where the detector is sensitive.A system consisting of two typical neutron stars will bedetectable by LIGO when the frequency of the gravita-tional waves is ∼10 Hz until the final coalescence around1000 Hz. This process will last for about 15 min and thetotal number of observed cycles will be of the order of 104,which leads to an enhancement of the detectability by afactor of 100. Binary neutron star systems and binary blackhole systems with masses of the order of 50 solar massesare the primary sources for LIGO. Given the anticipated

Page 195: Encyclopedia of Physical Science and Technology - Classical Physics

P1: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007F-300 June 30, 2001 16:44

Gravitational Wave Physics 85

sensitivity of LIGO, binary black hole systems are themost promising sources and could be detected as far as200 Mpc away. The event rate with the present estimatedsensitivity of LIGO is probably a few events per year, butfuture improvement of detector sensitivity (the LIGO IIphase) could lead to the detection of at least one event permonth. Supermassive black hole systems of a few mil-lion solar masses are the primary sources for LISA. Thesebinary systems are rare, but due to the huge amount ofenergy released, they should be detectable from as far asthe boundaries of the observable universe.

D. Cosmological Gravitational Waves

One of the strongest pieces of evidence in favor of the BigBang scenario is the 2.7 K cosmic microwave backgroundradiation. This thermal radiation first bathed the universearound 1 million years after the Big Bang. By contrast, thegravitational radiation background anticipated by theoristswas produced at Planck times, i.e., at 10−43 sec or earlierafter the Big Bang. Such gravitational waves have traveledalmost unimpeded through the universe since they weregenerated. The observation of cosmological gravitationalwaves will be one of the most important contributions ofgravitational wave astronomy. These primordial gravita-tional waves will be, in a sense, another source of noise

for our detectors and so they will have to be much strongerthan any other internal detector noise in order to be de-tected. Otherwise, confidence in detecting such primordialgravitational waves could be gained by using a system oftwo detectors and cross-correlating their outputs. The twoLIGO detectors are well placed for such a correlation.

SEE ALSO THE FOLLOWING ARTICLES

COSMOLOGY • GLOBAL GRAVITY MODELING • GRAVI-TATIONAL WAVE ASTRONOMY • NEUTRON STARS • PUL-SARS • RELATIVITY, GENERAL • SUPERNOVAE

BIBLIOGRAPHY

Blair, D. G. (1991). “The Detection of Gravitational Waves,” CambridgeUniversity Press, Cambridge.

Marck, J.-A., and Lasota, J.-P. (eds.). (1997). “Relativistic Gravitationand Gravitational Radiation,” Cambridge University Press,Cambridge.

Saulson, P. R. (1994). “Fundamentals of Interferometric GravitationalWave Detectors,” World Scientific, Singapore.

Thorne, K. S. (1987). “Gravitational radiation.” In “300 Years ofGravitation” (Hawking, S. W., and Israel, W., eds.), Cambridge Uni-versity Press, Cambridge.

Page 196: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

Heat TransferGeorge Alanson GreeneBrookhaven National Laboratory

I. Conduction Heat TransferII. Heat Transfer by Convection

III. Thermal Radiation Heat TransferIV. Boiling Heat TransferV. Physical and Transport Properties

GLOSSARY

Boiling The phenomenon of heat transfer from a surfaceto a liquid with vaporization.

Conduction Heat transfer within a solid or a motionlessfluid by transmission of mechanical vibrations and freeelectrons.

Convection Heat transfer within a flowing fluid by trans-lation of macroscopic fluid volumes from hot regionsto colder regions.

Heat transfer coefficient An engineering approximationdefined by Newton’s law of cooling which relates theheat flux to the overall temperature difference in asystem.

Thermal radiation The transport of thermal energy froma surface by nonionizing electromagnetic waves.

Temperature A scalar quantity which defines the internalenergy of matter.

HEAT TRANSFER plays an essential role in everythingthat we do. Our bodies are exposed to a changing environ-ment yet to live we must remain at 98.6F. The dynam-ics of our planet’s atmosphere and oceans are driven by

the seasonal variations in heat flux from our Sun. Thesedynamics, in turn, dictate whether it will rain or snow,whether there will be hurricanes or tornadoes, drought orfloods, if crops will grow or die. We burn fuels to heat ourhomes, our power plants create steam to turn turbines tomake electricity, and we reverse the process to cool ourdwellings with air conditioning for comfort. There are sev-eral modes for the transport of heat which we experiencedaily. Among these are conduction of heat in solids bymolecular vibrations, convection of heat in fluids by themotion of fluid elements from hot to cold regions, thermalradiation in which heat is transferred from surface to sur-face by electromagnetic radiation, and boiling heat trans-fer in which heat is transferred from a surface by causinga liquid-to-vapor phase change in an adjacent fluid.

I. CONDUCTION HEAT TRANSFER

Heat transfer in opaque solids occurs exclusively by theprocess of conduction. If a solid (or a stationary liquid orgas) is transparent or translucent, heat can be transferredin a solid by both conduction and radiation, and in fluids,heat can be transferred by conduction, radiation, and

279

Page 197: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

280 Heat Transfer

convection. In general, materials in which heat is trans-ferred by conduction only are solid bodies; the additionof convection and radiation to these systems enters thesolutions through conditions imposed at the boundaries.

A. Physics of Conductionand Thermal Conductivity

The mechanisms of heat conduction depend to a great ex-tent on the structure of the solid. For metals and other elec-trically conducting materials, heat is conducted through asolid by atomic and molecular vibrations about their equi-librium positions and by the mobility of free conduction-band electrons through the solid, the same electrons whichconduct electricity. There is, in fact, a rigorous relation-ship between the thermal and the electrical conductivitiesin metals, known as the Franz–Wiedemann law. In non-metallic or dielectric materials, lattice vibrations inducedby atomic vibrations, otherwise known as phonons, are theprinciple mechanism of heat conduction. A phonon can beconsidered a quantum of thermal energy in a thermoelas-tic wave of fixed frequency passing through a solid, muchlike a photon is a quantum of energy of a fixed frequencyin electromagnetic radiation theory, hence the origin of thename. It is the absence of free conduction-band electronsthat make dielectrics poor heat and electrical conductors,relying only on phonons or lattice vibrations to transferheat energy through a solid. This is intuitively less effi-cient than conduction in a metal or conductor as will bediscussed. It is also clear that in dielectrics which rely onphonon transport for heat transfer, anything which reducesthe phonon transport in a material will correspondingly re-duce its heat transfer efficiency. An example is the effectof dislocations or impurities in crystals and alloying inmetals.

The roots of our understanding of thermal conductivityby phonon transport come from kinetic theory and parti-cle physics. Phonon transport in a dielectric is analogousto the thermal conductivity of a gas which depends oncollisions between gas molecules to transfer heat. Con-sidering the phonons as particles in the spirit of the dualparticle-wave nature of electromagnetic theory, the ther-mal conductivity of a dielectric solid can be shown to berepresented by the relationship given as kp = ρcvvλ/3,where ρ is the phonon density, cv is the heat capac-ity at constant volume, v is the average phonon veloc-ity, and λ is the mean free path of the phonon. For heatconduction by phonon transport, the phonon velocity ison the order of the sound speed in the solid and themean free path is on the order of the interatomic spac-ing. Although the phonon density increases with increas-ing temperature, the thermal conductivity may remain un-changed or even decrease as the temperature increases if

the effect of the vibrations is to diminish the mean freepath by an equivalent factor or more. If a dielectric israised to a very high temperature, heat conduction is in-creased by thermal excitation of bound electrons whichcauses them to take on the characteristics of free elec-trons as in metals, hence the increased thermal conduc-tivity as we find in metals. In extreme cases, this canbe accompanied by electron or x-ray emission from thesolid.

In metals, conduction by phonons is enhanced byconduction by electrons, as just described for high-temperature dielectrics. The derivation from quantummechanics is parallel to that for phonon transport, exceptthat c is the electron heat capacity, v is the Fermi velocityof the free electrons, and λ is the electronic mean freepath of the valence electrons. Due to its complexity, thedetails of the derivation of the electron contribution tothe thermal conductivity will not be presented. However,it is easy to show that in a metal, the total thermalconductivity is the sum of the phonon contribution andthe electron contribution, k = kp + ke. In pure metals, theelectron contribution to the thermal conductivity may be30 times greater than the phonon contribution at roomtemperature. For a more rigorous derivation of the thermalconductivity, the reader is directed to the literature onsolid-state physics and quantum mechanics.

B. Fundamental Law of Heat Conduction

The second law of thermodynamics requires that heat istransferred from one body to another body only if the twobodies are at different temperatures, and that the heat flowsfrom the body at the highest temperature to the body at thelowest temperature. In essence, this is a statement that athermal gradient must exist in the solid and that heat flowsdown the thermal gradient. In addition, the first law of ther-modynamics requires that the thermal energy is conservedin the absence of heat sources or sinks in the body. It fol-lows from this that a body has a temperature distributionwhich is a function of space and time, T = T (x, y, z, t),and that the thermal field within the solid is constructed bythe superposition of an infinite number of isothermal sur-faces which never intersect, lest some point of intersectionin space be simultaneously at two or more temperatureswhich is impossible.

Consider a semi-infinite solid whose boundaries are par-allel and isothermal at different temperatures. Eventuallythe temperature distribution within the body will becomeinvariant with time and the heat flow from surface one tosurface two becomes q1−2 = −k A(T1 −T2)/d, where q1−2

is the heat flux, A is the area normal to the heat flux, T1

and T2 are the temperatures of the two isothermal bound-ing surfaces and d is the separation between the surfaces.

Page 198: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

Heat Transfer 281

In the limit that the coordinate normal to the isothermalplane approaches zero, this equation then becomes

qn = −k A∂T

∂n(1a)

or

q ′′n = −k

∂T

∂n, (1b)

which is known as Fourier’s heat conduction equation.The heat flow per unit area per unit time across a surfaceis called the heat flux q ′′ and has units of W/m2. The heatflux is a vector and can be calculated for any point in asolid if the temperature field and thermal conductivity areknown.

C. Differential Heat Conduction Equation

The differential heat conduction equations derive from theapplication of Fourier’s law of heat conduction, and thebasic character of these equations is dependent upon shapeand varies as a function of the coordinate system chosento represent the solid. If Fourier’s equation is applied toa simple, isotropic solid in Cartesian coordinates and ifthe thermal conductivity is assumed to be constant, theequation for the transient conservation of thermal energydue to conduction of heat in a solid with a heat source (orheat sink) can be derived as follows,

∂2T

∂x2+ ∂2T

∂y2+ ∂2T

∂z2+ q ′′′

k= 1

α

∂T

∂t, (2)

where q ′′′ is the volumetric heat source and α is the thermaldiffusivity, α = k/ρc. If the heat source is equal to zero,this reduces to the Fourier equation. If the temperature inthe solid is invariant with respect to time, this becomes thePoisson equation. Furthermore, if the temperature is time-invariant and the heat source is zero, this becomes theLaplace equation. Other forms of the thermal energy con-servation equations in a solid can be derived for othercoordinate systems and the eventual solutions depend onthe initial and boundary conditions which are imposed. Itis the solutions to these equations as given previously thatis modeled in commercially available computer analysispackages for heat transfer solutions in solids. The readeris referred to VanSant for a thorough listing of analyticalsolutions to the heat conduction equations in many coordi-nate systems and subject to numerous initial and boundaryconditions in order to experience the elegance of analyt-ical solutions to problems in heat transfer physics andto appreciate the relationship between heat transfer andmathematics.

A curiosity of the parabolic differential form of the heatconduction equation just presented is that it implies that

the velocity of propagation of a thermal wave in a solid isinfinite. This is a consequence of the fact that the solutionpredicts that the effects of a thermal disturbance in a solidare felt immediately at a distance infinitely removed fromthe disturbance itself. This is in spite of the definition ofthe thermal conductivity which is based upon finite speedof propagation of free electrons or phonons in matter. Inpractical applications, this outcome is inconsequential be-cause the effect at infinity is generally small. However,there are circumstances in which this peculiarity in theequations may actually become significant and lead to er-roneous results, for instance, in heat transfer problems atvery low temperatures or very short time scales, in whichcases the finite speed of propagation of heat becomes im-portant. Two examples of such circumstances which canbe encountered in practice are cryogenic heat transfer nearabsolute zero and rapid energy transfer in materials due tosubatomic particles which travel at the speed of light. It hasbeen suggested that the form of the differential equationsfor conduction heat transfer should be the damped-waveor the hyperbolic heat conduction equation, often calledthe telegraph equation, which includes the finite speed ofpropagation of heat, C , as shown below without deriva-tion.

1

C2

∂2T

∂t2+ 1

α

∂T

∂t= ∂2T

∂x2+ ∂2T

∂y2+ ∂2T

∂z2. (3)

For most practical problems in heat conduction, the so-lutions to the parabolic and hyperbolic heat conductionequations are essentially identical; however, the cautionsoffered in Eq. (3) should be evaluated in circumstanceswhere the finite propagation speed could become impor-tant, especially when using commercial equation solverswhich will undoubtedly not model the hyperbolic effectjust described. Rendering the damped-wave equation di-mensionless will reveal to the analyst when the wave prop-agation term and the diffusion term on the LHS of the hy-perbolic heat conduction equation are of the same orderof magnitude and both must be included in the solution,for instance, when t ∼ (α/C2).

A continued discussion of conduction heat transfer insolids would require the solutions to many special heattransfer cases for which the parabolic heat conductionequation can be easily integrated. Examples of these canbe found in every heat transfer text book, and they will notbe solved here. However, two examples will be discussedwhich illustrate powerful techniques for solving contem-porary heat transfer problems. These are the lumped heatcapacity approximation for transient heat conduction insolids with convection, and the numerical decompositionof the differential heat conduction equation for finite dif-ference computer analysis. By necessity, these discussionswill be brief but illustrative.

Page 199: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

282 Heat Transfer

D. Lumped Heat Capacity Approximationin Transient Conduction

Some heat transfer systems, usually involving a “small”body or a body which is thin in the direction of heat trans-fer, can be analyzed under the assumption that their tem-perature is uniform spatially, only a function of time. Thisis called the lumped heat capacity assumption. The sys-tem can be analyzed as a function of time only, greatlysimplifying the analysis. This situation can be illustratedby considering a small, spherical object at an initial tem-perature T0 which is suddenly submerged in a fluid attemperature T∞ which imposes a heat transfer coefficientat the surface of the sphere h with the units W/m2K. If thesphere has density ρ , specific heat cp, surface area A, andvolume V , the transient energy conservation equation canbe written as follows:

dT (t)

dt= − h A

ρcpV(T (t) − T∞), (4)

and the solution for the time-dependent dimensionlesstemperature of the sphere θ becomes as follows, θ =exp(−τ ) = exp(−Bi · Fo), where Bi is the Biot number(Bi = h /k) which is the ratio of the internal heat trans-fer resistance to the external heat transfer resistance, andFo is the Fourier number (Fo = αt/ 2), the dimension-less time. In order to simplify the solution for the tran-siently cooled sphere, a condition was imposed that thespatial variations of the temperature in the sphere weresmall. This condition is satisfied if the resistance to heattransfer inside the object is small compared to the ex-ternal resistance to heat transfer from the sphere to thefluid. Mathematically, this is stated that the Biot num-ber ≤0.1, a factor of an order of magnitude. If this con-dition is satisfied, heat transfer solutions can be greatlysimplified.

E. Finite Difference Representationof Steady-State Heat Conduction

In practice, it is frequently not possible to achieve analyti-cal solutions to heat transfer problems in spite of simplifi-cations and approximations. It is often necessary to resortto numerical solutions because of complexities involv-ing geometry and shape, variable physical and transportproperties, and complex and variable initial and boundaryconditions. Figure 1 shows a rectilinear two-dimensionalsolid, divided into a grid of equally spaced nodes; it will beassumed that a steady-state temperature field exits. Threenodes are depicted in Fig. 1: (a) an interior node whichis surrounded by other nodes in the solid, (b) a node onthe insulated boundary of the solid, and (c) a node on theconvective boundary of the solid.

FIGURE 1 Cartesian coordinate grid for finite difference analysis.

An analytical solution to such a simple heat transferproblem could be a formidable task; however, analysis bydecomposing the energy equation into a form suitable forfinite difference numerical analysis will greatly simplifythe task. In other words, an energy balance is performedon each shaded control volume (one for every node),allowing for heat transfer across each face of the controlvolume from surrounding nodes or, in the case of theconvective boundary, the surrounding fluid. Assumingfor convenience that x = y, the temperature is nota function of time and the thermal conductivity is aconstant, the finite difference equations can be easilyderived, and they are presented here for the three nodesin the example in Fig. 1.

Interior node: 0 = T2 + T3 + T4 + T5 − 4T1

Insulated boundary: 0 = T2 + T4 + 2T3 − 4T1 (5)

Convective boundary:

0 = 1

2(2T3 + T2 + T4) +

(h · x

k

)T∞

−(

h · x

k+ 2

)T1.

Note the appearance in the convective boundary nodeequation of the term (hx/k), the finite difference formof the Biot number which was introduced in the precedingsection. Such equations can be written for all the nodes andassembled in a manner convenient for iterative solution.Simple examples such as those shown here will rapidlyconverge; more complex problems will require more com-plex algorithms and stringent convergence criteria.

Page 200: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

Heat Transfer 283

II. HEAT TRANSFER BY CONVECTION

In the preceding section, we discussed the mechanisms ofconduction heat transfer as the sole agent which transportsheat energy within a solid. Convection was only consid-ered insofar as it entered the problem through the boundaryconditions. For fluids, however, this is true only under theconditions that the fluid is motionless, a condition almostnever realized in practice. In general, fluids are in motioneither by pumping or by buoyancy, and the heat transfer influids in motion is enhanced over conduction because themoving fluid particles carry heat with them as internal en-ergy; the transport of heat through a fluid by the motion ofmacroscopic fluid particles is called convection. We willnow consider methods of modeling convective heat trans-fer and the concept of the heat transfer coefficient whichis the fundamental variable in convection. The analysis ofconvective heat transfer is more complex than conductionin solids because the motion of the fluid must be consid-ered simultaneously with the energy transfer process. Thegeneral approach assumes that the fluid is a continuum in-stead of the more basic and complex approach assumingindividual particles. Although fundamental issues such asthe thermodynamic state and the transport properties ofthe fluid cannot be solved theoretically by the continuumapproach, the solutions to the fluid mechanics and heattransfer are made more tractable; parallel studies at themolecular level can resolve the thermodynamic and trans-port issues. In practice, the thermodynamic and transportproperties, although available from theoretical studies ona molecular level, are generally input to the study of heattransfer empirically.

FIGURE 2 Schematic of internal flow and external flow boundary layers.

A. Internal and External Convective Flows

There are two general classes of problems in convectiveheat transfer: internal convection in channels and pipes inwhich the flow patterns become fully developed and spa-tially invariant after traversing an initial entrance lengthand the heat flux is uniform along the downstream sur-faces, and external convection over surfaces which pro-duces a shear or boundary layer which continues to growin the direction of the flow and which never becomes fullydeveloped or spatially invariant. Both internal duct flowsand external boundary layer flows can be either laminaror turbulent, depending upon the magnitude of a dimen-sionless parameter of the fluid mechanics known as theReynolds number. For internal flows, the flow is laminarif the Reynolds number is less than 2 × 103; for externalflows, the rule of thumb is that the flow is laminar if theReynolds number is less than 5 × 105. Schematic repre-sentations of both an internal flow case and an externalboundary layer are shown in Figs. 2a,b, respectively.

B. Fluid Mechanics and the Reynolds Number

Steady flow in a channel (internal flow) and external flowover a boundary are governed by a balance of forces inthe fluid in which inertial forces and pressure forces arebalanced by viscous forces on the fluid. This leads to thefamiliar concept of a constant pressure drop in a water pipewhich provides the force to overcome friction along thepipe walls and thus provides the desired flow rate of waterout the other end. For Newtonian fluids, viscous or shearforces in the fluid are described by a relationship betweenthe stress (force/unit area) between the fluid layers which

Page 201: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

284 Heat Transfer

results in a shear of the velocity field in the fluid as follows,τ = µ · ∂u/∂y, where τ is the shear stress in the fluid,∂u/∂y is the rate of strain of the fluid, and µ, the constantof proportionality, is a fluid transport property known asthe dynamic viscosity. This is the Newtonian stress–strainrelationship and it forms the basis for the fundamentalequations of fluid mechanics. The force balance in thedirection of flow which provides for a state of equilibriumon a fluid element in the flow can be written as a balance ofdifferential pressure forces normal to the fluid element bytangential shear forces on the fluid element as shown in thefollowing:

−∂ P

∂x+ ∂τ

∂y=

∑Fx = D

Dt(ρu). (6)

Substituting the Newtonian stress–strain relationship intothis force balance, we find that

−∂ P

∂x+ µ

∂2u

∂y2= ρ

[u

∂u

∂x+ v

∂u

∂y

](7a)

and in dimensionless form, this becomes

−∂ P∗

∂x∗ +(

µ

ρUL

)∂2u∗

∂y∗2= u∗ ∂u∗

∂x∗ + v∗ ∂u∗

∂y∗ , (7b)

where the quantity (ρU L/µ) is called the Reynolds num-ber of the flow, and it represents the ratio of inertial forcesto viscous forces in the fluid. The Reynolds number, some-times written as Re = U L/ν, where ν is the kinematicviscosity, ν = µ/ρ, is the similarity parameter of fluidmechanics which provides the convenience of similaritysolutions to general classes of fluid mechanics problems(i.e., pressure drop in laminar or turbulent pipe flow canbe scaled by the Reynolds number, regardless of the ve-locity, diameter, or viscosity) and is the parameter whichpredicts when laminar conditions transition to turbulence.The Reynolds number plays a fundamental role in predict-ing convection and convective heat transfer.

C. The Convective Thermal Energy Equationand the Nusselt Number

In order to solve for heat transfer in convective flows, anenergy balance is constructed on an elemental fluid ele-ment. In Cartesian coordinates, this usually involves thebalance of convection of heat into and out of the elemen-tal volume in the direction of the flow (x-direction) andconduction of heat into and out of the elemental volumetransverse to the flow (y-direction). Taylor series expan-sions of the convective heat flux in the x-direction andthe conduction heat flux in the y-direction permit the rep-resentation of the heat balance on the differential fluidvolume in differential form. The statement of thermal en-ergy conservation on the differential unit volume of fluid

dx ·dy can be written in the form of the laminar boundarylayer equation as

ρcp

[u

∂T

∂x+ v

∂T

∂y

]= k

∂2T

∂y2. (8)

Furthermore, if the velocity, temperature, and coordinatesare nondimensionalized by the characteristic scales of theproblem, such as the maximum velocity U , the overalllength L , and the overall temperature difference Tw −T∞, the dimensionless form of the laminar boundary layerequation becomes

u∗ ∂θ

∂x∗ + v∗ ∂θ

∂y∗ = 1

Re · Pr

∂2θ

∂y∗2, (9)

subject to the appropriate boundary conditions. Note theappearance in Eq. (9) of the familiar Reynolds number,Re = U L/ν, and the appearance of another dimension-less parameter, the Prandtl number, Pr = µ·c/k, which in-cludes the physical and transport properties of fluid me-chanics and heat transfer. In simple terms, the Prandtlnumber represents the ratio of the thickness of the hydro-dynamic boundary layer to the thickness of the thermalboundary layer. If the Prandtl number equals unity, bothboundary layers grow at the same rate. For all problemsof convective heat transfer in fluids, the dominant dimen-sionless scaling or modeling parameters are the Reynoldsnumber and the Prandtl number. Solutions to the thermalenergy equations of convective heat transfer are complexand can only be described here in general terms. To ex-press the overall effect of convection on heat transfer, wecall upon Newton’s law of cooling given by

q = h A(Tw − T∞) = −k A

(∂T

∂y

)w

, (10a)

where h is the heat transfer coefficient which has units of(W/m2K). The heat transfer coefficient can also be writtenas

h = −k(∂T/∂y)w(Tw − T∞)

. (10b)

The dimensional solutions to the laminar boundary layerequations (and the thermal convective energy equations ingeneral) involve solutions as shown previously for the heattransfer coefficient. The solution for the heat transfer coef-ficient may be nondimensionalized by multiplying by thecharacteristic length scale of the problem and dividing bythe thermal conductivity. In this manner, the dimension-less heat transfer coefficient is introduced, and is calledthe Nusselt number,

Nu = h · x

k= f (Re, Pr), (11)

which is the general form of most solutions in forced-flowconvective heat transfer.

Page 202: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

Heat Transfer 285

In closing the discussion on convective heat transfer, itwould be useful to present two examples of the dimen-sionless heat transfer coefficient, the solution to the ther-mal energy equation, for the two cases presented earlierin Fig. 2. The first example is the laminar flow externalboundary layer which was depicted in Fig. 2b. For thecase of laminar boundary convective heat transfer overa horizontal, flat surface, the dimensionless heat transfercoefficient, which is the solution to the thermal energyequation becomes as follows:

Nu(x) = h(x) · x

k= 0.332 Pr1/3 Re(x)1/2. (12)

The second example is fully-developed flow in a smoothcircular pipe which was depicted in Fig. 2a. For the case offully developed turbulent flow in a smooth circular pipe,the solution of the convective thermal energy equationbecomes as follows:

Nud = h · d

k= 0.023 Re(d)0.8 Prn, (13)

where n = 0.4 for heating and n = 0.3 for cooling. Equa-tion (13) is called the Dittus–Boelter equation for turbulentheat transfer in a pipe. The derivations given in this chap-ter were, by necessity, simplifications of more rigorousderivations which may be found in the bibliography.

III. THERMAL RADIATIONHEAT TRANSFER

In the preceding sections, we examined two fundamentalmodes of heat transfer, conduction and convection, andhave shown how they are developed from a fundamentaltheoretical approach. We now turn our attention to the thirdfundamental mode of heat transfer, thermal radiation.

A. Physical Mechanisms of Thermal Radiation

Thermal radiation is the form of electromagnetic radiationthat is emitted by a body as a result of its temperature.There are many types of electromagnetic radiation, someis ionizing and some is nonionizing. Electromagnetic ra-diation generally becomes more ionizing with increasingfrequency, for instance x-rays and γ -rays. At lower fre-quencies, electromagnetic radiation becomes less ioniz-ing, for instance, visible, thermal, and radio wave radia-tion. However, this is not a hard and fast rule. The spectrumof thermal radiation includes the portion of the frequencyband of the electromagnetic spectrum which includes in-frared, visible, and ultraviolet radiation. Regardless of thetype of electromagnetic radiation being considered, allelectromagnetic radiation is propagated at the speed oflight, c = 3 × 1010 cm/s, and this speed is equal to the

product of the wavelength and frequency of the radiation,c = λν, where λ is the wavelength and is ν the frequency.The portion of the electromagnetic spectrum which is con-sidered thermal covers the range of wavelength from 0.1to 100 µm; in comparison, the visible light portion ofthe thermal spectrum in which humans can see is verynarrow, covering only from 0.35 to 0.75 µm. If we wereinsects, we might see in the infrared range of the spec-trum. If we did, warm bodies would look like multicol-ored objects but the glass windows in our homes wouldbe opaque because infrared is reflected by glass just likevisible light is reflected by mirrors. Since the windowsin our homes are transparent to visible light, they let insolar radiation which is emitted in the visible spectrum;since they are opaque to infrared radiation, the surfacesin your house which radiate in the infrared do not radiateout to space at night; your house loses heat by conduc-tion through the walls and convection from the outsidesurfaces.

The emission or propagation of thermal energy takesplace as discrete photons, each having a quantum of en-ergy E given by E = hν, where h is Plank’s constant(h = 6.625 × 10−34 J · s). An analogy is sometimes used tocharacterize the propagation of thermal radiation as par-ticles such as the molecules of a gas, each having mass,momentum, and energy, a so-called photon “gas.” In thisfashion, we have that the energy of the photons in the “pho-ton gas” is E = hν = mc2, the photon mass is m = hν/c2,and the photon momentum is p = hν/c. It follows fromstatistical thermodynamics that the radiation energy den-sity per unit volume per unit wavelength can be derivedas

uλ = 3πhcλ−5

exp(hc/λkT )−1, (14)

where k is Boltzmann’s constant (k = 1.38 × 10−23 J/mol-ecule · K). If the energy density of the radiating gas isintegrated over all wavelengths, the total energy emittedis proportional to the absolute temperature of the emittingsurface to the fourth power as

Eb = σ T 4, (15)

where Eb is the energy radiated by an ideal radiator(or black body) and σ is the Stefan–Boltzmann constant(σ = 5.67 × 10−8 W/m2K4). Eb is called the emissivepower of a black body. The term “black body” should betaken with caution for although most surfaces which lookblack to the eye are ideal radiators, other surfaces suchas ice and some white paints are also black at long wave-lengths. Equation (15) is known as the Stefan–Boltzmannlaw of radiation heat transfer for an ideal thermal radi-ator. We have now developed three fundamental laws ofclassical heat transfer:

Page 203: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

286 Heat Transfer

Fourier’s law of conduction heat transfer Newton’s law of convective cooling Stefan–Boltzmann law of thermal radiation

B. Radiation Properties

The properties of thermal radiation are not dissimilar toour experience with visible light. When thermal radiationis incident upon a surface, part is reflected, part is ab-sorbed, and part is transmitted. The fraction reflected iscalled the reflectivity ρ, the fraction absorbed is called theabsorptivity α, and the fraction transmitted is called thetransmissivity τ . These three variables satisfy the identitythat α +ρ + τ = 1. Since most solid bodies do not trans-mit thermal radiation, τ = 0 and the identity reduces toα + ρ = 1.

There are two types of surfaces when it comes to thereflection of thermal radiation from a surface: specularand diffuse. If the angle of incidence of incoming radia-tion is equal to the angle of reflected radiation, the reflectedradiation is called specular. If the reflected radiation is dis-tributed uniformly in all directions regardless of the angleof incidence, the reflected radiation is called diffuse. Ingeneral, polished smooth surfaces are more specular andrough surfaces are more diffuse. The emissive power of asurface E is defined as the energy emitted from the sur-face per unit area per unit time. If you consider a bodyof surface area A inside a black enclosure and in ther-mal equilibrium with the enclosure, an energy balanceon the enclosed surface states that E · A = qi · A · α; inother words, the energy emitted from the body is equalto the fraction of the incident energy absorbed from theblack enclosure. If the surface inside the black enclosureis itself a black body, the statement of thermal equilib-rium then becomes as follows: Eb · A = qi · A · (1), whereα = 1 for the enclosed black body. Dividing these twostatements of thermal equilibrium, we get, E/Eb = α; inother words, the ratio of the emissive power of a bodyto the emissive power of a black body at the same tem-perature is equal to the absorptivity of the surface, α. Ifthis ratio holds such that the absorptivity α is equal to theemissivity ∈ for all wavelengths, we have Kirchhoff’s law,∈= α, for a grey body or for grey body radiation. In otherwords, the surface is a grey body such that the monochro-matic emissivity of the surface ∈λ is a constant for allwavelengths.

In practice, the emissivities of various surfaces can varyby a great deal as a function of wavelength, temperature,and surface conditions. A graphical example of the vari-ations in the total hemispherical emissivity of Inconel718 as a function of surface condition and temperatureas reported by the author is shown in Fig. 3. However,

FIGURE 3 Total hemispherical emissivity of Inconel 718: (a)shiny, (b) oxidized in air for 15 min at 815C, (c) sandblasted andoxidized in air for 15 min at 815C.

the convenience of the assumptions of a grey body, onewhose monochromatic emissivity ∈λ is independent ofwavelength, and Kirchhoff’s law, that ∈= α, make manypractical problems more tractable to solution.

Plank has developed a formula for the monochromaticemissive power of a black body from quantum mechanicsas shown in the following:

Ebλ = C1λ−5

exp(C2/λT )−1, (16)

where C1 = 3.743 × 108 W · µm4/m2 and C2 = 1.439 ×104 µm · K, and λ is the wavelength in micrometers.Plank’s distribution function given in Eq. (16) predicts thatthe maximum of the monochromatic black body emissivepower shifts to shorter wavelengths as the absolute tem-perature increases, and that the peak in Ebλ increases asthe wavelength decreases. An illustrative example of thetrends of Plank’s distribution function is that while a veryhot object such as a white-hot ingot of steel radiates in thevisible spectrum, λ ∼ 1 µm, as it cools it will radiate inincreasingly longer wavelengths until it is in the infraredspectrum, λ ∼ 100 µm.

A relationship between the temperature and the peakwavelength of Plank’s black body emissive power dis-tribution function known as Wein’s displacement law isgiven here:

λmax · T = 2897 . 6 µm · K. (17)

This relationship determines the peak wavelength of theemissive power distribution for a black body at any tem-perature T . If the body is grey with an average emissivity∈, the value of Ebλ is simply multiplied by ∈ to get Eλ, butas a first approximation, the peak of Plank’s distributionfunction remains unchanged.

Page 204: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

Heat Transfer 287

C. Radiation Shape Factors

Surfaces that radiate thermal energy radiate to eachother, and it is necessary to know how much heat leavingSurface 1 gets to Surface 2 and vice versa, in orderto determine the surface heat flux. The function thatdetermines the amount of heat leaving Surface 1 that isincident on Surface 2 is called the radiation shape factor,F1−2. Consider two black surfaces A1 and A2 at two dif-ferent temperatures T1 and T2. The energy leaving A1

arriving at A2 is Eb1 A1 F1−2 and the energy leaving A2

arriving at A1 is Eb2 A2 F2−1. Since the surfaces are blackand all incident energy is absorbed (∈1 = ∈2 = 1), thenet radiative energy exchange between the two surfacesis q1−2 = Eb1 A1 F1−2 − Eb2 A2 F2−1. Setting both sur-faces to the same temperature forces q1−2 to zero andEb1 = Eb2, therefore A1 F1−2 = A2 F2−1. This relationshipis known as the reciprocity relationship for radiation shapefactors and can be written in general as Am Fm,n = An Fn,m .This relationship is geometrical and applies for greydiffuse surfaces as well as black surfaces. Since our sur-faces were black, we can substitute the black body emis-sive power, Ebi = σ T 4

i , to get the result

q1−2 = A1 F1−2σ(T 4

1 − T 42

). (18)

In general, the solution for shape factors involves geomet-rical calculus. However, many shape factors have beentabulated in books which simplify the analyses signifi-cantly. There are relations between shape factors whichare useful for constructing complex shape factors from anassembly of more simple shape factors. Considerable timecould be spent in this discussion; however, these will notbe discussed here and the reader is directed to the refer-ences for more details.

D. Heat Exchange Between Nonblack Bodies

We have just derived a useful equation for the heat fluxbetween two black, diffuse surfaces q1−2 = A1 F1−2(Eb1 −Eb2). In analogy to Ohm’s law, this can be rewrittenin the form of a resistance to heat transfer as q1−2 =(Eb1 − Eb2)/Rspatial where Rspatial = 1/(A1 F1−2). It is im-plied in this formulation that since both bodies are blackand thus perfect emitters, they have no surface resistance toradiation, only a geometrical spatial resistance. If both sur-faces were grey, they would have ∈ = 1, and there wouldbe associated with each surface a resistance due to theemissivity of each surface, a thermodynamic resistance inaddition to the spatial resistance just shown. Let us ex-amine this problem in more general terms. The problemof determining the radiation heat transfer between blacksurfaces becomes one of determining the geometric shapefactor. The problem becomes more complex when consid-

ering nonblack bodies because not all energy incident ona surface is absorbed, some is reflected back and some isreflected out of the system entirely. In order to solve thegeneral problem of radiation heat transfer between grey,diffuse, isothermal surfaces, we must define two new con-cepts, the radiosity J and irradiation G.

The radiosity J is defined as the total radiation whichleaves a surface per unit area per unit time, and the irradi-ation G is defined as the total energy incident on a surfaceper unit area per unit time. Both are assumed uniformover a surface for convenience. Assuming that τ = 0 andρ = (1 − ∈), the equation for the radiosity J is as follows:

J = εEb + ρG = εEb + (1 − ε)G. (19)

Since the net energy leaving the surface is the differencebetween the radiosity and the irradiation, we find,

q

A= J − G = εEb + (1 − ε)G − G, (20)

and solving for G from Eq. (19) and substituting inEq. (20), we get the following solution for the surfaceheat flux:

q = εA

(1 − ε)(Eb − J ) = Eb − J

(1 − ε)/εA. (21)

In another analogy to Ohm’s law, the LHS of Eq. (21) canbe considered a current, the RHS-top a potential differ-ence, and the RHS-bottom a surface resistance to radiat-ive heat transfer. We now consider the exchange of radiantenergy between two surfaces A1 and A2. The energy leav-ing A1 which reaches A2 is J1 A1 F1−2, and the energyleaving A2 which reaches A1 is J2 A2 F2−1. Therefore,the net energy transfer from A1 to A2 is q1−2 = J1 A1

F1−2 − J2 A2 F2−1, and using the reciprocity relation forshape factors we find,

q1−2 = (J1 − J2)A1 F1−2 = (J1 − J2)

(1/A1 F1−2), (22)

where (1/A1 F1−2) is the spatial resistance to radiative heattransfer between A1 and A2.

A resistance network may now be constructed for twoisothermal, grey, diffuse surfaces in radiative exchangewith each other by dividing the overall potential differenceby the sum of the three resistances as follows:

q1−2 = Eb1 − Eb2

(1 − ε1)/ε1 A1 + 1/A1 F1−2 + (1 − ε2)/ε2 A2

= σ(T 4

1 − T 42

)(1 − ε1)/ε1 A1 + 1/A1 F1−2 + (1 − ε2)/ε2 A2

.

(23)

This approach can be readily extended to include morethan two surfaces exchanging radiant energy but the

Page 205: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

288 Heat Transfer

equations quickly become unwieldy so no example willbe presented here.

One example of a problem which may be easily solvedwith this network method and which frequently arises inpractice, such as in the design of experiments, is the prob-lem of two grey, diffuse, and isothermal infinite parallelsurfaces. In this problem, A1 = A2 and F1−2 = 1 since allthe radiation leaving one surface reaches the other sur-face. Substituting for F1−2 in Eq. (23) and dividing byA = A1 = A2, we find that the net heat flow per unit areabecomes

q1−2

A= σ

(T 4

1 − T 42

)1/ε1 + 1/ε2 − 1

. (24)

A second example which serves to illustrate this techniqueis the problem of two long concentric cylinders, with A1

being the inner cylinder and A2 the outer cylinder. Onceagain, applying Eq. (23) noting that F1−2 = 1, we find that,

q1−2

A1= σ

(T 4

1 − T 42

)1/ε1 + (A1 / A2)(1/ε2 − 1)

. (25)

In the limit that (A1/A2) → 0, for instance, for a smallconvex object inside a very large enclosure, this reducesto the simple solution shown in Eq. (26).

q1−2

A1= σε1

(T 4

1 − T 42

). (26)

These are only two simple examples of the power of the ra-diation network approach to solving radiative heat transferproblems with many mutually irradiating surfaces.

The study of thermal radiative heat transfer goes onto consider radiative exchange between a gas and a heattransfer surface, complex radiation networks in absorb-ing and transmitting media, solar radiation and radiationwithin planetary atmospheres, and complex considerat-ions of combined conduction–convection–radiation heattransfer problems. The reader is encouraged to inves-tigate these and other topics in radiative heat transferfurther.

IV. BOILING HEAT TRANSFER

The phenomenon of heat transfer from a surface to a liquidwith a phase change to the vapor phase by the formationof bubbles is called boiling heat transfer. When a poolof liquid at its saturation temperature is heated by an ad-jacent surface which is at a temperature just above theliquid saturation temperature, heat transfer may proceedwithout phase change by single-phase buoyancy or naturalconvection.

FIGURE 4 Pool boiling curve for water.

A. Onset of Pool Boiling

As the surface temperature is increased, bubbles appearon the heater surface signaling the onset of nucleation andincipient pool boiling. The rate of heat transfer by poolboiling as this is called is usually represented graphicallyby presenting the surface heat flux, q ′′

w, versus surface su-perheat, Tw − Tsat. This is referred to as the boiling curve.The process of boiling heat transfer is quite nonlinear, theresult of the appearance of a number of regimes of boilingwhich depend fundamentally upon different heat transferprocesses.

The components of the pool boiling curve have beenwell established and are shown graphically in Fig. 4. Thefirst regime of the boiling curve is the natural convec-tion regime, essentially a regime just preceding boiling,in which the heat transfer is by single-phase flow withoutvapor generation. In this regime, buoyancy of the hot liq-uid adjacent to the surface of the heater forces liquid torise in the cooler liquid pool followed by fresh cold liquidpassing over the heater to repeat the process.

B. Nucleate Boiling

A further increase in the surface superheat or the sur-face heat flux will drive the system to the onset of nu-cleate boiling (ONB), the point on the boiling curve atwhich bubbles first appear on the heater surface. The rateof vapor bubble growth, the area density of nucleationsites which become active, the bubble frequency and bub-ble departure diameter manifest themselves as dominantparameters controlling the heat flux as the pool entersthe nucleate boiling regime, all of which are increasingfunctions of the surface superheat. Without further dis-cussion, it is mentioned that the rate of heat transfer inthe nucleate boiling regime is extremely sensitive to vari-ous properties and conditions including system pressure,liquid agitation, and subcooling; surface finish, age, and

Page 206: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

Heat Transfer 289

coatings; dissolved noncondensible gases in the liquid;size and orientation of the heater; and nonwetting andtreated surfaces. Heat fluxes in the nucleate pool boil-ing regime increase very rapidly with small increases inthe surface superheat. The literature contains numerousefforts by various people to develop generalized correla-tions for nucleate pool boiling applicable to a wide rangeof liquids and generalized to include many of the prop-erties and conditions previously listed. As a minimum,any successful correlation must include provisions whichreflect the character or conditions of the heater surfaceas well as the properties of the boiling fluids themselves,requirements which have presented formidable obstaclesto the development of any universally applicable corre-lation. One of the earliest attempts at such a correlationwas developed by Rohsenow (1952), as seen in the follow-ing equation, for its historical significance and continuedapplicability:

c (Tw − Tsat )

i g= Cs f

[q ′′

µ i g

g(ρ − ρg)

)0.5]0.33

Prs .

(27)

where Cs f ∼ 0.013, s = 1 for water and s = 1.7 for allother fluids. Examination of this correlation reveals atheme underlying all of heat transfer and that is the essen-tial requirement for accurate values of the physical andtransport properties of the fluids of interest.

C. Critical Heat Flux

As the surface superheat in nucleate pool boiling contin-ues to increase, the resulting increase in the boiling heatflux is accompanied by an increase in active nucleationsites on the surface of the heater, thus resulting in an in-creasing vapor production rate per unit area. The boilingheat flux will continue to increase up to a point at whichthe liquid can no longer remove any more heat from thesurface due to vapor blanketing of the surface, restrictionof liquid flow to the surface, and flooding effects whichpush liquid droplets away from the surface. There is nogeneral agreement as to which of these mechanisms is re-sponsible for the boiling crisis which ensues, and indeedeach may be controlling under different geometric condi-tions. Regardless, soon the pool boiling curve reaches apeak heat flux which is called the critical heat flux (CHF).The critical heat flux in pool boiling is predominantly ahydrodynamic phenomenon, in which insufficient liquidis able to reach the heater surface due to the rate at whichvapor is leaving the surface. As such, it is an unstablecondition in pool boiling which should be avoided in en-gineered systems through design. There are two routes bywhich CHF can be reached. The first is by controlling the

temperature of the heater surface, in which case the sys-tem will simply return to nucleate boiling if the superheatis reduced or enter into transition boiling if the superheatexceeds CHF.

D. Film Boiling

It is more likely, however, that in most engineering systemsthe actual independent variable would be the heat flux, notthe surface temperature. In this case, any increase in thesurface heat flux above the CHF limit would induce a hugetemperature excursion in the surface as it became vapor-blanketed and heated-up adiabatically. This temperatureexcursion would continue until the imposed heat load wasable to be transferred to the boiling liquid by thermal ra-diation from the surface almost exclusively. As a resultof the vapor blanketing of the heater preventing liquid–solid contact, this boiling regime is called film boiling,and the occurrence of the thermal excursion from CHFinto film boiling is known as burnout. This term comesfrom the fact that the resulting surface temperatures are,in general, so high that the surface and thus the equipmentis damaged. Film boiling as a heat transfer process doesnot enjoy wide commercial application because such hightemperatures are generally undesirable. In film boiling, acontinuous vapor film blankets the heater surface whichprevents direct contact of liquid with the surface. Vaporis generated at the interface between the vapor film andthe overlying liquid pool by conduction through the va-por film and thermal radiation across the vapor film fromthe hot surface. It is of interest to note that in film boiling,the boiling heat flux is insensitive to the surface conditionsunlike nucleate boiling, in which surface conditions or sur-face finish may play a dominant role. In transition boiling,the unstable regime between nucleate and film boiling, sur-face conditions do influence the data providing evidencethat there is some liquid–surface contact in transition boil-ing which is not manifested in the film boiling regime.However, due to this decoupling of the boiling processfrom the heater surface conditions, film boiling is moretractable to analysis. The classical analysis of film boil-ing from a horizontal surface was performed by Berenson(1961). Many others have since contributed to the under-standing of film boiling, notably by extending his work tovery high superheats as well as to include liquid subcool-ing effects. The original model derived by Berenson forthe film boiling heat transfer coefficient is reproduced inthe following:

h = 0.425

[k3

gρg(ρ − ρg)g(i g + 0.4cp,gTsat

µ Tsat (σ/g(ρ − ρg)1/2

]0.25

.

(28)

Page 207: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

290 Heat Transfer

For extension to higher temperatures at which thermalradiation becomes significant, a simple correction is madeto the calculated film boiling heat flux by adding a radiativeheat transfer contribution.

V. PHYSICAL AND TRANSPORTPROPERTIES

Accurate and reliable thermophysical property data playa significant role in all heat transfer applications. Whetherdesigning a laboratory experiment, analyzing a theoret-ical problem, or constructing a large-scale heat transferfacility, it is crucial to the success of the project that thephysical properties that go into the solution are accurate,lest the project be a failure with adverse financial conse-quences as well as environmental and safety implications.In the solutions of heat transfer problems, numerous phys-ical and transport properties enter into consideration, allof which are functions of system parameters such as tem-perature and pressure. These properties also vary signifi-cantly from material to material when intuition suggestsotherwise, such as between alloys of a similar base metal.Physical and transport properties of matter are surpris-ingly difficult to measure accurately, although the liter-ature abounds with measurements which are presentedwith great precision and which frequently disagree withother measurements of the same property by other inves-tigators by a wide margin. Although this can sometimesbe the result of variations in the materials or the system

TABLE I Physical Properties of Pure Metals and Selected Alloys at 300 K

Material Tmelt (K) ρ (kg/m3) cp (J/kg · K) k (W/m · K) α · 106 (m2/s)

Aluminum 933 2702 903 237 97.1

Bismuth 545 9780 122 7.9 6.6

Copper 1358 8933 385 401 117

Gold 1336 19300 129 317 127

Iron 1810 7870 447 80 23

304 SS 1670 7900 477 14.9 3.95

316 SS ∼1670 8238 468 13.4 3.5

Lead 601 11340 129 35 24

Nickel 1728 8900 444 91 23

Inconel 600 1700 8415 444 14.9 4.0

Inconel 625 — 8442 410 9.8 2.8

Inconel 718 1609 8193 436 11.2 3.1

Platinum 2045 21450 133 72 25

Silver 1235 10500 235 429 174

Tin 505 7310 227 67 40

Titanium 1953 4500 522 22 9.3

Tungsten 3660 19300 132 174 68

Zirconium 2125 6570 278 23 12.4

parameters, all too often it is the result of flawed experi-mental techniques. Measurements of physical propertiesshould be left to specialists whenever possible. It shouldcome as no surprise, therefore, that the dominant sourcesof uncertainties or errors in analytical and experimentalheat transfer frequently come from uncertainties or er-rors in the thermophysical properties themselves. Thisconcluding section presents four tables of measured phys-ical properties for selected materials under various condi-tions to illustrate the variability which can be encounteredbetween materials and, in one case, the variability of a sin-gle material property as a function of temperature alone.

Table I presents the most frequently used physical andtransport properties of selected pure metals and severalcommon alloys at 300 K. Listed are commonly quotedvalues for the density, specific heat, thermal conductiv-ity, and thermal diffusivity for 18 metals and alloys. Heattransfer applications frequently require these properties atambient temperature due to their use as structural mate-rials. For properties at other temperatures, the reader isreferred to Touloukian’s 13-volume series on the thermo-physical properties of matter. It can be seen in Table I thatsome of the properties vary quite significantly from metalto metal. A judicious choice of metal or alloy for a partic-ular application usually involves optimization of not onlythe thermophysical properties of that metal but also themechanical properties and corrosion resistance.

Table II presents the most frequently used physicaland transport properties for selected gases at 300 K.Once again, 300 K represents a temperature routinely

Page 208: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

Heat Transfer 291

TABLE II Properties of Selected Gases at Atmospheric Pressure and 300 K

ρ cp µ · 105 ν · 106 k α · 106

Gas (kg/m3) (kJ/kg · K) (kg/m · s) (m2/s) (W/m · K) (m2/s) Pr

Air 1.18 1.01 1.98 16.8 0.026 0.22 0.708

Hydrogen 0.082 14.3 0.90 109.5 0.182 1.55 0.706

Oxygen 1.30 0.92 2.06 15.8 0.027 0.22 0.709

Nitrogen 1.14 1.04 1.78 15.6 0.026 0.22 0.713

CO2 1.80 0.87 1.50 8.3 0.017 0.11 0.770

encountered in practical applications. The table considersfive common gases and lists seven properties of general in-terest and frequent use. The reader is cautioned against theuse of these properties at temperatures other than 300 K.All these properties with the exception of the Prandtl num-ber are strong functions of temperature and significant er-rors can result if they are extrapolated to other conditions.The reader is once again directed to Touloukian for de-tailed property data.

Applications of heat transfer at very low temperaturessuch as at liquid nitrogen (76 K) and liquid helium (4 K)temperatures present unique challenges to the experimen-talist and require knowledge of the cryogenic properties ofmatter. Table III presents a summary of the temperature-dependent specific heat of six common cryogenic materi-als over the temperature range from 2 to 40 K to illustratethe extreme sensitivity of this property in particular (andmost cryogenic properties in general) to even slight vari-ations in temperature. Clearly, experiments, analyses, ordesigns which do not use precise, accurate, and reliabledata for the physical properties of materials at cryogenictemperatures will suffer from large uncertainties. In ad-dition, precise temperature control is a necessity at thesetemperatures.

For research applications, these uncertainties could eas-ily render experimental results and research conclusions

TABLE III Specific Heat of Selected Materials at CryogenicTemperatures

Specific heat (J/kg · K)

T (K) Al Cu α-Iron Ti Ice Quartz

2 0.05 0.0066 0.183 0.146 0.12 —

4 0.26 0.0217 0.382 0.317 0.98 —

6 0.50 0.0545 0.615 0.540 3.3 —

8 0.88 0.114 0.900 0.840 7.8 —

10 1.4 0.205 1.24 1.26 15 0.7

15 4.0 0.663 2.49 3.30 54 4.0

20 8.9 1.76 4.50 7.00 114 11.3

30 31.5 6.53 12.4 24.5 229 22.1

40 77.5 14.2 29.0 57.1 340 65.3

invalid. The reader is cautioned to seek out the most re-liable data for thermophysical properties when operatingunder cryogenic conditions.

Finally, there are occasions in heat transfer when itis advantageous to utilize liquid metals as a heat trans-fer medium. Table IV lists 12 low melting point metalscommonly encountered in practice and lists their melt-ing temperatures and boiling temperatures for compari-son. The choice of a suitable liquid metal for a particu-lar application does not provide for the flexibility whichengineers and scientists have come to expect from otherfluids at ordinary temperatures. Often the choice of anappropriate liquid metal for a particular application de-pends on the phase change temperatures as shown inTable IV. When and if no suitable liquid metal is foundamong the pure metals, alloys can be used instead. Thesealloys or mixtures usually have physical properties andmelting/boiling temperatures which are significantly dif-ferent from their constituent elements. No data for liquidmetal alloys are given here; indeed, the data for liquidmetal alloys are meager.

It is crucial to the success of any heat transfer ex-periment, analysis, or facility that careful and judiciouschoices are made in the selection of the materials to be

TABLE IV Melting and Boiling Temperatures ofSome Common Liquid Metals

Metal Tmelt (K) Tboil (K)

Lithium 452 1590

Sodium 371 1151

Phosphorus 317 553

Potassium 337 1035

Gallium 303 2573

Rubidium 312 969

Indium 430 2373

Tin 505 2548

Cesium 302 1033

Mercury 234 630

Lead 601 2023

Bismuth 544 1833

Page 209: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GSS/GUB P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN007J-312 June 29, 2001 19:43

292 Heat Transfer

used and that the data on their thermophysical propertiesare accurate and precise. It is difficult and expensive to de-termine these properties on an application by applicationbasis, and property data which have not been measuredby specialists may suffer large uncertainties and errors.The results of experiments and analyses can only be asaccurate and reliable as their data and frequently that ac-curacy and reliability are limited by the thermophysicalproperties which were used.

SEE ALSO THE FOLLOWING ARTICLES

CRYOGENICS • DIELECTRIC GASES • ELECTROMAGNET-ICS • FUELS • HEAT EXCHANGERS • HEAT FLOW • THER-MAL ANALYSIS • THERMODYNAMICS • THERMOMETRY

BIBLIOGRAPHY

Berenson, P. J. (1961). “Film boiling heat transfer from a horizontalsurface,” J. Heat Transfer 83, 351–358.

Eckert, E. R. G., and Drake, R. M. (1972). “Analysis of Heat and MassTransfer,” McGraw-Hill, New York.

Hartnett, J. P., Irvine, T. F., Jr., Cho, Y. I., and Greene, G. A. (1964–present). “Advances in Heat Transfer,” Academic Press, Boston.

Rohsenow, W. M., Hartnett, J. P., and Ganic, E. N. (1985). “Handbookof Heat Transfer Fundamentals,” McGraw-Hill, New York.

Rohsenow, W. M. (1952). “A method of correlating heat transfer data forsurface boiling of liquids,” Trans ASME 74, 969.

Sparrow, E. M., and Cess, R. D. (1970). “Radiation Heat Transfer,”Brooks/Cole Publishing Company, Belmont, CA.

Touloukian, Y. S., et al. (1970). “Thermophysical Properties of Matter,”TPRC Series (v. 1–13), Plenum Press, New York.

VanSant, J. R. (1980). “Conduction Heat Transfer Solutions,” LawrenceLivermore National Laboratory, UCRL-52863.

Page 210: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

Liquids, Structure and DynamicsThomas DorfmullerUniversity of Bielefeld

I. IntroductionII. Structure of LiquidsIII. Dynamic Properties of LiquidsIV. Molecular Interactions and Complex LiquidsV. Compartmented Liquids

VI. Glass-Forming LiquidsVII. Gels

VIII. Dynamics in Complex LiquidsIX. Conclusions

GLOSSARY

Amphiphiles Molecules consisting of a hydrophobic anda hydrophilic moiety.

Complex liquids Liquids that consist of moleculeswhose anisotropic shape, specific interactions, and in-tramolecular conformations determine their propertiesto a significant degree.

Dynamic spectroscopy Spectroscopic technique thatuses the shape of spectral lines to obtain informationabout the dynamics of molecules.

Gels Polymer–liquid mixtures displaying no steady-stateflow. Gels are cross-linked solutions.

Glass Solidlike amorphous state of matter.Liquid crystals Liquids consisting of highly anisotropic

molecules whose orientations are strongly correlated.Micelles Aggregates formed in an aqueous solution of a

detergent.Microemulsions Ternary solution of water, oil, and a de-

tergent that forms dropletlike aggregates.

Pair correlation function Function of distance r thatdescribes the probability of finding a molecule at aplace located at a distance r from the origin, giventhat another molecule is located at the origin. SeeFig. 1.

Phase diagram Diagram having the coordinates pressureand temperature, in which the solid, liquid, and gaseousphases occupy different regions.

Plastic crystals Crystalline phases displaying a liquidlikerotational mobility of the molecules.

Simple liquids Liquids consisting of spherical or nearlyspherical molecules interacting with central forces andwithout conformational degrees of freedom.

Time correlation function Statistical quantity used todescribe the temporal evolution of a random process.See Eq. (13).

THE LIQUID STATE is a condensed state of matter thatis roughly characterized by high molecular mobility and

779

Page 211: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

780 Liquids, Structure and Dynamics

a low degree of order compared with solids. Liquids canbe distinguished from gases by their high density.

Liquid structure is studied by various scattering meth-ods and liquid dynamics by a large number of spectro-scopic techniques.

The liquid phase of matter is of paramount impor-tance in physics, chemistry, biology, and engineeringsciences. Especially, liquid or liquidlike systems suchas micelles, microemulsions, gels, and membranes playimportant roles in biology and in industrial applications.Although much detailed knowledge of liquids has beenaccumulated, the study of many fundamental issues inliquid-state physics and chemistry is actively pursued inmany fields of science.

I. INTRODUCTION

The liquid state of matter cannot be easily defined in anunambiguous and consistent way. It is often defined interms of the phase-diagram (i.e., with respect to the solidand gaseous state). However, the distinction between theliquid phase and the gas phase is not sharp in the crit-ical region of the phase-diagram, and the distinction ofa liquid from a solid is also unclear for substances thathave a tendency to supercool and are able to form glassesat low temperatures. Furthermore, many fluid substancesare known that display specific structures, such as liquidcrystals and micelles, where some of the criteria usuallyattributed to liquids do not apply. One could give a wide,but still to some extent ambiguous, definition of liquids bysaying that a liquid is a disordered condensed phase. Thiswould then include glasses, which due to the low mobilityof the constituent molecules are usually regarded as amor-phous solid systems. Another case where the limits of theliquid state are ill-defined is that of disordered clustersof molecules and of two-dimensional disordered arrange-ments on surfaces. The question of whether these phasesshould be considered liquids is a matter of the context inwhich they are studied.

Liquids can be classified according to the properties ofthe molecules that constitute them. We thus distinguishbetween atomic and molecular liquids, among nonpo-lar, polar, and ionic liquids, and between liquids whosemolecules do or do not display hydrogen bonding. Sinceinterparticle interactions play a central role in determin-ing the properties of liquids, we can broadly classify sim-ple and complex liquids according to the way in whichthe molecules or atoms of the liquid interact. The sim-plest liquids are those consisting of atoms of noble gases.Thus, liquid argon, being considered as the prototype ofa simple liquid, has been the object of many studies be-cause of the absence of any complicating features in the

intermolecular interaction. What distinguishes, for exam-ple, liquid argon from most other liquids is the sphericalshape of its atoms leading to central interaction forces,the dispersive character of the interparticle forces, and theabsence of internal degrees of freedom.

Other liquids that can be considered simple should con-sist of atoms or molecules with shapes not deviating muchfrom a sphere; they should not display noncentral, angle-dependent, or specific saturable interparticle forces like,for example, those that lead to the formation of hydro-gen bonds, and, finally, the internal degrees of freedom,especially configurational degrees of freedom, should notmuch influence the properties of the liquid. According tothese criteria, the majority of liquids are complex, the con-cept of a simple liquid being the result of an extrapolationof the properties of a relatively small number of liquidswhose molecules comply to some extent to the above-mentioned requirements of simple liquids. The concept ofa simple liquid has been very fruitful in contributing to thedevelopment of the concepts that are necessary to describethe essentials of the liquid state. However, since most inter-esting and important liquids must be reckoned among thecomplex liquids, the study of these is extremely important.

II. STRUCTURE OF LIQUIDS

The ordered structure of crystalline solids is a consequenceof interparticle interactions leading to a dependence ofthe free energy of a system of interacting particles ontheir arrangement in space. The stability of a particularcrystalline structure at thermodynamic equilibrium resultsfrom the minimum of the free energy achieved in this state.Because such interactions are negligible in gases at lowpressure, we observe chaos instead of order in gaseoussystems. The case of liquids is intermediate between thetwo extremes of perfect order in ideal crystalline solidsand complete lack of order in ideal gases. More precisely,the molecules of the liquids interact, and as a consequencethey tend to arrange themselves in a constrained structure.On the other hand, the thermal energy of a liquid is highenough so that the molecules rearrange themselves rapidlyand continuously. As a result, order in most liquids isnot constant in time, nor does it extend over distanceslarger than a few molecular diameters. If the shapes of themolecules and the attractive forces between molecules areanisotropic, then we may observe orientational order in aliquid, with the molecules with aligning themselves sothat the orientation of neighboring molecules is not com-pletely random. The opposite case is observed in plasticcrystals, which are positionally ordered systems in whichthe molecules are, however, relatively free to rotate; so wedo not have orientational order as in classical crystals.

Page 212: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

Liquids, Structure and Dynamics 781

Both translationally and rotationally ordered structurescan be described by appropriate functions of spatial vari-ables such as interparticle distance and relative orientation.Translational order is described by the radial pair corre-lation function g(r ) and orientational order by the staticorientational correlation function. The function g(r ) is theprobability of finding a molecule at a point A at a dis-tance located between r and r + dr from another pointB, given that another molecule is at B. The static orien-tational correlation is a number reflecting the probabil-ity that a molecule located at a distance r from anothermolecule is oriented in such a way that the two moleculesform an angle between ϑ and ϑ + dϑ . For this definitionto apply, the molecules must have a symmetry allowingthe definition of a physically identifiable orientation axis.This is the case, for example, with linear and cylindricallysymmetric molecules. Thus, the size of this quantity givesus a clue as to whether molecules located at a distance rfrom each other tend to align in parallel, antiparallel, orperpendicular orientations and characterizes the averageangle between such molecules.

The importance of both quantities stems from the factthat they can be measured by the diffraction of electromag-netic radiation and by scattering of slow neutrons, bothhaving a wave-length of approximately the average inter-particle distance. Figure 1 displays a characteristic shapeof g(r ), illustrating the exclusion of neighbors at small dis-tances from the reference molecule. The first, more pro-nounced, peak at r1 corresponds to the shell of the nearestneighbors. The second at r2 and the further peaks comefrom more distant and hence more diffusely distributedshells. The oscillations characteristic of the radial distri-bution decay after a small number of maxima and minimashowing that no long-range order is present at distances

FIGURE 1 A typical radial pair correlation function of a simpleliquid. Note the maxima at r1 and r2, which illustrate the increasedprobability of finding a molecule at these distances from the centralmolecule. Note also the value 0 at small distances and the limitingvalue of 1 at large distances. The first is a consequence of therepulsive interaction of molecules, and the second illustrates therandomization of the mean particle density at large distances.

FIGURE 2 The pair potential energy between two moleculesA and B. The two molecules, represented by the spheres, areshown at the equilibrium distance rAB if the potential is a Lennard–Jones potential. In terms of the hard core repulsive potential (ver-tical dashed line) this is the contact position with the center-to-center distance equal to rAB. The Lennard–Jones potential:V(r ) = 4EAB[(r 0/r )

12 − (r 0/r )6] results from the superposition

of the r −12 repulsive branch (upper part of the solid curve) andthe r −6 attractive branch (lower half of the solid curve).

as large as a few molecular diameters. For large values ofthe distance r the product ρg(r ) approaches the value ofthe average number density ρ = N/V of the equilibriumdistribution, where N is the number of molecules con-tained in the volume V . In contrast to this, a system withlong-range order would display nondecaying oscillationsof g(r ) over significant distances.

Although the general form of the intermolecular forcesis known, it is very difficult to derive directly from thisapproximate knowledge the exact shape of the radial dis-tribution function. As a useful approximation, however,the repulsive branch of the potential has been approxi-mated by a hard core potential and the attractive branchexpressed by simple inverse power of the intermolecularcenter-to-center distance. This is illustrated in Fig. 2. Withsuch an approximation it was found that the main featuresof the radial pair distribution function can be explainedqualitatively even if we completely neglect attraction. Itthus appears that most of the liquid structure is the re-sult of the steep repulsive intermolecular pair potential.

Page 213: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

782 Liquids, Structure and Dynamics

Steep in this context means a potential energy functionthat can be expressed as an inverse power of the inter-molecular distance with an exponent that is significantlylarger than the value n = 6 found in dispersion forces.Usually, steep potentials are approximated by an exponentn = 12 or larger because this is the case for the repulsivepart of the Lennard–Jones potential illustrated in Fig. 2.In most cases the repulsive branch of the potentials wasshown to be much steeper than the attractive branch, al-though some complex liquids, such as associating liquids,do have steep attractive branches that are essential in de-termining their structure. In this latter case the potentialhas the shape of a narrow, steep-walled well leading torelatively stable dimers or higher aggregates.

In a system of particles interacting through central pairforces, we can derive a simple equation between the ex-cess internal energy U per molecule due to intermolecularinteractions, the pair potential V (r ) and the radial distri-bution function g(r ):

U = N2πρ

∫ ∞

0g(r )V (r )r2 dr. (1)

We can also derive an expression for the equation of statefor this system in terms of the radial distribution functionand the gradient of V (r ):

PV = NkT

[1 − 1

6kT

∫V

g(r )rdV (r )

drdr

]. (2)

In this equation k is the Boltzmann constant and Pthe pressure. The compressibility equation, Eq. (3), con-nects the isothermal compressibility defined as βT =(∂V/∂ P)TV −1 with the radial pair distribution function:

kTβT = 1 + ρ

∫V

[g(r ) − 1] dr. (3)

The liquid structure that is inherent in g(r ) can also bedescribed in terms of the structure factor S(k), which is aquantity used to describe neutron and X-ray scattering ex-periments. The elastic scattering of, for example, neutronshaving a typical wavelength of 1 A is determined by thelocal arrangement of the scattering atoms. The structurefactor is connected to the radial distribution function bymeans of the equation

S(k) = 1 + ρ

∫V

exp(−ikr)[g(r ) − 1] dr. (4)

This equation shows that the structure factor can be ex-pressed in terms of the Fourier transform in space of theradial distribution function. For isotropic liquids the radialpair distribution function is a function of the modulus rand the structure factor of the modulus of the vector k.If the liquid is anisotropic, we must use instead the fullvectors r and k.

Actually, S(k) describes the liquid structure in k-space,which is the reciprocal of ordinary r -space. The role ofthis function in describing the structure of the liquid is de-termined by the character of the incident radiation whichis characterized by its wavevector ki and that of the scat-tered radiation by ks. The conservation of momentum ofthe system, liquid + incident radiation + scattered radia-tion, leads to a scattering intensity for a given angle ofobservation that depends only on S(k), where the vectork is defined by

k = ks − ki. (5)

S(k) curves can be calculated by model theories that canthen be tested against the experimental curves obtainedfrom neutron scattering experiments.

In the cases of linear (e.g., nitrogen and carbon disul-fide) and tetrahedal molecules (e.g., yellow phospho-rus and carbon trichloride), the diffraction methods haveshown that the former indeed tend to align in the liquid,whereas the latter in some cases form interlocked struc-tures. The structural information obtained by diffractionmethods is important but far from complete, and the con-firmation by model calculations is essential. The calcu-lation of accurate structure factors from diffraction ex-periments is often hampered by correction problems andproblems of interpretation. The use of isotopically sub-stituted molecules has proved essential in obtaining thenecessary data to calculate the more detailed atom–atompair distribution functions of molecular liquids.

With increasing complexity of the molecules, the prob-lems increase too. In the case of some more complexmolecules, however, such as acetonitrile, chloroform,methylene chloride, and methanol, the diffraction methodshave given structure factors that compare favorably withtheoretical data. Especially, one can confirm the forma-tion of hydrogen-bonded chainlike structures, which areexpected from the physical properties of these substancesand from some dynamic data. The above relations can beextended to describe more complex polyatomic molecularliquids if appropriate parameters for the description of themolecular coordinates (i.e., either the relative position ofthe center of mass r and the angles describing the orien-tation of the molecule or the set of parameters specifyingthe position of all the atoms in the molecule) are intro-duced. The radial pair distribution function then becomesa function of all these coordinates and is generally muchtoo complex to calculate or even to visualize.

A very useful simplification of the description andhence the calculation of liquid structure using site corre-lation functions was introduced with the so-called RISMtheory. This theory incorporates the chemical structure ofthe molecules into the model by approximating them toobjects consisting of hard fused spheres modeling their

Page 214: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

Liquids, Structure and Dynamics 783

chemical structures. The usefulness of this view in de-scribing the pair correlation function stems from the factthat it is mainly the shape of the molecule that is criticalin determining the structure of the liquid.

If, on the other hand, the role of the long-range attractivepart of the intermolecular potential is important (e.g., inthe calculation of thermodynamic properties or in the caseof strongly structured liquids such as water or some otherpolar liquids), different methods must be used, such asperturbation theories. These are based upon the assump-tion that we can split the intermolecular interactions intoa simple well-defined part and into another that is consid-ered a small perturbation, but which confers to the morecomplex system its specific properties, which we want tocalculate.

The main problem lies in the question whether it is pos-sible to find an adequate reference-state leading to a knownstructure so that the interactions of the more complex liq-uid can be obtained by adding a perturbation term to thereference potential. The hard-sphere potential was oftenchosen as a reference, but its usefulness for polyatomicliquids has been seriously questioned.

The strengths and limitations of theoretical modelsthat are used to obtain a quantitative description of liquidstructure are often assessed by comparing the resultswith X-ray and neutron diffraction data and with theresults of computer simulation calculations. The latterprovide us with a method of calculating numerically theproperties of model liquids consisting of molecules witha well-defined intermolecular potential. In moleculardynamics computer simulations, the classical trajectoriesof an ensemble of molecules are calculated by solvingthe equation of motion. The static properties (i.e., pairdistribution functions, equations of state, and internalenergy) are calculated as averages over a sufficiently largenumber of equilibrium configurations created from thetrajectories. The value of computer simulation lies in thepossibility of calculating separately the effects of differentfeatures of real molecules: shape, the potential energyparameters, the dipole moment, and several others. Thisproved to be very valuable for the understanding of theimportant factors affecting the structure of liquids. On theother hand, our incomplete knowledge of the intermolec-ular potential of real molecules prevents the computersimulations from giving an exact replica of the real liquid.Furthermore, due to the necessary restrictions in computercapacity, the ensembles that can be reasonably handledconsist of 1000 molecules or less. The averages over suchensembles are considered to represent, to a sufficientdegree of accuracy, statistical averages in a bulk liquidconsisting of some 1020 molecules. This entails problemsof a statistical nature that have been only partiallysolved.

It is fair to say that our knowledge of the structureof liquids has advanced in the last two decades, the es-sential mechanisms determining liquid structure beingunderstood in principle. What is still lacking is an ac-curate knowledge of intermolecular potentials derived ei-ther from experiments in the liquid state or by ab initioquantum-mechanical calculations. Furthermore, we mustbe aware that most of our models are approximations andthat we still are unable to predict whether a given approx-imation is adequate to give good thermodynamic, struc-tural, or dynamical data. We also do not know why mostof the currently used models give good results for data ofone of the above-mentioned classes but poor results foranother class.

III. DYNAMIC PROPERTIES OF LIQUIDS

The description of the equilibrium state of a liquid bymeans of the radial pair distribution function or the struc-ture factor can be extended to include time-dependentproperties of liquids. This can be done by the use of theVan Hoove correlation function G(r, t) which has beenintroduced as a tool for the description of quasi-elasticneutron-scattering results. This function is both space- andtime-dependent. In analogy to the definition of the staticradial distribution function by means of Eq. (1), the VanHoove correlation function is defined as a time-dependentdensity–density correlation function:

G(r, t) = 〈ρ(r, t) · ρ(0, 0)〉〈ρ(0, 0)〉2 , (6)

where G(r, t) is the probability of finding a particle i in aregion dr around a point r at time t , given that there wasa particle j at the origin at time t = 0.

To separate the motion of particles in a laboratory-fixedframe of reference from the relative motion of the particles,it is convenient to separate G(r, t) into a self and a distinctpart:

G(r, t) = Gs(r, t) + Gd(r, t). (7)

Figure 3 illustrates the behavior of Gs(r, t) and Gd(r, t)on three time scales. The time scales are considered withrespect to the so-called structural relaxation time τ , whichis defined as the average time required to change the localconfiguration of the liquid. In Fig. 3, the following casesare distinguished:

1. The time scale is short with respect to the structuralrelaxation time.

2. The time scale is similar to the structural relaxationtime.

Page 215: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

784 Liquids, Structure and Dynamics

FIGURE 3 The shapes of (a) the self term and (b) the distinctterm of the Van Hoove space-time correlation function. At timesthat are short with respect to the structural relaxation time, Gs(r, t)is sharply peaked since the reference molecule did not have timeto change its position significantly, whereas Gd(r, t) is zero dueto the repulsion of the reference molecule at the origin, whichdoes not allow another molecule to occupy the same position.With increasing time, as t becomes similar to τ , Gs(r, t) broadensbecause the probability of finding the reference molecule awayfrom the origin increases. On the other hand, at small distancesfrom the origin, Gd(r, t) increases as the probability of finding an-other molecule at the origin is no longer zero. Finally, at long times(t τ ), the probability of finding a given molecule at a distance ris small and independent of the distance from the origin, and theprobability of finding some molecule at r is 1.

3. The time scale is long with respect to the structuralrelaxation time.

One sees that at short times Gs is sharply peaked aboutr = 0, whereas Gd displays the characteristic oscillationssimilar to the time-in-dependent radial pair distributionfunction in Fig. 1. At long times, both functions vary littlein space and approach the steady-state value as the localdistribution is nearly averaged out at times long as com-pared to τ .

The described generalization leading to the space- andtime-dependent Van Hoove correlation function is read-ily extended to the structure factor which then becomesfrequency dependent. The use of a frequency dependentdynamic structure factor S(k, ω) stems from the study ofthe spectra of scattered slow neutrons. The symbol ω is forthe angular frequency. Thermal neutrons are well suitedto the study of the dynamics of liquids because their energyis comparable to kT , and the wavelength associated withthe neutrons is comparable to intermolecular distances atliquid densities. The measurable derivative d2σ/dO dϑ ofthe differential cross section dσ/dO is directly related toS(k, ω) by the equation

d2σ

dO dϑ= b2

(k1

k0

)S(k, ω), (8)

whereσ is the total cross section and b the scattering lengthtypical for the scattering atom (of the order of magnitudeof the nuclear radius). If the molecule is heteronuclear, this

relation has the form of a sum over all j atomic specieswith scattering lengths b j , O represents the solid angleunder which the scattered radiation is detected and k1 andk0 the moduli of the wavevectors of the neutrons beforeand after the scattering event, respectively.

The dynamic structure factor can be separated into a selfand a distinct part Ss(k, ω) and Sd(k, ω) corresponding tothe self and distinct parts of the Van Hoove correlationfunction. This separation acknowledges the fact that themolecular motion detected by neutron scattering involvesboth single-particle and collective motions. We can usetwo extreme models to describe the situation.

A. The Perfect Gas Model

The assumption of a free motion of the molecules witha mass M and the most probable velocity v0 = √

2kT/Mleads to the following expression for the Van Hoove cor-relation function:

Gs(r, t) = (π−2v0τ

)−3exp

−r2

(v0τ )2(9a)

Gd(r, t) = σ. (9b)

From this expression we can obtain the following expres-sion for the dynamic structure factor for this model:

S(k, ω) = (kv0π

1/2)−1

exp−ω2

(kv0)2 . (10)

In liquids, this limit of a free (i.e., collisionless) motionis realized in the limit r → 0 and t → 0 correspondingto k → ∞ and ω → ∞. Such behavior is approximatedin scattering experiments with wavelengths significantlyshorter than the average interparticle spacing of a fewAngstroms (thermal neutrons) and probing times shorterthat the average time between successive collisions.

B. Single-Particle Motionin the Hydrodynamic Limit

The opposite extreme to this limit is obtained at long timesand large distances corresponding to k → 0 and ω → 0. Inthis range the molecular interactions and not their massesdetermine the motion that is now monitored by the scat-tering of long-wavelength radiation (e.g., light scattering).This latter range of liquid dynamics corresponds to the hy-drodynamic limit. In this limit the liquid can be treated asa continuum to which the hydrodynamic equations ap-ply, the molecular details being formally introduced asextensions of the classical Navier–Stokes equations (e.g.,by using a frequency-dependent viscosity to take careof molecular relaxation processes). In the hydrodynamiclimit (i.e., when r and t are sufficiently large), the single-particle correlation function obeys a diffusion equation

Page 216: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

Liquids, Structure and Dynamics 785

similar to Fick’s differential diffusion equation. Under ap-propriate boundary conditions, we obtain the followingintegrated form:

Gs(r, t) = (4π Dt)−3/2exp−r2

4Dt. (11)

This is also Gaussian, differing, however, in the time de-pendence from the case of the free motion in Eq. (10). Thecorresponding dynamic structure factor is given by

Ss(k, ω) = (1/π )Dk2

ω2 + Dk2. (12)

This expression represents the spectrum of the scatteredintensity at a fixed value of the wavevector (i.e., at a fixedangle of observation).

The time correlation function formalism has beenshown to be adequate for representing liquid dynamicsin a convenient way. Thus, some experimental methodssuch as photon correlation spectroscopy directly give timecorrelation functions; others such as infrared and Ramanbandshape analysis operate in the frequency domain, andthe obtained spectra can be Fourier transformed to givetime correlation functions. Figure 4 visualizes the rela-tion between the two domains. This procedure is basedon the fluctuation-dissipation theorem of statistical me-chanics which connects random thermal fluctuations ina medium to the power spectrum characterizing the fre-quency spectrum of the process. A time correlation func-tion of a dynamical molecular variable A(t) (e.g., a dipolemoment) is defined by

C(t) = 〈A(0)A(t)〉. (13)

The correlation time τ of a process described by the abovecorrelation function is defined as the integral of the cor-

FIGURE 4 Relation between (a) the power spectrum (frequencydomain) of a relaxation process and (b) the correlation function(time domain) of a dynamical variable describing this process. Indynamic spectroscopy, the half width at half height of the spec-tral line is measured, and the correlation function is obtained byFourier transforming the spectral profile. The relaxation time canbe directly calculated from the linewidth by the relation indicated inthe figure if the spectral profile is a Lorentzian. In photon correla-tion spectroscopy, which operates in the time domain, a correlationfunction is directly measured.

relation function over the time from t = 0 to t = ∞. Ifthe process described by the above correlation function isdiffusive, then the correlation function is exponential:

C(t) = C(0) exp−t

τ. (14)

The time constant τ describing the decay of the cor-relation is termed a relaxation time. The correspondingspectrum I (ω) has a Lorentzian shape given by

I (ω) = 1

1 + (ω − ω0)2τ 2. (15)

The Lorentzian bandshape as indicated in the figure ischaracterized by the relaxation time that thus can be ex-tracted from the half width at half height of the spectralband by

τ = 1

2π. (16)

The spectral band function I (ω) and the correspond-ing correlation function C(t), as illustrated in Fig. 4, area Fourier transform pair (i.e., they can be uniquely trans-formed into each other).

The macroscopic transport coefficients such as the massdiffusion coefficient, the thermal conductivity coefficient,and the macroscopic shear viscosity have been related tothe time integral of pertinent correlation functions. Thus,the mass diffusion coefficient D is given by

D = 1

3

∫ ∞

0〈v(0)v(t)〉 dt. (17)

In this equation v(t) is the molecular velocity at time t .We have presently at our disposition several sources

of information about the molecular dynamics of liquids.Among them the most important experimental techniquesare Rayleigh and Raman light scattering; infrared and farinfrared spectroscopy; NMR spectroscopy; fluorescenceanisotropy methods, either stationary or time dependent;and time-dependent spectroscopy from the nanosecond tothe femtosecond time scale. On the other hand, one ofthe most important sources of dynamical information onliquids is the computer simulation by means of molecu-lar dynamics. The method aiming at extracting dynamicalinformation from the shape of spectra is termed dynam-ical spectroscopy. The dynamical information containedin spectral band-shapes is in most cases complex, sincethe spectra reflect rotational, translational, and vibrationalbroadening mechanisms that cannot be uniquely sortedout. Each spectroscopic method has its strengths and itsinherent restrictions, which is why most of the progress hasbeen obtained by the simultaneous application of severalmethods on the same liquid. One of the most serious re-strictions is that some methods (e.g., Rayleigh scattering)probe collective motions that are very difficult to relate

Page 217: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

786 Liquids, Structure and Dynamics

exactly to single molecule motions. On the other hand,collective motions, essential to the understanding of liquiddynamics, are of great interest in themselves. On the otherextreme, NMR data probe essentially single molecule mo-tions, however, in contrast to optical spectroscopic meth-ods, they do not give all the information contained in fulltime correlation functions but only correlation times.

In all the dynamical methods, we must be aware of in-strumental restrictions with regard to the accessible timescale. The extension of the time scale over as many decadesas possible is vital to the understanding of the underlyingmolecular mechanism and is one of the main experimen-tal goals in this area. For this reason it is important toobtain reliable data with different methods on time scalescomplementing each other and which can be properly ad-justed to each other. This is for example the case with thesimultaneous study of the depolarized light scattering andflourescence anisotropy of the same label molecule dis-solved in a liquid which can be used to monitor molecularrotations on a time scale extending from a few picosecondsto approximately 1 µs (i.e., over six decades).

In the last years the use of ultrashort laser pulses hasbecome an increasingly important tool for the study ofliquid dynamics. By this method it has become possi-ble to study directly in the time-domain processes tak-ing place in the picosecond and subpicosecond timerange. These include orientational processes in liquids,the rates of charge transfer processes, the rate of recom-bination of ions to molecules in a liquid cage, and a num-ber of solvent-dependent photophysical processes. Thus,vibration–rotation coupling and the rates of vibrationalenergy and phase relaxation were studied by picosecondspectroscopy, and the corresponding rates could be de-termined in some cases. Although the data obtained inthe time domain and those in the frequency domain arerigorously linked, the former sometimes allow us to cir-cumvent serious instrumental complications. In combina-tion with spectroscopic lineshape analysis, real-time tech-niques have improved significantly our understanding ofliquid-state dynamics.

Intermolecular dynamics is manifested in so-calledinteraction-induced spectra. This phenomenon, whichleads to the occurrence of forbidden spectral lines appear-ing in high-density gases and in liquids has been studiedextensively in the last years and has been shown to behelpful in obtaining information mainly about the short-time dynamics of liquids. The main mechanism by whichthese spectra are produced is the induction of a time-dependent dipole on a molecule by electric fields of othermolecules in its immediate neighborhood and the interac-tion of this dipole with the electric field of the light.The in-termolecular inducing fields may be coulomb-, dipole-, orhigher-multipole fields, as the case may be. Furthermore,

high-energy collisions produce distortions of the collidingmolecules, thus inducing transient dipoles that also con-tribute to interaction-induced spectra. It appears that thewealth of information concealed in interaction-inducedspectra is presently the main problem encountered in theanalysis of such data.

IV. MOLECULAR INTERACTIONSAND COMPLEX LIQUIDS

The structures as well as the dynamics of liquids in equi-librium are determined by interactions of the molecules.Knowledge of these interactions is essential in obtaining atheoretically founded description of the physics of liquids.However, it is still very difficult to carry out quantum me-chanical ab initio calculations of the intermolecular poten-tial, although the basic understanding of the interactionsbetween molecules is available. Thus, such calculationshave been fruitful only for a small number of moleculesconsisting of a relatively small number of atoms.

The alternative to theoretical calculations is to deter-mine intermolecular potentials by accurate gas-phase ex-periments that probe essentially two-particle interactions.However, this has also proved at least ambiguous since itis generally not possible to use unmodified gas-phase po-tentials for liquids. On the other hand, the reverse methodof determining intermolecular potentials from liquid-statedata is also unyielding because the method is model-dependent and, additionally, because most measurablequantities present themselves as integrals over ensemblesof molecules from which the integrand cannot be uniquelydetermined.

For the present we must use empirical potentials thatrepresent averages over those effects that cannot yet beexplicitly taken into account. A major problem that is stillunsolved is the calculation of many-particle interactionsin an ensemble of interacting molecules. This is critical forthe description of a liquid since at liquid densities, due tothe small distances between interacting molecules, we are,in principle, not allowed to express the total interaction inthe liquid as a sum of pairwise additive interactions, ne-glecting the many-body character of the problem. Someexperimental results have been interpreted by assumingthat many-particle interactions are important; however,such interpretations are still far from being quantitative.

V. COMPARTMENTED LIQUIDS

The nature of molecular interactions is very important indetermining the structure and the properties of complexliquids. Many molecules consist of two different parts, the

Page 218: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

Liquids, Structure and Dynamics 787

one interacting strongly with water (the hydrophile) andthe other not (the hydrophobe). In aqueous solutions weobserve compact or lamellar liquid aggregated structures,depending on the nature of the solute, the temperature, andthe concentration. The micelles thus formed are assumedto have a rather compact hydrophobic core surrounded bya hydrophilic shell. The hydrophiles are generally ionic orpolar groups, whereas the hydrophobe is often an aliphaticchain. The driving force of aggregation in water of suchmolecules, called amphiphiles or detergents, is the min-imization of the total free energy resulting from con-tributions from the water–water, water–hydrophile, andhydrophile–hydrophile interactions. The phase diagramof amphiphile–water mixtures displays several distinctphases with quite different properties, depending on thetemperature and the concentration of its constituents. Thestructure of the resulting aggregates is a function of the na-ture of the hydrophile and of the length of the hydrophobicchain (see Fig. 5). Also the structure of the liquid withinthe micellar core seems to be in some cases different fromthat in bulk liquids with the effect that solubilized speciesmay display a specific behavior as regards reactivity, acid-ity, mobility, etc.

Another example of compartmentation of practical im-portance is the case of microemulsions, which are formedwhen water, oil, and a detergent are mixed in appropriateproportions. Such systems are used to solubilize otherwiseunsoluble substances and to promote chemical reactionsby capturing them in their interior and thus increasing thelocal concentration of the reactants. Catalytic reactions inmicelles and microemulsions play an increasing role inchemistry.

The occurrence of localized compartmented liquidlikephases is a very important phenomenon and plays a ma-jor role in biological systems where both fluidity andcompartmentation are essential. The internal fluidity of

FIGURE 5 The structure of a micelle in water.

compartmented liquid phases is studied intensively byspectroscopic methods such as ESR spectroscopy and flu-orescence anisotropy decay of convenient dissolved orchemically bound labels.

Even in the absence of distinct phases, molecules thatconsist of groups with different affinities to the solventgive rise to more localized and less randomized structuresthat affect the physical and chemical properties of liquidmixtures. Hydrogen-bonding molecules can belong to thiscategory. Charge transfer interactions may affect in a sim-ilar fashion the local structure of a liquid mixture.

VI. GLASS-FORMING LIQUIDS

Amorphous substances with a solidlike rigidity play animportant technological role, and their study is one of themajor fields in materials science. Such substances are gen-erally obtained when a liquid is cooled below its meltingpoint while preventing crystallization. Glass-forming liq-uids (i.e., those that can be obtained in the glassy state)must have special properties connected with the symme-try, configuration, and flexibility of the molecules or theirability to form intermolecular bonds. Polymeric liquidsare among the best studied glass-forming systems.

The properties of liquids at different temperatures canbasically be understood in terms of the kinetic energy andthe intermolecular potential of the molecules. In somecases, however, during the process of cooling, the vis-cosity of a liquid increases by several orders of magni-tude in a rather narrow temperature range. The explana-tion usually given for this extreme slowing down of mostof the molecular dynamics is that in such cases the thermalenergy kT becomes similar to or smaller than the inter-molecular potential energy required by the molecules toaccommodate in the respective equilibrium configuration.Thus, the source of the high viscosity is the freezing ofintramolecular configurations, while at sufficiently hightemperatures the molecules are able to move past eachother, allowing local stresses to be relieved at a muchfaster rate. In the case of nonrigid molecules such as mostpolymers this process is supported by the adaptation ofthe molecule to the constraints produced by the environ-ment and external forces. In the high-viscosity state, onthe other hand intramolecular barriers may prevent themolecule from undergoing configurational changes, thisprocess leading to an increasing rigidity of the moleculeitself. As the relaxation of the undercooled liquid to ther-modynamic equilibrium becomes slower than the coolingrate, instead of crystal formation we observe the forma-tion of a rigid amorphous glass. The temperature Tg atwhich this occurs is known as the glass point. Several au-thors define the glass point as the temperature at which

Page 219: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

788 Liquids, Structure and Dynamics

the rate of molecular motions pertinent to the relaxationto equilibrium becomes macroscopic (i.e., of the order ofseconds to hours). Another definition stresses the aspectthat glass formation can be viewed as a thermodynamicphase transition.

Macroscopically, glasses can be distinguished from or-dinary liquids by the presence of elasticity. We express thisby saying that glasses respond to external stress (mainlyshear stress) predominantly by an elastic mechanism (fullrecovery after the stress has been relieved) while liquidsrespond by a viscous mechanism (no recovery after thestress has been relieved). Actually, both states of matterdisplay viscoelasticity (i.e., viscous as well as elastic re-sponse); however, this is generally observed only in theintermediate cases where the response changes from pre-dominantly viscous to predominantly elastic. The plot ofthe elasticity modulus versus temperature in Fig. 6 showsthe transition from the glassy to the rubbery and from thereto the liquid state.

The observation of viscoelastic behavior is, of course,a matter of the time scale of the experimental techniqueused to study the dynamics. Thus, the elastic response isapparent only when the time scale of the deformation iscomparable to the time required for molecules to accom-modate in the new equilibrium configuration. Viscoelas-tic properties of liquids and glasses are studied by mea-suring mechanical, ultrasonic, and rheological quantitiessuch as various elastic moduli and viscosity coefficients.Furthermore, several spectroscopic techniques such as di-electric relaxation, time-dependent Kerr effect, and light-scattering spectroscopy have been applied successfully tothe study of glass-forming systems. By these methods weobtain a characteristic relaxation time resulting from anexponential decay of some property such as dielectric po-larization or from mechanical deformation. In many cases,however, the data indicate the presence of more than onerelaxation process, which have been often described in

FIGURE 6 Plot of the elasticity modulus versus temperature,showing the transition from the glassy to the rubbery to the liq-uid state.

terms of a slow α-process, a faster β-process, and otherprocesses indicated by the Greek letters γ , δ, and so on.

Especially the α-process is crucial in determining themechanical properties of glass-forming systems, and at-tempts are made to synthesize molecules with a giventemperature dependence of the α-relaxation process. Thiswould allow us to obtain materials with definite useful me-chanical properties in a given temperature range. The inter-pretation of these processes at a molecular level, however,is a still unsolved problem. Many attempts to rationalizethe data in a semiphenomenological manner are based onthe concept of the free volume available to the molecu-lar motion and hence to the relaxation of nonequilibriumconfigurations. All these phenomena, which have been ex-tensively studied in polymer melts, are also observed inseveral glass-forming low-molecular-weight liquids (e.g.,o-terphenyl, decalin, salol, and polyalcohols).

An important observation in supercooled liquids is thedependence of the physical properties of these substanceson the thermal history of the sample. The question whetherglass formation is the effect of kinetic constraints only, orwhether other factors play a role, is still open. The exis-tence of metallic glasses and the observation of a glassyphase in computer simulations of molecules as simple asargon may be important clues to this question.

VII. GELS

When a low-molecular-weight liquid is dissolved in ahigh-molecular-weight system (the stationary compo-nent), which often is a cross-linked polymer, under cer-tain conditions we observe the formation of a gel. This isa macroscopically homogeneous liquid with high internalmobility but no macroscopic steady-state flow. Gel forma-tion requires the presence of more or less stable cross-linksto prevent viscous flow as well as a fluid component thatmust be a good solvent for the stationary component.

Figure 7 displays schematically the structure of a gel.

VIII. DYNAMICS IN COMPLEX LIQUIDS

One of the dynamical problems studied theoretically andexperimentally very extensively is the rotational motionof molecules in liquids. The molecular rotation in gasesis solely determined by the kinetic energy and the mo-ment(s) of inertia of the molecule. Under the action offrequent random collisions with other molecules, the ro-tation of a particular molecule in a liquid is continuouslyperturbed, and this is reflected in an exponential time de-pendence of the correlation function of the orientation ofthe molecular axes. Generally, at short times of the order of0.1 ps and less, the motion is determined by the molecular

Page 220: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GQT/MBR P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008b-386 June 29, 2001 16:56

Liquids, Structure and Dynamics 789

FIGURE 7 The structure of a gel. The points represent themolecules of the liquid. The cross-linked polymer is representedby the lines.

moment of inertia and the temperature, whereas at longertimes, the motion is determined by angular momentum ex-change due to the frequent collisions with other molecules.This can be described by a friction exerted on the rotatingmolecule by its neighbors. It was shown by Debye thatunder certain simplifying assumptions this so-called rota-tional diffusion can be described by a relaxation time τOR

which is connected to the macroscopic viscosity η of theliquid:

τOR = ηV

kT.

In this equation V is a characteristic volume, the hydro-dynamic volume of the molecule. If the shape and sizeof the rotating molecule are known, this relation can beused to probe the local viscosity in liquid systems (e.g., inmicelles and membranes). Such a local viscosity can bedifferent from the macroscopic viscosity and is accessibleonly through measurements done on label molecules.

Since molecular labels are convenient indicators ofthe local microdynamics of the liquid in their neighbor-hood, they can also be used to test theoretical modelsof liquid-state dynamics. The experimental methods cur-rently used are NMR relaxation, Raman linewidth mea-surements, dynamic light-scattering spectroscopy, fluo-rescence anisotropy, and dielectric relaxation. The theoryof rotating molecules in a liquid medium interacting in anuncorrelated random fashion with the surroundings hasbeen described by models amenable to analytical calcu-lations in the case of simple liquids. The quantities thatenter the calculation are molecular moments of inertia,molecular masses, and intermolecular forces. In the caseof more complex liquids, the assumption of a diffusivemotion in a continuum is made, and the parameters ofthe model are hydrodynamic quantities that can be com-

pared with the corresponding macroscopic data (e.g., themacroscopic shear viscosity).

The translational motion of very large molecules in liq-uids, such as diluted polymers or supramolecular aggre-gates like micelles and microemulsions, has been studiedby light-scattering methods to obtain information aboutthe molecular weight and size of the diffusing entity, itspolydispersity, the interactions with other species (e.g.,ions) or, at higher concentrations, interactions between thediffusing molecules themselves, and, finally, the internalflexibility and the rate of configurational changes.

IX. CONCLUSIONS

The liquid state includes a large number of phenomeno-logically very different systems such as simple liquids,micelles, microemulsions, polymer melts, liquid crystals,and gels. All these systems have in common (1) a ratherhigh molecular mobility which may, however, be restrictedin different ways depending on the system, and (2) molec-ular disorder which may also be restricted in differentways. The study of the liquid state involves most of themodern physical methods, and the theory of its molecularaspect requires elaborate statistical mechanical methods.The study of the liquid state is progressing at a rapid ratealthough several basic problems still remain unanswered.

SEE ALSO THE FOLLOWING ARTICLES

FLUID DYNAMICS • GLASS • HYDROGEN BOND • LIQUID

CRYSTALS (PHYSICS) • MICELLES • MOLECULAR HYDRO-DYNAMICS • PERMITTIVITY OF LIQUIDS • POTENTIAL EN-ERGY SURFACES • RHEOLOGY OF POLYMERIC LIQUIDS •X-RAY SMALL-ANGLE SCATTERING

BIBLIOGRAPHY

Barnes, A. J., Orville-Thomas, W. J., and Yarwood, J., eds. (1983).“Molecular Dynamics and Interactions,” D. Reidel, Dordrecht.

Berne, B. J., and Pecora, R. (1976). “Dynamic Light Scattering,” Wiley,New York.

Birnbaum, G. (1985). “Phenomena Induced by Intermolecular Interac-tions,” Plenum, New York.

Enderby, J. E., and Barnes, A. C. (1990). Reports on Progress in Physics53(1 & 2), 85–180.

Hansen, J. P., and McDonald, I. R. (1976). “Theory of Simple Liquids,”Academic Press, New York.

Rothschild, W. G. (1984). “Dynamics of Molecular Liquids,” Wiley, NewYork.

Rowlinson, J. S. (1982). “Liquids and Liquid Mixtures,” Butterworth,London.

Wang, C. H. (1985). “Spectroscopy of Condensed Media,” AcademicPress, Orlando.

Page 221: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

Mechanics, ClassicalA. Douglas DavisEastern Illinois University

I. KinematicsII. Newton’s Laws of MotionIII. ApplicationsIV. Work and EnergyV. Momentum

VI. Rigid Body MotionVII. Central Forces

VIII. Alternate Forms

GLOSSARY

Conservation Certain quantities—e.g., energy andmomentum—remain the same for a system before,during, and after some interaction (often a collision).Such quantities are said to be conserved.

Dynamics Explanation of the cause of motion. This in-volves forces acting on massive bodies and the motionthat ensues.

Energy Ability to do work; stored-up work.Kinematics Description of motion.Momentum Mass multiplied by the velocity. Momentum

is a vector.Statics Study of forces acting on bodies at rest.Work Distance a body moves multiplied by the compo-

nent of force in the direction of the motion. Work is ascalar.

CLASSICAL MECHANICS is the study of ordinary,massive objects, for example, the study of objects roughly

the size of a bread box traveling at roughly sixty miles anhour. It is to be distinguished from quantum mechanics,which deals with particles or systems of particles that areextremely small, and it should also be distinguished fromrelativity, which deals with extremely high velocities.

Classical mechanics can be divided into statics, kine-matics, and dynamics. Statics is the study of forces on abody at rest. Kinematics develops equations that merelydescribe the motion without question to its cause. Dynam-ics seeks to explain the cause of the motion.

I. KINEMATICS

Motion of an object must be described in terms of itsposition relative to some reference frame. If the motionoccurs in one dimension (as along a straight highway orrailroad track) the position will usually be written as x .If the motion occurs in two or three dimensions (as anairplane circling an airport or a spacecraft on its way toJupiter) the position will be written as r. The position is a

251

Page 222: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

252 Mechanics, Classical

vector. For the one-dimensional case, the vector nature ofthe position shows up as the sign of x . For example, theposition may be considered positive to the right; then it willbe negative to the left. Position is commonly measuredin meters. Of course, position may also be measured inkilometers, centimeters, feet, or miles as the need arises.

Velocity is the time rate of change of the position of anobject. It can be written as

v = x/t

or

v = dx/dt

for the one-dimensional case or, for the three-dimensionalcase, as

v = r/t

or

v = dr/dt,

where x or r is the position of the object of interest. Veloc-ity describes how fast the object is moving and in whichdirection. That means that velocity is a vector. Speed is themagnitude ( just the “how fast” without the direction) ofvelocity. Speed is a scalar. Both are commonly measuredin m/s.

Acceleration is the time rate of change of the velocityof an object. It can be written as

a = v/t

or

a = dv/dt

for the one-dimensional case or, for the three-dimensionalcase, as

a = v/t

or

a = dv/dt.

Acceleration is commonly measured in meters per sec-ond per second (m/s2). An acceleration of 10 m/s2 meansthat the velocity increases by 10 m/s every second. Othersystems of units could be used. For example, automotiveengineers may find it useful to express a car’s accelerationin miles/h/s. An acceleration of 4.3 miles/h/s means thata car’s velocity increases by 4.3 miles/h every second.

A. Constant Acceleration

If the position is known as a function of time, then thevelocity and acceleration are quite easy to determine byapplying their definitions. However, it is more usually thecase that the acceleration is known and the velocity and

position are wanted. For constant acceleration, a, in onedimension the velocity and position at some time t can befound from

v = v0 + at

x = x0 + v0t + 12 at2,

where x0 is the initial position at t = 0 and v0 is the initialvelocity at t = 0. Often it is useful to determine the veloc-ity at some position, rather than at some time. These twoequations can be solved to provide

v2 = v20 + 2a(x − x0).

B. Nonconstant Acceleration

Acceleration is connected to position through a second-order differential equation

a = d2x

dt2

or

a = d2rdt2

so the solution of x(t) or r(t) from a or a may, indeed, berather difficult. If the acceleration is known as a functionof time, a(t), then it may be integrated directly to yield

v(t) = v0 +∫ t

0a(t) dt

and

x(t) = x0 +∫ t

0v(t) dt.

A few other, special cases exist, which may be solved di-rectly. In general, though, ideas from dynamics, such asenergy conservation or momentum conservation, are usu-ally necessary in solving for or understanding the motionof an object when its acceleration is not constant.

II. NEWTON’S LAWS OF MOTION

A. Inertia

Once a car is moving a braking system is necessary to bringit back to a stop. Or a book lying on a desk requires a pushor a shove from the outside to start it moving. Both of thesesituations are examples of inertia. Because of inertia, anobject tends to continue to do what it is presently doing.This seems to have been first understood by Galileo andwas first clearly stated by Sir Isaac Newton in the first ofhis three laws of motion.“In the absence of forces fromthe outside, a body at rest will remain at rest and a bodyin motion will continue in motion along the same straightline with the same velocity.”

Page 223: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

Mechanics, Classical 253

Friction is a force, which is nearly always present andsometimes masks this idea of inertia. If a book is givena shove across a table it may stop before reaching theedge. The law of inertia, Newton’s first law of motion,is still valid. But there is a force from the outside, theforce of friction. An ordinary car will not coast forever;it will eventually come to rest. But there are forces fromthe outside which cause this (namely friction due to theair and friction between tires and roadway.)

The idea of inertia is important because it asserts thatmotion continues because of the motion present. Thereneed not be an active, continuing agent present at all timesas the motion continues.

B. Force (F = ma)

While inertia is important, motion is far more interestingwhen there is a force present. If there are several forcesacting on a body, it is the net force—the vector sum ofall the forces—that is important. Newton’s second law ofmotion states that “in the presence of a net or unbalancedforce a body will experience an acceleration. That accel-eration is inversely proportional to the mass of the bodyand is directly proportional to the force and in the directionof the force.”

This can be written as a = F/m although it is more com-monly written as F = ma. There is little exaggeration tosay that almost all of classical mechanics derives directlyfrom Newton’s second law.

Velocity and acceleration are easily and often confused.Most people are more familiar with velocity or speed.However, it is the acceleration that is of most use in deter-mining and describing motion and its cause.

A force is a push or a pull. Force is anything that causesan acceleration. Newton’s second law can be used as thedefinition of a force.

Newton’s second law also provides an operational def-inition of the mass of an object. It is a measure of howmuch “stuff” there is in an object. It is a measure of howdifficult it is to accelerate an object. By definition, a partic-ular block of platinum–iridium alloy has been designatedto have a mass of exactly 1 kg. If the same force is appliedto this (or an identical) block and to another block andthe other block is found to have an acceleration exactlyone-half that of the standard block, then the other block’smass is 2 kg.

The mass of an object is always the same. It is indepen-dent of altitude or position. As we shall see, there is animportant distinction between mass and weight.

In the metric system (or SI units), mass is measured inkilograms, force in newtons, and acceleration in m/s2. Aforce of one newton could cause a 1 kg mass to accelerateat 1 m/s2.

C. Action–Reaction

Newton’s third law of motion states that “if object 1 exertsa force on object 2 then object 2 also exerts a force backon object 1. The two forces are identical in magnitude andopposite in direction.”

An example of this is the force you exert down on achair when you sit on it and the force the chair exerts upon you. When an airplane propeller pushes back on the air,the air pushes forward on the propeller. As the sun pulls onearth, earth also pulls back on the sun. It is impossible toexert a force on an object without an additional force beingexerted by that object. Notice that the forces in questionare always exerted on different objects.

III. APPLICATIONS

A. Straight-Line Motion

Any constant force produces a constant acceleration so thekinematic equations for constant acceleration are imme-diately useable.

1. Free Fall

Freely falling objects near the earth’s surface are found tohave a constant acceleration of 9.8 m/s2 (or 32 ft/s2) down-ward if air resistance can be neglected. This accelerationis usually labeled “g”; that is, g = 9.8 m/s2 = 32 ft/s2. Toproduce the same acceleration, the forces on two differentbodies must be proportional to the masses. That meansthat the force of gravity must be proportional to the massof a body. This force of gravity is called weight W and

W = mg.

The kinematics equations that describe an object in freefall, then, are simply

v = v0 − gt

x = x0 + v0t − 12 gt2,

where the acceleration a has just been replaced with −g(the minus sign merely indicates downward).

2. Simple Harmonic Motion

A spring exerts a linear restoring force. As a spring isstretched or compressed, the force it exerts is proportionalto how far it has been stretched or compressed from its un-stretched, uncompressed, equilibrium position. And theforce is directed to move the stretched or compressedspring back to that equilibrium position. This force canbe described by the equation

F = −kx,

Page 224: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

254 Mechanics, Classical

where x is the displacement from equilibrium (positivefor stretch and negative for compression), k is a springconstant that describes the strength of the spring, and F isthe force.

If an object of mass m is attached to such a spring themotion that it undergoes is known as simple harmonicmotion. That motion can be described by

x(t) = A sin(ωt + ϕ).

A is called the amplitude of the motion and is the max-imum displacement from the equilibrium position. Themotion is symmetric; the object will move as far on oneside of the equilibrium position as on the other side. ϕ is aphase angle determined by the initial conditions (x0, v0).ω is the angular frequency in rad/s. It is related to the moreusual frequency f of cycles per second by

f = 2πω.

For such a mass on a spring the angular frequency is equalto

ω =√

k/m.

The period T is the amount of time required for a singlecycle; therefore, the period and frequency are related by

f = 1/T .

A small object of mass m suspended by a cord of length land allowed to swing back and forth is called a simple pen-dulum. For small amplitudes the motion of such a simplependulum is also simple harmonic motion. For the simplependulum the angular frequency is given by

ω =√

g/ l.

Note that for both examples of simple harmonic motion,the frequency is independent of the amplitude.

B. Three-Dimensional Motion

Newton’s second law makes it easy to extend the ideasof straight-line motion to projectile motion, the motionfollowed by a body thrown and released near the earth’ssurface. Observe this motion from far, far away in the planeof the motion and it looks like the object has simply beenthrown upward. The force of gravity acts to accelerateit downward. The vertical part of the motion appears tobe simply free fall and that is just motion with constantacceleration. Observe this motion from far above and itlooks like the object is moving at constant velocity. Thereis no horizontal force. The horizontal part of the motionappears to be simply constant velocity.

The path a projectile takes is a parabola. The range isthe horizontal distance an object will go if it is thrown

from and lands back on the same level surface. The rangeR is given by

R = v20 sin 2θ

/g,

where v0 is the initial speed and θ is the angle above thehorizontal at which the projectile is thrown. Note that therange is the same for complimentary angles; that is, θ and90 − θ give the same range. Maximum range is found forθ = 45.

IV. WORK AND ENERGY

A. Work

Work done by a constant force F is defined as the distanceD an object moves multiplied by the component of forcein that direction. Pushing on a wall may tire your bodybut no work has been done according to this definition.If a yo-yo swings in a circle, the string continually exertsa force perpendicular to the direction of motion and nowork is done. Work is a scalar quantity. The units of workare newton-meters or joules.

B. Kinetic Energy

The amount of work done on a body is equal to an increasein the quantity 1

2 mv2. That is,

W = 12 mv2

f − 12 mv2

0,

where vf is the final speed after the work has been doneand v0 was the original speed before. Because the object ismoving, it has the ability to do work on something else—itcould exert a force on another object over some distance.This ability to do work is called energy. Energy is a scalar.The quantity 1

2 mv2 is called the kinetic energy; it is energydue to motion.

The kinetic energy associated with the random motionof molecules due to heat is called thermal energy.

C. Potential Energy

Doing work on an object may change its position or condi-tion. Lifting an object requires doing work against gravity.Because of its higher position, the object can then do workon something else as it falls; thus, it has gravitational po-tential energy. If an object of mass m is lifted from aninitial height y0 to a final height y its potential energy ischanged by an amount

PE = mg(y − y0).

Stretching or compressing a spring requires work tobe done. That work done is stored up in the spring; thespring can be released and can do work on something

Page 225: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

Mechanics, Classical 255

else. This elastic potential energy of a spring stretched orcompressed a distance x from its equilibrium position is

PE = 12 kx2.

D. Conservation of Energy

Work and energy are useful because the work done ona system by forces from outside the system is equal tothe change in the total energy of the system. The totalenergy of a system is the sum of the potential, kinetic, andthermal energies. If the work from external forces is zerothen the total energy of the system remains constant—thetotal energy is conserved.

A simple pendulum is an example of a system for whichthe external forces do no work. The force exerted by thesupporting string on the mass of a pendulum is always per-pendicular to the direction of motion so no work is done.Therefore, the energy must be conserved. If a pendulumis lifted some distance and released, it begins with someamount of gravitational potential energy. As it swings thatpotential energy decreases but its speed increases, whichmeans the kinetic energy increases. The sum of the kineticand potential energies remains constant.

A roller coaster offers another example of a system thatalternately changes potential energy (height) into kineticenergy (speed) and vice versa. For both a roller coasterand a pendulum, friction will eventually cause the systemto stop. Friction can be considered an external force or wecan look at the thermal energy associated with the slightincrease in temperature of the wheels and rails as a rollercoaster runs.

V. MOMENTUM

Momentum, usually designated by p, is defined by multi-plying the mass m of an object by its velocity v,

p = mv.

It is similar to kinetic energy in that momentum increaseswith increasing speed. But it is different in that momentumis a vector quantity.

Like energy, momentum is useful because it is con-served. In the absence of external forces, the total mo-mentum of a system of particles remains constant. Eventhough the internal forces between the particles may bevery complicated, the vector sum of all the momenta ofall the particles remains constant. Conservation of mo-mentum is related to Newton’s third law of motion (actionand reaction).

A. Collisions

When two objects collide—as two billiard balls hitting ortwo cars crashing into each other—the forces are very dif-

ficult to measure or predict. But conservation of momen-tum means that the vector sum of the momenta of the twoobjects before the collision will be the same as the vectorsum of the momenta of the two objects after the collision.By itself, this is not sufficient to completely solve for thevelocities of the two objects after the collision (assumingthe conditions before the collision are given). But the finalvelocities can be found in two very useful extremes. If thekinetic energy is also conserved, that is, no energy is lostto heat or deforming the objects, the collision is termedelastic. The additional information provided is enough tosolve for the final motion. If the two objects stick togetherthe collision is termed inelastic and the maximum amountof kinetic energy is lost. Note that momentum is alwaysconserved whether the collision is totally elastic, totallyinelastic, or anywhere in between.

B. Rocket Propulsion

A car’s motion can be understood by looking at the wheelsas they push on the pavement and understanding that thepavement pushes back on the wheels. But how, then, doesa rocket move and accelerate in space? There is nothingelse around for it to push on that can push back on it.A rocket burns fuel that is exhausted from the rocket’sengine at high velocity. As momentum is carried in onedirection by the fuel, an equal amount of momentum iscarried in the opposite direction by the rocket. If you standin a child’s wagon and throw bricks in one direction youwill be moved in the other direction. As momentum iscarried in one direction by the bricks, an equal amount ofmomentum is carried in the opposite direction by you andthe wagon. The idea is the same as that used in explainingrocket propulsion. If gravity can be neglected, a rocket’sfinal velocity is given by

v = v0 + u ln (m0/m),

where v0 is its initial velocity, u is the exhaust velocity ofthe burned gases, m0 is the initial mass of the rocket, andm is the final mass of the rocket.

VI. RIGID BODY MOTION

A. Center of Mass

For a system of particles of mass mi each located at posi-tion ri, the mass-weighted average position of the particlesis called the center of mass and is defined by

R =( ∑

i

mi ri

)/( ∑i

mi

),

where∑

i means to sum over all values of i . The totalmass of the system of particles is M = ∑

i mi .

Page 226: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

256 Mechanics, Classical

For a rigid body the summation over individual massesis replaced by an integral over the volume of the body. Thecenter of mass is then defined by

R = 1

M

∫ ∫ ∫V

ρr dV,

where M is the total mass of the body, given by

M =∫ ∫ ∫

Vρ dV .

ρ is the mass density (mass per unit volume), r is just thelocation vector, and V is the volume of the body. As withall vector equations, this may be easier to understand incomponent form. The three coordinates (X, Y, Z ) of thecenter of mass are

X = 1

M

∫ ∫ ∫V

ρx dV,

Y = 1

M

∫ ∫ ∫V

ρy dV,

Z = 1

M

∫ ∫ ∫V

ρz dV .

The center of mass is a “uniquely interesting point” foreven though the motion of individual particles or rotationsof the body may be frustratingly complicated, the motionof the center of mass will be that of a single point particlewith mass M .

B. Angular Momentum

Just as linear momentum was useful in understanding andpredicting translational motion because of its conserva-tion, so another conserved quantity (called the angularmomentum) will be useful in discussing rotational mo-tions. The angular momentum L of a small particle relativeto some origin is given by

L = r × p,

where r is the location of the particle from the origin, p isits momentum, and × indicates the vector cross product.For a system of particles, the total angular momentum isthe vector sum of the individual angular momenta. Foran extended body, the total angular momentum requiresevaluating an integral over the volume of the body.

1. Rotation about a Fixed Axis

For rotation about a fixed axis, there is a strong correla-tion with straight-line motion. The mass is replaced by a“rotational mass” that depends upon the geometry of themass (how far it is located from the axis of rotation.) This“rotational mass” is called the moment of inertia I . For ahollow cylinder of mass M and radius R, the moment of

inertia is I = MR2. For a solid cylinder, I = 12 MR2. Force

is replaced by a “rotational force” that depends upon theforce and its placement from the axis of rotation; this iscalled a torque T . While a small force applied at the door-knob side opens a door easily, a large force will be requiredif it is applied back near the hinge; the rotational effect inthe two cases is the same. Torque is given by

T = rF sin θ,

where r is the distance from the axis of rotation, F is theforce, and θ is the angle between the two.

Just as a distance x labels the position of a mass on astraight track, an angle θ (measured in radians) labels theangular position of a rotating object. Angular velocity ω

describes its speed of rotation in rad/s and angular accel-eration α describes the rate of change of angular velocityin rad/s2.

The rotational equivalent of F = ma is T = Iα. The an-gular momentum for rotation about a fixed axis is L = Iω,which closely parallels P = Mv for the linear case.

2. Rotation in General

In general, however, rotation can be more complicatedthan straight-line motion. Angular momentum remains aconserved quantity. But in general angular momentum isgiven by

L = Iω,

where I is now a tensor. This brings about the interestingcase in which the angular momentum L and the angularvelocity ω may not necessarily be parallel to each other.This can be seen by tossing a book or tennis racket in theair spinning about each of three mutually perpendicularaxes. For the longest and shortest axes, L and ω will be inthe same direction; for the medium length axis they willnot be in the same direction.

VII. CENTRAL FORCES

A. Definitions

A central force is one whose direction is always along aradius; that is, either toward or away from a point that canbe used as an origin (or force center), and whose magnitudedepends solely upon the distance from that origin, r . Acentral force can always be written as

F = F(r )r,

where r is a unit vector in the radial direction. Centralforces are important because many real situations involvecentral forces. The gravitational force between two masses

Page 227: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

Mechanics, Classical 257

and the electrostatic force between two charges are bothcentral forces. Motion due to a central force will alwaysbe confined to a plane.

B. Gravity

Gravity is the force of attraction between two massivebodies. First described by Sir Isaac Newton, the force ofgravity between two bodies with masses m1 and m2 sep-arated by a distance r is given by

FG = (Gm1m2)/

r2,

where G is a universal constant (G = 6.672 × 10−11

N m2/kg2). This expression is valid for calculating theforce earth exerts on an apple near its surface or the forceearth exerts on our moon or the force our sun exerts onJupiter.

1. Kepler’s Laws of Planetary Motion

Before Newton discovered this law of universal gravita-tion, Johannes Kepler found, based upon careful obser-vational data, that the motion of the planets in our solarsystem could be explained by three laws:

1. Planets move in orbits that are ellipses with the sun atone focus (elliptical orbits).

2. Areas swept out by the radius vector from the sun to aplanet in equal times are equal (equal areas in equaltimes).

3. The square of a planet’s period is proportional to thecube of the semimajor axis of its orbit (T 2 ∝ r3).

It was a great triumph of Newton’s law of universal grav-itation that it could explain and predict Kepler’s laws ofplanetary motion. Kepler’s second law is true for any cen-tral force; it is the result of conservation of angular mo-mentum. The other two laws depend upon gravity beingan inverse square force.

2. Orbits

Planets travel in elliptical orbits about the sun. Satellitestravel in elliptical orbits about their planet. If the speedof a satellite is suddenly increased the shape of the ellip-tical orbit elongates. If a satellite has enough velocity toescape and never return to the planet the path it travels is aparabola or a hyperbola. Escape velocity is the minimumvelocity that will allow a satellite to travel away from itsplanet and never return. If a satellite leaves earth’s surfacewith a velocity of about 40,000 km/h (25,000 miles/h) itwill escape from earth and never return.

C. Harmonic Oscillator

A mass suspended between three sets of identical, mu-tually perpendicular springs forms an isotropic, three-dimensional simple harmonic oscillator. The springs pro-vide a restoring force of the form

F = −kr

so this three-dimensional harmonic oscillator experiencesa central force. Examples of such a system are atoms in cer-tain crystals, where the interatomic bonds act as the springsin this simple case. If the springs (or interatomic bonds fora crystal) are not all identical, then the force due to a dis-placement in one direction will be different than that foranother direction. The harmonic oscillator is anisotropicand can no longer be described as a central force.

VIII. ALTERNATE FORMS

Newton’s second law of motion, F = ma, can be usedto solve for the motion in many situations. But the sameinformation can be written in different forms and used insituations where direct solution of F = ma is very difficultor perhaps impossible.

A. Lagrange’s Equations

Lagrange’s equations of motion can be written as

d

dt

∂L

∂qk= ∂L

∂qk,

where qk is a “generalized coordinate” and L is calledthe Lagrangian function. The Lagrangian function is thedifference between the kinetic energy and the potentialenergy; L = KE − PE . The dot means a time derivative;qk = dqk/dt .

B. Hamilton’s Equation

Hamilton’s equations of motion can be written as

qk = ∂ H/∂pk

and

pk = −∂ H/∂qk,

where, again, qk is a “generalized coordinate,” pk is a “gen-eralized momentum,” and H is called the Hamiltonianfunction. For many situations, the Hamiltonian H is thetotal energy of the system.

Page 228: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ/GJY P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN009B-414 July 19, 2001 18:46

258 Mechanics, Classical

C. Poisson Brackets and Quantum Mechanics

Hamilton’s equations can be rewritten in terms of Poissonbrackets as

qk = [qk, H ]

and

pk = [pk, H ],

where the Poisson brackets are defined by

[A, B] =∑

k

[∂ A

∂qk

∂ B

∂pk− ∂ B

∂qk

∂ A

∂pk

].

This formulation is especially interesting because it al-lows for an easy and direct transfer of ideas from classicalmechanics to quantum mechanics.

SEE ALSO THE FOLLOWING ARTICLES

CELESTIAL MECHANICS • CRITICAL DATA IN PHYSICS

AND CHEMISTRY • ELECTROMAGNETICS • MECHANICS

OF STRUCTURES • NONLINEAR DYNAMICS • QUANTUM

MECHANICS • RELATIVITY, GENERAL • STATISTICAL

MECHANICS • VIBRATION, MECHANICAL

BIBLIOGRAPHY

Arya, A. P. (1997). “Introduction to Classical Mechanics,” Prentice Hall,New York.

Kwatny, H. G., and Blankenship, G. L. (2000). “Nonlinear Controland Analytical Mechanics: Computational Approach,” Birkhauser,Boston.

Brumberg, V. A. (1995). “Analytical Techniques of Celestial Mechanics,”Springer-Verlag, Berlin.

Chow, T. L. (1995). “Classical Mechanics,” Wiley, New York.Doghri, I. (2000). “Mechanics of Deformable Solids: Linear and

Nonlinear, Analytical and Computational Aspects,” Springer-Verlag,Berlin.

Hand, L. N., and Finch, J. D. (1998). “Analytical Mechanics,” CambridgeUniversity Press, Cambridge, UK.

Jose, J. V., and Saletan, E. J. (1998). “Classical Dynamics: A Contem-porary Approach,” Cambridge University Press, Cambridge, UK.

Torok, J. S. (1999). “Analytical Mechanics: With an Introduction to Dy-namical Systems,” Wiley, New York.

Page 229: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular HydrodynamicsSidney YipMassachusetts Institute of Technology

Jean Pierre BoonUniversite Libre de Bwxelles

I. MotivationII. Density Correlation FunctionIII. Linearized HydrodynamicsIV. Generalized HydrodynamicsV. Kinetic Theory

VI. Mode Coupling TheoryVII. Lattice Gas Hydrodynamics

GLOSSARY

Diffusion Dissipation of thermal fluctuations by essen-tially random (or stochastic) motions of atoms (as inthermal or concentration diffusion).

Generalized hydrodynamics Theoretical description offluctuations in fluids based on the extension of the equa-tions of linearized hydrodynamics to finite frequenciesand wavelengths.

(k, ω) space Region of wavenumber and frequency wherethermal fluctuations are being studied.

Lattice gas automata Class of cellular automata de-signed to model fluid systems using discrete space andtime implementation.

Memory function Space–time dependent kernel appear-ing in the equation of motion for time correlation func-tion, which contains the effects of static and dynamicalinteractions.

Mode coupling A theory in which the interatomic inter-actions are expressed in terms of products of two ormore modes of thermal fluctuations, such as the densi-ties of particle number, current, and energy.

Propagation Cooperative motion of atoms character-ized by a peak at finite frequency in the frequencyspectrum of density fluctuations (as in pressure wavepropagation).

Thermal fluctuations Spontaneous localized fluctua-tions in the particle number, momentum, and energydensities of atoms in a fluid at thermal equilibrium.

Time correlation function Function that expresses thecorrelation of dynamical variables evaluated at two dif-ferent time (and space) points.

Uncorrelated binary collisions Sequence of two-bodycollisions in which the collisions are taken to be inde-pendent even though pairs of atoms can recollide oneor more times.

MOLECULAR HYDRODYNAMICS is the theoreticaldescription of spontaneous localized fluctuations in spaceand time of the particle number density, the current density,and the energy density in a fluid at thermal equilibrium.Its domain of applicability ranges from low frequenciesand long wavelengths, where the linearized equations of

141

Page 230: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

142 Molecular Hydrodynamics

hydrodynamics are applicable, to frequencies and wave-lengths comparable to interatomic collision frequenciesand mean free paths.

I. MOTIVATION

When a fluid is disturbed locally from equilibrium, it willrelax by allowing the perturbation to dissipate throughoutthe system. At the macroscopic level this response involvesthe processes of mass diffusion, viscous flow, and thermalconduction, which are the mechanisms by which the trans-port of mass, momentum, and energy can take place. Inthe absence of an external disturbance, we can still speakof the dynamical behavior of a fluid in these terms. Thereason is that the same processes also govern the dissipa-tion of spontaneous fluctuations that are always presenton the microscopic level in a fluid at finite temperature.So the fluid can be considered as a “reservoir of thermalexcitations” extending over a broad range of wavelengthsand frequencies from the hydrodynamic scale down to therange of the intermolecular potential. Thus, the study ofthermal fluctuations is fundamental to the understandingof the molecular basis of fluid dynamics.

The conventional theory of fluid dynamics invariablybegins with the equations of hydrodynamics. The basic as-sumption of hydrodynamics is that changes in the fluid takeplace sufficiently slowly in space and time that the systemcan be considered to be in a state of local thermodynamicequilibrium. Under this condition we have a closed setof equations describing the space–time variations of theconserved variables, namely, the mass, momentum, andenergy densities. These equations become explicit, whenthe thermodynamic derivatives and the transport coeffi-cients occurring in them are known; however, such con-stants are not determined within the hydrodynamic theory,and therefore must be provided by either measurement ormore fundamental calculations.

The equations of hydrodynamics have an extremelywide range of scientific and technological applications.They are valid for disturbances of arbitrary magnitudeprovided the space and time variations are slow on themolecular scales with lengths measured in collision meanfree path l and times in inverse collision frequency ω−1

c .In terms of the wavelength, 2π/k, and frequency ω, of thefluctuations, the hydrodynamic description is valid onlyin the region of low (k, ω), where kl 1 and ω ωc.

When the condition of slow variations is not fullysatisfied, we expect the fluid behavior to show molecu-lar or nonhydrodynamic effects. Unless the fluctuationsare far removed from the hydrodynamic region of (k, ω),the discrepancies often appear only in a subtle and gradualmanner. This suggests that extensions or generalizations

of the hydrodynamic description may be useful and maybe accomplished by retaining the basic structure of theequations, while replacing the thermodynamic derivativesand transport coefficients by functions that directly reflectthe molecular structure of the fluid and the effects of indi-vidual intermolecular collisions. The result is then a theorythat is valid even on the scales of collision mean free pathand mean free time, a theory that may be called molecularhydrodynamics. In essence, molecular hydrodynamics is adescription that considers both the macroscopic behaviorof mass, momentum, and energy transport, and the mi-croscopic properties of local structure and intermolecularcollisions.

There are several reasons why a study of the extensionof hydrodynamics is important. First, we obtain a betterunderstanding of the validity of hydrodynamics. Second,an appreciation of how the details of molecular structureand collisional dynamics can affect the behavior of theconserved variables is essential to the study of transportphenomena on the molecular level. Finally, it is one ofthe basic aims of nonequilibrium statistical mechanics todevelop a unified theory of liquids that treats not only theprocesses in the hydrodynamic region of (k, ω), but alsothe molecular behaviors that manifest at higher values ofwavenumber and frequency.

II. DENSITY CORRELATION FUNCTION

The fundamental quantities in the study of thermal fluctu-ations in fluids are space and time-dependent correlationfunctions. These functions are the natural quantities fortheoretical analyses as well as laboratory measurements.They are well defined for a wide variety of physical sys-tems, and they possess both macroscopic properties andinterpretations at the microscopic level.

For the fluid system of interest we imagine an assem-bly of N identical particles (molecules), each of mass m,contained in a volume . The molecules have no inter-nal degrees of freedom, and they are assumed to interactthrough a two-body, additive, central potential u(r ). Thefluid is in thermal equilibrium, at a state far from any phasetransition. Also, there are no external fields imposed, sothe system is invariant to spatial translation, rotation, andinversion.

A time correlation function is the thermodynamic aver-age of a product of two dynamical variables, each express-ing the instantaneous deviation of a fluid property from itsequilibrium value. The dynamical variables that we wishto consider are the number density,

n(r, t) = 1√N

N∑i=1

δ(r − Ri (t)) (1)

Page 231: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 143

where Ri (t) denotes the position of particle i at time t , andthe current density,

j(r, t) = 1√N

N∑i=1

vi (t)δ(r − Ri (t)) (2)

where vi (t) is the velocity of particle i at time t .The thermodynamic average of a dynamical variable

A(r, t) is defined as

〈A(r, t)〉 =∫

d3 R1 . . . d3 RN d3 P1 . . .

× d3 PN feq(RN , PN )A(r, t) (3)

where feq is an equilibrium distribution of particlepositions RN = (R1, . . . , RN), and momenta PN =(P1 . . . PN ). Typically we adopt the canonical ensemblein evaluating Eq. (3),

feq(RN , PN ) = Q−1N exp(−βU )

N∏i=1

f0(Pi ) (4)

with β = (kB T )−1, T being the fluid temperature and kB

the Boltzmann’s constant. U (RN ) is the potential energy,f0(P) is the normalized Maxwell–Boltzmann distribution

f0(P) = (β/2πm)3/2 exp(−βP2/2m) (5)

and QN is the configurational integral

QN =∫

d3R1 . . . d3 RN exp(−βU ) (6)

where U is the potential energy of the system. Apply-ing Eq. (3) to Eqs. (1) and (2) gives the average values〈n(r, t)〉 = √

N/V , and 〈 j(r, t)〉 = 0. Notice that in gen-eral a dynamical variable depends on the particle positionsRN and momenta PN , and also on the position r and time t ,where the property is being considered. On the other hand,the average values are independent of r and t because thesystem is uniform and in equilibrium.

Given the dynamical variable n(r, t) we define the time-dependent density correlation function as

G(|r − r′|, t − t ′) = V 〈δn(r′, t ′)δn(r, t)〉

= 1

n

⟨N∑i, j

δ(r′ − Ri (t′))

× δ(r − R j (t))

⟩− n (7)

where δn(r, t) = n(r, t) − 〈n(r, t)〉, and n = N/V is theaverage number density of the fluid at equilibrium. Despiteits rather simple appearance this function contains all the

FIGURE 1 The time correlation function circle with its intercon-nected segments of theory, experiment, and atomistic simulation.

structural and dynamical information concerning densityfluctuations. Note that G depends on the separation |r − r′|because of rotational invariance and it is a function of t − t ′

because of time translational invariance. Without loss ofgenerality we can take r′ = 0 and t = 0.

The density correlation function is the leading mem-ber of a group of time correlation functions that havereceived attention in recent studies of nonequilibrium sta-tistical mechanics. These functions have become the stan-dard language for experimentalists and theorists alike,because they can be measured directly and they are well-defined quantities for which microscopic calculations canbe formulated. Moreover, time correlation functions areaccessible by atomistic simulations. Figure 1 shows thecomplementary nature of theoretical, experimental, andsimulation studies of time correlation functions. In thisarticle we are primarily concerned with the theoretical de-velopments, which, however, rely on simulation data andscattering measurements for guidance and validation.

It is instructive to note the simple physical interpretationof G(r, t), which we can deduce from its definition. Con-sider a laboratory coordinate system placed in the fluidsuch that at time t = 0 a particle is at the origin. At a latertime t , place an element of volume d3r at the position r.Then G(r, t)d3r is the average (or expected) number ofparticles in the element of volume at r at time t , given thata particle was located at the origin initially. The initialvalue of G is

G(r, 0) = δ(r) + ng(r ) (8)

where g(r ) is the equilibrium pair distribution function

n2g(r ) =∑

i,ji = j

〈δ(r − Ri )δ(R j )〉 (9)

Page 232: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

144 Molecular Hydrodynamics

III. LINEARIZED HYDRODYNAMICS

The linearized hydrodynamic equations for a fluid withno internal degrees of freedom consist of the continu-ity equation, which expresses mass or particle numberconservation,

∂ρ1(r, t)

∂t+ ρ0∇ · v(r, t) = 0 (10)

the Navier–Stokes equation, which expresses momentumor current conservation,

ρ0∂v(r, t)

∂t+ c2

0

γ∇ρ1(r, t)

+c20αρ0

γ∇T1(r, t) − η∇∇ · v(r, t) = 0 (11)

and the energy transport equation, which expresses kineticenergy conservation,

ρ0Cv

∂T1(r, t)

∂t− Cv(γ − 1)

α

∂ρ1(r, t)

∂t− λ∇2T1(r, t) = 0

(12)

In these equations, ρ = ρ0 + ρ1 is the local number den-sity, T = T0 + T1 is the local temperature, and v is thevelocity, with subscripts 0 and 1 denoting the equilibriumvalue and instantaneous deviation, respectively. The ratioof specific heats at constant pressure and constant volumeC p/Cv is γ . The combination of shear and bulk viscosities43ηs + ηB is denoted by η, and c0, α, λ are, respectively,the adiabatic sound speed, the thermal expansion coeffi-cient, and the thermal conductivity. Equations (10)–(12)are linearized in the sense that ρ1, T1, and v are assumedto be small, and therefore, only terms to first order inthese quantities need be kept. This assumption makes thedescription valid only for small-amplitude disturbancessuch as thermal fluctuations.

The parameters of the hydrodynamic description arethermodynamic coefficients α, γ , and c0, the thermal ex-pansion coefficient, the ratio of specific heats at constantpressure and volume, and the adiabatic sound speed, andtransport coefficients, ηs, ηB, and λ, the shear and bulkviscosities with η = 4

3ηs +ηB, and the thermal conductiv-ity. Once these are specified, the equations can be used tocalculate explicitly the spatial and temporal distributionsof the particle number, current, and energy densities for agiven set of boundary and initial conditions.

We will be interested in the decay of a density pulsecreated by thermal fluctuations in a uniform, infinite fluidmedium. For this problem it will be most convenient todiscuss the solutions in wavevector space by taking theFourier transform in configuration space and solving the

resulting equations as an initial value problem with initialvalues,

ρ1(r, t = 0) = δ(r) + n[g(r ) − 1]

v(r, t = 0) = 0 (13)

T1(r, t = 0) = 0

Equation (13) states that at time t = 0 a density pulseoccurs in the fluid in the form of a particle localized at theorigin of the coordinate system plus a distribution of par-ticles according to n[g(r ) − 1]; also, there are no currentor temperature perturbations. The meaning of ρ1(r, t) asthe density response to this initial condition is the spatialdistribution of this density pulse as time evolves. Noticethat any particle in the fluid can contribute to ρ1(r, t) fort > 0, not just the particle originally located at the origin.After taking the Fourier transform of Eqs. (10)–(12), wecan solve for

n(k, t) =∫

d3r eik·rρ1(r, t) (14)

The calculation is best carried out by taking the Laplacetransform in time, for example,

n(k, s) =∫ ∞

0dt e−st n(k, t) (15)

thus obtaining a system of coupled algebraic equations forthe Laplace–Fourier transformed densities n(k, s), v(k, s),and T1(k, s). The system of equations is homogeneous,and for nontrivial solutions the transform variable s has tosatisfy a cubic equation.

Since the hydrodynamics description is applicable onlywhen spatial and temporal variations of the densities occursmoothly, it is appropriate to look for roots of the cubicequation to lowest orders in the wavenumbers. To ordertwo,

s± = ±ic0k − k2

(16)s3 = −λk2/ρ0cp

where = [η + λ(C−1v − C−1

p )]/2ρ0. As a result of thedensity pulse, both pressure and temperature fluctuationsare induced. The pair of complex roots s± describes thepropagation of pressure fluctuations as damped soundwaves, with speed c0 and attenuation . The root s3 des-cribes the diffusion of temperature fluctuations with atten-uation λ/ρ0C p.

Using Eq. (16) we can invert the Laplace transformedsolution for n(k, s) and compute the correlation function.The result is

F(k, t) = 〈n(k, t)n(−k)〉

=∫

d3reik·rG(r , t) (17)

Page 233: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 145

F(k, t) = S(k)

[C p − Cv

C pexp

(− λk2

ρ0C pt

)

+ Cv

C pexp(−k2t) cos c0kt

](18)

where

S(k) = 〈n(k)n(−k)〉= F(k, t = 0) (19)

is known as the static structure factor of the fluid. Inthe long wavelength limit, it is a thermodynamic quan-tity: S(k → 0) → nkBT xT, where xT is the isothermalcompressibility. Equation (18) shows that there are twocomponents in the time decay of density fluctuations,an exponential decay associated with heat diffusion, anda damped oscillatory decay associated with pressure(sound) propagation.

The dynamics of density fluctuations can be studied di-rectly by scattering beams of thermal neutrons or laserlight from the fluid and measuring the frequency spec-trum of the scattered radiation. In such experiments, thefrequency spectrum of the density fluctuation is measured,

S(k, ω) =∫ ∞

−∞dt e−iωt F(k, t) (20)

In contrast to S(k), which is what we obtain from a neutronor X-ray diffraction experiment, S(k, ω) is called the dy-namic structure factor because it gives information aboutboth structure and dynamics of the fluid.

Since we can probe the fluid structure at differentwavenumbers, the frequency behavior of S(k, ω) canvary considerably from the hydrodynamic regime of longwavelengths (kl 1), to the regime of free particle flow(kl 1). The frequency spectrum of density fluctuationsin the hydrodynamic regime is characterized by three well-defined spectral lines, corresponding to the three modesin F(k, t) or the three roots to the dispersion equation asgiven in Eq. (16). From (18) and (20), one obtains

S(k, ω) = S(k)

C p − Cv

C p

λk2/ρ0C p

ω2 + (λk2/ρ0C p

)2

+ Cv

C p

[k2

(ω + c0k)2 + (k2)2

+ k2

(ω − c0k)2 + (k2)2

](21)

The spectrum is composed of a central peak with maxi-mum at ω = 0 and whose full width at half maximumis 2λk2/ρ0C p. This peak is called the Rayleigh line; itsintensity is given by S(k)[1 − 1/γ ]. There are also two

equally displaced side peaks with maxima at ω± = ±c0kand whose full width at half maximum is 2k2; these arecalled the Brillouin doublet and their integrated intensityis given by S(k)/γ . The intensity ratio of the Rayleighcomponent to the Brillouin components is γ − 1, a quan-tity known as the Landau–Placzek ratio. Note that a moreaccurate solution contains cross terms involving heat dif-fusion and pressure propagation and gives rise to an asym-metry in the Brillouin components.

There are other time correlation functions of interest,such as the transverse current correlation,

Jt (k, t) = 1

N

∑j,k

⟨vT

j (t)vTk (0)eik·[R j (t)−Rk (0)]

⟩(22)

where vTj (t) is the transverse component (direction per-

pendicular to k) of the velocity of the jth particle at timet . From the Navier–Stokes equation (11) we find

[∂ Jt (k, t)/∂t] = −νk2 Jt (k, t) (23)

where ν = η/ρ0. The corresponding frequency spectrumis a Lorentzian function.

Jt (k, ω) = 2v20νk2

/[ω2 + (νk2)2] (24)

with J (k, t = 0) = v20 = (βm)−1.

We see that at long wavelengths transverse current fluc-tuations in a fluid dissipate by simple diffusion at a rategiven by νk2.

IV. GENERALIZED HYDRODYNAMICS

The hydrodynamic description of fluctuations in fluids isexpected to become inappropriate at finite values of (k, ω)when kl 1 and ω ωc, where l is the collision mean freepath and ωc the mean collision frequency (see Section I).Nevertheless, we can extend the hydrodynamic descrip-tion by allowing the thermodynamic coefficients in Eqs.(12) and (13) to become wavenumber dependent and thetransport coefficients to become k- and ω-dependent. Thismethod of extension is called generalized hydrodynamics.

The basic idea of generalized hydrodynamics can besimply presented by considering the case of the transversecurrent fluctuations. One of the fundamental differencesbetween simple liquids and solids is that the former can-not support a shear stress, which is another way of sayingthat they have zero shear modulus. On the other hand, it isalso known that at sufficiently short wavelengths or highfrequencies shear waves can propagate through a simpleliquid because then the system behaves like a viscoelasticmedium. We have seen that according to hydrodynamicsthe frequency spectrum of the transverse current corre-lation function, (24), describes a diffusion process at all

Page 234: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

146 Molecular Hydrodynamics

frequencies. The absence of a propagating mode in (24) isan example of the inability of linearized hydrodynamicsto treat viscoelastic behavior at finite (k, ω).

In the approach of generalized hydrodynamics we ex-tend (23) by postulating the equation

∂tJt (k, t) = −k2

∫ t

0dt ′Kt (k, t − t ′)Jt (k, t ′) (25)

The kernel Kt (k, t) is called a memory function; it is itselfa time correlation function like Jt (k, t). The role of Kt is toenable Jt to take on a short-time behavior that is distinctlydifferent from its behavior at long times. It is reasonablethat a quantity such as Kt (k, t) should be present in theextension of hydrodynamics. With the introduction of asuitable Kt (k, t), we expect that (25) will give shear wavepropagation at finite (k, ω), while in the limit of small(k, ω) we recover Eq. (23).

On a phenomenological basis, without specifying com-pletely Kt (k, t) by a systematic derivation, we can requirethis function to satisfy requirements that incorporate cer-tain properties of the function that we can readily derive.The two properties of Kt (k, t) most relevant to the presentdiscussion are

Kt (k, t = 0) = (nm)−1G∞(k) (26)

and

limk→0

∫ ∞

0dt Kt (k, t) = ν (27)

where G∞(k) is the high-frequency shear modulus. BothG∞ and ν are actually properties of Jt (k, ω),

(kv0)2

nmG∞(k) = 1

∫ ∞

−∞dω ω2 Jt (k, ω) (28)

2v20ν = lim

ω→0limk→0

k

)2

Jt (k, ω) (29)

Moreover, Eq. (28) can be reduced to a kinetic contribution(kv2

0)2 and an integral over g(r ) and potential functionderivative that can be evaluated by quadrature.

Equations (26) and (27) may be regarded as constraintsor “boundary conditions” on Kt (k, t), but by themselvesthey do not determine the memory function. Empiricalforms have been proposed for Kt (k, t) with adjustable pa-rameters determined by imposing Eqs. (26) and (27). Asan example, we consider the exponential or single relax-ation time model,

Kt (k, t) = [G∞(k)/nm] exp[−t/τ (k)] (30)

where we are still free to specify the wavenumber-dependent relaxation time τ (k). Notice that Eq. (26) hasalready been incorporated. Applying Eq. (27) we obtainτ (k = 0) = nmν/G∞(0), a quantity sometimes called

the Maxwell relaxation time in viscoelastic theories. Fur-thermore, we expect τ (k) to be a decreasing function ofk on the grounds that fluctuations at shorter wavelengthsgenerally dissipate more rapidly. The simple interpolationexpression

1

τ 2t (k)

= 1

τ 2t (0)

+ (kv0)2 (31)

would be consistent with this expectation and entails nofurther parameters. There exist more elaborate models forτ (k) as well as for Kt (k, t), but the model Eq. (30) with(31) has the virtue of simplicity. Then Eq. (25) gives

Jt (k, ω) = 2v20k2 Kt (k, 0)

τt (k)

×[

ω2 −

k2 Kt (k, 0) − 1

2τ 2t (k)

]2

+[

k2 Kt (k, 0) − 1

4τ 2t (k)

]/τ 2

t (k)

−1

(32)

The effect of the memory function now may be seen in thespectral behavior of Jt (k, ω). Whenever

k2 Kt (k, 0) >1

2τ 2t (k)

(33)

there will exist a finite frequency, where the denominatorin Eq. (32) is a minimum, and Jt (k, ω) will show a resonantpeak. The resonant structure indicates a propagating modeassociated with shear waves. Notice that Eq. (33) cannothold at sufficiently small k; thus, in the long wavelengthlimit Eq. (32) can only describe diffusion, in agreementwith Eq. (24). Figure 2 shows the data of molecular dy-namics simulation; we see clear evidence of the onset ofshear waves as k increases.

Generalized hydrodynamic descriptions for other timecorrelation functions also can be developed by using mem-ory function equations such as Eq. (25). We will brieflysummarize the results for density and longitudinal cur-rent fluctuations. The continuity equation, Eq. (6), is anexact expression, unlike the Navier–Stokes or the energytransport equation. One of its implications is a rigorousrelation between the density correlation function, F(k, t),and the longitudinal current correlation function Jl(k, t).The latter is defined in a similar way as Eq. (22), withthe transverse component vT

j replaced by the longitudi-nal component (direction parallel to k). In terms of thedynamic structure factor S(k, ω), the relation is

Jl(k, ω) = (ω/k)2S(k, ω) (34)

Page 235: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 147

FIGURE 2 Normalized transverse current correlation function ofliquid argon at various wavenumbers, molecular dynamics simu-lation data (circles), and exponential memory function model withτt (k) given by Eq. (41) (solid curves) and by a more elaborateexpression (dashed curves).

Since this holds in general, we will focus our attentionon Jl(k, ω). For purposes of illustration we assume thattemperature fluctuations can be ignored. This means thatwe can set T1 = 0 in Eq. (11) and obtain

Jl(k, ω) = 2v20

(ωk)2

[ω2 − (cT k)2]2 + [ωνk2]2(35)

with cT = c0/ν being the isothermal sound speed. We seethat in the hydrodynamic description the longitudinal cur-rent fluctuations, in contrast to the transverse current fluc-tuations, propagate at a frequency essentially given byω ∼ cT k2. If temperature fluctuations were not neglected,

the propagation frequency would be c0k and the dampingconstant governed by the sound attenuation coefficient

[cf. Eq. (21)] instead of ν as in Eq. (35).The inadequacy of the hydrodynamic description Eq.

(35) at finite (k, ω) values is more subtle than is the caseof Jt (k, ω). We find that Eq. (35) gives an overestimate ofthe damping of fluctuations, and it does not describe anyof the effects associated with the intermolecular structureas manifested through the static structure factor S(k). Theextension of Eq. (35) can proceed if we write

∂ Jl(k, t)

∂t= −

∫ t

0dt ′Kl(k, t − t ′)Jl(k, t ′) (36)

with

Kl(k, t) = (kv0)2

S(k)+ k2φl(k, t) (37)

The form of Kl(k, t) is motivated by the couplingof Eqs. (10) and (11), and the generalization of theisothermal compressibility xT , nkB T xT → S(k). Combin-ing Eqs. [(35) and (36)] gives

Jl(k , ω)

= 2v20(ωk)2φ′

l (k, ω)[ω2 − (kv0)2

S(k)+ ωk2φ′′

l (k, ω)

]2

+ [ωk2φ′l (k, ω)]2

(38)

where φ′′l and φ′

l are the real and imaginary parts of

φl(k, s) =∫ ∞

0dt e−stφl(k, t) (39)

with s = iω, and they describe the dissipative and reactiveresponses, respectively.

It is evident from a comparison of Eq. (38) with Eq.(35) that in addition to the generalization of the isothermalcompressibility, the longitudinal viscosity has become acomplex k- and w-dependent quantity. Through φl(k, t)we can again introduce physical models and use variousproperties to determine the k dependence. One way tocharacterize the breakdown of hydrodynamics in the caseof Jl(k, ω) is to follow the frequency of the propagatingmode as k increases. Notice first that by virtue of Eq. (34)Jl(k, ω) always shows a peak at a nonzero frequency. Atsmall k this peak is associated with sound propagation. Ifwe define

c(k) = ωm(k)

k(40)

where ωm(k) is the peak position, then c(k) in the longwavelength limit is the adiabatic sound speed. This be-ing the case, it is reasonable to regard Eq. (40) as thespeed at which collective modes propagate in the fluid atany wavenumber. In terms of c(k) we have a well-defined

Page 236: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

148 Molecular Hydrodynamics

FIGURE 3 Variation of propagation velocities with wavenumberin liquid argon. Generalized hydrodynamics results are given asthe solid curve denoted by c (k ) and by the dashed curve, neutronscattering measurements are denoted by the closed circles andslash marks, and computer simulation data are denoted by opencircles. The quantities c0(k ) and c∞(k ) are defined in the text.

quantity for discussing the variation of propagation speedat finite k. Notice that we do not refer to the propagatingfluctuations at finite k as sound waves, because the latterare excitations that manifest clearly in S(k, ω).

There exist computer simulation results and neutroninelastic scattering data on simple liquids from whichc(k) can be determined. Figure 3 shows a comparison ofthese results with a generalized hydrodynamics calcula-tion. Also shown are the adiabatic sound speed c0(k) andthe high-frequency sound speed c∞(k),

c0(k) = v0[γ /S(k)]1/2 (41)

c∞(k) =

1

nm

[4

3G∞(k) + K∞(k)

]1/2

(42)

where K∞ is the high-frequency bulk modulus. It is seenin Fig. 3 that c0(k) and c∞(k) provide lower and upperbounds on c(k). The fact that c(k) deviates from both maybe attributed to dynamical effects, which cannot be des-cribed through static properties such as in Eqs. (41) and(42). Relative to the adiabatic sound speed c0(k → 0) wesee in c(k) first an enhancement as k increases up to about1 A−1, then a sharp decrease at larger k. The former be-havior, a positive dispersion, is due to shear relaxation,whereas the latter, a strong negative dispersion, is due tostructural correlation effects represented by S(k). Fromthis discussion we may conclude that an expression suchas Eq. (38), with rather simple physical models for φl(k, t),provides a semiquantitatively correct description of den-sity and current fluctuations at finite (k, ω).

V. KINETIC THEORY

In the theory of particle and radiation transport in flu-ids there exists a well established connection between thecontinuum approach as represented by the hydrodynam-ics equations and the molecular approach as representedby kinetic equations in phase space, an example of whichis the Boltzmann equation in gas dynamics. Through thisconnection we can obtain expressions for calculating theinput parameters in the continuum equations, such as thetransport coefficients in Eqs. (11) and (12). We can alsosolve the kinetic equations directly to analyze thermal fluc-tuations at finite (k, ω), and in this way take into account,explicitly, the effects of spatial correlations and detaileddynamics of molecular collisions. In contrast to general-ized hydrodynamics, the kinetic theory method allows usto derive, rather than postulate, the space–time memoryfunctions like K (k, t).

The essence of the kinetic theory description is thatparticle motions are followed in both configuration andmomentum space. Analogous to Section II we begin withthe phase space density

A(rpt) =N∑

i=1

δ(r − Ri (t))δ(p − Pi (t)) (43)

and the time-dependent phase-space density correlationfunction [cf. Eq. (7)]

C(r − r′, pp′, t) = 〈δA(rpt)δA(r′p′0)〉 (44)

with 〈A〉 = n f0(p). The fundamental quantity in the anal-ysis is now C(r, pp′, t), from which the time correlationfunctions of Section II can be obtained by appropriate in-tegration over the momentum variables. For example,

G(r, t) =∫

d3pd3p′C(r, pp′, t) (45)

Various methods have been proposed to derive the equa-tion governing C(r, pp′, t). All the results can be put intothe generic form(

z − k · pm

)C(kpp′z) −

∫d3p′′φ(kpp′′z)C(kp′′p′z)

= −iC0(kpp′) (46)

where

C(kpp′z) =∫

d3r∫ ∞

0dt ei(k·r−zt)C(r, pp′, t) (47)

with the initial condition

C0(kpp′) =∫

d3r eik·rC(r, pp′, t = 0)

= n f0(p)δ(p − p′) + n2 f0(p) f0(p′)h(k)

(48)

Page 237: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 149

and nh(k) = S(k) − 1. In Eq. (46) the function φ(kpp′z)is the phase-space memory function, which plays thesame role as the memory function K (k, t) in Eq. (25)or Eq. (36). It contains all the effects of molecular in-teractions. If φ were identically zero, then Eq. (46) woulddescribe a noninteracting system in which the particlesmove in straight line trajectories at constant velocities. Wecan also think of φ as the collision kernel in a transportequation.

There are a number of formal properties of φ pertainingto symmetries, conservation laws, and asymptotic behav-ior, which one can analyze. Also, explicit calculations havebeen made under different conditions, such as low density,weak coupling, or relaxation time models. In general, it isuseful to separate φ into an instantaneous, or static, partand a time-varying, or collisional, part,

φ(kpp′z) = φ(s)(kp) + φ(c)(kpp′z) (49)

where

φ(s)(kp) = −k · pm

n f0(p)C(k) (50)

The quantity C(k) = [S(k) − 1]/nS(k) is known as thedirect correlation function. Physically φ(s) represents theeffects of mean field interactions with nC(k) as the effec-tive potential of the fluid system.

The calculation of φ(c) is a difficult problem because wehave to deal with the details of collision dynamics. It canbe shown that in the limit of low densities, low frequen-cies, and small wavenumbers, φ(c) reduces to the collisionkernel in the linearized Boltzmann equation. This connec-tion is significant because the Boltzmann equation is thefundamental equation in the study of transport coefficientsand of the response of a gas to external perturbations.

The basic assumption underlying the Boltzmann equa-tion is that intermolecular interactions can be treated as asequence of uncorrelated binary collisions. This assump-tion renders the equation much more tractable, but it alsolimits the validity of the equation to low-density gases.Figure 4 shows the frequency spectrum of density fluctu-ations in xenon gas at 349.6 K and 1.03 atm calculatedaccording to the procedure:

S(k, ω) = 1

πRe

∫d3pd3p′C(kpp′z)z=iω (51)

where C is determined from Eq. (46) with φ(c) given bythe binary collision kernel for hard sphere interactions. Atsuch a low density it is valid to ignore φ(s) and the secondterm in Eq. (48).

Also shown in Fig. 4 are the experimental data fromlight scattering spectroscopy. The good agreement is evi-dence that the linearized Boltzmann equation provides anaccurate description of thermal fluctuations in low-density

FIGURE 4 Frequency spectrum of dynamic structure factor inxenon gas at 349.6 K and 1.03 atm; light scattering data for 6328 Aincident light and scattering angle of 169.4 are shown as closedcircles while the full curve denotes results obtained using the lin-earized Boltzmann equation for hard spheres. Calculated spec-trum has been convolved with the resolution function shown bythe dashed curve.

gases in the kinetic regime where kl ∼ 1. The agreementis less satisfactory when the data are compared with theresults of hydrodynamics; in this case the calculated spec-trum shows essentially no structure. This again indicatesthat at finite (k, ω) the hydrodynamic theory overestimatesthe damping of density fluctuations. Generally speaking,kinetic theory calculations have been quantitatively use-ful in the analysis of light scattering experiments on gasesand gas mixtures.

For moderately dense systems, typically fluids at aroundthe critical density, the Boltzmann equation needs to bemodified to take into account the local structure of thefluid. In the case of hard spheres, the modified equa-tion generally adopted is the generalized Enskog equa-tion, which involves g(σ ), the pair distribution functionat contact (with σ the hard sphere diameter); the collisionterm differs from the collision integral in the linearizedBoltzmann equation for hard spheres only in the presenceof two phase factors, which represent the nonlocal spatialeffects in collisions between molecules of finite size.

Figure 5 shows the frequency spectra of density fluc-tuations obtained from simulation and kinetic theory atrather long wavelengths in hard sphere fluids and atthree densities, corresponding roughly to half the crit-ical density, 1.7 times critical density, and liquid den-sity at the triple point. The kσ values are such that us-ing the expression l−1 = √

2πnσ 2g(σ ) for the collisionmean free path, we find that for the three cases (a)–(c)a molecule on the average would have suffered about1, 5, and 20 collisions, respectively, in traversing a dis-tance equal to the hard sphere diameter. On this basis wemight expect the spectra in (b) and (c) to be dominated byhydrodynamic behavior, while that in (a) should show sig-nificant deviations.

Page 238: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

150 Molecular Hydrodynamics

FIGURE 5 Frequency spectra of dynamic structure factor S(k, ω)in hard sphere fluids at three densities; simulation data are shownas open circles while the solid curves denote results obtainedusing the generalized Enskog equation. Dimensionless frequencyω* is defined as ωτE/kσ , with the Enskog collision time τ−1

E = 4√πnv0σ 2g(σ ). Only inputs to the calculations are S(k ) and g (σ ),

which can be obtained from the simulation data. In (b) the effectsof ignoring entirely the static part of the memory function andof using the conventional Enskog equation are also shown. For(a) nσ 3 = 0.1414, kσ = 0.412, S(k ) = 0.563; g (σ ) = 1.22; (b)nσ 3 = 0.471, kσ = 0.616; S(k ) = 0.149, g (σ ) = 4.98; and (c)nσ 3 = 0.884, kσ = 0.759, S(k ) = 0.0271, g (σ ) = 2.06.

The theoretical curves in Fig. 5 are kinetic model solu-tions to the generalized Enskog equation. They are seen todescribe quantitatively the computer simulation data. Wecould have expected good agreement in the lowest densitycase, which is nevertheless two orders of magnitude higherin density than a gas under standard conditions. That thetheory is still accurate at condition (b) is already some-what unexpected. So it is rather surprising that a kinetictheory that treats the interactions as only uncorrelated bi-nary collisions is applicable at liquid density, as shown in(c). Indeed, at three times the present value of kσ a char-acteristic discrepancy appears in the high-density case, asshown in Fig. 6.

The failure of the generalized Enskog equation to ac-count properly for the simulation results at low frequenciescan be traced to the presence of a slower decaying com-ponent in the data for F(k, t). It seems reasonable to asso-ciate this with the relaxation of clusters of particles, which

should become important at high densities. Just like theonset of shear wave propagation, this characteristic featureis part of the viscoelastic behavior expected of dense flu-ids. In order to describe such effects in the present context,it is now recognized that correlated collisions will have tobe included in the kinetic equation. Aside from density andthermal fluctuations, it is also known that the transport co-efficients derived from the Enskog equation are in errorup to a factor of 2 at the liquid density when compared tocomputer simulation data hard spheres. Moreover, simu-lation studies have revealed a nonexponential, long-timedecay of the velocity autocorrelation function that cannotbe explained by the Enskog theory.

Any attempt to treat correlated collision effects ne-cessarily leads to nonlinear kinetic equations. For prac-tical calculations it appears that only the correlated binarycollisions, called ring collisions, are tractable. To incor-porate these dynamical processes in the kinetic theory,we can develop a formalism wherein φ(c) is given as thesum φ(c) = φE + φR, where φE is the memory function

FIGURE 6 The density correlation function and normalized trans-verse current correlation in a hard sphere fluid at a density ofnσ 3 = 0.884. The k value is 2.28 σ−1. Computer simulation dataare given by the circles, while calculations using the generalizedEnskog equation or the mode coupling theory are denoted by thedashed and solid curves, respectively.

Page 239: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 151

for the generalized Enskog equation, and φR describes thering collision contribution. In essence φR can be expressedschematically as φR = VCCV, where V is an effective in-teraction, which involves the actual intermolecular poten-tial and the equilibrium distribution function of the fluid,and C is the phase space correlation function. The impor-tant point to note is that the memory function now dependsquadratically on C , thereby making Eq. (46) a nonlin-ear kinetic equation. The appearance of nonlinearity, orfeedback effects, is not so surprising when we recognizethat in a dense medium the motions of a molecule willhave considerable effects on its surroundings, which inturn will react and influence its subsequent motions. Theinclusion of correlated collisions is a significant develop-ment in the study of transport called renormalized kinetictheory.

The presence of ring collisions unavoidably makes theanalysis of time correlation functions considerably moredifficult. Nevertheless, it can be shown analytically thatwe obtain a number of nontrivial collective propertiescharacteristic of a dense fluid, such as a power law de-cay of the velocity autocorrelation function, and nonana-lytic density expansions of sound dispersion and transportcoefficients.

VI. MODE COUPLING THEORY

There exists another method of analyzing time correlationfunctions, which has common features with both general-ized hydrodynamics and renormalized kinetic theory. Inthis approach we formulate an approximate expression forthe space–time memory function that is itself nonlinear inthe time correlation functions. The method is called modecoupling because the correlation functions describe thehydrodynamic modes, the conserved variables of density,momentum, and energy, in the small (k, ω) limit, and theyare brought together to represent higher order correlations,which are important in a strongly coupled system such asa liquid.

The mode coupling approach has been particularly suc-cessful in describing the dynamics of dense fluids; it is theonly tractable microscopic theory of dense simple fluids.To describe the mode coupling formalism, we considerthe density correlation function or its Laplace transform

S(k, z) ≡ i∫ ∞

0dt eizt F(k, t) ≡ [F(k, t)] (52)

and similarly for Jl(k, z), the longitudinal current correla-tion function. Using the continuity equation we find

S(k, z) = − S(k)

z+

(k

z

)2

Jl(k, z) (53)

which is just another form of Eq. (34). Now we write anequation for Jl(k, z) [cf. Eq. (36)] in the form

Jl(k, z) = −v20

[z − 2

0(k)

z+ D(k, z)

]−1

(54)

where 20(k) = (kv0)2/S(k) and D(k, z) is the memory

function. Combining Eqs. (53) and (54) gives

S(k, z) = −S(k)

[z − 2

0(k)

z + D(k, z)

]−1

(55)

which is an exact equation. The basic assumption under-lying the mode coupling theory is the approximate expres-sion derived for D(k, z). In essence one obtains

D(k, z) 20(k)

v+

∫d3k ′ V (k, k ′)[F(k ′, t)

×F(|k − k′|, t)] (56)

where v is a characteristic collision frequency usuallytaken from the Enskog theory, and V (k, k ′) is an ef-fective interaction. Equation (56) is an example of atwo-mode coupling approximation involving two den-sity modes F(k, t). Depending on the problem, wecan have other products containing modes from thegroup F(k, t), Fs(k, t), Jt (k, t), Jl(k, t), and the energyfluctuation, and for each mode coupling term there willbe an appropriate vertex interaction V (k, k ′).

The calculation of S(k, z) is fully specified by combin-ing Eqs. (55) and (56). By expressing the memory functionback in terms of the correlation function, we obtain a self-consistent description capable of treating feedback effects.These are the effects that become important at high den-sities, and that we have tried to treat in the kinetic theoryapproach through the ring collisions.

Mode coupling calculations were first applied to ana-lyze the transverse and longitudinal current correlationfunctions in liquid argon and liquid rubidium. The theorywas found to give a satisfactory account of the computersimulation results on shear wave propagation in Jt and thedispersion behavior of ωm(k) in Jl . The theory was thenreformulated for the case of hard spheres and extensive nu-merical results were obtained and compared in detail withsimulation data. It was shown that the viscoelastic behav-ior discussed previously, which could not be explained bythe generalized Enskog equation, is now well described.The improvement due to mode coupling can be seen inFig. 6.

Another problem where the capability of mode couplinganalysis to treat dense medium effects can be demon-strated is the Lorentz model. This is the study of thediffusion of a tagged particle in a random medium of sta-tionary scatterers and of its localization when the scattererdensity n exceeds a critical value nc. The system can be

Page 240: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

152 Molecular Hydrodynamics

FIGURE 7 Density variation of the diffusion coefficient in thetwo-dimensional Lorentz model where the hard disks can over-lap: mode coupling theory (solid curve) and computer simulationdata. D0 is the diffusion coefficient given by the Enskog theory.

characterized by the diffusion coefficient of the taggedparticle D, which plays the role of an order parameter.The model then exhibits two distinct phases, a “diffusion”phase D = 0, when n < nc, and for n > nc a “localization”phase with D = 0.

Figure 7 shows the density variation of D in the case ofthe two-dimensional Lorentz model with hard disk scatter-ers that can overlap. The mode coupling theory gives sat-isfactory results if the density is scaled according to nc. Asfor the prediction of nc, the theory gives n∗

c = nσ d = 0.64and 0.72 for d = 2 and d = 3, respectively, while molec-ular dynamics simulations give 0.37 and 0.72. Here thetagged particle and the stationary scatterers are both hardspheres of diameter σ , and d is the dimensionality of thesystem. The fact that the theory does not give an accuratevalue for nc in two dimensions indicates that the statisticaldistribution of system configurations in which the particlebecomes trapped requires a more complicated treatmentthan the simplest mode coupling approximation. On theother hand, the density variation of the velocity autocor-relation function observed by simulation, particularly itsnonexponential decay at long times for n < nc, can be cal-culated very satisfactorily.

In view of the successful attempts at describing thedense, hard sphere fluids and the Lorentz model, we mightwonder what mode coupling theory will give at densi-ties beyond the normal liquid density, typically taken tobe the triple point density of a van der Waals liquid,n∗ = nσ 3 = 0.884. On intuition alone we expect thatas the atoms in the fluid are pushed more closely againsteach other, structural rearrangement becomes more and

more difficult so that at a certain density the local structurewill no longer relax on the time scale of observation. Thiscondition of structural arrest is a fundamental character-istic of solidification, and it is appropriate to ask if modecoupling theory can describe such a highly cooperativeprocess. Indeed, a certain self-consistent approximationin mode-coupling theory will lead to a model that exhibitsa freezing transition. The signature of the transition is thatthe system becomes nonergodic at a critical value of thedensity or temperature.

To demonstrate that the mode-coupling formalism candescribe a transition from ergodic to nonergodic behav-ior, we consider a schematic model for the normalizeddynamic structure factor ϕ(z) ≡ S(k,z)/S(k), and ignorethe wavenumber dependence in the problem. In analogywith Eq. (55) we write

−ϕ(z)−1 = z + K (z) (57)

−20 K (z)−1 = z + M(z) (58)

with M(z) playing the role of D(k, z). We will con-sider two different approximations to the memory functionM(z),

M(z) M0(z) + m(z) (59)

and

M(z) M0(z) + m(z)

1 − (z) m(z)(60)

with

M0(t) = ω δ(t) (61)

m(t) = 4λ20 F2(t) (62)

(t) = λ′F(t) J (t) (63)

Equations (59) and (62) constitute the original mode-coupling approximation, henceforth denoted as the LBGSmodel, in which only the coupling of density fluctuationmodes, with F(t) defined by Eq. (18), is considered. Equa-tions (60) and (63) constitute an extension in which thecoupling to longitudinal current modes, with J (t) givenby Eq. (34), is also considered. We will refer to this asthe extended mode-coupling approximation. In both mod-els M0(z) = iω is the Enskog-theory contribution to thememory function, where ω is an effective collision fre-quency. The coupling coefficients λ and λ′ will be treatedas density- and temperature-dependent constants. Com-paring the two models, we see that the difference lies inthe presence of (z) in Eq. (60).

It may seem remarkable that an apparently simple ap-proximation of coupling two density modes can providea dynamical model of freezing. To see how this comesabout, notice that the quantity of interest in the anal-ysis is the relaxation behavior of the time-dependent

Page 241: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 153

density correlation function G(r, t), or its Fourier trans-form F(t) = F(k, t). Under normal conditions one expectsF(t → ∞) = 0 because all thermal fluctuations in an equi-librium system should die out if one waits long enough.When freezing occurs, this condition no longer holds assome correlations now can persist for all times. The condi-tion that F(t) stays finite as t → ∞ means the system hasbecome nonergodic. To see that Eq. (59) can give sucha transition, we look for a solution to the closed set ofequations (57), (58), and (59) of the form

ϕ(z) = − f/z + (1 − f )ϕv(z) (64)

where the first term is that component of F(t) which doesnot vanish at long times, F(t → ∞) = f , and ϕv(z) is awell-behaved function that is not singular at small z. Sinceϕv(z) is not pertinent to our discussion, we do not need toshow it explicitly. Inserting this result into (59) yields

Mz = −4λ20 f/z + Mv(z) (65)

with Mv(z) representing all the terms that are nonsingularat small z; we obtain

ϕ(z) = − 4λ f 2

1 + 4λ f 2

1

z− (z)

1

1 + 4λ f 2(66)

with (z) also well behaved. For Eqs. (64) and (66) to becompatible, we must require

f = 4λ f 2

1 + 4λ f 2(67)

This is a simple quadratic equation for f , with solutionf = 1/2 + (1/2)(1 − 1/λ)1/2. Therefore, we see that inorder for the postulated form of the density correlationfunction to be acceptable solution to Eqs. (57), (58), and(59), f must be real, or λ > 1.

The implication of this analysis is that in the LBGSmodel the ergodic phase is defined by the region λ < 1,where the nondecaying component f must vanish, anda nonergodic phase exists for λ > 1. The onset of noner-godicity signifies the freezing in of some of the structuraldegrees of freedom in the fluid; therefore, it may be re-garded as a transition from a liquid to a glass. The originof this transition is purely dynamical since it arises from anonlinear feedback mechanism introduced through m(z).The freezing or localization of the particles shows up as asimple pole in the low-frequency behavior of ϕ(z), a con-sequence of the fact that M(z) ≈ 1/z at low frequencies.

The LBGS model is the first mode-coupling approxi-mation providing a dynamical description of an ergodicto nonergodic transition. The transition also has been de-rived using a nonlinear fluctuating hydrodynamics for-mulation. Analysis of the LBGS model shows that thediffusion coefficient D has a power-law density depen-dence, D ∼ (nc − n)α , with exponent α 1.76, and cor-

respondingly the reciprocal of the shear viscosity coeffi-cient η behaves in the same way. There exist experimentaland molecular dynamics simulation data that provide ev-idence supporting the density and temperature variationof transport coefficients predicted by the model. Specifi-cally, diffusivity data for the supercooled liquid methy-cyclohexane and for hard-sphere and Lennard–Jones fluidsobtained by simulation are found to have density depen-dence that can be fitted to the predicted power law. Thefact that the mode-coupling approximation is able to givea reasonable description of transport properties in liquidsat high densities and low temperatures beyond the triplepoint is considered rather remarkable.

The LBGS model also has been found to provide the the-oretical basis for interpreting recent neutron and light scat-tering measurements on dynamical relaxations in densefluids. These experiments show that the temporal relax-ation of the density correlation function F(k, t) is non-exponential, F(k, t) exp[−t/τ )β], with β distinctly lessthan unity. This behavior of scaling, in the sense of F be-ing a function of t/τ , where τ is a temperature-dependentrelaxation time, and of stretching, in the sense of β < 1, isalso given by Eq. (59) provided a term λ′′F(t) is added toEq. (62). Thus, the ability of the mode-coupling approxi-mation to describe the dynamical features of relaxationin dense fluids has considerable current experimentalsupport.

The successes of the approximation Eq. (59) notwith-standing, it does have an important physical shortcoming,namely, it does not treat the hopping motions of atomswhen they are trapped in positions of local potential min-ima. These motions are expected to be dominant at suf-ficiently low temperature of supercooling; their presencemeans that the system should remain in the ergodic phase,albeit the relaxation times can become exceedingly long.For this reason, the predicted transition of the LBGS modelis called the ideal glass transition; in reality one does notexpect such a transition to be observed. The extendedmode-coupling model, Eq. (60), in fact provides a cut-off for the ideal glass transition by virtue of the presenceof (z). One can see this quite simply from the small-zbehavior of Eqs. (57), (58), and (60). With nonzero,ϕ(z) no longer has a singular component varying like 1/z,so F(t) will always vanish at sufficiently long times.

Even though the two approximations, Eqs. (59) and(60), give different predictions for the transition, one hasto resort to numerical results in order to see the differ-ences between the two models in their descriptions ofF(k, t) in the time region accessible to computer simu-lation and neutron and light scattering measurements.In Fig. 8 we show the intermediate scattering functionF(k, t) of a fluid calculated by simulation using a trun-cated Lennard–Jones interaction at various fluid densities

Page 242: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

154 Molecular Hydrodynamics

FIGURE 8 Relaxation of density correlation function F(t) atwavenumber k = 2 A−1 obtained by molecular dynamics simu-lation at various reduced densities n∗ and reduced temperatureT∗ = 0.6. Time unit τ is defined as (mσ 2/ε)1/2.

(n∗ = nσ 3). One sees that as n∗ increases, the relaxationof F(k, t) becomes increasingly slow. Compared to thesesimulation results, the corresponding mode-coupling cal-culations, using a model equivalent to Eq. (59), show thesame qualitative behavior of slowing down of relaxation;however, the mode-coupling model predicts a freezing ef-fect that is too strong. This discrepancy is not seen in amodel equivalent to Eq. (60). Thus, there is numericalevidence that the cutoff mechanism of the transition, re-presented by , is rather significant.

To what extent can the dynamical features of super-cooled liquids be described by mode-coupling modelssuch as Eqs. (59) and (60)? Although these approxima-tions seem to give semiquantitative results when comparedto the available experimental and simulation results, it isalso recognized that hopping motions should be incorpo-rated in order that the theory be able to give a realisticaccount of the liquid-to-glass transition.

VII. LATTICE GAS HYDRODYNAMICS

Fluctuations extend continuously from the molecular levelto the hydrodynamic scale, but we have seen that thereare experimental and theoretical limitations to the rangeswhere they can be probed and computed. Indeed, no theoryprovides a fully explicit analytical description of space–time dynamics establishing the bridge between kinetic the-ory and hydrodynamic theory, and scattering techniqueshave limited ranges of wavelengths over which fluctuationcorrelations can be probed. With numerical computationalmethods one can realize molecular dynamics simulationsthat in principle, could cover the whole desired range, butin practice there are computation time and memory re-quirement limitations.

Lattice gas automata (LGA) are discrete models con-structed as an extremely simplified version of a many-

particle system where pointlike particles residing on a reg-ular lattice move from node to node and undergo collisionswhen their trajectories meet at the same node. The remark-able fact is that, if the collisions occur according to somesimple logical rules and if the lattice has the proper sym-metry, this automaton shows global behavior very similarto that of real fluids. Furthermore, the lattice gas automa-ton exhibits two important features: (i) It usually resideson large lattices, and so possesses a large number of de-grees of freedom; and (ii) its microscopic Boolean nature,combined with the (generally) stochastic rules that governits microscopic dynamics, results in intrinsic fluctuations.Therefore, the lattice gas can be considered as a “reservoirof thermal excitations” in much the same way as an ac-tual fluid, and so can be used as a “virtual laboratory” forthe analysis of fluctuations, starting from a microscopicdescription.

A lattice gas automaton consists of a set of particlesmoving on a regular d-dimensional lattice L at discretetime steps, t = nt , with n an integer. The lattice is com-posed of V nodes labeled by the d-dimensional positionvectors r ∈ L. Associated to each node there are b chan-nels (labeled by indices i, j, . . . , running from 1 to b).At a given time, t , channels are either empty (the occu-pation variable ni (r, t) = 0) or occupied by one particle[ni (r, t) = 1]. If channel i at node r is occupied, thenthere is a particle at the specified node r, with a velocityci . The set of allowed velocities is such that the condi-tion r + cit ∈ L is fulfilled. It may be required that theset ci b

i=1 be invariant under a certain group of symme-try operations in order to ensure that the transformationproperties of the tensorial objects that appear in the dy-namical equations are the same as those in a continuum[such as the Navier–Stokes equation (11)]. The “exclusionprinciple” requirement that the maximum occupation beof one particle per channel allows for a representationof the automaton configuration in terms of a set of bitsni (r, t); i = 1, . . . , b; r ∈ L. The evolution rules arethus simply logical operations over sets of bits, which canbe implemented in an exact manner in a computer.

The time evolution of the automaton takes place in twostages: propagation and collision. We reserve the notationn(r, t) ≡ ni (r, t)b

i=1 for the precollisional configurationof node r at time t , and n∗(r, t) ≡ n∗

i (r, t)bi=1 for the con-

figuration after collision. In the propagation step, particlesare moved according to their velocity

ni (r + ci t, t + t) = n∗i (r, t) (68)

The (local) collision step is implemented by redistribut-ing the particles occupying a given node r among thechannels associated to that node, according to a given

Page 243: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 155

prescription, which can be stochastic. The collision stepcan be represented symbolically by

n∗i (r, t) =

∑σ

σi ξn(r,t)→σ (69)

where ξn(r,t)→σ is a random variable equal to 1 if, startingfrom configuration n(r, t), configuration σ ≡ σi b

i=1 isthe outcome of the collision, and 0 otherwise. The physicsof the problem is reflected in the choice of transition ma-trix ξs→σ . Taking an average over the random variable(assuming homogeneity of the stochastic process in bothspace and time), and using Eq. (68), we obtain

ni (r + ci , t + 1) =∑σ,s

σi 〈ξ〉s→σδ[n(r, t), s] (70)

where automaton units (t = 1) are used. These micro-dynamic equations constitute the basis for the theoreticaldescription of correlations in lattice gas automata.

Starting from Eq. (70), by performing an ensemble av-erage over an arbitrary distribution of initial occupationnumbers (denoted by angular brackets), one derives a hier-archy of coupled equations for the n-particle distributionfunctions, analogous to the BBGKY hierarchy in conti-nuous systems. The first two equations in this hierarchyare

fi (r + ci , t + 1) =∑σ,s

σi 〈ξ〉s→σ〈δ(n(r, t), s)〉, (71)

f (2)i j (r + ci , r′ + c j , t + 1)

= (1 − δ(r, r′))

∑σ,s,σ′,s′

σiσ′j〈ξ〉s→σ〈ξ〉s′→σ′

× 〈δ(n(r, t), s)δ(n(r′, t), s′)〉

+ δ(r, r′)

∑s,σ

σiσ j 〈ξ〉s→σ〈δ(n(r, t), s)〉

(72)

where

fi (r, t) = 〈ni (r, t)〉 (73)

f (2)i j (r, r′, t) = 〈ni (r, t)n j (r′, t)〉 (74)

are the one- and two-particle distribution functions, re-spectively. The fluctuations of the channel occupationnumber are δni (r, t) = ni (r, t) − fi (r, t), and the cor-responding pair correlation function reads

Gi j (r, r′, t) = 〈δni (r, t)δn j (r′, t)〉 (75)

Using a cluster expansion and neglecting three-point cor-relations, the hierarchy of equations can be approximatelytruncated to yield the generalized Boltzmann equation for

the single particle distribution function fi (r, t),

fi (r + ci , t + 1) − fi (r, t) = (1,0)i (r, t)

+∑k<1

(1,2)i,kl (r, t)Gkl(r, r,t) (76)

and the ring kinetic equation for the equal-time pair cor-relation function

Gi j (r + ci , r′ + c j , t + 1)

−∑

kl

Wi j,kl(r, r′, t) Gkl(r, r′, t)

= Bi j (r, t)δ(r, r′) (77)

On the right-hand side of (76) the first term (1,0)i

represents the discrete nonlinear Boltzmann collisionterm, and the second term contains a set of correlatedcollision sequences (ring events). In the ring equation(77), Wi j,kl(r, r′, t) is a product of two homogeneouspropagators,

Wi j,kl(r, r′, t) = (δik +

(1,1)i,k (r, t)

)(δ jl +

(1,1)j,l (r′, t)

)(78)

where (1,1)i, j = ∂

(1,0)i /∂ f j is the linearized collision op-

erator, and the on-node source term Bi j (r, t) is a functionof fi (r, t) and of Gi j (r, r′, t).

In order to have a full description of the dynamics ofa fluid, temperature should be associated to the latticegas, which is only possible if the model system possessesa velocity distribution. A minimal two-dimensional ther-mal LGA can be constructed in the following manner:Particles reside on the two-dimensional triangular lattice(with hexagonal symmetry), have unit mass, and undergodisplacements with velocity moduli 1,

√3, and 2 (in lat-

tice unit length per time step), and so have energies 1/2,3/2, and 2, respectively; particles at rest have zero en-ergy. Speeds 1 and 2 correspond to displacements by oneand two lattice unit lengths, respectively, in one time stepalong any of the six lattice directions, and speed

√3 cor-

responds to displacements to the next nearest neighboringnodes along any of the six directions bisecting the latticedirections. This defines a lattice gas with basic conser-vation laws: mass, momentum, and energy, and correctsymmetry.

The existence of spontaneous thermal fluctuations inthis LGA is convincingly evidenced by the analysis of thedynamic structure factor S(k, ω), which in the linearizedBoltzmann approximation is given by

S(k, ω) = 2∑

i j

(1

eiω+1k·c − 1 −

)i j

κ j + ρS(k)

(79)where is the linearized Boltzmann collision operator, denotes the real part, ρ is the particle density per node,

Page 244: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

156 Molecular Hydrodynamics

FIGURE 9 The lattice gas dynamic structure factor in the hydrodynamic regime: S(k, ω) as a function of the frequencyω at high density and low k. Comparison between the simulation results (solid line) and the theoretical predictions:hydrodynamic spectrum (dashed curve), and Boltzmann theory (dotted curve).

and κ j = f j (1 − f j ), with f j the average particle den-sity per channel. The expression for S(k, ω) is in generaltoo complicated to be calculated analytically, but it canbe used for direct numerical evaluation at all values of k.However, for small k values, that is, in the hydrodynamicregime where spatial and temporal variations are smooth,S(k, ω) can be computed explicitly by low k expansion; thedynamic structure factor then takes the Landau–Placzekform as given by Eq. (21) (plus cross terms). A typicalspectrum is given in Fig. 9. It shows that the spectral den-sity of the lattice gas density fluctuations in the hydrody-namic domain exhibits the characteristic lineshape of theRayleigh–Brillouin spectrum observed experimentally inreal fluids.

A very intersting aspect of the lattice gas approach isthat the eigenvalues zµ(k) of the kinetic propagator

e−1k·c(1 + )ψµ(k) = ezµ(k)ψµ(k) (80)

can be computed analytically for low k values, yielding theexpressions for the transport coefficients, and numericallyfor any value of k. So here is a model system for which allmodes can be computed as functions of k delineating the

domains of validity of hydrodynamics, generalized hydro-dynamics, and kinetic regime. For instance, it is found thatthe predictions of the lattice Boltzmann equation in whichno small k- and/or ω-approximations have been made arevalid over quite a wide range of wavenumbers, in goodagreement with simulation results.

Lattice gas automata, as described so far, evolve ac-cording to an iterated sequence of mass- and momentum-preserved local collisions followed by propagation.Nonlocal interactions can be incorporated in the LGA dy-namics via long-distance momentum transfer simulatingattraction and/or repulsion between particles, by modify-ing the orientation of the velocity vectors from a diverg-ing configuration to a converging configuration to simu-late attractive forces and vice versa for repulsive forces.While in local collisions momentum redistribution is anode-located process, in nonlocal interactions (NLI) mo-mentum is exchanged between two particles residing onnodes separated by a (fixed or variable) distance r : Mass isconserved locally, momentum is conserved globally. Fromthe statistical mechanical viewpoint, LGAs with NLIsform an interesting class of models in that they include an

Page 245: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 157

elementary process that is essential for “nonideal” behav-ior. At the macroscopic level, the main feature exhibited byLGAs with NLIs is a “liquid–gas”-type phase separationwith bubble and drop formation.

The dynamics of LGA virtual particles is not governedby Newton’s equation of motion, and the concepts of forceand potential cannot be used in the sense of classical me-chanics. Moreover, in real fluids, each particle is subjecteda priori to the force field of all particles (whose effectis quantified by the potential of mean force), whereasin discrete lattice gases with NLIs, each particle inter-acts nonlocally with at most one other particle at a time.So, stricto sensu, the usual concept of intermolecular po-tential does not apply to lattice gases. In the LGA withNLIs, the idea of an interaction range is introduced bygoverning the interaction distance according to a proba-bility distribution p(r )(∝r−µ). For sufficiently long timesand large number of particles, the implementation of aprobability distribution of interaction distances has a re-sulting effect similar to an effective interaction potential.In order to define a quantity that can be identified as aninteraction potential in a discrete lattice gas, one can usethe following heuristic argument: The rate of momentumexchange caused by the nonlocal interaction is given byF(r ) = ζκ2

2 q(r ), with κ2 = f (1 − f ), and where ζ is anumerical factor whose value corresponds to the averageamount of momentum transfer. F(r ) is then interpreted asa force, from which a “pair potential” can be defined as thediscrete analog of the potential in continuum mechanics:u(r ) = −γF(r ), where F(r ) is the repartition functioncorresponding to the distribution q(r ), which is directlyrelated to p(r ). So u(r ) is well defined once p(r ) is fixed.For instance, with a power-law distribution p(r ) ∝ r−µ

such that the interactions are repulsive for r = 1 andattractive for r = 2, . . . , rmax, u(r ) exhibits a form com-patible with the expected typical pair interaction potentialof simple fluids.

The static structure factor ρS(k) = 〈∑i, j δn∗i (k, t)

δn j (k, t)〉 is a constant in the ideal gas, and so it is inthe ideal lattice gas: S0(k) = (1 − f ), reflecting the ab-sence of spatial density correlations [the factor (1 − f )arises because of the exclusion principle]. Now, for LGAswith NLIs, S(k) should be of the form S(k)/S0(k) =1 + f h(k), where h(k) is the Fourier transform of thepair correlation function [g(r ) − 1] and is therefore re-lated to the potential of mean force φ(r ) since g(r ) =exp[−βφ(r )] (here β is an arbitrary constant). Thus, bymeasuring the density fluctuation correlations in latticegas simulations, one can extract a function φ(r ) fromthe measured static structure factor. Figure 10 shows thatboth the radial distribution function g(r ) and the poten-tial function φ(r ) resemble those obtained in real fluidmeasurements.

At the macroscopic level, LGAs with NLIs can exhibitspinodal decomposition, and in the appropriate densityrange, one can “quench” the system by increasing theinteraction range rmax. Then S(k) measured at increasingvalues of rmax grows dramatically at low k. Using theexpression for the compressibility χ of the lattice gas, onehas

S(k → 0) = ρβ−1χ 1 − f

1 − ζκ3〈r〉q;

(81)κ3 = f (1 − f )(1 − 2 f )

where 〈r〉q is the expectation of r computed with thedistribution q(r ). Since 〈r〉q increases with rmax, we seethat S(k → 0) grows accordingly. The increase of S(k) atlow k is characteristic of the amplification of long-rangecorrelations near a phase transition.

The Boltzmann approximation neglects all mode-coupling effects. A consequence of mode coupling is thattime correlation functions generally exhibit long-time be-havior usually in the form of algebraic decay: ∼t−d/2,where d is the space dimension. The existence of theselong-time tails implies that in dimensions less than orequal to 2, the hydrodynamic equations are valid only forregimes in which mode-coupling effects are negligible,and in dimensions 3 and higher, the form of the hydro-dynamic equations remains valid, but the transport coeffi-cients are renormalized.

In mode-coupling theory (see Section VI), one startswith the idea that the long-time behavior can be explainedon the basis of hydrodynamic arguments. Consider thecase of the velocity of a tagged particle; the mode thatdescribes the decay of its velocity correlations, the shearmode, and the mode that describes particle displacements,the diffusion mode, are coupled. The assumption is thateventually the particle velocity will be equal to the fluidvelocity, so that the velocity of the particle is expressedin terms of the particle probability density and of thefluid velocity fluctuations. The former obeys the diffusionequation and the latter the linearized Navier–Stokes equa-tion. So the basic assumption combines the solutions of thetwo equations, and the result for the normalized velocityautocorrelation function ψ(t) reads

ψ(t) d − 1

d

1

n[4π (ν + Ds)t]

−d/2 (82)

where n is the number density, and Ds and ν denote theself-diffusion and the kinematic viscosity coefficients, res-pectively. The same result holds for lattice gases with n =ρ/v0, the number density per elementary unit volume ofthe lattice (e.g., v0 = √

3/2 in the triangular lattice), andwith an additional factor (1 − f ) because of the exclusionprinciple.

Page 246: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

158 Molecular Hydrodynamics

FIGURE 10 (a) Radial distribution function g (r ) and (b) potential function #(r ) ≡ ln(g (r )) of the lattice gas withnonlocal interactions governed by the probability distribution p (r ) ∝ r −µ for 1 ≤ r ≤ rmax; rmax = 6, µ = 0 (circles),rmax = 8, µ = 0 (squares), rmax = 10, µ = 1 (diamonds). Lines are guides to the eye.

In order to compute the velocity autocorrelation func-tion of a particle in the lattice gas where all particles areindistinguishable, one must be able to follow the dynam-ics of a given particle. For this purpose, a very efficientprocedure was developed so that high computational ac-curacy could be achieved for the detection of long-timetails, which requires high precision. Simulations were per-formed for 2D and 3D lattice gases, and the decay of thecomputed velocity autocorrelation function ∼t−d/2 wasfound to be in agreement with the mode-coupling pre-diction; extended mode-coupling theory improved theseresults to almost perfect agreement with the simulationresults for the amplitude factor of the long-time tail]. Theevidence of the algebraic decay ∼t−d/2 of the velocityautocorrelation function as demonstrated by lattice gassimulations constitutes by far the most accurate verifica-tion to date of mode-coupling predictions for hydrody-namic long-time tails: It is one of the convincing achieve-ments of the lattice gas approach to statistical mechanics,showing that LGAs are well suited to serve as a testingground for concepts in kinetic theory.

ACKNOWLEDGMENT

This work has been supported by the National Science Foundation un-der Grant CHE-8806767 and by the Fonds National de la RichercheScientifique (FNRS, Belgium).

SEE ALSO THE FOLLOWING ARTICLES

FLUID DYNAMICS • HYDRODYNAMICS OF SEDIMENTARY

BASINS • LIQUIDS, STRUCTURE AND DYNAMICS

BIBLIOGRAPHY

Boon, J.-P., and Yip, S. (1980). “Molecular Hydrodynamics,” McGraw-Hill, New York, and Dover reprint edition (1991).

Ernst, M. H. (1990). Statistical mechanics of cellular automata fluids.In “Liquids, Freezing and the Glass Transition” (J. P. Hansen, D.Levesque, and J. Zinn-Justin, eds.), North Holland, Amsterdam.

Gotze, W. (1990). Aspects of structural glass transitions. In “Liquids,Freezing and the Glass Transition” (J.-P. Hansen, D. Levesque, and J.Zinn-Justin, eds.), North-Holland, Amsterdam.

Page 247: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNB/GRI P2: FQP Final Pages

Encyclopedia of Physical Science and Technology EN010C-458 July 19, 2001 20:58

Molecular Hydrodynamics 159

Kim, B., and Mazenko, G. F. (1990). Fluctuating nonlinear hydrody-namics, dense fluids, and the glass transition. Adv. Chem. Phys. 78,129.

Martin, P. C. (1968). “Measurements and Correlation Functions,”Gordon and Breach, New York.

Richter, D., Dianoux, A. J., Petry, W., and Teixeira, J. (eds.) (1989).

“Dynamics of Disordered Materials,” Springer-Verlag, Berlin, contri-butions by L. Sjogren and W. Gotze, F. Mezei, and W. Knaak.

Rivet, J. P., and Boon, J. P. (2000). “Lattice Gas Hydrodynamics,” Cam-bridge University Press, Cambridge, U.K.

Yip, S. (1979). Renormalized kinetic theory of dense fluids. Ann. Rev.Phys. Chem. 30, 547.

Page 248: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical AcousticsA. H. BenadeCase Western Reserve University

I. IntroductionII. The Plucked and Struck String InstrumentsIII. The Singing VoiceIV. The Wind InstrumentsV. The Bowed String Instruments

VI. The Aptness of Instrumental Sounds in Rooms

GLOSSARY

Harmonic signal Signal whose sinusoidal componentshave frequencies that are (any) integer multiples n f0 ofsome “fundamental” frequency f0. The repetition rateof such a signal is f0, whether or not the component f0

is itself present.Heterodyne components “Crossbred” components hav-

ing frequencies fν = m fa ± n fb ± · · · ± · · · (m, n, . . .

integers) that are present in the response of a nonlinearsystem when it is driven by the frequencies fa, fb, . . . .

Impedance Ratio F/v of the oscillating excitory force Fto the resulting velocity response v at some specifiedpoint in a system that is driven at the frequency f .The ratio governs all interactions of the system withwhatever is connected to it at its driving point, and it hasmaxima (or minima) at the system’s modal frequencies.

Inharmonic signal Signal whose component frequenciesare not integer multiples of a common basis f0.

Linear system System whose net response to a superpo-sition of stimuli is the sum of the responses to eachtaken separately, with its response spectrum includingonly the frequencies present in the stimulus spectrum.

Modal frequencies “Natural frequencies” characteristicof the vibrations of a complex system when it is struckand allowed to ring. The structure of the system deter-mines these frequencies in a unique way (see mode ofoscillation).

Mode of oscillation One of a set of distinct ways that afinite linear system moves when it is impulsively dis-turbed, with each member of the set having its ownmodal frequency, decay rate, and vibration shape.

Nonlinear system System whose response to a set ofstimuli is not a simple superposition of the effects ofthe stimuli, and whose response spectrum thereforeincludes heterodyne frequencies in addition to thosepresent in the original stimuli.

Precedence effect Ability of the auditory system to com-bine several signals arriving in close succession into asingle detailed percept, based on all the informationthat has been collected.

Radiation Transmission of vibrations from a structureinto the adjacent (approximately unbounded) region,as from a piano string into its soundboard, or from atrumpet into the concert hall.

Resonance Generic name for the selectively strong

241

Page 249: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

242 Musical Acoustics

response of a modal system when it is driven by a si-nusoidal force whose frequency is close to one of thesystem’s modal frequencies.

Room average The result of averaging a source’s signalspectra observed at many points in a room, which givesa useful measure of the sound spectrum of the sourceitself. Averaging is needed because the transmissionof sound between two points in a room is a chaoticfunction of both position and frequency.

Spectrum A listing of the frequencies and amplitudes ofthe sinusoidal components making up any signal.

Vibration shape The characteristic distribution through-out a system of the motion of each mode of its vibration,with all parts of the system moving synchronously atthe corresponding modal frequency, if this mode alonehas been excited.

MUSICAL ACOUSTICS is the scientific study of thearts of performance and composition and the craft of in-strument construction as these interact in the context ofmusic. The scientific disciplines most directly involvedare oscillation physics and perception psychology, thoughthe muscular and neurological branches of physiology alsomake important contributions. The emphasis of the presentarticle is on the dynamical properties of musical instru-ments as these are influenced by the environment in whichthey are played and by the needs of the listener’s auditorymechanisms.

I. INTRODUCTION

This article on musical acoustics opens with an outlineof those salient properties of the human auditory sys-tem which permit it to function as a processor of musicalsounds in the concert hall. The apparently chaotic proper-ties of the sound-transmission process in halls will also besketched, along with an indication of how the ear can col-lect the data it needs undistracted by the confusion. Eventhough the article concerns itself primarily with the phys-ical behavior of musical instruments and musical sounds,it is important for the reader to begin with a good idea ofthe perceptual requirements that instruments must satisfyif they are to be musically successful. The physical prop-erties of musical sounds and of the transmission path fromsource to listeners are complicated. Music is possible be-cause its sounds are of a type whose significant featuresmay readily be deduced from the received signals. For sig-nals of the proper type, the auditory processor proves tobe an extremely efficacious signal processor that performsso effortlessly that few people are even remotely aware ofthe complexity of its task.

The following assertions will serve to provide a frame-work for the perception parts of this initial outline.

1. The mechanical motions of a primary sound sourcegive rise to an acoustical signal that is ultimately processedby the listener’s neurophysiological system.

2. Sounds from a source normally come to a listener’sears via a set of multiply reflected transmission paths in aroom.

3. The listener’s auditory system selects one or moresubsets of the signal information coming to his ears, andhe uses these in mutually supportive ways to recognizefeatures of interest. The subsets that come via differenttransmission paths are not always equivalent.

4. The system normally has several modes for process-ing and recognizing any given signal feature. It is thenable to resolve perceptual ambiguities, choosing thosemodes that give consistent results while “setting aside” theothers.

5. Much of the recognition function of the auditory sys-tem is based on sorting the signal properties into categoriesassociated with the characteristic behavior of generic typesof musical sound sources. Because of this, even partial in-formation is often sufficient to distinguish between instru-ments and for following individual instrumental voices inan ensemble.

6. The musically important recognition processes areall perceptually robust. That is, they are based on proper-ties of the musical signal that not only survive the trans-mission path but also are insensitive to the presence ofother musical signals or noise.

A. Structural Sketch of the AuditoryDetection System

We will pass over the outer ear (a sound collector) and themiddle ear (a coupling device and first overload protec-tor) to focus our attention on the major functional part ofthe inner ear, the cochlea. The basilar membrane withinthe cochlea provides the preliminary frequency sorting de-vice in the auditory signal path, as well as the transductionequipment that encodes mechanical vibrations into nerveimpulses for further processing. The sorting function ofthe cochlea is very simple: if a sound made up of severalsinusoidal components of widely separated frequenciesenters the ear, each one produces its own localized dis-turbance at some point along the basilar membrane—lowfrequencies at one end, high at the other. Thus, the firingof receptors at a particular position along the membraneindicates that a certain frequency is present in the originalsound.

This sorting by place along the basilar membrane isnot fine-grained. The perceived correlate (pitch) of the

Page 250: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 243

vibrational frequency of a sinusoid is associated with theplace of maximum vibration of the basilar membrane,while the maximum implied by the response gradient inthe “skirts” of the response region tends to define it withgreater precision.

When two stimuli whose frequencies fa and fb differ byless than about 20% enter the ear, their disturbances over-lap on the basilar membrane, and so stimulate the same setof receptors. The two stimuli produce cyclic alternationsin the vigor of the local vibration as they run in and outof step with one another, with an alternation (mechanicalbeating) rate equal to the difference fa − fb between thestimulus frequencies. On the other hand, when the acous-tic stimulus frequencies are separated by much more thanabout 25%, two essentially unrelated and mechanicallynoninteracting sets of receptors transmit data about thetwo stimuli.

A very large fraction of auditory processing theory isshaped by the existence of critical bands whose ultimateorigin lies in the distributed response of the basilar mem-brane (although modified by a certain amount of near-neighbor interaction of the receptors themselves). Overmost of the frequency range of musical interest, each crit-ical band extends over a range of roughly 26% ( 1

3 octave).Because of its importance in musical listening, we

should examine a few examples of the role of the criticalband phenomenon, beginning with the simplest. When theear is presented with two closely spaced sinusoids, suchthat fa − fb ≤ 20 Hz (well within a critical band), the lis-tener directly perceives the mechanical pulsations of thebasilar membrane as a pulsation in loudness of a sinusoidalsignal, whose pitch belongs with a frequency of about( fa + fb)/2. This is exactly in accordance with the expec-tations of a physicist. However, when fa − fb ≥ 20 Hz, thesignal is not perceived as having a rapidly pulsating loud-ness, but as a rather rough sound instead. The roughnessof this sound decreases rapidly however as the frequencydifference is increased toward the extent of the criticalband (e.g., to about 114 Hz for sinusoids lying near thenote A4, whose repetition rate is 440 Hz).

The determination of loudness is another auditory pro-cess that is dominated by the critical band phenomenon.The perceived loudness of a sound having all of its acous-tical energy E concentrated in a single critical band variesvery nearly as E0.3 for stimuli within the musically im-portant range of signal levels. This functional relation-ship holds unchanged whether all the signal energy is car-ried by a single sinusoid or by a group of closely spacedones, or even when the critical band is filled by randomnoise. When, however, two or more widely separated crit-ical bands are provided with stimuli E1, E2, . . . , the totalloudness is perceived as the simple sum of the loudnessescontributed by each in its own right. For components of

intermediate spacing, allowance must be made for the factthat the band edges are not sharply marked, so that they“shade off” from one to the next. In any event, the loud-ness will be greater when a given amount of power isapportioned among several critical bands than when it isconcentrated within a single critical band.

We now examine how the vigor of acoustical signals isneurologically encoded. The basilar membrane is richlysupplied with receptors all along its length (some 1500 percritical band). If the local vibration at the position of oneof these receptors is sufficiently strong (to first approxi-mation), it fires once per cycle of the vibration, sending anelectrical pulse to the higher centers of neurological pro-cessing. These receptors have a wide variety of thresholdsensitivities, so that only a few can be fired by a weak localvibration, while many of them produce pulses when thevibration is strong. In the language of signal processing,we may say that the signal frequency is coded in part byits location along the cochlea and in part by the repetitionrate of neural firing, whereas the signal strength is chieflyrepresented by the number of receptors that fire in eachburst (or volley, as it is customarily called).

Once a receptor has fired, it becomes insensitive for arefractory period that lasts about 1 ms, after which its sen-sitivity rapidly returns to normal. For sinusoidal stimulihaving a frequency of 2000 to 3000 Hz, a given receptorwill fire (on the average) only on every second or thirdcycle of the stimulus. For a given strength of mechanicaldisturbance in the cochlea then, the average number of re-ceptors actually firing per cycle is less for high-frequencystimuli than for those occurring below about 1000 Hz.

This last statement may lead us to speculate that theultimate perception of musical sounds may well be dif-ferent for tones that have a majority of their sinusoidalcomponents below 1000 Hz and those having a preponder-ance of high-frequency components. There are a numberof familiar properties of musical sounds that support suchspeculations. For the moment, we need only to suggest thesimplest of them all: the vast bulk of musical compositionworldwide is written to encompass a pitch range from afew notes below C4 (whose repetition rate is 261 Hz) toa few notes above C6 (repetition rate 1044 Hz). Further-more, we recognize the existence of many instruments andmany musical parts that are pitched in regions of the mu-sical scale that are very much lower than this “musicalheartland,” while there are very few to be found in thehigher-pitched musical regions. We shall return to thesematters in Section VI of this article.

Communications engineers who make use of pulse cod-ing for their signals will recognize that the thresholdbehavior of the auditory receptors and the subsequentand-gate and or-gate behavior of higher-level synapsesjoins with the “hit one, miss a few, and hit another”

Page 251: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

244 Musical Acoustics

responses of the primary receptors to generate pulse trainsof frequency f0 at many points in the nervous system net-work, and regardless of the details of its structure if some ofthe receptors are stimulated by incoming sinusoids whosefrequencies are integer multiples of α f0 , β f0 , γ f0 of thesame fundamental repetition rate. For example, an earprovided with signals having the frequencies 600, 800,and 1000 Hz will (among other things) give rise to awidespread neurological pulse-rate f0 = 200 Hz. On theother hand, sounds containing collections of inharmoniccomponents (α, β, γ, . . . , where α, β, and γ are not in-tegers) give rise to essentially chaotic pulse trains in thenervous system, although subsets of these tend to coalesceto give nearly periodic but fluctuating volleys if membersof the subsets are approximately harmonic. We find thatthe existence and nature of many fundamental propertiesof musical structure are suggested (if not implied) by thisglobal property that results from the mere existence ofsynaptic action. In particular, we may use it as a clear hintof why the ear responds to a sound made up of a col-lection of harmonic partials as a single clear perceptualentity—the musical tone—while an unrelated collectionis perceived as a jumble of separately heard sounds.

The fact that each one of several simultaneously pre-sented harmonic complexes is clearly heard as a tone inits own right is also strongly hinted at by the same proper-ties of the pulse coding action. Obviously, integer relationsbetween the repetition rates of these tones will themselvesbe expected to produce perceptually significant phenom-ena of the sort that underlie formal music theory.

B. Sound Transmission in a Room

It has already been remarked that the signal path betweenthe sound source and receiver in a room is highly variable,and even chaotic. Because experience shows that perform-ers and listeners alike find it easier to carry on their musicalactivities in a room than in a reflection-free environment,we must first learn something of the physical nature ofthe acoustical transmission path and then outline a few ofthe neuropsychological methods used by the listeners tomusically exploit information gained via this path.

Consider a prototype experiment in which an oscillatorsinusoidally drives a loudspeaker at one point in a room.Let the oscillator frequency be slowly raised, while a mi-crophone placed at another point in the room has its out-put signal traced out on the moving paper of a strip-chartrecorder. Figure 1 shows the resulting record of the trans-mission of sound from the loudspeaker to each of threedifferent microphone positions in a laboratory room. Thefirst feature of these tracings to catch the eye is their ex-treme irregularity; we further note that the three traces areentirely different. If, as a matter of fact, we were to pro-

FIGURE 1 Traces showing the extreme irregularity of soundtransmission in a room.

duce 30 or 40 traces of this type using various randomlychosen positions for the sound source and the detector, thetraces would all be different. However, the mathematicalaverage of the various microphone signal curves does havea well-defined meaning—it serves to inform the physicistof the strength of the loudspeaker signal itself. The slop-ing dashed line shown in the figure indicates such a roomaverage, and shows that the loudspeaker in the present ex-periment produced a steadily increasing excitation of theroom as its frequency was progressively raised.

Room-average spectra obtained from musical instru-ments are well-defined, and they are enormously infor-mative about instrumental behavior, but only as long asthe player is assigned a familiar performance task ade-quately specified in musical terms. Note also that, contraryto the belief of some recording engineers, combining allthe microphone signals at the input of a single analyzeraccomplishes nothing more than a reshuffling of the mu-sical cards. The result has no more significance than doesthe analysis of a single microphone!

C. Auditory Processes in a Large Room

We have learned that the transmission of sounds in a roomis (from the point of view of the physicist) complex andrandomly varies from point to point. Let us therefore in-quire about one of several methods of signal processingused by the auditory system that permits it to make “mea-surements” of the arriving sound at a rate that far exceedsthe abilities of a scientist using his best equipment.

First consider the signals received by the listener’s earduring the commencement of a musical sound in a room.He is provided with a series of early reflections. The firstto arrive is the direct sound from the instrument, and thenin quick succession come the first reflections from thewalls, the floor, and the ceiling of the room. If we wereusing our eyes in a mirrored room instead of our ears,we could say that these initial reflections provide us with

Page 252: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 245

front, left-hand, and right-hand views, plus informationfrom above and below. In both the acoustical and the op-tical versions we find reflections of reflections, these be-coming ever weaker and less well defined as the variouscomplexities of the reflection and transmission processtake their toll. Then we come to an important fact: A fewearly reflections present the listener/observer with nearlycomplete physical information about all aspects of the sig-nal source.

In a mirrored room, we would have to shift our con-scious attention from one available view to another, andthen to intellectually combine the gathered informationinto an overall picture of the observed subject. The musi-cal listener, on the other hand, can perform his compilationof data in what is perceived to be zero time! The success-ful collection, storage, comparison, and interpretation ofthese early-arrival data is possible if the successive earlyreflections bring in their messages within a time intervalof about 35 ms of one another. This implies that the reflect-ing surfaces must be sufficiently close to either the sourceor the listener that most of the sounds arriving from themhave traveled no more than about 10 m farther than theirpredecessors. It is impossible to overemphasize the im-portance of this fact! We have here the essential clue as tohow the auditory system carries on a major part of its workin the concert hall. There is one more feature of the signalprocessing behavior of the musical ear to which we mustgive our serious attention: reflections that have traveleddistances of more than about 40 m farther than the directsounds are actively disruptive of the musical recognitionprocess (see Fig. 2). The general nature of this type of au-ditory processing has been recognized for over 150 yearsand empirically exploited for a century. Its basic manifes-tations were scientifically formalized in the 1950s underthe name precheffect.

From the point of view of a physicist, the buildup ofsound at some place in a room is produced by the superpo-sition of the direct sound and a set of successive reflectionsthat arrive with random phase, an ever-increasing arrivalrate, and progressively weakening amplitudes. Clearly, thebuildup of sound (and its analogous reverberant decay) is

FIGURE 2 Reflections arriving within about 35 ms of the origi-nal sound enhance the listener’s perception of musical sounds inmany ways. Later arrivals degrade these perceptions.

itself a violently fluctuating random process. Just as aver-aging many samples of steady-state sound in a room givesa useful room average pavg, so also does the averagingof many buildups or decays lead to a measure (the onlymeaningful one) of the reverberation time Tr of the room.

Taking the two averages pavg, Tr together, we may thensay that the average onset behavior ponset(t) of the roomsound follows the rule

ponset(t) = pavg(1 − e−6.91t/Tr

), (1)

while the decay is given by

pdecay(t) = pavg e−6.91t/Tr . (2)

We must not forget, however, that it is only the earliest 50or 60 ms of the onset that contribute to fine-grained mu-sical detection, while the remainder provides little morethan the “aroma” of earlier sounds.

When summarizing the foregoing discussion of thefunctioning of the human auditory system in the concerthall, we begin by emphasizing that while the ear can collectdata over several tens of milliseconds, these early reflec-tions are fused into a single percept. The time of occur-rence of the percept turns out to be the instant at which theearliest contribution arrives, which is normally the arrivaltime of the direct sound from the instrument. Also, it is per-ceived as coming from the point in the room from whichthe instrument itself transmits the first-arriving signal, andits loudness is perceived as being accumulated from theentire sequence of early arrivals. The individual arrivalsare used together to provide a mutually confirmatory basisupon which we can assess such musical features as tonecolor, stability of production, and the type of articulationchosen by the performer for the note under consideration.It is quite correct to say that the ear is able to deducethe room-average sound by a suitable processing of theroom-caused fluctuations in its onset (or decay).

II. THE PLUCKED AND STRUCK STRINGINSTRUMENTS

A. The Guitar

The guitar will be used as a prototype instrument to tracethe major relationships that adapt the physical structureof a musical instrument to the auditory requirements ofthe musical ear. As indicated in Fig. 3, the generic gui-tar may be described as consisting of a set of six stringssupported on a hollow, thin-walled body and a fairly rigidneck. When a string is plucked, its vibrating length extendsfrom an anchorage on the bridge (attached to the top plateof the body) to one of the frets on the neck (as selected bythe player). The instrument is of course normally played in

Page 253: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

246 Musical Acoustics

FIGURE 3 Diagram of a guitar showing the names of its majorparts.

a room or concert hall. The physicist recognizes from thislisting that he is dealing with a coupled sequence of one-,two-, and three-dimensional vibratory systems. Taken inaggregate, such a sequence has an enormous number ofmodes to be excited at one point (on a string) by a playerand observed (in the room) by the listener. The character-istic impedance of the string subsystem is much smallerthan that of the wooden structure to which it is attached,while the characteristic impedance of the body is enor-mously greater than that of the air-filled room. Becauseof these differences, the three subsystems interact weakly,and their modal properties may usefully be analyzed oneby one.

The string itself is a one-dimensional system. Pluckingit excites a set of modal oscillations whose frequencies arearranged in an almost precise whole-number relation as

fn = (n/2L)(T/µ)1/2. (3)

Here L is the string length, T its tension, and µ its mass perunit length. However, the exactness of this harmonic rela-tionship is slightly disrupted by a small correction arisingfrom the barlike stiffness of the string and by irregular per-turbations associated with its coupling to the guitar body.

Because the modal frequencies of a guitar string enjoya harmonic relationship, the instrument transmits to thelistener sounds having frequency patterns that his neuro-physiological processor responds to with particular vivid-ness, and which govern the formal structure of music.

The vibrations of a plucked string drive the guitar bodymainly by way of an oscillatory force exerted on thebridge, although there is a musically significant excitationapplied by the other end of the string and transmitted to thebody via the neck. Once corrections have been made formodal behavior associated with the guitar’s barlike neckand the air contained within the instrument’s cavity, themean spacing of the body modes is essentially constant(100 Hz) over the entire frequency range. Mathemati-cal analysis shows that this behavior is to be expected fora platelike object, even when segments of various thick-nesses are joined into a boxlike structure.

When the guitar body is excited by the abruptly begunvibrations of a plucked string, its motion can be classi-

fied into two separate but musically significant categories.First, the body vibrates as a driven system at the frequen-cies of the excitory string modes. The vigor of each com-ponent motion depends on the point of application of thedriving force, and its decay in amplitude is governed bythe decay rate of the corresponding string vibration. Thesound components produced in the room by these vibra-tions exert a major influence on the pitch, loudness, andtone color of the guitar’s perceived signal. The second as-pect of the guitar sound is associated with the body, inwhich a transient vibration is set up whenever a string isplucked. The frequency components and the decay ratesof this sound are those characteristic of the body modesthemselves. While there is little energy associated withthis part of the sound arriving at the ear, it contributes sig-nificantly to the hollow, woody tone color and mild initialthump that is characteristic of all guitar notes.

The ability of the guitar body to convert its string vi-bration excitation into audible sound in the room dependssignificantly on the relationship between the wave speedof sound in the body to that in air, on the number and kindof discontinuities existing in the body structure, and onthe frequencies and strengths of the modal resonances ofthe body. However, for an elaborately intertwined set ofreasons that will be elucidated little by little through-out the course of this article, the excitory force appliedby the string to the guitar bridge is converted into soundin the room with an average efficacy that is almost in-dependent of frequency. To be sure, there are peaks anddips in the radiation processes associated with the gui-tar body parameters mentioned previously, but (for rea-sons that will become clearer over the remaining courseof this article) these may vary widely from instrumentto instrument without destroying the recognizability ofits sound as being guitarlike. Only one or two of thesepeaks of radiative efficacy play a major role in subjectivejudgments of guitar quality, and these lie in the region ofthe lowest frequency components of the guitar’s lowestnotes.

The physics of plucked string motion is such that theamplitudes of the various modes excited by a given typeof plucking depend very strongly on the position of theplectrum along the string and on its breadth and hardness.Temporarily setting aside the modifications arising fromthe nature of the plectrum and the stiffness of the string,the force Fn exerted on the bridge by the nth vibrationalmode of a string is

Fn = (2F0/πn) sin(nπxe/L). (4)

Here L is the vibrating length of the string, and xe is thedistance away from the bridge that an excitory force F0 isapplied by means of a very narrow plectrum. The presenceof the mode number n in the denominator shows that

Page 254: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 247

the high-frequency modes exert progressively less driv-ing force on the bridge (and so produce less sound in theroom) than do the lower modes.

When a guitar is played, the plucking point positionxe tends to be roughly constant, while the player choosesany one of a wide variety of string lengths L by his useof the frets. Equation (3) may be used to reexpress Fn asthe function Fa( f ) that gives the driving force exerted byany string oscillation having the frequency f , whether itbe associated with a large-n mode of a long string, or asmall-n mode of a short one:

Fa( f ) ≡ K · Ea( f ) = K

[sin(π f / fa)

π f / fa

]. (5)

Here K is (2F0xe /L) a proportionality constant, and fa

is the first-mode frequency of the string when it is short-ened to make its length equal to the bridge-to-plectrumdistance xe.

We may usefully characterize the function Ea( f ) as thespectrum envelope of the bridge driving force produced bya simple plucked string. When a particular string length Lp

is chosen by the player, the resulting sound is made up ofsinusoids having frequencies that are integer multiples ofthe first-mode frequency fp. The resulting force spectrumamplitudes at the bridge are then proportional to Ea( f )evaluated at the frequencies of n fp.

The left-hand curve in Fig. 4 shows the general shape ofthe Ea( f ) function. There are “spectral notches” in E dueto the zeros of the sine function (i.e., frequencies of zerosound production), while the whole curve has its behaviordominated by a solid trend line, which shows that E isessentially constant for frequencies below the breakpointvalue α = fa /π and falls with an asymptotic rate 1 / f forf α.

A guitar is not normally excited by a narrow plectrum.The width of the actual plectrum (or fingernail, or finger

FIGURE 4 Ea( f ), Eb( f ), and Ec( f ) combine to produce thespectrum envelope of guitar tones. Ea is associated with the pointof plucking, Eb with plectrum width, and Ec with string stiffnesseffects.

tip) joins with the inherent stiffness of the string to producea rounding-off of the string profile near the plucking point,instead of the abruptly angled profile that was assumed forthe calculation of Eqs. (4) and (5). This rounding-off of thestring profile produces a systematic modification Eb( f ) ofthe guitar’s sound spectrum envelope as

F( f ) = K Ea( f ) · Eb( f ), (6)

where

Eb( f ) =[

sin(π f / fb)

(π f / fb)[1 + ( f / fb)2]

]. (7)

Here fb is the first-mode frequency of a string whoselength wb is equal to the width of the curved region nearthe plucking point. The general nature Eb( f ) is shownby the middle curve in Fig. 4. Again, the notches asso-ciated with the sine function are shown dotted, leavinga solid line to show the main trend of this contributionto the spectrum envelope. Note that Eb is essentially flatbelow a breakpoint frequency β = ( fb /π ) determined bywb, while the amplitude falls at an asymptotic rate of 1 / f 3

for frequencies well above β.If the second breakpoint frequency β is much higher

than the first, the overall spectrum envelope starts out in-dependent of frequency at low frequencies, rolls over andbegins to fall as 1/ f for f > α, and then falls at the rapidrate of (1/ f )(1/ f 3) = (1/ f )4 above f = β, where the jointeffects of Ea and Eb are active. If α and β are not sovery different, there is a transition region of intermediateslope between the level behavior at low frequencies and the1/ f 4 high-frequency fall-off. Figure 5 shows the resultsof this behavior, as found in the practical world of guitarplaying. Here the bridge force is shown as a function offrequency for a string that is plucked at a fairly normalpoint 1

8 of the way along it by a plectrum, whose widthgives a local string curvature extending over 1

16 of the

FIGURE 5 Overall spectrum envelope of the guitar bridge drivingforce. The trend is from constancy at low frequencies to a high-frequency rolloff proportional to 1/ f 3.

Page 255: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

248 Musical Acoustics

“open” (maximum) string length. For reasons that willbecome clearer as we go along, a line is drawn on thisgraph to suggest that the high-frequency behavior of thespectrum is well approximated by a 1/ f 3 falloff rate, witha breakpoint located at about five times the open stringfirst-mode frequency. The spectral notches are shown ex-plicitly only for the Ea( f ) aspect of the behavior.

As remarked earlier, when a player is performing musicon a guitar, he tends to pluck the strings at a roughly con-stant distance from the bridge. The bridge driving-forcespectrum for all the notes played on any one string willthen share a single spectral envelope of the form shownin Fig. 5, including the notches. While the spectrum en-velopes for the adjacent strings (tuned to different pitches)are alike in form, the notches and breakpoints are dis-placed bodily to higher or lower frequencies by amountsdepending on the tunings of these other strings. Taken asa group, however, the force exerted on the bridge by allthe strings has an overall envelope that is frequency inde-pendent at low frequencies and varies roughly as 1/ f 3 athigh frequencies. The transition region between those twobehaviors is blurred somewhat, due to the differences be-tween breakpoint frequencies belonging to the individualstrings.

While the gross envelope for the bridge driving-forcespectrum is in fact made up of six distinct parts (one foreach string), the mechanism for conversion of this ex-citation into the room-average sound is very much thesame for all strings. It is all mediated by the same setof body resonances, and these can produce fluctuationsin the radiated sound above and below the essentiallyfrequency-independent trend line of the overall radiationprocess for sounds emitted by a platelike object. Thusthe observed room-average spectrum is found to havefluctuations above and below an overall envelope whoseshape is very similar to that of the curve in Fig. 5. Itis easy to estimate the expected number of fluctuationsover any frequency span of interest. This is equal to thenumber of body modes found in this span, augmented bythe corresponding number of notches in the drive-forcespectrum.

It is appropriate here to inject an additional piece ofinformation about the human auditory processor. A mu-sic listener is (in a laboratory situation) readily able todetect the strengthening or weakening of one or more si-nusoidal components of a harmonic collection. However,in the processing of music or speech he does not “payattention” very much to the presence of holes or notchesin the spectra of the sounds that he processes as a meansfor recognizing them or assessing their tone color. Moreprecisely, failure to provide notches will be noticed andcriticized in an attempted sound synthesis, but (exceptfor certain special cases) their mere presence and their

mean spacing are of more significance than their exactpositions.

B. The Harpsichord and Piano

The harpsichord and piano are acoustically similar to theguitar in that they have a set of vibrating strings that onlygradually communicate their energy to a two-dimensionalplatelike structure (the soundboard), which in turn passeson the vibration in the form of audible sound to the room.Once again, for musico-neurological reasons, it is impor-tant that the primary string mode oscillations take placeat frequencies that are in whole-number relation to oneanother. The major difference that distinguishes these in-struments from the guitar is the fact that they lack frets, sothat each string is used at only a single vibrating length;also, the place and manner of plucking or striking is chosenby the instrument’s maker rather than by its player.

The strings of a harpsichord are excited by a set of plec-tra ( jacks) operated from a keyboard, so that in many re-spects the dynamical behavior of a harpsichord is identicalwith that of the guitar. As a result, the main features of thespectral envelope of harpsichord sounds in a room are thesame as those of a guitar. The curve shown in Fig. 5 andthe accompanying discussion apply equally well to theharpsichord, the chief difference being in a slightly dif-ferent distribution of spectral notches associated with theexcitation mechanism, and a greatly decreased mean spac-ing (20 Hz) of the radiation irregularities that are asso-ciated with the plate modes of the soundboard. The (as yetundiscussed) air resonances of the cavity under the harp-sichord’s soundboard play a very much smaller role indetermining the overall tone than is the case for the guitar.

The mechanical structure of the piano is quite analo-gous to that of the harpsichord, but the use of hammersrather than plectra to excite its strings causes several mod-ifications to the overall envelope function. While it is com-monly believed that the striking point should have an ef-fect on the envelope similar to that given by Ea( f ) for theplucked string, in fact the corresponding function is muchless dependent on frequency, with only small dips appear-ing at the frequencies of the “notches” of Ea( f ). However,the width and softness of the hammer join with the string’sstiffness to give rise to an envelope function Ea( f ) ex-actly as given for the plucked strings. There is, however,one more dynamical influence on the spectral envelopeof a struck string: When the hammer strikes the string, itbounces off again after a time that is jointly determinedby the hammer mass and elasticity, the string tension,the position of the striking point along the string, and thelength of the string. The details of this dependency of thehamer contact time on these parameters are complicated,and it will suffice for us to present only its main spectral

Page 256: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 249

consequences. It gives rise to an envelope function Ec( f )that is satisfactorily represented by a simplified formulaas

Ec( f ) =[

cos(π f /4 fc)

1 + ( f / fc)2

]. (8)

Here fc is very nearly equal to the reciprocal of the ham-mer contact time. The nature of this function is shownin the right-hand part of Fig. 4. The envelope functionhas a familiar form, being essentially constant at low fre-quencies, having a number of deep notches, and ultimatelyfalling away as 1/ f 2 at high frequencies with a breakpointγ for the main trend at f = fc. The analog to Eq. (6) forthe piano is then

F( f ) = K · Eb( f ) · Ec( f ). (9)

At very high frequencies ( f α, β, and γ ), the drive-force spectrum falls away as (1/ f 3)(1/ f 2) = (1/ f 5).As before, the behavior at somewhat lower frequenciescan have an apparent fall rate represented by someintermediate exponent that depends on the maker’s choiceof β and γ .

In the piano, the distance of the hammer’s striking pointfrom the string end varies smoothly (sometimes in two ormore segments) from about 10 or 12% of the string lengthat the bass end of the scale to about 8% at the treble end.Similarly, the mass, width, and softness of the hammersfall progressively in going up the scale from the bass end.Taken together, these four varying parameters of piano de-sign provide the maker with his chief means for achievingwhat he calls a “good” tone for the instrument simultane-ously with uniformity of loudness and of keyboard “feel.”

The hammer mass has a direct influence on the feelof the keys. It also plays a major role in determining theloudness of the note via its effect on the kinetic energythat it converts into vibrational energy of the string. Thehammer mass (as well as its softness to some extent) joinswith the striking point and string parameters to controlthe contact time during a hammer blow. At C2 near thebottom of the scale, the design is such that the hammermass is about 1

30 of the total string mass, and the hammer’scontact time is around 4 ms (γ 250 Hz); at the midscaleC4, the string and hammer masses are about equal, and thecontact time is about 1.5-ms; at C7, near the top of scale, ithas fallen to 1 ms (γ 1000 Hz). In a related manner, thestring stiffness joins with hammer softness to determinethe string-curvature envelope Eb( f ) and its correspondingbreakpoint frequency β.

Figure 6 shows the remarkable uniformity in the trendline for the measured room-average spectra (defined inSection I) of notes taken from the musically dominantmidrange portion of a grand piano’s scale. These notes,running scale-wise from G3 up to G5 (having repetition

FIGURE 6 Measured room-average spectrum envelope of pianotones. Above about 800 Hz the components weaken as 1/ f 3.

rates 192 to 768 Hz), will be recognized as lying in the re-gion in which the auditory processor is particularly quick,precise, and confident. As a result, any regularities shownby the spectra of these notes have strong implicationsabout the manner in which the ear deals with such notes.In the figure, the dots located along the zero-decibel linerepresent the normalized amplitudes of the first-mode fre-quency components of all 14 played notes. The remainingdots then give the relative amplitudes of the remaininghigher harmonic components (expressed in decibels rela-tive to the “fundamental” components). About half of thenotes shown here were played and measured several times,over a period of five years, using a variety of analysis tech-niques. The variability due to all causes (irregularity ofstriking the keys, statistical fluctuations of the room mea-surement, wear of the piano, and differences due to alteredanalysis technique) may be shown to be about ±2 dB forthe position of any one dot on the curve. For this reason itis possible to attribute the observed scattering of the pointsabout their basic trend line almost wholly to the excitatoryspectrum notches and to the radiation effects of sound-board resonances in the body of the piano. The magnitudeof these fluctuations is consistent with estimates based onthe resonance properties of a sound board.

Figure 6 illustrates a spectral property that is shared bynearly all of the familiar midrange musical instruments.Here, as in the guitar and harpsichord, we find fluctua-tions about an essentially constant low-frequency averagetrend, plus a rolloff with a (1/ f 3) dependency at high fre-quencies. Dividing the two spectral regions, there is alsoa well-defined break point that lies close to 780 Hz for thepiano.

To recapitulate, the proportioning of the piano’s stringlength and strike point and its hammer’s breadth, mass, andsoftness cause the critical frequency parameters fb and fc

to vary widely for strings over the midrange playing scale.Nevertheless, the maker has arranged to distribute them in

Page 257: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

250 Musical Acoustics

such a way as to preserve the absolute spectral envelopeover a wide range of playing notes. Since it is clearly notan accident that the piano and harpsichord have developedwith proportions of the type implied previously, we are ledto inquire as to what are the perceptual constraints imposedon the design by the needs of the listner’s musical ear. Theanswers to this inquiry (implied in large measure by theauditory properties outlined in Section I) will be mademore explicit in the remaining course of this article.

C. Radiation Behavior of Platelike Vibrators

A number of significant musical properties of the guitar,harpsichord, and piano have been elucidated by an exami-nation of the ways in which their platelike body or sound-board structures communicate the vibrations imposed onthem by the string to the surrounding air in the room (vio-lins of course also communicate this way). It has alreadybeen asserted that the trend of radiating ability of suchstructures driven by oscillating forces is essentially inde-pendent of frequency except at low frequencies. Becausea number of musical complexities are associated with thisapparently simple trend, it is worthwhile to devote somespace to a brief outline of the radiation physics that isinvolved.

To begin with, consider a thin plate of limitless extent,driven at some point by a sinusoidal driving force of fixedmagnitude F0 and variable frequency f . Analysis showsthat the vibrational velocity produced at the driving pointof such a plate is proportional to the magnitude of thedriving force, but independent of its frequency. For defi-niteness, let the plate be of spruce about 3-mm thick (as isthe case for the guitar and violin top plate, or the sound-board of a harpsichord). Also, we temporarily limit thedriving frequency to values that lie below about 3000 Hz.

Despite the fact that the entire surface of the plate isset into vibration by the excitory force, only a small patchnear the driving point is actually able to emit sound intothe air! The radius rrad of this radiatively effective patchis about 16 cm at 100 Hz, and it varies inversely as thesquare root of the frequency. Thus, the area of the activepatch varies as 1/ f . Since the radiation ability of a smallvibrating piston is proportional to its area, velocity ampli-tude, and vibrational frequency, the sound emitted by theboard not only comes from a tightly localized spot at thepoint of excitation, but also the amount radiated is entirelyindependent of frequency.

When the size of the plate is restricted by any kind ofboundary (free, hinged, or clamped), additional radiationbecomes possible from a striplike region extending a dis-tance rrad (defined previously) inward from these bound-aries. Any sort of rigid blocking applied at some point,or hole cut in the plate, also gives rise to a radiatively

active region of width rrad around the discountinuity, andthe system retains its essentially frequency-independentradiating behavior. The fact that the system is now of fi-nite extent means that it has a large number of vibrationalmodes (whose mean spacing is set mainly by the thick-ness and total plate area of the structure). The system’s netradiated power then fluctuates symmetrically above andbelow the large-plate trend line in a manner controlled bythe size, width, and damping of the modal response peaksand dips. Curiously enough, the general level of radiationis hardly influenced by the plate damping produced by itsown internal friction.

Above a certain coincidence frequency fcoinc (the previ-ously mentioned 3000 Hz for a spruce plate 3-mm thick),the entire vibrating plate abruptly becomes able to radiateinto air. For a limitless plate, the radiating power becomesenormous just above fcoinc, and it then drops off to a newfrequency-independent value that is considerably greaterthan that found below fcoinc. However, for a finite-sizedsystem broken up into many parts (as in the musical struc-tures), there is no readily detectable alteration in the over-all radiating ability as the drive frequency traverses fcoinc,although many details of the directional distribution ofsound are drastically changed.

The coincidence frequency is inversely proportional tothe plate thickness, and the radius rrad of the radiativelyactive regions is proportional to the square root of thethickness; this means that for a piano (whose plate thick-ness is about triple that of a harpsichord), fcoinc falls toabout 1000 Hz and rrad is about 27 cm at 100 Hz.

The practical implications of the radiation propertiesbriefly discussed here are numerous. To begin with, itshould be clear that the boxlike (and therefore irregular)structure of the guitar and violin joins with the sound holesand miscellaneous internal bracing to greatly increase thesound output from what would otherwise be very soft-voiced instruments (as are the lutes and viols of simplerconstruction that were developed earlier).

By the beginning of the seventeenth century, the harpsi-chord soundboard had already accquired numerous heavystruts along with structural discontinuities provided bythe bridges needed to serve two complete sets of strings(the so-called 4- and 8-ft arrays) and a hitch-pin rail be-tween the bridges to bear the tension of the 4-ft strings.These strong and heavy discontinuities play an impor-tant role in providing the free, clear sound that is char-acteristic of a really fine harpsichord. The instrumentsby the French builder, Pascal-Joseph Taskin (1723–1793),which are noted for the fullness of their tone, are providedwith an unusually rigid set of bracings. It is importantto notice that despite the completely counterintuitive na-ture of the lumpy and discontinuous structures that favorsound production, the best makers nevertheless discovered

Page 258: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 251

and adhered to designs whose underlying acousticalvirtues have only been elucidated in the late twentiethcentury.

A quick look at the modern piano shows a similar adap-tation of the vibrating structure to its radiating task, al-though we might today ask (by analogy) whether or not afew properly placed extra braces might improve the soundsomewhat.

Some of the difficulty often faced by sound engineerswhen making recordings of the piano is readily understoodin terms of the ever-shifting patchwork of sources that areactive. The tendency of many engineers to place their mi-crophones close to the piano means that rapidly changing,and often perceptually conflicting, signals are registeredin the two recording channels. The ear is so accustomed toassembling the sounds from all over the soundboard, viathe mediation of a room, that anything else can confuse it.As a matter of fact, the ear is so insistent on receiving pi-ano sounds from a random patchwork of shifting sourcesthat successful electronic syntheses of piano music can bedone using the simplest of waveform sources as long assignals are randomly distributed to an array of six to tensmall loudspeakers placed on a flat board!

D. Piano Onset and Decay Phenomena

When a hammer strikes a piano string it is subjected toan impulsive blow that contains many frequency compo-nents. This blow, which is transmitted to the bridge in theform of a continuously distributed drive-force spectrumwhose shape is exactly the same as the spectrum enve-lope of the discrete string vibration drive forces, is heardin the room as a distinct thump. Those parts of each mea-sured room-average spectrum that lie between the stronglyrepresented string-vibration components clearly show theenvelope of the thump part of the net sound. As a matterof fact, the similarity between the shapes of the thump-envelopes of various notes, and of each one of these tothe overall envelope displayed in Fig. 6, serves as a goodconfirmation of our picture of the piano sound-generationprocess.

Despite the fact that the piano sound is produced byan impulsive excitation taking place in only a very fewmilliseconds, the buildup of radiated sound in its neigh-borhood takes place over a period of time that is 10 to50 times longer. We will begin our search for an expla-nation for this slow buildup by outlining the vibrationalenergy budget of a piano tone. Setting aside temporar-ily the energy associated with the initial thump, we canrecognize that each string mode is abruptly supplied withits share of energy at the time of the hammer blow. Overthe succeeding seconds, some of this energy is dissipatedunproductively as heat within the body of the wire and

at its anchorages. There is also a flow of vibrational en-ergy into the soundboard, to set up its vibrations. Overall,the board vibrations build up under the stimulus of thestring until the board’s own energy loss rate to internalfriction and to radiation into the room (and to some extentback into the string) are equal to the input rate from thestring.

Globally speaking then, we would expect the radiatedsound to rise in amplitude for a while (as the soundboardcomes into equilibrium with the string vibrations) and thento decay gradually as the string gives up its energy. Theinitial part of a curve showing this behavior calculated fora typical pianolike string and soundboard is shown by thesolid line in Fig. 7. This sort of calculation is able to givegood account of the main feature of the onset times (35to 50 ms). However, the measured behavior for a pianoshows considerably more complexity, for the followingreasons:

1. The initial thump is instantly transmitted to thesoundboard, and the resulting wave travels across it andsuffers numerous reflections.

2. During the initial epoch, while both the thump andthe main tonal components are spreading across the sound-board, and making their first few reflections, all frequencycomponents are able to radiate fairly efficiently. As a re-sult, components in the sound output have a significant rep-resentation at very early times. This behavior is schema-tized by the irregular line in Fig. 7.

3. The cross-influences of the three piano strings andbridge belonging to a typical piano note make the strings’own decay quite irregular. This irregularity is then re-flected in the longterm decay of the tone.

All this is an example of the statistical fluctuation be-havior that was outlined for rooms in Section I of thisarticle.

FIGURE 7 Smooth beaded curve: Trend of initial buildup ofsoundboard vibrational energy after excitation of a string. Irreg-ular curve: Schematic representation of the actual buildup.

Page 259: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

252 Musical Acoustics

III. THE SINGING VOICE

The instruments discussed so far have belonged to a classof musical sound generators in which the primary sourceof acoustical energy (the vibrating string) is abruptly set inmotion and allowed to die away over the next few seconds.There is another, much wider class of musical instruments(including the human voice, the woodwinds and brasses,and the violin family) in which the oscillations of the pri-mary vibrator are able to sustain themselves over relativelylong periods of time, drawing their energy from a nonoscil-latory source such as the musician’s wind supply or thesteady motion of his bow arm.

The tone production system of the singing voice pro-vides an excellent introduction to this class of continuous-tone instruments for two reasons. First, discussion is sim-plified by the fact that the primary exciter (the singer’slarynx) maintains its own oscillations in a manner that isquasi-independent of the vocal tract air passages that itexcites. Second, a further expository simplification comesabout because the frequency of its oscillations is controlledby a set of muscles that are distinct from those that deter-mine the shape of the upper airway.

Fundamentally, the larynx acts as a self-operating flowcontrol valve that admits puffs of compressed air into thevocal tract in a regular sequence whose repetition rate f0

determines the pitch of the note being sung. Since this flowsignal is of a strictly periodic nature, the frequencies ofits constituent sinusoids are exact integer multiples n f0 ofthe laryngeal oscillation rate, and they therefore producea sound of the type that is very well suited to the auditoryprocesses of musical listening.

Aside from its ability to generate continuous rather thandecaying sounds, the singer’s voice-production mecha-nism, with its signal path from primary sound source (lar-ynx) to concert hall via the vocal passages, is quite analo-gous in its physical behavior to that which leads the soundfrom vibrating string to the room by way of a soundboardor guitar body. Because the behavior of the sound trans-mission path through the vocal tract is far more impor-tant for present purposes than is the spectral descriptionof the excitatory pulse train, we temporarily limit our-selves to the simple remark that the source component(of frequency n f0) is related to the lowest component (offrequency f0) by a factor (1/n2)A(n), where A(n) has arelatively constant trend line plus a few irregularly spacednotches or quasi-notches. The perceptual significance ofthese notches is relatively small, as in the case of thestringed instrument spectra. In short, the spectrum enve-lope of any voiced sound in the room includes a factor1/ f 2 due the source spectrum as a major contributor to itsoverall shape.

The vocal tract air column extending from the laryn-geal source to the singer’s open mouth may be analyzedas a nonuniform but essentially one-dimensional waveg-uide, whose detailed shape can be modified by actions ofthe throat, jaw, tongue, and lip muscles. One end of thisduct is bounded by the high acoustical impedance pre-sented by the larynx, and the other by the low impedanceof the singer’s open mouth aperture. Acoustical theoryshows that such a bounded, one-dimensional medium hasits natural frequencies spaced in a roughly uniform man-ner. Furthermore, the 15-cm length of this region impliesthat this mean spacing be about 1000 Hz, so that we ex-pect no more than three or four such resonance frequenciesin the region below 4000 Hz that contains the musicallysignificant part of the voice spectrum. The signal transmis-sion path from larynx to the listening room has a transferfunction Tr( f ) that is the product of three factors. One isa term T1( f ) falling smoothly as 1/ f 1/2 associated withacoustical energy losses that take place at the walls ofthe vocal tract. Another factor, T2( f ), has to do with theefficacy of sound emission from the mouth aperture intothe room. This rises smoothly with a magnitude propor-tional to the signal frequency f . The third factor, T3( f ),fluctuates above and below a constant trend line, and itdepends on the shape given to the vocal tract passage byits controlling muscles.

The peaks in T3( f ) lie at frequencies that correspondto the normal-mode frequencies of the vocal tract if it isimagined to be closed off at the larynx and open at themouth. The dips in the transmission function lie, on theother hand, at the modal frequencies of the vocal tractconsidered as an air column that is open at both ends.Both the peaks and dips have widths of about 50 Hz in thefrequency range below 1500 Hz, rising to about 200 Hz at4000 Hz. These peaks and dips tend to rise or fall above andbelow the trend line by about ±10 dB (i.e., factors of about3±1). The nature of the overall vocal tract transfer functionTlr( f ) between larynx and the room is summarized as

Tlr( f ) = (1/ f )1/2 × ( f ) × (peaks and dips)

= ( f )1/2 × (peaks and dips). (10)

Because the vocal source spectrum has a relatively fea-tureless 1/ f 2 behavior, it is convenient to display graph-ically the product (1/ f 2)Tlr( f ), representing the soundnormally measurable via the room-averaging procedure.Figure 8 presents such curves computed for three config-urations of the vocal tract.

The pattern of peaks and dips in the Tlr( f ) function is ofmajor perceptual significance: Each vowel or other voicesound is associated with a particular vocal tract configura-tion, and so a particular Tlr( f ). Speech then consists of a

Page 260: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 253

FIGURE 8 Schematic representation of spectrum envelopecurves for three sung vowels.

rapidly changing set of spectral envelope patterns, whichare recognized by the listener in a manner that dependsalmost not at all on the nature of the laryngeal source spec-trum. Thus, a singer who produce the vowel aah at a pitchof A2 ( f0 = 110 Hz) supplies his listeners with a generatedsound whose harmonic components can be evaluated fromthe curve for aah (see Fig. 8) at the discrete frequencies110, 220, 330, . . . Hz, as indicated by the small circles onthe curve. Similarly, a singer producing the vowel ooh atG4 ( f0 = 392 Hz) emits a sound whose spectrum has com-ponent amplitudes that are related in the manner indicatedby the small x’s.

The vowel pattern recognition abilities of the human lis-tener are highly developed. Whispered speech is perfectlycomprehensible, even though the source signal consistsof a densely distributed random collection of sinusoids(white noise) rather than the discrete and harmonic col-lection of voiced speech. Furthermore, a radio announceris completely intelligible whether the receiver tone con-trols are set to “treble boost, bass cut” (nearly equivalentto multiplying Tlr( f ) by f ) or to “treble cut, bass boost”(which is nearly the same as multiplying the spectrumby 1/ f ). The recognition process requires, as a matter offact, only the existence of properly located peaks relativeto the local trend of the spectrum, while the positions anddepths of the dips are essentially irrelevant, to the pointwhere many electronic speech synthesizers omit them en-tirely. Thus, all that is really necessary is to specify thefrequencies of the lowest three or four transmission peaksfor each sound. These are denoted by F1, F2, F3, and F4

and are called the format frequencies (and are about 17%higher for women than for men).

So far, no clear distinction needs to be made betweenspeech and song beyond the need in music for preciselydefined pitch (and thence values of f0). The musician has,however, three special resources that have little signifi-

cance in speech. First of all, the source spectrum shape,which is approximated by An = A1/n2 for the “mildest”and most speechlike tone color, may be modified to theform

An = A1

[1 + (1/γ )2

1 + (n/γ )2

]. (11)

The components for which n is less than γ have thenessentially the same amplitude as A1, while the 1/n2 fall-off is postponed to the higher, h δ components. In theextreme case, γ can be as large as 3.

The second resource of the singer is the use of what isknown as format tuning. Consider a soprano who is askedto sing the vowel aah at the pitch D5 at the top of the treblestaff. This means that her sound consists of sinusoids hav-ing frequencies close to n f0 = 587, 1175, 1762, . . . Hz.Her normally spoken “aah” would have a second-formantfrequency F2, close to 1260 Hz, but she may choose toalter her vocal tract shape (and thus modify the vowelsound) somewhat, in order to place F2 exactly on top ofthe 1175-Hz second harmonic of the sung note. This sortof tuning is effective only for notes sung at a pitch highenough that a formant frequency can be adjusted to oneof the first three voice harmonics, which assures that therapid 1/n3/2 fall-off in amplitude has not significantly re-duced the prominence of the tuned component in the netsound. The most obvious benefit to be gained from for-mant tuning is almost trivial: The net loudness of the noteis increased to an extent that can be useful if the singer isstruggling for audibility against an overpowering accom-panist. Subtler, and more significant musically, is a sort ofglow and fullness that is imparted to the tone of a formant-tuned note. The perceptual reasons for this are not entirelyclear, but the fame of many fine sopranos is enhanced bytheir skillful use of the technique.

The third spectral modification that is available tosingers (especially for tenors, and for the highest notesof other males) is a rather curious one: A systematic mod-ification of the vocal tract region near the larynx and/ora manner of vowel production that makes the second andthird formant frequencies almost coincident can give riseto an extremely strong transmission peak in the neighbor-hood of 3000 Hz. This peak is referred to as the singer’sformant, regardless of its mode of production. The pres-ence of such a formant considerably increases the netsound output power of the voice, a fact that joins withcertain features of the ear’s perception mechanism to pro-duce a large increase in the loudness of all tones sung inthis way. It also produces what is usually referred to astonal brilliance and penetrating character.

When used flexibly and tastefully by true artists all threeof these vocal resources greatly enhance the beauty and

Page 261: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

254 Musical Acoustics

expressiveness of the musical line. For them to have sucheffects, they must be used subtly, with close attention tothe meaning of the music and the words. As a group, lessersingers do not make use of formant tuning except to in-crease their loudness. Among this same group of less-than-satisfactory performers, the other two forms of vocalproduction are used incessantly, in part to call attention tothemselves by mere loudness, and in part as evidence thatthey have what is called a “trained voice.” It is a curiousfact that if any other musical instrument (or a loudspeaker)had a strong and invariable peak around 3000 Hz in itsspectral envelope, it would be subject to instant and bittercriticism.

IV. THE WIND INSTRUMENTS

The first and most important feature of the family of windinstruments (from the point of view of physics and per-ception psychology) is the fact that its tones are self-sustaining. The duration of these tones is limited onlyby the desire of the player and the sufficiency of his airsupply. As hinted already in connection with the oscilla-tions in the singer’s larynx, self-sustaining oscillators ofnecessity give rise to sounds made up of exactly harmoniccomponents.

The second distinguishing feature of the wind instru-ments is that the air column whose natural frequenciescontrol the frequency and wave shape of the primary oscil-lation is also the device that transmits the resulting soundsto the listening room. It is no longer possible (as withthe stringed instruments and with the voice) to describea vibration source that is essentially independent of thetransmission mechanisms that convert its output into thesounds that we hear.

A. The Structure of a Wind Instrument

Figure 9 will serve to introduce the essential features ofa musical wind instrument as it is seen by a physicist.

FIGURE 9 Basic structure of a wind instrument: Air supply, flowcontroller, air column, and dynamical coupling between the lattertwo.

To begin with, the player is responsible for providing asupply of compressed air to the instrument’s reed system.This reed system functions as a flow controller that admitspuffs of air into an adjustable air column belonging to theinstrument itself. The system oscillates because the flowcontroller is actuated by an acoustical signal generatedwithin the upper end of the air column; this signal is infact the air column’s response to the excitory flow injectedvia the reed.

The structural features that serve to distinguish betweenthe two major families of wind instruments may be sum-marized as follows:

1. A woodwind is recognized by the fact that the lengthof its air column is adjusted by means of a sequence oftoneholes that are opened or closed in various combina-tions to determine the desired notes. The oboe, clarinet,saxophone, and flute are all members of this family.

2. A brass instrument is distinguished by the fact thatits air column continues uninterrupted from mouthpieceto bell, the necessary length adjustments being providedeither via segments of additional tubing that are added intothe bore by means of valves as in the trumpet or by meansof a sliding extension of the sort found on the trombone.

The sound production process in all wind instrumentsinvolves the action of an air flow controller under the in-fluence of acoustical disturbances produced within the aircolumn. This provides another description of the variouskinds of wind instrument in terms of the flow controllersthat are found on the different instruments.

1. The cane reed is found on the clarinet, oboe, bassoon,and saxophone. It is not (for present purposes) necessary tofurther distinguish between single and double reeds; theyboth share the dynamical property that the valve action issuch as to decrease the air flow through them when thepressure is increased within the player’s mouth.

2. The lip reed normally used on brass instruments andthe cornetto is the second major type of flow controller.Here the valve action is such that the transmitted flow isincreased by an increment of the pressure in the player’smouth.

3. Flutes, recorders, and most organ pipes are kept inoscillation through the action of a third type of controllerthat may aptly be described as an air reed. Here we findan air jet whose path is deflected into and out of the aircolumn through the action of the velocity of the air asit oscillates up and down the length of the governing aircolumn.

It should be emphasized that while the nature of theflow controller itself is very important, it does not usefully

Page 262: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 255

distinguish the instrumental wind families from each otherin the essential features of their oscillatory behavior.

B. The Oscillation Process: Time-DomainVersion

There are two main ways of describing the oscillation pro-cesses of a self-sustained instrument. The time-domaindescription deals with the temporal growth and evolutionof a small initial impulse that leaves the flow controllerand then is reflected back and forth between the two ter-minations of the air column—the tone holes and/or bellat its lower-end termination and the flow-controlling reedsystem at its upper end.

When the initial impulse is reflected from the lowertermination, it suffers a change of form and an enfeeble-ment of its vigor (as does each of its successors). Thechange in form arises because of the acoustical complex-ity of the termination, and the loss in amplitude occursbecause some of the incident wave energy has been lostin the journey and some transmitted into the outside air.At the reed end, another very different form of terminat-ing complexity produces a change in the reflected shapeof each returning impulse that travels up to it from thebottom end of the instrument. The size of this regeneratedpulse also increases because it receives energy suppliedby the incoming compressed air from the player’s lungs.

As the tonal start-up process evolves toward the condi-tion of steady oscillation, the wave shape stabilizes intoone in which each reflection at the lower end of the aircolumn is modified in such a way as to “undo” the modi-fication that takes place at the reed.

C. The Frequency-Domain Description of WindInstrument Oscillation

The time-domain description of the oscillation process isreadily susceptible to mathematical analysis and permitsdetailed calculation of the sound spectra produced by agiven reed and air column. However, it is ill-adapted to thetask of showing general relations between the mechanicalstructure of an instrument and its playing behavior, norwill it guide its maker in adjusting it for improved toneand response.

Fortunately, a second way of picturing the oscillatorysystem—the frequency-domain version—can readily dealwith such questions and is well suited for our present de-scriptive purposes. In the frequency-domain analysis, westart by relating the proportions of an instrument to thenatural frequencies and dampings of the various vibra-tional modes of the controlling air column. For presentpurposes, it suffices to describe the flow controller merelyby reiterating that the increment of flow produced by an

increment of control signal is not in proportion to it. Inparticular, a sinusoidal control signal of frequency f = Pgives rise to a pulsating flow that may be analyzed intoconstituent sinusoids having a harmonic set of frequenciesP, 2P, 3P, . . . . Additional components appear when theexcitatory signal is itself the superposition of several sinu-soids. If these have the frequencies P, Q, R, S, . . . , theresulting flow signal will contain an elaborate collection ofcomponents having frequencies that can be described by

f = |αP ± βQ ± γ R ± δS|. (12)

Here α, β, γ , and δ are integers that can take on any valuesbetween zero and an upper limit N , which can be as highas 4 or 5. Clearly, hundreds of these frequencies can bepresent in the flow. (Because of their cross-bred ancestry,they are known as heterodyne frequencies.) It is also clearfrom their very number and their computational originsthat they are distributed over a frequency range extendingfrom zero to more than N times the highest of the stimulusfrequencies, and that the amplitude of each of these newcomponents is determined by a combination of theamplitudes of all the original components. This meansthat the energy associated with each flow componentis determined jointly by all members of the controllingset of sinusoids. It is this cross-coupling of stimuli andresponses having widely different frequencies that under-lies the dynamical behavior of all self-sustained musicalinstruments, and so governs their musical properties.

Consider the behavior of a reed coupled to an air col-umn designed in such a way as to have only a single res-onant mode of oscillation. When blown softly, the sys-tem will oscillate at a frequency f0 that is very nearlyequal to the modal frequency. Because the strengths ofthe higher numbered heterodyne components always falltoward zero under conditions of weak excitation, almostthe entire efficacy of the flow controller is focused on sup-plying excitation to the air column at the frequency f0 of itsown maximum response, and the system can oscillate ef-ficiently. However, the system is perfectly stable, becauseany tendency of the system to “run away” leads to the pro-duction of heterodyne components that dissipate energy.These components do not replenish themselves becausethe air column does not respond strongly to them and sodoes not “instruct” the reed to reproduce them. If we tryto play loudly by blowing harder on such a single-modeinstrument, the sinusoid at f0 hardly changes in strength,but some hissing noise appears and the reed either chokesup entirely (on the cane reed instruments) or blows wideopen (on the brasses), with a complete cessation of tone.

Similar behavior is observed for a multimode air col-umn if the mode frequencies are randomly placed. Usuallythe system starts as though only the strongest resonancewas present, and the choking-up of the oscillation is more

Page 263: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

256 Musical Acoustics

abrupt because of the enormously increased number ofunproductive heterodyne components that are producedwhen the blowing pressure is increased.

The discussion so far has shown the conditions un-der which a reed-plus-air-column system can not play;it is time now to describe the requirements for a systemthat will produce sounds other than weak sinusoids. Sup-pose that the air column has a shape such that its naturalfrequencies themselves form a (very nearly) harmonic set,in the manner

fn = n f1 + εn, (13)

where the discrepancy εn is a measure of the inharmonic-ity. The heterodyne frequencies will now form smallclumps closely grouped around the exactly harmonic fre-quencies n fn . As a group, the modes thus appear able thento cooperate with the reed and so to regenerate the flow-stimulus energy (distributed in narrow clumps of ever-growing complexity). What actually happens is that themodes quickly lock together to produce a strictly harmonicoscillation with a repetition rate f0 such that the overallenergy production of the system is maximized. Such amode-locked regime of oscillation turns out to be increas-ingly quick-starting and stable in all respects if the air-column mode frequencies are increasingly well alignedinto harmonic relationship. It will also run over a widerange of blowing pressures (and so produce a musicallyuseful range of loudness).

D. Musically Useful Air-Column Shapes

We have just learned that for a self-sustained multimodeoscillation to exist at all, the air-column shape must be suchthat the natural frequencies of its modes are in very nearlyexact harmonic relationships. There are very few possiblebasic shapes that can meet this criterion. For instrumentsof the reed woodwind type there are two, for the brass in-struments there are two, and for the air-reed (flute) familythere is only one. It can be shown that because the cane-reed and lip-reed instruments have pressure-operated flowcontrollers, the relevant air column’s natural frequenciesare those calculated or measured under the condition thatits blowing end be closed off by means of an air-tight plug,while the downstream end is left in open communicationwith the outside air via the tone holes and bell. On theother hand, for the velocity-operated air reed of the flutefamily, it is necessary to consider the air-column modalfrequencies for the condition when both ends are open.

The clarinet family is the sole representative of thecylindrical bore (first) type of possible reed woodwindwhile the oboe, bassoon, and saxophone belong to thebasically conical second group. The trumpet, trombone,and French horn are representatives of the outward-flaring

hyperbolic-shaped air columns suitable for brass instru-ments, while the flugelhorn and certain baritone hornsare familiar examples of the conical second group. Theflutes, on the other hand, are all based on a straight-sidedtube, which can have positive, negative, or zero taper. Thatis, they can either expand or contract conically in goingdownstream from the blowing end, or can be untapering(i.e., cylindrical).

Because of the acoustical complexities of the tone holesat the lower ends of all the woodwinds and of the mouth-piece and reed structures at the upper ends of both brassesand woodwinds, the actual air-column shapes of the var-ious instruments differ in many small ways from theirprototypical bases. In all cases, however the differencesare such as to align the modal frequencies of the completeair column in the required harmonic relationship.

E. Sound Spectra in the Mouthpiece/ReedCavity

It would require a lengthy discussion to explain the waysin which the mouthpiece/reed cavity sound pressure spec-trum (which “instructs” the flow-controlling reed) takesits form, but it is not difficult to describe its general naturefor the various kinds of wind instrument.

It is clear that a lot of the f0 fundamental componentwill be present in the mouthpiece spectrum: not only is itdirectly generated via the lowest frequency air-column res-onance but also by difference-heterodyne action betweenevery adjacent pair of the higher harmonics. In similarfashion, there will be a fair amount of heterodyne contri-bution to the second harmonic component arising from (atthe very least) the nonlinear interaction of every alternatepair of harmonics. Analogous contributions are likewisemade to the higher tonal components by ever more com-plex combinations of pairs of peaks in the resonance curve.

Details aside, the foregoing considerations are them-selves able to imply a tendency for the successive har-monics of the generated tone to be progressively weaker.A closer look brings in information about the manner inwhich the properties of the air column and reed act to de-termine the spectrum. For a cane-reed woodwind (such asa clarinet, saxophone, or oboe) or for a lip-reed brass in-strument it is possible to show that as long as the playinglevel is low enough that the reed does not pound com-pletely closed during any part of its swing, the behaviorof the pressure amplitudes pn of the various harmonics iswell caricatured by

pn =(

p1

p0

)n Zn F( fn) + Mn

[F( fn) − Zn] + Dn

. (14)

Here Zn is the height of the flow-induced resonance re-sponse curve (input impedance) at the frequency fn of the

Page 264: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 257

nth harmonic component of the played tone; the factor(pi/p0)n gives the influence of the playing level on theoverall spectrum; p1 is the amplitude of the fundamen-tal component of the tone; and p0 is a reference pressuredefined such that (p1/p0) = 1 when the reed just closesat one extreme of its cyclic swing. We shall postpone anexplanation of the musical recognizability of p0 or its rela-tion to the ordinary loudness specifications that run frompianissimo to fortissimo in the player’s natural vocabu-lary and will note only that an important measure of thestrength of any component is the magnitude of its associ-ated air-column resonance peak.

The functions Mn and Dn are slowly changing, andthey describe the already-mentioned nonlinear processesof energy exchange between spectral components that as-sure amplitude stability and well-defined waveforms atall playing levels. The fuction F( fn) describes the rela-tion between the reed’s primary flow-controlling abilityand the frequency of some signal component that may beacting upon it:

F( fn) = ±Kr[1 − ( fn/ fr)2]. (15)

Here Kr is a constant, fn is the frequency of the signalcomponent, and fr is the (player-controllable) natural fre-quency of the reed taken by itself. The plus sign in thedefining equation for F applies to the cane reeds (whichare pushed open by an increase in mouthpiece pressure),while the minus sign applies to the lip reeds belonging tothe ordinary brass instruments (which are pushed closedby an increase in mouthpiece pressure). Since F mustbe positive if oscillation is to be supported, the lowestfew harmonics of the tone of cane reed instruments musthave fn ≤ fr, whereas for the brasses fn ≥ fr for all thecomponents.

Figure 10 illustrates the state of affairs for the woodwindfamily of instruments. The measured input impedancecurve Z ( f ) is shown for an English horn fingered to give

FIGURE 10 Measured resonance curve for a typical woodwindair column (English horn), along with a curve showing the natureof the primary flow-control function F( f ).

FIGURE 11 Measured resonance curve for a brass instrument aircolumn (trumpet). It is shown along with a typical brass instrumentflow-control function F( f ).

the air column for playing its own (written) note C4. Alsoshown is a typical flow-control curve F( f ) with the reedfrequency set at 1650 Hz. Notice (for future reference) thatthe air column almost completely lacks resonance peaksabove what is known as its cutoff frequency fc, which liesnear 1200 Hz.

Figure 11 shows in a similar fashion the impedancecurve and F( f ) for a trumpet in the case in which theplayer has set fr 350 Hz in preparation for sounding hiswritten note G4. This note has its major energy productionassociated with the cooperation of resonance peaks 3, 6,and 9 (which are in accurately harmonic relationship on agood trumpet). Once again we call attention to the absenceof air-column resonances above a cutoff frequency, thistime lying near 1500 Hz. The pressure spectrum in thetrumpet’s mouthpiece is difficult to guess by eye becauseit depends on the product of the heights of the Zn peaksand the rising F( f ) curve; however, it is clear that thecomponents having frequencies above the 1500-Hz cutoffare very weak. There are thus two reasons (acting for bothwoodwinds and brasses) why the spectrum should havea strong first harmonic and progressively weaker second,third, and fourth components, with rapidly disappearingcomponents above that. In addition to the falloff associatedwith the weakening heterodyne contribution, we have athigh frequencies a progressive reduction in the resonancepeak heights, and above fc all energy production ceases.

So far, resonance curves have been presented for onlyone of the many air-column configurations that are possi-ble via the fingerings available to a player. It goes almostwithout saying that when a player desires to sound a lowernote, he lengthens the air column of a woodwind by clos-ing a tonehole, or by adding an extra length of tubing tothe bore of a brass instrument by means of a valve pistonor slide. In this way the frequencies of all the resonancepeaks are shifted downward by a factor of 1/1.05946 forevery semitone lowering of the desired pitch. An example

Page 265: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

258 Musical Acoustics

FIGURE 12 Clarinet resonance curves measured for the aircolumns used in playing the notes C4 to G4.

of this behavior is presented in Fig. 12, where the res-onance curves are presented for the written notes lyingbetween C4 and G4 of a clarinet. The leftmost peak la-beled by the numeral 1 is the first-mode resonance for theair column used to produce C4; the leftmost peak markedwith a 2 similarly indicates the second-mode peak belong-ing to the same column arrangement, and so on for thehigher-numbered peaks. In an exactly parallel way, therightmost numerals 1, 2, 3, . . . indicate the correspondingresonance peaks for the note G4. There are two notewor-thy features in this set of impedance curves. The first isshared by all wind instrument resonance curves: The cut-off frequency (above which there are no peaks) remainsthe same for all fingerings. The second feature is charac-teristic of the clarinet family alone: While peaks 1, 2, 3, . . .are in the strict whole-number relationship demanded bythe cooperative nature of wind instrument tone produc-tion, the modal frequencies f1 , f2 , f3 , . . . lie in a 1, 3,5, . . . sequence, with dips in the resonance curves appear-ing at the positions of the even multiples of the mode-1frequency. An immediate consequence of this fact is thatdespite the restorative powers of the alternate-componentheterodynes, the even-numbered members of the gener-ated mouth-piece pressure spectrum are weaker than theodd-numbered ones. Because of the fixed cutoff frequencyshared by all the air columns used to play notes on agiven instrument, and because all the notes share the samemouthpiece and reed structure, it is possible to constructspectrum envelope formulas for the notes of the variousclasses of instrument. These formulas have a mathemat-ical structure very much like those presented earlier inconnection with the bridge-driving forces exerted by thestrings of a guitar, harpsichord, or piano. The basic physicsthat determines them is of course entirely different here,since wind instrument oscillations are active, nonlinear,and self-sustaining.

It is fairly obvious from the mathematical nature of the(pi /p0)n factor that if the player sounds his notes progres-

sively more softly (without changing the tension of his lipsor the setting of the reed), the higher-n components fallaway very much more quickly than the lower members ofthe sequence. In decibel language, we

peaks in Fig. 11 show clearly that there can be little direct

can say that for ev-ery decibel that the p1 component is weakened, the levelof pn falls by n dB. At the softest possible level, then, weexpect the tone within the mouthpiece to have degeneratedinto a single sinusoid of frequency f1.

In actual practice, the oboe is essentially unplayable atlevels for which (pi /p0) < 1, while the bassoon is almostnever played thus. The saxophone can be so used, but nor-mally it too is used in the domain where pi /p0 > 1. Onlythe clarinet is played at the levels discussed so far, and forit, the customary forte instruction gives a tone for whichpi /p0 is little or no larger than unity. This raises the im-mediate question of what happens to the spectral envelopewhen an instrument is played at the higher dynamic lev-els. The answer varies with the instrument, and while it isfairly well known, it would take us too far afield to dis-cuss it here. The solid curve in Fig. 13 presents the generalshape of the internal (mouthpiece) spectral envelope forthe nonclarinet woodwinds. The corresponding internalspectral envelope for the brasses is shown in the same fig-ure by a closely dotted curve, while the behavior of the oddand even components of the clarinet’s spectrum is shownby the pair of dashed lines. All these curves are calculatedon the assumption that the factor (p1/p0)n belongs to theinstrument’s normal mezzoforte playing level.

The interplay between the direct energy processes at theair-column resonance peaks and the heterodyne transferof energy between components is made vivid by the fol-lowing observations. The heights of the various resonance

FIGURE 13 Internal (mouthpiece or reed-cavity) spectrum en-velopes. Nonclarinet woodwinds, solid curve; brasses, dottedcurve; clarinet, dashed curves, one for the odd-numbered com-ponents of the played note, and one for the evens.

Page 266: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 259

production of energy by the first two or three compo-nents of the played tone C4 (led cooperatively by peaks2, 4, 6, . . .).

For brass instruments, on the other hand, the dottedspectrum envelope curve given in Fig. 13 shows, that theactual strength of the generated fundamental componentin the mouthpiece is the strongest of all, while components2, 3, and 4 are progressively weaker. Clearly, a great deal ofheterodyne action is needed to transfer the majority of thehigh-frequency generated power into the low-frequencypart of the spectrum. We shall meet similar behavior inthe tone production processes of the violin.

Consider next what happens when the trumpet playersounds what he calls the F2 pedal note, whose repetitionrate is exactly one-third that of C4. The harmonic compo-nents 3, 6, 9, . . . of the pedal tone lie at air-column peaks2, 4, 6, . . . and so can act as direct producers of acousticalenergy. Meanwhile, the tonal components (1, 2), (4, 5),(7, 8), . . . are away from any resonance peaks, and soexist only because of the heterodyne conversion process.The shape of the measured mouthpiece spectrum enve-lope for components 3, 6, 9 . . . for F2 is almost identicalwith that belonging to C4. The remaining (heterodyned)components have a very similar envelope, but one that ismany decibels weaker.

F. Transformation of the Mouthpiece SpectrumEnvelope into the Room-Average Envelope

Attention has already been called to the fact that regard-less of the air-column configuration, each musical windinstrument has a cutoff frequency fc above which it lacksresonance response peaks. The same air-column physicsthat produces a falling away of the resonance peak heightsimmediately below fc (and so a reduced production ofthe corresponding mouthpiece pressure components) alsoplays a significant role in the transformation of the mouth-piece sound into the one enjoyed by listeners in the concerthall.

It is a property of the sequence of open tone holes at thelower end of a woodwind that, at low frequencies, sound isalmost exclusively radiated from only the first of the holes,while the lower holes come into active play one by one asthe signal frequency is raised. Above the cutoff frequencyfc, all of the holes are fully active as radiators, and thesound emission not only becomes nearly independent offrequency, but also essentially complete.

While the acoustical laws of sound transmission andradiation from a brass instrument bell are quite differentfrom those governing the woodwind tone holes, here toowe find very weak emission of low-frequency sounds. Thebells’s radiation effectiveness then rises steadily with fre-quency until it is once more complete for signals above fc.

FIGURE 14 Spectrum transformation function converting the in-ternal spectrum envelope into the room-average one. Nonclarinetwoodwinds, solid curve; brasses, dotted line; clarinet, a generallyrising dashed line for the odd-numbered tonal components and ahorizontal dashed line for the even components. For clarity bothclarinet curves have been displaced downward 15 dB from theother curves.

All this explains why there are no resonance peaksabove fc, and why those having frequencies just belowfc are not very high: the energy loss associated with radi-ation acts to provide a frequency-dependent damping onthe resonance, a damping that becomes complete abovefc. In other words, the same phenomena that increasethe emission of high-frequency sound from the interiorof a wind instrument to the room around it also lead to aprogressively falling ability of the instrument to generatehigh-frequency sounds within itself (this is shown by thedotted curve in Fig. 13).

The solid line in Fig. 14 shows the behavior of nonclar-inet woodwinds in transferring their internal sounds intothe room. Note that the transfer becomes essentially com-plete for components having frequencies above the cutofffrequency. The dotted and the dashed curves of this figureshow the analogous spectrum transformation function forthe brass instruments and for the odd and even componentsof the clarinet spectrum.

G. Overall Spectrum Envelopes of WindInstruments in a Room

Figure 15 illustrates the nature of the room-average sound-spectrum envelopes of the main reed instrument classes,as calculated from the curves for their mouthpiece spectraand their transformation functions. Once again the solidcurve pertains to the nonclarinet woodwinds, the dottedline to the brasses, and the pair of dashed line to theclarinets. The essential correctness of these diagrammatic

Page 267: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

260 Musical Acoustics

FIGURE 15 External spectrum envelopes. Nonclarinet wood-winds, solid line; brasses, dotted line; clarinet, a pair of dashedlines: one for the odd and one for the even components of thetone.

representations is clearly shown in Fig. 16a–c, whichpresents the room-average spectrum envelopes measuredfor the oboe (C4 to C6), trumpet (E3 to F#

5), and clarinet(E3 to C6). These were obtained using room-averagingtechniques quite similar to those used to obtain the spec-trum envelope of a piano (see Fig. 6).

Exactly as is the case for the purely heterodyned even-harmonic components of the clarinet tone, the (1, 2), (4, 5)(7, 8), . . . components of the trumpet F2 pedal note areweakly generated but strongly radiated. As a result, inthe measured room-averaged spectrum of this note thesecomponents essentially fit the envelope belonging to thedirectly generated 3, 6, 9 . . . components (being only about3 dB weaker).

Study of a large variety of instruments (including so-prano, alto, and bass representatives of each family) showsthat the basic curves change very little from one exam-ple to another. In all cases the general trend at high fre-quencies is for the envelope to fall away as 1/ f 3, with abreakpoint close to 1500 Hz for the soprano instruments(oboe, trumpet, clarinet). For the alto instruments (En-glish horn, alto saxophone, alto clarinet), the breakpointis around 1000 Hz, paralleling the fact that their play-ing range lies a musical fifth below the soprano instru-ments. For the next lower range of instruments (trombone,tenor saxophone, bass clarinet), the break lies around1500/2 = 750 Hz, while the bassoon (whose playing rangelies at one-third the frequency of the oboe) has its breaknear 1500/3 = 500 Hz.

Almost nothing has been said so far about the flutefamily of woodwinds beyond a description of its as-sociated flow controller and the basic nature of its us-able air column. Despite the flute’s apparent mechanicalsimplicity, it is in many ways dynamically more sub-

FIGURE 16 Measured room-average spectra (a) oboe; (b) trum-pet; (c) clarinet, for clarity the even-component data have beendisplaced downward by 20 dB.

tle than the other woodwinds and somewhat less wellunderstood.

Because the flute’s flow controller is velocity oper-ated rather than pressure operated, a somewhat round-about proof shows that it is the peaks in the admittancecurve (flow/pressure) rather than those of its reciprocal,the impedance curve, that cooperate with and instruct the

Page 268: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 261

flute’s air reed. The room-average spectrum of a flute maybe expected a priori to be quite different from that of theother woodwinds for two reasons. First, the oscillationdynamics of the primary energy production mechanismare drastically different. Second, there are two sources ofsound radiation into the room: one is the familiar one as-sociated with the tone hole lattice ( fc 2000 Hz), whilethe other is the oscillatory flow at the embouchure holeacross which the player blows. It is somewhat as thoughthe room were supplied simultaneously with the internaland the external spectrum of a normal woodwind!

The room-average spectrum envelope for a flute has acuriously simple form that is well represented by

E( f ) = e− f/ fa . (16)

Here fa is near 800 Hz for the ordinary concert flute,close to 530 Hz for the alto, and 1600 Hz for the pic-colo (as might be expected for such systematically scaledinstruments).

V. THE BOWED STRING INSTRUMENTS

The bowed string family of musical instruments sharesmany features with the instruments that have been dis-cussed so far. For this reason the present section can serveboth as a review and elaboration of the earlier material andas an introduction to another major class of instruments.

As indicated in Fig. 17, the violin, like the guitar, hasa boxlike structure with a rigid neck and a set of stringswhose vibrating length can be controlled by the player’s

FIGURE 17 Structural parts of the violin, and their names.

fingers. To a first approximation then, the two instrumen-tal types have similar dynamical processes that convert thedriving-force spectrum (transferred from the strings viathe bridge) to the sound spectrum that is measured in theconcert hall. On the other hand, the excitation mechanismof the self-sustained string oscillation of violin familyinstruments proves to be essentially the same as that whichgenerates sound in the woodwind and brass instruments.

A. The Excitatory Functioning of the Bow

The mechanism used by the violin family to keep a string invibration is easily sketched: The frictional force exerted atthe contact point between bow and string is smaller whenthere is a fast slipping of the bow hair over the string andlarger when the sliding rate is slower. Thus, during thoseparts of its oscillation cycle when the contact point ofthe string chances to be swinging in the same directionas that of the rapidly moving bow (so that the slippingvelocity is small), there is a strong frictional force urgingthe string forward in the direction of its motion. During theother half of the cycle, the string is moving in a directionopposite to that of the bow. Under these conditions theslipping velocity is large, making the frictional force quitesmall. Notice that this frictional drag is still exerted in thedirection of the (forward) bow motion, and it therefore actsto somewhat retard the (backward) vibrational motion ofthe string. In short, during part of each vibratory cycle astrong force acts to augment the oscillation, and duringthe remainder of the cycle there is a weaker depletingaction. The oscillation builds up until the bow’s energyaugmentation process exactly offsets all forms of energydissipation that may take place at the bowing point andeverywhere else. It is clear that the vigor of the oscillationis ultimately limited by the fact that during its “forward”swing the string velocity at the bowing point can equal, butnot exceed, the velocity of the bow, otherwise the frictionwould reverse itself and pull the string velocity back downto match that of the bow.

In the earliest version of the formal theory of the bowedstring excitation process, it was assumed that the stringand bow hair stick and move together during the forwardmotion of the string, while during the return swing the fric-tion is negligible. Such a theory is remarkably successfulin predicting many of the most obvious features of theoscillation but is powerles to give a dependable accountof the actual driving-force spectrum envelope as it mightappear at the bridge.

B. The Frequency-Domain FormulationReapplied

In the framework of the resonance-curve/excitation-controller theory of self-sustained oscillators, it is easy

Page 269: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

262 Musical Acoustics

FIGURE 18 Violin bowing-point admittance (velocity/force) curvefor F fthe way along the string from the bridge to the player’s left-handfingers.

to see that the bow/string interaction serves as a velocity-operated force controller; this is in contrast to the pressure-operated flow controllers found in the woodwinds andbrasses. To be consistent then, in our theory we replacthe pressure-response curves of flow-driven air columns(as measured in the mouthpiece) by the velocity-responsecurve of a force-driven string (measured at the bowingpoint). Figure 18 shows a velocity-response (driving-pointadmittance) curve calculated for a violin D string fingeredto play F4. In this example, it has been assumed that thebow crosses the string at a point one-tenth of the stringlength away from the bridge (about 25 mm). Notice theremarkable similarity of this resonance curve to the oneshown in Fig. 11 for the air column of a trumpet.

For the trumpet, the increasing height of the resonancepeaks and the initial falling away beyond their maximumis determined chiefly by the design of the mouthpiece cup(a cavity) and back bore (a constriction). The ultimatedisappearance of the peaks is, as already explained, dueto the radiation loss to the room suffered by the air col-umn. In the case of a violin however, it is the distanceof the bowing point from the bridge that determines thefrequency region (around 1750 Hz in the example) wherethe admittance peaks are tallest. The subsequent weak-ening of the higher frequency peaks is controlled jointlyby the rising influence of frictional and radiation damp-ing and by some bow physics that is related to that whichproduces “notches” in the plucked string’s Ea( f ) function[see Eq. (5)].

There exists a force-control function representing thebow/string interaction that is analogous to the wind in-strument F( f ) function [see Eq. (15)]. While this analogto F is not shown in Fig. 18, it may be taken to be quitesimilar to the one shown for woodwinds in Fig. 10. Thereis unfortunately no simple description for the bow prop-

erties that together play the role of the reed resonancefrequency fr in limiting the production of energy at highfrequencies.

C. Spectrum Systematics

The small height of the lowest few resonance peaks inFig. 18 shows that the major part of the total energyproduction comes via the tall response peaks that lie athigher frequencies. Not surprisingly, the nonlinear natureof the bow/string stick-slip force results in heterodyne ef-fects that lead to a bowing point spectrum very similarto that of a trumpet mouthpiece pressure spectrum. Thesystematic transfer of high-frequency energy into low-frequency vibrational components is as effective for theviolin as for the trumpet, so that (as in the trumpet) thedriving-point spectrum ends up with the first componentstrongest and higher ones becoming progressively weaker.

The simplest stick-slip theory of the bow/string oscilla-tion gives a reasonably accurate initial picture of the spec-trum envelope Ev( f ) for the string velocity at the bowingpoint:

Ev( f ) = [sin(π f / f β)]/(π f / f β). (17)

Here fβ is defined in terms of the point of application ofthe bow on the string in exactly the same manner as fa

was defined via the plucking point v( f ) is shown exactly by the curve for Ea( f ) in Fig. 4.

in Eq. (5). The shape ofEThe actual velocity spectrum envelope of a bowed string isquite similar to that implied by Eq. (17), except that (a) thespectral notches do not go all the way to zero, and (b) athigh frequencies the effects of dampings, etc., reduce thespectral amplitudes very considerably.

So far the discussion of the spectral properties of thestring velocity at the bowing point serves as a means forclarifying the fundamental energy production processesof the bowed string. What actually leads to the radiationof sound in the room, however, is the force exerted by thestring on the bridge and the consequent emission of soundby the vibrating violin body. The spectrum transformationfunction relating the “internal” (bowing-point) spectrumto the “external” (room-average) spectrum must thus beconsidered in two parts. The first part relates the bowing-point velocity to the bridge force, while the second partconverts this force spectrum into the one measured in theroom.

The bowing-point velocity/bridge-force spectrumtransformation function TvF( f ) turns out, according tothe simplest theory, to be 1/ sin(π f/ fβ). The most strik-ing consequence of this fact is that it exactly cancels outthe notches in the simple formula for Ewords, at those frequencies for which there is suppos-edly no velocity signal at all at the bowing point, the

Page 270: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 263

FIGURE 19 Simplified transformation function connecting the vi-olin bridge drive-force spectrum envelope with the room-averagespectrum envelope.

transformation function is so enormously effective thatit apparently “creates” a drive force at the bridge! Cu-riously enough, although the stick-slip versions of boththe velocity-spectrum and force-transformation functionhave been common knowledge for over a century, seri-ous attempts to resolve the paradox have been made onlyrecently. Much of the necessary information is currentlyavailable, but it has not been fitted together into a coherentwhole. For present purposes, it will suffice to remark thatthe general trend of the force spectrum envelope of thebridge drive force is roughly constant at low frequencies,and it falls away fairly quickly for frequencies above abreakpoint that is determined in part by the distance of thebowing point from the bridge.

Figure 19 outlines the main behavior of the bridge-force-to-room transformation function. This must ofcourse be evaluated for a drive force whose direction liesroughly parallel to the plane of the violin’s top plate andtangent to the curved paths of the string anchorages onthe bridge (see Fig. 17). To a first approximation the trendline is horizontal, in agreement with the general assertionsmade about the force-to-room transformation in connec-tion with the plucked and struck string instruments. How-ever, there is a very rapid weakening in the radiating abil-ity of the body in the low-frequency region below a strongpeak near 260 Hz. This radiative transformation peak andassociated loss of low-frequency efficacy (whose cognateon the guitar falls at 85 Hz and below) is associated witha joint vibrational mode of the elastic-walled body cavityand the Helmholtz resonator formed by this air cavity andthe apertures in it provided by the f holes (see Fig. 17 fordetails and terminology).

There is a second radiativity peak just below 500 Hz ona violin. This one is associated with the resonant responseof a body mode in which the top plate vibrates (chieflyon the bass-bar side) in a sort of twisting motion having aquasi-fulcrum at the position of the sound post. The backplate is also in vigorous motion, being coupled to the topplate by the sound post.

(In the guitar, this radiativity peak is relatively unim-portant. While it too has a body mode in which the bridgerocks strongly, the lack of a bass bar and sound post makesfor a vibrational symmetry that gives a very small radia-tion of sound. Furthermore, the lowness of a guitar bridgemeans that this mode is only very weakly driven by avibrating string.)

The violin has two more strong radiativity peaks. One isfound near 3000 Hz, and the other near 6000 Hz, beyondwhich the radiativity falls as 1/ f 2 or faster. The first ofthese peaks is determined by a bridge-plus-body mode inwhich the predominant motion is a rocking of the top partof the bridge about its waist. The second peak belongs to amode in which there is a sort of bouncing motion (normalto the plane of the top plate) of the upper part of the bridgeon the bent “legs” connecting its waist to its feet. Analogsto these peaks do not exist on the guitar.

There are many additional resonance-related peaks anddips in the transformation function besides those describedpreviously and indicated in Fig. 19. For a violin these arespaced (on thge) only about 35 Hz apart, and theyare proportionally closer on larger members of the family.We have dealt explicitly here only with those of majoracoustical and musical importance whose positions alongthe frequency axis are well established for each familyof bowed instruments. It is an important part of a fiddlemaker’s skill to place these selected peaks in their correctfrequency positions. He must also properly proportion theinteractions of the various parts of an instrument (e.g., bysuitably dimensioning and placing the soundpost).

D. The Violin’s Measured Room-AverageSpectrum and Its Implications

Figure 20 shows the room-average spectra of all chromaticnotes between a violin’s bottom G3 and the A#

4 that lies

FIGURE 20 Measured room-average spectra of violin notes ofthe chromatic scale between G3 and A#

4.

Page 271: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

264 Musical Acoustics

somewhat more than an octave above. One feature callsinstant attention to itself in the spectrum envelope impliedby these data: This measured envelope is similar to thatof soprano wind instruments and also the piano in thatthe envelope is roughly uninform at low frequencies andfalls away at the rate of about 1/ f 3 at high frequencies.The similarity is closest between the violin and the windinstruments, because the breakpoint between the low- andhigh-frequency regions lies in all cases near 1500 Hz!

Comparison of the envelope shape from Fig. 20 with thetransformation function of Fig. 19 causes an initial feel-ing of surprise. Figure 19 shows a strong radiativity peakaround 3000 Hz and another one near 450 Hz. Neither oneof these shows up in the spectral envelope of the radiatedsound. While no one seems to have worked out the detailsyet, the basic explanation has already been met among thewind instruments. Efficacy of radiation generally means alowering of the resonance peaks that operate the primaryexcitation controller (reed or bow). As a result, there is lessenergy produced at the radiativity peaks than elsewhere,thus offsetting the increased emission of the enegy that isproduced. However, the fact that all parts of the generatedspectrum are strongly interconnected by heterodyne ac-tion makes it impossible to make detailed predictions ofwhat will happen from general principles alone.

More detailed comparison of the radiativity curve andthe spectrum envelope shows further evidence that theirrelationship is not simple: The strong radiativity peak at260 Hz differs from the others in that it does appear tostregthen the radiated tonal components that coincide withit. Furthermore, the rapid decrease in radiativity below250 Hz is reflected in a rapid loss of power in the cor-responding components of the violin’s tone. We also see(Fig. 20) hints of a strong emission of sound in the re-gions around 400 Hz and clear indications of even strongeremission around 550 Hz, despite the fact that there are noprominent resonances to be found at these frequenciesin the violin’s modal collection. Other hints of system-atic fine structure in the observed spectrum are tantaliz-ingly visible in the present data, hints that strengthen andweaken surprisingly when the data are displayed in differ-ent ways. As remarked earlier, much more remains to bedone to elucidate the detailed origins of the violin’s spec-tral envelope, as is the case with many other features ofits acoustical behavior. Meanwhile, clues as to what sortsof phenomena are to be expected may be looked for inthe spectral relations among the components of the brassinstrument pedal tones.

The musical interpretability of the bowed string sound ismade yet more difficult by the fact that the human auditorysystem is readily able to recognize the tonal influences ofall the resonances displayed in Fig. 19, along with several

other less well-marked or invariable ones belonging to thiscomplex system. This is despite the fact that they are notvisible in the measured spectrum. Physicists must alwaysremember in cases like this that while the ear does not ana-lyze sounds in the ways most readily chosen by laboratoryscientists, it must in the final analysis act upon whateverpieces of physical data offer themselves, many of whichcan be at least listed for the scientist’s serious considera-tion, even though they may be difficult for him to measure.

Claims are made from time to time that “the secret ofStradivari” has been discovered. Such claims arise in partbecause of a sometimes unrecognized conflict betweenthe remarkably effective but subliminal routines of musi-cal listening and the highly intellectualized activities of alaboratory researcher, and in part because of everyone’s ro-mantic desire to create a better instrument. Each discoverproclaims some “truth” that he has found. If the “scien-tific” discoverer is often less guarded in his claims than inhis craftsman or musician counterpart, it is because he of-ten knows only one aspect of the primary oscillation prob-lem or of the vibration/radiation aspects of the net soundproduction process. Moreover, he is not subjected to thediscipline of successful practice in the real-world fields ofinstrument making or musical performance, where partialsuccess is often equivalent to failure!

VI. THE APTNESS OF INSTRUMENTALSOUNDS IN ROOMS

The diverse musical instruments that we have studiedshare a remarkable number of properties. Let us list someof these and attempt to relate them to the ways in whichthey provide useful data to the auditory processor.

All of the standard orchestral instruments generatesounds that are (note by note) made up of groups of sinu-soids whose frequencies are whole-number multiples ofsome repetition rate. We might ask, at least for the pluckedor struck string instruments, whether it is an accident thatthis should be so, since it comes about (and only approx-imately at that) via the choice of thin, elongated, uniformwires as the primary vibrating object. Why should suchvibratiors take precedence over vibrating plates or mem-branes, or even over wires of nonuniform cross section?In the case of the wind and bowed instruments (includingthe singing voice), self-sustained oscillations are possibleonly under conditions where the resulting spectrum is ofthe strictly harmonic type. Here, then, the traditional in-strument maker has no choice: It is impossible for him toprovide inharmonic sound sources.

For the moment, the question remains partly open asto why the harmonic-type instruments are dominant. We

Page 272: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 265

have however been given a strong hint by the observationthat the auditory processor treats such aggregations ofspectral components in a special way—it perceives eachsuch grouping as an individual, compact tone. It can alsodistinguish several such tones at the same time, and evenrecognize well-marked relationships between them (suchas the octave or the musical fifth).

The problem remains, however, as to whether the rec-ognizability of individual harmonic groups can survive theroom’s transmission path. Regardless of the complexity oftransmission of amplitude or phase, the frequencies of thecomponents radiated from an instrument arrive unalteredat the listener’s ear. It is an easily verified fact that the pitchof a harmonic complex is almost rigorously established bythe harmonic pattern of its component frequencies (whichdetermine in a mathematically unique way its repetitionrate), rather than by the amplitudes of these components.In other words, as long as even a few of the partials ofeach instrument’s tone detectably arrive at the listener’sear, the pitch (music’s most important attribute) is wellestablished.

Great emphasis has been laid throughout this articleon the fact that each instrument is constructed in such away that all of its notes share a common spectrum enve-lope. It has also been pointed out that for the keyboardinstruments at least, it is a matter of considerable diffi-culty to achieve such an envelope. The structure of thebrass instruments, on the other hand, almost guarantees awell-defined envelope; and even though it is possible tobuild woodwind instruments that lack an envelope, manythings become easier if one is arranged for them. Finally,the guitarist and the violinist were found to have instru-ments that inherently tend to produce a spectral envelope,but one whose breakpoint and high-frequency slope canbe influenced by the player via his choice of pluckingor bowing point.

What are the perceptual reasons for these instruments tohave evolved to produce a well-defined spectral envelope?This question can be answered at least in part by the factsof radiation acoustics and musical perception in rooms. Ittakes only a very limited collection of auditory samplesof transmission-distorted data from an instrument for thelistener to form an impression of the breakpoint and high-frequency slope, and so (even for a single note in a musicalpassage) to permit him to decide which of the instrumentsbefore him has produced it.

A question that is less easy to decipher is why the in-struments seem to have very nearly the same envelopes.A partial explanation is to be found in the observation thatthe bulk of the available acoustical energy in a tone is al-located to the first four or five partials, which puts theseenergy packages into a set of independent critical bands,

thus maximizing the net loudness. A less obvious expla-nation is that the high-frequency rolloff may be a wayof preventing excessive tonal roughness of the sort thatcomes about when too many harmonics find themselvesin the same critical band. For example, harmonics 7, 8,and 9 will contribute some roughness because they all liewithin the 25% bandwidth of significant mutual interac-tion. This explanation is not adequate, however. It turnsout that critical-band-induced roughnesses of this sort arenot strongly active, whereas (for the soprano instrumentsat least) an insufficient rolloff rate tends to produce a quiteunacceptable tone color.

While much remains to be done to fully clarify ques-tions of the sort raised in the preceding paragraph, we canfind hints as to where the answers may be sought. Psychoa-cousticians have shown the existence of a tonal attributeknown as sharpness (which is what its German originatorcalls it in English, but edginess, or harshness, would be abetter term). This attribute may be calculated dependablyfrom the spectral envelope of a sound, its power level, andthe frequency range in which its components are found.We may summarize the calculation method thus. The totalloudness N perceived by the listener is given by the inte-gral of a loudness density function n(z) that is a perceptualcognate of the product of the physicist’s spectral envelopeE( f ) and the level of the acoustical signals received bythe listener:

N =∫

n(z) dz. (18)

It also takes into account varying amounts of interactionbetween spectral components that are not widely separatedin frequency. Here the variable z is the transformationof the ordinary frequency axis into a perceptual coordi-nate such that increments of one unit of z correspond tothe width of one critical band. The sharpness, S, is thencalculated as an overlap integral of n(z) and a sharpnessweighting function g(x), given by

S = const

ln(N/20 + 1)

∫n(z)g(z) dz. (19)

The sharpness weighting function g(z) is small at low val-ues of z, and it rises rapidly for values of z that correspondto frequencies above 1000 Hz. It should probably not betaken as accidental that the most important contributions tothe sharpness integral arise above 1000 Hz, in the regionwhere the primary receptors are beginning to randomlymisfire relative to their mechanical stimuli.

Figure 21a shows the function n(z) calculated for a har-monic tone having a fundamental frequency f0 of 200 Hz,a spectral envelope with breakpoint frequency 1500 Hz,

Page 273: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

266 Musical Acoustics

FIGURE 21 (a) Loudness density function n (z ) calculated forharmonic tones based on 200 Hz, and having a spectral envelopewith 1500 Hz break frequency and 1/ f 3 high frequency rolloff.Also shown is the sharpness function g (z ). (b) Similar curves,for a tone having the same spectral envelope but with an 800 Hzfundamental frequency.

and 1/ f 3 high-frequency rolloff. Also plotted is the sharp-ness function g(z). Qualitatively speaking, we may under-stand the net sharpness as being related to the area ofthe shaded region lying between the z axis and the twocurves. Figure 21b similarly illustrates the case of a tonebelonging to the same spectral envelope and fundamentalfrequency 800 Hz.

We find by direct electronic synthesis (or by soundinga specially made laboratory wind instrument) that an un-pleasant harshness is attributed to tones having a raisedbreakpoint or reduced rate of high-frequency falloff rela-tive to the one we described previously. Furthermore, thetone is generally pronounced to lack piquancy and to besomewhat dull and muffled when the envelope has a low-ered breakpoint or steepened falloff.

It is not difficult to expect from the general nature ofFig. 21 that instruments built to play in the alto, tenor, andbass ranges will be very little influenced by the sharpnessphenomenon, freeing them (in accordance with observa-tion and experiment) from the constraints that appear tohold for the soprano instruments. On the other hand, trebleinstruments (having breakpoint frequencies of 2000 Hz orabove) are found to have a great deal of sharpness regard-less of the high-frequency envelope slope. In the orchestrathese instruments are rarely used, and then only for specialpurposes. Quantitative study of the relation of sharpness,

loudness, and spectrum envelope for instruments in vari-ous pitch ranges is in its infancy, but already a considerableamount of consistency is apparent.

One flaw is to be noticed in the apparently coherentpicture that has been sketched above: A most importantmusical instrument, the piano, seems not to provide theotherwise universal spectral envelope. Here the break fre-quency turns out to lie near 800 Hz rather than 1500 Hz.However, it is a simple matter to electronically refilter a setof recorded piano tones to move the breakpoint to 1500 Hzwithout changing anything else. Listening tests on such amodified (normalized) sound show at once that the tonedoes not so much become harsh as that the pounding ofthe hammers becomes obtrusive. Readers who have lis-tened to the actual sound of the early-nineteenth-centurypianoforte will have heard a mild form of the same kindof hammer heard a mild form of the same kind of ham-mer clang. (Do not count on a recording to inform you,because many recordings have been so tampered withthat nothing can be learned from them.) Apparently, pi-anos have evolved away from an original design basedon the harpsichord (where the continuous-spectrum im-pulsive hammer sound is not produced, and the spec-trum envelope is essentially of the familiar type) to onein which one tonal virtue is sacrificed to avoid a seriousflaw.

The perceptual symbiosis that exists between a musicalinstrument and the concert hall in which it is played can beillustrated further by considering the details of the primaryradiation processes whose signals are compiled in mak-ing a room-average spectrum. For every instrument family,the spectrum envelope of the sounds radiated in some par-ticular direction (in reflection-free surroundings) differssignificantly from that radiated in some other direction. Inmany cases smoothly varying discrepancies between towsuch envelopes can amount to as much as 40 dB. We alsofind that the signals from microphones placed at variouspositions close to an instrument have peculiar and highlyirregular spectra that have no easily recognizable relation-ship with more thoughtfully obtained spectra. We have al-ready seen what happens when many individual samplesof the more distant version of the sound are combinedinto a room average: a reliable picture of it emerges. Ourhearing mechanism and the surrounding hall join to ac-complish just this task. The concert halls in which we nor-mally listen to music offer many reflections, which meansthat data concerning all aspects (literally!) of the emittedsound are made available to our auditory processors. Thechief reason (from the point of view of music) why theroom-average spectrum is important is that the ear actu-ally can assemble the equivalent information by means ofearly-reflection processing and/or multiple-sample aver-aging via use of two-ear, moving-listener, moving-source

Page 274: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTQ Final Pages

Encyclopedia of Physical Science and Technology EN010I-467 July 16, 2001 15:28

Musical Acoustics 267

data collected over the span of several seconds. Almostnone of this multiplicity of data is available for processingin reflection-free surroundings, which provides a signifi-cant hint as to why serious performers and listeners aliketend to dislike open-air music: It subjects them to auditorydeprivation.

Despite the noise-reduction and harmonic distortion-free techniques of digital recording and the use of com-pact disks, many modern attempts at musical recordingare frequently quite unsatisfactory. Recording engineerssometimes misuse their technical resources in an attemptto remove the confusion from the recorded sound by theuse of reflection-free studios, partitions between instru-ments, and the “mixing down” and “filter enhancement”of signals from numerous highly directional microphones(each placed very close to its own instrument). Theseactions (which are increasingly resented by performingclassical musicians) produce distortion of the primary mu-sical data when they do not eliminate them altogether. Onthe other hand, recordings of the sort made in the 1950sand 1960s using two or three microphones properly placedin a good concert hall have never been surpassed, at leastin the informed judgement of those listeners to classi-cal music whose experience has been gained largely byactual concert-going. In short, for music we need andenjoy all of the data from our instruments, instrumentsthat have evolved over several centuries to communicatetheir voices effectively in the environment of a concerthall.

SEE ALSO THE FOLLOWING ARTICLES

ACOUSTICAL MEASUREMENT • ACOUSTICS, LINEAR •SIGNAL PROCESSING, ACOUSTIC • SIGNAL PROCESSING,GENERAL • ULTRASONICS AND ACOUSTICS

BIBLIOGRAPHY

Benade, A. H. (1976). “Fundamentals of Musical Acoustics,” OxfordUniv. Press, London and New York.

Benade, A. H. (1985). From instrument to ear in a room: Direct, or viarecording. J. Audio Eng. Soc. 33, 218–233.

Benade, A. H., and Kouzoupis, S. N. (1988). The clarinet spectrum:Theory and experiment. J. Acoust. Soc. Am. 83, 292–304.

Benade, A. H., and Larson, C. O. (1985). Requirements and techniquesfor measuring the musical spectrum of a clarinet. J. Acoust. Soc. Am.78, 1475–1497.

Benade, A. H., and Lutgen, S. J. (1988). The saxophone spectrum.J. Acoust. Soc. Am. 83, 1900–1907.

Causse, R., Kergomard, J., and Lurton, X. (1984). Input impedance ofbrass musical instruments—Comparison between experiment and nu-merical models. J. Acoust. Soc. Am. 75, 241–254.

Cremer, L. (1984). “The Physics of Violins” (J. S. Allen, translator). MITPress, Cambridge, Massachusetts.

De Poli, A. (1991). “Representations of Musical Signals,” MIT Press,Cambridge, MA.

Griffith, N., and Todd, P. M. (1999). Musical Networks: Parallel Dis-tributed Perception and Performance, MIT Press, Cambridge, MA.

Hall, D. E. (1986). Piano string excitation, I. J. Acoust. Soc. Am. 79,141–147.

Hall, D. E. (1987). Piano string excitation: The question of missingmodes. J. Acoust. Soc. Am. 82, 1913–1918.

Hall, D. E. (1988). Piano string excitation: Spectra for real hammers andstrings. J. Acoust. Soc. Am. 83, 1627–1638.

Hutchins, C. M. (1983). A history of violin research. J. Acoust. Soc. Am.73, 1421–1440.

Marshall, K. D. (1985). Modal analysis of a violin. J. Acoust. Soc. Am.77, 695–709.

McIntyre, M. E., Schumacher, R. T., and Woodhouse, J. (1983). On theoscillations of musical instruments. J. Acoust. Soc. Am. 74, 1345–1375.

Pierce, J. R. (1992). “Science of Musical Sound, Rev. Ed.” Holt, NewYork.

Rossing, T. D., and Fletcher, N. H. (1998). “The Physics of MusicalInstruments, 2nd Ed,” Springer-Verlag, New York.

Sadie, S. (ed.) (1980). “The New Grove Dictionary of Music and Musi-cians, Macmillan, London, England.

Weinreich, G., and Kergomard, J. (1996). “Mechanics of Musical Instru-ments,” Springer-Verlag, New York.

Page 275: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

Nonlinear DynamicsF. C. MoonCornell University

I. IntroductionII. The Undamped PendulumIII. Nonlinear ResonanceIV. Self-Excited Oscillations: Limit CyclesV. Stability and Bifurcations

VI. Flows and Maps: Poincare SectionsVII. One-Dimensional Maps, Bifurcations,

and Chaos

VIII. Fractals and Chaotic VibrationsIX. Fractal DimensionX. Lyapunov Exponents and Chaotic DynamicsXI. The Lorenz Equations: A Model for Convection

DynamicsXII. Spatiotemporal Dynamics: SolitonsXIII. Controlling ChaosXIV. Conclusion

GLOSSARY

Bifurcation Denotes the change in the type of long-timedynamical motion when some parameter or set of pa-rameters is varied (e.g., as when a rod under a com-pressive load buckles—one equilibrium state changesto two stable equilibrium states).

Chaotic motion Denotes a type of motion that is sensi-tive to changes in initial conditions. A motion for whichtrajectories starting from slightly different initial con-ditions diverge exponentially. A motion with positiveLyapunov exponent.

Controlling chaos The ability to use the parameter sen-sitivity of chaotic attractors to stabilize any unstable,periodic orbit in a strange attractor.

Duffing’s equation Second-order differential equationwith a cubic nonlinearity and harmonic forcingx + cx + bx + ax3 = f0 cos ωt .

Feigenbaum number Property of a dynamical systemrelated to the period-doubling sequence. The ratio ofsuccessive differences between period-doubling bifur-cation parameters approaches the number 4.669. . . .

This property and the Feigenbaum number have beendiscovered in many physical systems in the prechaoticregime.

Fractal dimension Fractal dimension is a quantitativeproperty of a set of points in an n-dimensional spacethat measures the extent to which the points fill a sub-space as the number of points becomes very large.

Hopf bifurcation Emergence of a limit cycle oscillationfrom an equilibrium state as some system parameter isvaried.

Limit cycle In engineering literature, a periodic motionthat arises from a self-excited or autonomous systemas in aeroelastic flutter or electrical oscillations. In dy-namical systems literature, it also includes forced pe-riodic motions (see also Hopf bifurcation).

Linear operator Denotes a mathematical operation (e.g.,differentiation, multiplication by a constant) in whichthe action on the sum of two functions is the sum of theaction of the operation on each function, similar to theprinciple of superposition.

Lorenz equations Set of three first-order autonomousdifferential equations that exhibit chaotic solutions.

523

Page 276: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

524 Nonlinear Dynamics

This set of equations is one of the principal paradigmsfor chaotic dynamics.

Lyapunov exponents Numbers that measure the expo-nential attraction or separation in time of two adjacenttrajectories in phase space with different initial condi-tions. A positive Lyapanov exponent indicates a chaoticmotion in a dynamical system with bounded trajecto-ries. (Sometimes spelled Liapunov).

Nonlinearity Property of an input–output system ormathematical operation for which the output is notlinearly proportional to the input. For example,y = cxn (n = 1), or y = x dx/dt , or y = c(dx/dt)2.

Period doubling Sequence of periodic vibrations inwhich the period doubles as some parameter in theproblem is varied. In the classic model, these frequencyhalving bifurcations occur at smaller and smaller in-tervals of the control parameter. Beyond a critical ac-cumulation parameter value, chaotic vibrations occur.This scenario to chaos has been observed in may phys-ical systems but is not the only route to chaos (seeFeigenbaum number).

Phase space In mechanics, an abstract mathematicalspace with coordinates that are generalized coordinatesand generalized momenta. In dynamical systems, gov-erned by a set of first-order evolution equations; thecoordinates are the state variables or components ofthe state vector.

Poincare section (map) Sequence of points in phasespace generated by the penetration of a continu-ous evolution trajectory through a generalized sur-face or plane in the space. For a periodically forcedsecond-order nonlinear oscillator, a Poincare map canbe obtained by stroboscopically observing the posi-tion and velocity at a particular phase of the forcingfunction.

Quasi-periodic Vibration motion consisting of two ormore incommensurate frequencies.

Saddle point In the geometric theory of ordinary differ-ential equations, an equilibrium point with real eigen-values with at least one positive and one negativeeigenvalue.

Solitons Nonlinear wave-like solutions that can occur ina chain of coupled nonlinear oscillators.

Strange attractor Attracting set in phase space onwhich chaotic orbits move; an attractor that is not anequilibrium point or a limit cycle, or a quasi-periodicattractor. An attractor in phase space with fractaldimension.

Van der Pol equation Second-order differential equa-tion with linear restoring force and nonlinear damping,which exhibits a limit cycle behavior. The classic math-ematical paradigm for self-excited oscillations.

DYNAMICS is the mathematical study of the way sys-tems change in time. The models that measure this changeinclude differential equations and difference equations,as well as symbol dynamics. The subject involves tech-niques for deriving mathematical models as well as thedevelopment of methods for finding solutions to the equa-tions of motion. Such techniques involve both analyticmethods, such as perturbation techniques, and numericalmethods.

I. INTRODUCTION

In the classical physical sciences, such as mechanics orelectromagnetics, the methods to derive mathematicalmodels are classified as dynamics, advanced dynamics,Lagrangian mechanics, or Hamiltonian mechanics. Inthis review, we discuss neither techniques for derivingequations nor the specific solution methods. Instead, wedescribe some of the phenomena that characterize hownonlinear systems change in time, such as nonlinearresonance, limit cycles, coupled motions, and chaoticdynamics.

An important class of problems in this subject consistsof those problems for which energy is conserved. Sys-tems in which all the active forces can be derived froma force potential are sometimes called conservative. Abranch of dynamics that deals with such systems is calledHamiltonian mechanics.

The qualifier nonlinear implies that the forces (or volt-ages, etc.) that produce change in physical problems arenot linearly proportional to the variables that describe thestate of the system, such as position and velocity in me-chanical systems (or charges and currents in electrical sys-tems). Mathematically, the term linear refers to the actionof certain mathematical operators L , such as are used inmultiplication by a constant, taking a derivative, or anindefinite integral. A linear operator is one that can bedistributed among a sum of functions without interaction,that is,

L[a f (z) + bg(t)] = aL[ f (t)] + bL[g(t)].

Nonlinear operators, such as those that square or cube afunction, do not obey this property. Dynamical systemsthat have nonlinear mathematical models behave very dif-ferently from ones that have linear models. In the follow-ing, we describe some of the unique features of nonlineardynamical systems.

Another distinction is whether the motion is boundedor not. Thus, for a mass on an elastic spring, the restoringforces act to constrain the motion, whereas in the case ofa rocket, the distance from some fixed reference can grow

Page 277: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

Nonlinear Dynamics 525

without bound. In this review, we discuss only boundedproblems typically involving vibrating phenomena.

Mathematical models in dynamical systems generallytake one of three forms: differential equations (or flows),difference equation (called maps), and symbol dynamicequations. Although the physical laws from which themodels are derived are often second-order differentialequations, the theory of nonlinear dynamics is best stud-ied by rewriting these equations in the form of first-orderequations. For example, Newton’s law of conservation ofmomentum for a unit mass with one degree of freedom isusually written as a second-order differential equation:

x = F(x, x, t). (1)

In nonlinear dynamics one often rewrites this in the form

x = y, y = F(x, y, t). (2)

The motion is then viewed in phase space with vec-tor components (x, y) corresponding to position and ve-locity. (In advanced dynamics, phase space is sometimesdefined in terms of generalized position coordinates andgeneralized momentum coordinates.) For more complexproblems, one studies dynamical models with differen-tial equations in an N -dimensional phase space withN components x1(t), x2(t), . . . , xi(t), . . . , xn(t), wherethe equation of motion takes the form

x = F(x, t)(3)

x1 = x1 x2 ≡ x = y

using Eq. (2).Difference equations or maps are also used in nonlinear

dynamics and are sometimes derived or related to contin-uous flows in phase space by observing the motion or stateof the system at discrete times, that is, xn ≡ x(tn). In dis-tinction to Eq. (3), the subscript refers to different timesor different events in the history of the system. First- andsecond-order maps have the following forms:

xn+1 = f (xn) (4a)

or

xn+1 = f (xn, yn)(4b)

yn+1 = g(xn, yn)

Examples are given later in this article.Another model is obtained when the variable Xn is re-

stricted to a finite set of values, say (0, 1, 2). In this case,there is no need to think in terms of numbers because onecan make a correspondence between (0, 1, 2) and any set ofsymbols such as (a1, a2, a3) = (L, C, R) or (R, Y, B). Thus,in some systems we may be interested only in whether theparticle is to the left (L), right (R), or in the center (C)

with respect to some reference. We can also label stateswith colors, such as red (R), yellow (Y), or blue (B). Theevolution of a system is then expressed in the form

an+1 = h(an). (5)

Here, however, h(an) may not be an explicit algebraicexpression but a rule that may incorporate inequalities.For example, suppose that x(tn) is the position of someparticle at time tn . Then one could have

an+1 = L if xn < 0

an+1 = R if xn ≥ 0.

An equilibrium solution might be LLLL. . . , whereasa periodic motion has the form RRLR-RLRRL. . . , orLRLRLR. . . .

For a given physical system, one can use all three typesof models.

II. THE UNDAMPED PENDULUM

A. Free Vibrations

A classical paradigm in nonlinear dynamics is the circularmotion of a mass under the force of gravity (Fig. 1). Abalance equation between the gravitational torque and therate of change of angular momentum yields the nonlinearordinary differential equation

θ + (g/L) sin θ = 0, (6)

where g is the gravitational constant and L the lengthof the pendulum. A standard approach to understandingthe dynamics of this system is to analyze the stabilityof motion of the linearized equations about equilibriumpositions.

FIGURE 1 (a) The classical pendulum under the force of gravity.(b) Phase plane sketch of motions of the pendulum showing so-lutions near the origin (center) and solution near θ = ±π (saddlepoint).

Page 278: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

526 Nonlinear Dynamics

Using the form of Eq. (2) or (3) one has

θ = , = −ω20 sin θ (7)

where

ω20 ≡ g /L .

Equilibrium points of Eq. (3) are defined by F(xe) = 0. Inthe example of the pendulum, x = (θ, ) and θe = ±m π,

e = 0. Because the torque is periodic in θ , we can re-strict θ to −π < θ ≤ π . In a linearized analysis, we definea perturbation variable ϕ = θ − θe so that sin θ is replacedby ±ϕ, depending on whether θe = 0 or π . About θe = 0,one finds that the linearized motion is oscillatory (i.e.,θ (t) = A sin(ω0t + B), where A and B are determinedfrom initial conditions). The motion in the phase plane(θ, ) takes the form of an elliptic orbit with clockwiserotation (Fig. 1). Such motion is known as a center. Themotion about θe = ±π can be shown to be an unstableequilibrium point, known as a saddle, with trajectoriesthat are also shown in Figure 1. (One should note that thesaddles at θe = ±π are physically the same.) Using theconservation of linearized system qualitatively representthose of the nonlinear system. These local qualitative pic-tures of the nonlinear phase plane motion can often bepieced together to form a global picture in Figure 1. Thetrajectory separating the inner orbits (libration) from theouter or rotary orbit is known as a separatrix. For smallmotions the period of oscillation is 2π/ω0 or 2π (L /g)

12 .

However, the period of libration increases with increasingamplitude and approaches infinity as the orbit approachesthe separatrix. The dependence of the free oscillation pe-riod or frequency on the amplitude is characteristic ofnonlinear systems.

III. NONLINEAR RESONANCE

A classical model for nonlinear effects in elastic mechan-ical systems is a mass on a spring with nonlinear stiff-ness. This model is represented by the differential equation(known as Duffing’s equation)

x + 2γ x + αx + βx3 = f (t). (8)

This equation can also be used to describe certain nonlin-ear electrical circuits. When the linear damping term andexternal forcing are zero (i.e., γ = f = 0), the system isconservative and the nonlinear dynamics in the (x , x = y)phase plane can exhibit a number of different patternsof behavior, depending on the signs of α and β. Whenα, β > 0. The system has a single equilibrium point, acenter, where the frequency of oscillation increases withamplitude. For α > 0 and β < 0, the frequency decreases

FIGURE 2 Phase plane motions for an oscillator with a nonlinearrestoring force [Duffing’s equation (8)]. (a) Hard spring problem,α, β > 0. (b) Soft spring problem, α > 0, β < 0. (c) Two-well poten-tial problem, α < 0, β > 0.

with amplitude (i.e., the period increases as in the pendu-lum) and the motion is unbounded outside the separatrix.For α < 0 and β > 0, there are three equilibria: two stableand one unstable (a saddle), as in Figure 2c. Such mo-tions represent the dynamics of a particle in a two-wellpotential.

Forced vibration of the damped system [Eq. (8)] repre-sents an important class of problems in engineering. If theinput force is oscillatory (i.e., f = f0 cos ωt), the responseof the system x(t) can exhibit periodic, subharmonic, orchaotic functions of time. A periodic output has the samefrequency as the input, whereas a subharmonic motion in-cludes motions of multiple periods of the input frequency2π/ω:

x(t) ∼ A cos[(n /m)ωt + B]. (9)

where n and m are integers. When the motion is pe-riodic, the classic phenomenon of hysteretic nonlinearresonance occurs as in Figure 3. The output of thesystem has a different response for increasing versusdecreasing forcing frequency in the vicinity of the lin-ear natural frequency

√α. Also, the dotted curves in

FIGURE 3 Nonlinear resonance for the hard spring problem: re-sponse amplitude versus driving frequency.

Page 279: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

Nonlinear Dynamics 527

Figure 3 represent unstable motions that result in jumpsin the response as frequency is increased or decreased.However, the output motion may not always be peri-odic, as Figure 3 implies, and may change to a subhar-monic or chaotic motion depending on the parameters(γ, α, β, f0 , ω). The multiplicity of possible solutions isnot often pointed out in more classical treatments of non-linear oscillations. Chaotic vibrations are discussed in thefollowing.

IV. SELF-EXCITED OSCILLATIONS: LIMITCYCLES

Dynamic systems with both sources and sinks for en-ergy comprise an important class of nonlinear phenom-ena. These include mechanical systems with relativemotion between parts, fluid flow around solid objects, bio-chemical and chemical reactions, and circuits with nega-tive resistance (created by active electronic devices suchas operational amplifiers or feedback circuits), as shown inFigure 4. The source of energy may create an unstable spi-ral equilibrium point while the source of dissipation maylimit the oscillation motion to a steady motion or closedorbit in the phase space, as shown in Figure 5. The classi-cal model for this limit cycle phenomena is the so-calledVan der Pol equation given by

x − γ x(1 − βx2) + ω20x = f (t). (10)

When f (t) = 0, the system is called autonomous, and theorigin is the only equilibrium point in the phase plane,that is, (x , x = γ ) = (0, 0). This point can be shown to

FIGURE 4 Sources of self-excited oscillations. (a) Dry frictionbetween a mass and a moving belt. (b) Aeroelastic forces on avibrating airfoil. (c) Negative resistance in an active circuit element.

FIGURE 5 Phase plane portrait for a limit cycle oscillation. (a)Small γ [Eq. (10)]. (b) Relaxation oscillations, large γ [Eq. (10)].

be an unstable spiral when γ > 0. When γ is small, thelimiting orbit in a set of normalized coordinates (β = ω2

0 =1) is a circle of radius 2. As shown in Figure 5a, solutionsinside the circle spiral out and onto the limit cycle whilethose outside spiral inward and onto the limit orbit. Thefrequency of the resulting periodic motion for β = ω = 1is one radian per nondimensional time unit.

When γ is larger (e.g., γ ∼ 10), the motion takes a spe-cial form known as a relaxation oscillation, as shown inFigure 5b. It is periodic but is not sinusoidal, that is, itincludes higher harmonics. The system exhibits suddenperiodic shifts in motion.

If periodic forcing is added to a self-excited systemsuch as Eq. (10) (i.e., f (t) = f0 cos ω1t), then more com-plicated motions can occur. Note that when a nonlinearsystem is forced, superposition of free and forced motionis not valid. Two important phenomena in forced, self-excited systems are mentioned here: entrained oscillationand combination or quasi-periodic oscillations. When thedriving frequency is close to the limit cycle frequency,the output x(t) may become entrained at the driving fre-quency. For larger differences between driving and limitcycle frequencies, the output may be a combination of thetwo frequencies in the form

x = A1 cos ω0t + A2 cos ω1t. (11)

When ω0 and ω1 are incommensurate (i.e., ω0/ω1 is anirrational number), the motion is said to be quasi-periodic,or almost periodic. Phase plane orbits of Eq. (11) are notclosed when ω0 and ω1 are incommensurate.

V. STABILITY AND BIFURCATIONS

The existence of equilibria or steady periodic solutions isnot sufficient to determine if a system will actually behave

Page 280: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

528 Nonlinear Dynamics

FIGURE 6 Bifurcation diagrams. (a) Pitchfork bifurcation, thetransition from one to two stable equilibrium positions. (b) Hopf bi-furcation, the transition from stable spiral to limit cycle oscillation.

that way. The stability of these solutions must also bechecked. As parameters are changed, a stable motion canbecome unstable and new solutions may appear. The studyof the changes in the dynamic behavior of systems as pa-rameters are varied is the subject of bifurcation theory.Values of the parameters at which the qualitative or topo-logical nature of the motion changes are known as criticalor bifurcation values.

An example of a simple bifurcation is the equation formotion in a two-well potential [Eq. (8)]. Suppose we viewα as a control parameter. Then in Eq. (8), the topology ofthe phase space flow depends critically on whether α < 0or α > 0, as shown in Figure 2a and c, for zero damping andforcing. Thus α = 0 is known as the critical or bifurcationvalue. A standard bifurcation diagram plots the values ofthe equilibrium solution as a function of α (Fig. 6a) andis known as a pitchfork bifurcation. When damping ispresent, the diagram is still valid. In this case, one stablespiral is transformed into two stable spirals and a saddleas α decreases from positive to negative values.

A bifurcation for the emergence of a limit cycle in aphysical system is shown in Figure 6b. This is sometimesknown as a Hopf bifurcation. Here, the equilibrium pointchanges from a stable spiral or focus to an unstable spiralthat limits onto a periodic orbit.

VI. FLOWS AND MAPS: POINCAR ESECTIONS

An old technique for analyzing solutions to differentialequations, developed by Poincare around the turn of the20th century, has now assumed greater importance in themodern study of dynamical systems. The Poincare sectionis a method to transform a continuous dynamical processin time into a set of difference equations of the form of

Eq. (4b), known in modern parlance as a map. The study ofmaps obtained from Poincare sections of flows is based onthe theory that certain topological features of the motionin time are preserved in the discrete time dynamics ofmaps.

To illustrate how a Poincare section is obtained, imag-ine that a system of three first-order differential equationsof the form of Eq. (3) has solutions that can be repre-sented by continuous trajectories in the Cartesian space(x, y, z), where x1(t) = x , x2(t) = y, and x3(t) = z (Fig. 7).If the solutions are bounded, then the solution curve iscontained within some finite volume in this space. Wethen choose some surface through which the orbits of themotion pierce. If a coordinate system is set up on thistwo-dimensional surface with coordinates (ξ, η), then theposition of the (n + 1)th orbit penetration (ξn+1, ηn+1) isa function of the nth orbit penetration through the solutionof the original set of differential equations.

A period-one orbit means that

ξn+1 = ξn, ηn+1 = ηn.

A period-m orbit is defined such that

ξn+m = ξn, ηn+m = ηn.

Such orbits in the map correspond to periodic and subhar-monic motions in the original continuous motion. On theother hand, if the sequence of points in the map seem tolie on a closed curve in the Poincare surface, the motionis termed quasi-periodic and corresponds to the sum oftwo time-periodic functions of different incommensuratefrequencies, as in Eq. (9).

Motions whose Poincare maps have either a finite set ofpoints (periodic or subharmonic motion) or a closed curveof points are known as classical attractors. A motion witha set of Poincare points that is not a classical attractor andthat has certain fractal properties is known as a strangeattractor. Strange attractor motions are related to chaoticmotions and are defined as follows.

FIGURE 7 Poincare section. Construction of a difference equa-tion model (map) from a continuous dynamic model.

Page 281: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

Nonlinear Dynamics 529

FIGURE 8 Experimental Poincare map for chaotic motions of aparticle in a two-well potential with periodic forcing and damping[Eq. (10)].

In certain periodically forced problems, there is a natu-ral way to obtain a Poincare section or map. Consider thedamped mass with a nonlinear spring and time periodicforce

x = y

(12)

y = −γ y − F(x) + f0 cos ωt .

A Poincare section can be obtained in this system by defin-ing a third variable z = ωt , where 0 ≤ z < 2π , so that thesystem is converted to a autonomous system of equationsusing z = ω. We also connect the planes defined by z =0 and z = 2π so that the motion takes place in a toroidalvolume (Fig. 7). The Poincare map is obtained by observ-ing (x , y) at a particular phase of the forcing function.This represents a stroboscopic picture of the motion. Ex-perimentally, one can perform the phase plane trace at aparticular phase z = z0 on a storage oscilloscope (Fig. 8).

VII. ONE-DIMENSIONAL MAPS,BIFURCATIONS, AND CHAOS

A simple linear difference equation has the form

xn +1 = λxn . (13)

This equation can be solved explicitly to obtain xn = A λn ,as the reader can check. The solution is stable (i.e.,|xn | → 0 as n → ∞) if |λ| < 1 and unstable if |λ| > 1. Thelinear equation [Eq. (13)] is often used as a model for pop-ulation growth in chemistry and biology. A more realisticmodel, which accounts for a limitation of resources in agiven species population, is the so-called logistic equation

xn +1 = λxn(1 − xn). (14)

FIGURE 9 Graphical solution to a first-order difference equation.The example shown is the parabolic or logistic map.

This is a nonlinear difference equation that has equilibriumpoints x = 0, 1. One can examine the stability of nonlinearmaps in the same way as for flows by linearizing the right-hand side of Eq. (14) about the equilibrium or fixed points.

The orbits of a solution to one-dimensional maps can besolved graphically by reference to Figure 9, in which the(n + 1)th value is reflected about the identity orbit (straightline). An orbit consists of a sequence of points xn that canexhibit transient, periodic, or chaotic behavior, as shown inFigure 10. These properties of solutions can be representedby the bifurcation diagram in Figure 11, where λ is a con-trol parameter. As λ is varied, periodic solutions changecharacter to subharmonic orbits of twice the period of the

FIGURE 10 Possible solutions to the quadratic or parabolic map[Eq. (14)]. (a) Steady or period-one motion. (b) Period-two andperiod-four motions. (c) Chaotic motions.

Page 282: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

530 Nonlinear Dynamics

FIGURE 11 Period-doubling bifurcation diagram for a first-ordernonlinear difference equation [Eq. (14)].

previous orbit. The bifurcation values of λ, λn accumu-late at a critical value at which non-periodic orbits appear.The sequences of values of λ at which period-doubling oc-curs has been shown by Feigenbaum to satisfy the relation

lim[(λn − λn −1)/(λn +1 − λn)] → 4.6692 . . . . (15)

These results have assumed great importance in the studyof dynamic models in classical physics for two reasons.First, in many experiments, Poincare sections of the dy-namics often (but not always) reveal the qualities of a one-dimensional map. Second, Feigenbaum and others haveshown that this period-doubling phenomenon is not onlya prelude to chaos but is also universal when the one- di-mensional map x → f (x) has at least one maximum orhump. Universal means that no matter what physical vari-able is controlled, it shows the same scaling properties asEq. (15). This has been confirmed by many experiments inphysics in solid- and fluid-state problems. However, whenthe underlying dynamics reveals a two-dimensional map,then the period-doubling route to chaos may not be unique.

A two-dimensional map that can be calculated directlyfrom the principles of the dynamics of a ball bouncing ona vibrating platform under the force of gravity is shown inFigure 12a and b (Guckenheimer and Holmes have givena derivation). The difference equations are given by

xn +1 = (1 − ε)xn + κ sin yn (16)

and

yn +1 = yn + xn +1 ,

where xn is the velocity before impact, yn the time of im-pact normalized by the frequency of the vibrating table(i.e., y = ωt , modulo 2π ), and κ proportional to the am-plitude of the vibrating table in Figure 12a. The parameterε is proportional to the energy lost at each impact with the

table. When the system is conservative, ε = 0, the Eqs.(16) are essentially a Poincare map of the continuous mo-tion obtained by observing the time of phase and velocityof impact when the ball hits the table. The first equationis a momentum balance relation before and after impact,whereas the second equation is found by integrating thefree flight motion of the ball between impacts.

These equations have also been used to model anelectron in an electromagnetic field. This map is some-times known as the standard map.

In this problem, one can compare the difference be-tween chaos in a conservative system (ε = 0) and chaosin a system for which there is dissipation. When ε = 0(Fig. 12c), the map shows there are periodic orbits (fixedpoints in the map) and quasi-periodic motions, as evi-denced by the closed map orbits. Islands of chaos existfor initial conditions starting near the saddle points of the

FIGURE 12 Dynamics of the second-order standard maps[Eq. (16)]. (a) Physical model of a ball bouncing in a vibrating table.(b) Iterations of the map with dissipation ε = 0.4, κ ∼ 6 [Eq. (16)],showing fractal structure characteristic of strange attractors. (c)Iteration of the map for many different initial conditions showingregular and stochastic motions (no dissipation ε = 0, κ ∼ 1).

Page 283: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

Nonlinear Dynamics 531

map. In the dissipative case (Fig. 12b), the chaotic orbitshows a characteristic fractal structure but requires a muchlarger force amplitude κ . However, the forcing amplitudeκ needed to obtain chaotic motion in the dissipative caseis much larger than that required for chaos in the conser-vative case ε = 0.

VIII. FRACTALS AND CHAOTICVIBRATIONS

One of the remarkable discoveries in nonlinear dynamicsin recent years is the existence of randomlike solutions todeterministic differential equations and maps. Stochasticmotions in nondissipative or conservative systems wereknown around the time of Poincare. However, the discov-ery of such motions in problems with damping or dissipa-tion was a surprise to many theorists and has led to exper-imental observations of chaotic phenomena in many areasof classical physics. Technically, a chaotic motion is onein which the solution is extremely sensitive to initial con-ditions, so much so that trajectories in phase space startingfrom neighboring initial conditions diverge exponentiallyfrom one another on the average. In a flow, this divergenceof trajectories can take place only in a three-dimensionalphase space. In a map, however, one can have chaotic be-havior in a first-order nonlinear difference equation, asdescribed in the logistic map example of Eq. (14).

In the dissipative standard map for the bouncing ball[Eq. (16)], chaotic solutions exist when impact energy islost (ε > 0) for κ ∼ 6. A typical long iterative map of sucha solution is shown in Figure 12b. The iterates appear tooccur randomly along the sets of parallel curves. If this so-lution is looked at with a finer grid, this parallel structurecontinues to appear. The occurrence of self-similar struc-ture at finer and finer scales in this set of points is calledfractal. Fractal structure in the Poincare map is typical ofchaotic attractors.

IX. FRACTAL DIMENSION

A quantitative measure of the fractal property of strangeattractors is the fractal dimension. This quantity is a mea-sure of the degree to which a set of points covers someinteger n-dimensional subspace of phase space. There aremany definitions of this measure. An elementary defini-tion is called the capacity dimension. One considers a largenumber of points in an n-dimensional space and tries tocover these points with a set of N hypercubes of size ε. Ifthe points were uniformly distributed along a linear curve,the number of points required to cover the set would vary asN ∼ 1/ε (Fig. 13). If the points were distributed on a two-

FIGURE 13 Definition of fractal dimension of a set of points interms of the number of covering cubes N(ε).

dimensional surface, then N ∼ ε−2. When the points arenot uniformly distributed, one might find that N ∼ ε−d . Ifthis behavior continues as ε → 0 and the number of pointsincreases, then we define the capacity dimension as

d = limε→0

[log N (ε)/ log(1/ε)]. (17)

Other definitions exist that attempt to measure fractalproperties, such as the information dimension and the cor-relation dimension. In the latter, one chooses a sphere orhypersphere of size ε and counts the number of points ofthe set in the sphere. When this is repeated for every pointin the set, the sum is called the correlation function C(ε).If the set is fractal, then C ∼ εdc or dc = lim(log C / log ε),as ε → 0. It has been found that dc ≤ d, where d is thecapacity dimension.

As an example, consider the chaotic dynamics of a par-ticle in a two-well potential. The equation of motion isgiven by

x + γ x − 1

2x(1 − x2) = f0 cos ωt . (18)

This is a version of the Duffing equation [Eq. (8)]. It hasbeen shown by Holmes of Cornell University that for chaosto exist, the amplitude of the forcing function must begreater a critical value, that is,

f0 > [γ√

2 cosh(πω/√

2)]/3πω. (19)

The regions of chaos in the parameter plane ( f0, ω)are shown in Figure 14, as determined by numericalexperiments. The above criterion gives a good lowerbound. Equation (18) is a model for the vibrations of abuckled beam. Experiments with chaotic vibrations of abuckled beam show good agreement with this criterion[Eq. (19)].

The fractal dimension of the Poincare map of the chaoticmotions of Eq. (18) depends on the damping γ . When γ

is large (γ ∼ 0.5), the dimension dc ∼ 1.1, and when γ issmall (γ ∼ 0.01), dc ∼ 1.9. The fractal dimension of the

Page 284: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

532 Nonlinear Dynamics

FIGURE 14 Regions of chaotic and regular motion for the two-well potential problem [Eq. (18)].

set of points in Figure 15a is close to dc = 1.5. This meansthat the points do not cover the two-dimensional plane.This is evidenced by the voidlike structure of this chaoticmap as it is viewed at finer and finer scales.

X. LYAPUNOV EXPONENTS ANDCHAOTIC DYNAMICS

A measure of the sensitivity of dynamical motion tochanges in initial conditions is the Lyapunov exponent.Thus, if two trajectories start close to one another, the dis-tance between the two orbits (t) increases exponentiallyin time for small times, that is,

FIGURE 15 Poincare maps of chaotic motions of the two wellpotential problem with fractal dimensions of (a) d = 1.5 and (b)d = 1.1 for two different damping ratios.

(t) = 02λt . (20)

When λ is averaged over many points along the chaotictrajectory, λ is called the Lyapunov exponent. In a chaoticmotion λ > 0, whereas for regular motion λ ≤ 0.

In a two-dimensional map, one imagines a small cir-cle of initial conditions about some point on the attrac-tor (Fig. 16). If the radius of this circle is ε, then afterseveral iterations of the map (say n), the circle may bemapped into an ellipse with principal axis of dimension(2εµn

1 , 2εµn2); µ1 , µ2 are called Lyapunov numbers, and

the exponents are found from λi = log µi . If the system isdissipative, the area decreases after each iteration of themap (i.e., λ1 + λ2 < 0). If λ1 > λ2, then in a chaotic motionλ1 > 0 and λ2 < 0.

Thus, regions of phase space are stretched in one direc-tion and contracted in another direction (Fig. 16). Iteratingthis stretching and contraction through the map eventuallyproduces the fractal structure seen in Figure 15. Of impor-tance to the understanding of such motions are conceptssuch as horseshoe maps and Cantor sets. Space does notpermit a discussion of these ideas, but they may be foundin several of the modern references on the subject.

The Lyapunov exponents (λ1, λ2) can be used to cal-culate another measure of fractal dimension called theLyapunov dimension. For points in a plane, such as a two-dimensional Poincare map, this measure is given by

dL = 1 − (λ1/λ2) = 1 + [log µ1/ log(1/µ2)]. (21)

where λ1 > 0, λ2 < 0. This relation can be extended tohigher dimensional maps.

XI. THE LORENZ EQUATIONS: A MODELFOR CONVECTION DYNAMICS

As a final illustration of the new concepts in nonlinear dy-namics, we consider a set of three equations proposed byLorenz of MIT in 1963 as a crude model for thermal gra-dient induced fluid convection under the force of gravity.Such motions occur in oceans, atmosphere, home heating,and many engineering devices. In this model the variablex represents the amplitude of the fluid velocity streamfunction and y and z measure the time history of the tem-perature distribution in the fluid (a derivation has been

FIGURE 16 Sketch of sensitivity of motion to initial conditions asmeasured by Lyapunov exponents.

Page 285: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

Nonlinear Dynamics 533

FIGURE 17 Sketch of local motion near the three equilibria forthe Lorenz equations [Eq. (22)]. (a) As a model for thermo-fluidconvection (b).

given by Lichtenberg and Lieberman). In nondimensionalform, these equations become

x = σ (y − x)

y = r x − y − xz (22)

z = xy − bz

These equations would be linear were it not for the twoterms xz and xy in the second and third equations. Forthose familiar with fluid mechanics, σ is a Prandtl num-ber and r is similar to a Rayleigh number. The parameterb is a geometric factor. If (x , y , z) ≡ v were to repre-sent a velocity vector in phase space, then the divergence∇ · v = −(σ + b + 1) < 0. This implies that a volume ofinitial conditions decreases as the motion moves in time.For σ = 10, b =

83 (a favorite set of parameters for experts

in the subject), there are three equilibia for r > 1 with theorigin an unstable saddle (Fig. 17a). When r > ∼25, theother two equilibria become unstable spirals (Fig. 17a)and a complex chaotic trajectory moves between regionsnear all three equilibria as shown in Figure 18.

An appropriate Poincare map of this flow shows it tobe nearly a one-dimensional map with period-doublingbehavior as r is increased close to the critical value ofr = ∼25. The fractal dimension of this attractor has beenfound to be 2.06, which indicates that the motion lies closeto a two-dimensional surface.

XII. SPATIOTEMPORAL DYNAMICS:SOLITONS

Nonlinear dynamics models can be used to study spa-tially extended systems such as acoustic waves, electricaltransmission problems, plasma waves, and so forth. These

FIGURE 18 Trajectories of chaotic solution to the Lorenz equa-tions for thermo-fluid convection.

problems have been modeled by using a linear chain of dis-crete oscillators with nearest neighbor coupling as shownin Figure 19. Of course, the limit of such models is con-tinuum physics, such as acoustics or fluid mechanics, forwhich one uses partial differential equations in space andtime. When the coupling between the oscillators is nonlin-ear, then a phenomena known as soliton wave dynamicscan occur. Solitons are pulse-like waves that can propa-gate along the linear chain. Left- and right-moving wavescan intersect and emerge as left and right waves with-out distortion (see Fig. 20). One example is the so-calledToda-lattice, in which the inter particle force is assumedto vary exponentially. (See Toda, 1989.)

x j + F(x j − x j −1) + F(x j − x j +1) = 0

F(x) = β[(exp(bx) − 1)].

A classic problem of dynamics on a finite particle chainwas posed by the famous physicist Eurico Fermi and twocolleagues at Los Alamos in the 1950s. They had expectedthat if energy were placed in one spatial mode, then thenonlinear coupling would disperse the waves into the Nclassic vibration modes. However to their surprise, most ofthe energy stayed in the first spatial mode and eventually,after a finite time, all the energy returned to the originalinitial energy mode. This phenomenon is known as therecurrence problem in finite degree of freedom systemsand is known as the Fermi–Pasta–Ulam problem. (SeeGapanov-Grekhov and Rubinovich, 1992.)

FIGURE 19 Chain of coupled nonlinear oscillators.

Page 286: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

534 Nonlinear Dynamics

FIGURE 20 Soliton wave dynamics along a chain of nonlinearoscillator.

However, it is also now recognized that a different set ofinitial conditions could result in spatiotemporal stochas-ticity or spatiotemporal chaos. (See e.g., Moon, 1992.)

XIII. CONTROLLING CHAOS

One of the most inventive ideas to come out of modernnonlinear dynamics is the control of chaos. It is based onthe concept that a system with a chaotic attractor may be

FIGURE 21 Poincare map amplitude versus time for chaotic and controlled chaos of a period-four orbit for a two-wellpotential nonlinear oscillator.

used as a source of controlled periodic motions. This ideaoriginated in the work of three University of Maryland re-searchers in 1990; E. Ott, C. Gregogi, and J. Yorke (OGY).(See Kapitaniak, 1996 for a review of this subject.) Thisexample of nonlinear thinking has resulted in the designof systems to control chaotic modulation of losers, sys-tems to control heart arythymias, and circuits to encryptand decode information.

The idea is based on several premises.

1. The nonlinear system has a chaotic or strangeattractor.

2. That the strange attractor is robust in some variationof a control parameter.

3. That there exists an infinite number of unstableperiodic orbits in the strange attractor.

4. There exists a control law that will locally stabilizethe unstable motion in the vicinity of the saddlepoints of the orbit map.

There are many variations of this OGY method, somebased on the analysis of the underlying nonlinear mapand some based on experimental techniques. An exampleof controlled dynamics is shown in Figure 21 from theauthor’s laboratory. The vertical scale shows the Poincare

Page 287: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN010C-484 July 16, 2001 16:7

Nonlinear Dynamics 535

map sampled output of a vibrating nonlinear elastic beam.The horizontal scale shows the time. The figure shows firsta chaotic signal, then when control in initiated, a periodfour orbit appears, then chaos returns when the control isswitched off. The control consists of a pulsed magneticforce on the beam. The control force is only active fora fraction of the period. The method uses the inherentparameter sensitivity of the underlying chaotic attractorto achieve control.

XIV. CONCLUSION

Dynamics is the oldest branch of physics. Yet, 300 yearsafter the publication of Newton’s Principia (1687), newdiscoveries are still emerging. The ideas of Newton, Euler,Lagrange, Hamilton, and Poincare, once conceived in thecontext of orbital mechanics of the planets, have now tran-scended all areas of physics and even biology. Just as thenew science of dynamics in the seventeenth century gavebirth to a new mathematics (namely, the calculus), so havethe recent discoveries in chaotic dynamics ushered in mod-ern concepts in geometry and topology (such as fractals),which the 21st century novitiate in dynamics must mas-ter to grasp the subject fully. The bibliography lists onlya small sample of the literature in dynamics. However,the author hopes that they will give the interested readersome place to start the exciting journey into the field ofnonlinear dynamics.

SEE ALSO THE FOLLOWING ARTICLES

CHAOS • DYNAMICS OF ELEMENTARY CHEMICAL REAC-TIONS • FLUID DYNAMICS • FRACTALS • MATHEMATI-CAL MODELING • MECHANICS, CLASSICAL • NONLINEAR

PROGRAMMING • VIBRATION, MECHANICAL

BIBLIOGRAPHY

Abraham, R. H., and Shaw, C. D. (1983). “Dynamics: The Geometry ofBehavior,” Parts 1–3. Aerial Press, Santa Cruz, CA.

Gaponov-Grekhov, A. V., and Rabinovich, M. I. (1992). “Nonlinearitiesin Action,” Springer-Verlag, New York.

Guckenheimer, J., and Holmes, P. J. (1983). “Nonlinear Oscillations;Dynamical Systems and Bifurcations of Vector Fields.” Springer-Verlag, New York.

Jackson, A. (1989). “Perspectives in Nonlinear Dynamics,” Vol. 1. Cam-bridge Univ. Press, New York.

Kapitaniak, T. (1996). “Controlling Chaos,” Academic Press, London.Lichtenberg, A. J., and Liebermann, M. A. (1983). “Regular and Stochas-

tic Motion,” Springer-Verlag, New York.Minorsky, N. (1962). “Nonlinear Oscillations,” Van Nostrand, Princeton,

NJ.Moon, F. C. (1992). “Chaotic and Fractal Dynamics,” Wiley, New York.Nayfeh, A. H., and Balachandran, B. (1993). “Nonlinear Dynamics,”

Wiley, New York.Schuster, H. G. (1984). “Deterministic Chaos,” Physik-Verlag GmbH,

Weinheim, Federal Republic of Germany.Strogatz, S. H. (1994). “Nonlinear Dynamics and Chaos,” Addison

Wesley, Reading, MA.Toda, M. (1989). “Theory of Nonlinear Lattices,” Springer-Verlag,

Berlin.

Page 288: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and PolarimetryKent RochfordNational Institute of Standards and Technology

I. Polarization StatesII. Polarizers

III. RetardersIV. Mathematical RepresentationsV. Polarimetry

GLOSSARY

Birefringence The property of optically anisotropic ma-terials, such as crystals, of having the phase velocityof propagation dependent on the direction of propaga-tion and polarization. Numerically, birefringence is therefractive index difference between eigenpolarizations.

Diattenuation The property of having optical transmit-tance depend on the incident polarization state. In di-attenuators, the eigenpolarizations will have princi-pal transmittances Tmax and Tmin, and diattenuationis quantified as (Tmax − Tmin)/(Tmax + Tmin). Diatten-uation may occur during propagation when absorptioncoefficients depend on polarization (also called dichro-ism) or at interfaces.

Eigenpolarization A polarization state that propagatesunchanged through optically anisotropic materials.Eigenpolarizations are orthogonal in homogeneous po-larization elements.

Jones calculus A mathematical treatment for describ-ing fully polarized light. Light is represented by 2 × 1complex Jones vectors and polarization components as2 × 2 complex Jones matrices.

Mueller calculus A mathematical treatment for describ-

ing completely, partially, or unpolarized light. Light isrepresented by the 4 × 1 real Stokes vector and polar-ization components as 4 × 4 real Mueller matrices.

Polarimetry The measurement of the polarization stateof light or the polarization properties (retardance, diat-tenuation, and depolarization) of materials.

Polarized light A light wave whose electric field vectortraces a generally elliptical path. Linear and circularpolarizations are special cases of elliptical polarization.In general, light is partially polarized, and is a mixtureof polarized light and unpolarized light.

Polarizer A device with diattenuation approaching 1 thattransmits one unique polarization state regardless ofincident polarization.

Retardance The optical phase shift between two eigen-polarizations.

Unpolarized light Light of finite spectral width whose in-stantaneous polarization randomly varies over all statesduring the detection time. Not strictly a polarizationstate of light.

THE POLARIZATION state is one of the fundamen-tal characteristics (along with intensity, wavelength,

521

Page 289: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

522 Polarization and Polarimetry

and coherence) required to describe light. The earliestrecorded observation of polarization effects was reportedby Bartholinus, who observed double refraction in calcitein 1669. Huygens demonstrated the concept of polariza-tion by passing light through two calcite crystals in 1690.Today, the measurement, manipulation, and control of po-larization plays an important role in optical sciences.

I. POLARIZATION STATES

Light can be represented as an electromagnetic wave thatsatisfies Maxwell’s equations. A transverse electromag-netic wave has electric and magnetic field componentsthat are orthogonal to the direction of propagation. As thewave propagates, the strengths of these transverse fieldsoscillate in space and time, and the polarization state isdefined by the direction of the electric field vector E.

For our discussion, we will use a right-handed Cartesiancoordinate system with orthogonal unit vectors x, y, and z.A monochromatic plane wave E(z , t) traveling in vacuumalong the z direction with time t can be written as

E(z , t) = RexEx exp[i(ωt − k0z + φx )]

+ yEy exp[i(ωt − k0z + φy)] (1a)

or

E(z , t) = xEx cos(ωt − k0z + φx )

+ yEy cos(ωt − k0z + φy), (1b)

where ω is the angular optical frequency and Ex and Ey

are the electric field amplitudes along the x and y axes, re-spectively. The free-space wavenumber is k0 = 2π/λ forwavelength λ, and φx and φy are absolute phases. The dif-ference in phase between the two component fields is thenφ = φy − φx . The direction of E and the polarization ofthe wave depend on the field amplitudes Ex and Ey andthe phases φx and φy .

A. Linear Polarization

A wave is linearly polarized if an observer looking alongthe propagation axis sees the tip of the oscillating electricfield vector confined to a straight line. Figure 1 depicts thewave propagation for two different linear polarizationswhen Eq. (1b) is plotted for φx = φy = 0. In Fig. 1, Ey = 0and light is linearly polarized along the x axis; in the otherexample, light is polarized along the y axis when Ex = 0.

For a field represented by Eqs. (1a) and (1b), light willbe linearly polarized whenever φ = m π , where m is aninteger; the direction of linear polarization depends on themagnitudes of Ex and Ey . For example, if Ex = Ey , the

FIGURE 1 Two linear polarized waves. The electric field vectorof x-polarized light oscillates in the xz plane. The shaded wave isy-polarized light in the yz plane.

vector sum of these orthogonal fields yields a wave polar-ized at 45 from the x axis. If Ex = −Ey (or if Ex = Ey

and φ = π ), the light is linearly polarized at −45. Forin-phase component fields (φ = 0), the linear polariza-tion is oriented at an angle α = tan−1(Ey /Ex ) with respectto the x axis.

In general, linear polarization states are often definedby an orientation angle, though descriptive terms such asx- or y-polarized, or vertical or horizontal, may be used.However, when a wave is incident upon a boundary twospecific linearly polarized states are defined. The planeof incidence (Fig. 2) is the plane containing the incidentray and the boundary normal. The linear polarization inthe plane of incidence is called p-polarization and thefield component perpendicular to the plane is s-polarized.This convention is used with the Fresnel equations (Sec-tion II.A) to determine the transmittance, reflectance, andphase shift when light encounters a boundary.

FIGURE 2 Light waves at a boundary. The plane of incidence co-incides with the plane of the page. Incident, reflected, and trans-mitted p-polarized waves are in the plane of incidence. The corre-sponding s -polarizations (not shown) would be perpendicular tothe plane of incidence.

Page 290: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 523

FIGURE 3 The electric field propagation for right-circular polar-ization, Eq. (2), when t = 0. At a fixed time, the tip of the electricfield vector traces a right-handed corkscrew as the wave propa-gates along the +z direction.

B. Circular Polarization

Another special case occurs when Ex = Ey = E0 and thefield components have a 90 relative phase difference[φ = (m + 1/2)π ]. If φ = π/2, Eq. (1b) becomes

Ercp = E0[x cos(ωt − k0z) + y cos(ωt − k0z + π/2)]

= E0[x cos(ωt − k0z) − y sin(ωt − k0z)]. (2)

As the wave advances through space the magnitude of Ercp

is constant but the tip of this electric field vector traces acircular path about the propagation axis at a frequency ω.A wave with this behavior is said to be right-circularlypolarized.

Figure 3 shows the electric field vector for right-circularpolarization when viewed at a fixed time (t = 0); here thefield will trace a right-handed spiral in space. An observerlooking toward the origin from a distant point (z > 0)would see the vector tip rotating counterclockwise as thefield travels along z. In contrast, the same observer lookingat a right-circularly polarized field at a fixed position (forexample, z = 0) would see the vector rotation trace out aclockwise circle in the xy plane as time advances. Thisdifference in the sense of rotation between space and timeis often a source of confusion, and depends on notation(see Section I.F).

When light is left-circularly polarized the field tracesout a left-handed spiral in space at a fixed time anda counterclockwise circle in time at a fixed position.Equation (1b) describes left-circular polarization whenEx = Ey = E0 and φ = −π/2:

Elcp = E0[x cos(ωt − k0z) + y cos(ωt − k0z − π/2)]

= E0[x cos(ωt − k0z) + y sin(ωt − k0z)]. (3)

Right- and left-circular polarizations are orthogonalstates and can be used as a basis pair for representingother polarization states, much as orthogonal linear states

are combined to create circular polarization. Adding equalamounts of right- and left-circularly polarized light willyield a linearly polarized state. For example,

12 E

rcp + 12 E

lcp = xE0 cos(ωt − k0z). (4)

In contrast, adding equal quantities of left- and right-circular polarization that are out of phase [by adding anadditional π phase to both component fields in Eq. (2)]yields

− 12 E

rcp + 12 E

lcp = yE0 sin(ωt − k0z). (5)

In general, equal amounts of left- and right-circular polar-ization combine to produce a linear polarization with anazimuthal angle equal to half the phase difference.

C. Elliptical Polarization

For elliptically polarized light the electric field vector ro-tates at ω but varies in amplitude so that the tip traces out anellipse in time at a fixed position z. Elliptical polarizationis the most general state and linear and circular polariza-tions are simply special degenerate forms of ellipticallypolarized light. Because of this generality, attributes ofthis state can be applied to all polarization states.

The polarization ellipse (Fig. 4) can provide usefulquantities for describing the polarization state. The az-imuthal angle α of the semi-major ellipse axis from the xaxis is given by

tan(2α) = tan(β) cos(φ), (6)

where tan(β) = Ey /Ex and 0 ≤ β ≤ π/2. The ellipticitytan |ε| = b /a, the ratio of the semi-minor and semi-majoraxes, is calculated from the amplitudes and phases ofEq. (1) as

tan(ε) = tan[sin−1(sin 2β sin φ)/2]. (7)

FIGURE 4 The polarization ellipse showing fields Ex and Ey,ellipticity tan |ε| = b/a, and azimuthal angle α. The tip of the electricfield E traces this elliptical path in the transverse plane as the fieldpropagates down the z axis.

Page 291: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

524 Polarization and Polarimetry

Polarization is right-elliptical when 0 < φ < 180 andtan(ε) > 0 and left-elliptical when −180 < φ < 0 andtan(ε) < 0.

D. Unpolarized Light

Monochromatic, or single-frequency, light must necessar-ily be in some polarization state. Light that contains a bandof wavelengths does not share this requirement.

Quasi-monochromatic light can be represented by mod-ifying Eq. (1b) as

E(z, t) = Re(xEx (t) expi[ωmt + φx (t)]+ yEy(t) expi[ωmt + φy(t)]) (8)

where ωm is the mean frequency of an electric field withbandwidth ω < ωm . Taking the real part of this complexanalytic representation yields the true field. Whereas thefield amplitudes Ei (t) and phases φi (t) are constants forstrictly monochromatic light, these quantities fluctuate ir-regularly when the light has finite bandwidth. The pairsof functions Ei (t) and φi (t) have statistical correlationsthat depend on the spectral bandwidth of the light source.The coherence time τ ∼ 2π/ω describes the time scaleduring which the pairs of functions show similar time re-sponse. For some brief time t τ , Ei (t) and φi (t) areessentially constant, and E(t) possesses some ellipticalpolarization state, but a later field E(t + τ ) will have adifferent elliptical polarization. Light is described as un-polarized, or natural, if the time evolutions of the pairsof functions are totally uncorrelated within the detectiontime, and any polarization state is equally likely duringthese successive time intervals.

While strictly monochromatic light cannot be unpolar-ized, natural light can be polarized into any desired ellip-tical state by passing it through the appropriate polarizer.Indeed, when unpolarized light is incident on a polarizer,the detected output intensity is independent of the po-larization state transmitted by the polarizer. This occursbecause a unique polarization exists for an infinitesimaltime t τ and the average projection of these arbitrarystates on a given polarizer is 1

2 over the relatively longintegration time of the detector. In the absence of disper-sive effects, unpolarized light, when totally polarized byan ideal polarizer, will behave much like monochromaticpolarized light.

It is often desirable to have unpolarized light, especiallywhen the undesired polarization dependence of compo-nents degrades optical system performance. For example,the responsivity of photodetectors can exhibit polarizationdependence and cause measurements of optical power tovary with the polarization even when intensity is constant.In some cases, pseudo-depolarizers are useful for modi-

fying polarization to produce light that approximates un-polarized light (Section III.F). For quasi-monochromaticlight, the orthogonal field components can be differentiallydelayed, or retarded, longer than τ , so that the fields be-come uncorrelated. Alternatively, repeatedly varying thepolarization state over a time shorter than the detector re-sponse causes the measurement to include the influence ofmany polarization states. This method, known as polariza-tion scrambling, can reduce some undesirable polarizationeffects by averaging polarizations.

The previous discussion implicitly assumes that thelight has uniform properties over the wavefront. However,the polarization can be varied over the spatial extent ofthe beam using a spatially varying retardance. Further de-scription of these methods and their limitations is foundin the discussion on optical retarders.

E. Degree of Polarization

Light that is neither polarized nor unpolarized is partiallypolarized. The fraction of the intensity that is polarizedfor a time much longer than the optical period is calledthe degree of polarization P and ranges from P = 0 forunpolarized light to P = 1 when a light beam is com-pletely polarized in any elliptical state. Light is partiallypolarized when 0 < P < 1. Partially polarized light occurswhen Ei (t) and φi (t) are not completely uncorrelated, andthe instantaneous polarization states are limited to a sub-set of possible states. Partially polarized light may also berepresented as a sum of completely polarized and unpo-larized components.

We can also define a degree of linear polarization (thefraction of light intensity that is linearly polarized) or adegree of circular polarization (the fraction that is circu-larly polarized). Degrees of polarization can be describedformally using the coherency matrix or Stokes vector for-malism described in Section IV.

F. Notation

The choice of coordinate system and the form of thefield in Eqs. (1a) and (1b) is not unique. We have cho-sen a right-handed coordinate system such that the cross-product is x × y = z and used fields with a time depen-dence exp[i(ωt − kz)] rather than the complex conjugateexp[−i(ωt − kz)]. Both choices are equally valid, but mayresult in different descriptions of the same polarizationstates. Descriptions of circular polarization in particularare often contradictory because of the confusion arisingfrom the use of varied conventions. In this article we fol-low the “Nebraska Convention” adopted in 1968 by theparticipants of the Conference on Ellipsometry at the Uni-versity of Nebraska.

Page 292: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 525

Also, the choice of the Cartesian basis set for describ-ing the electric field is common but not obligatory. Anypolarization state can be decomposed into a combinationof any pair of orthogonal polarizations. Thus Eqs. (1a) and(1b) could be written in terms of right- and left-circularstates or orthogonal elliptical states.

II. POLARIZERS

An ideal polarizer transmits only one unique state of po-larization regardless of the state of the incident light. Po-larizers may be delineated as linear, circular, or elliptical,depending on the state that is produced. Linear polarizersthat transmit a linear state are the most common and areoften simply called “polarizers.”

The transmission axis of a linear polarizer correspondsto the direction of the output light’s electric field oscilla-tion. This axis is fixed by the device, though polarizerscan be oriented (rotated normal to the incident light) toselect the azimuthal orientation of the output state. Whenlinearly polarized light is incident on a linear polarizer, thetransmittance T from the polarizer follows Malus’s law,

T = cos2 θ, (9)

where θ is the angle between the input polarization’s az-imuth and the polarizer’s transmission axis. When the inci-dent light is formed by a linear polarizer, Eq. (9) describesthe transmission through two polarizers with angle θ be-tween transmission axes. In this configuration the secondpolarizer is often called an analyzer, and the polarizer andanalyzer are said to be crossed when the transmittance isminimized (θ =90).

Since an ideal polarizer transmits only one polarizationstate it must block all others. In practice polarizers are notideal, and imperfect polarizers do not exclude all otherstates. For an imperfect polarizer Malus’s law becomes

T = (Tmax − Tmin) cos2 θ + Tmin, (10)

where Tmax and Tmin are called the principal transmit-tances, and transmittance T varies between these values.The extinction ratio Tmin/Tmax provides a useful measureof polarizer performance. Diattenuation is the dependenceof transmittance on incident polarization, and can be quan-tified as (Tmax − Tmin)/(Tmax + Tmin), where the maximumand minimum transmittances occur for orthogonal polar-izations in homogeneous elements. (Homogeneous polar-ization elements have eigenpolarizations that are orthog-onal and we consider such elements exclusively in thisarticle.) Polarizers are optical elements that have a diat-tenuation approaching 1.

Most interfaces with nonnormal optical incidence willexhibit some linear diattenuation since the Fresnel reflec-

tion and transmission coefficients depend on the polar-ization. High-performance polarizers exploit these effectsto achieve very high diattenuations by differentially re-flecting and transmitting orthogonal polarizations. In con-trast, dichroism is a material property in which diattenu-ation occurs as light travels through the medium. Mostcommercial polarizers exploit dichroism, polarization-dependent reflection or refraction in birefringent crystals,or polarization-dependent reflectance and transmittance indielectric thin-film structures.

A. Fresnel Equations

Maxwell’s equations applied to a plane wave at an inter-face between two dielectric media provide the relationshipamong incident, transmitted, and reflected wave ampli-tudes and phases. Figure 2 shows the electric fields andwavevectors for a wave incident upon the interface be-tween two lossless, isotropic dielectric media. The planeof incidence contains all three wavevectors and is used todefine two specific linear polarization states; p-polarizedlight has its electric field vector within the plane of inci-dence, and s-polarized light is perpendicular to this plane.The law of reflection, θi = θr, provides the direction of thereflected wave. The refraction angle is given by Snell’slaw,

ni sin θi = nt sin θt. (11)

Fresnel’s equations yield the amplitudes of the transmittedfield Et and reflected field Er as fractions of the incidentfield Ei. For p-polarized light in isotropic, homogeneous,dielectric media, the amplitude reflectance rp is

rp =(

Er

Ei

)p

= nt cos θi − ni cos θt

ni cos θt + nt cos θi(12)

and amplitude transmittance tp is

tp =(

Et

Ei

)p

= 2ni cos θi

ni cos θt + nt cos θi. (13)

For s-polarized light, the corresponding Fresnel equationsare

rs =(

Er

Ei

)s

= ni cos θi − nt cos θt

ni cos θi + nt cos θt(14)

and

ts =(

Et

Ei

)s

= 2ni cos θi

ni cos θi + nt cos θt. (15)

The Fresnel reflectance for cases ni/nt = 1.5 andnt /ni = 1.5 is shown in Fig. 5. At an incidence angleθB = tan−1(nt/ni) (for nt > ni), known as the Brewster an-gle, rp = 0 and p-polarized light is totally transmitted. Ina pile-of-plates polarizer, plates of glass are oriented at the

Page 293: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

526 Polarization and Polarimetry

FIGURE 5 Fresnel reflectances for p -polarized (solid curve) ands -polarized (dashed) light for cases ni /nt = 1.5 and nt /ni = 1.5.The amplitude reflectance is 0 for p -polarized light at the Brewsterangle θB, and is one for all polarizations when the incidence angleis θ ≥ θC.

Brewster angle so that only s-polarized light is reflectedfrom each plate, and the successive diattenuations fromeach plate increase the degree of polarization of transmit-ted light.

When ni > nt, both polarizations may be completelyreflected if the incidence angle is larger than the criticalangle θc,

θc = sin−1 nt

ni. (16)

When θi ≥ θc, the light undergoes total internal reflection(TIR). For these incidence angles no net energy is trans-mitted beyond the interface and an evanescent field prop-agates along the direction θt. The reflectance can be re-duced from 1 if the medium beyond the interface is thinnerthan a few wavelengths and followed by a higher refrac-tive index material. The resulting frustrated total internalreflection allows energy to flow across the interface, lead-ing to nonzero transmittance. For this reason, TIR devicesusing glass–air interfaces must be kept free of contami-nants that may frustrate the TIR. Birefringent crystal po-larizers obtain very high extinction ratios by transmittingone linear polarization while forcing the orthogonal po-larization to undergo TIR.

B. Birefringent Crystal Polarizers

Birefringent polarizers spatially separate an incident beaminto two orthogonally polarized beams. In a conventionalpolarizer, the undesired polarization is eliminated by di-recting one beam into an optical absorber so that a sin-gle polarization is transmitted. Alternatively, a polarizing

beamsplitter transmits two distinct orthogonally polarizedbeams that are angularly separated or displaced.

In birefringent materials, the incident polarization isdecomposed into two orthogonal states called principalpolarizations or eigenpolarizations. When the eigenpolar-izations travel at the same velocity (and see the same re-fractive index), the direction of propagation is called anoptic axis (see Section III.A). When light does not travelalong an optic axis, the eigenpolarizations see different re-fractive indices and thus propagate at different velocitiesthrough the material.

When light enters or exits a birefringent material at anonnormal angle θ that is not along an optic axis, theeigenpolarizations refract at different angles, undergoingwhat is termed double refraction. Also, each eigenpolar-ization may encounter different reflectance or transmit-tance at interfaces (since Fresnel coefficients depend onthe refractive indices), and diattenuation results. Completediattenuation occurs if one eigenpolarization undergoestotal internal reflection while the other eigenpolarizationis transmitted.

Most birefringent polarizers are made from calcite, anaturally occurring mineral. Calcite is abundant in its poly-crystalline form, but optical-grade calcite required for po-larizers is rare, which makes birefringent polarizers morecostly than most other types. Calcite transmits from below250 nm to above 2 µm and is used for visible and near-infrared applications. Other birefringent crystals, such asmagnesium fluoride (with transmittance from 140 nm to7 µm), can be used at some wavelengths for which calciteis opaque.

Prism polarizers are composed of two birefringentprisms cut at an internal incidence angle that transmitsonly one eigenpolarization while totally internally reflect-ing the other (Fig. 6). The prisms are held together by a thincement layer or may be separated by an air gap and exter-nally held in place for use with higher power laser beams.The transmitted beam contains only one eigenpolarizationsince the orthogonal polarization is completely reflected.The prisms are aligned with parallel optic axes, so that thistransmitted beam undergoes very small deviations, usu-ally less than 5 min of arc. Often the reflected beam also

FIGURE 6 Glan–Thompson prism polarizer. At the interface, p -polarized light reflects (and is typically absorbed by a coating atthe side of the prism) and s -polarized light is transmitted. Theoptic axes (shown as dots) are perpendicular to the page.

Page 294: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 527

contains a small amount of the transmitted eigenpolariza-tion since nonzero reflectance results if the refractive in-dices of the cement and transmitted eigenpolarization arenot exactly equal. Because the reflected beam has poorerextinction, it is usually eliminated by placing an index-matched absorbing layer on the side face toward whichlight is reflected.

Glan prism polarizers are the most common birefringentcrystal polarizer. They exhibit superior extinction; extinc-tion ratios of 10−5–10−6 are typical, and extinctions below10−7 are possible. The small residual transmittance canarise from material imperfection, scattering at the prismfaces, or misalignment of the optic axes in each prism ofthe polarizer.

Because total internal reflection requires incidence an-gles larger than θc, the polarizer operates over a limitedrange of input angles that is often asymmetric about nor-mal incidence. The semi-field angle is the maximum anglefor which output light is completely polarized regardlessof the rotational orientation of the polarizer (that is, forany azimuthal angle of output polarization). The field an-gle is twice the semi-field angle. The field angle dependson the refractive index of the intermediate layer (cement orair) and the internal angle of the contacted prisms. Sincethe incidence angle at the contacting interface dependsin part on the refractive index when light is nonnormallyincident on the polarizer, the field angle is wavelengthdependent.

Birefringent crystal polarizing beamsplitters transmittwo orthogonal polarizations. Glan prism polarizers canact as beamsplitters if the reflected beam exits through apolished surface, though extinction is degraded. Polariz-ing beamsplitters with better extinction separate the beamsthrough refraction at the interface. In Rochon prisms, lightlinearly polarized in the plane normal to the prism is trans-mitted undeviated, while the orthogonal polarization is de-viated by an angle dependent on the prism wedge angleand birefringence (Fig. 7a). Senarmont polarizing beamsplitters are similar, but the polarizations of the deviatedand undeviated beams are interchanged. Wollaston polar-izers (Fig. 7b) deviate both output eigenpolarizations withnearly equal but opposite angles when the input beamis normally incident. For all these polarizers, the devia-tion angle depends on the wedge angle and varies withwavelength.

C. Interference Polarizers

The Fresnel equations show that the transmittance and re-flectance of obliquely incident light will depend on thepolarization. Dielectric stacks made of alternating high-and low-refractive index layers with quarter-wave opti-cal thickness can be tailored to provide reflectances and

FIGURE 7 (a) Rochon and (b) Wollaston polarizers. The direc-tions of the optic axes are shown in each prism (as dots for axesperpendicular to page and as a two-arrow line for axes in the planeof the page).

transmittances with large diattenuation. Optical thick-ness depends on incidence angle, and polarizers based onquarter-wave layers are sensitive to incidence angle andwavelength. Designs that increase the wavelength rangedo so at the expense of input angle range, and vice versa.Polarizing beamsplitter cubes are made by depositing thestack on the hypotenuse of a right-angle prism and ce-menting the coated side to the hypotenuse of a secondprism.

The extinction of these devices is limited by the defectsin the coating layers or the optical quality of the opticalsubstrate material through which light must pass. The stateof polarization may also be altered by the birefringence inthe substrate. Commercial thin-film polarizers are avail-able with an extinction of about 10−5.

D. Dichroic Polarizers

Some molecules are optically anisotropic, and light polar-ized along one molecular direction may undergo greaterabsorption than perpendicularly polarized light. Whenthese molecules are randomly oriented, this molecular-level diattenuation will average out as the light propagatesthrough the thickness, and bulk diattenuation may not beobserved. However, linear polarizers can be made by ori-enting dichroic molecules or crystals in a plastic or glassmatrix that maintains a desired alignment of the trans-mission axes. Extinction ratios between 10−2 and 10−5

are possible in oriented dichroics in the visible and near-infrared regions.

Dichroic sheet polarizers are available with larger ar-eas and at lower cost than other polarizer types. Also, theacceptance angle, or maximum input angle from normalincidence that does not result in degraded extinction, istypically large in dichroics because diattenuation occursduring bulk propagation rather than at interfaces. How-ever, the maximum transmittance of these polarizers maybe significantly less than unity since the transmission axis

Page 295: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

528 Polarization and Polarimetry

may also absorb light. Because absorbed light will heat thematerial and may cause damage at high power, incidentpowers are limited.

III. RETARDERS

Retarders are devices that induce a phase difference, orretardation, between orthogonally polarized componentsof a light wave. Linear retarders are the most common andproduce a retardance φ = φy − φx [using the notation ofEqs. (1a) and (1b)] between orthogonal linear polariza-tions. Circular retarders cause a phase shift between right-and left-circular polarizations and are often called rotatorsbecause circular retardance changes the azimuthal angleof linearly polarized light. Because the polarization stateof light is determined by the relative amplitudes and phaseshifts between orthogonal components, retarders are use-ful for altering and controlling a wave’s polarization. Infact, an arbitrary polarization state can be converted to anyother state using an appropriate retarder.

A. Linear Birefringence

In optically anisotropic materials, such as crystals, thephase velocity of propagation generally depends on thedirection of propagation and polarization. The optic axesare propagation directions for which the phase velocityis independent of the azimuth of linear polarization. Forother propagation directions, two orthogonal eigenaxesperpendicular to the propagation define the linear polar-izations of waves that propagate through the crystal withconstant phase velocity. These eigenpolarizations are lin-ear states whose refractive indices are determined by thecrystal’s dielectric tensor and propagation direction. Lightpolarized in an eigenpolarization will propagate throughan optically anisotropic material with unchanging polar-ization, while light in other polarization states will changewith distance as the beam propagates.

Uniaxial crystals and materials that behave uniaxiallyare commonly used in birefringent retarders and polariz-ers. These crystals have a single optic axis, two princi-pal refractive indices no and ne, and a linear birefringencen = ne − no. When light travels parallel to the optic axis,the eigenpolarizations are degenerate, and all polarizationspropagate with index no. For light traveling in other direc-tions, one eigenpolarization has refractive index no andthe other’s varies with direction between no and ne (andequals ne when the propagation is perpendicular to theoptic axis).

B. Waveplates

Waveplates are linear retarders made from birefringentmaterials. Rewriting Eq. (1a) for propagation through a

birefringent medium of length L yields

E(z = L , t) = ReEx exp[i(ωt − k0nx L)]

+Ey exp[i(ωt − k0ny L)], (17)

where the x and y directions coincide with eigenpolar-izations and the absolute phases are initially equal (atz = 0, φx = φy = 0). The retardance φ = k0(nx − ny)L isthe relative phase shift between eigenpolarizations and de-pends on the wavelength, the propagation distance, and thedifference between the refractive indices of the eigenpo-larizations. If the z axis is an optic axis, then nx = ny = no,and there is no retardance; if z is perpendicular to an opticaxis, the retardance is φ = ±k0(no − ne)L . In general,the retardance over a path of length L in a material withbirefringence n is given by

φ = 2πnL/λ. (18)

Retardance may be specified in radians, degrees[φ = 360 · (no − ne)L/λ0], or length [φ = (no −ne)L].

A waveplate that introduces a π -radian or 180 phaseshift between the eigenpolarizations is called a half-waveplate. Upon exiting the plate, the two eigenpolarizationshave a λ/2 relative delay and are exactly out of phase.A half-wave plate requires a birefringent material withthickness given by

Lλ/2 = λ0(2m + 1)

2 |no − ne| , (19)

where the waveplate order m is a positive integer thatneed not equal 0 since additional retardances of 360 donot affect the phase relationship. Quarter-wave plates areanother common component and provide phase shifts of90 or π/2.

The eigenaxis with the lower refractive index (no in pos-itive uniaxial crystals such as quartz, and ne in negativeuniaxial crystals such as calcite) is called the fast axis ofthe retarder due to the faster phase velocity and is oftenmarked by the manufacturer. The eigenaxes can be iden-tified by rotating the retarder between crossed polarizersuntil the transmittance is minimized. When the polarizertransmission axis coincides with the retarder eigenaxis, theinput polarization matches the eigenpolarization, and thelight travels through the crystal unchanged until blockedby the analyzer. An input different from the eigenpolar-ization will exit the crystal in a different polarization stateand will not be completely blocked by the analyzer.

Waveplates are commonly made using quartz, mica, orplastic sheets that are stretched to produce an anisotropythat gives rise to birefringence. At visible wavelengths,ne − no ∼0.009 for quartz, and the corresponding zeroth-order (m = 0) quarter-wave plate thickness of ∼40 µm

Page 296: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 529

poses a severe manufacturing challenge. Mica can becleaved into thin sections to obtain zeroth-order retar-dance, but the resulting waveplate usually has poorerspatial uniformity. Polymeric materials often have lowerbirefringence and can be most easily fabricated intozeroth-order waveplates.

In many applications, retardance of integral multiplesof 2π is unimportant, and multiple-order (m ≥ 0) wave-plates are often lower in cost because the increased thick-ness eases fabrication. However, this approach can resultin increased retardance errors. For example, retardance de-pends on the wavelength [explicitly in Eq. (18) or throughdispersion]. Also, retardance can change with tempera-ture or with nonnormal incidence angles that vary theoptical thickness and propagation direction. Retardanceerrors arising from changes in wavelength, temperature,or incidence angle linearly increase with thickness andmake multiple-order waveplates unadvisable in applica-tions that demand accurate retardance.

Compound zeroth-order waveplates represent a com-promise between manufacturability and performancewhen true zeroth-order waveplates are not easily obtained.When two similar waveplates are aligned with orthogo-nal optic axes, the phase shifts in each waveplate haveopposite sign and the combined retardance will be the dif-ference between the two retardances. Compound zeroth-order retarders are made by combining two multiple-order waveplates in this way so that the net retardance isless than 2π . For example, two multiple-order waveplateswith retardance φ1 = 20π + π/2 and φ2 = −20π canbe combined to yield a compound zeroth-order quarter-wave plate. Compound zeroth-order waveplates exhibitthe same wavelength and temperature dependence aszeroth-order waveplates since retardance errors are pro-portional to the difference of plate thicknesses. However,input angle dependence is the same as in a multiple-orderwaveplate with equivalent total thickness.

C. Compensators

A compensator is a variable linear retarder that can beadjusted over a continuous range of values (Fig. 8). Ina Babinet compensator, two wedged plates of birefrin-gent material are oriented with their optic axes perpen-dicular. In this arrangement, the individual wedges im-part opposite signs of retardance, and the net retardanceis the difference between the individual magnitudes. Themagnitudes depend on the thickness of each wedge tra-versed by the optical beam. Typically one wedge is fixedand the other translated by a micrometer drive so thatthis moving wedge presents a variable thickness in thebeam path, and the net retardance depends on the mi-crometer adjustment. The use of two wedges eliminates

FIGURE 8 (a) Babinet and (b) Soleil–Babinet compensators.One wedge moves in the direction of the vertical arrow to adjustthe retardance. The direction of the optic axes are shown usingnotation from Fig. 7.

the beam deviation and the output beam is collinear to theinput.

The Babinet compensator has the disadvantage that theretardance varies across the optical beam because the rel-ative thicknesses of each wedge and corresponding netretardance vary over the beam in the direction of wedgetravel. This can be overcome using a Soleil (or Babinet–Soleil) compensator. In this device the two wedged pieceshave coincident optic axes and translation of the movingwedge changes the total thickness and retardance of thecombined retarder. The total thickness of this two-wedgepiece is now constant over the useful aperture. A paral-lel plate of fixed retardance is placed after the wedge, inthe same manner as a compound zeroth-order retarder, toimprove performance.

D. Rhombs

Retarders can also be fabricated of materials that do notexhibit birefringence. The phase shift between s- and p-polarized waves that occurs at a total internal reflection(Section II, Fresnel equations) can be exploited to obtaina linear retarder. When light is incident at angles largerthan the critical angle, the retardance at the reflection is

φ = φp − φs = 2 tan−1

[cos θi

√sin2 θi − (ni /nt )2

sin2 θi

]

(20)

and depends on the incidence angle and refractive indices.A Fresnel rhomb is a solid parallelogram fabricated so thata beam at normal incidence at the entrance face totally re-flects twice within the rhomb to provide a net retardanceof π/2. This retarder is, however, very sensitive to theincidence angle and laterally displaces the beam. Con-catenating two Fresnel rhombs (Fig. 9) provides collinearoutput and can greatly reduce the sensitivity of retardanceto incident angle since retardance changes at the first pairof reflections are partially canceled by the second pair.

Page 297: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

530 Polarization and Polarimetry

FIGURE 9 Two Fresnel rhombs concatenated to form a Fresneldouble rhomb.

Total-internal-reflection retarders are less sensitive towavelength variation than waveplates whose retardanceincreases with L/λ since the rhomb retardance does notdepend on the optical path length. Wavelength dependenceis limited only by the material dispersion dn/dλ, whichcontributes small retardance changes. Thus, rhomb de-vices are more nearly achromatic than waveplates and canbe operated over ranges of 100 nm or more. Rhomb de-vices are much larger than waveplates, and the clear aper-ture has practical limits since increasing cross section re-quires a proportional increase in length. Performance canalso be compromised by the presence of birefringence inthe bulk glass. Birefringence, arising from stresses in ma-terial production or optical fabrication, can lead to spatialvariations and path-length dependence, and limit retar-dance stability to several degrees if not mitigated.

E. Circular Retarders

Some materials can exhibit circular birefringence, or opti-cal activity, in which the eigenpolarizations are right- andleft-circular and the retardance is a phase shift betweenthese two circular states. Circular retarders are often calledrotators because incident linear polarization will generallyexit at a different azimuthal angle that depends on the ro-tary power (circular retardance per unit length) and thick-ness. A material that rotates linearly polarized light clock-wise (as viewed by an observer facing the light source) istermed dextrorotary or right-handed, while counterclock-wise rotation occurs in levorotary, or left-handed, mate-rials. The sense of rotation is fixed with respect to thepropagation direction; if the beam exiting an optically ac-tive material is reflected back through the material, thepolarization will be restored to the initial azimuth. Thus adouble pass through an optically active material will causeno net rotation of linear polarization.

Crystalline quartz exhibits optical activity that is mostevident when propagation is along the optic axis and retar-dance is absent. The property is not limited to crystallinematerials, however; molecules that are chiral (that lackplane or centrosymmetry and are not superposable on theirmirror image) can yield optical activity. Enantiomers arechiral molecules that share common molecular formulasand ordering of atoms but differ in the three-dimensionalarrangement of atoms; separate enantiomers have equalrotary powers but differ in the sense of rotation. Liquids

and solutions of chiral molecules such as sugars may beoptically active if an excess of one enantiomer is present.

In solution, each enatiomeric form will rotate light, andthe net rotation depends on the relative quantities of dex-trorotary and levorotary enantiomers. Mixtures with equalquantities of enantiomers present are called racemic andthe net rotation is zero. Most naturally synthesized organicchiral molecules, for example, sugars and carbohydrates,occur in only one enatiomeric form. Saccharimetry, themeasurement of the optical rotary power of sugar solu-tions, is used to determine the concentration of sugar insingle-enantiomer solutions.

F. Electrooptic and Magnetooptic Effects

In some materials, retardance can be induced by an elec-tric or magnetic field. These effects are exploited to createactive devices that produce an electrically controllable re-tardance.

Crystals that are not centrosymmetric may exhibit a lin-ear birefringence proportional to an applied electric fieldcalled the linear electrooptic effect or Pockels effect. Inthese materials, applied fields cause an otherwise isotropiccrystal to behave uniaxially (and uniaxial crystals to be-come biaxial). Crystal symmetry determines the directionof the optic axes and the form of the electrooptic tensor.The magnitude of the induced birefringence thus dependson the polarization direction, the applied field strength anddirection, and the material.

The electrically induced birefringence can be appre-ciable in some materials, and the Pockels effect is widelyused in retardance modulators, phase modulators, and am-plitude modulators. Modulators are often characterized bytheir half-wave voltage Vπ , or the voltage needed to causea 180 phase shift or retardance. Vπ can vary from ∼10 Vin waveguide modulators to hundreds or thousands of voltsin bulk modulators.

The Kerr, or quadratic, electrooptic effect occurs insolids, liquids, or gases and has no symmetry require-ments. In this effect, the linear birefringence magnitudeis proportional to the square of the applied electric fieldand the induced optic axis is parallel to the field direction.The effect is typically smaller than the Pockels effect andis often negligible in Pockels materials.

The Faraday effect is an induced circular birefringencethat is proportional to an applied magnetic field. It is oftencalled Faraday rotation because the circular birefringencerotates linearly polarized light by an angle proportionalto the field. The Faraday effect can occur in all materials,though the magnitude is decreased by birefringence.

In contrast to optical activity, the sense of Faraday rota-tion is determined by the direction of the magnetic field.Thus, a double-pass configuration in which light exiting a

Page 298: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 531

Faraday rotator reflects and propagates back through thematerial will yield twice the rotation of a single pass. Thisproperty is exploited in optical isolators, or componentsthat transmit light in only one direction. In the simplestisolators, a 45 Faraday rotator is placed between polar-izers with transmission axes at 0 and 45. In the forwarddirection, light linearly polarized at 0 is azimuthally ro-tated 45 to coincide with the analyzer axis and is fullytransmitted; backward light input at 45 rotates to 90 andis completely blocked by the polarizer at 0.

Faraday mirrors, made by combining a 45 Faraday ro-tator with a plane mirror, have the extraordinary propertyof “unwinding” polarization changes caused by propaga-tion. Polarized light that passes through an arbitrary re-tarder, reflects off a Faraday mirror, and retraces the inputpath will exit with a fixed polarization for all magnitudesor orientations of the retarder so long as the retardanceis unchanged during the round-trip time. When the inputlight is linearly polarized, the return light is always orthog-onally polarized for all intervening retardances. These de-vices find applications in fiber optic systems since bend-induced retardance is difficult to control in an ordinaryoptical fiber.

G. Pseudo-Depolarizers

Conversion of a polarized, collimated light beam into abeam that is truly unpolarized is difficult. Methods for ob-taining truly unpolarized light rely on diffuse scattering,such as passing light through ground glass plates or anintegrating sphere. These methods result in light propa-gating over a large range of solid angles and decrease theirradiance, or power per unit area, away from the depo-larizer. The loss is often unacceptable when a collimatedbeam is needed.

Approximations to the unpolarized state can be cre-ated using pseudo-depolarizers that produce a large va-riety of states over time, wavelength, or the beam crosssection. As described in Section I, temporal decorrela-tion requires that the beam propagate through a retar-dance that is much larger than the light’s coherence lengthLc = cτ ≈ 2πc/ω. If nonmonochromatic, linearly po-larized light bisects the axes of a waveplate with suffi-ciently large retardance, the two linear eigenpolarizationswill emerge with a relative phase shift that rapidly and ar-bitrarily changes on the order of the coherence time. At anymoment the instantaneous output state will be restricted toa point on the Poincare sphere (see Section IV) along thegreat circle connecting the ±45 and circular polarizationstates. When the detector is slower than τ , the averagedresponse will include the influence of all these states.

Lyot depolarizers are configurations of two retardersthat perform this temporal decorrelation for any input

polarization state. These are commonly made by con-catenating thick birefringent plates that act as high-orderwaveplates or by connecting lengths of polarization-maintaining (PM) fiber. PM fiber has about one wave-length of retardance every few millimeters, and can beobtained in lengths sufficient to decorrelate multimodelaser light.

A polarized light beam can also be converted to a beamwith a spatial distribution of states to approximate unpo-larized light, without the requirements on spectral band-width. For example, the retardance across a wedged wave-plate is not spatially uniform, and an incident beam willexit with a spatially varying polarization. When detectedby a single photodetector, the influence of all the states willbe averaged in the output response. These methods oftensatisfy needs for unpolarized light, but clearly depend onthe details and requirements of the application.

IV. MATHEMATICAL REPRESENTATIONS

Several methods have been developed to facilitate the rep-resentation of polarization states, polarization elements,and the evolution of polarization states as light passesthrough components. Using quasimonochromatic fields,the 2×2 coherency matrix can be used to represent polar-izations and determine the degree of polarization of light.The four-element Stokes vector describes the state of lightusing readily measurable intensities and can be related tothe coherency matrix. Mueller calculus represents opticalcomponents as real 4×4 matrices; when combined withStokes vectors it provides a quantitative description of theinteraction of light and optical components. In contrast,Jones calculus represents components using complex 2×2matrices and represents light using two-element electricfield vectors. Jones calculus cannot describe partially po-larized or unpolarized light, but retains phase informationso that coherent beams can be properly combined. Finally,the Poincare sphere is a pictorial representation that is use-ful for conceptually understanding the interaction betweenretarders and polarization states. A brief discussion intro-duces each of these methods.

A. Coherency Matrix

Using Eq. (8), we can define orthogonal field compo-nents of a quasi-monochromatic plane wave Ex =Ex (t) exp[i(ωt − k0z + φx (t))] and likewise for Ey . Thecoherency matrix J is given by

J =[ 〈Ex E∗

x 〉 〈Ex E∗y〉

〈Ey E∗x 〉 〈Ey E∗

y〉]

=[

Jxx Jxy

Jyx Jyy

], (21)

Page 299: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

532 Polarization and Polarimetry

where the angle brackets denote a time average and the as-terisk denotes the complex conjugate. The total irradianceI is given by the trace of the matrix, Tr(J) = Jxx + Jyy ,and the degree of polarization is

P =√

1 − 4|J|(Jxx + Jyy)2

, (22)

where |J| is the determinant of the matrix. Recalling thenotation for elliptical light, one can find the azimuthalangle α of the semi-major ellipse axis from the x axis andthe ellipticity angle ε of the polarized component as

α = 1

2tan−1

[Jxx + Jyy

Jxx − Jyy

](23)

ε = 1

2tan−1

[−i(Jxy − Jyx )

P(Jxx − Jyy)

].

Partially polarized light can be decomposed into polar-ized and unpolarized components and expressed usingcoherency matrices as J = Jp + Ju. Thus the state of thepolarized portion of light can be extracted from the co-herency matrix even when light is partially polarized. Thecoherency matrix representation of several states is pro-vided in Table I.

B. Mueller Calculus

In Mueller calculus the polarization state of light is rep-resented by a four-element Stokes vector S. The Stokesparameters s0, s1, s2, and s3 are related to the coherencymatrix elements or the quasi-monochromatic field repre-sentation through

s0 = Jxx + Jyy = ⟨Ex (t)2

⟩ + ⟨Ey(t)2

⟩s1 = Jxx − Jyy = ⟨

Ex (t)2⟩ − ⟨

Ey(t)2⟩

(24)s2 = Jxy + Jyx = 2

⟨Ex (t)Ey(t) cos(φ)

⟩s3 = i(Jyx − Jxy) = 2

⟨Ex (t)Ey(t) sin(φ)

⟩,

where the angle brackets denote a time averaging requiredfor nonmonochromatic light. Each Stokes parameter is re-lated to the difference between light intensities of specifiedorthogonal pairs of polarization states. Thus, the Stokesvector is easily found by measuring the power Pt trans-mitted through six different polarizers. Specifically,

S =

s0

s1

s2

s3

=

P0 + P90

P0 − P90

P+45 − P−45

Prcp − Plcp

, (25)

so that s0 is the total power or irradiance of the light beam,s1 is the difference of the powers that pass through hori-zontal (along x) and vertical (along y) linear polarizers, s2

is the difference between +45 and −45 linearly polar-ized powers, and s3 is the difference between right- andleft-circularly polarized powers. The values of the Stokesparameters are limited to s2

0 ≥ s21 + s2

2 + s23 and are often

normalized so that s0 = 1 and −1 ≤ s1 , s2 , s3 ≤ 1. Table Ilists normalized Stokes vectors for several polarizationstates.

The degree of polarization [Eq. (22)] can be written interms of Stokes parameters as

P =√

s21 + s2

2 + s23

s20

. (26)

Additionally, we can define the degree of linear polariza-tion (the fraction of light in a linearly polarized state) byreplacing the numerator of Eq. (26) with

√s2

1 + s22 , or the

degree of circular polarization by replacing the numeratorwith s3.

An optical component that changes the incident polar-ization state from S to some output state S′ (through re-flection, transmission, or scattering) can be described by a4 × 4 Mueller matrix M. This transformation is given by

S′ =

s ′0

s ′1

s ′2

s ′3

= MS =

m00 m01 m02 m03

m10 m11 m12 m13

m20 m21 m22 m23

m30 m31 m32 m33

s0

s1

s2

s3

,

(27)

where M can be a product of n cascaded components Mi

using

M =n∏

i=1

Mi . (28)

Matrix multiplication is not commutative and the productmust be formed in the order that light reaches each com-ponent. For a system of three components in which thelight is first incident on component 1 and ultimately exitscomponent 3, S′ = M3M2M1S, for example.

Examples of Mueller matrices for several homogeneouspolarization components are given in Table II. The Muellermatrix for a component can be experimentally obtained bymeasuring S′ for at least 16 judiciously selected S inputs,and procedures for measurement and data reduction arewell developed.

C. Jones Calculus

In Jones calculus a two-element vector represents the am-plitude and phase of the orthogonal electric field compo-nents and the phase information is preserved during calcu-lation. This allows the coherent superposition of waves andis useful for describing the polarization state in systemssuch as interferometers that combine beams. Since this

Page 300: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 533

TABLE I Matrix Representations of Selected Polarization Statesa

State Coherency matrix Stokes vector Jones vector

Linear along x (α = β = 0; tan ε = 0) I

[1 0

0 0

]

1

1

0

0

[1

0

]

Linear along y (α = β = 90; tan ε = 0) I

[0 0

0 1

]

1

−1

0

0

[0

1

]

Linear at +45 (α = β = 45; tan ε = 0) 12 I

[1 1

1 1

]

1

0

1

0

1√

2

[1

1

]

General linear (−90 < α < 90; tan ε = 0) 12 I

[cos(α)2 sin α cos α

sin α cos α sin(α)2

]

1

cos 2α

sin 2α

0

[cos(α)

sin(α)

]

Right circular (tan ε = 1; φ = 90; β = 45) 12 I

[1 i

i 1

]

1

0

0

1

1√

2

[1

i

]

Left circular (tan ε = −1; φ = −90; β = 45) 12 I

[1

i 1

]

1

0

0

−1

1√

2

[1

−i

]

General elliptical

1

cos 2ε cos 2β

cos 2ε sin 2β

sin 2ε

1√

2

[cos βe−iφ/2

sin βeiφ/2

]

Unpolarized 12 I

[1 0

0 1

]

1

0

0

0

None

a The parameters α, β, ε, and φ are defined corresponding to elliptical light as discussed in Section I. Extensive lists of Stokes andJones vectors are available in several texts.

method is based on coherent waves, however, the Jonesvector describes only fully polarized states, and partiallyor unpolarized states and depolarizing components cannotbe represented.

Recalling Eqs. (1a) and (1b), one can write a vectorformulation of complex representation for a fully coherentfield

E = eiωt

∣∣∣∣ Ex eiφx

Eyeiφy

∣∣∣∣ , (29)

where the space-dependent term kz has been omitted.When the time dependence is also omitted, this vectoris known as the full Jones vector. For generality, the Jonesvector J is often written in a normalized form

J =∣∣∣∣ cos β

sin βeiφ

∣∣∣∣ =∣∣∣∣ cos βe−iφ/2

sin βeiφ/2

∣∣∣∣ , (30)

where φ = φy − φx and tan(β) = Ey/Ex . The Jones vec-tor can also be found from the polarization azimuthal angle

Page 301: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

534 Polarization and Polarimetry

TABLE II Matrix Representation of Optical Components

Component Mueller matrix Jones matrix

Linear diattenuator withmaximum (minimum)transmission p2

1 (p22)

or absorber(if p = p1 = p2)

1

2

p1 + p2 p1 − p2 0 0

p1 − p2 p1 + p2 0 0

0 0 2√

p1 p2 0

0 0 0 2√

p1 p2

[p1 0

0 p2

]

Linear polarizer at 0 1

2

1 1 0 0

1 1 0 0

0 0 0 0

0 0 0 0

[1 0

0 0

]

Linear polarizer at anangle θ

1

2

1 cos 2θ sin 2θ 0

cos 2θ cos2 2θ cos 2θ sin 2θ 0

sin 2θ cos 2θ sin 2θ sin2 2θ 0

0 0 0 0

[cos2 θ sin θ cos θ

sin θ cos θ sin2 θ

]

Half-wave (δ = 180)linear retarder with thefast axis at 0

1 0 0 0

0 1 0 0

0 0 −1 0

0 0 0 −1

[1 0

0 −1

]

Quarter-wave (δ = 90)linear retarder with thefast axis at 0

1 0 0 0

0 1 0 0

0 0 0 1

0 0 −1 0

[ei π/4 0

0 e −i π/4

]

General linear retarder:retardance δ, fast axisat angle β from x axis

1

2

1 0 0 00 cos 4β sin2 δ/2 + cos2 δ/2 sin 4β sin2 δ/2 − sin 2β sin δ

0 sin 4β sin2 δ/2 − cos 4β sin2 δ/2 + cos2 δ/2 cos 2β sin δ

0 sin 2β sin δ − cos 2β sin δ cos δ

[ei δ/2 cos2 β + e −i δ/2 sin2 β i sin 2β sin δ/2

i sin 2β sin δ/2 e −i δ/2 cos2 β + ei δ/2 sin2 β

]

Right circularretardance δ orrotator with θ = δ/2

1

2

1 0 0 0

0 cos δ/2 sin δ/2 0

0 − sin δ/2 cos δ/2 0

0 0 0 0

[cos δ/2 sin δ/2

− sin δ/2 cos δ/2

]

Mirror

1 0 0 0

0 1 0 0

0 0 −1 0

0 0 0 −1

[1 00 −1

]

Faraday mirror

1 0 0 0

0 −1 0 0

0 0 1 0

0 0 0 −1

[0 −1

−1 0

]

Depolarizer

1 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

None

α and ellipticity tan(ε) of the polarization ellipse using

φ = tan−1

[tan(2ε)

sin(2α)

]

β = 1

2cos−1[cos(2ε) cos(2α)]. (31)

Table I provides examples of Jones vectors for severalpolarization states.

The polarization properties of optical components canbe represented as 2×2 Jones matrices (Table II). The out-put polarization state is J′ = MJ, where the Jones matrixM may be constructed from a cascade of components Mi

using Eq. (28). In general the matrices are not commuta-tive and require the same ordering as in Mueller calculus,with the rightmost matrix representing the first elementthe light is incident upon, and so on.

Page 302: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 535

Jones used this calculus to establish three theorems thatdescribe the minimum number of optical elements neededto describe a cascade of many elements at a given wave-length:

1. A system of any number of linear retarders androtators (circular retarders) can be reduced to a systemcomposed of only one retarder and one rotator.

2. A system of any number of partial polarizers androtators can be reduced to a system composed of onlyone partial polarizer and one rotator.

3. A system of any number of retarders, partialpolarizers, and rotators can be reduced to a systemcomposed of only two retarders, one partial polarizer,and, at most, one rotator.

The Jones matrices in Table II assume forward propaga-tion. In some cases, for example, with nonreciprocal com-ponents such as Faraday rotators, backward propagationmust be explicitly described. Furthermore, since fields areused to represent polarization states, the phase shift aris-ing from normal-incidence reflection may be important.For propagation in reciprocal media, the transformationfrom the forward Jones matrix to the backward case isgiven by [

a b

c d

]forward

→[

a −c

−b d

]backward

. (32)

For nonreciprocal behavior, such as the Faraday effect, thetransformation is instead[

a b

c d

]forward

→[

a −b

−c d

]backward

. (33)

When M is composed of a cascade of Mi that includeboth reciprocal and nonreciprocal polarization elements,each matrix must be transformed and a new combined ma-trix calculated. Upon reflection, the light is now backwardpropagating and the Jones matrix can be transformed tothe forward-propagating form (for direct comparison withthe input vector, for example) by changing the sign of thesecond element; in other words,

Jforward =[

1 0

0 −1

]Jbackward (34)

The calculi discussed above are applicable to problemswhen the polarization properties are lumped, that is, thesystem consists of simple components such as ideal wave-plates, rotators, and polarizers, etc. Because the Jones (orMueller) matrix from a cascade of matrices depends on theorder of multiplication, an optical component with inter-mixed polarization properties cannot generally be repre-sented by the simple multiplication matrices representingeach individual property. For example, a component in

which both linear retardance (represented by Jones ma-trix ML) and circular retardance (MC) are both distributedthroughout the element is not properly represented by ei-ther MLMC or MCML.

A method known as the Jones N -matrix formulationcan be used to find a single Jones matrix that properly de-scribes the distribution of multiple polarization properties.The N -matrix represents the desired property over a van-ishingly small optical path. The differential N -matricesfor each desired property can be summed and the com-bined properties found by an integration along the opticalpath. Tables of N -matrices and algorithms for calculat-ing corresponding Jones matrices can be found in severalreferences.

Jones and Mueller matrices can be related to each otherunder certain conditions. Jones matrices differing only inabsolute phase (in other words, a phase common to bothorthogonal eigenpolarizations) can be transformed into aunique Mueller matrix that will have up to seven indepen-dent elements, though the phase information will be lost.Thus Mueller matrices for distributed polarization prop-erties can be derived from Jones matrices calculated usingN -matrices. Conversely, nondepolarizing Mueller matri-ces [which satisfy the condition Tr(MMT) = 4m00, whereMT is the transpose of M] can be transformed into a Jonesmatrix.

D. Poincare Sphere

The Poincare sphere provides a visual method for repre-senting polarization states and calculating the effects ofpolarizing components. Each state of polarization is rep-resented by a unique point on the sphere defined by itsazimuthal angle α, the ellipticity tan |ε|, and the handed-ness. Orthogonal polarizations occupy points at oppositeends of a sphere diameter. Propagation through retardersis represented by a sphere rotation that translates the po-larization state from an initial point to a final polarization.

Figure 10 shows a Poincare sphere with several po-larizations labeled. Point x represents linear polarizationalong the x axis and point y represents y-polarized light.Right-circular polarization (tan ε = 1) lies at the northpole, and all polarizations above the equator are right-elliptical. Similarly, the south pole represents left-circularpolarization (tan ε = −1), and states below the equator areleft-elliptically polarized. (In many texts the locations ofthe circular states are reversed; while a source of confu-sion, this change is valid so long as other conventions areobserved.)

In Fig. 10, a general polarization state with azimuthalangle α and ellipticity angle ε is represented by the pointp with longitude 2α and latitude 2ε. Linear polarizationshave zero ellipticity (tan|ε| = 0) and are located along the

Page 303: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

536 Polarization and Polarimetry

FIGURE 10 The Poincare sphere. The polarization representedby point p is located using the azimuthal angle α (in the equatorialplane measured from point x) and the ellipticity angle ε (a merid-ional angle measured from the equator toward the north pole).Linear polarization along the x axis is located at point x, linear po-larization along the y axis is represented by point y, and rcp andlcp denote right- and left-circularly polarized states, respectively.The origin represents unpolarized light.

equator. A linear polarization with azimuthal angle α fromthe x axis is located at a longitudinal angle 2α along theequator from point x . Polarization states that lie upon acircle parallel to the equator have the same ellipticity butdifferent orientations. Polarizations at opposite diametershave the same ellipticity, perpendicular azimuthal angles,and opposite handedness.

The Poincare sphere can also be used to show the effectof a retarder on an incident polarization state. A retarderoriented with a fast axis at α and an ellipticity and hand-edness given by tan ε can be represented by a point R onthe sphere located at angles 2α and 2ε. For a given inputpolarization represented by point p, a circle centered atpoint R that includes point p is the locus of the output po-larization states possible for all retardance magnitudes. Aspecific retardance magnitude δ is represented by a clock-wise arc of angle δ along the circle from the point p. Theendpoint of this arc represents the polarization state outputfrom the retarder.

Consider x-polarized light incident on a quarter-wavelinear retarder oriented with its fast axis at +45 fromhorizontal; using Jones calculus, we find that right circu-lar polarization should exit the waveplate. To show thisgraphically using the Poincare sphere, we locate the point+45, which represents the retarder orientation. The initialpolarization is at point x ; for a retardance δ = 90, we tracea clockwise arc centered at the point +45 that subtends90 from point x . This arc ends at the north pole, so theresulting output is right-circular polarization. If the retar-dance was δ = 180, the arc would subtend 180, and the

output light would be y-polarized. Similarly, left-circularpolarization results if δ = 270 (or if δ = 90 and the fastaxis is oriented at −45). The evolution of the polarizationthrough additional components can be traced by locatingeach retarder’s representation on the sphere, defining acircle centered by this point and the polarization outputfrom the previous retarder, and tracing a new arc throughan angle equal to the retardance.

Comparing the Poincare sphere definitions to Eq. (25)shows that for normalized Stokes vectors (s0 = 1), eachvector element corresponds to a point along Cartesianaxes centered at the sphere’s origin. Stokes elements1(= cos 2ε cos 2α) falls along the axis between x- andy-polarized; s1 = 1 corresponds to point x and s1 = −1corresponds to point y. Values of s2(= cos 2ε sin 2α) cor-respond to points along the diameter connecting the ±45

linear polarization points; s2 = −1 corresponds to the−45 point. Element s3(= sin 2ε) is along the axis be-tween the north and south poles. These projections on thePoincare sphere can be equivalently represented by rewrit-ing Eq. (25) and normalizing to obtain

s0

s1

s2

s3

=

1

cos(2ε) cos(2α)

cos(2ε) sin(2α)

sin(2ε)

. (35)

Any fully polarized state on the surface of the sphere canbe found using these Cartesian coordinates. Partially po-larized states will map to a point within the sphere, andunpolarized light is represented by the origin.

V. POLARIMETRY

Polarimetry is the measurement of a light wave’s polar-ization state, or the characterization of an optical com-ponent’s or material’s polarization properties. Completepolarimeters measure the full Stokes vector of an opticalbeam or measure the full Mueller matrix of a sample.In many cases, however, some characteristics can be ne-glected and the measurement of all Stokes or Muellerelements is not necessary. Incomplete polarimeters mea-sure a subset of characteristics and may be used when sim-plifying assumptions about the light wave (for example,that the degree of polarization is 1) or sample (for example,a retarder exhibits negligible diattenuation or depolariza-tion) are appropriate. In this section, a few techniques arebriefly described for illustration.

A. Light Measurement

A polarization analyzer, or light-measuring polarime-ter, characterizes the polarization properties of an opticalbeam. An optical beam’s Stokes vector can be completely

Page 304: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

Polarization and Polarimetry 537

characterized by measuring the six optical powers listed inEq. (25) using ideal polarizers. When the optical beam’sproperties are time invariant, the measurements can be per-formed sequentially by measuring the power transmittedthrough four orientations of a linear polarizer and two ad-ditional measurements with a quarter-wave retarder (ori-ented ±45 with respect to the polarizers axis) placed be-fore the polarizer. In practice, as few as four measurementsare required since s2 = 2P+45 − s0 and s3 = 2Prcp − s0.

The Stokes vector can alternatively be measured witha single circular polarizer made by combining a quarter-wave plate (with the fast axis at 45) with a linear polarizer.Prcp is measured when the retarder side faces the source.Flipping so that the retarder faces the detector allows mea-surement of P0 , P90 , and P±45 .

The Stokes vector elements can be measured simulta-neously with multiple detector configurations. Division ofamplitude polarimeters use beamsplitters to direct frac-tions of the power to appropriate polarization analyzers.Using division of wavefront polarization analyzers, we as-sume that the polarization is uniform over the optical beamand subdivisions of the beam’s cross section are directedto appropriate analyzers.

Incomplete light-measuring polarimeters are usefulwhen the light is fully polarized (degree of polarizationapproaches 1). For example, the ellipticity magnitude andazimuth can be found by analyzing the light with a ro-tating linear polarizer and measuring the minimum andmaximum transmitted powers. Linear polarization yieldsa detected signal with maximum modulation, while min-imum modulation occurs for circular polarization. Thehandedness of the ellipticity can be found using a right-(or left-) circular polarizer.

These methods are photometric, and accurate opticalpower measurements are required to determine the lightcharacteristics. Before the availability of photodetectors,null methods that rely on adjusting system settings un-til light transmission is minimized were developed, andthese are still useful today. For example, an incomplete po-larimetric null system for analyzing polarized light usesa calibrated Babinet–Soleil compensator followed by alinear polarizer. Adjusting both the retardance δ and an-gle θ between the fast axis and polarizer axis until thetransmitted power is zero yields the ellipticity angle ω

(using sin 2ω = sin 2θ sin δ) and azimuthal angle α (usingtan α = tan 2θ cos δ). When unpolarized light is present,the minimum transmission is not zero, and photometricmeasurement of this power can be used to obtain the de-gree of polarization.

B. Sample Measurement

A polarization generator is used to illuminate the samplewith known states of polarization to measure the sample’s

polarization properties. The reflected or transmitted lightis then characterized by a polarization analyzer, and theproperties of the sample are inferred from changes be-tween the input and output states.

A common configuration for determining the Muellermatrix combines a fixed linear polarizer and a rotatingquarter-wave retarder for polarization generation with arotating quarter-wave retarder followed by a fixed lin-ear polarizer for analysis. Power is measured as the tworetarders are rotated at different rates (one rotates fivetimes faster than the other) and the Mueller matrix el-ements are found from Fourier analysis of the resultingtime series. Alternatively, measurements can be takenat 16 (or more) specific combinations of generator andanalyzer states, typically with the polarizers fixed andat specified retarder orientations. Data reduction tech-niques have been developed for efficiently determining theMueller matrix from such measurements. Several methodsinclude measurements at additional generator/analyzercombinations to overdetermine the matrix; least-squarestechniques are then applied to reduce the influence ofnonideal system components and decrease measurementerror.

Because of the simplicity and reduction of variables,incomplete polarizers can often provide a more accuratemeasurement of a single polarization property when othercharacteristics are negligible. For example, there are manymethods for measuring linear retardance in samples withnegligible circular retardance, diattenuation, and depolar-ization, and these are often applicable to measurements ofhigh-quality waveplates.

In a rotating analyzer system, the retarder is placedbetween two linear polarizers so that the input polar-ization bisects the retarder’s birefringence axes. Lin-ear retardance is calculated from measurements of thetransmitted power when the analyzer is parallel (P0 )and perpendicular (P90 ) to the input polarizer using|δ| = cos−1[(P0 − P90 )/(P0 + P90 )]. In this measure-ment, retardance is limited to two quadrants (for ex-ample, measurements of 90 and 270 = −90 retarderswill both yield δ = 90). If a biasing quarter-wave re-tarder is placed between the input polarizer and re-tarder and both retarders are aligned with the fast axisat 45, retardance in quadrants 1 and 4 (|δ| ≤ 90) canbe measured from δ = sin−1[(P90 − P0 )/(P90 + P0 )].There are several null methods, including those that usea variable compensator aligned with the retarder at 45

between crossed polarizers (retardance is measured byadjusting a calibrated compensator until no light is de-tected) or that use a fixed quarter-wave-biasing retarderand rotate the polarizer and/or analyzer until a null isobtained.

Ellipsometry is a related technique that allows themeasurement of isotropic optical properties of surfaces

Page 305: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/MBQ P2: GRB Final Pages

Encyclopedia of Physical Science and Technology EN012F-590 July 26, 2001 10:59

538 Polarization and Polarimetry

and thin films from the polarization change inducedupon reflection. Linearly polarized light is directed to-ward the sample at known incidence angles, and thereflected light is analyzed to determine its polarizationellipse.

Application of electromagnetic models to the configu-ration (for example, via Fresnel equations) allows one tocalculate the refractive index, extinction coefficient, andfilm thickness from the measured ellipticities. Ellipsom-etry can be extended to other configurations using vari-ous incident polarizations and polarization analyzers tomeasure polarimetric quantities, blurring any distinctionbetween ellipsometry and polarimetry.

SEE ALSO THE FOLLOWING ARTICLES

ELECTROMAGNETICS • LIGHT SOURCES • OPTICAL

DIFFRACTION • WAVE PHENOMENA

BIBLIOGRAPHY

Anonymous (1984). “Polarization: Definitions and Nomenclature, In-strument Polarization,” International Commission on Illumination,Paris.

Azzam, R. M. A., and Bashara, N. M. (1997). “Ellipsometry and Polar-ized Light,” North Holland, Amsterdam.

Bennett, J. M. (1995). “Polarization.” In “Handbook of Optics,” Vol. 1,pp. 5.1–5.30 (Bass, M., ed.), McGraw-Hill, New York.

Bennett, J. M. (1995). “Polarizers.” In “Handbook of Optics,” Vol. 2, pp.3.1–3.70 (Bass, M., ed.), McGraw-Hill, New York.

Born, M., and Wolf, E. (1980). “Principles of Optics,” Pergamon Press,Oxford.

Chipman, R. A. (1995). “Polarimetry.” In “Handbook of Optics,” Vol. 1,pp. 22.1–22.37 (Bass, M., ed.), McGraw-Hill, New York.

Collet, E. (1993). “Polarized Light: Fundamentals and Applications,”Marcel Dekker, New York.

Hecht, E., and Zajac, A. (1979). “Optics,” Addison-Wesley, Reading,MA.

Kilger, D. S., Lewis, J. W., and Randall, C. E. (1990). “Polarized Lightin Optics and Spectroscopy,” Academic Press, San Diego, CA.

Yariv, A., and Yeh, P. (1984). “Optical Waves in Crystals,” Wiley, NewYork.

Page 306: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and PhotometryRoss McCluneyFlorida Solar Energy Center

I. BackgroundII. RadiometryIII. PhotometryIV. Commonly Used Geometric RelationshipsV. Principles of Flux Transfer

VI. SourcesVII. Optical Properties of Materials

VIII. The Detection of RadiationIX. Radiometers and Photometers,

Spectroradiometers, and SpectrophotometersX. Calibration of Radiometers and Photometers

GLOSSARY

Illuminance, Ev The area density of luminous flux, theluminous flux per unit area at a specified point in aspecified surface that is incident on, passing through,or emerging from that point in the surface (unit:lm · m−2 = lux).

Irradianc, Ee The area density of radiant flux, the radiantflux per unit area at a specified point in a specifiedsurface that is incident on, passing through, or emergingfrom that point in the surface (unit: watt · m−2).

Luminance, Lv The area and solid angle density of lu-minous flux, the luminous flux per unit projected areaand per unit solid angle incident on, passing through,or emerging from a specified point in a specifiedsurface, and in a specified direction in space (units:lumen · m−2 · sr−1 = cd · m−2).

Luminous efficacy, Kr The ratio of luminous flux in lu-

mens to radiant flux (total radiation) in watts in a beamof radiation (units: lumen/watt).

Luminous flux, Φv The V (λ)-weighted integral of thespectral fluxλ over the visible spectrum (unit: lumen).

Luminous intensity, Iv The solid angle density ofluminous flux, the luminous flux per unit solid angleincident on, passing through, or emerging from apoint in space and propagating in a specified direction(units: lm · sr−1 = cd).

Photopic spectral luminous efficiency function, V(λ)The standardized relative spectral response of a humanobserver under photopic (cone vision) conditions overthe wavelength range of visible radiation.

Projected area, Ao Unidirectional projection of the areabounded by a closed curve in a plane onto another planemaking some angle θ to the first plane.

Radiance, Le The area and solid angle density of radiantflux, the radiant flux per unit projected area and per unit

731

Page 307: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

732 Radiometry and Photometry

solid angle incident on, passing through, or emergingfrom a specified point in a specified surface, and in aspecified direction in space (units: watt · m−2· sr−1).

Radiant flux, Φe The time rate of flow of radiant energy(unit: watt).

Radiant intensity, Ie The solid angle density of radiantflux, the radiant flux per unit solid angle incident on,passing through, or emerging from a point in space andpropagating in a specified direction (units: watt · sr−1).

Solid angle,Ω The area A on a sphere of the radial projec-tion of a closed curve in space onto that sphere, dividedby the square r2 of the radius of that sphere.

Spectral radiometric quantities The spectral “concen-tration” of quantity Q, denoted Qλ, is the derivatived Q/dλ of the quantity with respect to wavelength λ,where “Q” is any one of: radiant flux, irradiance, radi-ant intensity, or radiance.

RADIOMETRY is a system of language, mathematicalformulations, and instrumental methodologies used to de-scribe and measure the propagation of radiation throughspace and materials. The radiation so studied normallyis confined to the ultraviolet (UV), visible (VIS), and in-frared (IR) parts of the spectrum, but the principles areapplicable to radiant energy of any form that propagatesin space and interacts with matter in known ways, similarto those of electromagnetic radiation. This includes otherparts of the electromagnetic spectrum and to radiationcomposed of the flow of particles where the trajectories ofthese particles follow known laws of ray optics, throughspace and through materials. Radiometric principles areapplied to beams of radiation at a single wavelength orthose composed of a broad range of wavelengths. They canalso be applied to radiation diffusely scattered from a sur-face or volume of material. Application of these principlesto radiation propagating through absorbing and scatteringmedia generally leads to mathematically sophisticated andcomplex treatments when high precision is required. Thatimportant topic called radiative transfer, is not treated inthis article.

Photometry is a subset of radiometry, and deals onlywith radiation in the visible portion of the spectrum. Pho-tometric quantities are defined in such a way that theyincorporate the variations in spectral sensitivity of the hu-man eye over the visible spectrum, as a spectral weightingfunction built into their definition.

In determining spectrally broadband radiometric quan-tities, no spectral weighting function is used (or one mayconsider that a weighting “function” of unity (1.0) is ap-plied at all wavelengths).

The scope of this treatment is limited to definitions ofthe primary quantities in radiometry and photometry, the

derivations of several useful relationships between them,the rudiments of setting up problems in radiation transfer,short discussions of material properties in a radiometriccontext, and a very brief discussion of electronic detectorsof electromagnetic radiation. The basic design of radiome-ters and photometers and the principles of their calibrationare described as well.

Until the latter third of the 20th century, the fields ofradiometry and photometry developed somewhat inde-pendently. Photometry was beset with a large variety ofdifferent quantities, names of those quantities, and unitsof measurement. In the 1960s and 1970s several authorscontributed articles aimed at bringing order to the apparentconfusion. Also, the International Lighting Commission(CIE, Commission International de l’Eclairage) and theInternational Electrotechnical Commission (CEI, Commi-sion Electrotechnic International ) worked to standardizea consistent set of symbols, units, and nomenclature, cul-minating in the International Lighting Vocabulary, jointlypublished by the CIE and the CEI. The recommendationsof that publication are followed here. The CIE has be-come the primary international authority on terminologyand basic concepts in radiometry and photometry.

I. BACKGROUND

A. Units and Nomenclature

Radiant flux is defined as the time rate of flow of energythrough space. It is given the Greek symbol and themetric unit watt (a joule of energy per second). An impor-tant characteristic of radiant flux is its distribution overthe electromagnetic spectrum, called a spectral distribu-tion or spectrum. The Greek symbol λ is used to symbol-ize the wavelength of monochromatic radiation, radiationhaving only one frequency and wavelength. The unit ofwavelength is the meter, or a submultiple of the meter,according to the rules of System International, the inter-national system of units (the metric system). The unit offrequency is the hertz (abbreviated Hz), defined to be acycle (or period) per second. The symbol for frequencyis the Greek ν. The relationship between frequency ν andwavelength λ is shown in the equation

λν = c, (1)

where c is the speed of propagation in the medium (calledthe “speed of light” more familarly). The spectral concen-tration of radiant flux at (or around) a given wavelength λ

is given the symbol λ, the name spectral radiant flux, andthe units watts per unit wavelength. An example of this isthe watt per nanometer (abbreviated W/nm). The names,definitions, and units of additional radiometric quantitiesare provided in Section II.

Page 308: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 733

The electromagnetic spectrum is diagramed in Fig. 1.The solar and visible spectral regions are expanded to theright of the scale. Though sound waves are not electromag-netic waves, the range of human-audible sound is shownin Fig. 1 for comparison.

The term “light” can only be applied in principle toelectromagnetic radiation over the range of visible wave-lengths. Radiation outside this range is invisible to thehuman eye and therefore cannot be called light. Infraredand ultraviolet radiation cannot be termed “light.”

Names and spectral ranges have been standardized forthe ultraviolet, visible, and infrared portions of the spec-trum. These are shown in Table I.

B. Symbols and Naming Conventions

When the wavelength symbol λ is used as a subscript on aradiometric quantity, the result denotes the concentrationof the quantity at a specific wavelength, as if one weredealing with a monochromatic beam of radiation at thiswavelength only. This means that the range λ of wave-lengths in the beam, around the wavelength λ of definition,is infinitesemally small, and can therefore be defined interms of the mathematical derivative as follows. Let Q be aradiometric quantity, such as flux, and Q be the amountof this quantity over a wavelength interval λ centeredat wavelength λ. The spectral version of quantity Q, atwavelength λ, is the derivative of Q with respect to wave-length, defined to be the limit as λ goes to zero of theratio Q /λ.

Qλ = d Q

d λ. (2)

This notation refers to the “concentration” of the radiomet-ric quantity Q, at wavelength λ, rather than to its functionaldependence on wavelength. The latter would be notatedas Qλ(λ). Though seemingly redundant, this notation is

FIGURE 1 Wavelength and frequency ranges over the electro-magnetic spectrum.

TABLE I CIE Vocabulary for Spectral Regions

Name Wavelength range

UV-C 100 to 280 nm

UV-B 280 to 315 nm

UV-A 315 to 400

VIS Approx. 360–400 to 760–800 nm

IR-Aa 780 to 1400 nm

IR-B 1.4 to 3.0 µm

IR-Cb 3 µm to 1 mm

a Also called “near IR” or NIR.b Also called “far IR” or FIR.

correct within the naming convention established for thefield of radiometry.

When dealing with the optical properties of materialsrather than with concentrations of flux at a given wave-length, the subscripting convention is not used. Instead, thefunctional dependence on wavelength is notated directly,as with the spectral transmittance: T (λ). Spectral opticalproperties such as this one are spectral weighting func-tions, not flux distributions, and their functional depen-dence on wavelength is shown in the conventional manner.

C. Geometric Concepts

In radiometry and photometry one is concerned with sev-eral geometrical constructs helpful in defining the spa-tial characteristics of radiation. The most useful are areas,plane angles, and solid angles.

The areas of interest are planar ones (including smalldifferential elements of area used in definitions and deriva-tions), nonplanar ones (areas on curved surfaces), andwhat are called projected areas. The latter are areas re-sulting when an original area is projected at some angleθ , as viewed from an infinite distance away. Projectedareas are unidirectional projections of the area bounded bya closed curve in a plane onto another plane, one makingangle θ to the first, as illustrated in Fig. 2.

A plane angle is defined by two straight lines intersect-ing at a point. The space between these lines in the planedefined by them is the plane angle. It is measured in radi-ans (2π radians in a circle) or degrees (360 degrees to acircle). In preparation for defining solid angle it is pointedout that the plane angle can also be defined in terms of theradial projection of a line segment in a plane onto a point,as illustrated in Fig. 3.

A plane angle is the quotient of the arc length s and theradius r of a radial projection of segment C of a curve ina plane onto a circle of radius r lying in that plane andcentered at the vertex point P about which the angle isbeing defined.

Page 309: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

734 Radiometry and Photometry

FIGURE 2 Illustration of the definition of projected areas.

If θ is the angle and s is the arc length of the projectiononto a circle of radius r , then the defining equation is

θ = s

r . (3)

According to Eq. (3), the plane angle is a dimensionlessquantity. However, to aid in communication, it has beengiven the unit radian, abbreviated rad. The radian mea-sure of a plane angle can be converted to degree mea-sure with the multiplication of a conversion constant,180/π .

A similar approach can be used to define solid angle.A solid angle is defined by a closed curve in space and apoint, as illustrated in Fig. 4.

A solid angle is the quotient of the area A and square ofthe radius r of a radial projection of a closed curve C inspace onto a sphere of radius r centered at the vertex pointP relative to which the angle is being defined.

If is the solid angle being defined, A is the area onthe sphere enclosed by the projection of the curve ontothat sphere, and r is the sphere’s radius, then the definingequation is

FIGURE 3 Definition of the plane angle.

FIGURE 4 Definition of the solid angle.

= A

r2 (4)

According to Eq. (4), the solid angle is dimensionless.However, to aid in communication, it has been given theunit steradian, abbreviated sr. Since the area of a sphere is4π times the square of its radius, for a unit radius spherethe area is 4π and the solid angle subtended by it is 4π

sr. The solid angle subtended by a hemisphere is 2πsr . Itis important to note that the area A in Eq. (4) is the areaon the sphere of the projection of the curve C . It is notthe area of a plane cut through the sphere and containingthe projection of curve C . Indeed, the projections of somecurves in space onto a sphere do not lie in a plane.

One which does is of particular interest—the projectionof a circle in a plane perpendicular to a radius of the sphere,as illustrated in Fig. 5, which also shows a hemisphericalsolid angle. Let α be the plane angle subtended by theradius of the circle at the center of the sphere, called the“half-angle” of the cone. It can be shown that the solidangle subtended by the circle is given by

= 2π (1 − cos α). (5)

If α = 0 then = 0 and if α = 90 then = 2πsr , as re-quired. A derivation of Eq. (5) is provided on pp. 28–30of McCluney (1994).

FIGURE 5 (a) Geometry for determining the solid angle of a rightcircular cone. α is the “half-angle” of the cone. (b) Geometry of ahemispherical solid angle.

Page 310: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 735

D. The Metric System

To clarify the symbols, units, and nomenclature of radiom-etry and photometry the international system of units andrelated standards known as the metric system was em-braced. There have been several versions of the metricsystem over the last couple of centuries. The current mod-ernized one is named Le System International d’Unites(SI). It was established in 1960 by international agreement.The Bureau International des Poids et Mesures (BIPM)regularly publishes a document containing revisions andnew recommendations on terminology and units. The In-ternational Standards Organization (ISO) publishes stan-dards on the practical uses of the SI system in a variety offields. Many national standards organizations around theworld publish their own standards governing the use of thissystem, or translations of the BIPM documents, into thelanguages of their countries. In the United States the unitsmetre and litre are spelled meter and liter, respectively.

The SI system calls for adherence to standard prefixesfor standard orders of magnitude, listed in Table II. Thereare some simple rules governing the use of these prefixes.The prefix symbols are to be printed in roman type withoutspacing between the prefix symbol and the unit symbol.The grouped unit symbol plus its prefix is inseparablebut may be raised to a positive or negative power andcombined with other unit symbols. Examples: cm2, nm,µm, klx, 1 cm2 = 10−4 m2. No more than one prefix canbe used at a time. A prefix should never be used alone,except in descriptions of systems of units.

There are now two classes of units in the SI system:

Base units and symbols: meter (m), kilogram (kg),second (s), ampere (A), kelvin (K), mole (mol), andcandela (cd). Note that the abbreviations of unitsnamed for a person are capitalized, but the full unitname is not. (For example, the watt was named forJames Watt and is abbreviated “W.”)

TABLE II SI Prefixes

Factor Prefix Symbol Factor Prefix Symbol

1024 yotta Y 10−1 deci d

1021 zetta Z 10−2 centi c

1018 exa E 10 −3 milli m

1015 peta P 10−6 micro µ

1012 tera T 10−9 nano n

109 giga G 10−12 pico p

106 mega M 10 −15 femto f

103 kilo k 10−18 atto a

102 hecto h 10−21 zepto z

101 decka, deca da 10−24 yocto y

Derived units: joule (= kg · m2 · s−2 = N · m), watt(= J · s−1), lumen (= cd · sr), and lux (= lm · m−2).These are formed by combining base units according toalgebraic relations linking the corresponding physicalquantities. The laws of chemistry and physics are usedto determine the algebraic combinations resulting in thederived units. Also included are the units of angle(radian, rad), and solid angle (steradian, sr).

A previously separate third class called supplementaryunits, combinations of the above units and units for planeand solid angle, was eliminated by the General Conferenceon Weights and Measures (CGPM, Conference Generaledes Poids et Mesures) during its 9–12 October 1995 meet-ing. The radian and steradian were moved into the SI classof derived units.

Some derived units are given their own names, to avoidhaving to express every unit in terms of its base units. Thesymbol “·” is used to denote multiplication and “/” denotesdivision. Both are used to separate units in combinations.It is permissible to replace “·” with a space, but somestandards require it to be included. In 1969 the followingadditional non-SI units were accepted by the InternationalCommittee for Weights and Measures for use with SI units:day, hour, and minute of time, degree, minute and secondof angle, the litre (10−3 m3), and the tonne (103 kg). In theUnited States the latter two are spelled “liter” and “metricton,” respectively.

The worldwide web of the internet contains manysites describing and explaining the SI system. A searchon “The Metric System” with any search engine shouldyield several. The United States government site athttp://physics.nist.gov/cuu/Units/ is comprehensive andprovides links to other web pages of importance.

E. The I-P System

The most prominent alternative to the metric system is theinch-pound or the so-called “English” system of units. Inthis system the foot and pound are units for length andmass. The British thermal unit (Btu) is the unit of energy.This system is used little for radiometry and photome-try around the world today, with the possible exceptionof the United States, where many illumination engineersstill work with a mixed metric/IP unit, the foot-candle(lumen · ft−2) as their unit of illuminance. There are about10.76 square feet in a square meter. So one foot-candleequals about 10.76 lux. The I-P system is being depre-cated. However, in order to read older texts in radiom-etry and photometry using the I-P system, some famil-iarity with its units is advised. Tables 10.3 and 10.4 ofMcCluney (1994) provide conversion factors for manynon-SI units.

Page 311: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

736 Radiometry and Photometry

II. RADIOMETRY

A. Definitions of Fundamental Quantities

There are five fundamental quantities of radiometry: radi-ant energy, radiant flux, radiant intensity, irradiance, andradiance. Each has a photometric counterpart, describedin the next section.

Radiant energy, Q, is the quantity of energy propagatinginto, through, or emerging from a specified surface area ina specified period of time (unit: joule). Radiant energy isof interest in applications involving pulses of radiation, orexposure of a receiving surface to temporally continuousradiant energy over a specific period of time. An equivalentunit is the watt · sec.

Radiant flux (power), , is the time rate of flow ofradiant energy (unit: watt). One watt is 1 J sec−1. Thedefining equation is the derivative of the radiant energy Qwith respect to time t .

= d Q

dt. (6)

Radiant flux is the quantity of energy passing through asurface or region of space per unit time. When specifyinga radiant flux value, the spatial extent of the radiation fieldincluded in the specification should be described.

Irradiance, E , is the area density of radiant flux, theradiant flux per unit area at a specified point in a specifiedsurface that is incident on, passing through, or emergingfrom that point in the surface (unit: watt · m−2). All di-rections in the hemispherical solid angle producing theradiation at that point are to be included. The definingequation is

E = d

dso, (7)

where d is an infinitesimal element of radiant flux anddso is an element of area in the surface. (The subscript “o”is used to indicate that this area is in an actual surface andis not a projected area.) The flux incident on a point in asurface can come from any direction in the hemisphericalsolid angle of incidence, or all of them, with any direc-tional distribution. The flux can also be that leaving thesurface in any direction in the hemispherical solid angleof emergence from the surface.

The irradiance leaving a surface can be called the ex-itance and can be given the symbol M , to distinguish itfrom the irradiance incident on the surface, but it has thesame units and defining equation as irradiance. (The termemittance, related to the emissivity, is reserved for use indescribing a dimensionless optical property of a material’ssurface and cannot be used for emitted irradiance.)

Since there is no mathematical or physical distinctionbetween flux incident upon, passing through, or leaving a

surface, the term irradiance is used throughout this articleto describe the flux per unit area in all three cases.

Irradiance is a function of position in the surface spec-ified for its definition.

When speaking of irradiance, one should be careful bothto describe the surface and to indicate at which point onthe surface the irradiance is being evaluated, unless thisis very clear in the context of the discussion, or if theirradiance is known or assumed to be constant over thewhole surface.

Radiant intensity, I , is the solid angle density of radi-ant flux, the radiant flux per unit solid angle incident on,passing through, or emerging from a point in space andpropagating in a specified direction (units: watt · sr−1). Thedefining equation is

I = d

d ω, (8)

where d is an element of flux incident on or emerg-ing from a point within element d ω of solid angle in thespecified direction. The representation of d ω in sphericalcoordinates is illustrated in Fig. 6.

Radiant intensity is a function of direction from its pointof specification, and may be written as I (θ, φ) to indicateits dependence upon the spherical coordinates (θ, φ) spec-ifying a direction in space. Its definition is illustrated inFig. 7.

Intensity is a useful concept for describing the direc-tional distribution of radiation from a point source (or asource very small compared with the distance from it to theobserver or detector of that radiation). The concept can beapplied to extended sources having the same intensity atall points, in which case it refers to that subset of the radia-tion emanating from the entire source of finite and knownarea which flows into the same infinitesimal solid angledirection for each point in that area. (The next quantity

FIGURE 6 Representation of the element of solid angle dω inSpherical coordinates.

Page 312: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 737

FIGURE 7 Geometry for the definition of Intensity.

to be described, radiance, is generally a more appropri-ate quantity for describing the directional distribution ofradiation from nonpoint sources.)

When speaking of intensity, one should be careful todescribe the point of definition and the direction of radia-tion from that point for clarity of discourse, unless this isobvious in the context of the discussion, or if it is knownthat the intensity is constant for all directions.

The word “intensity” is frequently used in optical phy-sics. Most often the radiometric quantity being describedis not intensity but irradiance.

Radiance, L , is the area and solid angle density of ra-diant flux, the radiant flux per unit projected area and perunit solid angle incident on, passing through, or emerg-ing from a specified point in a specified surface, and in aspecified direction (units: watt · m−2 · sr−1). The definingequation is

L = d2

d ω ds

or (9)

L = d2

d ω dso cos θ,

where ds = dso cos θ is the projected area, the area of theprojection of elemental area dso along the direction ofpropagation to a plane perpendicular to this direction, d ωis an element of solid angle in the specified direction andθ is the angle this direction makes with the normal (per-pendicular) to the surface at the point of definition, asillustrated in Fig. 8.

Radiance is a function of both position and direction.For many real sources, it is a strongly varying function ofdirection. It is the most general quantity for describing thepropagation of radiation through space and transparent orsemitransparent materials. The radiant flux, radiant inten-sity, and irradiance can be derived from the radiance bythe mathematical process of integration over a finite sur-face area and/or over a finite solid angle, as demonstratedin Section IV.B.

FIGURE 8 Geometry for the definition of radiance.

Since radiance is a function of position in a definedsurface as well as direction from it, it is important whenspeaking of radiance to specify the surface, the point in it,and the direction from it. All three pieces of informationare important for the proper specification of radiance. Forexample, we may wish to speak of the radiance emanatingfrom a point on the ground and traveling upward towardthe lens of a camera in an airplane or satellite travelingoverhead. We specify the location of the point, the surfacefrom which the flux emanates, and the direction of its traveltoward the center of the lens. Since the words “radiance”and “irradiance” can sound very similar in rapidly spokenor slurred English, one can avoid confusion by speaking ofthe point and the surface that is common to both concepts,and then to clearly specify the direction when talking aboutradiance.

B. Definitions of Spectral Quantities

The spectral or wavelength composition of the five fun-damental quantities of radiometry is often of interest. Wespeak of the “spectral distribution” of the quantities and bythis is meant the possibly varying magnitudes of them atdifferent wavelengths or frequencies over whatever spec-tral range is of interest. As before, if we let Q representany one of the five radiometric quantities, we define thespectral “concentration” of that quantity, denoted Qλ, tobe the derivative of the quantity with respect to wave-length λ. (The derivative with respect to frequency ν orwavenumber (1/ν) is also possible but less used.)

Qλ = d Q

dλ. (10)

This defines the radiometric “quantity” per unit wave-length interval and can also be called the spectral powerdensity. It has the same units as those of the quantity Q

Page 313: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

738 Radiometry and Photometry

TABLE III Symbols and Units of the Five Spectral Radiometric Quantities

Quantity Spectral radiant energy Spectral radiant flux Spectral irradiance Spectral intensity Spectral radiance

Symbol Qλ λ E λ I λ L λUnits J · nm−1 W · nm−1 W · m−2 · nm−1 W · sr−1 · nm−1 W · m−2 · sr−1 · nm−1

divided by wavelength. The spectral radiometric quantityQλ is in one respect the more fundamental of the two,since it contains more information, the spectral distribu-tion of Q, rather than just its total magnitude. The two arerelated by the integral

Q =∫ ∞

0Qλ d λ. (11)

If Qλ is zero outside some wavelength range, (λ1 , λ2) thenthe integral of Eq. (11) can be replaced by

Q =∫ λ2

λ1

Qλ dλ. (12)

The symbols and units of the spectral radiant quantitiesare listed in Table III.

III. PHOTOMETRY

A. Introduction

Photometry is a system of language, mathematical formu-lations, and instrumental methodologies used to describeand measure the propagation of light through space andmaterials. In consequence, the radiation so studied is con-fined to the visible (VIS) portion of the spectrum. Onlylight is visible radiation.

In photometry, all the radiant quantities defined inSection II are adapted or specialized to indicate the hu-man eye’s response to them. This response is built intothe definitions. Familiarity with the five basic radiometricquantities introduced in that section makes much easierthe study of the corresponding quantities in photometry, asubset of radiometry.

The human eye responds only to light having wave-lengths between about 360 and 800 nm. Radiometry dealswith electromagnetic radiation at all wavelengths and fre-quencies, while photometry deals only with visible light—that portion of the electromagnetic spectrum which stim-ulates vision in the human eye.

Radiation having wavelengths below 360 nm, down toabout 100 nm, is called ultraviolet, or UV, meaning “be-yond the violet.” Radiation having wavelengths greaterthan 830 nm, up to about 1 mm, is called infrared, orIR, meaning “below the red.” “Below” in this case refersto the frequency of the radiation, not to its wavelength.(Solving (1) for frequency yields the equation ν = c/λ,showing the inverse relationship between frequency and

wavelength.) The infrared portion of the spectrum lies be-yond the red, having frequencies below and wavelengthsabove those of red light. Since the eye is very insensitiveto light at wavelengths between 360 and about 410 nmand between about 720 and 830 nm, at the edges of thevisible spectrum, many people cannot see radiation in por-tions of these ranges. Thus, the visible edges of the UVand IR spectra are as uncertain as the edges of the VISspectrum.

The term “light” should only be applied to electro-magnetic radiation in the visible portion of the spectrum,lying between 380 and 770 nm. With this terminology,there is no such thing as “ultraviolet light,” nor does theterm “infrared light” make any sense either. Radiation out-side these wavelength limits is radiation—not light—andshould not be referred to as light.

B. The Sensation of Vision

After passing through the cornea, the aqueous humor, theiris and lens, and the vitreous humor, light entering theeye is received by the retina, which contains two generalclasses of receptors: rods and cones. Photopigments inthe outer segments of the rods and cones absorb radiationand the absorbed energy is converted within the receptors,into neural electrochemical signals which are then trans-mitted to subsequent neurons, the optic nerve, and thebrain.

The cones are primarily responsible for day vision andthe seeing of color. Cone vision is called photopic vision.The rods come into play mostly for night vision, when illu-mination levels entering the eye are very low. Rod vision iscalled scotopic vision. An individual’s relative sensitivityto various wavelengths is strongly influenced by the ab-sorption spectra of the photoreceptors, combined with thespectral transmittance of the preretinal optics of the eye.The relative spectral sensitivity depends on light level andthis sensitivity shifts toward the blue (shorter wavelength)portion of the spectrum as the light level drops, due tothe shift in spectral sensitivity when going from conesto rods.

The spectral response of a human observer under pho-topic (cone vision) conditions was standardized by theInternational Lighting Commission the International del’Eclairage (CIE), in 1924. Although the actual spectralresponse of humans varies somewhat from person to per-son, an agreed standard response curve has been adopted,

Page 314: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 739

as shown graphically in Fig. 9 and listed numerically inTable IV.

The values in Table IV are taken from the LightingHandbook of the Illuminating Engineering Society ofNorth America (IESNA). Since the symbol V (λ) is nor-mally used to represent this spectral response, the curvein Fig. 9 is often called the “V -lambda curve.”

The 1924 CIE spectral luminous efficiency functionfor photopic vision defines what is called “the CIE1924 Standard Photopic Photometric Observer.” Theofficial values were originally given for the wavelengthrange from 380 to 780 nm at 10-nm intervals but werethen “completed by interpolation, extrapolation, andsmoothing from earlier values adopted by the CIE in1924 and 1931” to the wavelength range from 360 to830 nm on 1-nm intervals and these were then recom-mended by the International Committee of Weights andMeasures (CIPM) in 1976. The values below 380 andabove 769 are so small to be of little value for mostphotometric calculations and are therefore not included inTable IV.

Any individual’s eye may depart somewhat from theresponse shown in Fig. 9, and when light levels are mod-erately low, the other set of retinal receptors (rods) comesinto use. This regime is called “scotopic vision” and ischaracterized by a different relative spectral response.

The relative spectral response curve for scotopic visionis similar in shape to the one shown in Fig. 9, but the peak isshifted from 555 to about 510 nm. The lower wavelengthcutoff in sensitivity remains at about 380 nm, however,while the upper limit drops to about 640 nm. More in-formation about scotopic vision can be found in variousbooks on vision as well as in the IESNA Lighting Hand-book. The latter contains both plotted and tabulated valuesfor the scotopic spectral luminous efficiency function.

FIGURE 9 Human photopic spectral luminous efficiency.

C. Definitions of Fundamental Quantities

Five fundamental quantities in radiometry were defined inSection II.A. The photometric ones corresponding to thelast four are easily defined in terms of their radiometriccounterparts as follows. Let Qλ(λ) be one of the following:spectral radiant flux λ, spectral irradiance E λ, spectralintensity Iλ, or spectral radiance L λ. The correspondingphotometric quantity, Qv is defined as follows:

Qv = 683∫ 770

380Qλ(λ)V (λ) d λ (13)

with wavelength λ having the units of nanometers.The subscript v (standing for “visible” or “visual”)

is placed on photometric quantities to distinguish themfrom radiometric quantities, which are given the sub-script e (standing for “energy”). These subscripts maybe dropped, as they were in previous sections, when themeaning is clear and no ambiguity results. Four funda-mental radiometric quantities, and the corresponding pho-tometric ones, are listed in Table V, along with the units foreach.

To illustrate the use of (13), the conversion from spectralirradiance to illuminance is given by

Ev = 683∫ 770

380Eλ(λ) V (λ) d λ. (14)

The basic unit of luminous flux, the lumen, is like a “light-watt.” It is the luminous equivalent of the radiant flux orpower. Similarly, luminous intensity is the photometricequivalent of radiant intensity. It gives the luminous fluxin lumens emanating from a point, per unit solid anglein a specified direction, and therefore has the units of lu-mens per steradian or lm/sr, given the name candela. Thisunit is one of the seven base units of the metric system.More information about the metric system as it relates toradiometry and photometry can be found in Chapter 10 ofMcCluney (1994).

Luminous intensity is a function of direction from itspoint of specification, and may be written as Iv(θ, φ) to in-dicate its dependence upon the spherical coordinates (θ, φ)specifying a direction in space, illustrated in Fig. 6.

Illuminance is the photometric equivalent of irradianceand is like a “light-watt per unit area.” Illuminance is afunction of position (x, y) in the surface on which it isdefined and may therefore be written as Ev(x, y). Mostlight meters measure illuminance and are calibrated toread in lux. The lux is an equivalent term for the lumenper square meter and is abbreviated lx.

In the inch-pound (I-P) system of units, the unit for il-luminance is the lumen per square foot, or lumen · ft−2,which also has the odd name “foot-candle,” abbreviated“fc,” even though connection with candles and the candela

Page 315: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

740 Radiometry and Photometry

TABLE IV Photopic Spectral Luminous Efficiency V (λ)

Values interpolated at intervals of 1 nmWavelength

λ, nmStandard

values 1 2 3 4 5 6 7 8 9

380 .00004 .000045 .000049 .000054 .000058 .000064 .000071 .000080 .000090 .000104

390 .00012 .000138 .000155 .000173 .000193 .000215 .000241 .000272 .000308 .000350

400 .0004 .00045 .00049 .00054 .00059 .00064 .00071 .00080 .00090 .00104

410 .0012 .00138 .00156 .00174 .00195 .00218 .00244 .00274 .00310 .00352

420 .0040 .00455 .00515 .00581 .00651 .00726 .00806 000889 .00976 .01066

430 .0116 .01257 .01358 .01463 .01571 .01684 .01800 .01920 .02043 .02170

440 .023 .0243 .0257 .0270 .0284 .0298 .0313 .0329 .0345 .0362

450 .038 .0399 .0418 .0438 .0459 .0480 .0502 .0525 .0549 .0574

460 .060 .0627 .0654 .0681 .0709 .0739 .0769 .0802 .0836 .0872

470 .091 .0950 .0992 .1035 .1080 .1126 .1175 .1225 .1278 .1333

480 .139 .1448 .1507 .1567 .1629 .1693 .1761 .1833 .1909 .1991

490 .208 .2173 .2270 .2371 .2476 .2586 .2701 .2823 .2951 .3087

500 .323 .3382 .3544 .3714 .3890 .4073 .4259 .4450 .4642 .4836

510 .503 .5229 .5436 .5648 .5865 .6082 .6299 .6511 .6717 .6914

520 .710 .7277 .7449 .7615 .7776 .7932 .8082 .8225 .8363 .8495

530 .862 .8739 .8851 .8956 .9056 .9149 .9238 .9320 .9398 .9471

540 .954 .9604 .9961 .9713 .9760 .9083 .9480 .9873 .9902 .9928

550 .995 .9969 .9983 .9994 1.0000 1.0002 1.0001 .9995 .9984 .9969

560 .995 .9926 .9898 .9865 .9828 .9786 .9741 .9691 .9638 .9581

570 .952 .9455 .9386 .9312 .9235 .9154 .9069 .8981 .8890 .8796

580 .870 .8600 .8496 .8388 .8277 .8163 .8046 .7928 .7809 .7690

590 .757 .7449 .7327 .7202 .7076 .6949 .6822 .6694 .6565 .6437

600 .631 .6182 .6054 .5926 .5797 .5668 .5539 .5410 .5282 .5156

610 .503 .4905 .4781 .4568 .4535 .4412 .4291 .4170 .4049 .3929

620 .381 .3690 .3575 .3449 .3329 .3210 .3092 .2977 .2864 .2755

630 .265 .2548 .2450 .2354 .2261 .2170 .2082 .1996 .1912 .1830

640 .175 .1672 .1596 .1523 .1452 .1382 .1316 .1251 .1188 .1128

650 .107 .1014 .0961 .0910 .0862 .0816 .0771 .0729 .0688 .0648

660 .061 .0574 .0539 .0506 .0475 .0446 .0418 .0391 .0366 .0343

670 .032 .0299 .0280 .0263 .0247 .0232 .0219 .0206 .0194 .0182

680 .017 .01585 .01477 .01376 .01281 .011,92 .01108 .01030 .00956 .00886

690 .0082 .00759 .00705 .00656 .00612 .00572 .00536 .00503 .00471 .00440

700 .0041 .00381 .00355 .00332 .00310 .00291 .00273 .00256 .00241 .00225

710 .0021 .001954 .001821 .001699 .001587 .001483 .001387 .001297 .001212 .001130

720 .00105 .000975 .000907 .000845 .000788 .000736 .000668 .000644 .000601 .000560

730 .00052 .000482 .000447 .000415 .000387 .000360 .000335 .000313 .000291 .000270

740 .00025 .000231 .000214 .000198 .000185 .000172 .000160 .000149 .000139 .000130

750 .00012 .000111 .000103 .000096 .000090 .000084 .000078 .000074 .000069 .000064

760 .00006 .000056 .000052 .000048 .000045 .000042 .000039 .000037 .000035 .000032

is mainly historical and indirect. The I-P system is beingdiscontinued in photometry, to be replaced by the met-ric system, used exclusively in this treatment. For moreinformation on the connections between modern metricphotometry and the antiquated and deprecated units, thereader is directed to Chapter 10 of McCluney (1994). Aswith radiant exitance, illuminance leaving a surface canbe called luminous exitance.

Luminance can be thought of as “photometric bright-ness,” meaning that it comes relatively close to describingphysically the subjective perception of “brightness.” Lu-minance is the quantity of light flux passing through apoint in a specified surface in a specified direction, perunit projected area at the point in the surface and per unitsolid angle in the given direction. The units for luminanceare therefore lm · m−2· sr−1. A more common unit for

Page 316: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 741

TABLE V Basic Quantities of Radiometry and Photometry

Radiometric Photometricquantity Symbol Units quantity Symbol Units

Radiant flux e watt (W) Luminous flux v lumen (lm)

Radiant intensity Ie W/sr Luminous intensity Iv lumen/sr = candela (cd)

Irradiance Ee W/m2 Illuminance Ev lumen/m2 = lux (lx)

Radiance Le W · m−2 · sr−1 Luminance Lv lm · m−2 · sr−1 = cd/m2

luminance is the cd · m−2, which is the same as the lumenper steradian and per square meter.

D. Luminous Efficacy of Radiation

Radiation luminous efficiacy, Kr, is the ratio of luminousflux (light) in lumens to radiant flux (total radiation) inwatts in a beam of radiation. It is an important concept forconverting between radiometric and photometric quanti-ties. Its units are the lumen per watt, lm/W.

Luminous efficacy is not an efficiency since it is not adimensionless ratio of energy input to energy output—itis a measure of the effectiveness of a beam of radiationin stimulating the perception of light in the human eye. IfQv is any of the four photometric quantities (v, Ev, Iv,or Lv) defined previously and Qe is the corresponding ra-diometric quantity, then the luminous efficacy associatedwith these quantities has the following defining equation:

Kr = Qv

Qe[lm · W −1] (15)

Qe is an integral over all wavelengths for which Qλ isnonzero, while Qv depends on an integral (13) over onlythe visible portion of the spectrum, where V (λ) is nonzero.The luminous efficacy of a beam of infrared-only radia-tion is zero since none of the flux in the beam is in thevisible portion of the spectrum. The same can be said ofultraviolet-only radiation.

The International Committee for Weights and Mea-sures (CPIM), meeting at the International Bureau ofWeights and Measures near Paris, France, in 1977 set thevalue 683 lm/W for the spectral luminous efficacy (Kr) ofmonochromatic radiation having a wavelength of 555 nmin standard air. In 1979 the candela was redefined to be theluminous intensity in a given direction, of a source emit-ting monochromatic radiation of frequency 540 × 1012

hertz and that has a radiant intensity in that direction of1/683 W/sr . The candela is one of the seven fundamentalunits of the metric system. As a result of the redefinition ofthe candela, the value 683 shown in Eq. (13) is not a recom-mended good value for Kr but instead follows from the def-inition of the candela in SI units. (Prior to 1979, the candelawas realized by a platinum approximation to a blackbody.After the 1979 redefinition of the candela, it can be realized

from the absolute radiometric scale using any of a varietyof absolute detection methods discussed in Section X.)

IV. COMMONLY USED GEOMETRICRELATIONSHIPS

There are several important spatial integrals which can bedeveloped from the definitions of the principal radiomet-ric and photometric quantities. This discussion of someof them will use radiometric terminology, with the under-standing that the same derivations and relationships applyto the corresponding photometric quantities.

A. Lambertian Sources and the Cosine Law

To simplify some derivations, an important property, ap-proximately exhibited by some sources and surfaces, isuseful. Any surface, real or imaginary, whose radianceis independent of direction is said to be a Lambertianradiator. The surface can be self-luminous, as in the caseof a source, or it can be a reflecting or transmitting one.If the radiance emanating from it is independent of di-rection, this radiation is considered to be Lambertian. ALambertian radiator can be thought of as a window ontoan isotropic radiant flux field.

Finite Lambertian radiators obey Lambert’s cosine law,which is that the flux in a given direction leaving an ele-ment of area in the surface varies as the cosine of the angleθ between that direction and the perpendicular to the sur-face element: d(θ ) = d(0) cos θ . This is because theprojected area in the direction θ decreases with the cosineof that angle. In the limit, when θ = 90 degrees, the fluxdrops to zero because the projected area is zero.

There is another version of the cosine law. It has todo not with the radiance leaving a surface but with howradiation from a uniform and collimated beam (a beamwith all rays parallel to each other and equal in strength)incident on a plane surface is distributed over that surfaceas the angle of incidence changes.

This is illustrated as follows: A horizontal rectangle oflength L and width W receives flux from a homogeneousbeam of collimated radiation of irradiance E , making anangle θ with the normal (perpendicular) to the plane of

Page 317: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

742 Radiometry and Photometry

the rectangle, as shown in Fig. 2. If is the flux over theprojected area A, given by E times A, this same flux o

will be falling on the larger horizontal area Ao = L · W ,producing horizontal irradiance Eo = o/A. The flux isthe same on the two areas ( = o). Equating them gives

EA = Eo Ao. (16)

But A = Ao cos θ so that

Eo = E cos θ. (17)

This is another way of looking at the cosine law. Althoughit deals with the irradiance falling on a surface, if the sur-face is perfectly transparent, or even imaginary, it will alsodescribe the irradiances (or exitances) emerging from theother side of the surface.

B. Flux Relationships

Radiance and irradiance are quite different quantities. Ra-diance describes the angular distribution of radiation whileirradiance adds up all this angular distribution over a spec-ified solid angle and lumps it together. The fundamentalrelationship between them is embodied in the equation

E =∫

L(θ, φ) cos θ dω (18)

for a point in the surface on which they are defined. In thisand subsequent equations, the lower case dω is used toidentify an element of solid angle dω and the upper case to identify a finite solid angle.

If = 0 in Eq. (18), there is no solid angle and there canbe no irradiance! When we speak of a collimated beam ofsome given irradiance, say Eo, we are talking about theirradiance contained in a beam of nearly parallel rays,but which necessarily have some small angular spread tothem, filling a small but finite solid angle , so that (18)can be nonzero. A perfectly collimated beam contains noirradiance, because there is no directional spread to itsradiation—the solid angle is zero. Perfect collimation is auseful concept for theoretical discussions, however, and itis encountered frequently in optics. When speaking of col-limation in experimental situations, what is usually meantis “quasi-collimation,” nearly perfect collimation.

If the radiance L(θ, φ) in Eq. (18) is constant over therange of integration (over the hemispherical solid angle),then it can be removed from the integral and the result is

E = πL . (19)

This result is obtained from Eq. (18) by replacing dω withits equivalence in spherical coordinates, sin θ dθ dφ, andintegrating the result over the angular ranges of 0 to 2π

for φ and 0 to π/2 for θ .

A constant radiance surface is called a Lambertian sur-face so that (19) applies only to such surfaces.

It is instructive to show how Eq. (18) can be derivedfrom the definition of radiance. Eq. (9) is solved for d2

and the result divided by dso. Since d2/dso = d E byEq. (7), we have

d E = L cos θ dω. (20)

Integrating (20) yields (18).Similarly, one can replace the quotient d2/dω with

the differential d I [from Eq. (8)] in Eq. (9) and solve ford I . Integrating the result over the source area So yields

I =∫

So

L cos θ dso. (21)

Intensity is normally applied only to point sources, or tosources whose area So is small compared with the distanceto them. However, Eq. (21) is valid, even for large sources,though it is not often used this way.

Solving (8) for d = I dω, writing dω as dao/R2, anddividing both sides by dao yields the expression

E = I dω

dao= I dao

dao R2= I

R2(22)

for the irradiance E a distance R from a point source ofintensity I , on a surface perpendicular to the line betweenthe point source and the surface where E is measured. Thisis an explicit form for what is known as “the inverse squarelaw” for the decrease in irradiance with distance from apoint source. The inverse square law is a consequence ofthe definition of solid angle and the “filling” of that solidangle with flux emanating from a point source.

Next comes the conversion from radiance L to flux .Let the dependence of the radiance on position in a surfaceon which it is defined be indicated by generalized coordi-nates (u, v) in the surface of interest. Let the directionaldependence be denoted by (θ, φ), so that L may be writ-ten as a function L(u, v, θ, φ) of position and direction.Solve (9), the definition of radiance, for d2. The result is

d2 = L cos θ dso dω. (23)

Integrating (23) over both the area So of the surface andthe solid angle of interest yields

=∫

So

L(u, v, θ, φ) cos θ dω dso. (24)

In spherical coordinates, dω is given by sin θ dθ dφ. Let-ting the solid angle over which (23) is integrated extendto the full 2πsr of the hemisphere, we have the total fluxemitted by the surface in all directions.

=∫

So

∫ 2π

0

∫ π2

0L(u, v, θ, φ) cos θ sin θ dθ dφ dso.

(25)

Page 318: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 743

V. PRINCIPLES OF FLUX TRANSFER

Only the geometrical aspects of flux transfer through alossless and nonscattering medium are of interest in thissection. The effects of absorption and scattering of radi-ation as it propagates through a transparent or semitrans-parent medium from a source to a receiver are outsidethe scope of this article. The effects of changes in the re-fractive index of the medium, however, are dealt with inSection V.E.

All uses of flux quantities in this section refer to boththeir radiant (subscript e) and luminous (subscript v) ver-sions. The subscripts are left off for simplicity. When theterms radiance and irradiance are mentioned in this sec-tion, the discussion applies equally to luminance and illu-minance, respectively.

A. Source/Receiver Geometry

The discussion begins with the drawing of Fig. 10 and thedefinition of radiance L in (9):

L = d2

d ω dso cos θ, (26)

where θ is the angle made by the direction of emerg-ing flux with respect to the normal to the surface of thesource, dso is an infinitesimally small element of area atthe point of definition in the source, and d ω is an elementof solid angle from the point of definition in the direction ofinterest.

In Fig. 10 are shown an infinitesimally small elementdso of area at a point in a source, an infinitesimal ele-ment dao of area at point P on a receiving surface, thedistance R between these points, and the angles θ andψ between the line of length R between the points andthe normals to the surfaces at the points of intersection,respectively.

FIGURE 10 Source/receiver geometry.

B. Fundamental Equations of Flux Transfer

The element dω of solid angle subtended by element ofprojected receiver area da = dao cos ψ at distance R fromthe source is

dω = da

R2= dao cos ψ

R2(27)

so that, solving (26) for d2 and using (27), the elementof flux received at point P from the element dso of areaof the source is given by

d2 = Ldso cos θ dao cos ψ

R2(28)

with the total flux received by area Ao from source areaSo being given by

=∫

So

∫Ao

Ldso cos θ dao cos ψ

R2. (29)

This is the fundamental (and very general within the as-sumptions of this section) equation describing the transferof radiation from a source surface of finite area to a receiv-ing surface of finite area. Most problems of flux transferinvolve this integration (or a related version shown later,giving the irradiance E instead of the flux). For complexor difficult geometries the problem can be quite complexanalytically because in such cases L , θ, ψ, and R willbe possibly complicated functions of position in both thesource and the receiver surfaces. The general dependencyof L on direction is also embodied in this equation, sincethe direction from a point in the source to a point in the re-ceiver generally changes as the point in the receiver movesover the receiving surface.

The evaluation of (29) involves setting up diagrams ofthe geometry and using them to determine trigonomet-ric and other analytic relationships between the geometricvariables in (29). If the problem is expressed in carte-sian coordinates, for example, then the dependences ofL , θ, ψ, R, dso, and dao upon those coordinates must bedetermined so that the integrals in (29) can be evaluated.

Two important simplifications allow us to address alarge class of problems in radiometry and photometry withease, by simplifying the mathematical analysis.

The first results when the source of radiance is known tobe Lambertian and to have the same value for all points inthe source surface. This makes L constant over all rangesof integration, both the integration over the source area andthe one over the solid angle of emerging directions fromeach point on the surface. In such a case, the radiance canbe removed from all integrals over these variables. The re-maining integrals are seen to be purely geometric in char-acter. The second simplification arises when one doesn’twant the total flux over the whole receiving surface—only the flux per unit area at a point on that surface, the

Page 319: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

744 Radiometry and Photometry

irradiance E at point P in Fig. 10. In this case, we candivide both sides of (28) by the element of area in thereceiving surface, dao, to get

d E = Lcos θ cos ψ

R2dso. (30)

This equation is the counterpart of (28) when it is theirradiance E of the receiving surface that is desired. For thetotal irradiance at point P , one must integrate this equationover the portion So of the source surface contributing tothe flux at P .

E =∫

So

Lcos θ cos ψ

R2dso. (31)

When L is constant over direction, it can be removed fromthis integral and one is left with a simpler integration toperform. Equation (31) is the counterpart to (29) when itis the irradiance E at the receiving point that is of interestrather than the total flux over area Ao.

C. Simplified Source/Receiver Geometries

If the source area So is small with respect to the distance Rto the point P of interest (i.e., if the maximum dimensionof the source is small compared with R), then R2, cos ψ ,and cos θ do not vary much over the range of integrationshown in (31) and they can be removed from the integral.If L does not vary over So, then it also can be removed fromthe integral, even if L is direction dependent, because therange of integration over direction is so small; that is, onlythe one direction from the source to point P in the receiveris of interest. We are left with an approximate version of(30) for small homogeneous sources some distance fromthe point of reception:

E ≈ LSo cos θ cos ψ

R2. (32)

This equation contains within it both the cosine law andthe inverse square law.

If the source and receiving surfaces face each other di-rectly, so that θ and ψ are zero, both of the cosines inthis equation have values of unity and the equation is stillsimpler in form.

D. Configuration Factor

In analyzing complicated radiation transfer problems, it isfrequently helpful to introduce what is called the configu-ration factor. Alternate names for this factor are the view,angle, shape, interchange, or exchange factor. It is definedto be the fraction of total flux from the source surface thatis received by the receiving surface. It is given the sym-bol Fs−r or F1−2, indicating flux transfer from source to

receiver or from Surface 1 to Surface 2. In essence, it in-dicates the details of how flux is transferred from a sourcearea of some known form to a reception area. Its value ismost evident when the source radiance is of such a naturethat it can be taken from the integrals, leaving integralsover only geometric variables.

The geometry can still be quite complex, making an-alytical expressions for F1−2 difficult to determine andcalculate. Many important geometries have already beenanalyzed, however, and the resulting configuration factorspublished.

In many problems, one is most concerned with the mag-nitude and spectral distribution of the source radiance andthe corresponding spectral irradiance in a receiving sur-face, rather than with the geometrical aspects of the prob-lem expressed by the shape factor. It is very convenientin such cases to separate the spectral variations from thegeometrical ones. Once the configuration factor has beendetermined for a situation with nonchanging geometry,it remains constant and attention can be focused on thevariable portion of the solution.

A general expression for the configuration factor resultsfrom dividing (29) for the flux r on the receiver by (24)for the total flux s, emitted by the source.

Fs−r = r

s=

∫So

∫Ao

L cos θ cos ψ dso dao

R2∫So

∫2π

L cos θ dω dso. (33)

This is the most general expression for the configurationfactor. If the source is Lambertian and homogeneous, orif So and Ao are small in relation to R2 then L can beremoved from the integrals, resulting in

Fs−r =∫

So

∫Ao

cos θ cos ψ dso dao

R2

π So(34)

a more conventional form for the configuration factor. Asdesired, it is purely geometric and has no radiation compo-nents. For homogeneous Lambertian sources of radianceL , the flux to a receiver, s−r is given by

s−r = π SoL Fs−r (35)

E. Effect of Refractive Index Changes

For a ray propagating through an otherwise homogeneousmedium without losses, it can be shown that the quantityL/n2 is invariant along the ray. L is the radiance and n isthe refractive index of the medium. If the refractive indexis constant, the radiance L is constant along the ray. Thisis known as the invariance of radiance.

Page 320: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 745

Suppose this ray passes through a (specular) interfacebetween two isotropic and homogeneous media of dif-ferent refractive indices, n1 and n2, and suppose there isneither absorption nor reflection at the interface. In thiscase it can be shown that

L1

n21

= L2

n22

. (36)

Equation 36 shows how radiance invariance is modifiedfor rays passing through interfaces between two mediawith different refractive indices.

A consequence of (36) is that a ray entering a medium ofdifferent refractive index will have its radiance altered, butupon emerging back into the original medium the originalradiance will be restored, neglecting absorption, scatter-ing, and reflection losses.

This is what happens to rays passing through the lens ofan imaging system. The radiance associated with every raycontributing to a point in an image is the same as when thatray left the object on the other side of the lens (ignoringreflection and transmission losses in the lens). Since thisis true of all rays making up an image point, the radianceof an image formed by a perfect, lossless lens equals theradiance of the object (the source).

This may seem paradoxical. Consider the case of a fo-cusing lens, one producing a greater irradiance in the im-age than in the object. How can a much brighter imagehave the same radiance as that of the object? The answeris that the increased flux per unit area in the image is bal-anced by an equal reduction in the flux per unit solid angleincident on the image. This trading of flux per unit areafor flux per unit solid angle is what allows the radiance toremain essentially unchanged.

VI. SOURCES

A. Introduction

The starting point in solving most problems of radiationtransfer is determining the magnitude and the angular andspectral distributions of emission from the source. The op-tical properties of any materials on which that radiation isincident are also important, especially their spectral anddirectional properties. This section provides comparativeinformation about a variety of sources commonly foundin radiometric and photometric problems within the UV,VIS, and IR parts of the spectrum. Definitions used inradiometry and photometry for the reflection, transmis-sion, and absorption properties of materials are providedin Section VII.

Spectral distributions are probably the most importantcharacteristics of sources that must be considered in thedesign of radiometric systems intended to measure all or

portions of those distributions. The matching of a properdetector/filter combination to a given radiation source isone of the most important tasks facing the designer. Sec-tion VIII deals with detectors.

B. Blackbody Radiation

All material objects above a temperature of absolute zeroemit radiation. The hotter they are, the more they emit.The constant agitation of the atoms and molecules mak-ing up all objects involves accelerated motion of electri-cal charges (the electrons and protons of the constituentatoms). The fundamental laws of electricity and mag-netism, as embodied in Maxwell’s equations, predict thatany accelerated motion of charges will produce radiation.The constant jostling of atoms and molecules in materialsubstances above a temperature of absolute zero produceselectromagnetic radiation over a broad range of wave-lengths and frequencies.

1. Stefan–Boltzmann Law

The total radiant flux emitted from the surface of an objectat temperature T is expressed by the Stefan–Boltzmannlaw, in the form

Mbb = σT 4, (37)

where Mbb is the exitance of (irradiance leaving) the sur-face in a vacuum, σ is the Stefan–Boltzmann constant(5.67031 × 10−8 W · m−2 · K−4), and T is the temperaturein degrees kelvin. The units for Mbb in (37) are W · m−2.Using (37), a blackbody at 27oC, (27 + 273 = 300 K),emits at the rate of 460 W/m2. At 100oC this rate increasesto 1097 W/m2.

Equation (37) applies to what is called a perfect or fullemitter, one emitting the maximum quantity of radiationpossible for a surface at temperature T . Such an emitteris called a blackbody, and its emitted radiation is calledblackbody radiation.

A blackbody is defined as an ideal body that allowsall incident radiation to pass into it (zero reflectance)and that absorbs internally all the incident radiation (zerotransmittance). This must be true for all wavelengths andall angles of incidence. According to this definition, ablackbody is a perfect absorber, having absorptance 1.0at all wavelengths and directions. Due to the law of theconservation of energy, the sum of the reflectance Rand absorptance A of an opaque surface must be unity,A + R = 1.0. Thus, if a blackbody has an absorptance of1.0, its reflectance must be zero. Accordingly, a perfectblackbody at room temperature would appear totally blackto the eye, hence the origin of the name. Only a few sur-faces, such as carbon black, carborundum, and gold black,approach a blackbody in these optical properties.

Page 321: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

746 Radiometry and Photometry

The radiation emitted by a surface is in general dis-tributed over a range of angles filling the hemisphere andover a range of wavelengths. The angular distribution ofradiance from a blackbody is constant; that is, the radianceis independent of direction; it is Lambertian. Specifically,this means that Lλ(θ, φ) = L λ(0, 0) = L λ. Thus, the rela-tionship between the spectral radiance Lbbλ and spectralexitance Mbbλ of a blackbody is given by (19), repeatedhere as

Mbbλ = π Lbbλ. (38)

If Lbbλ is in W · m−2 · sr−1 · nm−1 then the units of Mbbλ

will be W · m−2 · nm−1.

2. Greybodies

Imperfect emitters, which emit less than a blackbody atany given temperature, can be called greybodies if theirspectral shape matches that of a blackbody. If that shapediffers from that of a blackbody the emitter is called anonblackbody.

The Stefan–Boltzmann law still applies to greybodies,but an optical property factor must be included in (37)and (38) for them to be correct for greybodies. That isthe emissivity of the surface, defined and discussed inSection VII.C.3.

3. Planck’s Law

As the temperature changes, the spectral distribution ofthe radiation emitted by a blackbody shifts. In 1901, MaxPlanck made a radical new assumption—that radiant en-ergy is quantized—and used it to derive an equation for thespectral radiant energy density in a cavity at thermal equi-librium (a good theoretical approximation of a blackbody).By assuming a small opening in the side of the cavity andexamining the spectral distribution of the emerging radia-tion, he derived an equation for the spectrum emitted by ablackbody. The equation, now called Planck’s blackbodyspectral radiation law, accurately predicts the spectral ra-diance of blackbodies in a vacuum at any temperature.Using the notation of this text the equation is

Lbbλ = 2hc2

λ5 (e hc

λkT − 1), (39)

where h = 6.626176 × 10−34 J · s is Planck’s constant,c = 2.9979246 × 108 m/s is the speed of light in a vac-uum, and k = 1.380662 × 10−23 J · K−1 is Boltzmann’sconstant. Using these values, the units of Lbbλ will beW · m−2 · µm−1 · sr−1. Plots of the spectral distribution ofa blackbody for different temperatures are illustrated inFig. 11. Each curve is labeled with its temperature in de-grees Kelvin. Insignificant quantities of blackbody radia-

FIGURE 11 Exitance spectra for blackbodies at various temper-atures from 300 to 20,000 K, calculated using Eq. (47).

tion lie in the visible portion of the spectrum for tempera-tures below about 1000 K. With increasing temperatures,blackbody radiation first appears red, then white, and atvery high temperatures it has a bluish appearance.

From (38), the spectral exitance Mbbλ of a blackbody attemperature T is just the spectral radiance Lbbλ given in(39) multiplied by π .

Mbbλ = 2πhc2

λ5(ehc

λ kT − 1). (40)

4. Luminous Efficacy of Blackbody Radiation

Substituting (40) for the hemispherical spectral exitanceof a blackbody into (11) for Ee and (14) for Ev, for each ofseveral different temperatures T , one can calculate the ra-diation luminous efficacy Kbb of blackbody radiation as afunction of temperature. Some numerical results are givenin Table VI, where it can be seen that, as expected, the lu-minous efficacy increases as the body heats up to whitehot temperatures. At very high temperatures Kbb declines,since the radiation is then strongest in the ultraviolet, out-side of the visible portion of the spectrum.

5. Experimental Approximation of a Blackbody

The angular and spectral characteristics of a blackbodycan be approximated with an arrangement similar to theone shown in Fig. 12. A metal cylinder is hollowed out toform a cavity with a small opening in one end. At the op-posite end is placed a conically shaped “light trap,” whosepurpose is to multiply reflect incoming rays, with maxi-mum absorption at each reflection, in such a manner thata very large number of reflections must take place beforeany incident ray can emerge back out the opening. Withthe absorption high on each reflection, a vanishingly small

Page 322: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 747

TABLE VI Blackbody Luminous Efficacy Values

Temperature Luminous efficacyin degrees K in lm/W

500 7.6 × 10−13

1,000 2.0 × 10−4

1,500 0.103

2,000 1.83

2,500 8.71

3,000 21.97

4,000 56.125

5,000 81.75

6,000 92.9

7,000 92.8

8,000 87.3

9,000 79.2

10,000 70.6

15,000 37.1

20,000 20.4

30,000 7.8

40,000 3.7

50,000 2.0

fraction of incident flux, after being multiply reflectedand scattered, emerges from the opening. In consequence,only a very tiny portion of the radiation passing into thecavity through the opening is reflected back out of thecavity.

The temperature of the entire cavity is controlled byheating elements and thick outside insulation so that allsurfaces of the interior are at precisely the same (known)temperature and any radiation escaping from the cavitywill be that emitted from the surfaces within the cavity. Theemerging radiation will be rendered very nearly isotropic

FIGURE 12 Schematic diagram of an approximation to ablackbody.

by the multiple reflections taking place inside (at least overthe useful solid angle indicated in Fig. 12, for which theapparatus is designed).

C. Electrically Powered Sources

Modern tungsten halogen lamps in quartz envelopes pro-duce output spectra that are somewhat similar in shape tothose of blackbody distributions. A representative spec-tral distribution is shown in Fig. 13. This lamp covers awide spectral range, including the near UV, the visible,and much of the infrared portion of the spectrum. Onlythe region from about 240 to 2500 nm is shown in Fig. 13.

Although quartz halogen lamps produce usable outputsin the ultraviolet region, at least down to 200 nm, theoutput at these short wavelengths is quite low and declinesrapidly with decreasing wavelength. Deuterium arc lampsovercome the limitations of quartz halogen lamps in thisspectral region, and they do so with little output abovea wavelength of 500 nm except for a strong but narrowemission line at about 660 nm. The spectral irradiancefrom a deuterium lamp is plotted in Fig. 14.

Xenon arc lamps have a more balanced output over thevisible but exhibit strong spectral “spikes” that pose prob-lems in some applications. Short arc lamps, such as thoseusing xenon gas, are the brightest manufactured sources,with the exception of lasers. Because of the nature of thearc discharges, these lamps emit a continuum of outputover wavelengths covering the ultraviolet and visible por-tions of the spectrum.

The spectral irradiance outputs of the three sourcesjust mentioned cover the near UV, the visible, and thenear IR. They are plotted along with the spectrum of a50-W mercury-vapor arc lamp in Fig. 14. Mercury lampsemit strong UV and visible radiation, with strong spectral

FIGURE 13 Spectral irradiance from a quartz halogen lamp 50cm from the filament.

Page 323: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

748 Radiometry and Photometry

FIGURE 14 Spectral irradiance distributions for four sources ofradiation.

lines in the ultraviolet superimposed over continuousspectra.

Tungsten halogen lamps have substantial output in thenear infrared. For sources with better coverage of IR-A andIR-B, different sources are more commonly used. Typicaloutputs from several infrared laboratory sources are shownin Fig. 15. The sources are basically electrical resistanceheaters, ceramic and other substances that carry electri-cal current, which become hot due to ohmic heating, andwhich emit broadband infrared radiation with moderatelyhigh radiance.

In addition to these relatively broadband sources, thereare numerous others that emit over more restricted spectralranges. The light-emitting diode (LED) is one example.It is made of a semiconductor diode with a P-N junctiondesigned so that electrical current through the junction inthe forward bias direction produces the emission of opticalradiation. The spectral range of emission is limited, but notso much to be considered truly monochromatic. A sample

FIGURE 15 Spectral irradiance distributions from four sources ofinfrared radiation.

LED spectrum is shown in Fig. 16. LEDs are efficientconverters of electrical energy into radiant flux.

Lasers deserve special mention. An important charac-teristic of lasers is their extremely narrow spectral outputdistribution, effectively monochromatic. A consequenceof this is high optical coherence, whereby the phases ofthe oscillations in electric and magnetic field strength vec-tors are preserved to some degree over time and space.Another characteristic is the high spectral irradiance theycan produce. For more information the reader is referred tomodern textbooks on lasers and optics. Most gas dischargelasers exhibit a high degree of collimation, an attributewith many useful optical applications.

A problem with highly coherent sources in radiometryand photometry is that not all of the relationships devel-oped so far in this article governing flux levels are strictlycorrect. The reason is the possibility for constructive anddestructive interference when two coherent beams of thesame wavelength overlap. The superposition of two ormore coherent monochromatic beams will produce a com-bined irradiance at a point that is not always a simple sumof the irradiances of the two beams at the point of super-position. A combined irradiance level can be more andcan be less than the sum of the individual beam irradi-ances, since it depends strongly on the phase differencebetween the two beams at the point of interest. The pre-dictions of radiometry and photometry can be preservedwhenever they are averaged over many variations in thephase difference between the two overlapping beams.

D. Solar Radiation and Daylight

Following its passage through the atmosphere, direct beamsolar radiation exhibits the spectral distribution shown inFig. 17. The fluctuations at wavelengths over 700 nm arethe result of absorption by various gaseous constituents ofthe atmosphere, the most noticeable of which are watervapor and CO2. The V -lambda curve is also shown inFig. 17 for comparison.

FIGURE 16 Relative spectral exitance of a red light-emittingdiode.

Page 324: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 749

FIGURE 17 Spectral irradiance of terrestrial clear sky directbeam solar radiation.

The spectral distribution of blue sky radiation is similarto that shown in Fig. 17, but the shape is skewed by whatis called Rayleigh scattering, the scattering of radiationby molecular-sized particles in the atmosphere. Rayleighscattering is proportional to the inverse fourth power ofthe wavelength. Thus, blue light is scattered more promi-nently than red light. This is responsible for the blue ap-pearance of sky light. (The accompanying removal of lightat short wavelengths shifts the apparent color of beam sun-light toward the red end of the spectrum, responsible forthe orange-red appearance of the sun at sunrise and sun-set.) The spectral distribution of daylight is important inthe field of colorimetery and for many other applications,including the daylight illumination of building interiors.

VII. OPTICAL PROPERTIESOF MATERIALS

A. Introduction

Central to radiometry and photometry is the interactionof radiation with matter. This section provides a discus-sion of the properties of real materials and their abilitiesto emit, reflect, refract, absorb, transmit, and scatter radi-ation. Only the rudiments can be addressed here, dealingmostly with terminology and basic concepts. For more in-formation on the optical properties of matter, the reader isdirected to available texts on optics and optical engineer-ing, as well as other literature on material properties.

B. Terminology

The improved uniformity in symbols, units, and nomen-clature in radiometry and photometry has been extended tothe optical properties of materials. Proper terminology can

now be identified for the processes of reflection, transmis-sion, and emission of radiant flux by or through materialmedia. Although symbols have been standardized for mostof these properties, there are a few exceptions.

To begin, the CIE definitions for reflectance, transmit-tance, and absorptance are provided:

1. Reflectance (for incident radiation of a given spectralcomposition, polarization and geometricaldistribution) (ρ): Ratio of the reflected radiant orluminous flux to the incident flux in the givenconditions (unit: 1)

2. Transmittance (for incident radiation of given spectralcomposition, polarization and geometricaldistribution) (τ ): Ratio of the transmitted radiant orluminous flux to the incident flux in the givenconditions (unit: 1)

3. Absorptance: Ratio of the absorbed radiant orluminous flux to the incident flux under specifiedconditions (unit: 1)

These definitions make explicit the point that radiationincident upon a surface can have nonconstant distribu-tions over the directions of incidence, over polarizationstate, and over wavelength (or frequency). Thus, whenone wishes to measure these optical properties, it mustbe specified how the incident radiation is distributed inwavelength and direction and how the emergent detectedradiation is so distributed if the measurement is to havemeaning. Polarization effects are not dealt with here. Thewavelength dependence of radiometric properties of ma-terials is indicated with a functional lambda λ thus: τ (λ),ρ(λ), and α(λ).

The directional dependencies are indicated by specify-ing the spherical angular coordinates (θ, φ) of the incidentand emergent beams.

In other fields it is common to assign the ending -ivityto intensive, inherent, or bulk properties of materials. Theending -ance is reserved for the extensive properties of afixed quantity of substance, for example a portion of thesubstance having a certain length or thickness. (Some-times the term intrinsic is used instead of intensive andextrinsic is used instead of extensive.) Figure 18 illus-trates the difference between intrinsic and extrinsic re-flection properties and introduces the concept of interfacereflectivity. An example from electronics is the 30 ohmelectrical resistance of a 3 cm length of a conductor hav-ing a resistivity of 10 ohms/cm.

According to this usage in radiometry, reflectance isreserved for the fraction of incident flux reflected (underdefined conditions of irradiation and reception) from a fi-nite and specified portion of material, such as a 1-cm-thickplate of fused silica glass having parallel, roughened

Page 325: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

750 Radiometry and Photometry

FIGURE 18 (a) Intrinsic versus (b) extrinsic reflection propertiesof a material. (c) Interface reflectivity.

surfaces in air. The reflectivity of a material, such as BK7glass, would refer to the ratio of reflected to incident fluxfor the perfectly smooth (polished) interface between anoptically infinite thickness of the material and some othermaterial, such as air or vacuum. The infinite thicknessis specified to ensure that reflected flux from no otherinterface can contribute to that reflected by the interfaceof interest, and to ensure the collection of subsurface fluxscattered by molecules of the material.

CIE definitions for the intrinsic optical properties ofmatter read as follows:

1. Reflectivity (of a material) (ρ∞): Reflectance of alayer of the material of such a thickness that there isno change of reflectance with increase in thickness(unit: 1)

2. Spectral transmissivity (of an absorbing material)(τi,o(λ)): Spectral internal transmittance of a layer ofthe material such that the path of the radiation is ofunit length, and under conditions in which theboundary of the material has no influence (unit: 1)

3. Spectral absorptivity (of an absorbing material)(αi,o(λ)): Spectral internal absorptance of a layer ofthe material such that the path of the radiation is ofunit length, and under conditions in which theboundary of the material has no influence (unit: 1)

One can further split the reflectivity ρ∞ into interface-only and bulk property components. We use the symbolρ, rho with a bar over it, to indicate the interface con-tribution to the reflectivity. The interface transmissivity τis included in this notational custom. Since there is pre-sumed to be no absorption when radiation passes throughor reflects from an interface, τ + ρ = 1.0.

To denote the optical properties of whole objects, suchas parallel sided plates of a material of specific thickness,we use upper case Roman font characters, as with thesymbol R for reflectance, and the “-ance” suffix.

This terminology is summarized as follows:

ρ Reflectivity of an interfaceρ Reflectivity of a pure substance, including both bulk

and interface processesR Reflectance of an objectτ Transmissivity of an interfaceτ (Internal) linear transmissivity of (a unit length of) a

transparent or partially transparent substance, away frominterfaces; unit: m−1

T Transmittance of an objectα (Internal) linear absorptivity of (a unit length of) a

transparent or partially transparent substance, away frominterfaces; unit: m−1

A Absorptance of an object

C. Surface and Interface Optical Properties

1. Conductor Optical Properties

A perfect conductor, characterized by infinitely great con-ductivity, has an infinite refractive index and penetrationof electromagnetic radiation to any depth is prohibited.This produces perfect reflectivity. Real conductors suchas aluminum and silver do not have perfect conductivi-ties nor do they have perfect reflectivities. Their reflectiv-ities are quite high, however, over broad spectral ranges.They are therefore useful in radiometric and photometricapplications. Unprotected mirrors made of these materi-als, unfortunately, tend to degrade with exposure to airover time and they are seldom used without protectiveovercoatings. The normal incidence spectral reflectancesof optical quality glass mirrors coated with aluminum,with aluminum having a magnesium fluoride protectiveovercoat, with aluminum having a silicon monoxide over-coat, with silver having a protective dielectric coating, andwith gold are shown in Fig. 19. The reflectance of thesesurfaces, already quite high at visible and infrared wave-lengths, increases with incidence angle, approaching unityat 90o.

2. Nonconductor Optical Properties

Consider the extremely thin surface region of a perfectlysmooth homogeneous and isotropic dielectric material, itsinterface with another medium such as air, water, or avacuum, an interface normally too thin to absorb signifi-cant quantities of the radiation incident on it. Absorptionis not considered in this discussion since it is consideredto be a bulk or volume characteristic of the material. Ra-diation incident upon an interface between two differentmaterials is split into two parts. Some is reflected, and therest is transmitted. The fraction of incident flux that is re-flected is called the interface reflectivity ρ and the fraction

Page 326: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 751

FIGURE 19 Spectral reflectances of commercially availablemetallic mirror materials.

transmitted is the interface transmissivity τ . The variationsof ρ and τ with angle of incidence are given by Fresnel’sformulas, which can be found in most optical textbooks.

When the bulk medium optical properties are consid-ered, the situation is more complicated, since the trans-mitted flux can be absorbed and “re-reflected” and/or scat-tered by the medium below the interface, by direction- andwavelength-dependent processes.

When both interface and interior optical processes areconsidered together, the spectral and directional variationsin transmissivity and reflectivity become still more impor-tant, and the absorptivity of the medium also comes intoplay. The wavelength dependence of the optical propertiesof materials is indicated with a functional λ notation thus:τ (λ), α(λ), and ρ(λ). The direction of an element of solidangle d ω is indicated using the spherical angular coordi-nates (θ, φ), illustrated in Fig. 6. Using these coordinates,the directional dependence of optical properties is indi-cated with the functional notation: τ (θ, φ) and ρ(θ, φ),and the combined spectral and directional properties thus:τ (λ, θ, φ) and ρ(λ, θ, φ).

3. Surface Emission Properties

The emissive properties of greybody and nonblackbodysurfaces are characterized by their emissivity ε. Emissiv-ity is the ratio of the actual emission of thermal radiantflux from a surface to the flux that would be emitted bya perfect blackbody emitter at the same temperature. Ac-cording to the terminology guidelines given earlier, theterm emissivity should be reserved for the surface of aninfinitely thick slab of pure material with a polished sur-face, while emittance would apply to a finite thickness ofan actual object. For substances opaque at the wavelengthsof emission, however, the intrinsic and extrinsic versionsof ε are the same, leading to two acceptable names for thesame quantity.

As was the case for reflectance and transmittance,emittance is in general a directional quantity and can bespecified as ε(θ, φ). The directional emittance at normalincidence (θ = 0) is called the normal emittance. The aver-age of the directional emittance over the whole hemispher-ical solid angle is called the hemispherical emittance. Theemittances shown in Table VII are for hemispherical emit-tance into a vacuum.

The spectral exitance Mλ(λ) of a nonblackbody can bespecified using the spectral emittance ε(λ):

Mλ(λ) = ε(λ)Mbbλ(λ) (41)

with Mbbλ being given by (40).

4. Directional Optical Properties

Radiation incident at a point in a surface can come to thatpoint from many directions. The concept of a pencil ofrays, rays filling a right circular conical solid angle, likethe shape of the tip of a well-sharpened wooden pencil, isuseful in describing the directional dependences of trans-mittance and reflectance, for both theoretical treatmentsand in practical measurements.

In making transmittance or reflectance measurements,a sample to be tested is illuminated with radiation fillingsome solid angular range of directions. The reflected ortransmitted flux is then collected over another range ofdirections within some second solid angle.

In order for the transmittance or reflectance value tohave meaning, either theoretically or experimentally, it isessential that the directions and solid angles of incidenceand emergence be specified. These tell the ranges of anglesinvolved in the measurement.

In discussing reflectance and transmittance, there arethree categories of solid angles of interest, and severaldifferent definitions of reflectance and transmittance us-ing combinations of these. The three solid angle categoriesare directional, conical, and hemispherical. They are il-lustrated in Fig. 20. There are nine possible combinations

TABLE VII Hemispherical Emittance Values forTypical Materials

Emittance fromMaterial 4 to 16 µm

White paint 0.90

Black asphalt and roofing tar 0.93

Light concrete 0.88

Pine wood 0.60

Stainless steel 0.18 to 0.28

Galvanized sheet metal 0.13 to 0.28

Aluminum sheet metal 0.09

Polished aluminum 0.05 to 0.08

Page 327: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

752 Radiometry and Photometry

FIGURE 20 Geometry for directional, conical, and hemisphericalsolid angles.

of these three kinds of solid angle, resulting in the ninenames for them given below. The most commonly usedones are indicated in bold face type.

Bidirectional Directional–conical Directional–hemispherical Conical–directional Hemispherical–directional Biconical Conical–hemispherical Hemispherical–conical Bihemispherical

The first five of these are mainly found in theoreticaldiscussions. The last four are used in reflectance andtransmittance measurements. Solar optical property stan-dards published by various organizations refer to conical–hemispherical measurements. The reason is that for mostpractical problems, it is only the total transmitted or re-flected irradiance due to the directly incident beam alonethat is of interest. For other applications and more generalor more complex situations, the biconical definition isthe most important (see Fig. 21). Theoretical treatments

FIGURE 21 Geometry for the definition of biconical transmittanceand reflectance.

FIGURE 22 Geometry for the definition of directional–hemispherical reflectance.

of radiative transfer deal almost exclusively with the“directional” versions of the definitions. Sometimes theterminology “bidirectional” is used to refer to biconicalmeasurements. This is appropriate when the solid anglesinvolved are small. Example geometries are shown inFigs. 22 and 23.

VIII. THE DETECTION OF RADIATION

There is considerable variety in the kinds of devices(called detectors or sensors) available for the detectionand measurement of optical radiation. Some respond to theheat produced when radiant energy is absorbed by a sur-face. Some convert this heat into mechanical movement,and some convert it into electricity. Photographic emul-sions convert incident radiation into chemical changes

FIGURE 23 Geometry for the definition of conical–hemisphericaltransmittance.

Page 328: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 753

made visible by the development process. Other detectorsconvert electromagnetic radiation directly into electricalenergy.

Many electrical effects have been devised to amplifythe typically small electrical signals produced by detectorsto levels easier to measure. There are unavoidable smallfluctuations found in the output signals of all detectorswhich mask or obscure the signal resulting from incidentradiation. This is called noise, and various means havebeen devised to reduce its effect on measurement results.

Most detectors with high sensitivity (strong responseto weak flux levels) have nonuniform spectral responses.Often the inherent spectral response of the detector is notthe one desired for the application. In most such cases itis possible to add a spectrally selective filter, producing acombined filter/detector response closer to what is desired.Matching filters with detectors for this purpose can bedifficult, but it is one of the most important problems inradiometry and photometry.

The output voltage or current of most detectors de-pends on more than just the strength of the incidentflux. Temperature T can have an effect, as can the di-rection θ of incident radiation. If we combine all thesedependencies into one single spectral response function,R(λ, λ, T , θ, x , y , z , . . .), we can write an equation forthe output signal S(λ) at wavelength λ as a function of theincident spectral radiant flux λ.

S(λ) = R(λ, λ, T , θ, x , y , z , . . .) λ + So , (42)

where x , y , and z are other physical parameters on whichthe detector’s output might depend and So is the “darksignal,” the signal output of the detector (be it current orvoltage) when the flux on it is zero.

Considering only the spectral dependency in the aboveequation, the total output signal S, in terms of the inci-dent spectral irradiance Eλ(λ), a spectral altering filtertransmittance T (λ), the detector responsivity R(λ), andthe detector area A will be given by

S = A∫ ∞

0Eλ (λ) T (λ) R(λ) d λ + So . (43)

If the detector spectral response R(λ) is constant, at thevalue Ro, over some wavelength range of interest, a spec-trum altering filter will not be needed and the only integralremaining in (43) is over the spectral irradiance. The resultis the simpler equation

S = AEe Ro + So , (44)

which may be solved for the incident irradiance.

Ee = k(S − So), (45)

where k = 1/ARo is the calibration constant for a detectorused as a normal incidence irradiance meter and So is thedark signal.

Values are published by manufacturers for the respon-sivity and other characteristics of their detectors. Theseperformance figures are approximate and are used mainlyfor the selection of a detector with the proper character-istics for the given application—not for calibration pur-poses, with a notable exception, described in Section X.C.

It is important to note that the smaller the detector, thelower the noise level produced. There is therefore usuallya noise penalty for using a detector having a sensitivesurface significantly larger than the incident beam. Theunused area contributes to both the dark current and tothe noise but not to the signal. The signal-to-noise ratio(SNR) of a detector can therefore be improved by using adetector only as large as needed to match the beam of fluxplaced on the detector by the conditioning optics.

Often the flux incident on a detector is “chopped,” ismade to switch on and off at some frequency f . Muchof the noise in such detectors can be suppressed from theoutput signal if the alternating output signal from the de-tector is amplified only at the chopping frequency f . Thelarger the frequency bandwidth f of this amplificationcircuit, the greater the noise in the amplified signal. Thisleads to the concept of noise equivalent power, or NEP,of the detector. This is the flux incident on the detector,in units of watts, which produces an amplified signal justequal to the root mean square (rms) of the noise. It isgenerally desirable to have a low value of the NEP, whichis quoted in units of W · Hz−1/2. The lower the value ofthe NEP the lower the flux the detector can measure witha good SNR. Detectivity D is the reciprocal of NEP. Nor-malized detectivity, D*, is the detectivity normalized fordetector area and frequency bandwidth. It has units of Hz

12 ·

(cm2)1/2· W−1.The spectral detectivities of a variety of detectors are

shown in Fig. 24. One thing is clear from those plots.Broad spectral coverage generally comes at the expenseof detectivity.

Once an appropriate detector has been selected, it willgenerally be placed in an optical–mechanical system hav-ing the effect of conditioning the flux prior to its receiptby the detector. This conditioning can consist of chopping,focusing incident flux into a narrow conical solid angularrange of angles, and/or spectral filtering.

IX. RADIOMETERS AND PHOTOMETERS,SPECTRORADIOMETERS, ANDSPECTROPHOTOMETERS

A. Introduction

Radiometer is the term given to an instrument designedto measure radiant flux. Some radiometers measure theradiant flux contained in a beam having a known solid

Page 329: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

754 Radiometry and Photometry

FIGURE 24 Representative spectral normalized detectivities of a variety of detectors.

angle and cross-sectional area. Others measure the flux re-ceived from a large range of solid angles. Some radiome-ters measure over a large wavelength range. These aretermed broadband. Others perform measurements onlyover a narrow spectral interval. When the shape of thespectral response of a broadband radiometer is made tomatch the human spectral photopic efficiency function,the V -lambda curve, it is called a photometer.

Some narrow spectral interval radiometers are madewhich scan the position of their narrow spectral intervalacross the spectrum. These are called spectroradiometers.They are used to measure the spectral flux, irradiance, orradiance received by them.

Spectrophotometers are misnamed. This term is gener-ally applied to neither radiometers nor photometers but totransmissometers or to reflectometers—instruments mea-suring an optical property—which scan over a range ofmonochromatic wavelengths. In spite of the inclusion of“photo” in the name, the human photopic spectral responsefunction (the V -lambda curve) is generally not employedin the use of spectrophotometers. They might thereforemore properly be called spectral transmissometers (orreflectometers).

Radiometers are divided into radiance and irradiancesubclasses. Instruments with intentionally broad spectralcoverage are called broadband radiometers. Photome-ters are similarly divided into luminance and illuminanceversions.

B. Spectral Response Considerations

In the practical use of radiance (and irradiance) meters,it is especially important to be cognizant of the spectrallimitations of the meter and to include these limits when re-porting measurement results. Flux entering the meter hav-ing wavelengths outside its range of sensitivity will not be

measured. In such cases, the measurements will only sam-ple a portion of the incident flux and should be so reported.Well-built photometers do not share this characteristic. Ifthe spectral response of a photometer strictly matches theshape of the V -lambda curve, then flux outside the visi-ble wavelength range should not be measured, will not bemeasured, will not be recorded, and cannot be reported.Furthermore, in this case of perfect spectral correction, anyspectral distribution of radiation incident on the photome-ter will be measured correctly without spectral responseerrors. On the other hand, if a photometer’s spectral re-sponse does not quite match the shape of the V -lambdacurve, the resulting errors can be small or large dependingupon the spectral distribution of flux from the source overthe spectral region of the departure from V (λ) response.Consider, for example, the case of a measurement of the il-luminance from a Helium Neon laser beam at wavelength632.8 nm. This wavelength is at the red edge of the visiblespectrum and a relatively small error in the V -lambda cor-rection of a photometer at this wavelength can yield a largeerror in the measurement of illuminance from this source.

C. Cosine Correction

A consequence of the cosine law is that the output ofa perfect irradiance meter illuminated uniformly withcollimated radiation fully filling its sensing area will de-crease with the cosine of the angle of incidence as thatangle increases from zero to 90o. Such behavior is called“good cosine correction.” Most detectors do not have thisdesirable characteristic by themselves. To restore good co-sine response, some correction method is needed if a de-tector is to work properly as an irradiance or illuminancemeter. Furthermore, the housings of many detectors shadetheir sensitive surfaces at some angles of incidence, againcalling for some means of angular response correction.

Page 330: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 755

FIGURE 25 Schematic illustration of the features of an irradiance/illuminance meter.

A common method, shown in Fig. 25, for providingthe needed correction is to cover the detector with a sheetof milk-white, highly diffusing, semitransparent materialhaving good (and ideally constant) hemispherical–conicalspectral transmittance over the spectral range of good de-tector sensitivity. The idea is that no matter how the inci-dent radiation falls on this material, a fixed and constantfraction of it will be delivered to the detector over a rangeof angles. In practice, no diffusing sheet has been foundthat satisfies this ideal perfectly.

What is done is to experiment with a variety of diffusingmaterials, surface roughnesses, and geometrical configu-rations until a combination is found that provides reason-ably good cosine correction.

A solution to this problem is to limit the size of thediffusing sheet and allow it to extend above the detec-tor housing so that at large angles of incidence someof the incident flux will be received by the edge of thesheet, this edge being perpendicular to the front surface.Thus, as more and more flux is reflected from the front ofthe sheet, more and more will be transmitted through itsedge, since in this case the incidence angle is decreasingand the exposed area incrases. At an 80o angle of inci-dence, for example, little flux will enter through the frontface of the diffuser, but much more will enter through theedge.

A problem remains, however. True cosine responsedrops to zero at 90o, whereas the exposed edge of the dif-fuser receives considerable quantities of flux at this angle.The cosine corrector must be designed so that no flux canreach the detector for angles of incidence at and greater

than 90o. The usual solution to this requirement is illus-trated generically in Fig. 25.

As the angle of incidence increases, more and moreflux will reach the edge of the detector until the angle ofincidence approaches 90o, at which point the shading ringbegins to shade the edge. Finally, at 90o, the diffuser isshaded completely and no flux can reach it. The design ofthe specific dimensions of this cosine correction schemedepends strongly on the biconical optical properties of thediffusing sheet, the geometrical placement of the detectorbelow it, and the angular response characteristics of thedetector itself. Finding the right geometry is often a hit-or-miss proposition. Even if a good design is found, thequality of the corrected cosine response can suffer if theproperties of the diffusing material change in time, frombatch to batch of manufacture, or with the wavelength ofincident radiation. Making a good cosine corrector is oneof the most difficult problems in the manufacture of goodquality, accurate irradiance and illuminance meters.

A way of providing better cosine correction than theone diagramed in Fig. 25 is through the use of an inte-grating sphere. An arrangement for utilizing the desirableproperties of the integrating sphere is illustrated in Fig. 26.The approach is based on the following idealized princi-ple. Flux entering a small port in a hollow sphere whoseinterior surface is coated with a material of extremely highdiffuse reflectance will be multiply reflected (scattered) alarge number of times, in all directions, with little loss oneach reflection. If a small hole is placed in the side of thesphere and shielded from flux that has not been reflected atleast once by the sphere, the flux emerging from this hole

Page 331: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

756 Radiometry and Photometry

FIGURE 26 Integrating sphere cosine correction.

will be a fixed and constant fraction of the flux enteringthe other hole, regardless how the flux entering the inputport is distributed in angle.

Real integrating spheres, with reflectances less than l.0and with entrance and exit ports of finite areas cannotachieve this idealized performance. They can be made toapproach it closely, but they are not efficient at deliveringflux to the detector for measurement, so irradiance andilluminance meters employing integrating spheres gener-ally suffer lower sensitivities.

X. CALIBRATION OF RADIOMETERSAND PHOTOMETERS

A. Introduction

Radiometers and photometers involve a number of com-ponents, all contributing to the overall sensitivity of theinstrument to incident radiation. Although one could inprinciple determine the contribution of each individualcomponent to the overall calibration of the instrument, inpractice this procedure is seldom used. Instead, the com-plete instrument is calibrated all at once.

Calibration is usually a two-step process. First one de-termines the mathematical transformation needed to con-vert an output electrical signal into an estimate of the inputflux in the units desired for the quantity being measured.Second, one ensures the accuracy of this transformationover time as the characteristics of the components makingup the radiometer or photometer change or drift.

There are two approaches to calibrating or recalibrat-ing a radiometer/photometer. In the first case, one usesthe radiometer/photometer to measure flux from a stan-dard source whose emitted flux is known accurately in thedesired units and then applies a suitable transformation toconvert the output signal to the proper magnitude and units

of the standard input. For this to work, it is critical thatthe overall response of the radiometer/photometer be con-stant over the period of time between calibrations. The out-put conversion transformation can be either in hardware,where the sensitivity of the radiometer is adjusted so that itreads correctly, or in “software,” where a calibration con-stant is multiplied by the output signal to convert it to theproper value and units every time a measurement is made.

In the second approach to calibration, one measures theflux from an uncalibrated source, first with the device tobe calibrated, and then with an already-calibrated standardradiometer/photometer having identical field of view andspectral response. The output of the device is then cali-brated to be identical to the measured result and units ob-tained with the standard radiometer/photometer. Once thecalibration is performed, or the calibration transformationis known, it can be applied to subsequent measurementsand the device is thereby said to be calibrated.

B. Standard Sources

For radiometers and photometers whose calibration driftsslowly over time, one can calibrate the device when itis first fabricated and then recalibrate it periodically oversome acceptable time period. For most accurate results, itis advisable to recalibrate frequently at first, and to thenincrease the time interval between recalibrations only aftera history of drift has been established. For precise radiom-etry and photometry, a working standard or transfer stan-dard is used to make frequent calibration checks between(or even during) measurements to account for the effectsof small residual drifts in the calibration of a radiometeror photometer.

Historically, the focus of calibration was on the prepa-ration of standard sources, most notably standard lamps,which produce a known and constant quantity of fluxgiving a known irradiance at a fixed distance from theemitting element. A typical measurement configuration isillustrated schematically in Fig. 27(a). There has been ashift to the use of calibrated detection standards; that is,detectors whose responsivity is sufficiently constant andreproducible over time to make it possible to calibrateother detectors or radiometers based on these standarddetectors. Standard lamps are still available as calibratedsources, however. These generally produce a fixed outputflux distribution with wavelength. Because of the possi-bility of nonlinear response effects, radiometers and pho-tometers should be calibrated only over their ranges oflinearity, within which a standard lamp can be found.

Many nations maintain primary standards for radiom-etry (and photometry) in national laboratories dedicatedto this purpose. In the United States, such standards aremaintained by the National Institute of Standards and

Page 332: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

Radiometry and Photometry 757

FIGURE 27 (a) Calibration arrangement for irradiance/illumi-nance. (b) Calibration arrangement for radiance/luminance.

Technology in Gaithersburg, MD. From these are derivedsecondary standards (also called transfer standards) thatcan be maintained at private laboratories or by other orga-nizations for the purpose of calibrating and recalibratingcommercially and custom produced radiometers and pho-tometers. Working standards are standards derived fromsecondary standards but which are designed and intendedfor easy and repeated use to check the calibration of aradiometric or photometric system periodically during orbetween measurements.

1. Calibration of Radiance and Irradiance Meters

Calibrations using a standard lamp frequently utilize spe-cially designed tungsten filament lamps whose emittingcharacteristics are known to be quite constant over a pe-riod of time if the lamp is not frequently used. Such lampsmust be operated with precisely the same electrical cur-rent through the filament as when their calibrations wereinitially set. Specially designed power supplies are madefor use with such lamps. These power supplies ensure theconstancy of this filament current and also keep track ofhow many hours the filament has been operated since ini-tial calibration.

One can obtain irradiance standard lamps commerciallyand use them for the calibration of broadband irradiancesensors. They must be operated according to manufacturerspecifications and care must be taken to avoid stray lightfrom the source reflecting from adjacent objects and intothe radiometer being calibrated.

Over the years, researchers at the National Institute ofStandards and Technology have worked to develop im-proved standards of spectral radiance and irradiance for theultraviolet, visible, and near infrared portions of the spec-trum. The publications of that U.S. government agencyshould be consulted for the details.

2. Calibration of Luminanceand Illuminance Meters

Standard sources of radiance and irradiance that emitusable quantities of radiation over the visible portion ofthe spectrum can be used as standards for the calibrationof photometers if the photometric outputs of these sourcesis known. Commercial radiometric and photometric stan-dards laboratories generally can supply photometric cal-ibrations for their radiometric sources for modest addi-tional cost. The most common source is the incandescentfilament lamp, with its characteristic spectral output dis-tribution. If the primary use of the photometer being cali-brated is to measure light levels derived from sources withsimilar spectral distributions, and if the V -lambda correc-tion of the photometer is good, then use of tungsten fila-ment standard lamps is an acceptable means of calibration.

If the photometer is intended for measurement of radi-ation with substantially different spectral distribution andthe V -lambda correction is not good, then significant mea-surement errors can result from calibration using tungstensources. Fortunately, other standard spectral distributionshave been defined. They are based on phases of daylight(primarily for colorimetric applications). Sources exhibit-ing approximations of these distributions have been de-veloped. For cases of imperfect V -lambda correction, it isrecommended that calibration sources be used that moreclosely match the distributions to be measured with thephotometer.

C. Calibrated Detectors

Calibrated silicon photodetectors are now available astransfer or working standards based on the NlST ab-solute spectral responsivity scale. Current informationabout NIST calibration services can be found at web sitehttp://www.physics.nist.gov.

D. National Standards Laboratories

Anyone concerned with calibration of radiometers andphotometers can benefit greatly from the work of the Na-tional Institute of Standards and Technology and its coun-terparts in other countries.

NSSN offers a web-based comprehensive data networkon national, foreign, regional, and international standardsand regulatory documents. A cooperative partnershipbetween the American National Standards Institute(ANSI), U.S. private-sector standards organizations,government agencies, and international standards organi-zations, NSSN can help in the identification and locationof national standards laboratories offering services inradiometry and photometry outside the United States.

Page 333: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN013D-648 July 26, 2001 20:28

758 Radiometry and Photometry

World Wide Web address: http://www.nssn.org. Theweb address for the International Standards Organiza-tion (ISO) is http://www.iso.ch. A list of national metrol-ogy laboratories can be found at http://www.vnist.gov/oiaa/national.htm.

ACKNOWLEDGMENT

Portions reprinted with permission from McCluney, R. (1994). “Intro-duction to Radiometry and Photometry,” Artech House, Inc., Norwood,MA. www.artechhouse.com.

SEE ALSO THE FOLLOWING ARTICLES

COLOR SCIENCE • INFRARED SPECTROSCOPY • LIGHT

SOURCES • OPTICAL DETECTORS • POLARIZATION AND

POLARIMETRY • PHOTONIC BANDGAP MATERIALS •RADIATION, ATMOSPHERIC • RADIATION EFFECTS IN

ELECTRONIC MATERIALS AND DEVICES • RADIATION

SOURCES • RADIO ASTRONOMY, PLANETARY • REMOTE

SENSING FROM SATELLITES

BIBLIOGRAPHY

Biberman, L. M. (1967). “Apples Oranges and UnLumens,” Appl. 0ptics6, 1127.

Boyd, R. W. (1983). “Radiometry and the Detection of Optical Radia-tion,” Wiley, New York.

Budde, W. (1983). “Optical Radiation Measurements,” Wiley, New York.Chandrasekhar, S. (1960). “Radiative Transfer,” Dover Publications,

New York.CIE (1990). “CIE 1988 2o Spectral Luminous Efficiency Function for

Photopic Vision,” Tech. Rept. CIE 86. CIE, Vienna, Austria.CIE, (1987). “International Lighting Vocabulary,” 4th ed., Publ. No.

17.4. Commission International de l’Eclairage (CIE), Vienna, and In-ternational Electrotechnical Commission (IEC). [Available in the U.S.from TLA-Lighting Consultants, 7 Pond St., Salem, MA 01970.

Dereniak, E. L., and Crowe, D. G. (1984). “Optical Radiation Detectors,”Wiley, New York.

Goebel, D. G. (1967). Generalized integrating sphere theory. Appl. Op-tics 6, 125–128.

Grum F., and Becherer, R. J. (1979). “Optical Radiation Measurements.Volume 1 Radiometry,” Academic Press, New York.

IES (2000). “The IESNA Lighting Handbook: Reference and Appli-caiton,” 9th ed., Illuminating Engineering Society of North America,New York.

McCluney, R. (1994). “Introduction to Radiometry and Photometry,”Artech House, Norwood, MA.

Meyer-Arendt, J. R. (1968). Radiometry and photometry: units and con-version factors. Appl. Optics 7, 2081–2084.

Nicodemus, F. E. (1963). Radiance. Am. J. Phys. 31, 368–377.Nicodemus, F. E. (1976). “Self-Study Manual on Optical Radiation

Measurements,” NBS Technical Note 910, U.S. Department of Com-merce, National Institute of Standards and Technology, Gaithersburg,MD.

Siegel, R., and Howell, J. R. (1992). “Thermal Radiation Heat Transfer,”3rd ed., Hemispherical Publishing/McGraw-Hill, New York.

Spiro, I. J., and Schlessinger, M. (1989). “Infrared Technology Funda-mentals,” Marcel Dekker, New York.

Taylor, B. N. (1995). “Guide for the Use of the International Systemof Units (SI),” NIST Special Publication 811, National Institute ofStandards and Technology, Gaithersburg, MD.

Welford, W. T., and Winston, R. (1989). “High Collection NonimagingOptics,” Academic Press, New York.

Page 334: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring TheoryJohn H. SchwarzCalifornia Institute of Technology

I. SupersymmetryII. String Theory Basics

III. SuperstringsIV. From Superstrings to M-Theory

GLOSSARY

Compactification The process by which extra spatial di-mensions form a very small (compact) manifold andbecome invisible at low energies. To end up with fourlarge dimensions, this manifold should have six dimen-sions in the case of superstring theory or seven dimen-sions in the case of M theory.

D-brane A special type of p-brane that has the prop-erty that a fundamental string can terminate on it.Mathematically, this corresponds to Dirichlet bound-ary conditions, which is the reason for the use of theletter D.

M-theory A conjectured quantum theory in eleven di-mensions, which is approximated at low energies byeleven-dimensional supergravity. It arises as the strongcoupling limit of the type IIA and E8 × E8 heteroticstring theory. The letter M stands for magic, mystery,or membrane according to taste.

p-brane A dynamical excitation in a string theory thathas p spatial dimensions. The fundamental string, forexample, is a 1-brane. All of the other p-branes havetensions that diverge at weak coupling, and thereforethey are nonperturbative.

S duality An equivalence between two string theories

(such as type I and SO(32) heterotic) which relates oneat weak coupling to the other at strong coupling andvice versa.

String theory A relativistic quantum theory in which thefundamental objects are one-dimensional loops calledstrings. Unlike quantum field theories based on pointparticles, consistent string theories unify gravity withthe other forces.

Supergravity A supersymmetric theory of gravity. In ad-dition to a spacetime metric field that describes spin 2gravitons, the quanta of gravity, these theories containone or more spin 3/2 gravitino fields. The gravitinofields are gauge fields for local supersymmetry.

Superstring A supersymmetric string theory. At weakcoupling there are five distinct superstring theories,each of which requires ten-dimensional spacetime(nine spatial dimensions and one time dimension).These five theories are related by various dualities,which imply that they are different limits of a singleunderlying theory.

Supersymmetry A special kind of symmetry that re-lates bosons (particles with integer intrinsic spin)to fermions (particles with half-integer intrinsicspin). Unlike other symmetries, the associated con-served charges transform as spinors. According to a

351

Page 335: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

352 Superstring Theory

fundamental theorem, supersymmetry is the uniquepossibility for a nontrivial extension of the knownsymmetries of spacetime (translations, rotations, andLorentz transformations).

T duality An equivalence between two string theories(such as type IIA and type IIB) which relates one witha small circular spatial dimension to the other with alarge circular spatial dimension and vice versa.

MANY of the major developments in fundamental physicsof the past century arose from identifying and overcom-ing contradictions between existing ideas. For example,the incompatibility of Maxwell’s equations and Galileaninvariance led Einstein to propose the special theory ofrelativity. Similarly, the inconsistency of special relativitywith Newtonian gravity led him to develop a new theoryof gravity, which he called the general theory of relativ-ity. More recently, the reconciliation of special relativitywith quantum mechanics led to the development of quan-tum field theory. We are now facing another crisis of thesame character. Namely, general relativity appears to beincompatible with quantum field theory. Any straightfor-ward attempt to “quantize” general relativity leads to anonrenormalizable theory. This means that the theory isinconsistent and needs to be modified at short distancesor high energies. The way that string theory does this isto give up one of the basic assumptions of quantum fieldtheory, the assumption that elementary particles are math-ematical points. Instead, it is a quantum field theory ofone-dimensional extended objects called strings. Thereare very few consistent theories of this type, but super-string theory shows great promise as a unified quantumtheory of all fundamental forces including gravity. So far,nobody has constructed a realistic string theory of elemen-tary particles that could serve as a new standard model ofparticles and forces, since there is much that needs to bebetter understood first. But that, together with a deeper un-derstanding of cosmology, is the goal. This is very mucha work in progress.

Even though string theory is not yet fully formulated,and we cannot yet give a detailed description of how thestandard model of elementary particles should emerge atlow energies, there are some general features of the the-ory that can be identified. These are features that seemto be quite generic irrespective of how various details areresolved. The first, and perhaps most important, is thatgeneral relativity is necessarily incorporated in the theory.It gets modified at very short distances/high energies but atordinary distances and energies it is present in exactly theform proposed by Einstein. This is significant, because it isarising within the framework of a consistent quantum the-ory. Ordinary quantum field theory does not allow gravity

to exist; string theory requires it. The second general factis that Yang–Mills gauge theories of the sort that comprisethe standard model naturally arise in string theory. We donot understand why the specific Yang–Mills gauge the-ory based on the symmetry group SU (3) × SU (2) × U (1)should be preferred, but (anomaly-free) theories of thisgeneral type do arise naturally at ordinary energies. Thethird general feature of string theory solutions is that theypossess a special kind of symmetry called supersymmetry.The mathematical consistency of string theory dependscrucially on supersymmetry, and it is very hard to find con-sistent solutions (i.e., quantum vacua) that do not preserveat least a portion of this supersymmetry. This predictionof string theory differs from the other two (general rela-tivity and gauge theories) in that it really is a prediction.It is a generic feature of string theory that has not yet beenobserved experimentally.

I. SUPERSYMMETRY

Even though supersymmetry is a very important part of thestory, the discussion here will be very brief. Like the elec-troweak symmetry in the standard model, supersymmetryis necessarily a broken symmetry. A variety of arguments,not specific to string theory, suggest that the characteristicenergy scale associated to supersymmetry breaking shouldbe related to the electroweak scale, in other words, in therange 100 GeV–1 TeV. (Recall that the rest mass of a pro-ton or neutron corresponds to an energy of approximately1 GeV. Also, the masses of the W ± and Z0 particles, whichtransmit the weak nuclear forces, correspond to energies ofapproximately 100 GeV.) Supersymmetry implies that allknown elementary particles should have partner particleswhose masses are in this general range. If supersymmetrywere not broken, these particles would have exactly thesame masses as the known particles, and that is definitelyexcluded. This means that some of these superpartnersshould be observable at the CERN Large Hadron Collider(LHC), which is scheduled to begin operating in 2005 or2006. There is even a chance that Fermilab Tevatron ex-periments could find superparticles before then. (CERNis a lab outside of Geneva, Switzerland and Fermilab islocated outside of Chicago, IL.)

In most versions of phenomenological supersymme-try there is a multiplicatively conserved quantum num-ber called R-parity. All known particles have evenR-parity, whereas their superpartners have odd R-parity.This implies that the superparticles must be pair-producedin particle collisions. It also implies that the lightest super-symmetry particle (or LSP) should be absolutely stable.It is not known with certainty which superparticle is theLSP, but one popular guess is that it is a “neutralino.”

Page 336: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 353

This is an electrically neutral fermion that is a quantum-mechanical mixture of the partners of the photon, Z0, andneutral Higgs particles. Such an LSP would interact veryweakly, more or less like a neutrino. It is of considerableinterest, since it has properties that make it an excellentdark matter candidate. There are experimental searchesunderway in Europe and in the United States for a class ofdark matter particles called WIMPS (weakly interactingmassive particles). Since the LSP is of an example of aWIMP, these searches could discover the LSP some day.However, the current experiments might not have suffi-cient detector volume to compensate for the exceedinglysmall LSP cross sections, so we may have to wait for futureupgrades of the detectors.

There are three unrelated arguments that point to thesame 100 GeV–1 TeV mass range for superparticles. Theone we have just been discussing, a neutralino LSP as animportant component of dark matter, requires a mass ofabout 100 GeV. The precise number depends on the mix-ture that comprises the LSP, what their density is, and anumber of other details. A second argument is based ona theoretical issue called the hierarchy problem. This isthe fact that in the standard model quantum correctionstend to renormalize the Higgs mass to an unacceptablyhigh value. The way to prevent this is to extend the stan-dard model to a supersymmetric standard model and tohave the supersymmetry be broken at a scale comparableto the Higgs mass, and hence to the electroweak scale.This works because the quantum corrections to the Higgsmass are more mild in the supersymmetric version of thetheory. The third argument that gives an estimate of thesupersymmetry-breaking scale is based on grand unifica-tion. If one accepts the notion that the standard modelgauge group is embedded in a larger group such as SU (5)or SO(10), which is broken at a high mass scale, then thethree standard model coupling constants should unify atthat mass scale. Given the spectrum of particles, one cancompute the variation of the couplings as a function ofenergy using renormalization group equations. One findsthat if one only includes the standard model particles thisunification fails quite badly. However, if one also includesall the supersymmetry particles required by the minimalsupersymmetric extension of the standard model, then thecouplings do unify at an energy of about 2 × 1016 GeV.This is a very striking success. For this agreement to takeplace, it is necessary that the masses of the superparticlesare less than a few TeV.

There is other support for this picture, such as the easewith which supersymmetric grand unification explains themasses of the top and bottom quarks and electroweak sym-metry breaking. Despite all these indications, we cannotbe certain that supersymmetry at the electroweak scalereally is correct until it is demonstrated experimentally.

One could suppose that all the successes that we havelisted are a giant coincidence, and the correct descriptionof TeV scale physics is based on something entirely dif-ferent. The only way we can decide for sure is by doingthe experiments. I am optimistic that supersymmetry willbe found, and that the experimental study of the detailedproperties of the superparticles will teach us a great deal.

A. Basic Ideas of String Theory

In conventional quantum field theory the elementary par-ticles are mathematical points, whereas in perturbativestring theory the fundamental objects are one-dimensionalloops (of zero thickness). Strings have a characteristiclength scale, which can be estimated by dimensional anal-ysis. Since string theory is a relativistic quantum theorythat includes gravity it must involve the fundamental con-stants c (the speed of light), h (Planck’s constant dividedby 2π ), and G (Newton’s gravitational constant). Fromthese one can form a length, known as the Planck length

p =(

hG

c3

)3/2

= 1.6 × 10−33 cm. (1)

Similarly, the Planck mass is

m p =(

hc

G

)1/2

= 1.2 × 1019 GeV/c2. (2)

Experiments at energies far below the Planck energy can-not resolve distances as short as the Planck length. Thus, atsuch energies, strings can be accurately approximated bypoint particles. From the viewpoint of string theory, thisexplains why quantum field theory has been so successful.

As a string evolves in time it sweeps out a two-dimensional surface in spacetime, which is called theworld sheet of the string. This is the string counterpart ofthe world line for a point particle. In quantum field theory,analyzed in perturbation theory, contributions to ampli-tudes are associated to Feynman diagrams, which depictpossible configurations of world lines. In particular, inter-actions correspond to junctions of world lines. Similarly,perturbative string theory involves string world sheets ofvarious topologies. A particularly significant fact is thatthese world sheets are generically smooth. The existenceof interaction is a consequence of world-sheet topologyrather than a local singularity on the world sheet. Thisdifference from point-particle theories has two importantimplications. First, in string theory the structure of interac-tions is uniquely determined by the free theory. There areno arbitrary interactions to be chosen. Second, the occur-rence of ultraviolet divergences in point-particle quantumfield theories can be traced to the fact that interactions areassociated to world-line junctions at specific spacetimepoints. Because the string world sheet is smooth, without

Page 337: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

354 Superstring Theory

any singular behavior at short distances, string theory hasno ultraviolet divergences.

B. A Brief History of String Theory

String theory arose in the late 1960s out of an attempt todescribe the strong nuclear force, which acts on a class ofparticles called hadrons. The first string theory that wasconstructed only contained bosons. The construction ofa better string theory that also includes fermions led tothe discovery of supersymmetric strings (later called su-perstrings) in 1971. The subject fell out of favor around1973 with the development of quantum chromodynamics(QCD), which was quickly recognized to be the correcttheory of strong interactions. Also, string theories hadvarious peculiar features, such as extra dimensions andmassless particles, which are not appropriate for a hadrontheory.

Among the massless string states there is one that cor-responds to a particle with two units of spin. In 1974, itwas shown by Joel Scherk and the author (Scherk andSchwarz, 1974), and independently by Yoneya (1974),that this particle interacts like a graviton, so that stringtheory actually contains general relativity. This led us topropose that string theory should be used for unification ofall elementary particles and forces rather than as a theoryof hadrons and the strong nuclear force. This implied, inparticular, that the string length scale should be compara-ble to the Planck length, rather than the size of hadrons(10−13 cm), as had been previously assumed.

In the period now known as the “first superstring rev-olution,” which took place in 1984–1985, there were anumber of important developments (described later in thisarticle) that convinced a large segment of the theoreticalphysics community that this is a worthy area of research.By the time the dust settled in 1985 we had learned thatthere are five distinct consistent string theories, and thateach of them requires spacetime supersymmetry in theten dimensions (nine spatial dimensions plus time). Thetheories, which will be described later, are called type I,type IIA, type IIB, SO(32) heterotic, and E8 × E8 het-erotic. In the “second superstring revolution,” which tookplace around 1995, we learned that the five string theo-ries are actually special solutions of a completely uniqueunderlying theory.

C. Compactification

In the context of the original goal of string theory—to ex-plain hadron physics—extra dimensions are unacceptable.However, in a theory that incorporates general relativ-ity, the geometry of spacetime is determined dynamically.Thus one could imagine that the theory admits consis-

tent quantum solutions in which the six extra spatial di-mensions form a compact space, too small to have beenobserved. The natural first guess is that the size of thisspace should be comparable to the string scale and thePlanck length. Since the equations of the theory must besatisfied, the geometry of this six-dimensional space isnot arbitrary. A particularly appealing possibility, whichis consistent with the equations, is that it forms a type ofspace called a Calabi–Yau space (Candelas et al., 1985).

Calabi–Yau compactification, in the context of theE8 × E8 heterotic string theory, can give a low-energyeffective theory that closely resembles a supersymmetricextension of the standard model. There is actually a lot offreedom, because there are very many different Calabi–Yau spaces, and there are other arbitrary choices that canbe made. Still, it is interesting that one can come quiteclose to realistic physics. It is also interesting that the num-ber of quark and lepton families that one obtains is deter-mined by the topology of the Calabi–Yau space. Thus, forsuitable choices, one can arrange to end up with exactlythree families. People were very excited by this scenarioin 1985. Today, we tend to make a more sober appraisalthat emphasizes all the arbitrariness that is involved, andthe things that don’t work exactly right. Still, it would notbe surprising if some aspects of this picture survive as partof the story when we understand the right way to describethe real world.

D. Perturbation Theory

Until 1995 it was only understood how to formulate stringtheories in terms of perturbation expansions. Perturbationtheory is useful in a quantum theory that has a small di-mensionless coupling constant, such as quantum electro-dynamics, since it allows one to compute physical quanti-ties as power series expansions in the small parameter. Inquantum electrodynamics (QED) the small parameter isthe fine-structure constant α ∼ 1/137. Since this is quitesmall, perturbation theory works very well for QED. Fora physical quantity T (α), one computes (using Feynmandiagrams)

T (α) = T0 + αT1 + α2T2 + · · · . (3)

It is the case generically in quantum field theory that ex-pansions of this type are divergent. More specifically, theyare asymptotic expansions with zero radius convergence.Nonetheless, they can be numerically useful if the ex-pansion parameter is small. The problem is that there arevarious nonperturbative contributions (such as instantons)that have the structure

TNP ∼ e−(const./α). (4)

In a theory such as QCD, there are problems for which per-turbation theory is useful (due to asymptotic freedom) and

Page 338: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 355

other ones where it is not. For problems of the latter type,such as computing the hadron spectrum, nonperturbativemethods of computation, such as lattice gauge theory, arerequired.

In the case of string theory the dimensionless string cou-pling constant, denoted gs , is determined dynamically bythe expectation value of a scalar field called the dilaton.There is no particular reason that this number should besmall. So it is unlikely that a realistic vacuum could beanalyzed accurately using perturbation theory. More im-portantly, these theories have many qualitative propertiesthat are inherently nonperturbative. So one needs nonper-turbative methods to understand them.

E. The Second Superstring Revolution

Around 1995 some amazing and unexpected “dualities”were discovered that provided the first glimpses into non-perturbative features of string theory. These dualities werequickly recognized to have three major implications.

The dualities enabled us to relate all five of the su-perstring theories to one another. This meant that, in afundamental sense, they are all equivalent to one another.Another way of saying this is that there is a unique under-lying theory, and what we had been calling five theoriesare better viewed as perturbation expansions of this un-derlying theory about five different points (in the space ofconsistent quantum vacua). This was a profoundly satis-fying realization, since we really didn’t want five theoriesof nature. That there is a completely unique theory, with-out any dimensionless parameters, is the best outcome forwhich one could have hoped. To avoid confusion, it shouldbe emphasized that even though the theory is unique, it isentirely possible that there are many consistent quantumvacua. Classically, the corresponding statement is that aunique equation can admit many solutions. It is a partic-ular solution (or quantum vacuum) that ultimately mustdescribe nature. At least, this is how a particle physicistwould say it. If we hope to understand the origin and evo-lution of the universe, in addition to properties of elemen-tary particles, it would be nice if we could also understandcosmological solutions.

A second crucial discovery was that the theory admitsa variety of nonperturbative excitations, called p-branes,in addition to the fundamental strings. The letter p labelsthe number of spatial dimensions of the excitation. Thus,in this language, a point particle is a 0-brane, a string is a1-brane, and so forth. The reason that p-branes were notdiscovered in perturbation theory is that they have tension(or energy density) that diverges as gs → 0. Thus they areabsent from the perturbative theory.

The third major discovery was that the underlying the-ory also has an eleven-dimensional solution, which is

called M-theory. Later, we will explain how the eleventhdimension arises.

One type of duality is called S duality. (The choice ofthe letter S has no great significance.) Two string theories(let’s call them A and B) are related by S duality if oneof them evaluated at strong coupling is equivalent to theother one evaluated at weak coupling. Specifically, for anyphysical quantity f , one has

f A(gs) = fB(1/gs). (5)

Two of the superstring theories—type I and SO(32)heterotic—are related by S duality in this way. Thetype IIB theory is self-dual. Thus S duality is a symme-try of the IIB theory, and this symmetry is unbroken ifgs = 1. Thanks to S duality, the strong coupling behaviorof each of these three theories is determined by a weak-coupling analysis. The remaining two theories, type IIAand E8 × E8 heterotic, behave very differently at strongcoupling. They grow an eleventh dimension.

Another astonishing duality, which goes by the nameof T duality, was discovered several years earlier. It canbe understood in perturbation theory, which is why it wasfound first. But, fortunately, it often continues to be valideven at strong coupling. T duality can relate different com-pactifications of different theories. For example, supposetheory A has a compact dimension that is a circle of radiusRA and theory B has a compact dimension that is a circleof radius RB . If these two theories are related by T dualitythis means that they are equivalent provided that

RA RB = (s)2, (6)

where s is the fundamental string length scale. This hasthe amazing implication that when one of the circles be-comes small the other one becomes large. Later, we willexplain how this is possible. T duality relates the twotype II theories and the two heterotic theories. There aremore complicated examples of the same phenomenon in-volving compact spaces that are more complicated than acircle, such as tori, K3, Calabi–Yau spaces, etc.

F. The Origins of Gauge Symmetry

There are a variety of mechanisms than can give rise toYang–Mills type gauge symmetries in string theory. Here,we will focus on two basic possibilities: Kaluza–Kleinsymmetries and brane symmetries.

The basic Kaluza–Klein idea goes back to the 1920s,though it has been much generalized since then. The ideais to suppose that the ten- or eleven-dimensional geometryhas a product structure M × K , where M is Minkowskispacetime and K is a compact manifold. Then, if K hassymmetries, these appear as gauge symmetries of the ef-fective theory defined on M . The Yang–Mills gauge fields

Page 339: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

356 Superstring Theory

arise as components of the gravitational metric field withone direction along K and the other along M . For ex-ample, if the space K is an n-dimensional sphere, thesymmetry group is SO(n + 1), if it is CPn—which has 2ndimensions—it is SU(n + 1), and so forth. Elegant as thismay be, it seems unlikely that a realistic K has any suchsymmetries. Calabi–Yau spaces, for example, do not haveany.

A rather more promising way of achieving realisticgauge symmetries is via the brane approach. Here theidea is that a certain class of p-branes (called D-branes)have gauge fields that are restricted to their world volume.This means that the gauge fields are not defined through-out the ten- or eleven-dimensional spacetime but onlyon the (p + 1)-dimensional hypersurface defined by theD-branes. This picture suggests that the world we observemight be a D-brane embedded in a higher dimensionalspace. In such a scenario, there can be two kinds of ex-tra dimensions: compact dimensions along the brane andcompact dimensions perpendicular to the brane.

The traditional viewpoint, which in my opinion is stillthe best bet, is that all extra dimensions (of both types)have sizes of order 10−30–10−32 cm corresponding to anenergy scale of 1016–1018 GeV. This makes them inacces-sible to direct observation, though their existence wouldhave definite low-energy consequences. However, one canand should ask “what are the experimental limits?” Forcompact dimensions along the brane, which support gaugefields, the nonobservation of extra dimensions in tests ofthe standard model implies a bound of about 1 TeV. TheLHC should extend this to about 10 TeV. For compactdimensions “perpendicular to the brane,” which only sup-port excitations with gravitational strength forces, the bestbounds come from Cavendish-type experiments, whichtest the 1/R2 structure of the Newton force law at shortdistances. No deviations have been observed to a distanceof about 1 mm so far. Experiments planned in the nearfuture should extend the limit to about 100 µ. Obviously,observation of any deviation from 1/R2 would be a majordiscovery.

G. Conclusion

This introductory section has sketched some of the re-markable successes that string theory has achieved overthe past 30 years. There are many others that did not fit inthis brief survey. Despite all this progress, there are somevery important and fundamental questions whose answersare unknown. It seems that whenever a breakthrough oc-curs, a host of new questions arise, and the ultimate goalstill seems a long way off. To convince you that there isa long way to go, let us list some of the most importantquestions.

What is the theory? Even though a great deal is knownabout string theory and M-theory, it seems that theoptimal formulation of the underlying theory has notyet been found. It might be based on principles thathave not yet been formulated.

We are convinced that supersymmetry is presentat high energies and probably at the electroweakscale, too. But we do not know how or why it isbroken.

A very crucial problem concerns the energy density ofthe vacuum, which is a physical quantity in agravitational theory. This is characterized by thecosmological constant, which observationally appearsto have a small positive value—so that the vacuumenergy of the universe is comparable to the energy inmatter. In Planck units this is a tiny number( ∼ 10−120). If supersymmetry were unbroken, wecould argue that = 0, but if it is broken at the 1 TeVscale, that would seem to suggest ∼ 10−60, which isvery far from the truth. Despite an enormous amountof effort and ingenuity, it is not yet clear howsuperstring theory will conspire to breaksupersymmetry at the TeV scale and still give a valuefor that is much smaller than 10−60. The fact thatthe desired result is about the square of this might be auseful hint.

Even though the underlying theory is unique, thereseem to be many consistent quantum vacua. We wouldvery much like to formulate a theoretical principle(not based on observation) for choosing among thesevacua. It is not known whether the right approach tothe answer is cosmological, probabilistic, anthropic,or something else.

II. STRING THEORY BASICS

In this section we will describe the world-sheet dynam-ics of the original bosonic string theory. As we will seethis theory has various unrealistic and unsatisfactory prop-erties. Nonetheless it is a useful preliminary before de-scribing supersymmetric strings, because it allows us tointroduce many of the key concepts without simultane-ously addressing the added complications associated withfermions and supersymmetry.

We will describe string dynamics from a first-quantizedworld-sheet sum-over-histories point of view. This ap-proach is closely tied to perturbation theory analysis. Itshould be contrasted with “second quantized” string fieldtheory, which is based on field operators that create ordestroy entire strings. To explain the methodology, let usbegin by reviewing the world-line description a massivepoint particle.

Page 340: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 357

A. World-Line Description of a MassivePoint Particle

A point particle sweeps out a trajectory (or world line) inspacetime. This can be described by functions x µ(τ ) thatdescribe how the world line, parameterized by τ , is em-bedded in the spacetime, whose coordinates are denotedx µ. For simplicity, let us assume that the spacetime is flatMinkowski space with a Lorentz metric

ηµν =

−1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

. (7)

Then, the Lorentz invariant line element is given by

ds2 = −ηµν dxµ dxν . (8)

In units h = c = 1, the action for a particle of mass m isgiven by

S = −m∫

ds. (9)

This could be generalized to a curved spacetime by replac-ing ηµν by a metric gµν(x), but we will not do so here. Interms of the embedding functions, x µ(τ ), the action canbe rewritten in the form

S = −m∫

d τ√−ηµν x µ x ν, (10)

where dots represent τ derivatives. An important propertyof this action is invariance under local reparametriza-tions. This is a kind of gauge invariance, whose meaningis that the form of S is unchanged under an arbitraryreparametrization of the world line τ → τ (τ ). Actually,one should require that the function τ (τ ) is smooth andmonotonic ( d τ

d τ > 0). The reparametrization invariance isa one-dimensional analog of the four-dimensional generalcoordinate invariance of general relativity. Mathemati-cians refer to this kind of symmetry as diffeomorphisminvariance.

The reparametrization invariance of S allows us to ch-oose a gauge. A nice choice is the “static gauge”

x0 = τ. (11)

In this gauge (renaming the parameter t) the actionbecomes

S = −m∫ √

1 − v2 dt , (12)

where

v = d xdt

. (13)

Requiring this action to be stationary under an arbitraryvariation of x(t) gives the Euler–Lagrange equations

d pdt

= 0, (14)

where

p = δS

δv = m v√1 − v2

, (15)

which is the usual result. So we see that standard relativis-tic kinematics follows from the action S = − m

∫ds.

B. World-Volume Actions

We can now generalize the analysis of the massive pointparticle to a p-brane of tension Tp. The action in this caseinvolves the invariant (p + 1)-dimensional volume and isgiven by

Sp = −Tp

∫d µp +1 , (16)

where the invariant volume element is

d µp +1 =√

−det (−ηµν∂α x µ∂β x ν

) d p +1 σ. (17)

Here the embedding of the p-brane into d-dimensionalspacetime is given by functions x µ(σα). The index α =0, . . . , p labels the p + 1 coordinates σα of the p-braneworld-volume and the index µ = 0, . . . , d − 1 labels the dcoordinates x µ of the d-dimensional spacetime. We havedefined

∂α x µ = ∂x µ

∂σα. (18)

The determinant operation acts on the (p + 1) × (p + 1)matrix whose rows and columns are labeled by α and β.The tension Tp is interpreted as the mass per unit volumeof the p-brane. For a 0-brane, it is just the mass. Theaction Sp is reparametrization invariant. In other words,substituting σα = σα(σ β), it takes the same form whenexpressed in terms of the coordinates σ α .

Let us now specialize to the string, p = 1. Evaluatingthe determinant gives

S[x] = −T∫

dσ dτ√

x2x ′2 − (x · x ′)2, (19)

where we have defined σ 0 = τ , σ 1 = σ , and

xµ = ∂xµ

∂τ, x ′µ = ∂xµ

∂σ. (20)

This action, called the Nambu–Goto action, was firstproposed in 1970 (Nambu, 1970 and Goto, 1971). TheNambu–Goto action is equivalent to the action

S[x, h] = −T

2

∫d2σ

√−hhαβηµν∂αxµ∂β xν, (21)

Page 341: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

358 Superstring Theory

where hαβ (σ, τ ) is the world-sheet metric, h = det h αβ ,and h αβ is the inverse of hαβ . The Euler–Lagrange equa-tions obtained by varying h αβ are

Tαβ = ∂α x · ∂β x − 1

2h αβ h γ δ∂γ x · ∂δ x = 0. (22)

The equation Tαβ = 0 can be used to eliminate the world-sheet metric from the action, and when this is done onerecovers the Nambu–Goto action. (To show this take thedeterminant of both sides of the equation ∂α x · ∂β x =12 h αβ h γ δ∂γ x · ∂δ x .)

In addition to reparametrization invariance, the actionS[x , h] has another local symmetry, called conformal in-variance (or Weyl invariance). Specifically, it is invariantunder the replacement

hαβ → (σ, τ )h αβ

(23)x µ → x µ.

This local symmetry is special to the p = 1 case (strings).The two reparametrization invariance symmetries of

S[x , h] allow us to choose a gauge in which the threefunctions hαβ (this is a symmetric 2 × 2 matrix) are ex-pressed in terms of just one function. A convenient choiceis the “conformally flat gauge”

hαβ = ηαβ e φ(σ,τ ) . (24)

Here ηαβ denotes the two-dimensional Minkowski metricof a flat world-sheet. However, because of the factor e φ ,hαβ is only “conformally flat.” Classically, substitution ofthis gauge choice into S[x , h] yields the gauge-fixed action

S = T

2

∫d2 σηαβ∂α x · ∂β x . (25)

Quantum mechanically, the story is more subtle. Instead ofeliminating h via its classical field equations, one shouldperform a Feynman path integral, using standard machin-ery to deal with the local symmetries and gauge fixing.When this is done correctly, one finds that in general φ

does not decouple from the answer. Only for the spe-cial case d = 26 does the quantum analysis reproducethe formula we have given based on classical reasoning(Polyakov, 1981). Otherwise, there are correction termswhose presence can be traced to a conformal anomaly(i.e., a quantum-mechanical breakdown of the conformalinvariance).

The gauge-fixed action [Eq. (25)] is quadratic in thex’s. Mathematically, it is the same as a theory of d freescalar fields in two dimensions. The equations of motionobtained by varying xµ are simply free two-dimensionalwave equations:

xµ − x ′′µ = 0. (26)

This is not the whole story, however, because we must alsotake account of the constraints Tαβ = 0. Evaluated in theconformally flat gauge, these constraints are

T01 = T10 = x · x ′ = 0(27)

T00 = T11 = 1

2(x2 + x ′2) = 0.

Adding and subtracting gives

(x ± x ′)2 = 0. (28)

C. Boundary Conditions

To go further, one needs to choose boundary conditions.There are three important types. For a closed string oneshould impose periodicity in the spatial parameter σ .Choosing its range to be π (as is conventional)

xµ(σ, τ ) = xµ(σ + π, τ ). (29)

For an open string (which has two ends), each end can berequired to satisfy either Neumann or Dirichlet boundaryconditions (for each value of µ).

Neumann:∂xµ

∂σ= 0 at σ = 0 or π (30)

Dirichlet:∂xµ

∂τ= 0 at σ = 0 or π. (31)

The Dirichlet condition can be integrated, and then it spec-ifies a spacetime location on which the string ends. Theonly way this makes sense is if the open string ends ona physical object—it ends on a D-brane. (D stands forDirichlet.) If all the open-string boundary conditions areNeumann, then the ends of the string can be anywherein the spacetime. The modern interpretation is that thismeans that there are spacetime-filling D-branes present.

Let us now consider the closed-string case in more de-tail. The general solution of the two-dimensional waveequation is given by a sum of “right-movers” and “left-movers”:

xµ(σ, τ ) = xµ

R (τ − σ ) + xµ

L (τ + σ ). (32)

These should be subject to the following additionalconditions:

1. xµ(σ, τ ) is real2. xµ(σ + π, τ ) = xµ(σ, τ )3. (x ′

L )2 = (x ′R)2 = 0; these are the Tαβ = 0 constraints in

Eq. (28)

Page 342: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 359

The first two conditions can be solved explicitly in termsof Fourier series:

x µR = 1

2x µ + 2

s p µ(τ − σ ) + i√2 s

∑n =0

1

n αµ

n e −2in(τ−σ )

(33)

x µL = 1

2x µ + 2

s p µ(τ + σ ) + i√2 s

∑n =0

1

nαµ

n e −2in(τ+σ ) ,

where the expansion parameters αµn , α

µn satisfy

αµ−n =

(αµ

n

)†, α

µ−n =

(αµ

n

)†. (34)

The center-of-mass coordinate x µ and momentum p µ arealso real. The fundamental string length scale s is relatedto the tension T by

T = 1

2πα′ , α′ = 2s . (35)

The parameter α′ is called the universal Regge slope, sincethe string modes lie on linear parallel Regge trajectorieswith this slope.

D. Quantization

The analysis of closed-string left-moving modes, closed-string right-moving modes, and open-string modes are allvery similar. Therefore, to avoid repetition, we will focuson the closed-string right-movers. Starting with the gauge-fixed action in Eq. (25), the canonical momentum of thestring is

p µ(σ, τ ) = δS

δ x µ= T x µ. (36)

Canonical quantization (this is just free two-dimensionalfield theory for scalar fields) gives

[p µ(σ, τ ), x ν(σ ′, τ )] = −ihηµνδ(σ − σ ′). (37)

In terms of the Fourier modes (setting h = 1) these become

[p µ, x ν] = −i ηµν (38)[αµ

m , ανn

] = m δm +n ,0 ηµν,

(39)[αµ

m , ανn

] = m δm +n ,0 ηµν,

and all other commutators vanish.Recall that a quantum-mechanical harmonic oscillator

can be described in terms of raising and lowering opera-tors, usually called a † and a, which satisfy

[a , a †] = 1. (40)

We see that, aside from a normalization factor, the expan-sion coefficients αµ

−m and αµm are raising and lowering op-

erators. There is just one problem. Because η00 = −1, thetime components are proportional to oscillators with the

wrong sign ([a , a †] = −1). This is potentially very bad,because such oscillators create states of negative norm,which could lead to an inconsistent quantum theory (withnegative probabilities, etc.). Fortunately, as we will ex-plain, the Tαβ = 0 constraints eliminate the negative-normstates from the physical spectrum.

The classical constraint for the right-moving closed-string modes, (x ′R)2 = 0, has Fourier components

Lm = T

2

∫ π

0e −2imσ (x ′R)2 d σ =

1

2

∞∑n =−∞

αm −n · αn , (41)

which are called Virasoro operators. Since αµm does not

commute with αµ−m , L0 needs to be normal-ordered:

L0 = 1

2 α2

0 +∞∑

n =1

α−n · αn . (42)

Here αµ

0 = sp µ/√

2, where p µ is the momentum.

E. The Free String Spectrum

Recall that the Hilbert space of a harmonic oscilla-tor is spanned by states |n 〉, n = 0, 1, 2, . . . , where theground state, |0〉, is annihilated by the lowering operator(a | 0〉 = 0) and

|n〉 = (a†)n

√n!

| 0〉. (43)

Then, for a normalized ground-state (〈0 | 0〉 = 1), one canuse [a, a†] = 1 repeatedly to prove that

〈m | n〉 = δm,n (44)

and

a†a | n〉 = n | n〉. (45)

The string spectrum (of right-movers) is given by theproduct of an infinite number of harmonic-oscillator Fockspaces, one for each α

µn , subject to the Virasoro constraints

(Virasoro, 1970)

(L0 − q) | φ〉 = 0(46)

Ln | φ〉 = 0, n > 0.

Here |φ〉 denotes a physical state, and q is a constant to bedetermined. It accounts for the arbitrariness in the normal-ordering prescription used to define L0. As we will see, theL0 equation is a generalization of the Klein–Gordon equa-tion. It contains p2 = −∂ · ∂ plus oscillator terms whoseeigenvalue will determine the mass of the state.

Page 343: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

360 Superstring Theory

It is interesting to work out the algebra of the Virasorooperators Lm , which follows from the oscillator algebra.The result, called the Virasoro algebra, is

[Lm, Ln] = (m − n)Lm+n + c

12(m3 − m)δm+n,0. (47)

The second term on the right-hand side is called the “con-formal anomaly term” and the constant c is called the“central charge” or “conformal anomaly.” Each compo-nent of xµ contributes one unit to the central charge, sothat altogether c = d .

There is a more sophisticated way to describe the stringspectrum (in terms of BRST cohomology), but it is equiv-alent to the more elementary approach presented here. Inthe BRST approach, gauge-fixing to the conformal gaugein the quantum theory requires the addition of world-sheet Faddeev-Popov ghosts, which turn out to contributec = −26. Thus the total conformal anomaly of the xµ andthe ghosts cancels for the particular choice d = 26, as weasserted earlier. Moreover, it is also necessary to set theparameter q = 1, so that mass-shell condition becomes

(L0 − 1) | φ〉 = 0. (48)

Since the mathematics of the open-string spectrum isthe same as that of closed-string right-movers, let usnow use the equations we have obtained to study theopen-string spectrum. (Here we are assuming that theopen-string boundary conditions are all Neumann, corre-sponding to spacetime-filling D-branes.) The mass-shellcondition is

M2 = −p2 = −1

2α2

0 = N − 1, (49)

where

N =∞∑

n=1

α−n · αn =∞∑

n=1

na†n · an. (50)

The a†’s and a’s are properly normalized raising and low-ering operators. Since each a†a has eigenvalues 0, 1,

2, . . . , the possible values of N are also 0, 1, 2, . . . . Theunique way to realize N = 0 is for all the oscillators to bein the ground state, which we denote simply by |0; pµ〉,where pµ is the momentum of the state. This state hasM2 = −1, which is a tachyon (pµ is spacelike). Sucha faster-than-light particle is certainly not possible in aconsistent quantum theory, because the vacuum would beunstable. However, in perturbation theory (which is theframework we are implicitly considering) this instabilityis not visible. Since this string theory is only supposed tobe a warm-up exercise before considering tachyon-free su-perstring theories, let us continue without worrying aboutthe vacuum instability.

The first excited state, with N = 1, corresponds to M2 =0. The only way to achieve N = 1 is to excite the firstoscillator once:

|φ〉 = ζµαµ

−1 | 0; p〉. (51)

Here ζµ denotes the polarization vector of a massless spin-one particle. The Virasoro constraint condition L1 | φ〉 = 0implies that ζµ must satisfy

pµζµ = 0. (52)

This ensures that the spin is transversely polarized, so thereare d−2 independent polarization states. This agrees withwhat one finds for a massless Maxwell or Yang–Millsfield.

At the next mass level, where N = 2 and M2 = 1, themost general possibility has the form

|φ〉 = (ζµα

µ

−2 + λµναµ

−1αν−1

) | 0; p〉. (53)

However, the constraints L1 | φ〉 = L2 | φ〉 = 0 restrict ζµ

and λµν . The analysis is interesting, but only the resultswill be described. If d > 26, the physical spectrum con-tains a negative-norm state, which is not allowed. How-ever, when d = 26, this state becomes zero-norm and de-couples from the theory. This leaves a pure massive “spintwo” (symmetric traceless tensor) particle as the onlyphysical state at this mass level.

Let us now turn to the closed-string spectrum. A closed-string state is described as a tensor product of a left-movingstate and a right-moving state, subject to the conditionthat the N value of the left-moving and the right-movingstate is the same. The reason for this “level-matching”condition is that we have (L0 − 1) | φ〉 = (L0 − 1) | φ〉 = 0.The sum (L0 + L0 − 2) | φ〉 is interpreted as the mass-shell condition, while the difference (L0 − L0)|φ〉 =(N − N ) | φ〉 = 0 is the level-matching condition.

Using this rule, the closed-string ground state is just

|0〉 ⊗ | 0〉, (54)

which represents a spin 0 tachyon with M2 = −2. (Thenotation no longer displays the momentum p of the state.)Again, this signals an unstable vacuum, but we will notworry about it here. Much more important, and more sig-nificant, is the first excited state

|φ〉 = ζµν

µ

−1 | 0〉 ⊗ αν−1 | 0〉), (55)

which has M2 = 0. The Virasoro constraints L1 | φ〉 =L1 | φ〉 = 0 imply that pµζµν = 0. Such a polarization ten-sor encodes three distinct spin states, each of which playsa fundamental role in string theory. The symmetric part ofζµν encodes a spacetime metric field gµν (massless spintwo) and a scalar dilaton field φ (massless spin zero). The

Page 344: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 361

gµν field is the graviton field, and its presence (with the cor-rect gauge invariances) accounts for the fact that the theorycontains general relativity, which is a good approximationfor energies well below the string scale. Its vacuum valuedetermines the spacetime geometry. Similarly, the valueof φ determines the string coupling constant (gs = 〈 eφ〉).

ζµν also has an antisymmetric part, which corres-ponds to a massless antisymmetric tensor gauge fieldBµν = −Bνµ. This field has a gauge transformation of theform

δBµν = ∂µν − ∂νµ, (56)

(which can be regarded as a generalization of the gaugetransformation rule for the Maxwell field: δAµ = ∂µ).The gauge-invariant field strength (analogous to Fµν =∂µ Aν − ∂ν Aµ) is

Hµνρ = ∂µ Bνρ + ∂ν Bρµ + ∂ρ Bµν. (57)

The importance of the Bµν field resides in the fact that thefundamental string is a source for Bµν , just as a chargedparticle is a source for the vector potential Aµ. Mathemat-ically, this is expressed by the coupling

q∫

Bµν dxµ ∧ dxν, (58)

which generalizes the coupling of a charged particle to aMaxwell field

q∫

Aµ dxµ. (59)

F. The Number of Physical States

The number of physical states grows rapidly as a functionof mass. This can be analyzed quantitatively. For the openstring, let us denote the number of physical states withα′M2 = n − 1 by dn . These numbers are encoded in thegenerating function

G(w) =∞∑

n=0

dnwn =

∞∏m=1

(1 − wm)−24. (60)

The exponent 24 reflects the fact that in 26 dimensions,once the Virasoro conditions are taken into account, thespectrum is exactly what one would get from 24 trans-versely polarized oscillators. It is easy to deduce from thisgenerating function the asymptotic number of states forlarge n, as a function of n

dn ∼ n−27/4e4π√

n. (61)

This asymptotic degeneracy implies that the finite-tempe-rature partition function

tr (e−β H ) =∞∑

n=0

dne−βMn (62)

diverges for β−1 = T > TH , where TH is the Hagedorntemperature

TH = 1

4π√

α′ = 1

4πs. (63)

TH might be the maximum possible temperature or else acritical temperature at which there is a phase transition.

G. The Structure of String Perturbation Theory

As we discussed in the first section, perturbation the-ory calculations are carried out by computing Feyn-man diagrams. Whereas in ordinary quantum field theoryFeynman diagrams are webs of world lines, in the caseof string theory they are two-dimensional surfaces repre-senting string world-sheets. For these purposes, it is conve-nient to require that the world-sheet geometry is Euclidean(i.e., the world-sheet metric hαβ is positive definite). Thediagrams are classified by their topology, which is verywell understood in the case of two-dimensional surfaces.The world-sheet topology is characterized by the numberof handles (h), the number of boundaries (b), and whetheror not they are orientable. The order of the expansion (i.e.,the power of the string-coupling constant) is determinedby the Euler number of the world sheet M . It is given byχ (M) = 2 − 2h − b. For example, a sphere has h = b = 0,and hence χ = 2. A torus has h = 1, b = 0, and χ = 0, acylinder has h = 0, b = 2, and χ = 0, and so forth. Surfaceswith χ = 0 admit a flat metric.

A scattering amplitude is given by a path integral of theschematic structure∫

Dhαβ(σ )Dxµ(σ )e−S[h,x]nc∏

i=1

∫M

Vαi (σi ) d2σi

no∏j=1

×∫

∂ MVβ j (σ j ) dσ j . (64)

The action S[h, x] is given in Eq. (21). Vαi is a ver-tex operator that describes emission or absorption of aclosed-string state of type αi from the interior of the stringworld-sheet, and Vβ j is a vertex operator that describesemission of absorption of an open-string state of type β j

from the boundary of the string world-sheet. There arelots of technical details that are not explained here. In theend, one finds that the conformally inequivalent world-sheets of a given topology are described by a finite num-ber of parameters, and thus these amplitudes can be recastas finite-dimensional integrals over these “moduli.” (Themomentum integrals are already done.) The dimension ofthe resulting integral turns out to be

N = 3(2h + b − 2) + 2nc + no. (65)

As an example consider the amplitude describing elasticscattering of two open-string ground states. In this case

Page 345: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

362 Superstring Theory

h = 0, b = 1, nc = 0, no = 4, and therefore N = 1. In termsof the usual Mandelstam invariants s = −(p1 + p2)2 andt = −(p1 − p4)2, the result is

A(s , t) = g2s

∫ 1

0dxx−α(s)−1(1 − x)−α(t)−1 , (66)

where the Regge trajectory α(s) is

α(s) = 1 + α′s . (67)

This integral is just the Euler beta function

A(s , t) = g2s B(−α(s), −α(t)) = g2

s

(−α(s))(−α(t))

(−α(s) − α(t)).

(68)

This is the famous Veneziano amplitude (Veneziano,1968), which got the whole subject started.

H. Recapitulation

This section described some of the basic facts of the 26-dimensional bosonic string theory. One significant pointthat has not yet been made clear is that there are actuallya number of distinct theories depending on what kinds ofstrings one includes.

Oriented closed strings only Oriented closed-strings and oriented open-strings; in

this case one can incorporate U (n) gauge symmetry Unoriented closed strings only Unoriented closed-strings and unoriented

open-strings; in this case one can incorporate SO(n) orSp(n) gauge symmetry

As we have mentioned already, all the bosonic stringtheories are unphysical as they stand, because (in eachcase) the closed-string spectrum contains a tachyon. Atachyon means that one is doing perturbation theory aboutan unstable vacuum. This is analogous to the unbrokensymmetry extremum of the Higgs potential in the stan-dard model. In that case, we know that there is a sta-ble minimum, where the Higgs fields acquires a vacuumvalue. Recently, there has been success in demonstratingthat open-string tachyons condense at a stable minimum,but the fate of the closed-string tachyon is still an openproblem.

III. SUPERSTRINGS

Among the deficiencies of the bosonic string theory is thefact that there are no fermions. As we will see, the ad-dition of fermions leads quite naturally to supersymme-try and hence superstrings. There are two alternative for-

malisms that are used to study superstrings. The originalone, which grew out of the 1971 papers by Ramond and byNeveu and Schwarz (1971) is called the RNS formalism. Inthis approach, the supersymmetry of the two-dimensionalworld-sheet theory plays a central role. The second ap-proach, developed by Michael Green and the author inthe early 1980s (Green and Schwarz, 1981), emphasizessupersymmetry in the ten-dimensional spacetime. Whichone is more useful depends on the problem being studied.Only the RNS approach will be presented here.

In the RNS formalism, the world-sheet theory is basedon the d functions xµ(σ, τ ) that describe the embeddingof the world-sheet in the spacetime, just as before. How-ever, in order to supersymmetrize the world-sheet theory,we also introduce d fermionic partner fields ψµ(σ, τ ).Note that xµ transforms as a vector from the spacetimeviewpoint, but as d scalar fields from the two-dimensionalworld-sheet viewpoint. The ψµ also transform as a space-time vector, but as world-sheet spinors. Altogether, xµ andψµ described d supersymmetry multiplets, one for eachvalue of µ.

The reparametrization invariant world-sheet action de-scribed in the preceding section can be generalized tohave local supersymmetry on the world-sheet, as well.(The details of how that works are a bit too involvedto describe here.) When one chooses a suitable confor-mal gauge (hαβ = eφηαβ), together with an appropriatefermionic gauge condition, one ends up with a world-sheet theory that has global supersymmetry supplementedby constraints. The constraints form a super-Virasoro al-gebra. This means that in addition to the Virasoro con-straints of the bosonic string theory, there are fermionicconstraints, as well.

A. The Gauge-Fixed Theory

The globally supersymmetric world-sheet action that ari-ses in the conformal gauge takes the form

S = −T

2

∫d2σ

(∂αxµ∂αxµ − iψµρα∂αψµ

). (69)

The first term is exactly the same as in Eq. (25) of thebosonic string theory. Recall that it has the structure of dfree scalar fields. The second term that has now been addedis just d free massless spinor fields, with Dirac-type ac-tions. The notation is that ρα are two 2 × 2 Dirac matricesand ψ = (ψ−

ψ+) is a two-component Majorana spinor. The

Majorana condition simply means that ψ+ and ψ− are realin a suitable representation of the Dirac algebra. In fact, aconvenient choice is one for which

ψρα∂αψ = ψ−∂+ψ− + ψ+∂−ψ+, (70)

Page 346: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 363

where ∂± represent derivatives with respect toσ± = τ ± σ .In this basis, the equations of motion are simply

∂+ψµ− = ∂−ψ

µ+ = 0. (71)

Thus ψµ− describes right-movers and ψ

µ+ describes left-

movers.Concentrating on the right-movers ψ

µ−, the global su-

persymmetry transformations, which are a symmetry ofthe gauge-fixed action, are

δxµ = iεψµ−

(72)δψ

µ− = −2∂−xµε.

It is easy to show that this is a symmetry of the action[Eq. (69)]. There is an analogous symmetry for the left-movers. (Accordingly, the world-sheet theory is said tohave (1, 1) supersymmetry.) Continuing to focus on theright-movers, the Virasoro constraint is

(∂−x)2 + i

µ−∂−ψµ− = 0. (73)

The first term is what we found in the bosonic string theory,and the second term is an additional fermionic contribu-tion. There is also an associated fermionic constraint

ψµ−∂−xµ = 0. (74)

The Fourier modes of these constraints generate thesuper-Virasoro algebra. There is a second identical super-Virasoro algebra for the left-movers.

As in the bosonic string theory, the Virasoro algebrahas conformal anomaly terms proportional to a centralcharge c. As in that theory, each component of xµ con-tributes +1 to the central charge, for a total of d , while (inthe BRST quantization approach) the reparametrizationsymmetry ghosts contribute −26. But now there are addi-tional contributions. Each component of ψµ gives +1/2,for a total of d/2, and the local supersymmetry ghostscontribute +11. Adding all of this up, gives a grand totalof c = 3d

2 − 15. Thus, we see that the conformal anomalycancels for the specific choice d = 10. This is the pre-ferred critical dimension for superstrings, just as d = 26 isthe critical dimension for bosonic strings. For other valuesthe theory has a variety of inconsistencies.

B. The R and NS Sectors

Let us now consider boundary conditions for ψµ(σ, τ ).(The story for xµ is exactly as before.) First, let us consideropen-string boundary conditions. For the action to be well-defined, it turns out that one must set ψ+ = ±ψ− at the twoends σ = 0, π . An overall sign is a matter of convention,so we can set

ψµ+(0, τ ) = ψ

µ−(0, τ ), (75)

without loss of generality. But this still leaves two possi-bilities for the other end, which are called R and NS:

R: ψµ+(π, τ ) = ψ

µ−(π, τ )

(76)NS: ψ

µ+(π, τ ) = −ψ

µ−(π, τ ).

Combining these with the equations of motion ∂−ψ+ =∂+ψ− = 0, allows us to express the general solutions asFourier series

R: ψµ− = 1√

2

∑n∈Z

dµn e−in(τ−σ )

ψµ+ = 1√

2

∑n∈Z

dµn e−in(τ+σ )

(77)

NS: ψµ− = 1√

2

∑r∈Z+1/2

bµr e−ir (τ−σ )

ψµ+ = 1√

2

∑r∈Z+1/2

bµr e−ir (τ+σ ).

The Majorana condition implies that dµ−n = dµ†

n and bµ−r =

bµ†r . Note that the index n takes integer values, whereas

the index r takes half-integer values (± 12 , ± 3

2 , . . .). In par-ticular, only the R boundary condition gives a zero mode.

Canonical quantization of the free fermi fields ψµ(σ, τ )is very standard and straightforward. The result can beexpressed as anticommutation relations for the coefficientsdµ

m and bµr :

R:dµ

n , dνn

= ηµνδm+n,0 m, n ∈ Z(78)

NS:dµ

r , dνs

= ηµνδr+s,0 r, s ∈ Z + 1

2.

Thus, in addition to the harmonic oscillator operators αµm

that appear as coefficients in mode expansions of xµ, thereare fermionic oscillator operators dµ

m or bµr that appear as

coefficients in mode expansions of ψµ. The basic structureb, b† = 1 is very simple. It describes a two-state systemwith b | 0〉 = 0, and b† | 0〉 = | 1〉. The b’s or d’s with nega-tive indices can be regarded as raising operators and thosewith positive indices as lowering operators, just as we didfor the α

µn .

In the NS sector, the ground state |0; p〉 satisfies

αµm | 0; p〉 = bµ

r | 0; p〉 = 0, m, r > 0 (79)

which is a straightforward generalization of how we de-fined the ground state in the bosonic string theory. All theexcited states obtained by acting with the α and b rais-ing operators are spacetime bosons. We will see later thatthe ground state, defined as we have done here, is again atachyon. However, in this theory, as we will also see, thereis a way by which this tachyon can (and must) be removedfrom the physical spectrum.

Page 347: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

364 Superstring Theory

In the R sector there are zero modes that satisfy thealgebra

0 , dν0

= ηµν. (80)

This is the d-dimensional spacetime Dirac algebra. Thusthe d0’s should be regarded as Dirac matrices and all statesin the R sector should be spinors in order to furnish rep-resentation spaces on which these operators can act. Theconclusion, therefore, is that whereas all string states inthe NS sector are spacetime bosons, all string states in theR sector are spacetime fermions.

In the closed-string case, the physical states are ob-tained by tensoring right- and left-movers, each of whichare mathematically very similar to the open-string spec-trum. This means that there are four distinct sectors ofclosed-string states: NS ⊗ NS and R ⊗ R describe space-time bosons, whereas NS ⊗ R and R ⊗ NS describe space-time fermions. We will return to explore what this giveslater, but first we need to explore the right-movers by them-selves in more detail.

The zero mode of the fermionic constraint ψµ∂−xµ = 0gives a wave equation for (fermionic) strings in theRamond sector, F0|ψ〉 = 0, which is called the Dirac–Ramond equation. In terms of the oscillators

F0 = α0 · d0 +∑n =0

α−n · dn. (81)

The zero-mode piece of F0, α0 · d0, has been isolated, be-cause it is just the usual Dirac operator, γ µ∂µ, up to nor-malization. (Recall that α

µ

0 is proportional to pµ = −i∂µ,and dµ

0 is proportional to the Dirac matrices γ µ.) Thefermionic ground state |ψ0〉, which satisfies

αµn | ψ0〉 = dµ

n | ψ0〉 = 0, n > 0, (82)

satisfies the wave equation

α0 · d0 | ψ0〉 = 0, (83)

which is precisely the massless Dirac equation. Hence thefermionic ground state is a massless spinor.

C. The GSO Projection

In the NS (bosonic) sector the mass formula is

M2 = N − 1

2, (84)

which is to be compared with the formula M2 = N − 1 ofthe bosonic string theory. This time the number operatorN has contributions from the b oscillators as well as the α

oscillators. (The reason that the normal-ordering constantis −1/2 instead of −1 works as follows. Each transverseα oscillator contributes −1/24 and each transverse b os-cillator contributes −1/48. The result follows since the

bosonic theory has 24 transverse directions and the super-string theory has 8 transverse directions.) Thus the groundstate, which has N = 0, is now a tachyon with M2 = −1/2.

This is where things stood until the 1976 work ofGliozzi, Scherk, and Olive. They noted that the spectrumadmits a consistent truncation (called the GSO projec-tion), which is necessary for the consistency of the inter-acting theory. In the NS sector, the GSO projection keepsstates with an odd number of b-oscillator excitations andremoves states with an even number of b-oscillator ex-citation. Once this rule is implemented the only possiblevalues of N are half integers, and the spectrum of allowedmasses are integral

M2 = 0, 1, 2, . . . . (85)

In particular, the bosonic ground state is now massless.The spectrum no longer contains a tachyon. The GSOprojection also acts on the R sector, where there is ananalogous restriction on the d oscillators. This amounts toimposing a chirality projection on the spinors.

Let us look at the massless spectrum of the GSO-projected theory. The ground-state boson is now a mass-less vector, represented by the state ζµbµ

−1/2 | 0; p〉,which (as before) has d − 2 = 8 physical polarizations.The ground-state fermion is a massless Majorana–Weylfermion which has 1

4 · 2d/2 = 8 physical polarizations.Thus there are an equal number of bosons and fermions, asis required for a theory with spacetime supersymmetry. Infact, this is the pair of fields that enter into ten-dimensionalsuper Yang–Mills theory. The claim is that the completetheory now has spacetime supersymmetry.

If there is spacetime supersymmetry, then there shouldbe an equal number of bosons and fermions at every masslevel. Let us denote the number of bosonic states withM2 = n by dNS(n) and the number of fermionic states withM2 = n by dR(n). Then we can encode these numbers ingenerating functions, just as we did for the bosonic stringtheory

fNS(w) =∞∑

n=0

dNS(n)wn = 1

2√

w

( ∞∏m=1

(1 + wm−1/2

1 − wm

)8

−∞∏

m=1

(1 − wm−1/2

1 − wm

)8)

(86)

fR(w) =∞∑

n=0

dR(n)wn = 8∞∏

m=1

(1 + wm

1 − wm

)8

. (87)

The 8’s in the exponents refer to the number of transversedirections in ten dimensions. The effect of the GSO pro-jection is the subtraction of the second term in fNS and the

Page 348: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 365

reduction of the coefficient in fR from 16 to 8. In 1829,Jacobi discovered the formula

fR(w) = fNS(w). (88)

(He used a different notation, of course.) For him thisrelation was an obscure curiosity, but we now see thatit tells us that the number of bosons and fermions is thesame at every mass level, which provides strong evidencefor supersymmetry of this string theory in ten dimensions.A complete proof of supersymmetry for the interactingtheory was constructed by Green and the author five yearsafter the GSO paper (Green and Schwarz, 1981).

D. Type II Superstrings

We have described the spectrum of bosonic (NS) andfermionic (R) string states. This also gives the spectrumof left- and right-moving closed-string modes, so we canform the closed-string spectrum by forming tensor prod-ucts as before. In particular, the massless right-movingspectrum consists of a vector and a Majorana–Weyl spinor.Thus the massless closed-string spectrum is given by

(vector + MW spinor) ⊗ (vector + MW spinor). (89)

There are actually two distinct possibilities, because thetwo MW spinors can have either opposite chirality or thesame chirality.

When the two MW spinors have opposite chirality, thetheory is called type IIA superstring theory, and its mass-less spectrum forms the type IIA supergravity multiplet.This theory is left-right symmetric. In other words, thespectrum is invariant under mirror reflection. This impliesthat the IIA theory is parity conserving. When the two MWspinors have the same chirality, the resulting type IIB su-perstring theory is chiral, and hence parity violating. Ineach case there are two gravitinos, arising from vector ⊗spinor and spinor ⊗ vector, which are gauge fields for localsupersymmetry. Thus, since both type II superstring theo-ries have two gravitinos, they have local N = 2 supersym-metry in the ten-dimensional sense. The supersymmetrycharges are Majorana–Weyl spinors, which have 16 realcomponents, so the type II theories have 32 conserved su-percharges. This is the same amount of supersymmetry aswhat is usually called N = 8 in four dimensions, and itis believed to be the most that is possible in a consistentinteracting theory.

The type II superstring theories contain only orientedclosed strings (in the absence of D-branes). However, thereis another superstring theory, called type I, which can beobtained by a projection of the type IIB theory, that onlykeeps the diagonal sum of the two gravitinos. Thus, thistheory only has N = 1 supersymmetry (16 supercharges).It is a theory of unoriented closed strings. However, it can

be supplemented by unoriented open strings. This intro-duces a Yang–Mills gauge group, which classically can beSO(n) or Sp(n) for any value of n. Quantum consistencysingles out SO(32) as the unique possibility. This restric-tion can be understood in a number of ways. The way thatit was first discovered was by considering anomalies.

E. Anomalies

Chiral (parity-violating) gauge theories can be inconsis-tent due to anomalies. This happens when there is a quan-tum mechanical breakdown of the gauge symmetry, whichis induced by certain one-loop Feynman diagrams. (Some-times one also considers breaking of global symmetries byanomalies, which does not imply an inconsistency. Thatis not what we are interested in here.) In the case of fourdimensions, the relevant diagrams are triangles, with thechiral fields going around the loop and three gauge fieldsattached as external lines. In the case of the standardmodel, the quarks and leptons are chiral and contributeto a variety of possible anomalies. Fortunately, the stan-dard model has just the right particle content so that allof the gauge anomalies cancel. If one omits the quark orlepton contributions, it does not work.

In the case of ten-dimensional chiral gauge theories, thepotentially anomalous Feynman diagrams are hexagons,with six external gauge fields. The anomalies can be at-tributed to the massless fields, and therefore they can beanalyzed in the low-energy effective field theory. Thereare several possible cases in ten dimensions:

N = 1 supersymmetric Yang–Mills theory. Thistheory has anomalies for every choice of gauge group.

Type I supergravity. This theory has gravitationalanomalies.

Type IIA supergravity. This theory is nonchiral, andtherefore it is trivially anomaly-free.

Type IIB supergravity. This theory has three chiralfields each of which contributes to several kinds ofgravitational anomalies. However, when theircontributions are combined, the anomalies all cancel.(This result was obtained by Alvarez–Gaume andWitten, 1983.)

Type I supergravity coupled to super Yang–Mills. Thistheory has both gauge and gravitational anomalies forevery choice of Yang–Mills gauge group exceptSO(32) and E8 × E8. For these two choices, all theanomalies cancel. (This result was obtained by Greenand Schwarz, 1984a.)

As we mentioned earlier, at the classical level one candefine type I superstring theory for any orthogonal or sym-plectic gauge group. Now we see that at the quantum level,

Page 349: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

366 Superstring Theory

the only choice that is consistent is SO(32). For any otherchoice there are fatal anomalies. The term SO(32) is usedhere somewhat imprecisely. There are several different Liegroups that have the same Lie algebra. It turns out that theparticular Lie group that is appropriate is Spin (32)/Z2.It contains one spinor conjugacy class in addition to theadjoint conjugacy class.

F. Heterotic Strings

The two Lie groups that are singled out—E8 × E8 andSpin (32)/Z2—have several properties in common. Eachof them has dimension = 496 and rank = 16. Moreover,their weight lattices correspond to the only two evenself-dual lattices in 16 dimensions. This last fact wasthe crucial clue that led Gross, Harvey, Martinec, andRohm (1985) to the discovery of the heterotic string soonafter the anomaly cancellation result. One hint is the rela-tion 10 + 16 = 26. The construction of the heterotic stringuses the d = 26 bosonic string for the left-movers and thed = 10 superstring for the right movers. The 16 extra left-moving dimensions are associated to an even self-dual 16-dimensional lattice. In this way one builds in the SO(32)or E8 × E8 gauge symmetry.

Thus, to recapitulate, by 1985 we had five consistentsuperstring theories, type I [with gauge group SO(32)],the two type II theories, and the two heterotic theories.Each is a supersymmetric ten-dimensional theory. Theperturbation theory was studied in considerable detail,and while some details may not have been completed,it was clear that each of the five theories has a well-defined, ultraviolet-finite perturbation expansion, satisfy-ing all the usual consistency requirements (unitarity, ana-lyticity, causality, etc.). This was pleasing, though it wassomewhat mysterious why there should be five consistentquantum gravity theories. It took another ten years untilwe understood that these are actually five special quantumvacua of a unique underlying theory.

G. T Duality

T duality, an amazing result obtained in the late 1980s, re-lates one string theory with a circular compact dimensionof radius R to another string theory with a circular dimen-sion of radius 1/R (in units s = 1). This is very profound,because it indicates a limitation of our usual motions ofclassical geometry. Strings see geometry differently frompoint particles. Let us examine how this is possible.

The key to understanding T duality is to consider thekinds of excitations that a string can have in the presenceof a circular dimension. One class of excitations, calledKaluza–Klein excitations, is a very general feature of anyquantum theory, whether or not based on strings. The ideais that in order for the wave function eipx to be single

valued, the momentum along the circle must be a multipleof 1/R, p = n/R, where n is an integer. From the lowerdimension viewpoint this is interpreted as a contribution(n/R)2 to the square of the mass.

There is a second type of excitation that is special toclosed strings. Namely, a closed string can wind m timesaround the circular dimension, getting caught up on thetopology of the space, contributing an energy given by thestring tension times the length of the string

Em = 2π R · m · T . (90)

Putting T = 12π

(for s = 1), this is just Em = m R.The combined energy-squared of the Kaluza–Klein and

winding-mode excitations is

E2 =( n

R

)2+ (m R)2 + · · · , (91)

where the dots represent string oscillator contributions.Under T duality

m ↔ n, R ↔ 1/R. (92)

Together, these interchanges leave the energy invariant.This means that what is interpreted as a Kaluza–Kleinexcitation in one string theory is interpreted as a winding-mode excitation in the T-dual theory, and the two theorieshave radii R and 1/R, respectively. The two principle ex-amples of T-dual pairs are the two type II theories and thetwo heterotic theories. In the latter case there are additionaltechnicalities that explain how the two gauge groups arerelated. Basically, when the compactification on a circleto nine dimensions is carried out in each case, it is neces-sary to include effects that we haven’t explained (calledWilson lines) to break the gauge groups to SO(16) ×SO(16), which is a common subgroup of SO(32) andE8 × E8.

IV. FROM SUPERSTRINGS TO M-THEORY

Superstring theory is currently undergoing a period ofrapid development in which important advances in under-standing are being achieved. The focus in this section willbe on explaining why there can be an eleven-dimensionalvacuum, even though there are only ten dimensions inperturbative superstring theory. The nonperturbative ex-tension of superstring theory that allows for an eleventhdimension has been named M-theory. The letter M is in-tended to be flexible in its interpretation. It could standfor magic, mystery, or meta to reflect our current stateof incomplete understanding. Those who think that two-dimensional supermembranes (the M2-brane) are funda-mental may regard M as standing for membrane. An ap-proach called Matrix theory is another possibility. And, ofcourse, some view M-theory as the mother of all theories.

Page 350: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 367

In the first superstring revolution we identified five dis-tinct superstring theories, each in ten dimensions. Threeof them, the type I theory and the two heterotic theo-ries, have N = 1 supersymmetry in the ten-dimensionalsense. Since the minimal ten-dimensional spinor is si-multaneously Majorana and Weyl, this corresponds to 16conserved supercharges. The other two theories, calledtype IIA and type IIB, have N = 2 supersymmetry (32 su-percharges). In the IIA case the two spinors have oppositehandedness so that the spectrum is left-right symmetric(nonchiral). In the IIB case the two spinors have the samehandedness and the spectrum is chiral.

In each of these five superstring theories it became clear,and was largely proved, that there are consistent pertur-bation expansions of on-shell scattering amplitudes. Infour of the five cases (heterotic and type II) the funda-mental strings are oriented and unbreakable. As a result,these theories have particularly simple perturbation expan-sions. Specifically, there is a unique Feynman diagram ateach order of the loop expansion. The Feynman diagramsdepict string world-sheets, and therefore they are two-dimensional surfaces. For these four theories the uniqueL-loop diagram is a closed orientable genus-L Riemannsurface, which can be visualized as a sphere with L han-dles. External (incoming or outgoing) particles are repre-sented by N points (or “punctures”) on the Riemann sur-face. A given diagram represents a well-defined integralof dimension 6L + 2N − 6. This integral has no ultravio-let divergences, even though the spectrum contains statesof arbitrarily high spin (including a massless graviton).From the viewpoint of point-particle contributions, stringand supersymmetry properties are responsible for incred-ible cancellations. Type I superstrings are unoriented andbreakable. As a result, the perturbation expansion is morecomplicated for this theory, and various world-sheet dia-grams at a given order have to be combined properly tocancel divergences and anomalies.

As we explained in the previous section, T duality re-lates two string theories when one spatial dimension formsa circle (denoted S1). Then the ten-dimensional geometryis R9 × S1. T duality identifies this string compactificationwith one of a second string theory also on R9 × S1. If theradii of the circles in the two cases are denoted R1 and R2,then

R1 R2 = α′. (93)

Here α′ = 2s is the universal Regge slope parameter, and

s is the fundamental string length scale (for both stringtheories). Note that T duality implies that shrinking thecircle to zero in one theory corresponds to decompactifi-cation of the dual theory.

The type IIA and IIB theories are T dual, so compacti-fying the nonchiral IIA theory on a circle of radius R and

letting R → 0 gives the chiral IIB theory in ten dimen-sions. This means, in particular, that they should not beregarded as distinct theories. The radius R is actually thevacuum value of a scalar field, which arises as an internalcomponent of the ten-dimensional metric tensor. Thus thetype IIA and type IIB theories in ten dimensions are twolimiting points in a continuous moduli space of quantumvacua. The two heterotic theories are also T dual, though(as we mentioned earlier) there are additional technicaldetails in this case. T duality applied to the type I theorygives a dual description, which is sometimes called typeI′ or IA.

A. M-Theory

In the 1970s and 1980s various supersymmetry and su-pergravity theories were constructed. In particular, super-symmetry representation theory showed that the largestpossible spacetime dimension for a supergravity theory(with spins ≤2) is eleven. Eleven-dimensional supergrav-ity, which has 32 conserved supercharges, was constructedin 1978 by Cremmer, Julia, and Scherk (1978). It hasthree kinds of fields—the graviton field (with 44 polar-izations), the gravitino field (with 128 polarizations), anda three-index gauge field Cµνρ (with 84 polarizations).These massless particles are referred to collectively asthe supergraviton. Eleven dimension supergravity is non-renormalizable, and thus it cannot be a fundamental the-ory. However, we now believe that it is a low-energy ef-fective description of M-theory, which is a well-definedquantum theory. This means, in particular, that higher di-mension terms in the effective action for the supergravityfields have uniquely determined coefficients within theM-theory setting, even though they are formally infinite(and hence undetermined) within the supergravity context.

Intriguing connections between type IIA string theoryand eleven dimension supergravity have been known for along time, but the precise relationship was only explainedin 1995. The field equations of eleven dimension super-gravity admit a solution that describes a supermembrane.In other words, this solution has the property that the en-ergy density is concentrated on a two-dimensional sur-face. A three-dimensional world-volume description ofthe dynamics of this supermembrane, quite analogous tothe two-dimensional world volume actions of superstrings[in the GS formalism (Green and Schwarz, 1984b)], wasconstructed by Bergshoeff, Sezgin, and Townsend (1987)The authors suggested that a consistent eleven dimen-sion quantum theory might be defined in terms of thismembrane, in analogy to string theories in ten dimen-sions. (Most experts now believe that M-theory cannotbe defined as a supermembrane theory.) Another strikingresult was that a suitable dimensional reduction of this

Page 351: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

368 Superstring Theory

supermembrane gives the (previously known) type IIAsuperstring world-volume action. For many years thesefacts remained unexplained curiosities until they were re-considered by Townsend (1995) and by Witten (1995).The conclusion is that type IIA superstring theory reallydoes have a circular eleventh dimension in addition tothe previously known ten spacetime dimensions. This factwas not recognized earlier because the appearance of theeleventh dimension is a nonperturbative phenomenon, notvisible in perturbation theory.

To explain the relation between M-theory and type IIAstring theory, a good approach is to identify the param-eters that characterize each of them and to explain howthey are related. Eleven-dimensional supergravity (andhence M-theory, too) has no dimensionless parameters.The only parameter is the eleven-dimensional Newtonconstant, which raised to a suitable power (−1/9), givesthe eleven-dimensional Planck mass m p. When M-theoryis compactified on a circle (so that the spacetime geometryis R10 × S1) another parameter is the radius R of the circle.Now consider the parameters of type IIA superstring the-ory. They are the string mass scale ms , introduced earlier,and the dimensionless string coupling constant gs .

We can identify compactified M-theory with type IIAsuperstring theory by making the following correspon-dences:

m2s = 2πRm3

p (94)

gs = 2πRms . (95)

Using these one can derive gs = (2πRm p)3/2 and ms =g1/3

s m p. The latter implies that the eleven-dimensionalPlanck length is shorter than the string length scale atweak coupling by a factor of (gs)1/3.

Conventional string perturbation theory is an expansionin powers of gs at fixed ms . Equation (95) shows that this isequivalent to an expansion about R = 0. In particular, thestrong coupling limit of type IIA superstring theory corre-sponds to decompactification of the eleventh dimension,so in a sense M-theory is type IIA string theory at infinitecoupling.∗ This explains why the eleventh dimension wasnot discovered in studies of string perturbation theory.

These relations encode some interesting facts. Forone thing, the fundamental IIA string actually is anM2-brane of M-theory with one of its dimensions wrappedaround the circular spatial dimension. Denoting thestring and membrane tensions (energy per unit volume)by TF1 and TM2, one deduces that

TF1 = 2πRTM2 . (96)

∗The E8 × E8 heterotic string theory is also eleven-dimensional atstrong coupling.

However, TF1 = 2πm2s and TM2 = 2πm3

p. Combining the-se relations gives Eq. (94).

B. Type II p-branes

Type II superstring theories contain a variety of p-branesolutions that preserve half of the 32 supersymmetries.These are solutions in which the energy is concentratedon a p-dimensional spatial hypersurface. (The world vol-ume has p + 1 dimensions.) The corresponding solutionsof supergravity theories were constructed by Horowitz andStrominger (1991). A large class of these p-brane excita-tions are called D-branes (or Dp-branes when we want tospecify the dimension), whose tensions are given by

TDp = 2πm p +1s

/gs . (97)

This dependence on the coupling constant is one of thecharacteristic features of a D-brane. Another characteristicfeature of D-branes is that they carry a charge that couplesto a gauge field in the RR sector of the theory (Polchinski,1995). The particular RR gauge fields that occur implythat p takes even values in the IIA theory and odd valuesin the IIB theory.

In particular, the D2-brane of the type IIA theory cor-responds to the supermembrane of M-theory, but now ina background geometry in which one of the transverse di-mensions is a circle. The tensions check, because [usingEqs. (94) and (95)]

TD2 = 2πm3s

/gs = 2πm3

p = TM2. (98)

The mass of the first Kaluza–Klein excitation of theeleven-dimensional supergraviton is 1/R. Using Eq. (95),we see that this can be identified with the D0-brane.More identifications of this type arise when we con-sider the magnetic dual of the M-theory supermembrane,which is a five-brane, called the M5-brane.∗ Its tension isTM5 = 2πm6

p. Wrapping one of its dimensions around thecircle gives the D4-brane, with tension

TD4 = 2πRTM5 = 2πm5s

/gs . (99)

If, on the other hand, the M5-frame is not wrapped aroundthe circle, one obtains the NS5-brane of the IIA theorywith tension

TN S5 = TM5 = 2πm6s

/g2

s . (100)

To summarize, type IIA superstring theory is M-theorycompactified on a circle of radius R = gss . M-theory isbelieved to be a well-defined quantum theory in eleven-dimension, which is approximated at low energy byeleven-dimensional supergravity. Its excitations are the

∗In general, the magnetic dual of a p-brane in d dimensions is a(d − p − 4)-brane.

Page 352: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 369

massless supergraviton, the M2-brane, and the M5-brane.These account both for the (perturbative) fundamentalstring of the IIA theory and for many of its nonperturbativeexcitations. The identities that we have presented here areexact, because they are protected by supersymmetry.

C. Type IIB Superstring Theory

Type IIB superstring theory, which is the other maximallysupersymmetric string theory with 32 conserved super-charges, is also ten-dimensional, but unlike the IIA the-ory its two supercharges have the same handedness. Atlow-energy, type IIB superstring theory is approximatedby type IIB supergravity, just as eleven-dimensional su-pergravity approximates M-theory. In each case the su-pergravity theory is only well-defined as a classical fieldtheory, but still it can teach us a lot. For example, it can beused to construct p-brane solutions and compute their ten-sions. Even though such solutions are only approximate,supersymmetry considerations ensure that the tensions,which are related to the kinds of conserved charges thep-branes carry, are exact. Since the IIB spectrum containsmassless chiral fields, one should check whether there areanomalies that break the gauge invariances—general co-ordinate invariance, local Lorentz invariance, and localsupersymmetry. In fact, the UV finiteness of the stringtheory Feynman diagrams ensures that all anomalies mustcancel, as was verified from a field theory viewpoint byAlvarez-Gaume and Witten (1983).

Type IIB superstring theory or supergravity containstwo scalar fields, the dilation φ and an axion χ , which areconveniently combined in a complex field

ρ = χ + ie−φ. (101)

The supergravity approximation has an SL(2, R) symme-try that transforms this field nonlinearly:

ρ → aρ + b

cρ + d , (102)

where a , b , c , d are real numbers satisfying ad − bc = 1.However, in the quantum string theory this symmetry isbroken to the discrete subgroup SL(2, Z ) (Hull andTownsend, 1995), which means that a , b , c , d are re-stricted to be integers. Defining the vacuum value of theρ field to be

〈ρ〉 = θ

2π+ i

gs, (103)

the SL(2, Z ) symmetry transformation ρ → ρ + 1 impliesthat θ is an angular coordinate. Moreover, in the specialcase θ = 0, the symmetry transformation ρ → −1/ρ takesgs → 1/gs . This symmetry, called S duality, implies thatcoupling constant gs is equivalent to coupling constant

1/gs , so that, in the case of type II superstring theory, theweak coupling expansion and the strong coupling expan-sion are identical. (An analogous S-duality transformationrelates the type I superstring theory to the SO(32) heteroticstring theory.)

Recall that the type IIA and type IIB superstring theo-ries are T dual, meaning that if they are compactified oncircles of radii RA and RB one obtains equivalent theoriesfor the identification RA RB = 2

s . Moreover, we saw thatthe type IIA theory is actually M-theory compactified ona circle. The latter fact encodes nonperturbative informa-tion. It turns out to be very useful to combine these twofacts and to consider the duality between M-theory com-pactified on a torus (R9 × T 2) and type IIB superstringtheory compactified on a circle (R9 × S1).

A torus can be described as the complex plane mod-ded out by the equivalence relations z ∼ z + w1 andz ∼ z + w2. Up to conformal equivalence, the periods w1

and w2 can be replaced by 1 and τ , with Im τ > 0. Inthis characterization τ and τ ′ = (a τ + b)/(c τ + d), wherea , b , c , d are integers satisfying ad − bc = 1, describeequivalent tori. Thus a torus is characterized by a modularparameter τ and an SL(2, Z ) modular group. The natural,and correct, conjecture at this point is that one should iden-tify the modular parameter τ of the M-theory torus withthe parameter ρ that characterizes the type IIB vacuum(Schwarz, 1995 and Aspinwall, 1996). Then the duality ofM-theory and type IIB superstring theory gives a geomet-rical explanation of the nonperturbative S-duality sym-metry of the IIB theory: the transformation ρ → −1/ρ,which sends gs → 1/gs in the IIB theory, corresponds tointerchanging the two cycles of the torus in the M the-ory description. To complete the story, we should relatethe area of the M theory torus (AM ) to the radius of theIIB theory circle (RB). This is a simple consequence offormulas given above

m3p AM = (2πRB)−1 . (104)

Thus the limit RB → 0, at fixed ρ, corresponds to decom-pactification of the M-theory torus, while preserving itsshape. Conversely, the limit AM → 0 corresponds to de-compactification of the IIB theory circle. The duality canbe explored further by matching the various p-branes innine-dimensions that can be obtained from either the M-theory or the IIB theory viewpoints. When this is done, onefinds that everything matches nicely and that one deducesvarious relations among tensions (Schwarz, 1996).

Another interesting fact about the IIB theory is that itcontains an infinite family of strings labeled by a pair ofintegers (p , q) with no common divisor (Schwarz, 1995).The (1, 0) string can be identified as the fundamental IIBstring, while the (0, 1) string is the D-string. From thisviewpoint, a (p, q) string can be regarded as a bound state

Page 353: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

370 Superstring Theory

of p fundamental strings and q D-strings (Witten, 1996).These strings have a very simple interpretation in the dualM-theory description. They correspond to an M2-branewith one of its cycles wrapped around a (p , q) cycle of thetorus. The minimal length of such a cycle is proportional to|p + q τ |, and thus (using τ = ρ) one finds that the tensionof a (p , q) string is given by

Tp ,q = 2π |p + qρ|m2s . (105)

Imagine that you lived in the nine-dimensional worldthat is described equivalently as M-theory compactified ona torus or as the type IIB superstring theory compactifiedon a circle. Suppose, moreover, you had very high energyaccelerators with which you were going to determine the“true” dimension of spacetime. Would you conclude thatten or eleven is the correct answer? If either AM or RB wasvery large in Planck units there would be a natural choice,of course. But how could you decide otherwise? The an-swer is that either viewpoint is equally valid. What deter-mines which choice you make is which of the masslessfields you regard as “internal” components of the metrictensor and which ones you regards as matter fields. Fieldsthat are metric components in one description correspondto matter fields in the dual one.

D. The D3-Brane and N = 4 Gauge Theory

D-branes have a number of special properties, which makethem especially interesting. By definition, they are braneson which strings can end—D stands for Dirichlet bound-ary conditions. The end of a string carries a charge, and theD-brane world-volume theory contains a U (1) gauge fieldthat carries the associated flux. When n Dp-branes are co-incident, or parallel and nearly coincident, the associated(p + 1)-dimensional world-volume theory is a U (n) gaugetheory (Witten, 1996). The n2 gauge bosons Ai j

µ and theirsupersymmetry partners arise as the ground states of ori-ented strings running from the i th Dp-brane to the j th Dp-brane. The diagonal elements, belonging to the Cartan sub-algebra, are massless. The field Ai j

µ with i = j has a massproportional to the separation of the i th and j th branes.

The U (n) gauge theory associated with a stack ofn Dp-branes has maximal supersymmetry (16 super-charges). The low-energy effective theory, when the braneseparations are small compared to the string scale, is su-persymmetric Yang–Mills theory. These theories can beconstructed by dimensional reduction of ten-dimensionalsupersymmetric U (n) gauge theory to p + 1 dimensions.A case of particular interest, which we shall now focuson, is p = 3. A stack of n D3-branes in type IIB super-string theory has a decoupled N = 4, d = 4 U (n) gaugetheory associated to it. This gauge theory has a numberof special features. For one thing, due to boson–fermioncancellations, there are no U V divergences at any order of

perturbation theory. The beta function β(g) is identicallyzero, which implies that the theory is scale invariant. Infact, N = 4, d = 4 gauge theories are conformally invari-ant. The conformal invariance combines with the super-symmetry to give a superconformal symmetry, which con-tains 32 fermionic generators. Another important propertyof N = 4, d = 4 gauge theories is an electric-magnetic du-ality, which extends to an SL(2, Z ) group of dualities.Now consider the N = 4 U (n) gauge theory associated toa stack of n D3-branes in type IIB superstring theory. Thereis an obvious identification that turns out to be correct.Namely, the SL(2, Z ) duality of the gauge theory is in-duced from that of the ambient type IIB superstring theory.The D3-branes themselves are invariant under SL(2, Z )transformations.

As we have said, a fundamental (1, 0) string can end ona D3-brane. But by applying a suitable SL(2, Z ) transfor-mation, this configuration is transformed to one in which a(p , q) string ends on the D3-brane. The charge on the endof this string describes a dyon with electric charge p andmagnetic charge q, with respect to the appropriate gaugefield. More generally, for a stack of n D3-branes, any paircan be connected by a (p , q) string. The mass is propor-tional to the length of the string times its tension, which wesaw is proportional to |p + qρ|. In this way one sees thatthe electrically charged particles, described by fundamen-tal fields, belong to infinite SL(2, Z ) multiplets. The otherstates are nonperturbative excitations of the gauge theory.The field configurations that describe them preserve halfof the supersymmetry. As a result their masses are givenexactly by the considerations described above. An interest-ing question, whose answer was unknown until recently,is whether N = 4 gauge theories in four dimensions alsoadmit nonperturbative excitations that preserve 1/4 of thesupersymmetry. The answer turns out to be that they do,but only if n ≥ 3. This result has a nice dual description interms of three-string junctions (Bergman, 1998).

E. Conclusion

In this section we have described some of the interestingadvances in understanding superstring theory that havetaken place in the past few years. The emphasis has been onthe nonperturbative appearance of an eleventh dimensionin type-IIA superstring theory, as well as its implicationswhen combined with superstring T dualities. In particu-lar, we argued that there should be a consistent quantumvacuum, whose low-energy effective description is givenby eleven-dimensional supergravity.

What we have described makes a convincing self-consistent picture, but it does not constitute a complete for-mulation of M-theory. In the past several years there havebeen some major advances in that direction, which we will

Page 354: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTV/GUB P2: GTY Final Pages

Encyclopedia of Physical Science and Technology EN016K-743 July 31, 2001 16:18

Superstring Theory 371

briefly mention here. The first, which goes by the nameof Matrix Theory, bases a formulation of M-theory in flateleven-dimensional spacetime in terms of the supersym-metric quantum mechanics of N D0-branes in the largeN limit (Banks et al., 1997). Matrix Theory has passedall tests that have been carried out, some of which arevery nontrivial. The construction has a nice generalizationto describe compactification of M-theory on a torus T n .However, it does not seem to be useful for n > 5, and othercompactification manifolds are (at best) awkward to han-dle. Another shortcoming of this approach is that it treatsthe eleventh dimension differently from the other ones.

Another proposal relating superstring and M-theorybackgrounds to large N limits of certain field theories hasbeen put forward by Maldacena (1997) and made moreprecise by Gubser, Klebanov, and Polyakov (1998), and byWitten (1998). [For a review of this subject, see (Aharonyet al., 2000).] In this approach, there is a conjectured du-ality (i.e., equivalence) between a conformally invariantfield theory (CFT) in d dimensions and type IIB super-string theory or M-theory on an Anti-de-Sitter space (AdS)in d + 1 dimensions. The remaining 9 − d or 10 − d di-mensions form a compact space, the simplest cases beingspheres. Three examples with unbroken supersymmetryare Ad S5 × S5, Ad S4 × S7, and Ad S7 × S4. This approachis sometimes referred to as AdS/CFT duality. This is an ex-tremely active and very promising subject. It has alreadytaught us a great deal about the large N behavior of vari-ous gauge theories. As usual, the easiest theories to studyare ones with a lot of supersymmetry, but it appears that inthis approach supersymmetry breaking is more accessiblethan in previous ones. For example, it might someday bepossible to construct the QCD string in terms of a dual AdSgravity theory, and use it to carry out numerical calcula-tions of the hadron spectrum. Indeed, there have alreadybeen some preliminary steps in this direction.

To sum up, I would say that despite all of the successesthat have been achieved in advancing our understandingof superstring theory and M-theory, there clearly is stilla long way to go. In particular, despite much effort andseveral imaginative proposals, we still do not have a con-vincing mechanism for ensuring the vanishing (or extremesmallness) of the cosmological constant for nonsupersym-metric vacua. Superstring theory is a field with very am-bitious goals. The remarkable fact is that they still seem tobe realistic. However, it may take a few more revolutionsbefore they are attained.

ACKNOWLEDGMENTS

This article is based on lectures presented at the NATO Advanced StudyInstitute Techniques and Concepts of High Energy Physics, which took

place in St. Croix, Virgin Islands during June 2000. The author’s researchis supported in part by the U.S. Dept. of Energy under Grant No. DE-FG03-92-ER40701.

SEE ALSO THE FOLLOWING ARTICLES

FIELD THEORY AND THE STANDARD MODEL • GROUP

THEORY, APPLIED • PERTURBATION THEORY • QUANTUM

THEORY • RELATIVITY, GENERAL

BIBLIOGRAPHY

Aharony, O., Gubser, S. S., Maldacena, J., Ooguri, H., and Oz, Y. (2000).Phys. Rep. 323, 183.

Aspinwall, P. S. (1996). Nucl. Phys. Proc. Suppl. 46, 30, hep-th/9508154.Alvarez-Gaume, L., and Witten, E. (1983). Nucl. Phys. B234, 269.Banks, T., Fischler, W., Shenker, S., and Susskind, L. (1997). Phys. Rev.

D55, 5112, hep-th/9610043.Bergman, O. (1998). Nucl. Phys. B525, 104, hep-th/9712211.Bergshoeff, E., Sezgin, E., and Townsend, P. K. (1987). Phys. Lett. B189,

75.Candelas, P., Horowitz, G. T., Strominger, A., and Witten, E. (1985).

Nucl. Phys. B258, 46.Cremmer, E., Julia, B., and Scherk, J. (1978). Phys. Lett. 76B, 409.Gliozzi, F., Scherk, J., and Olive, D. (1976). Phys. Lett. 65B, 282.Goto, T. (1971). Prog. Theor. Phys. 46, 1560.Green, M. B., and Schwarz, J. H. (1984a). Phys. Lett. 149B, 117.Green, M. B., and Schwarz, J. H. (1984b). Phys. Lett. 136B, 367.Green, M. B., and Schwarz, J. H. (1981). Nucl. Phys. B181, 502; Nucl.

Phys. B198, (1982) 252; Phys. Lett. 109B, 444.Green, M. B., Schwarz, J. H., and Witten, E. (1987). “Superstring The-

ory,” in 2 vols., Cambridge Univ. Press, U.K.Gross, D. J., Harvey, J. A., Martinec, E., and Rohm, R. (1985). Phys.

Rev. Lett. 54, 502.Gubser, S. S., Klebanov, I. R., and Polyakov, A. M. (1998). Phys. Lett.

B428, 105, hep-th/9802109.Horowitz, G. T., and Strominger, A. (1991). Nucl. Phys. B360, 197.Hull, C., and Townsend, P. (1995). Nucl. Phys. B438, 109, hep-th/

9410167.Maldacena, J. (1998). Adv. Theor. Phys. 2, 231, hep-th/9711200.Nambu, Y. (1970). Notes prepared for the Copenhagen High Energy

Symposium.Neveu, A., and Schwarz, J. H. (1971). Nucl. Phys. B31, 86.Polchinski, J. (1995). Phys. Rev. Lett. 75, 4724, hep-th/9510017.Polchinski, J. (1998). “String Theory,” in 2 vols., Cambridge Univ. Press,

U.K.Polyakov, A. M. (1981). Phys. Lett. 103B, 207.Ramond, P. (1971). Phys. Rev. D3, 2415.Scherk, J., and Schwarz, J. H. (1974). Nucl. Phys. B81, 118.Schwarz, J. H. (1995). Phys. Lett. B360, 13, Erratum: Phys. Lett. B364,

252, hep-th/9508143.Schwarz, J. H. (1996). Phys. Lett. B367, 97, hep-th/9510086.Townsend, P. K. (1995). Phys. Lett. B350, 184, hep-th/9501068.Virasoro, M. (1970). Phys. Rev. D1, 2933.Veneziano, G. (1968). Nuovo Cim. 57A, 190.Witten, E. (1995). Nucl. Phys. B443, 85, hep-th/9503124.Witten, E. (1996). Nucl. Phys. B460, 335, hep-th/9510135.Witten, E. (1998). Adv. Theor. Math. Phys. 2, 253, hep-th/9802150.Yoneya, T. (1974). Prog. Theor. Phys. 51, 1907.

Page 355: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

ThermodynamicsStanley I. SandlerUniversity of Delaware

I. Thermodynamic Systems and PropertiesII. Mass and Energy Flows and the

Equilibrium StateIII. Laws of ThermodynamicsIV. Criteria for Equilibrium and StabilityV. Pure Component Properties

VI. Phase Equilibrium in One-Component SystemsVII. Thermodynamics of Mixtures

and Phase EquilibriumVIII. Mixture Phase Equilibrium Calculations

IX. Chemical EquilibriumX. Electrolyte SolutionsXI. Coupled Reactions

GLOSSARY

Activity coefficient A measure of the extent to which thefugacity of a species in a mixture departs from idealmixture or ideal Henry’s law behavior.

Equilibrium state A state in which there is no measur-able change of properties and no flows.

Excess property The difference between the property ina mixture and that for an ideal mixture at the sametemperature, pressure, and composition.

Homogeneous system A system of uniform properties.Ideal mixture A mixture in which there is no change

in volume, internal energy, or enthalpy of forming amixture from its pure components at constant pressureat all temperatures and compositions.

Intensive property (or state variable) A property of asystem that is independent of the mass of the system.

Multiphase system A heterogeneous system consistingof several phases, each of which is homogeneous.

Partial molar property The amount by which an exten-sive property of the system increases on the additionof an infinitesimal amount of a substance at constanttemperature and pressure, expressed on a molar basis.

CHEMICAL THERMODYNAMICS is a science that isboth simple and elegant and can be used to describe a largevariety of physical and chemical phenomena at or nearequilibrium. The basis of thermodynamics is a small setof laws based on experimental observation. These general

Cod. 639

Page 356: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

640 Thermodynamics

laws combined with constitutive relations—that is, rela-tions that describe how properties (for example, the den-sity) of a substance depend on the state of the system suchas its temperature and pressure—allow scientists and engi-neers to calculate the work and heat flows accompanyinga change of state and to identify the equilibrium state.

I. THERMODYNAMIC SYSTEMSAND PROPERTIES

Thermodynamics is the study of changes that occur insome part of the universe we designate as the system;everything else is the surroundings. A real or imaginedboundary may separate the system from its surroundings.A collection of properties such as temperature, pressure,composition, density, refractive index, and other proper-ties to be discussed later characterize the thermodynamicstate of a system. The state of aggregation of the system(that is, whether it is a gas, liquid or solid) is referred toas its phase. A system may be composed of more thanone phase, in which case it is a heterogeneous system;a homogeneous system consists of only a single phase.Of most interest in thermodynamics are the changes thatoccur with a change in temperature, state of aggregation,composition (due to chemical reaction), and/or energy ofthe system.

Any element of matter contains three types of energy.First is its kinetic energy, which depends on its veloc-ity and is given by 1

2 mv2, where m is the mass and v isits center-of-mass velocity (though there may be an addi-tional contribution due to rotational motion that we willnot consider). A second contribution is the potential en-ergy, denoted by mφ and due to gravity or electric andmagnetic fields. The third, and generally the most im-portant in thermodynamics, is the internal energy U (orinternal energy per unit mass U ), which depends on thetemperature, state of aggregation, and chemical compo-sition of the substance. In thermodynamics, one is inter-ested in changes in internal energy between two statesof the system. For changes of state that do not involvechemical reaction, a reference state of zero internal en-ergy can be chosen arbitrarily. However, if chemical re-actions do occur, the reference state for the calculationof internal energies and other properties of each sub-stance in the reaction must be chosen in such a way thatthe calculated changes on reaction equal the measuredvalues.

There are many mechanisms by which the properties ofa system can change. The mass of a system can changeif mass flows into or out of the system across the sys-tem boundaries. Concentrations can change as a result ofmass flows, volume changes, or chemical reaction. The

energy of a system can change as a result of a numberof different processes. As mass flows across the systemboundary, each element of mass carries its properties, suchas its internal and kinetic energy. Heat (thermal energy)can cross the system boundary by direct contact (conduc-tion and convection) or by radiation. Work or mechanicalenergy can be done on a system by compressing the sys-tem boundaries, by a drive shaft that crosses the systemboundaries (as in a turbine or motor), or can be added aselectrical energy (in a battery or electrochemical cell). Ora system can do work on its surroundings by any of thesemechanisms.

A system that does not exchange mass with its surround-ings is said to be closed. A system that does not exchangethermal energy with its surroundings is referred to as anadiabatic system. A system that is of constant volume, adi-abatic, and closed is called an isolated system. A systemwhose properties are the same throughout is referred to asa uniform system.

It is useful to distinguish between two types of systemproperties. Temperature, pressure, refractive index, anddensity are examples of intensive properties—propertiesthat do not depend on the size or extent of the system.Mass, volume, and total internal energy are examples ofextensive properties—properties that depend on the to-tal size of the system. Extensive properties can be con-verted to intensive properties by dividing by the totalmass or number of moles in the system. Volume per unitmass (reciprocal of density) and internal energy per moleare examples of intensive properties. Intensive propertiesare also known as state variables. Intensive variables perunit mass will be denoted with a ∧ (as in V , to denotevolume per unit mass), while those on a per mole basisare given an underbar (as in U , to denote internal en-ergy per mole). Also, X , Y , and Z will be used to indi-cate state properties such as U and V , and T and P . Acharacteristic of a state property that is central to ther-modynamic analyses is that its numerical value dependsonly on the state, not on the path used to get to thatstate. Consequently, in computing the change in value ofa state property between two states, any convenient pathbetween those states may be used, instead of the actualpath.

An important experimental observation is that the spec-ification of two independent state properties of a closed,uniform, one-component system completely fixes the val-ues of the other state properties. For example, if two sys-tems of the same substance in the same state of aggregationare at the same temperature and at the same pressure, allother state properties of the two systems, such as density,volume per unit mass, refractive index, internal energyper unit mass, and other properties that will be introducedshortly, will also be identical. To fix the size of the system,

Page 357: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 641

one must also specify the value of one extensive variable(i.e., total mass, total volume, etc.).

II. MASS AND ENERGY FLOWSAND THE EQUILIBRIUM STATE

Flows into or out of a system can be of two types. One isa forced flow, as when a pump or other device creates acontinual mechanical, thermal, or chemical driving forcethat results in a flow of mass or energy across the boundaryof a system. The other type of flow, which we refer to as anatural flow, occurs into or out of a system as a result ofan initial difference of some property between the systemand its surroundings that in time will dissipate as a resultof the flow. For example, if two metal blocks of differenttemperatures are put in contact, a flow of heat will occurfrom the block of higher temperature to the one of lowertemperature until an equilibrium state is reached in whichboth blocks have the same temperature.

An important observation is that a closed isolated sys-tem, if initially nonuniform, will eventually reach a time-invariant state that is uniform (homogeneous system) orcomposed of several phases, each of which is of uniformproperties. Such a state of time-invariant uniformity is theequilibrium state. Systems open to natural flows will also,in time, come to equilibrium. However, a system subjectedto a continuous forced flow may in time come to a time-invariant, nonuniform steady state.

The methods of thermodynamics are used to identify,describe, and sometimes predict equilibrium states. Thesesame methods can also be used to describe nonequilib-rium and steady states provided that at each point in spaceand time the same relations between the state proper-ties exist as they do in equilibrium. This implies that theinternal relaxation times in the fluid must be fast com-pared to the time scales for changes imposed upon thesystem.

III. LAWS OF THERMODYNAMICS

There are four laws or experimental observations on whichthermodynamics is based, though they are not always re-ferred to as such. The first observation is that in all transfor-mations, or changes of state, total mass is conserved (notethat this need not be true in nuclear reactions, but thesewill not be considered here.) The second observation, thefirst law of thermodynamics, is that in all transformations(again, except nuclear reactions) total energy is conserved.This has been known since the experiments of J. M. Jouleover the period from 1837 to 1847.

The next observation, which leads to the second law, isthat all systems not subject to forced flows or imposed gra-

dients (of temperature, pressure, concentration, velocity,etc.) will eventually evolve to a state of thermodynamicequilibrium. Also, systems in stable equilibrium states willnot spontaneously change into a nonequilibrium state. Forexample, an isolated block of metal with a temperaturegradient will evolve to a state of uniform temperature,but not vice versa. The third law of thermodynamics is ofa different character than the first two and is mentionedlater.

A. Mass Balance

After choosing a system, one can write balance equa-tions to encompass the experimental observations above.Chemists and physicists are generally interested in theapplication of the laws of thermodynamics to a changeof state in closed systems, while engineers are frequentlyinterested in open systems. For generality, the equationsfor an open, time-varying system will be written here. Themass balance for the one-component system schematicallyshown in Fig. 1 is

d M

dt=

N∑j=1

(M) j (1)

where M is the total mass of the system at time t , and(M) j is the mass flow rate at the j th entry port into thesystem. For a mixture of C components, the total mass isthe sum of the masses of each species i , M = ∑C

i=1 Mi

and (M) j = ∑Ci=1(Mi ) j , where (Mi ) j is the flow rate of

species i at the j th entry point. (Note that the mass balancecould also be written on a molar basis; however, since thetotal number of moles and the number of moles of eachspecies are not conserved on a chemical reaction, that formof the equation is a more complicated.)

FIGURE 1 A schematic diagram of a system open to the flowsof mass, heat, and work.

Page 358: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

642 Thermodynamics

B. First Law

Using the sign convention that any flow that increases theenergy of the system is positive, the energy balance for anopen system is

d

[M

(U + v2

2+ φ

)]dt

=N∑

j=1

[M

(H + v2

2+ φ

)]j

+ W + Q − PdV

dt(2)

The term on the left is the rate of change of the totalenergy of the system written as a product of the mass ofthe system and the energy per unit mass. This includesthe internal energy U , the kinetic energy v2/2, and thepotential energy φ. The first term on the right accounts forthe fact that each element of mass entering or leaving thesystem carries with it its specific enthalpy, H = U + PV ,the sum of the specific internal energy and energy dueto the product of the specific volume and the pressure atthe entry port. This term is summed over all entry ports.The remaining terms are the rate at which work is doneon the system, W , by mechanisms that do not involve achange of the system boundaries, referred to as shaft work;the rate at which heat or thermal energy enters the system,Q; and the rate at which work is done on the system bycompression or expansion of the system boundaries. Aversion of this equation that explicitly includes differentspecies in multicomponent mixtures will be consideredlater. Also, the equation above assumes a constant pressureat the system boundary. If this is not the case, the last termis replaced by an integral over the surface of the system.

C. Second Law

To complete the formulation of thermodynamics, a bal-ance equation is needed for another state property of thesystem that accounts for such experimental observationsas: (1) isolated systems evolve to a state of equilibrium andnot in the opposite direction, and (2) while mechanical (ki-netic and potential) energy can be completely convertedinto heat, thermal energy can only partially be convertedinto mechanical energy, the rest remaining as thermal en-ergy of a lower temperature.

Because mass, energy, and momentum are the only con-served quantities and the momentum balance is of littleuse in thermodynamics, the additional balance equationwill be for a nonconserved property—that is, a prop-erty that can be created or destroyed in a change ofstate.

There are many formulations of the second law of ther-modynamics to describe these observations. The one that

will be used here states, by postulate, that there is a statefunction called the entropy, denoted by the symbol S (andS for entropy per unit mass), with a rate of change givenby:

d(M S)

dt=

N∑j=1

(M) j S j + Q

T+ Sgen (3)

where Sgen, which is greater than or equal to zero, is therate of entropy generation in a process due to nonunifor-mities, gradients, and irreversibilities in the system. It isfound that Sgen = 0 in a system at equilibrium without anyinternal flows, and that Sgen is greater than zero when suchflows occur. The fact that Sgen ≥ 0 and cannot be less thanzero encompasses the experimental observations above,as well as many others; indeed, Sgen ≥ 0 is the essence ofthe second law of thermodynamics. The third law of ther-modynamics states that the entropy of all substances inthe perfect crystalline state is zero at the absolute zero oftemperature. This law is the basis for calculating absolutevalues of the entropy.

IV. CRITERIA FOR EQUILIBRIUMAND STABILITY

Consider a system that is closed (all M = 0), adiabatic(Q = 0), of constant volume (dV/dt = 0), without workflows (W ), and stationary (so that there are no changesin kinetic or internal energy). The mass balance, first andsecond law equations for this system are

dM

dt= 0; M

dU

dt= 0; M

d S

dt= Sgen ≥ 0 (4)

The first equation (mass balance) shows that the total massof this system is constant, and the energy balance (first law)shows that the internal energy per unit mass is constant.The second law (entropy balance) states that the entropyof the system will increase until the system reaches theequilibrium state in which there are no internal flows sothat Sgen = 0, and d S

dt = 0; that is, the entropy per unit massis constant. Now, since S is increasing on the approach toequilibrium, and constant equilibrium it follows that thecriterion for equilibrium is

S = maximum for a system of constant M, U, and V(5a)

Mathematically, the equilibrium state is found by observ-ing that for any differential change,

d S = 0 for a system of constant M, U, and V

and

d2 S < 0 (5b)

Page 359: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 643

The first of these equations is used to identify a stationarystate of the system, and the second ones ensure that thestationary state is a stable, equilibrium state (that is, astate in which the entropy is a maximum subject to theconstraints, and not a minimum).

Similar arguments can be used to identify the mathe-matical criteria for equilibrium and stability in systemssubject to other constraints. Some results are

A = U − T S = minimum (6)

d A = 0 and d2 A > 0

for a system of constant M, T, and V

and

G = H − T S = U + PV − T S = minimum (7)

dG = 0 and d2G > 0

for a system of constant M, T, and P

The equations above define the Gibbs free energy G andthe Helmholtz free energy A.

From the first of the stability criteria above (d2 S ≤ 0)one can derive that, for a stable equilibrium state to ex-ist for a pure substance, the following criteria must bemet:

CV =(

∂U

∂T

)V

> 0 and

(∂ P

∂V

)T

< 0 (8)

(In these equations, we have used an underbar to desig-nate a molar property, and CV is the constant volume heatcapacity.) If these criteria are not met, the state is not astable one, and either another state of aggregation or atwo-phase system (i.e., vapor + liquid) is the equilibriumstate. The stability criteria for a multicomponent mixtureare much more complicated, involving derivatives of thefree energy function with respect to composition.

V. PURE COMPONENT PROPERTIES

A. Interrelationships Between State Variables

The first and second laws of thermodynamics are in termsof internal energy and entropy, though the properties thatare easiest to measure are temperature and pressure. Inorder to determine how the properties of a pure substancechange with changes in temperature and pressure, considera stationary, closed system of constant mass without anyshaft work. The first and second laws for such a system(on a molar basis) are

dU

dt= Q − P

dV

dtand

d S

dt= Q

T+ Sgen (9)

Our interest is in the change of properties between twoequilibrium states and, since any convenient path can beused for the calculation, a reversible path is used so thatSgen = 0. Using this, and combining the two equationsabove, we obtain:

dU

dt= T

d S

dt− P

dV

dt(10)

usually written simply as dU = T d S − PdV . By the chainrule of partial differentiation, one has

dX =(

∂X

∂Y

)Z

d Y +(

∂X

∂Z

)Y

dZ (11)

From this equation, we find that:(∂U

∂ S

)V

= T ;

(∂U

∂V

)S

= −P;

(∂ S

∂V

)U

= P

T

(12)

Two mathematical properties for the partial derivatives ofinterest here are(

∂X

∂Y

)Z

= 1

(∂Y/∂X )Z(∂X

∂Y

)Z

=(

∂X

∂K

)Z

(∂K

∂Y

)Z

(13)

Using these equations together with Eq. (12) one obtains:(∂U

∂ S

)V

= T =(

∂U

∂T

)V

(∂T

∂ S

)V

= CV

(∂T

∂ S

)V(

∂T

∂ S

)V

= T

CVor

(∂ S

∂T

)V

= CV

T(14)

B. Maxwell’s Relations

A property of continuous mathematical functions, such asthe thermodynamic properties here, is that mixed secondderivatives are equal; that is,

∂X

∣∣∣∣Y

(∂Z

∂Y

)X

= ∂

∂Y

∣∣∣∣X

(∂Z

∂X

)Y

(15)

Using this property with Eq. (10), one obtains the follow-ing Maxwell relations:(

∂ S

∂V

)T

=(

∂ P

∂T

)V

;

(∂ S

∂ P

)T

= −(

∂V

∂T

)P

;

(∂T

∂ P

)S

=(

∂V

∂ S

)P

;

(∂T

∂V

)S

= −(

∂ P

∂ S

)V

(16)

Page 360: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

644 Thermodynamics

Now, using the chain rule of partial differentiation and theMaxwell relations, we can write

d S =(

∂ S

∂T

)V

dT +(

∂ S

∂V

)T

dV

= CV

TdT +

(∂ P

∂T

)V

dV (17)

In a similar fashion, the following equations are obtained:

d S = CP

TdT −

(∂V

∂T

)P

d P

dU = CV dT +[

T

(∂ P

∂T

)V

− P

]dV (18)

dH = CP dT +[

V −(

∂V

∂T

)P

]d P

where CP = ( ∂H∂T )P is the constant pressure heat capacity.

C. Equations of State

Two types of information are needed to use the equationsabove for calculating the changes in thermodynamic prop-erties with a change of state. First is heat capacity data.This information is usually available for each componentas a function of temperature for liquids and solids or forthe ideal gas state (a gas at such low pressure that interac-tions between the molecules are of negligible importance).The second type of information needed is an interrelationbetween pressure, temperature, and specific volume, thatis, a volumetric equation of state (EOS). Several examplesare given below:

PV = RT or Z (T, P) = PV

RT= 1 ideal gas EOS

P = RT

V − b− a

V 2

or

Z (T, P) = PV

RT= V

V − b− a

RT Vvan der Waals EOS

P = RT

V − b− a(T )

V · (V + b) + b · (V − b)Peng–Robinson EOS

Z (T, P) = PV

RT= 1 + B(T )

V+ C(T )

V 2 + · · · virial EOS

PV

RT= Z (T, P) = 1 +

(B − A

RT− C

RT 3

)1

V

+(

b − a

RT

)1

V 2 + aα

RT V 5 + β

RT 3V

(1 + γ

V 2

)× exp(−γ /V 2) Benedict–Webb–Rubin EOS

Many other volumetric equations of state have been pro-posed, including more complicated ones when high accu-racy is needed. In these equations, a(T ), B(T ), and C(T )are functions of temperature; all other parameters are con-stants specific to each fluid.

The combination of heat capacity data, a volumet-ric equation of state, and Eqs. (17) and (18) allows thechange in thermodynamic properties between any twostates to be computed. However, again, a convenient pathrather than the actual path is used for the calculation. Forexample, to compute the change in molar enthalpy be-tween the states (P1, T1) and (P2, T2), the path followedis (P1, T1) → (P = 0, T1) → (P = 0, T2) → (P2, T2). Inthis way, the equation of state is used for steps 1 and 3,and the available ideal gas heat capacity is used in step 2:

H (T2, P2) − H (T1, P1) =∫ P=0,T1

P1,T1

[V −

(∂V

∂T

)P

]d P

+∫ P=0,T2

P=0,T1

CP dT +∫ P1,T2

P0,T2

[V −

(∂V

∂T

)P

]d P

(19)

Similar equations are used to compute the change in otherthermodynamic properties.

VI. PHASE EQUILIBRIUM INONE-COMPONENT SYSTEMS

A. Criterion for Phase Equilibrium

For a one-component open system with no shaft work, thefirst and second law equations (on a molar basis) are

dU

dt= N H + Q − P

dV

dtand

d S

dt= N S + Q

T+ Sgen

(20)

Again, to compute property changes consider a path onwhich Sgen = 0, to obtain:

dU

dt= T

d S

dt− P

dV

dt+ G

d N

dt

or simply

dU = T d S − P dV + G d N (21)

Analogous relations are obtained for other thermodynamicproperties. For example,

dG =(

∂G

∂T

)P,N

dT +(

∂G

∂ P

)T,N

d P +(

∂G

∂ N

)T,P

d N

= −S dT + V d P + G d N (22)

To obtain the criterion for phase equilibrium in a purefluid, consider a closed system at constant temperature

Page 361: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 645

and pressure, consisting of two subsystems, I and II, withmass freely transferable between them. As the compositesystem is closed to external mass flows,

d N = d N I + d N II = 0 or d N II = −d N I (23)

Because the temperature and pressure are fixed and arethe same in both subsystems, the change in the Gibbs freeenergy of the combined system accompanying an inter-change of mass is

dG = GI d N I + GII d N II (24)

At equilibrium, G is a maximum so that dG = 0 for allexchanges of mass. Therefore,

dG = 0 = GI d N I + GII d N II = (GI − GII) d N I

This must be true for any value of d N I, so that the condi-tion for phase equilibrium is

GI(T , P) = GII(T , P) or equivalently

f I(T , P) = f II(T , P) (25)

where the fugacity, denoted by the symbol f , which is afunction of temperature and pressure, is

f (T , P)

P= exp

[G(T , P) − G I G(T , P)

RT

]

= exp

[1

RT

∫ P

P =0

(V − RT

P

)d P

]

= exp

[1

RT

∫ V =Z RT/P

V =∞

[RT

V− P

]dV

− ln Z (T , P) + Z (T , P) − 1

](26)

It is easily shown that Eq. (25) is the condition for equi-librium for composite systems subject to other constraints(i.e., closed systems at constant U and V or constant Tand V , among others).

B. Calculation of Phase Equilibrium

Figure 2 shows isotherms (lines of constant temperature)on a pressure–volume plot computed using a typical equa-tion of state of the van der Waals form. In this diagram,T1 < T2 < T3 < T4 < T5. Note that at temperatures T1 andT2 there are regions where (∂P /∂V )T > 0, which violatesthe stability criterion of Eq. (8). Consequently, two phases(a vapor and a liquid) will form in these regions. The ther-modynamic properties of the coexisting states are foundby requiring that each of the temperature, pressure, and fu-gacity of both phases be the same. Algorithms and com-puter codes for such calculations appear in the applied

FIGURE 2 P–V–T plot for a typical cubic equation of state show-ing thermodynamically unstable regions (between points a and b).Point c is the critical point.

thermodynamics literature. Figure 3 is a redrawn versionof the previous figure replacing the unstable region withthe dome-shaped two-phase coexistence region. The leftside of the dome gives the liquid properties as a functionof the state variables; the vapor properties are given by the

FIGURE 3 P–V–T plot for a cubic equation of state with the unsta-ble region replaced with the vapor–liquid equilibrium coexistenceregion.

Page 362: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

646 Thermodynamics

right side. A tie line (horizontal line) of constant temper-ature and pressure connects the two equilibrium phases.The liquid and vapor properties become identical at thepeak of the two-phase dome, referred to as the criticalpoint, which is point c in Fig. 2. Mathematically, this isthe point, at which the equation of state has an inflec-tion point, (∂ P/∂V )T = (∂2 P/∂V 2)T = 0, and is a uniquepoint on a pure component phase diagram. The tempera-ture, pressure, and density at the critical point are referredas the critical temperature, Tc, the critical pressure, Pc,and the critical volume Vc, respectively. These conditionsare frequently used to determine the values of the param-eters in an equation of state.

When an equation of state is not available for a liquid,the fugacity is calculated from:

f (T, P)

P= exp

[1

RT

∫ P

P=0

(V − RT

P

)d P

]

= exp

[1

RT

∫ Pvap(T )

P=0

(V vap − RT

P

)d P

+ 1

RT

∫ P

Pvap(T )

(V liq − RT

P

)d P

]

= f (T, Pvap)

Pvap(T )

× exp

[1

RT

∫ P

Pvap(T )

(V liq − RT

P

)d P

]

= f (T, Pvap)

Pvap(T )

Pvap(T )

P

× exp

[1

RT

∫ P

Pvap(T )V liq d P

]

or

f (T, P) = Pvap(T )f (T, Pvap)

Pvap(T )

× exp

[1

RT

∫ P

Pvap(T )V liq d P

](27)

At low vapor and total pressures, this equation reduces tof (T, P) = Pvap(T ). At higher pressures, the value of thefirst correction term:

f (T, Pvap)

Pvap(T )

= exp

[1

RT

∫ Pvap(T )

P=0

(V vap(T, P) − RT

P

)d P

]

(28)

must be computed; note that this involves the equationof state only for the vapor. Finally, at very high pres-sures, the exponential term in Eq. (27), known as thePoynting correction, is computed using the liquid specificvolume.

C. Clapeyron and Clausius–ClapeyronEquations

At equilibrium between phases, the molar Gibbs freeenergy is the same in both phases, that is GI(T, P) =GII(T, P). For small changes in temperature, the corre-sponding change in the equilibrium pressure can be com-puted from:

d GI(T, P) = d GII(T, P)

V I d P − SIdT = V IId P − SII dT

or (d P

dT

)GI= GII

=(

SII − SI

V II − V I

)= 1

T

(H II − H I

V II − V I

)

= H

T V(29)

which is the Clapeyron equation. This equation is applica-ble to vapor–liquid, solid–liquid, solid–vapor, and solid–solid phase transitions. In the case of low-pressure vapor–liquid equilibrium,

V = V vap − V liq ≈ V vap = RT

P

so that

d ln Pvap

dT= H vap

RT 2

and

lnPvap(T2)

Pvap(T1)=

∫ T2

T1

H vap

RT 2dT (30)

which is the Clausius–Clapeyron equation. For moder-ate ranges of temperature, where the heat of vaporiza-tion can be considered to be approximately constant, thisbecomes:

lnPvap(T2)

Pvap(T1)= −H vap

R

(1

T2− 1

T1

)(31a)

The simpler form of this equation,

ln Pvap(T ) = A − B

T(31b)

is used as the basis for correlating vapor pressure data.

Page 363: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 647

VII. THERMODYNAMICS OF MIXTURESAND PHASE EQUILIBRIUM

A. Partial Molar Properties

The thermodynamic properties of a mixture are fixed oncethe values of two state variables (such as temperature andpressure) and the composition of the mixture are fixed.Composition can be specified by either the numbers ofmoles of all species or the mole fractions of all but onespecies (as the mole fractions must sum to one). Thus, forexample, the change in the Gibbs free energy of a single-phase system of i components is

dG =(

∂G

∂T

)P,N

dT +(

∂G

∂ P

)T,N

d P

+C∑

i=1

(∂G

∂ Ni

)T,P,N j =i

d Ni

= −S dT + V d P +C∑

i=1

Gi d Ni (32)

In this equation, the notation of a partial molar property,

X i =(

∂ X

∂ Ni

)T,P,N j =i

=(

∂(NX )

∂ Ni

)T,P,N j =i

(33)

has been introduced. The partial molar property X i is theamount by which the total system property, X , changesdue to the addition of an infinitesimal amount of speciesi at constant temperature, constant pressure, and constantnumber of moles of all species except i (designated byN j =i ). A partial molar property is a function not only ofspecies i , but of all species in the mixture and their com-positions. Indeed, a major problem in applied thermody-namics is the determination of the partial molar properties.

From Eq. (32) and the first and second laws of ther-modynamics, a number of other equations can be derived.Several are listed below:

dU = T d S − P dV +C∑

i=1

Gi d Ni

dH = T d S + V d P +C∑

i=1

Gi d Ni (34)

dA = −S dT − P dV +C∑

i=1

Gi d Ni

Note that it is the partial molar Gibbs free energy thatappears in each of these equations, which is an indicationof its importance in thermodynamics. The partial molarGibbs free energy of a species, Gi is also referred to asthe chemical potential µi . For simplicity of notation, Gi

will be used here instead of the more commonly used µi .

B. Criteria for Phase and ChemicalEquilibrium in Mixtures

Extending the analysis of phase equilibrium used abovefor a pure fluid to a multicomponent, multiphase system,one obtains as the criterion for equilibrium that,

GIi (T, P, xI) = GII

i (T, P, xII) = GIIIi (T, P, xIII) = · · ·

(35a)or, equivalently,

f Ii (T, P, xI) = f II

i (T, P, xII) = f IIIi (T, P, xIII) = · · ·

(35b)

where x is being used to indicate the vector of mole frac-tions of all species present. The fugacity of species i in amixture f i will be discussed shortly.

Equilibrium in chemical reactions is another importantarea of chemical thermodynamics. The chemical reaction,

αA + βB + · · · ⇔ ρR + σS + · · ·where α, β, etc. are the stoichiometric coefficients will bewritten as:

ρR + σS + · · · − αA − βB − · · · = 0 (36)

or simply asC∑

i=1

νi I = 0

The mole balance for each species in a chemical reactioncan be written using the stoichiometric coefficients in thecompact form,

Ni = Ni,0 + νi X (37)

where Ni,0 is the number of moles of species i beforeany reaction has occurred, and X is the molar extent ofreaction, which will have the same value for all species inthe reaction. The Gibbs free energy for a closed system atconstant temperature and pressure is

G(T, P, N ) =C∑

i=1

Ni Gi (T, P, N)

=C∑

i=1

(Ni + νi X )Gi (T, P, N) (38)

where N is used to indicate the vector of mole numbers ofall species present. At equilibrium in a closed system atconstant temperature and pressure, G is a maximum, anddG = 0. Since the only variation possible is in the molarextent of reaction X , it then follows that for chemicalreaction equilibrium,

C∑i=1

νi Gi (T, P, N) = 0 single chemical reaction,

(39a)

Page 364: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

648 Thermodynamics

In a multiple reaction system, defining νi j to be the stoi-chiometric coefficient for species i in the j th reaction, theequilibrium condition becomes:

C∑i=1

νi j Gi (T, P, N) = 0 for each reaction j = 1, 2, . . .

(39b)In all multiple reaction systems, it is only necessary toconsider a set of independent reactions—that is, a reactionset in which no reaction is a linear combination of theothers.

Finally, for a system with multiple reactions and mul-tiple phases, the criterion for equilibrium is that Eqs. (35)and (39) must be satisfied simultaneously. That is, fora state of equilibrium to exist in a multiphase, react-ing system, each possible process (i.e., transfer of massbetween phases or chemical reaction) must be in equi-librium for the system to be in equilibrium. This doesnot mean that the composition in each phase will be thesame.

C. Gibbs Phase Rule

To fix the thermodynamic state of a pure-component,single-phase system, the specification of two state prop-erties is required. Thus, the system is said to have twodegrees of freedom, F . To fix the thermodynamic stateof a nonreacting, C-component, single-phase system, thevalues of two state properties and C − 1 mole fractions arerequired (the remaining mole fraction is not an indepen-dent variable as all the mole fractions must sum to one)for a total of C + 1 variables. That is, F = C + 1. Considera system consisting of C components, P phases, and Mindependent chemical reactions. Since C + 1 state prop-erties are needed to fix each phase, it would appear thatthe system has P(C + 1) degrees of freedom. However,since the temperature is the same in all phases, specifyingthe temperature in one phase fixes its values in the P − 1other phases. Similarly, fixing the pressure in one phasesets its values in the P − 1 remaining phases. That thefugacity of each species must be the same in each phaseremoves another C(P − 1) degrees of freedom. Finally,that the criterion for chemical equilibrium for each of theM independent reactions must be satisfied places anotheradditional M constraints on the system. Therefore, theactual number of degrees of freedom is

F = P · (C + 1) − (P − 1) − (P − 1) − C · (P − 1) − M

= C − P − M + 2 (40)

This result is the Gibbs phase rule. It is important tonote that this gives the number of state properties neededto completely specify the thermodynamic state of each

of the phases in the multicomponent, multiphase, multi-reaction system. However, such a specification does notgive information on the relative amounts of the coexistingphases, or the total system size. Such additional informa-tion comes from the specification of the initial state andthe species mass balances.

VIII. MIXTURE PHASE EQUILIBRIUMCALCULATIONS

Central to the calculation of equilibria in mixtures is thefugacity of species i in the mixture f i which is given by:

f i (T, P)

xi P= exp

[Gi (T, P, x) − G I G M

i (T, P, x)

RT

]

= exp

[1

RT

∫ P

P=0

(Vi − RT

P

)d P

]

= exp

[1

RT

∫ V =Z RT/P

V =∞

[RT

V

−N

(∂ P

∂ Ni

)T,V,N j =i

]dV − ln Z (T, P, x)

]

(41)

In this equation, the superscript IGM indicates an idealgas mixture—that is, a mixture that has the followingproperties:

PV I G M =(

C∑i=1

Ni

)RT or PV I G M =

(C∑

i=1

xi

)RT

so that V I G Mi (T, P, x) = V I G

i (T, P) = RT/P

U I G M (T, P, x) =C∑

i=1

xiUI Gi (T, P)

so that U I G Mi (T, P, x) = U I G

i (T, P)

H I G M (P, T, x) =C∑

i=1

xiHI Gi (T, P)

so that H I G Mi (T, P, x) = H I G

i (T, P)(42)

SI G M (T, P, x) =C∑

i=1

xi SI Gi (T, P) − R

C∑i=1

xi ln xi

so that S I G Mi (T, P, x) = SI G

i (T, P) − R ln xi

AI G M (T, P, x) =C∑

i=1

xi AI Gi (T, P) + RT

C∑i=1

xi ln xi

so that AI G Mi (T, P, x) = AI G

i (T, P) + RT ln xi

Page 365: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 649

G I G M (T, P, x) =C∑

i=1

xiGI Gi (T, P) + RT

C∑i=1

xi ln xi

so that G I G Mi (T, P, x) = G I G

i (T, P) + RT ln xi

Also of interest is the ideal mixture whose propertiesare given by:

V I M (T, P, x) =C∑

i=1

Ni V i (T, P)

so that V I Mi (T, P, x) = V i (T, P)

U I M (T, P, x) =C∑

i=1

xiU i (T, P)

so that U I Mi (T, P, x) = U i (T, P)

H I M (P, T, x) =C∑

i=1

xi Hi (T, P)

so that H I Mi (T, P, x) = Hi (T, P)

SI M (P, T, x) =C∑

i=1

xi Si (T, P) − RC∑

i=1

xi ln xi

so that Si (T, P, x) = Si (T, P) − R ln xi (43)

AI M (P, T, x) =C∑

i=1

xi Ai (T, P) + RTC∑

i=1

xi ln xi

so that AI Mi (T, P, x) = Ai (T, P) + RT ln xi

G I M (P, T, x) =C∑

i=1

xiGi (T, P) + RTC∑

i=1

xi ln xi

so that G I Mi (T, P, x) = Gi (T, P) + RT ln xi

While the equations for the ideal mixture appear verysimilar to those for the ideal gas mixture, there are twoimportant distinctions between them. First, the I G M onlyrelates to gaseous mixtures, while the I M is applicable togases, liquids, and solids. Second, in the I G M the purecomponent property is that of the ideal gas at the condi-tions of the mixture, while in the I M the pure componentproperties are at the same temperature, pressure, and stateof aggregation of the mixture. Note that in an ideal gasmixture,

V I G Mi (T, P, x) = RT

Pso that f I G M

i (T, P, x) = xi P

(44a)

while in an ideal mixture,

V I Mi (T, P, x) = V i (T, P)

so that f I Mi (T, P, x) = xi fi (T, P) (44b)

That is, in the ideal mixture the fugacity of a component isthe product of the mole fraction and the pure componentfugacity at the same temperature, pressure, and state ofaggregation as the mixture.

A. Equations of State for Mixtures

Few mixtures are ideal gas mixtures, or even ideal mix-tures; consequently, there are two ways to proceed. Thefirst method is to use an equation of state; this is the de-scription used for all gaseous mixtures and also for someliquid mixtures, though the latter may be difficult if thechemical functionalities of the species in the mixture arevery different. Generally, the same forms of equations ofstate described earlier are used, though the parametersin the equations are now functions of composition. Forthe virial equation, this composition dependence is knownexactly from statistical mechanics:

B(T, x) =C∑

i=1

C∑j=1

xi x j Bi j (T ),

C(T, x) =C∑

i=1

C∑j=1

C∑k=1

x j x j xkCi jk(T ), . . . (45)

where the only composition dependence is that shown ex-plicitly. For cubic equations of state, the following mixingrules:

a(T, x) =C∑

i=1

C∑j=1

xi x j ai j (T ), b( x) =C∑

i=1

C∑j=1

xi x j bi j

(46)and combining rules:

ai j (T ) = √aii (T )a j j (T )(1 − ki j ), bi j = 1

2(bii + b j j )

(47)are used, where the binary interaction parameter ki j isadjusted to give the best fit of experimental data. Other,more complicated mixing rules have been introduced inthe last decade to better describe mixtures containing verypolar compounds and species of very different function-ality. There are additional mixing and combining rules forthe multiparameter equations of state, and each is specificto the equation used.

B. Phase Equilibrium CalculationsUsing an Equation of State

If an equation of state can be used to describe both thevapor and liquid phases of a mixture, it can then be useddirectly for phase equilibrium calculations based on equat-ing the fugacity of each component in each phase:

f Li (T, P, x) = f V

i (T, P, y) (48)

Page 366: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

650 Thermodynamics

where the superscripts L and V indicate the vapor andliquid phases, respectively, and x and y are the vectors oftheir compositions. Algorithms for the computer calcula-tion of this type of phase equilibrium calculation are avail-able elsewhere. Because the vapor and liquid phases of hy-drocarbons (together with inorganic gases such as CO2)are well described by simple equations of state, the oil andgas industry typically does phase equilibrium calculationsin this manner. Because of the limited applicability of EOSto the liquid phase of polar mixtures, the method below iscommonly used for phase equilibrium calculations in thechemical industry.

C. Excess Properties and Activity Coefficients

A description that can be used for liquid and solid mixturesis based on considering any thermodynamic property to bethe sum of the ideal mixture property and a second term,the excess property, that accounts for the mixture beingnonideal; that is,

H (T, P, x) = H IM (T, P, x) + H ex (T, P, x)

=C∑

i=1

xiH (T, P) +C∑

i=1

xi H exi (T, P, x)

V (T, P, x) = V I M (T, P, x) + V ex (T, P, x)

=C∑

i=1

xi V (T, P) +C∑

i=1

xi Vexi (T, P, x)

(49)G(T, P, x) = G I M (T, P, x) + Gex (T, P, x)

=C∑

i=1

xiG(T, P) + RTC∑

i=1

xi ln xi

+C∑

i=1

xi Gexi (T, P, x)

where

H exi =

(∂ NH ex

∂ Ni

)T,P,N j =i

; V exi =

(∂ N V ex

∂ Ni

)T,P,N j =i

;

Gexi =

(∂ NGex

∂ Ni

)T,P,N j =i

; etc. (50)

Of special interest is the commonly used activity coeffi-cient, γ , which is related to the excess partial molar Gibbsfree energy as follows:

Gexi (T, P, x) = RT ln γi (T, P, x)

For changes in any mixture property θ (T, P, N) wecan write:

dθ (T, P, N) = d(N θ ) =C∑

i=1

Ni d θ i +C∑

i=1

θ i d Ni

= N

(∂ θ

∂T

)P,N

dT + N

(∂ θ

∂ P

)T,N

d P

+C∑

i=1

θ i d Ni

Subtracting the two forms of the equation, and consideringonly changes at constant temperature and pressure, thisreduces to:

C∑i=1

Ni d θ =C∑

i=1

xi d θ i = 0 (51a)

which for a binary mixture can be written as

x1

(∂θ1

∂x1

)T,P

+ x2

(∂θ2

∂x1

)T,P

= 0

and

x1

(∂θ ex

1

∂x1

)T,P

+ x2

(∂θ ex

2

∂x1

)T,P

= 0 (51b)

since this equation is satisfied identically for the ideal mix-ture. Special cases of this equation are

x1

(∂H ex

1

∂x1

)T,P

+ x2

(∂H ex

2

∂x1

)T,P

= 0;

x1

(∂ V ex

1

∂x1

)T,P

+ x2

(∂ V ex

2

∂x1

)T,P

= 0

x1

(∂Gex

1

∂x1

)T,P

+ x2

(∂Gex

2

∂x1

)T,P

= x1

(∂ ln γ1

∂x1

)T,P

+ x2

(∂ ln γ2

∂x1

)T,P

= 0 (51c)

These equations, forms of the Gibbs–Duhem equation,are useful in obtaining partial molar property informa-tion from experimental data and for testing the accuracyof such data. For example, by isothermal heat-of-mixingmeasurements over a range of concentrations, excessenthalpy data can be obtained as follows. For a binarymixture,

H mix = (x1 H 1 + x2 H 2) − (x1H 1 + x2 H 2)

Page 367: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 651

and(∂H mix

∂x1

)T ,P

= ( H 1 − H 1) + x1

(∂ H 1

∂x1

)T ,P

− ( H 2 − H 2) + x1

(∂ H 2

∂x1

)T ,P

Using the Gibbs–Duhem equation and combining the twoequations above give:

H mix − x1

(∂H mix

∂x1

)T ,P

= H 2 − H 2

and

H mix + x2

(∂H mix

∂x1

)T ,P

= H 1 − H 1 (53)

Consequently, by having H mix data as a functionof composition so that the compositional derivatives canbe evaluated, the partial molar enthalpies of each of the

FIGURE 4 Construction illustrating how the difference between the partial molar and pure-component enthalpiescan be obtained graphically at a fixed composition from a plot of Hmi x versus composition in a binary mixture.

species at each composition can be obtained. If the H mix

data have been fitted to an equation, usually a polyno-mial in mole fraction, this can be done analytically. Thegraphical procedure shown in Fig. 4 can also be used,where the intercepts A and B then give the difference be-tween the partial molar and pure component enthalpiesat the indicated concentration. Similar procedures can beused to obtain partial molar volume data from volumechange on mixing data. From vapor–liquid equilibriumdata, as will be described later, activity coefficient (ex-cess Gibbs free energy) data can be obtained. Also, if par-tial molar property data have been obtained experimen-tally, they can be tested for thermodynamic consistencyby using the Gibbs–Duhem equation either differentiallyon a point-by-point basis or by integration over the wholedataset.

Algebraic expressions are generally used to fit excessproperty data as a function of composition. For example,when the two-parameter expression,

Page 368: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

652 Thermodynamics

θ ex (T , P , x) = ax1x2

(x1 + x2b)(54a)

is used, one obtains, in general,

θ ex1 = abx2

2

(x1 + x2b)2and θ ex

2 = ax21

(x1 + x2b)2(54b)

and, in particular,

Gex1 = RT ln γ1 = abx2

2

(x1 + x2b)2

and

Gex2 = RT ln γ2 = ax2

1

(x1 + x2b)2 (54c)

which is the Van Laar model. There are many other, andmore accurate, activity coefficient models in the thermo-dynamic literature that are used by chemists and engineers.

D. Phase Equilibrium CalculationsUsing Activity Coefficients

With this definition of the partial molar excess Gibbs freeenergy and the activity coefficient, the fugacity of a speciesin a liquid mixture can be computed from:

f Li (T , P , x) = xi γi (T , P , x) f Li (T , P) (55)

where the fugacity of the pure component is equal to thevapor pressure of the pure component, P vap(T ), if thevapor pressure and total pressure are low. If the vaporpressure is above ambient, then the fugacity at this pres-sure contains a correction that can be computed from theequation of state for the vapor. Also, if the total pres-sure is much above the pure component vapor pressure, aPoynting correction is added:

f Li (T , P) = P vapi (T )

(f Li

(T , P vap

i

)P vap

i

)

× exp

( ∫ P

P vapi (T )

V L

RTd P

)(56)

The calculation of vapor–liquid equilibrium using ac-tivity coefficient models is then based on:

f Li (T , P , x) = xi γi (T , P , x) f Li (T , P)

= xi γi (T , P , x)P vapi (T )

(f Li

(T , P vap

i

)P vap

i

)

× exp

( ∫ P

P vapi (T )

V L

RTd P

)= f Vi (T , P , y)

(57)

A common application of this equation is to vapor–liquidequilibrium at low pressures, where the vapor can be con-sidered to be an ideal gas mixture and all pressure correc-tions can be neglected. This leads to the simple equation,

xi γi (T , x)P vapi (T ) = yi P (58)

relating the compositions of the vapor and liquid phases.If vapor–liquid phase equilibrium data are available,this equation can be used to obtain values of γi (T , x)and, therefore, Gex

i (T , x) and Gex (T , x) = ∑xi Gex

i(T , x) = RT

∑ xi ln γi (T , x). Alternatively, if activity

coefficient orGex data are available or can be predicted, thecompositions of the equilibrium phases can be computed.Note that for the case of an ideal solution (γi = 1 for allcompositions), the low-pressure vapor–liquid equilibriumrelation becomes:

xi P vapi (T ) = yi P (59a)

Also, summing over all species, one then obtains for theideal solution at low pressure:

P(T , x) =C∑

i =1

xi P vapi (T )

and

yi = xi P vapi (T )

P(T , x)= xi P vap

i (T )C∑

j =1x j P vap

j (T )

(59b)

(since ∑c

i =1 yi = 1). The first of these equations indicatesthat the total pressure is a linear function of liquid-phasemole fraction. This is known as Raoult’s law. The secondequation establishes that the vapor and liquid composi-tions in an ideal solution will be different (except if, fortu-itously, the vapor pressures of the components are equal).

The comparable equations for a nonideal mixture at lowpressure are

P =C∑

i =1

xi γi (T , x)P vapi (T )

and

yi = xi γi (T , x)P vapi (T )

C∑j =1

x j γ j (T , x)P vapj (T )

(60a)

Figure 5 shows the pressure versus mole fraction behaviorfor various mixtures. In this figure, curve 1 is for an idealsolution (i.e., Raoult’s law). Curves 2 and 3 correspondto solutions with positive deviations from Raoult’s law asa result of the activity coefficients of both species beinggreater than unity. Curves 4 and 5 are similar for the caseof negative deviations from Raoult’s law (γ < 1). Figure 6is a plot of the vapor-phase mole fraction, y, versus theliquid phase mole fraction, x , for these cases. The dashedline in the figure is x = y.

Page 369: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 653

FIGURE 5 Pressure versus liquid composition curves for vapor–liquid equilibrium in a binary mixture. Curve 1 is for an ideal mixture(Raoult’s Law). Curves 2 and 3 are for nonideal solutions in whichthe activity coefficients are greater than unity, and curves 4 and 5are for nonideal solutions in which the activity coefficients are lessthan unity. Curves 3 and 5 are for mixtures in which the solutionnonideality is sufficiently great as to result in an azeotrope.

FIGURE 6 Liquid composition versus vapor composition (x vs.y) curves for the mixtures in Fig. 5. The dashed line is the line ofx = y, and the point of crossing

Curve 3 in Fig. 5 is a case in which the nonideality

of this line is the azeotropic point.

is sufficiently great that there is a maximum in the pres-sure versus liquid composition curve. Mathematically, itcan be shown that at this maximum the vapor and liq-uid compositions are identical. This is seen as a crossingof the x = y line in Fig. 6. Such a point is referred to asan azeotrope. Curve 5 is another example of a mixturehaving an azeotrope, although as a result of large negativedeviations from Raoult’s law. Azeotropes occur as a resultof solution nonidealities and are most likely to occur inmixtures of chemically dissimilar species with vapor pres-sures that are reasonably close. An azeotrope in a binarymixture occurs if:

γ1(T, x1) = P

Pvap1 (T )

= x1 Pvap1 (T ) + x2 Pvap

2 (T )

Pvap1 (T )

and

γ2(T, x1) = P

Pvap2 (T )

(61)

If the azeotropic point of a mixture and the pure com-ponent vapor pressures have been measured, the twoconcentration-dependent activity coefficients can be cal-culated at this composition. This information can then beused to obtain values of the parameters in a two-parameteractivity coefficient model, such as the Van Laar modeldiscussed earlier, and then to predict values of the activitycoefficients and the vapor–liquid equilibria over the wholeconcentration range. The occurrence of azeotropes in mul-ticomponent mixtures is not very common. Calculationsfor nonideal mixtures at high pressures are considerablymore complicated and are discussed in books on appliedthermodynamics.

E. Henry’s Law

There is an important complication that arises in thecalculation of phase equilibrium with activity coefficients:To use Eq. (55) one must be able to calculate the fugacityof the pure component as a liquid at the temperatureand pressure of the mixture. This is not possible, forexample, if the dissolved component exists only as a gas(i.e., O2, CO2, etc.) or as a solid (i.e., sugar, a long-chainhydrocarbon, etc.) as a pure component at the mixtureconditions. If the temperature and pressure are not veryfar from the melting point of the solid or boiling pointof the gaseous species, Eq. (27) can still be used byextrapolation of the liquid fugacity (or vapor pressure)into the solid or gaseous states as appropriate. (Such aproblem does not arise when using an equation of state,as the species fugacity in a mixture is calculated directly,not with respect to a pure component state.)

If extrapolation over a very large temperature rangewould be required, a different procedure is used. In thiscase, Eq. (53) is be replaced by:

Page 370: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

654 Thermodynamics

f Li (T, P, x) = xiγ

∗i (T, P, x) i (T, P) (62a)

or

f Li (T, P, x) = Miγ

⊕i (T, P, M) i (T, P) (62b)

depending on the concentration units used. In these twoequations, forms of Henry’s law, the fugacity of a gaseousor solid component dissolved in a liquid is calculatedbased on extrapolation of its behavior when it is highlydiluted. In the first equation, the initially linear depen-dence of the species fugacity at high dilution is usedto find the Henry’s law constant i . Then, the nonlin-ear behavior at higher concentrations is accounted for bythe composition-dependent activity coefficient γ ∗

i . In thisdescription, the Henry’s law constant depends on tem-perature and the solvent–solute pair. Also, normaliza-tion of the activity coefficient γ ∗

i is different from theactivity coefficient used heretofore in that its value isunity when the species is infinitely dilute, while γi = 1in the pure component limit. The relation between thetwo is

γ ∗i (xi ) = γi (xi )

γi (xi = 0)(63)

The second form of Henry’s law, Eq. (62b), is similar butbased on using molality as the concentration variable.

Both types of Henry’s law coefficients are generallydetermined from experiment. Once values are known asa function of temperature, solvent, and solute, the phasebehavior involving a solute described by Henry’s law canbe calculated. For example, at low total pressure, we havefor the vapor–liquid equilibrium of such a component:

xiγ∗i (T, P, x) i (T, P) = yi P = Pi

or

Miγ⊕i (T, P, M) i (T, P) = yi P = Pi (64)

depending on the concentration variable used. At higherpressures, a Poynting correction would have to be addedto the left side of both equations, and the partial pressureof the species in the vapor phase, Pi , would be replacedby its fugacity, normally calculated from an equation ofstate.

IX. CHEMICAL EQUILIBRIUM

The calculation of chemical equilibrium is based onEq. (39). While the partial molar Gibbs free energy orchemical potential of each species in the mixture is neededfor the calculation, what is typically available is the Gibbsfree energy of formation G f and the heat (enthalpy) offormation H f of the pure components from their ele-ments, generally at 25C and 1 bar. To proceed, one writes:

G i (T, P, x) = Gi (T, P = 1 bar)

+ [ G i (T, P, x) − Gi (T, P = 1 bar)]

= Gi (T, P = 1 bar) + RT lnf i (T, P, x)

fi (T, P = 1 bar)(65)

Then, Eq. (39) can be written as:

C∑i=1

νi Gi (T, P, x) =C∑

i=1

νi

[Gi (T, P = 1 bar)

+ RT lnf i (T, P, x)

fi (T, P = 1 bar)

]= 0 (66)

Common notation is to define the activity of each speciesas:

ai (T, P, x)f i (T, P, x)

fi (T, P = 1 bar)(67)

and to define a chemical equilibrium constant K (T ) from:

RT ln K (T ) = −C∑

i=1

νiGi (T, P = 1 bar)

= −C∑

i=1

νiGf,i (T, P = 1 bar) = −Gor xn(T ) (68)

leading to:

K (T ) =C∏

i=1

(f i (T, P, x)

fi (T, P = 1 bar)

)νi

=C∏

i=1

[ai (T, P, x)]νi

(69)where Go

r xn(T ) is the standard free energy of reaction—that is, the Gibbs free energy change that would occurbetween reactants in the pure component state to produceproducts, also as pure components. At 25C,

RT ln K (T = 25C)

= −C∑

i=1

νiGi (T = 25C, P = 1 bar)

= −C∑

i=1

νiG f,i (T = 25C, P = 1 bar)

= −Gor xn(T = 25C) (70)

Also, the standard heat of reaction is

H orxn(T = 25C)

=C∑

i=1

νi H i (T = 25C, P = 1 bar)

=C∑

i=1

νiH f,i (T = 25C, P = 1 bar)

Page 371: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 655

and

H orxn(T ) = H o

rxn(T = 25C)

+∫ T

T = 25C

C∑i=1

νi CP,i (T ) dT (71)

Then, using:

∂T

(G

T

)P

= − H

T 2leads to

(∂ ln K (T )

∂T

)P

= H or xn(T )

T 2(72a)

and

lnK (T )

K (T = 25C)

=∫ T

T =25C

H orxn(T )

RT 2dT

= H orxn(T = 25C)

R

(1

T− 1

298.15

)

+∫ T

T =25C

[ ∫ TT 1=25C

C∑i=1

νi CP,i (T 1) dT 1

]RT 2

dT (72b)

For a liquid species at low and moderate pressure, andwith the pure-component standard state, the activity is

ai (T, P, x) = f Li (T, P, x)

f Li (T, P = 1 bar)

= xiγi (T, P, x) f Li (T, P)

f Li (T, P = 1 bar)

= xiγi (T, P, x) (73a)

The activity of species in the vapor is

ai (T, P, y) = f Vi (T, P, y)

f Vi (T, P = 1 bar)

= yi P

1 bar(73b)

where the term on the right of the expression is correct onlyfor an ideal gas mixture. Thus, for example, the chemicalequilibrium relation for the low-pressure gas-phase reac-tion, H2 + 1

2 O2 ↔ H2O is

K (T ) = aH2O

aH2 a1/2O2

=yH2O P

1 baryH2 P

1 bar

(yO2 P

1 bar

)1/2

= yH2O

yH2

(yO2

)1/2

(1 bar

P

)1/2

(74)

which indicates that as the pressure increases, the con-version of hydrogen and oxygen to water is favored. Theequilibrium relation for the low-pressure hydrogenation

of benzene to cyclohexane involving hydrogen gas andliquid benzene and cyclohexane C6H6 + 3H2 ↔ C6H12 is

K (T ) = aC6H12

aC6H6 a3H2

=

xC6H12γC6H12 f LC6H12

f LC6H12

xC6H6γC6H12 f LC6H6

f LC6H6

(yH2 P

1 bar

)3

= xC6H12γC6H12

xC6H6γC6H6

(1 bar

yH2 P

)3

= xC6H12

xC6H6

(1 bar

PH2

)3

(75)

where in the last term in this equation the activity coef-ficients have been omitted, as benzene and cyclohexaneare so chemically similar that they are expected to forman ideal solution, and PH2 = yH2 P is the partial pressureof hydrogen in the gas phase.

If the reaction system is closed, then the equilibrium re-lations have to be solved together with the mass balances.For example, suppose three moles of hydrogen and onemole of oxygen are being reacted to form water. The massbalances for this reaction give:

Initial Moles at EquilibriumSpecies moles equilibrium mole fraction

H2 3 3 − X3 − X

4 − 0.5X

O2 1 1 − 0.5X1 − 0.5X

4 − 0.5X

H2O 0 XX

4 − 0.5XTotal moles 4 − 0.5X

The chemical equilibrium relation to be solved for themolar extent of reaction X is, then,

K (T ) = yH2O

yH2

(yO2

)1/2

(1 bar

P

)1/2

=X

4 − 0.5X3 − X

4 − 0.5X

(1 − 0.5X

4 − 0.5X

)1/2

(1 bar

P

)1/2

= X (4 − 0.5X )1/2

(3 − X )(1 − 0.5X )1/2

(1 bar

P

)1/2

Therefore, once the temperature is specified so that valueof K (T ) can be computed, and the pressure is fixed, theequilibrium molar extent of reaction X can be computed,and from that each of the equilibrium mole fractions.

When several reactions occur simultaneously, a similarprocedure is followed in that a chemical equilibrium rela-tion is written for each of the independent reactions, andmass balances are used for each component. The solution

Page 372: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

656 Thermodynamics

can be complicated since all the reactions are coupledthrough the mass balances; that is, the molar extent foreach reaction will appear in some or all of the equilibriumrelations.

When there are many reactions possible, or when thereis combined chemical and phase equilibrium, calculationby direct Gibbs free energy minimization may be a betterway to proceed. In this method, expressions are writtenfor the partial molar Gibbs free energy of every compo-nent in every possible phase (which will involve the molefractions of all species in that phase), and then a searchmethod is used to find the state of minimum Gibbs freeenergy (if temperature and pressure are fixed) subject tothe mass balance constraints. That is, one identifies thestate in which the total Gibbs free energy is a minimumdirectly, rather than using chemical equilibrium constants.

X. ELECTROLYTE SOLUTIONS

Electrolyte solutions are fundamentally different from theother mixtures so far considered. One reason is that thespecies, such as salts, ionize in solution so that the na-ture of the pure component and the substance in solutionis very different. Another reason is that, because the ionsare charged, the interactions are much stronger and longerrange than among molecules. Consequently, the solutionsare much more nonideal, and the activity coefficient mod-els used for molecules, such as the simple Van Laar model,are not applicable. Also, the anions and cations originat-ing from a single ionizable substance are present in a fixedratio.

Consider the ionization reaction Aν+Bν− = ν+Az+ +ν−Bz− . Since the initial molecule has no net charge, wehave

ν+z+ + ν−z− = 0 (76a)

or, on a molar basis,

ν+NA + ν−NB = 0 (76b)

where ν+ and ν− are the stoichiometric coefficients ofthe ions A and B in the molecule, and z+ and z− aretheir charges. By Eq. (76b) the number of moles of eachion cannot be changed independently, so the partial mo-lar Gibbs free energy of each ion cannot be separatelymeasured. As the total molar concentration of salt can bevaried, the customary procedure is to define a mean ionicactivity coefficient γ± based on Henry’s law, applicable toboth ions, and referenced to a hypothetical ideal one-molalsolution as follows:

GAB(T, P, M) = GIdealAB (T, P, M = 1)

+ νRT ln

[M±γ±M = 1

](77)

where ν = ν+ + ν− and Mν± = Mν+

A Mν−B is the mean ionic

molality. At very low ionic concentrations, the mean ionicactivity coefficient γ± can be computed from the Debye–Huckel limiting law:

ln γ± = −α|z+z−|√

I (78)

where

I = 1

2

∑i=ions

z2i Mi

In this equation, I is the ionic strength, the sum is over allions in solution, and α is a temperature-dependent param-eter whose value is 1.178 (mol/L)−0.5 for water at 25C. Athigher ionic strengths, the following empirical extensionsto the limiting law have been used:

ln γ± = −α|z+z−|√I

1 + β√

Iand

ln γ± = −α|z+z−|√I

1 + β√

I+ δ I (79)

where β = 1.316 (mol/L)−0.5 for water at 25C, and δ

is an adjustable parameter fit to experimental data. Notethat Eq. (78) and the first of Eq. (79) predict a steep andcontinuing decrease of γ± with increasing ionic strength,while the last of Eq. (79) correctly predicts first a de-crease in γ± and then an increase with increasing ionicstrength.

Since a solvent of high dielectric constant is needed fora salt to ionize, ions are not found in the vapor phase atnormal conditions. However, the strong nonideality of anelectrolyte solution containing ions affects vapor–liquidand reaction equilibria. For example, silver chloride is onlyvery slightly soluble in water. The equilibrium constant forthe reaction AgCl → Ag+ + Cl− is

K = aAg+aCl−

aAgCl=

MAg+

M = 1

MCl−

M = 1(γ±)2

1

= MAg+ MCl− (γ±)2

so that

MAg+ = K

(γ±)2 MCl−(80)

The molality of the silver ion that will dissolve is affectedby the addition of other ions. If a salt containing nei-ther silver or chloride ions (e.g., KNO3) is added to asilver chloride solution, the ionic strength of the solutionwill increase; this will result in a decrease in the meanionic activity coefficient at low total ionic strength andan increase in the solubility of Ag+ ions. Conversely, athigher ionic strength, the mean ionic activity coefficient

Page 373: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GTY/MBQ P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN016J-96 July 31, 2001 17:27

Thermodynamics 657

will increase, producing a decrease in the solubility ofAg+ ions. However, if a salt containing a Cl− ion isadded, there will be a small ionic strength effect, but alarge common-ion effect resulting in a decrease in theconcentration of Ag+ ions and the solubility of AgCl.That is, because the value of the equilibrium constantis fixed, increasing the Cl− ion concentration by addi-tion of a Cl-containing salt will depress the Ag+ ionconcentration.

XI. COUPLED REACTIONS

For a state of equilibrium at constant temperature and pres-sure, the Gibbs free energy should be a minimum. If sev-eral chemical reactions occur in a system that are onlylinked through mass balances, then those reactions thatreduce the Gibbs free energy of the system will occur, andthose that increase G will not occur. There are, however,other reactions that are more closely coupled. One exam-ple is an electrolytic battery in which two electrochemicalreactions occur, one of which increases the Gibbs freeenergy of the system while the other decreases it. Whenthe two half cells are connected, if the sum of the twoGibbs free energy changes is negative, both reactions willoccur, including the half-cell reaction that increases theGibbs free energy system. That is, the reaction with a neg-ative Gibbs free energy change is driving the one with apositive change.

Another example is the production of adenosine triphos-phate (ATP), a molecule used to store energy in biologicalsystems, by the phosphorylation of adenosine diphos-phate (ADP), ADP + phosphate → ATP. The standard-state Gibbs free energy change for this process is 29.3

kJ, so this reaction, by itself will have a very small equi-librium constant. However, by enzymatic reactions, it iscoupled to the oxidation of glucose, C6H12O6 + 6O2 →6CO2 + 6H2O, with a standard state Gibbs free energychange of −2807.2 kJ, which is so large that it can drivethe phosphorylation of many ADP molecules. In fact, thenet overall reaction is

C6H12O6 + 6O2 + 38 ADP + 38 phosphate

→ 6CO2 + 6H2O + 38 ATP

for which Go = −1756.8 kJ. There are many other ex-amples in biological systems of complex enzymatic reac-tion networks resulting in one reaction driving another.

SEE ALSO THE FOLLOWING ARTICLES

BIOENERGETICS • HEAT TRANSFER • INTERNAL COMBUS-TION ENGINES • PHYSICAL CHEMISTRY • STEAM TABLES

BIBLIOGRAPHY

Pitzer, K. S. (1995). “Thermodynamics,” 3rd ed., McGraw-Hill, NewYork.

Prausnitz, J. M., Lichtenthaler, R. N., and de Azevedo, E. G. (1999).“Molecular Thermodynamics of Fluid-Phase Equilibria,” 3rd ed.,Prentice-Hall, Englewood Cliffs, NJ.

Rowlinson, J. S., and Swinton, F. L. (1982). “Liquids and LiquidMixtures,” 3rd ed., Butterworths, London.

Sandler, S. I. (1999). “Chemical and Engineering Thermodynamics,”3rd ed., Wiley, New York.

Smith, J. M., Van Ness, H. C., and Abbott, M. M. (1996). “Introductionto Chemical Engineering Thermodynamics,” 5th ed., McGraw-Hill,New York.

Page 374: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

ThermometryC. A. SwensonIowa State University

T. J. QuinnBureau International de Poids et Mesures

I. IntroductionII. Standards and Calibrations

III. Thermodynamic TemperaturesIV. Practical Thermometry

GLOSSARY

Fixed point Unique temperature that is associated witha well-defined thermodynamic state of a pure sub-stance, and that generally involves two or three phasesin equilibrium.

Ideal gas Assembly of noninteracting particles. Heliumgas at a low pressure is a good approximation for anideal gas.

International Temperature Scale of 1990 Internation-ally adopted temperature scale (abbreviated ITS-90or T90) that provides a reference for all currentthermometry.

Primary thermometer Device that directly determinesthermodynamic temperatures.

Secondary thermometer Instrument that is used forpractical thermometry and that must be calibrated interms of a primary thermometer.

Standard platinum resistance thermometer Carefullyspecified secondary thermometer that is used inthe definition of the IPTS-68 over much of itsrange.

Thermodynamic temperature Parameter (actually, an

energy) that appears in theoretical calculations of ther-mal effects.

MODERN THERMOMETRY extends over at least10 decades in temperature, from the temperatures reachedin nuclear cooling experiments to those achieved in nu-clear explosions. At both the lowest and the highest ex-tremes, temperatures are measured using methods thatare related directly to theory and, hence, correspond tothermodynamic temperatures. At intermediate tempera-tures, where high accuracy is most necessary, temper-atures are defined in terms of secondary thermometers(such as the standard platinum resistance thermometer)that have proved to be stable and sensitive and to havecalibrations that vary smoothly with thermodynamic tem-perature. These instruments serve as interpolation devicesbetween a sequence of accurately defined fixed pointsto which temperatures have been assigned which corre-spond closely to thermodynamic values. The thermome-ters that are used in practical situations may be more con-venient to use than either thermodynamic thermometers orscale-defining secondary thermometers, may be smaller

705

Page 375: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

706 Thermometry

in size, and/or may be more sensitive, while lacking thesmoothness and/or stability criteria.

I. INTRODUCTION

The qualitative aspects of temperature and temperaturedifferences are synonymous with the physiological sen-sations of “hot” and “cold.” These descriptions are am-biguous, since often it is the heat conductance or even thethermal mass of the material that is sensed, rather thanits actual temperature. Hence, the temperature of a glassobject always will seem to be less extreme than that of ametal object, even though the two objects are at the sametemperature.

The measurement of temperature, or the science ofthermometry, is made quantitative through the observa-tion that the physical properties of materials (density,electrical resistance, and color, for instance) change re-producibly as they become “hotter” or “colder.” Thesechanges, which can be relatively large and extremely re-producible for certain well-characterized materials, allowthe design and construction of practical thermometers. Animportant requirement in any science is that measurementsmade in different localities and in different ways can berelated quantitatively, so an agreement on the use of stan-dards must exist. Thermometry standards are based on theobservation that certain phenomena always occur at thesame, highly reproducible, temperature. The temperaturesat which water freezes and then boils under a pressure of1 atm were recognized very early as being useful ther-mometric “fixed points,” and the Celsius (formerly calledcentigrade) temperature scale, t , was based on the assign-ment of 0 and 100C, respectively, to these two phenom-ena. As described below, a number of fixed points are usedtoday to define the currently accepted temperature scale.

Once fixed-point temperatures have been assigned, val-ues are associated with intermediate temperatures by inter-polation using a “thermometric parameter” that has beenevaluated at both lower- and higher-temperature fixedpoints. This parameter could be, for instance, the ex-pansion of a liquid in a glass bulb (the liquid-in-glassthermometer) or the electrical resistance of a platinumwire (the platinum resistance thermometer; PRT). Sincethese interpolations may give answers that depend onthe material and/or the physical property involved, thestandard temperature scale also must designate the typeof interpolation device that is to be used. A carefullyspecified standard platinum resistance thermometer(SPRT) is the designated interpolation instrument overmuch of the intermediate temperature range, with otherinstruments important at the extremes of very high andvery low temperatures.

The above discussion places no restrictions on whatcould be an arbitrary assignment of values to the variousfixed points, although a “smooth” relationship betweenthese and, for instance, the resistance of an SPRT wouldappear to be desirable. The concept of a characteristicthermal energy, or of a theoretical temperature, appearsboth in the science of thermodynamics and in theoreti-cal calculations of thermal properties of materials. Hence,a natural additional requirement is that fixed-point tem-peratures (and interpolated values) coincide as closely aspossible with theoretical (or thermodynamic, or absolute)temperatures, T , which will be measured in kelvins (K).This requirement can be satisfied using a “primary” ther-mometer, which is a practical device that can be under-stood completely in a theoretical sense (a gas thermome-ter, for instance) and that can be used experimentally tostudy fixed points and interpolation devices. In addition,for purely practical reasons, temperature intervals mea-sured in kelvins and degrees centigrade should have iden-tical numerical values. This was accomplished historicallyby making measurements with the primary thermometerat the two defining fixed points for the Celsius scale and byrequiring that the corresponding temperature difference beexactly 100 K.

Temperatures on the Celsius scale may have either posi-tive or negative values, since 0C has been chosen arbitrar-ily, while T must always be positive, except for unusualsituations, and T = 0 (absolute zero) has a definite mean-ing (see below). Once the above interval equivalence hasbeen established, t and T will differ by an additive con-stant, which is the absolute temperature (in K) of the icepoint. The triple point of water is much more reproduciblethan the ice point (see below), and the temperatures of thisfixed point are defined to be 273.16 K and 0.01C. Thisdefinition, which establishes the size of the kelvin, wasbased on the best data available in 1960 for the freezingand boiling points of water on the ideal gas scale. Modernmeasurements (see below) show that a discrepancy existsbetween this definition and the definition of the Celsiusscale, since the temperature interval between the waterfreezing and the water boiling points is 99.974 K.

Standards decisions are made by the 48-nation GenevaConference on Weights and Measures (CGPM), whichmeets every 4 years (1991, 1995, 1999, etc.). The CGPMacts on the advice of 18 national technical experts whoform the International Committee on Weights and Mea-sures (CIPM). The CIPM, in turn, relies heavily on thebench scientists who make up the various ConsultativeCommittees where the actual expertise is located. Thus,it is the Consultative Committee on Thermometry (CCT)that has primary responsibility for establishing and moni-toring thermometry standards through recommendationsthat eventually are acted upon by the CGPM. The work

Page 376: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 707

of the consultative committees is coordinated by the In-ternational Bureau of Weights and Measures in Sevres,just outside Paris, France. The CCT conducts its quality-control role through exchanges of personnel and devicesamong laboratories and carries out carefully organized in-ternational comparisons of thermometer and fixed points.It publishes the results of these exchanges as well as theresults of critical evaluations of data. The CCT was re-sponsible for the establishment, in January 1990, of theInternational Temperature Scale of 1990 (ITS-90), whichreplaced the International Practical Temperature Scale of1968 (IPTS-68). Standards decisions are made with greatcare and after much deliberation, since mistakes have along lifetime, with, historically, changes being made onlyevery 20 years or so.

II. STANDARDS AND CALIBRATIONS

A. Fixed Points

A useful thermometric fixed point must be reproduciblefrom sample to sample and must exhibit a sharp, well-defined “signal” to which other measurements can bereferred easily. In practice, most fixed points are associ-ated with the properties of high-purity, single-componentmaterials. The practical realization of a fixed point with ahigh accuracy requires considerable care and experiencein both the setting-up and the use of the devise, and thisis primarily a task for a standards laboratory. Fixed pointsof all kinds play such an important role in thermometry,however, that they must be a part of a discussion oftemperature.

1. Triple Points

The triple point is the unique combination of temperatureand pressure at which the liquid, solid, and vapor phasesof a pure, single-component system coexist. The triplepoint of water provides an excellent; illustration of thisphenomenon; Fig. 1 is a photograph of a water triple-pointcell that is used to realize 273.16 K with an accuracy of10 µK (10−5 K). The glass container contains only purewater, with all traces of air removed. The thermometer isinserted into the central well, around which ice is carefullyfrozen in a mantle, after which a narrow annulus of wateris formed around this well by melting ice from the insideout. Thus, the temperature is uniquely defined since allthree phases of pure water are present in equilibrium. Thecell in Fig. 1 was removed from its refrigeration chamberfor the photograph, but the ring of ice is present, and thethin sheath of water around the well is clearly visible.

Triple points also are important at low temperatures.These are obtained by liquefying a gas (oxygen, argon,

FIGURE 1 A water triple-point cell for use with PRTs. [Courtesyof Jarrett Instrument Company.]

neon, and hydrogen are examples) in a sealed system andthen carefully cooling it until the solid begins to format the triple point. Impurities in the starting material cancause changes in the triple-point temperature as the sam-ple is frozen (or melted), and the inherent accuracy of thesystem (a unique definition of the temperature) is lost.Problems of contamination during gas handling are mini-mized with a system (Fig. 2) in which a high-purity gas atroom temperature and 100 atm is sealed permanently intoa carefully cleaned stainless-steel container. As this cell iscooled to the triple point, solid and liquid collect aroundthe copper thermometer well, and the temperature can re-main extremely constant as the solid is frozen and thenmelted. Although these cells have been in use only since1975, they appear to be remarkably stable with time. Thedevelopment of sealed triple-point cells (some of whichcontain several different gases in different parts of the cell)has revolutionized the ease with which low-temperaturefixed points can be realized. Similar systems also havebeen used to obtain high-quality triple points at highertemperatures for other pure materials, with mercury, gal-lium, and indium metals providing examples.

2. Freezing Points

The freezing point is the temperature at which the solid be-gins to form from the liquid in the presence of atmosphericpressure. The freezing point of water (which defines 0C),for instance, is approximately 0.01C lower than the triplepoint, primarily because the melting temperature of water

Page 377: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

708 Thermometry

FIGURE 2 An example of the design for a sealed triple-point cell.

is depressed by the application of pressure, although it alsois affected by dissolved gases and other impurities. Theuncontrollable impurity effects make the freezing point ofwater less satisfactory as a fixed point than the triple point.To prevent ambiguities, standards thermometry is referredexclusively to the triple point of water, which is definedto be exactly −0.01C. Melting temperatures generallyincrease with applied pressure, so the freezing points formost materials are higher than the triple points. Since met-als tend to oxidize at high temperatures when exposed toair, atmospheric pressure may be transmitted by an inertgas, but the effect is the same. Again, as for triple points,impurities can destroy the sharpness with which the freez-ing point can be defined.

3. Boiling Points: Vapor Pressures

The vapor pressure of a pure substance is a unique func-tion of the temperature, so pressure control is equivalentto temperature control. The normal boiling points of pure

substances (where the vapor pressure is 1 standard atm,or 101,325 Pa) have been used as fixed points, primarilythose of water, oxygen, and hydrogen. Where possible,boiling points have been replaced as fixed points by triplepoints of other substances to eliminate problems due topressure measurement and the existence of temperaturegradients in the liquid. The vapor pressure–temperaturerelations for the liquefied helium isotopes, however, oftenare used directly for the calibration of other thermometersat temperatures from below 1 to 4.2 K. Reliable experi-mental results for the vapor pressure–temperature relationare available both for the common isotope of mass 4 (4He)and for the much rarer isotope of mass 3 (3He), and equa-tions describing these form the lower temperature portionof the ITS-90. Other vapor pressure–temperature relations(hydrogen, neon, oxygen, nitrogen, oxygen) are useful assecondary standards. In this type of measurement, caremust be taken to avoid temperature gradients in the liq-uid (a sensing bulb is preferred) and cold spots along thepressure measuring tube.

4. Superconducting Transitions

The low-temperature electrical resistance of a number ofpure metals disappears abruptly at a well-defined temper-ature that is characteristic of the metal. These supercon-ducting transition temperatures (Tc) have been developedby the National Institute of Standards and Technology asthermometric fixed points for temperatures from 15 mK(tungsten) to 7.2 K (lead). Early data for polycrystallinematerials showed appreciable widths for the transitions,and a corresponding lack of accuracy. Later work on sin-gle crystals gives much sharper transitions. The magnitudeof Tc depends on the presence of a magnetic field, so caremust be taken with magnetic shielding and, also, with themagnitude of the measuring field for the noncontact mu-tual inductance detection method used to determine Tc.

B. Interpolation Devices

A practical interpolation device must be sensitive, capa-ble of a high accuracy and reproducibility, and convenientto use in different environments. The temperature depen-dence of its thermometric parameter must be “reasonable,”and understood at least qualitatively in a theoretical sense.A very carefully specified form of the platinum resistancethermometer (the SPRT) traditionally has been the inter-polation instrument for international scales, and this in-strument is used in the definition of the ITS-90 for tem-peratures from the triple point of hydrogen, 13.8033 K,to the freezing point of silver, 961.78C. Platinum has theadvantages that it can be obtained with a high purity, canbe formed easily into wire, has a very high melting point,

Page 378: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 709

FIGURE 3 Typical standard platinum resistance thermometers. [Courtesy of Yellow Springs Instrument Company.]

and suffers little from oxidation. Many years of use havemade the PRT a well-understood instrument both empiri-cally and scientifically.

Figure 3 shows two forms of a commercially availableSPRT. In each case, the fine-wire sensing element (typ-ically 25 at the triple point of water) is mounted in-side a thin, roughly 6-mm-diameter, 40-mm-long platinumsheath, with a glass or fused quartz seal for introducing theelectrical leads. A small amount of “air” provides thermalconductance. A four-lead design allows an unambiguousdefinition of the resistance of the element. The “capsule”version is intended for low-temperature use, where it canbe placed in a vacuum-insulated thermometer well, as forthe sealed triple-point cell of Fig. 2. The disadvantage ofthe capsule form is that the four leads from the resistanceelement are at the same temperature as the capsule, so

leakage resistances between the leads can become impor-tant at temperatures greater than 200 or 300C. The “long-stem” SPRT (Fig. 3, top) reduces this problem since thefour leads leave the sealed enclosure at room temperature.Its length, however, makes this instrument impractical foruse at temperatures below about 50 K. Internal electri-cal leakage, which even here becomes a problem for thehighest temperatures (above 500 C), can be minimizedthrough the use of long-stem thermometers with ice-pointresistances as low as 0.25 . The stability of an SPRTcan be determined through periodic checks of its resis-tance when it is immersed in a triple-point cell (Fig. 1). Agood SPRT will give results that are reproducible to bet-ter than 0.1 mK even when different triple-point cells areused. The resistance-temperature characteristics of PRTsare discussed specifically in Section IV.

Page 379: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

710 Thermometry

The SPRT becomes relatively insensitive at temper-atures below roughly 13.8 K, and the low-temperaturecalibration is very sensitive to strains that are caused byshock. Other resistance thermometers are more satisfac-tory for use below 13.8 K (or even 20 K), most importantlythose using a rhodium–iron alloy. At the lowest tempera-tures, the susceptibilities of elementary magnetic systems(electronic to a few millikelvins, then nuclear) show aparticularly simple temperature dependence (the Curie–Weiss law; see below) and are used both for interpolationand extrapolation. The melting curve of the helium isotopeof mass 3 (3He) also has a strong pressure–temperature re-lationship below 0.5 K and is being adopted for use as athermometer for use down to 0.9 mK (see below).

At very high temperatures, above roughly 1000 C, theradiation emitted by a black body can be measured ac-curately and is used as a measure of temperature (opticalpyrometry). Only a single calibration point is required forthese measurements, and overlap with the PRT scales isachieved, at least in laboratory measurements. The rela-tive intensities of lines in optical emission or absorptionspectra can change with temperature as higher energy lev-els are excited thermally. These relative intensities can beinterpreted directly in terms of T .

C. THE ITS-90

1. The Scale Definition

The currently accepted International Temperature Scaleof 1990 differs appreciably from its immediate predeces-

FIGURE 4 Differences between the ITS-90 and its predecessor, the IPTS-68. [From the BIPM.]

sor (the IPTS-68), with the magnitudes of the differencesbetween the two scales shown in Fig. 4. The lower end ofthe scale now is 0.65 K rather than 13.8 K, differences fromthermodynamic temperatures (especially at low tempera-ture) are reduced to give increased smoothness, and the de-velopment of high-temperature SPRTs allows their use tothe freezing point of silver (961.78C). The discontinuityin slope at 630 C in Fig. 4 is related to the change at thistemperature in the interpolation instrument which is usedto define the IPTS-68. The relatively accurate and preciseSPRT was used at lower temperatures, while the muchless precise and stable ( ±0.2 K) platinum–10% rhodium/platinum thermocouple was used to the gold point.

The ITS-90 is defined in terms of the 17 fixed points inTable I, with vapor pressure–temperature relations for thehelium isotopes extending the scale definition to 0.65 K.These fixed points are characterized as vapor pressure (v),triple point (tp), or freezing point (fp), with no boilingpoints being used. The triple point of water is assigned theexact value 273.16 K, with the relationship between theKelvin and the Celsius temperatures defined as

t90/C = T90/K − 273.15; (1)

273.15 appears here instead of 273.16 since, as discussedin the Introduction (Section I), Celsius temperatures arebased on the freezing, not the triple, point of water.

The ITS-90 is described most readily in terms of thefour interpolation methods (instruments) which are usedto define it in four distinct but overlapping temperatureranges. These overlaps represent a change in philosophy

Page 380: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 711

TABLE I Fixed-Point Temperatures for the ITS-90

T90 (K) t90 (C)

1. Helium (v) 3 to 5 −270.15 to −268.15

2. e-Hydrogen (tp) 13.8033 −259.3467

3. e-Hydrogen (v or g) ≈17 ≈−256.15

4. e-Hydrogen (v or g) ≈20.3 ≈−252.85

5. Neon (tp) 24.5561 −248.5939

6. Oxygen (tp) 54.3584 −218.7916

7. Argon (tp) 83.8058 −189.3442

8. Mercury (tp) 234.3156 −38.8344

9. Water (tp) 273.16 0.01

10. Gallium (fp) 302.9146 29.7646

11. Indium (fp) 429.7485 156.5985

12. Tin (fp) 505.078 231.928

13. Zinc (fp) 692.677 419.527

14. Aluminum (fp) 933.473 660.323

15. Silver (fp) 1234.93 961.78

16. Gold (fp) 1337.33 1064.18

17. Copper (fp) 1357.77 1084.62

from the IPTS-68, since no overlap was allowed betweenthe four ranges which defined that scale.

The low-temperature portion of the ITS-90 is dividedinto two regions. For the lowest temperatures (0.65 to5 K), explicit equations are given for the vapor pressure–temperature relations for the two helium isotopes. Temper-atures between 3 K and the triple point of neon (24.5561 K)are defined by an interpolating constant volume gas ther-mometer (see Section III.B.1), which uses either 4He or3He as the working substance. A procedure is given forcorrecting the gas thermometer pressures (slightly) for thenonideal behavior of these gases, after which the parame-ters for a parabolic pressure–temperature relation are de-termined from the corrected pressures at fixed points 1, 2,and 5 in Table I.

The platinum resistance thermometer (an SPRT) is usedto define the ITS-90 from 13.8 K (2 in Table I) to 961.78C(the freezing point of silver; 15), with the acknowledg-ment that no single instrument is likely to be usable overthis whole range. The characteristics of a real thermome-ter were used to generate an SPRT interpolation relationwhich, to obtain the required accuracy, is quite complex.To eliminate differences between thermometers due to dif-ferent resistances, the primary variable which is used forinterpolation is the dimensionless ratio of the thermometerresistance at a given temperature to its value at the triplepoint of water, 273.16 K,

W (T90) = R(T90)/R(273.16 K). (2)

The triple-point value of R typically is approximately25 for an SPRT, which will be used from the low-est temperatures to, possibly, 400C, with smaller val-

ues (as low as 0.25 ) used for the highest-temperatureapplications.

A PRT that is acceptable for representing the ITS-90(an SPRT) must have a high-purity, strain-free platinumelement; the ITS-90 defines such an element as one forwhich either W (29.7646C) ≥ 1.11807 (the gallium triplepoint) or W (−38.8344C) ≤ 0.844235 (the mercury triplepoint). An SPRT that is to be used to the freezing pointof silver in addition must have W (961.78C) ≥ 4.2844.These requirements eliminate many relatively inexpensivecommercial thermometers. A practical requirement whichis not stated in the scale is that an SPRT must have a re-producibility at the triple point of water after temperaturecycling of better than 1 mK (preferably 0.1 mK). Ther-mometers which are used above the zinc point (431C)require careful treatment because of effects due to anneal-ing of the platinum element.

The mathematical functions that are required to de-scribe mathematically the ITS reference interpolation re-lation for an SPRT are quite complex. For temperaturesfrom 13.8 to 273.16 K, a 13-term power series is requiredto give ln[Wr (T90)] as a function of ln[T90 /273.16 K],while the inverse relation, which gives T90 as a function ofWr (T90), requires a 16-term power series. The correspond-ing power series for temperatures from 0 to 961.78C eachcontain “only” 10 terms.

Only rarely will the temperature dependence of the re-sistance for a real thermometer, W (T90), agree with thatgiven by the reference function, Wr (T90). The values ofW and Wr are compared at the various fixed points, andthe differences are used to determine the parameters ina deviation function which then is used together with thereference relation to obtain T90. The details again are com-plex; an SPRT which is to be used from 13.8 to 273.16 Kmust be calibrated at points 2 through 9 (Table I) to deter-mine the eight parameters in the deviation function. Fora calibration which is to be used only within ±30C ofthe ice point, the thermometer need only be calibrated atthe mercury point, the water triple point, and the galliumpoint to determine two parameters for the deviation func-tion. All in all, 11 possible subranges are defined; 4 dependon the lowest temperature below 273.16 K at which thethermometer will be used, 1 is for temperatures near 0C,and 6 depend on the maximum temperature above 0C atwhich the thermometer will be used.

A question immediately arises as to the agreement thatcan be expected between temperatures obtained at, for in-stance, −15C, for a given thermometer which has beencalibrated using five different procedures and five differentsets of fixed points. This is the “uniqueness” problem. Thebelief is that the differences at a given temperature betweencalibrations using different ranges will be comparable withdifferences between different thermometers which arecalibrated in a given range. This “nonuniqueness” will

Page 381: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

712 Thermometry

be a few tenths of a millikelvin near room temperature,less than 1 mK for the more extreme parts of the scalebetween 13.8 K and 420C, and should be less than 5 mKat the highest temperatures.

The highest range of the ITS-90, above the silver point,is defined by optical pyrometry, using Planck’s law to ob-tain the radiant emission from a black-body cavity for agiven wavelength, λ, and bandwidth. The ratio of the spec-tral radiances at the temperature T90 and at the referencetemperature, X , is related to the absolute temperature by

Lλ(T90)

Lλ(TX )= exp[c2 /λT90(X )] − 1

exp[c2 /λT90] − 1, (3)

where T90(X ) refers to any one of the silver [T90(Ag) =1,234.93 K], the gold [T90(Au) = 1,337.33 K], or thecopper [T90(Cu) = 1357.77 K] freezing points. Here theoptical pyrometer both defines the scale and serves asthe interpolation device. The ITS-90 specifies the use ofthe theoretical value for the constant c2, so there are noadjustable parameters in this relation. Proper realizationof temperatures by pyrometry requires care in the designof the cavities in which the gold and the sample are lo-cated, and as with most thermometry, care must be takento avoid systematic errors.

2. Calibration Procedures

Working thermometers (either transfer standards or work-ing instruments) should be calibrated by following theprocedures outlined in the basic ITS-90 document toreproduce the scale. In practice, this can be a cumbersomeprocedure, especially at low temperatures, where gas ther-mometry requires long-term experiments. In this tempera-ture region, gas thermometry results will be transferred tohighly stable rhodium–iron resistance thermometers, andmost subsequent calibrations will be carried out in termsof “point-by-point” comparisons at thermal equilibriumbetween a set of standard thermometers and the unknownthermometer(s). This also may be true for higher, PRT,temperatures when calibrations are not carried out at a na-tional standards laboratory. In this instance, “standards”which have been calibrated directly on the ITS-90 maybe used as substitutes for true fixed point devices. Threestandard thermometers are the useful minimum, since notmore than one would be expected to show drift (instabil-ity) in any given period of time. The result is a table oftemperatures and corresponding W ’s, with the W ’s con-verted to R(T90) using the measured R(273.16 K) = Ro toeliminate dependence on a standard resistance value. Toa first approximation, small changes in Ro will have littleeffect on the W (T90) relationship for a thermometer.

For moderate and low temperatures, the sheaths of thethermometers can be inserted in individual mounting holes

in an isothermal metal block. Thermal shielding of theblock, anchoring of the leads to the block, vacuum insu-lation, and temperature control all are important factorsin such a thermometer comparator. Variable-temperaturebaths (oil or possibly molten salt) are used at higher tem-peratures where long-stem thermometers must be used.Calibrations carried out by each of the national standardslaboratories can be expected to be equivalent, and to repre-sent the ITS-90 within stated uncertainties. Other calibra-tion sources, which generally are traceable to a nationalstandards laboratory, generally have less rigorous controls,and care must be taken in assessing the accuracy of cali-brations that are supplied. If accuracy is important, theperformance of a thermometer can be spot-checked withcommercially available sealed fixed-point devices, withgallium (see Table I) being most useful near room tem-perature. This may be particularly important when highlyaccurate thermometry is required for the maintenance ofstandards or for biological studies.

D. Electrical Measurements

High-quality electrical measurements traditionally haveused very accurate dc techniques. Voltages were measuredpotentiometrically in terms of standard cells, while resis-tances were measured using Wheatstone or other types ofbridges. For accurate work, a standard resistor or a resis-tance thermometer is designed with four terminals, twoof which are for the measuring current, while the sec-ond pair, mounted just inside the current leads at eachend, measures the potential drop across the resistor. If aconventional Wheatstone-type bridge technique is used,the bridge determines the sum of the resistances of theresistor and of the leads, so a separate measurement ofthe resistance of a pair of leads at one end of the resistor(or thermometer) must be made. Care must be taken thatthe lead resistances are symmetrical. These measurementscan be simplified if a potentiometer is used to compare di-rectly the potential drops across a standard resistor andthe unknown for a common current. In this case, negligi-ble current flows through the potential leads, and no leadcorrection is required.

In both bridge and potentiometric measurements, par-asitic emfs (voltages) can exist in the lead wires and themeasuring instrument, with current reversal required toeliminate their effects. In addition, since the bridge con-tains standard resistances of various magnitudes, thesemust be intercompared and recalibrated regularly to de-tect aging effects. The linearity of a dc potentiometer alsomust be calibrated at regular intervals for the same reason.

Modern semiconductor technology has caused ma-jor changes in the above procedures. First, voltmetersnow routinely have extremely high input impedances

Page 382: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 713

(greater than 1000 M) and linearities at the 10−6 level.Hence, most accurate electrical measurements now aremade using these instruments rather than potentiometersor bridges. Modern multimeters often can be used in afour-terminal mode for resistance measurement, and mostcan be interfaced directly with a computer for experi-mental control and data acquisition.

When the highest accuracy in resistance measurementis required, variations of the potentiometer technique areused in which the accurate division of voltage levelsis carried out using ratio transformers rather than re-sistive windings. These components are very similar toideal transformers or inductors, with windings on a high-permeability mumetal toroid system for which the sta-bility is determined by winding geometry rather than aphysical property. The current comparator is a dc instru-ment in which the condition for zero magnetic flux in acore is used to determine the ratio of currents through tworesistances (a standard and an unknown) when the poten-tial drops across them are equal. The effects of parasiticvoltages are eliminated by using current reversal. Theseinstruments are in common use in standards laboratoriesand are capable of determining resistance ratios potentio-metrically at the 10 −8 level. This corresponds to betterthan 10 µK for an SPRT with a 25- ice-point resistanceand is better than the long-term stability of many standardresistances. It is for this reason that SPRT measurementsare always expressed in terms of Eq. (2), using a directdetermination of R(273.16 K).

Various alternating current bridges and potentiometershave been constructed using ratio transformer techniques.Figure 5 shows a very simple version of an ac ratio-transformer bridge. The ac voltage drop across an un-known resistor is compared with a fraction of the voltagedrop across a standard resistor. This fraction, which is de-termined by the turns ratio, is adjusted until a null is indi-cated at the detector. Typically, this is a phase-sensitive de-tector with transformer input and a sensitivity to extremelylow (nV; 10−9 V) voltages. This bridge is useful primarily

FIGURE 5 An elementary ac ratio-transformer bridge for resis-tance measurements.

for temperature control, since the finite input impedanceof the transformer (typically 105 at 400 Hz) causes un-acceptable shunting of the reference resistor. The inputimpedance of the transformer can be increased greatly bysophisticated designs that use multiple cores and windingsand operational amplifier feedback. As a result, accuraciesof 10−8 are also reported for the ac measurement of a stan-dard 25- SPRT.

Although the effects of parasitic dc voltages are elimi-nated with ac methods, frequency-dependent lead admit-tance effects (due to shunt capacitances between ther-mometer leads) are important, and both in-phase andquadrature balance conditions must be met. This isaccomplished in Fig. 5 with the variable shunt capacitor. Itis for this reason that ac bridges are restricted to relativelylow resistance values for the most accurate work.

III. THERMODYNAMIC TEMPERATURES

A. General Concepts

The concept of thermodynamic temperature arises fromthe second law of thermodynamics and the existence ofreversible heat effects, such as for the isothermal compres-sion of an ideal gas. The maximum (Carnot) efficiency fora heat engine, for example, is expressed in terms of a ratioof thermodynamic temperatures.

Developments of statistical mechanics contain a char-acteristic energy that is the same for all systems that arein thermal equilibrium and that increases as the internalenergy of a system is increased. This characteristic energyhas properties that are identical to those of temperature asit is defined in both the thermodynamic and the practicalsenses. This characteristic energy appears in an elemen-tary manner in the Boltzmann factor, which determinesthe relative populations of two states that are separated byan energy difference E ,

N1/N2 = exp(−E/kBT ). (4)

In this expression, kBT is the characteristic energy, and kB

(as yet undetermined) is the Boltzmann constant. Equa-tion (4) suggests that the concept of a level of temperatureis purely relative. A collection of systems can be said tobe at a low temperature (close to T = 0) if most (all) ofthem are in their lowest energy (ground) state, that is,if E kBT . Alternatively, a high temperature corre-sponds to an equal population of the levels. Whether ornot a temperature is “high” or “low” thus depends on thecharacteristic energies of the system and is a purely rela-tive concept. Absolute zero corresponds to a state at whichevery conceivable system is in its ground state. Negativetemperatures occur when (as in some laser systems) an

Page 383: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

714 Thermometry

upper metastable level has been forced to have a largerpopulation than a lower level.

The relationship between theoretical and practical tem-peratures (see Section I) has been determined most oftenusing measurements made with an ideal gas. The experi-mental equation of state for such a system is written

PVm = RT, (5)

with Vm the volume per gram molecular weight of thegas, R the gas constant per mole (8.317 J/F mol-K), and Trelated to the Celsius scale by Eq. (1). Since a Carnot heatengine with an ideal gas as the working medium has anefficiency identical to that of a Carnot cycle, T as it appearsin Eq. (5) can be chosen to be equal to thermodynamictemperatures.

Statistical mechanics as applied to an ideal gas (a collec-tion of noninteracting particles) also gives Eq. (5), if RTis assumed to be proportional to the characteristic thermalenergy of the system and to the total number of particles.The association with Eq. (4) exists through the introduc-tion of the gas constant per molecule, the Boltzmann con-stant, kB = R/NA, where NA, the Avagadro constant, isthe number of molecules in a gram molecular weight of asubstance. The characteristic thermal energy that appearsin the Boltzmann relation is the same as that which appearsin the ideal-gas law.

B. Absolute or Primary Thermometers

The use of fixed points and designated interpolation instru-ments would not be necessary if an absolute or primarythermometer could be used directly as a practical ther-mometer. A single calibration of such a thermometer atthe triple point of water (273.16 K) would serve to stan-dardize the thermometer once and for all. Unfortunately,most primary thermometers are relatively clumsy devicesand may require elaborate instrumentation and possiblylong equilibrium and/or measurement times.

Two exceptions are the optical pyrometer at high tem-peratures and the magnetic thermometer at low tempera-tures. In each of these cases, data are taken using the pri-mary thermometric parameter, with this parameter relateddirectly by theory to the absolute temperature. At interme-diate temperatures, fixed points and easily used secondarythermometers must be used for the routine measurementof temperature. Primary thermometers, then, are used toestablish the temperatures that are assigned to the fixedpoints and to test the smoothness and appropriateness ofthe calibration relations that are used with the secondarythermometers.

The following sections discuss briefly the various typesof primary thermometers that have been used to obtainaccurate thermodynamic temperatures. Gas thermometry

in various forms traditionally has been of primary im-portance in this area, but modern optical pyrometry hascomparable importance at high temperatures, and noiseand magnetic thermometry also have had important com-plementary roles. The existence of several approaches fora given temperature range is important to provide confi-dence in the relationship between theory and experiment,and to provide information about the possible existence ofsystematic errors.

1. Gas Thermometry

The ideal-gas law [Eq. (5)] is valid experimentally for areal gas only in the low-pressure limit, with higher-orderterms (the virial coefficients, not defined here) effectivelycausing R to be both pressure and temperature dependentfor most experimental conditions. While these terms canbe calculated theoretically, most gas thermometry data aretaken for a variety of pressures, and the ideal-gas limit,and, hence, the ideal-gas temperature, is achieved throughan extrapolation to P = 0. The slope of this extrapola-tion gives the virial coefficients, which are useful not onlyfor experimental design, but also for comparison with the-ory. The following discussion of ideal-gas thermometry isconcerned, first, with conventional gas thermometry, thenwith the measurement of sound velocities, and, finally,with the use of capacitance or interferometric techniques.Each of these instruments should give comparable results,although the “virial coefficients” will have different forms.

Gas thermometry in the past 20 years or so has bene-fited from a number of innovations that have improved theaccuracy of the results. Pressures are measured using freepiston (dead weight) gauges that are more flexible and eas-ier to use than mercury manometers. The thermometric gas(usually helium) is separated from the pressure-measuringsystem by a capacitance diaphragm gauge, which gives anaccurately defined room-temperature volume and a sepa-ration of the pressure-measurement system from the work-ing gas. In addition, residual-gas analyzers can determinewhen the thermometric volume has been sufficiently de-gassed to minimize desorption effects.

In isothermal gas thermometry, absolute measurementsof the pressure, volume, and quantity of a gas (numberof moles) are used with the gas constant to determine thetemperature directly from Eq. (5). Data are taken isother-mally at several pressures, and the results are extrapolatedto P = 0 to obtain the ideal-gas temperature as well as thevirial coefficients. A measurement at 273.16 K gives thegas constant.

A major problem in isothermal gas thermometry is de-termining the quantity of gas in the thermometer, sincethis ultimately requires the accurate measurement of asmall difference between two large masses. Most often,

Page 384: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 715

this problem is bypassed by “filling” the thermometer toa known pressure at a standard temperature, with relativequantities of gas for subsequent fillings determined bydivision at this temperature between volumes that have aknown ratio. The standard temperature may involve a fixedpoint or, for temperatures near the ice point, an SPRT thathas been calibrated at the triple point of water. Since thevolume of the gas for a given filling is constant for datataken on several subsequent isotherms, and the mass ra-tios are known very accurately, the absolute quantity ofgas needs to be known only approximately. Excellent sec-ondary thermometry is very important to reproduce theisotherm temperatures for subsequent gas thermometerfillings. The results for the isotherms (virial coefficientsand temperatures) then are referenced to this standard“filling temperature.”

The procedure for constant-volume gas thermometry isvery much the same as that for isotherm thermometry, butdetailed bulb pressure data are taken as a function of tem-perature for one (and possibly more) “filling” of the bulbat the standard temperature. To first order, pressure ra-tios are equal to temperature ratios, with thermodynamictemperatures calculated using known virial coefficients.In practice, the virial coefficients vary slowly with tem-perature, so a relatively few isotherm determinations canbe sufficient to allow the detailed investigation of a sec-ondary thermometer to be carried out using many datapoints in a constant-volume gas thermometry experiment.If the constant-volume gas thermometer is to be used inan interpolating gas thermometer mode (as for the ITS-90), the major corrections are due to the nonideality of thegas. When a nonideality correction is made using knownvalues for the viral coefficients, the gas thermometer canbe calibrated at three fixed points (near 4 and at 13.8 and24.6 K) to give a quadratic pressure–temperature relationthat corresponds to T within roughly 0.1 mK.

The velocity of sound in an ideal gas is given by

c2 = (CP/CV)RT/M, (6)

where the heat capacity ratio (CP/CV) is 5/3 for amonatomic gas such as helium. Since times and lengthscan be measured very accurately, the measurement ofacoustic velocities by the detection of successive reso-nances in a cylindrical cavity (varying the length at con-stant frequency) appears to offer an ideal way to measuretemperature. This is not completely correct, however,since boundary (wall and edge) effects that affect the ve-locity of sound are important even for the simplest casein which only one mode is present in the cavity (fre-quencies of a few kilohertz). These effects unfortunatelybecome larger as the pressure is reduced. An excellenttheory relates the attenuation in the gas to these velocitychanges, but the situation is very complex and satisfactory

results are possible only with complete attention to detail.An alternative configuration uses a spherical resonator inwhich the acoustic motion of the gas is perpendicular tothe wall, thus eliminating viscosity boundary layer effects.The most reliable recent determination of the gas constant,R, is based on very careful sound velocity measurementsin argon as a function of pressure at 273.16 K, using aspherical resonator.

The dielectric constant and index of refraction of anideal gas also are density dependent through the Clausius–Mossotti equation,

(εr − 1)/(εr + 2) = α/Vm = αRT/P, (7)

in which εr (= ε/ε0) is the dielectric constant and α isthe molar polarizability. Equation (7) suggests that anisothermal measurement of the dielectric constant as afunction of pressure should be equivalent to an isother-mal gas thermometry experiment, while an experiment atconstant pressure is equivalent to a constant-volume gasthermometry experiment. The dielectric constant, whichis very close to unity, is most easily determined in termsof the ratio of the capacitance of a stable capacitor thatcontains gas at the pressure P to its capacitance whenevacuated. The results that are obtained when this ratio ismeasured using a three-terminal ratio transformer bridgeare comparable in accuracy with those from conventionalgas thermometry. An advantage is that the quantity of gasin the experiment need never be known, although caremust be taken in cell design to ensure that the nonneg-ligible changes in cell dimensions with pressure can beunderstood in terms of the bulk modulus of the (copper)cell construction material.

At high frequencies (those of visible light), the dielec-tric constant is equal to the square of the index of refrac-tion of the gas (εr = n2), so an interferometric experimentshould also be useful as a primary thermometer. No resultsfor this type of experiment have been reported, however.

2. Black-Body Radiation

The energy radiated from a black body is a function of bothtemperature and wavelength [Eq. (3)]. An ideal black bodyhas an emissivity (and hence an absorptivity) of unity, ora zero reflectivity. The design of high-temperature blackbodies to satisfy this condition requires considerable care.In practice, a usable design would consist of a long cylin-drical graphite cavity with a roughened interior that is, forinstance, surrounded by freezing gold to maintain isother-mal conditions. The practical aspects of optical pyrometryare discussed briefly in Section IV. For the present pur-poses, optical pyrometry using well-defined wavelengthsand sensitive detectors (so-called photon-counting tech-niques) can be used with Eq. (3) to measure relative

Page 385: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

716 Thermometry

temperatures with a high accuracy (better than 10 mK)at temperatures as low as the zinc point, 419.527C. Thisgives a valuable relationship between the high temper-ature end of current gas thermometry experiments andthe temperatures that are assigned to the gold and silverpoints.

The total energy that is radiated by a black body overall wave lengths [the integrated form of Eq. (3)] is thewell-known Stefan–Boltzmann law,

dW/dT = σ T 4. (8)

Here, σ = (2π5k4B/15c2h3) = 5.67 × 10−8 W/m2K4 is the

Stefan–Boltzmann constant. Measurements of the powerradiated from a black body at 273.16 K give σ directly,and, since both Planck’s constant, h, and the velocity oflight, c, are well known, also give the Boltzmann constant,kB. Relative emitted powers also give temperature ratios.Total radiation measurements [Eq. (8)] have been carriedout for black bodies in the range from −130C to + 100Cusing an absorber at a low temperature (roughly 2 K) tomeasure the total radiant power that is emitted.

3. Noise Thermometry

Noise thermometry is another, quite different, system thatcan be understood completely from a theoretical stand-point and that can be realized in practice. The magni-tude of the mean-square thermal noise voltage (Johnsonor Nyquist noise) that is generated by thermal fluctuationsof electrons across a pure electrical resistance, R, is givenby

(V 2)avg = 4kBTR f. (9)

This simple exact expression assumes that R is frequencyindependent, with the mean-square noise voltage depend-ing on R and the bandwidth in hertz, f , over whichthe measurement is made. These measurements are dif-ficult, since, to achieve the needed accuracy, consistentmeasurements must be made of the long-time average ofthe square of a voltage. In most instances, the results areobtained as the ratio of the mean square voltage at T tothat at a standard temperature (possibly 273.16 K), so theabsolute values of the voltages need not be determined.Instrumental stability is very important, however. Noisetemperatures have been determined from as low as 17 mK[17 × 10−3 K, using SQUID (Superconducting QuantumInterference Device) technology] to over 1000C. Whilenoise thermometry is difficult to carry out in a routinefashion, the measurements involved are so different fromthose for gas thermometry and optical pyrometry that theresults are extremely useful.

4. Magnetic Thermometry

The magnetic susceptibility of an ideal paramagnetic salt(a dilute assembly of magnetic moments) obeys Curie’slaw,

x = C/T, (10)

where C , the Curie constant, is proportional to the num-ber of ionic magnetic moments and their magnitudes.The magnetic moments may be due either to electronicor to nuclear effects, with a difference in magnitude ofroughly 1000. Interactions between the moments eventu-ally cause the breakdown of Eq. (10) at temperatures of theorder of millikelvins (or higher) for electronic paramag-netism, and at temperatures 1000 times smaller for nuclearsystems.

Magnetic thermometry involving electron spins is notstrictly primary thermometry, since the number of mo-ments in the sample cannot be determined with any pre-cision, and Curie’s law is obeyed only approximately forany real system. Magnetic interactions between the mo-ments and complications due to the existence of excitedstates for the ions cause difficulties in almost every case.An ion can be chosen for which the excited states are notpopulated for a given experiment, with deviations due tomagnetic interactions expected on theoretical grounds togive first-order corrections to Curie’s law which are of theform

x = A + B/(T + + δ/T ). (11)

The parameter A is due to temperature-independent dia-magnetism and paramagnetism, while represents effectsdue to surrounding moments, and δ arises because of com-plex spin systems. In practice, each of these parametersmust be determined empirically.

While a paramagnetic salt such as cerium magnesiumnitrate [CMN, Ce2Mg3(NO3)12·24H2O] shows almost-pure Curie law behavior ( = 0.3 mK, δ = 0), the dilutionof its moments and consequent small susceptibility makemeasurements difficult above 2 K, with a breakdown ofEq. (11) arising near 4 K due to the beginning occupationof a higher-energy state of the cerium ion. Even at lowtemperatures, controversy exists for CMN as to the mean-ing of the “nonideality” parameters, and the significance ofdifferent values of for single-crystal and powdered sam-ples. The use of SQUID technology rather than conven-tional ratio-transformer mutual inductance bridges allowsmeasurements to be made with extremely small samples.Paramagnetic salts with larger susceptibilities, which areuseful at higher temperatures, will have larger values forthe nonideality parameters and will show deviations fromeven Eq. (11) at temperatures not far below 1 K.

Page 386: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 717

5. Helium Melting-Pressure Thermometry

At temperatures below the lower limit of the ITS-90,0.65 K, a new low-temperature scale is being proposed bythe CCT based on the relation between the pressure andthe temperature of melting 3He. Although the helium melt-ing temperature–pressure relation used in the new scale isclosely related to the Clausius–Clapyron equation its tem-perature cannot be calculated directly from this equationwith sufficient accuracy. Instead, the relation is based onexperimental measurements using magnetic thermometry,noise thermometry, and nuclear-orientation thermometry.It is thus not strictly a primary thermometer. The newscale is referred to as to the “Provisional Low-TemperatureScale, 0.9 mK to 1 K: PLTS-2000.” The scale is defined bythe relation between the temperature of melting 3He andfixed points, i.e., the minimum in the melting pressure of3He at a temperature of about 315 mK and a pressure of2.93 MPa and at the A, A–B, and Neel transitions in 3He attemperatures of about 2.44, 1.9, and 3.44 mK respectively.

6. Nuclear Orientation Thermometry

At temperatures below 100 mK or so, the splitting of nu-clear energy levels in a single crystal may become com-parable with the characteristic thermal energy, kBT . Theγ -ray emissions from the oriented nuclei then may beanisotropic, and the anisotropies can be used to determinethe relative populations of these levels. In the simplestpossible two-level case, Eq. (4) can be applied to obtainthe temperature directly from these nuclear orientation ex-periments. Such measurements have been made from 10to roughly 50 mK for radioactive cobalt of mass 60 ina single-crystal nonradioactive cobalt lattice. These haveconfirmed SQUID noise measurements in the assignmentof absolute temperatures to the superconducting transi-tions of the National Bureau of Standards SRM 768 de-vice. The energy levels of the nuclei involved must be un-derstood in detail from other measurements before thesemethods can be used, but, again, it is useful that two inde-pendent measurements can be used to assign thermody-namic temperatures in an extreme region of the tempera-ture spectrum.

7. Spectroscopic Methods

Optical spectroscopy can give information about therelative populations of excited states in a very high-temperature system, such as a plasma. This informationthen can be combined with the Boltzmann relation ordirect theoretical calculations to obtain the temperaturedirectly, as for nuclear orientation experiments. Again,the system must be understood theoretically, and possible

complications due to interactions must be recognized. Thisuse of spectroscopic data for primary thermometry repre-sents the only possible means for determining extremelyhigh temperatures.

C. The ITS-90 and ThermodynamicTemperatures

Each of the above primary thermometers has been used forat least a limited temperature region in the establishment ofthe ITS-90. At the lowest temperatures, the scale is basedon a combination of results from magnetic, noise, and gasthermometry, with several gas thermometry experimentsof most importance from liquid helium and/or liquid hy-drogen temperatures to 0C. These agree well with total-radiation experiments at temperatures above 240 K. Gasthermometry results overlap pyrometry data for tempera-tures from 457 to 661C, and the comparison of an SPRTwith pyrometry data provided the SPRT reference func-tion for temperatures from 660C to the silver point. Thecorrespondence between the ITS-90 and thermodynamictemperatures is believed to vary from ±0.5 mK at the low-est temperatures to a maximum of ±2 mK for any tem-perature below 0C. At higher temperatures, the possibledifference rises from ±3 mK at the steam point to ±25 mKat 660C. The three highest temperature reference points(based on freezing points for silver, gold, and copper) areexpected to be internally consistent to within the accuracyof standards pyrometry and to have potential differencesfrom thermodynamic temperatures of ±0.04, 0.05, and0.06 K, respectively, which reflect the uncertainties at theprimary reference temperature of 660C. The most im-portant characteristic of the ITS-90, however, is that it isbelieved to be smoothly related to T at all temperatures,with no abrupt differences in slope such as appear in Fig. 4,where, on the scale of this figure, T90 is identical to T .

IV. PRACTICAL THERMOMETRY

Many types of thermometers are in general use, and manymore have been proposed. The following is a brief sum-mary of the characteristics of the more common typesof secondary thermometers, with no attempt made to becomplete or comprehensive. The choice of a type of ther-mometer for a given application is somewhat arbitrary,with the deciding factors sometimes dictated by rigorousconstraints but more often by personal preferences and/orprejudices. The accuracy or longevity of a thermometercalibration (a certificate or a table) should not be taken forgranted when a temperature must be known within speci-fied limits. Checks should be made, either in terms of aclose-by fixed point (the freezing point of water and the

Page 387: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

718 Thermometry

triple point of gallium are particularly useful near roomtemperature) or by comparison with one, but preferablytwo or more, carefully handled, “standard” thermometer.An electrical instrument should never be relied upon togive answers that are correct to all of the significant figuresthat are generated in the display or in the printout, espe-cially if important conclusions depend on these numbers.

A. Liquid-in-Glass Thermometers

These represent the oldest, and still very common, prac-tical thermometers, although they are increasingly beingreplaced by low-cost electronic devices using semicon-ductor elements (see below) as the temperature sensor.They come in many forms and qualities with a variety ofliquids, although mercury is the choice for accurate ap-plications. A very good thermometer for use up to 100Ccan be calibrated to 0.01C or better and will remain sta-ble at this level for a considerable period of time. Caremust be taken in the use of such a thermometer, since thereadings depend on the depth of immersion of the ther-mometer. Thus, they are most useful for measurementson liquids where a surface is defined. The disadvantageof liquid-in-glass thermometers is that they must be cali-brated manually, a tedious process, and must be read byeye, with no opportunities for automated data acquisition.

B. Resistance Thermometers

Resistance thermometers, or, more strictly, thermometersfor which a voltage reading depends on an applied current,quite naturally fall into two categories. The first includespure metals and metallic alloys that exhibit a positive tem-perature coefficient of resistance. Alloys with very smallcoefficients are useful for constructing the standard resis-tances that must play an important role in the practicaluse of resistance thermometers. The second category in-cludes primarily semiconducting materials, for which thetemperature coefficient of resistance is negative. It alsoincludes devices, such as diodes, for which the forwardvoltage is a function of temperature.

General considerations for the measurement of electri-cal resistance, discussed in Section II.D, are not repeatedhere. The reproducibility of a practical resistance ther-mometer is an important characteristic that is not alwaysdirectly related to the cost. Its calibration also may dependcritically on the magnitude of the measuring current, socare should be taken to follow the manufacturer’s (or cali-brator’s) recommendations. Resistance thermometers of-ten are used both for the control of temperature (as in athermostat) and for the measurement of the temperature.In general, this is not a recommended procedure, sincea temperature-control sensor generally is located in the

FIGURE 6 The temperature dependences of the resistances fortwo metallic resistance thermometers.

vicinity of the source of heat of refrigeration and will notgive a true average reading for the volume that is beingcontrolled.

1. Metallic Thermometers

The platinum resistance thermometer (PRT) is a typicalmetallic thermometer; the temperature dependence of theresistance that is shown in the double-logarithmic plot inFig. 6 is characteristic of most metals. Near room temper-ature and above, the electrical resistance of a pure metal isassociated primarily with lattice vibrations and is propor-tional to T , with the temperature coefficient of resistanceapproximately independent of temperature. Impurity ef-fects end to dominate at low temperatures, where the resis-tance approaches a constant value as T approaches zero.The ratio of the room-temperature resistance to its low-temperature value (the resistance ratio) is a measure of thepurity of a metal, and the ratio of 1000 for the SPRT inFig. 6 (the nominal ice point resistance is 25 ) is char-acteristic of a very pure metal.

Industrial PRTs are constructed from a “potted” wireor a thin film bonded to a ceramic substrate. Thesehave a characteristic resistance very similar to that ofan SPRT near room temperature but have a relativelyhigh value for the low-temperature resistance due to thequality of the platinum and also to the strains inducedin fabrication. Standard calibration tables exist for thesecommercial PRTs for temperatures from 77 K upward,

Page 388: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 719

with the objective of allowing routine substitution and re-placement of thermometers as needed. One of the difficul-ties in using pure metallic thermometers at temperaturesbelow 20 K is that the resistance is very sensitive to strainsthat are induced by shocks, so great care must be taken inhandling a calibrated SPRT. Hence, a PRT that was notwound in a strain-free configuration could be expected tobe relatively more unstable than the much more expensiveSPRT. An additional characteristic of inexpensive PRTs isthat they are primarily two-lead devices. For most appli-cations, it is useful to attach a second pair of leads so thatthe resistance of the thermometer is well defined.

The temperature dependence of an alloy thermometer isalso shown in Fig. 6. The primary component of this ther-mometer is rhodium metal, with a slight amount (0.5%)of iron added as an alloying agent. The localized mag-netic moment of the iron scatters electrons very well atlow temperatures and is responsible for the relatively high10 K resistance for this thermometer, which has a nomi-nal 100- room-temperature resistance. The interactionof these iron moments with the electrons also results inan approximately linear temperature dependence for thelow-temperature resistivity, in contrast with the SPRT, asshown in Fig. 7 for temperatures to 0.25 K. This ther-mometer is much more satisfactory than the PRT at lowtemperatures because of both its sensitivity and its stabil-ity. The wire is extremely stiff and difficult to fabricate

FIGURE 7 The resistance–temperature relations for several low-temperature thermometers. [The GE and CG results are throughthe courtesy of Lake Shore Cryotronics, Inc.]

into a thermometer element. As a result, the thermome-ters are very insensitive to shock, and aging and annealingeffects are virtually nonexistent. Rhodium thermometers,which are packaged similarly to SPRTs, now form thebasis for most practical low-temperature standards ther-mometry. They are available also in other packages foruse in practical measurements, possibly (as Fig. 6 indi-cates) for temperatures up to room temperature. A singlethermometer that can be used with a reasonable sensitivityfrom 0.5 to 300 K is a very useful device.

2. Semiconductors

Figure 7 gives, along with low-temperature results fora rhodium–iron thermometer, a double-logarithmic plotof the resistance–temperature relationships for a numberof low-temperature thermometers which are constructedfrom semiconducting materials. This presentation doesnot include an R-vs-T relationship for another often-usedsemiconducting thermometer, the thermistor (see below),which would be similar to that for the carbon–glass (CG)thermometer, but for higher temperatures.

Commercial radio resistors were used as the first semi-conducting low-temperature thermometers, with the mostpopular being, first, those manufactured by Allen–Bradley(A-B), and, later, those manufactured by Speer. The bond-ing of the electrical leads to the composite material inthese resistors proved to be quite rugged, and althoughsmall (occasionally large) resistance shifts occurred onsubsequent coolings to liquid helium temperatures, thecalibrations remained stable as long as the thermometerswere kept cold. The thermometric characteristics of thesetwo brands of resistors have the common feature that thetemperature coefficient of the resistance is a smooth andmonotonic function of the temperature. The details of theirtemperature variation are seen to be quite different, how-ever, with the A-B resistors being very sensitive, whilethe Speer resistors have a reasonable resistance even atthe lowest temperatures. These resistors are still used forlow-temperature measurements, although improvementsin their composition have changed (and downgraded)their thermometry characteristics. The carbon–glass ther-mometer, which uses fine carbon filaments deposited in aspongy-glass matrix, also has a well-behaved resistance–temperature characteristic, as well as a high sensitivity.This thermometer suffers from lead-attachment problemsand has instabilities (minor for many purposes) that makeit unsuitable for standards-type measurements. All threeof these thermometers have resistances with moderatemagneto-resistance characteristics so are useful for mea-surements in a magnetic field.

Germanium resistance thermometers consist of a smallcrystal of doped germanium onto which four leads (two

Page 389: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

720 Thermometry

current, two potential) are attached. These lead resistancesare comparable with the sensor resistance and are similarlytemperature dependent. This thermometer element is in asealed jacket with a low pressure of exchange gas. Figure 7shows the resistance–temperature characteristics for threeof these resistors (labled GE), which are intended for dif-ferent temperature ranges. The minimum usable tempera-ture in each case is defined as that at which the resistanceapproaches 105 . The shapes of the calibration curves arequite similar, with, as a crude approximation, d ln R /d lnT −2. A detailed inspection of these relations revealsa complex behavior, with a nonmonotonic temperaturedependence for d R/dT , so the generation of analyticalexpressions for the resistance–temperature characteristicis difficult.

Germanium resistance thermometers served as the ba-sis for low-temperature standards thermometry for manyyears, until rhodium–iron thermometers were introduced.The major advantages of germanium resistance ther-mometers for experimental work are their relatively smallsize, high sensitivity, and good stability. While the higher-resistance thermometers can be used up to 77 K, theycannot be used at much higher temperatures because thetemperature coefficient changes sign and is positive nearroom temperature. Their magnetoresistance is rather highand complex, and they are seldom used for measure-ments in large magnetic fields. For accurate work aboveroughly 30 K, dc and ac calibrations of these thermome-ters may differ significantly, dependent on the frequency,so the measurement method corresponding to the calibra-tion must be used.

Thermistors are two-lead sintered metal–oxide devicesof a generally small mass, much smaller than any of theabove thermometers’. This, combined with the high sen-sitivity, is their major attraction. The extreme sensitivityrequires that a thermistor be chosen to work in a specifictemperature range, since otherwise the resistance will beeither too small or too large. They have been used at tem-peratures from 4.2 K (seldom) to 700C (special design).Their stability can be quite good, especially for the beaddesigns, when they are handled with care.

The forward voltage of semiconducting diodes alsohas a well-defined dependence on temperature, which hasbeen used to produce thermometers that are small in sizeand dependable. Figure 8 gives the voltage–temperaturerelationships for silicon and gallium arsenide diode ther-mometers as obtained with a 10- µA measuring current.The gallium arsenide calibration is smoother than thatfor the silicon diode, with the knee in the silicon curvebeing rather sharp. At low temperatures, the sensitivityof these thermometers can be quite good (better than1 mK), with an accuracy and reproducibility of 0.1 Kor better. At higher temperatures, these limits should

FIGURE 8 The temperature dependences of the forward volt-ages for two commercial diode thermometers. [Courtesy of LakeShore Cryotronics, Inc.]

be increased by about an order of magnitude. Standardvoltage–temperature relations for selected classes of thesediodes allow interchange of off-the-shelf devices with an-ticipated low-temperature and high-temperature accura-cies of 0.1 and 1 K, respectively.

C. Thermocouples

The existence of a temperature gradient in a conductorwill cause a corresponding emf to be generated in this con-ductor which depends on the gradient (the thermoelectriceffect). While this emf (or voltage) cannot be measureddirectly for a single conductor, the difference between thethermal emfs for two materials can be measured and can beused to measure temperatures, as in a thermocouple. Whentwo wires of dissimilar materials are joined at each endand the ends are kept at different temperatures, a (thermo-electric) voltage will appear across a break in the circuit.This voltage will depend on the temperature differenceand, also, on the difference between the thermoelectricpowers of the two materials. The temperature dependenceof this voltage is called the “Seebeck coefficient.”

The thermocouple which was used to define the high-temperature IPTS-68 interpolation relation (platinum–10% rhodium/platinum) gives the emf (E)-vs-temperaturerelation, labeled S in Fig. 9. Noble-metal thermocou-ples typically have a relatively low sensitivity (roughly10 µV/K) and calibrations which may change with strainand annealing. These drawbacks are compensated by theusefulness of these thermocouples for work at very high-temperatures. In time, these traditional high-temperaturethermocouples may be replaced by gold–platinum and/orplatinum–palladium thermocouples, which have similar

Page 390: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

Thermometry 721

FIGURE 9 The voltage–temperature characteristics for typicalnoble-metal (S) and base-metal (K) thermocouples.

sensitivities but are more reproducible. More sensi-tive (basemetal) thermocouples are available for lower-temperature use, and two of these also are shown in Fig. 9.The type K (K) thermocouple uses nickel–chromium-vs-nickel–aluminum alloys, and the type T (T) uses coppervs a copper–nickel alloy. While Seebeck coefficients gen-erally are very small below roughly 20 K, relatively largevalues (10 µV/K or so) are observed for dilute alloys (lessthan 0.1%) of iron in gold; these thermocouples are usefuleven below 1 K.

Thermocouples are convenient, especially when emfsare measured with modern semiconductor instrumenta-tion. The reference junction generally is chosen to beat the ice point (0C), where precautions must be takenif an ice bath is used. The junction must be electri-cally isolated from the bath to prevent leakage to ground,which could give false readings, and it must extend suf-ficiently far into the bath so that heat conduction alongthe wires to the junction is not important. Finally, thejunction must be surrounded by melting ice (a mixtureof ice and water), not cold water, since the density ofwater is minimum at 4C and temperature gradients ex-ist in water on which ice is floating. The ice bath can bereplaced by an electronic device for which the output volt-age simulates an ice bath and is independent of ambienttemperature.

Thermocouples are relatively sensitive to their envi-ronment, and their calibration can be affected in many,sometimes subtle, ways. Annealing, oxidation, and alloy-ing effects can change the Seebeck coefficient, while ex-traneous, emfs are introduced when strains and a tem-perature gradient coexist along a wire. Care clearly mustbe taken in experimental arrangements involving thermo-couples, and the standard tables that exist for the variouscommonly used types of thermocouples must be applied

judiciously. It is important to remember that the thermalproduced by a thermocouple is developed along that partof the wire passing through a temperature gradient; it hasnothing to do with the junction. Consequently, strains andinhomogeneities present in that part of the wire in thetemperature gradient will lead to errors in the temperaturemeasurement.

D. Optical Pyrometry

Some of the problems involved in optical pyrometry wereaddressed in an earlier section, with the emissivity ofthe source a major concern. Commercial pyrometers havebeen in use for many years and have been a part of the In-ternational Temperature Scales since 1927. Early opticalpyrometers matched the brightness of the radiation sourcewith that of a filament as the filament current was varied.The temperature of the source was then calibrated directlyin terms of the current through the filament. Neutral den-sity filters are used to extend the range of these pyrometersto higher temperatures. Considerable skill is required touse these “disappearing filament” pyrometers (the fila-ment disappears in an image of the source) reproducibly,but they are used widely in industry.

The visual instruments have been replaced in standardsand, also, in most practical applications by photoelectricpyrometers, in which a silicon diode detector or a pho-tomultiplier tube replaces the eye as the detector. Theseinstruments have a high sensitivity and can be used withinterference filters to increase their accuracy [Eq. (3)]. Amajor concern in optical pyrometry is that real objectsdo not show ideal black-body radiation characteristics buthave an emittance that differs from that of a black body ina manner that can be a function of the temperature, wave-length, and surface condition. Pyrometers that operate attwo or more distinct wavelengths provide at least partialcompensation for these effects.

A recent development in high-temperature optical py-rometry uses a fine sapphire fiber light pipe and photo-electric detection to obtain the temperature of a systemthat cannot be viewed directly. The end of the fiber maybe encapsulated to form a black body (producing a self-contained thermometer) or the fiber may be used to viewdirectly the object whose temperature is to be determined.Very sensitive semiconducting infrared detectors havemade possible the use of total-radiation thermometers atand above room temperature for noncontact detection oftemperature changes in processing operations and even,for instance, to determine the location of “heat leaks” inthe insulation of a house. The slight excess temperatureassociated with certain tumors in medical applications hasalso been detected in this way.

Page 391: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GNH/GKM P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016D-776 August 1, 2001 9:51

722 Thermometry

E. Miscellaneous Thermometry

Many other thermometric systems are useful, some forspecific applications. The variation with temperature ofcertain quartz piezoelectric coefficients gives a thermome-ter with a frequency readout. Very sensitive gas thermome-ters can be made with pressure changes sensed by changesin the resonant frequency of tunnel-diode circuits. Glass–ceramic capacitance thermometers are unique in that theyhave no magnetic field dependence, so are useful for low-temperature measurements in large magnetic fields.

Superconducting technology using SQUIDs allows thedetection of very small changes in magnetic flux and,hence, in the current flowing through a loop of wire. Majoradvantages are the high sensitivity and the capability of us-ing small samples in, for instance, magnetic thermometryand the measurement of low voltages. They have, for ex-ample, been used with gold–iron thermocouples for high-precision temperature measurements below 1 K. SQUIDsare primarily low-temperature devices but have been ap-plied to routine measurements at room temperature andabove.

Vapor pressure thermometry, with judicious choice ofworking substance, allows a very high sensitivity, but only,except at liquid helium temperatures, in a narrow temper-ature region. Here, capacitive diaphragm gauges and othermodern pressure-sensing devices replace the conventional

mercury manometer and allow remote readout of the pres-sures involved.

SEE ALSO THE FOLLOWING ARTICLES

CRITICAL DATA IN PHYSICS AND CHEMISTRY • CRYO-GENICS • HEAT TRANSFER • THERMAL ANALYSIS •THERMODYNAMICS • THERMOELECTRICITY • TIME AND

FREQUENCY

BIBLIOGRAPHY

American Institute of Physics (1992). “Temperature: Its Measurementand Control in Science and Industry,” Vol. 6, Proceedings of the Sym-posium on Temperature, AIP, New York. (See also Vol. 5 in the sameseries.)

Bureau International des Poids et Mesures (BIPM) (1991). “Supple-mentary Information for the ITS-90,” BIPM, Sevres, France. (A bib-liography of recent articles on thermometry from national metrologyinstitutes can be found at the BIPM web site: www.bipm.org.)

Bureau International des Poids et Mesures (BIPM) (1996). Metrolo-gia 33, No. 4, 289–425 (a special issue devoted wholly tothermometry).

Hudson, R. P. (1980). “Measurement of temperature.” Rev. Sci. Instrum.51, 871.

Quinn, T. J. (1990). “Temperature,” 2nd ed., Academic Press,New York.

Page 392: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater AcousticsWilliam A. KupermanUniversity of California, San Diego

I. Ocean Acoustic EnvironmentII. Physical MechanismsIII. Sonar EquationIV. Sound Propagation ModelsV. Quantitative Description of Propagation

VI. Sonar Array ProcessingVII. Active Sonar Processing

VIII. Appendix: Units

GLOSSARY

Active sonar A sonar which emits sounds and receivesits echo.

Beamforming Phasing an array to form a set of “lookdirections.”

Convergence zone propagation Spatially periodic(≈35–65 km) refocusing of sound from a shallowsource producing zones of high intensity near the sur-face due to the upward refracting nature of the soundspeed profile and the absence of bottom interaction.

Decibels Ten times the logarithm in base 10 of a ratio ofintensities.

Deep scattering layer A layer in the water column pop-ulated by organisms that scatter sound, and which typ-ically undergoes diurnal variations in depth.

Deep sound channel A sound channel occurring in deepwater whose axis is at the the minimum of the soundspeed profile and in which propagation does not involveinteraction with the ocean surface or bottom.

Matched field processing Beamforming by matching the

data on an array with the solutions of the wave equationspecific to the environment.

Passive sonar A sonar which only receives sound.Propagation loss The ratio in decibels, between the

acoustic intensity at a field point and the intensity ata reference distance (typically 1 m) from the source.

Reverberation The scattered acoustic field from an ac-tive sonar source which acts as interference in the sonarsystem.

Sound speed profile The speed of sound as a function ofdepth.

Surface duct A sound channel whose upper boundary isthe ocean surface, formed when there is a local soundspeed profile minimum near the ocean surface.

Transmission loss The negative of propagation loss.

IT IS WELL established that sound waves, rather thanelectromagnetic waves, propagate long distances in theocean. Hence, in the ocean as opposed to air or a vac-uum, there is SONAR (Sound Navigation and Ranging)

317

Page 393: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

318 Underwater Acoustics

instead of radar, acoustic communication instead of ra-dio, and acoustic imaging and tomography instead ofmicrowave or optical imaging or X-ray tomography.Underwater acoustics is the science of sound in water(most commonly in the ocean) and encompasses not onlythe study of sound propagation, but also the maskingof sound signal by interfering phenomena and the sig-nal processing for extracting these signals from inter-ference. This article will present the basic physics ofocean acoustics and then discuss applications. The deci-bel units used in underwater acoustics are described in theAppendix.

I. OCEAN ACOUSTIC ENVIRONMENT

The acoustic properties of the ocean such as the pathsalong which sound from a localized source travel aremainly dependent on the ocean sound speed structure,which in turn is dependent on the oceanographic envi-ronment. The combination of water column and bottomproperties leads to a set of generic sound propagation pathsdescriptive of most propagation phenomena in the ocean.

A. Ocean Environment

Sound speed in the ocean water column is a function oftemperature, salinity, and ambient pressure. Since the am-bient pressure is a function of depth, it is customary toexpress the sound speed (c) in meters per second as anempirical function of temperature (T ) in degrees centi-grade, salinity (S) in parts per thousand, and depth (z) inmeters, for example,

c = 1449.2 + 4.6T − 0.055T 2 + 0.00029T 3

+ (1.34 − 0.01T )(S − 35) + 0.016z . (1)

Figure 1 shows a typical set of sound speed profiles, in-dicating greatest variability near the surface. In a warmerseason (or warmer part of the day, sometimes referred toas the “afternoon effect”), the temperature increases nearthe surface and hence the sound speed increases towardthe sea surface. In nonpolar regions where mixing near thesurface due to wind and wave activity is important, a mixedlayer of almost constant temperature is often created. Inthis isothermal layer sound speed increases with depth be-cause of the increasing ambient pressure, the last term inEq. (1). This is the surface duct region. Below the mixedlayer is the thermocline where the temperature and hencethe sound speed decreases with depth. Below the ther-mocline, the temperature is constant and the sound speedincreases because of increasing ambient pressure. There-fore, between the deep isothermal region and the mixed

FIGURE 1 Generic sound speed profiles.

layer, there is a depth at minimum sound speed referred toas the axis of the deep sound channel. However, in polarregions, the water is coldest near the surface, so that theminimum sound speed is at the surface. Figure 2 is a con-tour display of the sound speed structure of the North andSouth Atlantic with the deep sound channel axis indicatedby the heavy dashed line. Note the deep sound channelbecomes shallower toward the poles. Aside from soundspeed effects, the ocean volume is absorbtive and willcause attenuation that increases with acoustic frequency.

Shallower water such as that in continental shelf andslope regions is not deep enough for the depth-pressureterm in Eq. (1) to be significant. Thus the winter profiletends to isovelocity simply because of mixing, whereasthe summer profile has a higher sound speed near the sur-face due to heating; both are schematically represented inFig. 3.

The sound speed structure regulates the interaction ofsound with the boundaries. The ocean is bounded aboveby air which is a perfect reflector; however, it is oftenrough, causing sound to scatter in directions away fromthe “specular” reflecting angle. The ocean bottom is typ-ically a complicated, rough, layered structure supportingelastic waves. Its geoacoustic properties are summarizedby density, compressional and shear speed, and attenuationprofiles. The two basic interfaces, air/sea and sea/bottom,can be thought of as the boundaries of an acoustic waveg-uide whose internal index of refraction is determined bythe fundamental oceanographic parameters represented inthe sound speed equation, Eq. (1).

Page 394: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 319

FIGURE 2 Sound speed contours at 5 m/sec intervals taken from the North and South Atlantic along 30.50W.Dashed line indicates axis of deep sound channel (from Northrup 1974).

B. Basic Acoustic Propagation Paths

Sound propagation in the ocean can be qualititatively bro-ken down into three classes: very short range, deep water,and shallow water propagation.

1. Very Short Range Propagation

The amplitude of a point source in free space falls offwith range r as r −1; this geometric loss is called spheri-cal spreading. Most sources of interest in the deep oceanare nearer the surface than the bottom. Hence, the twomain short range paths are the direct path and the surfacereflected path. When these two paths interfere, they pro-duce a spatial distribution of sound oftened referred to asa Lloyd mirror pattern, as shown in the inset of Fig. 4.Also, with reference to Fig. 4, note that transmission lossis a decibel measure of relative intensity (see Appendix),the latter being proportional to the square of the acousticamplitude.

2. Long Range Propagation Paths

Figure 5 is a schematic of propagation paths in the oceanresulting from the sound speed profiles (indicated by thedashed line) described above in Fig. 1. These paths can beunderstood from Snell’s law,

FIGURE 3 Typical summer and winter shallow water soundspeed profiles.

cos θ (z)

c(z)= constant, (2)

which relates the ray angle θ (z), with respect to thehorizontal, to the local sound speed c(z) at depth z. Theequation requires that the higher the sound speed, thesmaller the angle with the horizontal, meaning, that soundbends away from regions of high sound speed; or saidanother way, sound bends toward regions of low soundspeed. Therefore, paths 1, 2, and 3 are the simplest toexplain since they are paths that oscillate about the localsound speed minima. For example, path 3, depicted bya ray leaving a source near the deep sound channel axisat a small horizontal angle, propagates in the deep soundchannel. This path, in temperate lattitudes where thesound speed minimum is far from the surface, permitspropagation over distances of thousands of kilometers.Path 4, which is at slightly steeper angles and is usuallyexcited by a near surface source, is convergence zonepropagation, a spatially periodic (35–65 km) refocusingphenomenon producing zones of high intensity near thesurface due to the upward refracting nature of the deepsound-speed profile. Regions in between these zones arereferred to as shadow regions. Referring back to Fig. 1,there may be a depth in the deep isothermal layer at whichthe sound speed is the same at it is at the surface; thisdepth is called the critical depth and is the lower limit ofthe deep sound channel. A positive critical depth specifiesthat the environment supports long distance propagationwithout bottom interaction, whereas a negative criticaldepth specifies that the ocean bottom is the lowerboundary of the deep sound channel. The bottom bouncepath 5 is also a periodic phenomenon but with a shortercycle distance and shorter propagation distance becauseof losses when sound is reflected from the ocean bottom.

3. Shallow Water and Waveguide Propagation

In general, the ocean can be thought of as an acousticwaveguide; this waveguide physics is particularly evident

Page 395: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

320 Underwater Acoustics

FIGURE 4 The inset shows the geometry of the Lloyd mirror effect. The plots show a comparison of Lloyd mirror tospherical spreading. Transmission losses are plotted in decibels corresponding to losses of 10 log r 2 and 10 log r 4,respectively, as explained in Section I.C.

in shallow water (inshore out to the continental slope,typically to depths of a few hundred meters). Snell’slaw applied to the summer profile in Fig. 3 producesrays which bend more toward the bottom than winterprofiles in which the rays tend to be straight. This im-plies two effects with respect to the ocean bottom: (1)For a given range, there are more bounces off the oceanbottom in the summer than in the winter; (2) the ray an-gles intercepting the bottom are steeper in the summerthan in the winter. A qualitative understanding of the re-flection properties of the ocean bottom should thereforebe very revealing of sound propagation in summer ver-sus winter. Basically, near-grazing incidence is much lesslossy than larger, more vertical angles of incidence. Sincesummer propagation paths have more bounces, each ofwhich is at steeper angles than those of winter paths, sum-

FIGURE 5 Schematic representation of various types of sound propagation in the ocean.

mer shallow water propagation is lossier than in winter.This result is tempered by rough winter surface condi-tions that generate large scattering losses at the higherfrequencies.

For simplicity, we consider an isovelocity waveguidebounded above by the air/water interface and below bya two-fluid interface. From Section II.C., we have per-fect reflection with a 180-degree phase change at the sur-face, and for paths more horizontal than the bottom crit-ical angle, there will also be perfect bottom reflection.Therefore, as schematically indicated in Fig. 6a, ray pathswithin a cone of 2θc will propagate unattenuated downthe waveguide. Because the upgoing and downgoing rayshave equal amplitudes, preferred angles will exist such thatperfect constructive interference can occur. These partic-ular angles can be associated with the normal modes of

Page 396: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 321

FIGURE 6 Ocean waveguide propagation. (a) Long distancepropagation occurs within a cone of 2θc . (b) There are a discreteset of paths that reflect off the bottom and surface that construc-tively interfere. For the example shown, the condition for construc-tive interference is that the phase change along BCDE be a muli-tiple of two π .

the waveguide as formally derived from the wave equa-tion in Section IV. However, it is instructive to understandthe geometric origin of the waveguide modal structure.Figure 6b is a schematic of a ray reflected from the bot-tom and then the surface of a “Pekeris” waveguide (anenvironment with constant sound speeds and densities inthe water column and fluid bottom, respectively). Con-sider a ray along the path ACDF and its wavefront whichis perpendicular to the ray. The two downgoing rays ofequal amplitude, AC and DF, will constructively inter-fere if points B and E have a phase difference of an in-tegral number of 360 degrees (and similarly for upgo-ing rays). There will be a discrete set of angles up tothe critical angle for which this constructive interferencetakes place and, hence, for which sound propagates. Thisdiscrete set, in terms of wave physics, is called the nor-mal modes of the waveguide and is further discussed inSection IV.D.

C. Geometric Spreading Loss

The energy per unit time emitted by a sound source is flow-ing through a larger area with increasing range. Intensityis the the power flux through a unit area which translatesto the energy flow per unit time through a unit area. Thesimplest example of geometric loss is spherical spreadingfor a point source in free space where the area increasesas 4πr2, where r is the range from the point source. Sospherical spreading results in an intensity decay propor-tional to r −2. Since intensity is proportional to the squareof the pressure amplitude, the fluctuations in pressure in-

duced by the sound, p, decay as r −1. For range indepen-dent ducted propagation, that is, where rays are refractedor reflected back toward the horizontal direction, thereis no loss associated with the vertical dimension. In thiscase, the spreading surface is the area of cylinder whoseaxis is in the vertical direction passing through the source,2πr H , where H is the depth of the duct (waveguide) andis constant. Geometric loss in the near field Lloyd mirrorregime requires consideration of interfering beams fromdirect and surface reflected paths. To summarize, the ge-ometric spreading laws for the pressure field (recall thatintensity is proportional to the sqaure of the pressure) are:

Spherical spreading loss: p ∝ r −1

Cylindrical spreading loss: p ∝ r −1/2

Lloyd mirror loss: p ∝ r −2.

II. PHYSICAL MECHANISMS

The physical mechanisms associated with the generation,reception, attenuation, and scattering of sound in the oceanare discussed in this section.

A. Transducers

A transducer converts some sort of energy to sound(source) or converts sound energy (receiver) to an electri-cal signal. In underwater acoustics, piezoelectric and mag-netostrictive transducers are commonly used; the formerconnects electric polarization to mechanical strain and thelatter connects magnetization of a ferromagnetic materialto mechanical strain. In addition there are: electrodynamictransducers in which sound pressure oscillations move acurrent-carrying coil through a magnetic field causing aback electromagnetic field, and electrostatic transducers inwhich charged electrodes moving in a sound field changethe capacitance of the system. Explosion, airgun, electricdischarge, and lasers are also used as wideband sources.

B. Volume Attenuation

Volume attenuation increases with frequency. In Fig. 5, thelosses associated with path 3 only include volume atten-uation and scattering, because this path does not involveboundary interactions. The volume scattering can be bi-ological in origin or arise from interaction with internalwave activity in the vicinity of the upper part of the deepsound channel where paths are refracted before they wouldinteract with the surface. Both of these effects are smallat low frequencies. This same internal wave region is alsoon the lower boundary of the surface duct, allowing scat-tering out of the surface duct, thereby also constituting aloss mechanism for the surface duct. This mechanism also

Page 397: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

322 Underwater Acoustics

leaks sound into the deep sound channel, a region whichwithout scattering would be a shadow zone for a surfaceduct source. This type of scattering from internal waves isalso a source of fluctuation of the sound field.

Attenuation is characterized by an exponential decay ofthe sound field. If A0 is the rms amplitude of the sound fieldat unit distance from the source, then the attenuation of thesound field causes the amplitude to decay with distancealong the path, r :

A = A0 exp(−αr ), (3)

where the unit of α is nepers/distance. The attenuation co-efficient can be expressed in decibels per unit distance bythe conversion α′ = 8.686α. The frequency dependenceof attenuation can be roughly divided into four regimes asdisplayed in Fig. 7. In Region I, leakage out of the soundchannel is believed to be the main cause of attenuation.The main mechanisms associated with Regions II and IIIare boric acid and magnesium sulfate chemical relaxation.Region IV is dominated by the shear and bulk viscosityassociated with fresh water. A summary of the approxi-mate frequency dependence ( f in kHz) of attenuation (inunits of dB/km) is given by

α′(d B/km) = 3.3 × 10−3 + 0.11 f 2

1 + f 2

+ 43 f 2

4100 + f 2 + 2.98 × 10−4 f 2 , (4)

FIGURE 7 Regions of different dominent processes at attenua-tion of sound in seawater [From Urick, R. J. (1979). Sound Prop-agation in the Sea. Washington: U.S. G.P.O.]. The attenuation isgiven in dB per kiloyard.

with the terms sequentially associated with Regions I–IVin Fig. 7.

C. Bottom Loss

The structure of the ocean bottom affects those acousticpaths which interact with the ocean bottom. This bottominteraction is summarized by bottom reflectivity, the am-plitude ratio of reflected and incident plane waves at theocean-bottom interface as a function of grazing angle, θ(see Fig. 8a). For a simple bottom which can be repre-sented by a semi-infinite half-space with constant soundspeed cb and density ρb, the reflectivity is given by

R(θ ) = ρbkwz − ρwkbz

ρbkwz + ρwkbz, (5)

with the subscript w denoting water; the wavenumbers aregiven by

kiz = (ω/ci ) sin θi ≡ k sin θi ; i = w, b. (6)

FIGURE 8 The reflection and transmission process. Grazing an-gles are defined relative to the horizontal. (a) A plane wave isincident on an interface separating two media with densities andsound speeds ρ, c. R(θ ) and T (θ ) are reflection and transmis-sion coefficients. Snell’s law is a statement that k⊥, the horizontalcomponent of the wave vector, is the same for all three waves. (b)Rayleigh reflection curve (Eq. 5) as a function of the grazing angle(θ in (a)) indicating critical angle θc. The dashed curve shows thatif the second medium is lossy, there is less than perfect reflectionbelow the critical angle. Note that for the nonlossy, bottom thereis complete reflection below the critical angle, but with a phasechange.

Page 398: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 323

The incident and transmitted grazing angles are related bySnell’s law,

cb cos θw = cw cos θb , (7)

and the incident grazing angle θw is also equal to the angleof the reflected plane wave.

For this simple water-bottom interface for which wetake cb > cw, there exists a critical grazing angle θc belowwhich there is perfect reflection,

cos θc = cw

cb. (8)

For a lossy bottom, there is no perfect reflection, as alsoindicated in a typical reflection curve in Fig. 8b. These re-sults are approximately frequency independent. However,for a layered bottom, the reflectivity has a complicatedfrequency dependence. It should be pointed out that if thedensity of the second medium vanishes, the reflectivityreduces to the pressure release case of R(θ ) = −1.

D. Scattering and Reverberation

Scattering caused by rough boundaries or volume inhomo-geneities is a mechanism for loss (attenuation), reverber-ant interference, and fluctuation. Attenuation from volumescattering is addressed in Section II.C. In most cases, it isthe mean or coherent (or specular) part of the acoustic fieldwhich is of interest for a sonar or communications appli-cation, and scattering causes part of the acoustic field to berandomized. Rough surface scattering out of the “speculardirection” can be thought of as an attenuation of the meanacoustic field, and typically increases with increasing fre-quency. A formula often used to describe reflectivity froma rough boundary is

R′(θ ) = R(θ ) exp

(−2

2

), (9)

where R(θ ) is the reflection coefficient of the smooth in-terface and is the Rayleigh roughness parameter definedas ≡ 2k σ sin θ where k = 2π/λ, λ is the acoustic wave-length, and σ is the rms roughness (height).

The scattered field is often referred to as reverberation.Surface, bottom or volume scattering strength, SS ,B ,V , is asimple parameterization of the production of reverberationand is defined as the ratio in decibels of the sound scat-tered by a unit surface area or volume referenced to a unitdistance, Iscat , to the incident plane wave intensity, Iinc,

SS ,B ,V = 10 log Iscat

Iinc. (10)

The Chapman–Harris curves predicts the ocean surfacescattering strength in the 400–6400 Hz region,

SS = 3.3β logθ

30− 42.4 log β + 2.6;

β = 107(w f 1/3)−0.58 , (11)

where θ is the grazing angle in degrees, w the wind speedin m/sec, and f the frequency in Hz.

The simple characterization of bottom backscatteringstrength utilizes Lambert’s rule for diffuse scattering,

SB = A + 10 log sin2 θ (12)

where the first term is determined empirically. Under theassumbtion that all incident energy is scattered into thewater column with no transmission into the bottom, A is−5 dB. Typical realistic values for A which have beenmeasured are −17 dB for big Basalt Mid-Atlantic Ridgecliffs and −27 dB for sediment ponds.

Volume scattering strength is typically reduced to a sur-face scattering strength by taking SV as an average volumescattering strength within some layer at a particular depth;then the corresponding surface scattering strength is

SS = SV + 10 log H (13)

where H is the layer thickness. The column or integratedscattering strength is defined as the case for which H isthe total water depth.

Volume scattering usally decreases with depth (about5 dB per 300 m) with the exception of the deep scatteringlayer. For frequencies less than 10 kHz, fish with air-filledswim bladders are the main scatterers. Above 20 kHz, zoo-plankton or smaller animals that feed upon phytoplanktonand the associated biological chain are the scatterers. Thedeep scattering layer (DSL) is deeper in the day than inthe night, changing most rapidly during sunset and sun-rise. This layer produces a strong scattering increase of5–15 dB within 100 m of the surface at night and virtuallyno scattering in the daytime at the surface since it migratesdown to hundreds of meters. Since higher pressure com-presses the fish swim bladder, the backscattering acousticresonance tends to be at a higher frequency during the daywhen the DSL migrates to greater depths. Examples ofday and night scattering strengths are shown in Fig. 9.

Finally, near-surface bubbles and bubble clouds can bethought of as either volume or surface scattering mecha-nisms acting in concert with the rough surface. Bubbleshave resonances (typically greater than 10 kHz) and atthese resonances, scattering is strongly enhanced. Bubbleclouds have collective properties; among these propertiesis that a bubbly mixture, as specified by its void fraction(total bubble gas volume divided by water volume), has aconsiderably lower sound speed than water.

E. Ambient Noise

There are essentially two types of ocean acoustic noise:manmade and natural. Generally, shipping is the mostimportant source of manmade noise, though noise fromoffshore oil rigs is becoming more and more prevalent.

Page 399: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

324 Underwater Acoustics

FIGURE 9 Day and night scattering strength measurements us-ing an explosive source as a function of frequency [from Chapmanand Marshall (1966)]. The spectra measured at various times afterthe explosion are labeled with the depth of the nearest scattererthat could have contributed to the reverberation. The ordinate cor-responds to SV in Eq. (13). [From Chapman, R. P. and Harris,H. H. (1962). “Surface backscattering strengths measured withexplosive sound sources,” J. Acoust. Soc. Am. 34, 1592–1597.]

Typically, natural noise dominates at low frequencies(below 10 Hz) and high frequencies (above a fewhundred Hz). Shipping fills in the region between 10 anda few hundred Hz. A summary of the spectrum of noiseis shown in Fig. 10. The higher frequency noise is usuallyparameterized according to sea state (also Beaufortnumber) and/or wind. Table I summarizes the descriptionof sea state.

The sound speed profile affects the vertical and angulardistribution of noise in the deep ocean. When there is apositive critical depth (see Section I.B.), sound from sur-face sources can travel long distances without interactingwith the ocean bottom, but a receiver below this criticaldepth should sense less surface noise because propagationinvolves interaction with lossy boundaries, surface and/orbottom. This is illustrated in Fig. 11, which shows a deepwater environment with measured ambient noise. Fig-ure 12 is an example of vertical directivity of noise whichalso follows the propagation physics discussed above. Theshallower depth is at the axis of the deep sound channelwhile the other is at the critical depth. The pattern is nar-rower at the critical depth where the sound paths tend tobe horizontal since the rays are turning around at the lowerboundary of the deep sound channel.

In a range independent ocean, Snell’s law predicts ahorizontal noise notch at depths where the speed of soundis less than the near-surface sound speed. Returning toEq. (2), and reading off the sound speeds from Fig. 11 at thesurface (c = 1530 m/sec) and say, 300 m (1500 m/sec), ahorizontal ray (θ = 0) launched from ocean surface wouldhave an angle with respect to the horizontal of about 11

FIGURE 10 Composite of ambient noise spectra [From Wenz,G. M. (1962). “Acoustic ambient noise in the ocean: Spectra andsources,” J. Acoust. Soc. Am. 34, 1936–1956].

at 300 m depth. All other rays would arrive with greatervertical angles. Hence we expect this horizontal notch.However, the horizontal notch is often not seen at ship-ping noise frequencies. That is because shipping tends tobe concentrated in continental shelf regions, and propaga-tion down a continental slope converts high angles rays tolower angles at each bounce. There are also deep soundchannel shoaling effects that result in the same trend inangle conversion.

III. SONAR EQUATION

A major application of underwater acoustics is sonar sys-tem technology. The performance of a sonar is often ap-proximately described simply in terms of the sonar equa-tion. The methodology of the sonar equation is analogousto an accounting procedure involving acoustic signal, in-terference, and system characteristics.

A. Passive Sonar Equation

A passive sonar system uses the radiated sound from atarget to detect and locate the target. A radiating object

Page 400: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 325

TABLE I Descriptions of the Ocean Sea Surface

Wind speed Fully arisen sea12-hr wind

Range Mean Wave Wave Fetch b,c

Beaufort knots knots heighta,b heighta,b Durationb,c naut. miles SeastateSea criteria scale (m/s) (m/s) ft (m) ft (m) hr (km) scale

Mirrorlike 0 <1 0

(<0.5)

Ripples 1 1–3 2 1/2

(0.5–1.7) (1.1)

Small wavelets 2 4–6 5 <1 <1 1

(1.8–3.3) (2.5) (<0.30) (<0.30)

Large wavelets, 3 7–10 8-1/2 1–2 1–2 <10 2scattered whitecaps (3.4–5.4) (4.4) (0.30–0.61) (0.30–0.61) <2.5 (<19)

Small waves, frequent 4 11–16 13-1/2 2–5 2–6 10–40 3whitecaps (5.5–8.4) (6.9) (0.61–1.5) (0.61–1.8) 2.5–6.5 (19–74)

Moderate waves, 5 17–21 19 5–8 6–10 40–100 4many whitecaps (8.5–11.1) (9.8) (1.5–2.4) (1.8–3.0) 6.5–11 (74–185)

Large waves, whitecaps 6 22–27 24-1/2 8–12 10–17 100–200 5everywhere, spray (11.2–14.1) (12.6) (2.4–3.7) (3.0–5.2) 11–18 (185–370)

Heaped-up sea, 7 28–33 30-1/2 12–17 17–26 200–400 6blown spray, streaks (14.2–17.2) (15.7) (3.7–5.2) (5.2–7.9) 18–29 (370–740)

Moderately high, long 8 34–40 37 17–24 26–39 400–700 7waves, spindrift (17.3–20.8) 19.0) (5.2–7.3) (7.9–11.9) 29–42 (740–1300)

a The average height of the highest one-third of the waves (significant wave height).b Estimated from data given in U.S. Hydrographic Office (Washington, DC) publications HO 604 (1951) and HO 603 1955).c The minimum fetch and duration of the wind needed to generate a fully arisen sea.Note. Approximate relation between scales of wind speed, wave height, and sea state [From Wenz, G. M. (1962). “Acoustic ambient

noise in the ocean: Spectra and sources,” J. Acoust. Soc. Am. 34, 1936–1956].

of source level SL (all units are in decibels) is receivedat a hydrophone of a sonar system at a lower signal levelS because of the transmission loss “TL” it suffers (e.g.,cylindrical spreading plus attenuation or a TL computedfrom one of the propagation models of Section IV),

S = SL − TL. (14)

The noise, N , at a single hydrophone is subtracted fromEq. (14) to obtain the signal-to-noise ratio at a singlehydrophone,

SNR = SL − TL − N . (15)

Typically, a sonar system consists of an array or an-tenna of hydrophones which provides signal-to-noise en-hancement through a beamforming process (see Sec-tion VI). This process is quantified in decibels by arraygain AG (see Section VI.B.) that is added to the singlehydrophone SNR to give the SNR at the output of thebeamformer,

SNRBF = SL − TL − N + AG. (16)

Because detection involves addtional factors includingsonar operator ability, it is necessary to specify a detec-

tion threshold, DT level above the SNRBF at which thereis a 50% (by convention) probability of detection. Thedifference between these two quantities is called signalexcess (SE),

SE = SL − TL − N + AG − DT. (17)

This decibel bookkeeping leads to an important sonarengineering descriptor called the figure of merit, FOM,which is the transmission loss that gives a zero signalexcess,

FOM = SL − N + AG − DT (18)

The FOM encompasses the various parameters a sonarengineer must deal with: expected source level, the noiseenvironment, array gain, and the detection threshold. Con-versely, since the FOM is a transmission loss, one can usethe output of a propagation model (or if appropriate, asimple geometric loss plus attenuation) to estimate theminimum range at which a 50% probability of detectioncan be expected. This range changes with oceanographicconditions and is often referred to as the “range of the day”in navy sonar applications.

Page 401: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

326 Underwater Acoustics

FIGURE 11 Noise in the deep ocean. (a) Sound speed profile and (b) noise level as a function of depth in the Pacific[From Morris, G. B. (1978). “Depth dependence of ambient noise in the Northeastern Pacific Ocean,” J. Acoust. Soc.Am. 64, 581–590].

B. Active Sonar Equation

A monostatic active sonar transmits a pulse to a targetand its echo is detected at a receiver colocated with thetransmitter. A bistatic active sonar has the receiver in adifferent location than the transmitter. The main differ-ences between the passive and active cases is the additionof a target strength term, TS; reverberation and hence re-verberation level, RL, is usually the dominant source ofinterference as opposed noise; and the transmission loss isover two paths: transmitter to target and target to receiver.In the monostatic case, the transmission loss is 2TL whereTL is the-one way transmission loss, and in the bistaticcase, the transmission loss is the sum (in dB) over pathsfrom the transmitter to the target and the target to the re-ciever, TL1 + TL2. The concept of the detection thresholdis useful for both passive and active sonars. Hence, forsignal excess, we have

SE = SL−TL1 +TS−TL2 − (RL+ N )+AG−DT. (19)

FIGURE 12 The vertical directionality of noise at the axis of thedeep sound channel and at the critical depth in the Pacific [FromAnderson, V. C. (1979). “Variations of the vertical directivity ofnoise with depth in the North Pacific,” J. Acoust. Soc. Am. 66,1446–1452].

The corresponding FOM for an active system is defined forthe maximum allowable two-way transmission loss withTS = 0 dB.

IV. SOUND PROPAGATION MODELS

The wave equation describing sound propagation is de-rived from the equations of hydrodynamics and its coef-ficients, and boundary conditions are descriptive of theocean environment. There are essentially four types ofmodels (computer solutions to the wave equation) todescribe sound propagation in the sea:

1. Ray theory2. The spectral method or fast field program (FFP)3. Normal mode (NM)4. Parabolic equation (PE).

All of these models allow for the fact that the ocean envi-ronment varies with depth. A model that also takes into ac-count horizontal variations in the environment (i.e., slop-ing bottom or spatially variable oceanography) is termedrange dependent. For high frequencies (a few kilohertz orabove), ray theory is the most practical. The other threemodel types are more applicable and usable at lower fre-quencies (below a kilohertz). The models discussed hereare essentially two-dimensional models, since the indexof refraction has much stronger dependence on depth thanon horizontal distance. Nevertheless, bottom topographyand strong ocean features can cause horizontal refraction(out of the range-depth plane). Ray models are most easily

Page 402: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 327

extendable to include this added complexity. Full three-dimensional wave models are extremely computationallyintensive. A compromise that often works for “weak”three-dimensional problems is the “N × 2D” approxima-tion that combines two-dimensional solutions along radi-als to produce a three-dimensional solution.

A. The Wave Equation andBoundary Conditions

The wave equation for pressure, p, in cylindrical coordi-nates with the range coordinates denoted by r = (x , y) andthe depth coordinate denoted by z (taken positive down-ward) for a source free region is

∇2 p(r, z , t) − 1

c2(r, z)

∂2 p(r, z , t)

∂t2= 0, (20)

where c(r, z) is the sound speed in the wave propagatingmedium.

It is convenient to solve Eq. (20) in the frequency do-main by assuming a solution with a frequency dependenceof exp(−i ωt) to obtain the Helmholtz equation (K ≡ ω/c),

∇2 p(r, z) + K 2 p(r, z) = 0, (21)

with

K 2(r, z) = ω2

c2(r, z) . (22)

The range-dependent environment manifests itself as thecoefficient K 2(r, z) of the partial differential equation forthe appropriate sound speed profile. The range-dependentbottom type and topography appears as boundary condi-tions. In underwater acoustics, both fluid and elastic (shearsupporting sediments and bottom strata) media are of in-terest. For simplicity we only consider fluid media below.

The most common plane interface boundary conditionsencountered in underwater acoustics are the pressure re-lease condition at the ocean surface,

p = 0, (23)

and at the ocean-bottom interface, the continuity ofpressure

p1 = p2, (24)

and vertical particle velocity

1

ρ1

∂p1

∂z= 1

ρ2

∂p2

∂z, (25)

where the ρi ’s are the densities of the two media. Theselatter boundary conditions applied to the plane wave fieldsin Fig. 8a yield the Rayleigh reflection coefficient givenby Eq. (1).

The Helmholtz equation for an acoustic field from apoint source is

∇2G(r, z) + K 2(r, z)G(r, z) = −δ2(r − rs)δ(z − zs),

(26)

where the subscript “s” denotes the source coordinates.The acoustic field from a point source, G(r), is obtainedeither by solving the boundary value problem of Eq. (26)(spectral method or normal modes) or by approximatingEq. (21) by an initial value problem (ray theory, parabolicequation).

B. Ray Theory

Ray theory is a geometrical, high-frequency approximatesolution to Eq. (21) of the form

G(R) = A(R) exp[i S(R)], (27)

where the exponential term allows for rapid variations asa function of range and A(R) is a more slowly varying“envelope” which incorporates both geometrical spread-ing and loss mechanisms. The geometrical approximationis that the amplitude varies slowly with range (i.e.,(1/A)∇2 A K 2) so that Eq. (21) yields the eikonalequation

(∇S)2 = K 2. (28)

The ray trajectories are perpendicular to surfaces ofconstant phase (wavefronts), S, and may be expressedmathematically as follows:

d

dl

[K

dRdl

]= ∇K , (29)

where l is the arc length along the direction of the ray, andR is the displacement vector. The direction of averageflux (energy) follows that of the trajectories, and theamplitude of the field at any point can be obtained fromthe density of rays.

The ray theory method is computationally rapid and ex-tends to range-dependent problems. Furthermore, the raytraces give a physical picture of the acoustic paths. It ishelpful in describing how sound redistributes itself whenpropagating long distances over paths that include shal-low and deep environments and/or mid-latitude to polarregions. The disadvantage of conventional ray theory isthat it does not include diffraction, including effects thatdescribe the low-frequency dependence (“degree of trap-ping”) of ducted propagation.

C. Wavenumber Representationor Spectral Solution

The wave equation can be solved efficiently with spectralmethods when the ocean environment does not vary withrange. The term “Fast Field Program (FFP)” had beenused because the spectral methods became practical with

Page 403: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

328 Underwater Acoustics

the advent of the fast Fourier transform (FFT). Assume asolution of Eq. (26) of the form

G(r, z) = 1

∫ ∞

−∞d2k g(k, z , zs) exp[ik·(r − rs)], (30)

which then leads to the equation for the depth-dependentGreen’s function, g(k, z , zs),

d2g

dz2 + (K 2(z) − k2)g = −

1

2πδ(z − zs). (31)

Furthermore, we assume azimuthal symmetry, kr > 2π

and rs = 0 so that Eq. (30) reduces to

G(r, z) = exp(−i π/4)

(2πr )1/2

∫ ∞

−∞dk (k)1/2g(k , z , zs) exp(ikr ).

(32)

This integral is then evaluated using the FFT algorithm.Although the method was initially labeled “fast field” itis fairly slow because of the time required to calculatethe Green’s functions (solve Eq. 31). However, it has ad-vantages when one wishes to calculate the “near-field”region or to include shear wave effects in elastic media;it is also often used as a benchmark for other less ex-act techniques. With a great deal of additional computa-tional effort, this method is extendable to range-dependentenvironments.

D. Normal Mode Model

Rather than solve Eq. (31) for each g for the completeset of k’s (typically thousands of times), one can utilize anormal mode expansion of the form

g(k, z) =∑

an(k)un(z), (33)

where the quantities un are eigenfunctions of the followingeigenvalue problem:

d2un

dz2+ [

K 2(z) − k2n

]un(z) = 0. (34)

The eigenfunctions, un , are zero at z = 0, satisfy the lo-cal boundary conditions descriptive of the ocean-bottomproperties, and satisfy a radiation condition for z → ∞.They form an orthonormal set in a Hilbert space withweighting function ρ(z), the local density. The range ofdiscrete eigenvalues corresponding to the poles in the in-tegrand of Eq. (32) is given by the condition

min[K (z)] < kn < max[K (z)]. (35)

These discrete eigenvalues correspond to discrete angleswithin the critical angle cone in Fig. 6a as discussed inSection I.B.3. The eigenvalues kn typically have a smallimaginary part αn , which serves as the modal attenuationrepresentative of all the losses in the ocean environment.

Solving Eq. (26) using the normal mode expansion givenby Eq. (33) yields (for the source at the origin)

G(r, z) = i

4 ρ(zs)

∑n

un(zs)un(z)H 10 (knr ). (36)

The asymptotic form of the Hankel function can be used inthe above equation to obtain the well-known normal moderepresentation of a cylindrical (axis is depth) waveguide:

G(r, z) = i ρ(zs)

(8πr )1/2 exp(−i π/4)

×∑

n

un(zs)un(z)

k1/2n

exp(iknr ). (37)

Equation (37) is a far field solution of the wave equationand neglects the continuous spectrum (kn < min[K (z)]of Ineq. 35) of modes. For purposes of illustrating thevarious portions of the acoustic field, we note that kn is ahorizontal wave number so that a “ray angle” associatedwith a mode with respect to the horizontal can be takento be θ = cos−1[kn /K (z)]. For a simple waveguide,the maximum sound speed is the bottom sound speedcorresponding to min[K (z)]. At this value of K (z), wehave from Snell’s law θ = θc, the bottom critical angle.In effect, if we look at a ray picture of the modes, thecontinuous portion of the mode spectrum correspondsto rays with grazing angles greater than the bottomcritical angle of Fig. 8b and therefore outside the cone ofFig. 6a. This portion undergoes severe loss. Hence, wenote that the continuous spectrum is the near (vertical)field and the discrete spectrum is the far (more horizon-tal, profile dependent) field falling within the cone inFig. 6a.

The advantages of the normal mode procedure arethat (1) the solution is available for all source and receiverconfigurations once the eigenvalue problem is solved;(2) it is easily extended to moderately range-dependentconditions using the adiabatic approximation; (3) it can beapplied (with more effort) to extremely range-dependentenvironments using coupled mode theory. However, itdoes not include a full representation of the near field.

E. Adiabatic Mode Theory

All of the range-independent normal mode “machinery”developed for environmental ocean acoustic modeling ap-plications can be adapted to mildly range-dependent con-ditions using adiabatic mode theory. The underlying as-sumption is that individual propagating normal modesadapt (but do not scatter or “couple” into each other) tothe local environment. The coefficients of the mode expan-sion, an in Eq. (33), now become mild functions of range,i.e., an(k) → an(k, r). This modifies Eq. (32) as follows:

Page 404: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 329

G(r, z) = i ρ(zs)

(8πr )1/2 exp(−i π/4)

×∑

n

un(zs)vn(z)

kn1/2 exp(iknr ). (38)

where the range-averaged wavenumber (eigenvalue) is

kn = 1

r

∫ r

0kn(r ′) dr ′, (39)

and the kn(r ′) are obtained at each range segment fromthe eigenvalue problem Eq. (34) evaluated for the environ-ment at that particular range along the path. The quantitiesun and vn are the sets of modes at the source and the fieldpositions, respectively.

Simply stated, the adiabatic mode theory leads to a de-scription of sound propagation such that the acoustic fieldis a function of the modal structure at both the source andthe receiver and some average propagation conditions be-tween the two. Thus, for example, when sound emanatesfrom a shallow region where only two discrete modes existand propagates into a deeper region with the same bottom(same critical angle), the two modes from the shallow re-gion adapt to the form of the first two modes in the deepregion. However, the deep region can support many moremodes; intuitively, we therefore expect the resulting twomodes in the deep region will take up a smaller more hor-izontal part of the cone of Fig. 6a than they take up in theshallow region. This means that sound rays going fromshallow to deep tend to become more horizontal, whichis consistent with a ray picture of downslope propagation.Finally, fully coupled mode theory for range-dependentenvironments has been developed but requires extremelyintensive computation.

1. Parabolic Equation Model (PE)

The PE method was introduced into ocean acoustics andmade viable with the development of the “Tappert split-step algorithm” which utilized FFTs at each range step.Subsequent numerical developments greatly expanded theapplicability of parabolic equation.

2. Standard PE Split–Step Algorithm

The PE method is presently the most practical and en-compassing wave-theoretic range-dependent propagationmodel. In its simplest form, it is a far-field narrow-angle(∼ ±20 with respect to the horizontal—adequate for mostunderwater propagation problems) approximation to thewave equation. Assuming azimuthal symmetry about asource, we express the solution of Eq. (21) in cylindricalcoordinates in a source free region in the form

p(r, z) = ψ(r, z) · H (r ), (40)

and we define K 2(r, z) ≡ K 20 n2, n therefore being an “in-

dex of refraction” c0/c, where c0 is a reference soundspeed. Substituting Eq. (40) into Eq. (21) and taking K 2

0as the separation constant, we end up with a Bessel equa-tion for H which has a Hankel function as the outgoingsolution. If we use the asymptotic form of the Hankel func-tion, H 1

0 (K0r ), and invoke the “paraxial” (narrow angle)approximation,

∂2ψ

∂r2 2K0

∂ψ

∂r, (41)

we obtain the parabolic equation (in r ),

∂2ψ

∂z2+ 2i K0

∂ψ

∂r+ K 2

0 (n2 − 1)ψ = 0, (42)

where we note that n is a function of range and depth. Weuse a marching solution to solve the parabolic equation.There has been an assortment of numerical solutions, butthe one that still remains a standard is the so-called “split-step” range-marching algorithm,

ψ(r + r, z) = exp

[i K0

2(n2 − 1)r

]F−1

×[(

exp

(− ir

2K0s2

))· F[ψ(r, z)]

],

(43)

which is often referred to as the “split-step” marchingsolution to the PE. The Fourier transforms F are per-formed using FFTs. Equation (43) is the solution forn constant, but the error introduced when n (profile orbathymetry) varies with range and depth can be made arbi-trarily small by increasing the transform size and decreas-ing the range-step size. It is possible to modify split-stepalgorithm to increase its accuracy with respect to higherangle propagation.

3. Generalized or Higher-Order PE Methods

Methods of solving the parabolic equation, including ex-tensions to higher angle propagation, elastic media, anddirect time domain solutions including nonlinear effects,have recently appeared. In particular, accurate high an-gle solutions are important when the evironment supportsacoustic paths that become more vertical such as whenthe bottom has a very high speed and hence, large criti-cal angle with respect to the horizontal. In addition, forelastic propagation, the compressional and shear wavesspan a wide angle interval. Finally, Fourier synthesis forpulse modeling requires high accurate in phase and thehigh angle PE’s are more accurate in phase, even at thelow angles.

Page 405: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

330 Underwater Acoustics

Equation (42) with the second-order range derivativewhich was neglected because of Ineq. (41) can be writtenin operator notation as

[P2 + 2i K0 P + K 20 (Q2 − 1)]ψ = 0, (44)

where

P ≡ ∂

∂r , Q ≡

√n2 + 1

K 20

∂2

∂z2 . (45)

Factoring Eq. (45) assuming weak range dependence andretaining only the factor associated with outgoing propa-gation yields a one-way equation

P ψ = i K0(Q − 1)ψ (46)

which is a generalization of the parabolic equation be-yond the narrow angle approximation associated withIneq. (38). If we define Q = √

1 + q and expand Q ina Taylor series as a function of q , the standard PE methodis recovered by Q ≈ 1 + 0.5q . The wide-angle PE to arbi-trary accuracy in angle, phase, etc, can be obtained froma Pade series representation of the Q operator,

Q ≡√

1 + q = 1 +n∑

j =1

a j ,nq

1 + b j ,nq + O(q2n +1), (47)

where n is the number of terms in the Pade expansion and

FIGURE 13 Consistency between ray theory and normal mode theory. (a) Sound speed profile. (b) Ray trace. (c)Normal modes. (d) Propagation calculations.

a j ,n = 2

2n + 1 sin2

(j π

2n + 1

), b j ,n = cos2

(j π

2n + 1

).

(48)

The solution of Eq. (46) using Eqs. (47) and (48) has beenimplemented using finite difference techniques for fluidand elastic media.

V. QUANTITATIVE DESCRIPTIONOF PROPAGATION

All of the models described above attempt to describe re-ality and to solve in one way or another the Helmholtzequation. They therefore should be consistent, and thereis much insight to be gained from understanding this con-sistency. The models ultimately compute propagation losswhich is taken as the decibel ratio (see Appendix) of thepressure at the field point to a reference pressure, typicallyone meter from the source.

Figure 13 shows convergence zone type propagation fora simplified profile. The ray trace in Fig. 13b shows thecyclic focusing discussed in Section I.B. The same pro-file is used to calculate normal modes, shown in Fig. 13c,which when summed according to Eq. (37) exhibit thesame cyclic pattern as the ray picture. Figure 13d showsboth the normal mode (wave theory) and ray theory re-sult. Ray theory exhibits sharply bounded shadow regions

Page 406: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 331

as expected, whereas the normal mode theory, which in-cludes diffraction, shows that the acoustic field does existin the shadow regions, and the convergence zones havestructure.

Normal mode models sum the discrete modes whichroughly correspond to angles of propagation within thecone of Fig. 6a. The spectral method can include the fullfield, discrete plus continuous, the latter corresponding tolarger angles. The discussion following Eq. (37) definesthese angles in terms of horizontal wavenumbers, andeigenvalues of the normal mode problem are a discrete setof horizontal wavenumbers. Hence the integrand (Green’sfunction) of the spectral method has peaks at the eigen-values associated with the normal modes. These peeksare shown on the right of Fig. 14a. The smoother portionof the spectrum is the continuous part corresponding tothe larger angles. Therefore, the consistency we expectbetween the normal mode and the spectral method andthe physics of Fig. 6 is that the continuous portion of thespectral solution decays rapidly with range so that thereshould be complete agreement at long ranges between

FIGURE 14 Relationship between FFP, NM, and PE computa-tions. (a) FFP Green’s function from Eq. (31). (b) Normal mode,spectral (FFP), and PE propagation results showing some agree-ment in near field and complete agreement in far field.

normal mode and spectral solutions. The Lloyd mirroreffect, a near-field effect, should also be exhibited in thespectral solution but not the normal mode solution. Theseaspects are apparent in Fig. 14b. The PE solution is ingood agreement with the other solutions but with somephase error associated with the average wavenumber thatmust be chosen in the split-step method. The PE solution,which contains part of the continuous spectrum includingthe Lloyd mirror beams, is more accurate than the normalmode solution at short range; however, the generalizedPE can be made arbitrarily accurate at short range byincluding more expansion terms in Eq. (47).

Range-dependent results are shown in Fig. 15. A raytrace, a ray trace field result, a PE result, and data areplotted together for a range-dependent sound speed profileenvironment. The models agree with the data in general,with the exception that the ray results predict too sharp aleading edge of the convergence zone.

Upslope propagation is modeled with the PE in Fig. 16.As the field propagates upslope, sound is dumped intothe bottom in what appears to be discrete beams. The flatregion has three modes and each is cut off successivelyas sound propagates into shallower water. The ray picturealso has a consistent explanation of this phenomenon. Therays for each mode become steeper as they propagate up-slope. When the ray angle exceeds the critical angle, the

FIGURE 15 Model and data comparison for a range dependentcase. (a) Profiles and ray trace for a case of a surface duct dis-appearing. (b) 250 Hz PE and 2 kHz Ray trace comparisons withdata.

Page 407: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

332 Underwater Acoustics

FIGURE 16 Relation between upslope propagation (from PE cal-culation) showing individual mode cutoff and energy dumping inthe bottom, and a corresponding ray schematic.

sound is significantly transmitted into the bottom. The lo-cations where this takes place for each of the modes areidentified by the three arrows.

As a final example of how physical insight can bederived from models, we present a range-independentnormal mode study of the optimum frequency of prop-agation in a shallow water environment with summer andwinter (dashed lines) profiles as indicated in Fig. 17a withthe source (S) and receiver (R) also indicated. Frequencyversus range contours of propagation loss obtained from awideband experiment (analyzed in 1/3 octave bands) areshown in Fig. 17b. One obtains the conventional type ofpropagation loss curves by taking a horizontal cut throughthe contour in Fig. 17b, with the result shown in Fig. 17c.Figure 17d is a model result from an incoherent (no crossterms) sum of normal modes. We note here, as an aside,that in shallow water environments, propagation loss ob-tained by incoherently summing the modes is approxi-mately equal to 1/3 octave frequency averaging, whichhas the effect of averaging away modal interference. Thefrequency versus range contours reveal an optimum fre-quency in the 200–400 Hz region. This can be seen by ob-serving the 80 dB contour which goes out to long rangesin the region, whereas other frequencies, at say a range of70 km, have much higher losses.

VI. SONAR ARRAY PROCESSING

Temporal processing such as digital signal processing iscommon to many fields. In this section we emphasize ap-plications to underwater acoustics, mainly concentrating

on spatial processing. Further, the array processing dis-cussed below for passive sonars is also applicable to ac-tive sonar signal processing. Spatial sampling of a soundfield is usually done by an array of transducers, althoughthe synthetic aperture array, in which a sensor is movedthrough space to obtain measurements in both the time andspace domains, is an important exception. Spatial sam-pling is analogous to temporal sampling, with the sam-pling interval replaced by the sensor spacing vectors. TheNyquist criterion requires that the sensor spacing be atleast twice the spatial wavenumber of the measured soundfield.

A. Linear Plane Wave Beamformingand Spatiotemporal Sampling

The simplest example of array processing is phase shad-ing in the frequency domain (or time delay in the timedomain) to search for the bearing of a plane wave signal.This procedure is referred to as plane wave beamforming,or delay and sum beamforming in the time domain. Forsimplicity we consider a linear array, and we take θ asthe bearing angle associated with the plane wave signal asshown in Fig. 18.

1. Frequency Domain Processing

A plane wave can be represented as

s(θ ) = exp(ik · r), (49)

where we have suppressed the time dependenceexp(−iωt)] and k = |k| = ω/c. The field is summed inphase if the receiving element (hydrophone or micro-phone) inputs at position di are multiplied by the complexconjugate of the plane wave phase factor,

w∗i = exp(−ik · di) = exp[−id(k sin θs)], (50)

where θs is a scanning angle. This process will have amaximum when the scanning angle equals the incidentangle of the signal.

The output of this beamforming process is denotedB(θs), but often it is the power ouput of the beamformerthat is of interest:

|B(θs)|2 =∣∣∣∣∣

m∑i=1

w∗i (θs)[si (θ ) + ni ]

∣∣∣∣∣2

=m∑

i, j=1

w∗i (θs)(si j + ni j )w j (θs), (51)

where si and ni are the signal and noise at the i th receiv-ing element and where si j + ni j are elements of a cross-spectral density matrix which, when obtained from data,would involve Fourier transforms and ensemble averages

Page 408: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 333

FIGURE 17 (a) Shallow water environment with summer sound speed profile. (b) Frequency versus range contoursof propagation loss obtained from a wide-band experiment (analyzed in 1/3 octave bands). (c) Propagation curves atthree frequencies corresponding from experiment contour curves. (d) Theoretical result using normal mode model.

as mentioned in the introduction and in the discussion fol-lowing Eq. (53) augmented by Fig. 19. In writing downthe right-hand side of Eq. (51), the signal and noise fieldswere assumed to be mutually incoherent.

We can write the above expression in matrix nota-tion where the boldface lower-case letters denote vectorsand boldface upper-case letters denote matrices. Define asteering column vector w whose i th element is wi and across-spectral density matrix (CSDM) K of the signal andnoise with elements Ki j = si j + ni j since the signal andnoise are assumed to be independent. Equation (51) canbe rewritten as

FIGURE 18 Geometry for plane wave beamforming.

|B(θ s)|2 = w†(θs)K(θtrue)w(θs) ≡ w†Kw (52)

where “†” denotes the complex transpose operation. TheCSDM or the covariance of the field is composed of un-correlated signal and noise covariances,

K = Ks + Kn. (53)

The data across the array as represented in the matrix Kcontain the information that the source is in the directionθtrue. Sometimes w(θs) is referred to as a replica, and theabove beamforming process is viewed as matching thereceived data across the array with a replica. The type of

Page 409: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

334 Underwater Acoustics

FIGURE 19 Array, narrowband model, and sample covariancematrix estimation.

beamformer represented by Eq. (52) is called a linear or aBartlett beamformer.

For the sample covariance estimation we assume wehave an array with N sensors located at di , i = 1, N , anda narrowband model as illustrated in Fig. 19. These covari-ances are estimated by segmenting the received data, ri (t)into “snapshots” using a sampling window, W (t), that isunity in the interval

[0, T ω], Rli ( f ) =

∫ Tl +Tw

Tl

ri (t)W (t − Tl)e − j2π f t dt (54)

where, here, the notation uses frequency, f , rather thanangular frequency, ω. In most beamforming algorithms thedata vectors are averaged to form the sample covariancematrix

K( f ) = 1

L

L∑l =1

Rl( f )Rl( f )H (55)

where L is the number of snapshots.

2. Time Domain Processing

Time delay is the time domain analogy to phase shadingin the frequency domain. This can be derived formally bytaking the Fourier transform of the beamforming processwith the result that the beamformer output is

B(t) =∑

i

ri

(t − di

csin θ

), (56)

where ri (·) is the time domain data at the i th phone. Thisprocess is referred to as delay and sum beamforming; thedelay is simply the time interval associated with the phaseshift in the frequency domain array processing.

B. Some Beamformer Properties

Figure 20 shows the output results of some plane wavebeamformers for the cases of one and two incident signals.To be noted are the sidelobes of the Bartlett processor and

FIGURE 20 Beamformer outputs. (a) Single sources at a bear-ing of 45. (b) Two sources with 6.3 angular separation. Solidline: linear processor (Bartlett). Dashed line: minimum variancedistortionless processor (MV).

the high-resolution performance of the adaptive proces-sors (discussed in the next section). Some of the generalattributes of an array processor are:

The main response axis (MRA): Generally, onenormalizes the beampattern to have 0 dB, or unity gainin the steered direction.

Beamwidth: An array with finite extent, or aperture,must have a finite beamwidth centered about the MRAwhich is termed the “main lobe.”

Sidelobes: Sidelobes are angular or wavenumberregions where the array has a relatively strongresponse. Sometimes they can be comparable to theMRA, but in a well-designed array, they are usually−20 dB or lower, i.e, the response of the array is lessthan 0.1 to a signal in the direction of a sidelobe.

Wavenumber processing: Rather than scan throughincident angles, one can scan through wavenumbers,k sin θs ≡ κs ; scanning through all possible values of κs

results in nonphysical angles which correspond towaves not propagating at the acoustic medium speed.Such waves can exist, such as those associated witharray vibrations. The beams associated with thesewavenumbers are sometimes referred to as virtualbeams. An important aspect of these beams is that theirsidelobes can be in the physical angle region therebyinterfering with acoustic propagating signals.

Page 410: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 335

Array gain: The array gain is defined as the decibelratio of the signal-to-noise ratios of the array output toa single phone output. If the noise field is isotropic, thearray gain is also termed the directivity index.

C. Adaptive Processing

There are high-resolution methods to suppress sidelobes,usually referred to as adaptive methods since the signalprocessing procedure constructs weight vectors that de-pend on the received data itself. We briefly describe oneof these procedures: the Minimum Variance DistortionlessProcessor (MVDP), sometimes also called the MaximumLikelihood Method (MLM) directional spectrum estima-tion procedure.

We seek a weight vector wMV applied to the matrix Ksuch that its effect will be to minimize the output of thebeamformer, Eq. (52), except in the look direction wherewe want the signal to pass through undistorted. The weightvector is therefore chosen to minimize the functional

F = w†MV KwMV + α

(w†

MV w − 1). (57)

The first term is the mean-square output of the array andthe second term incorporates the constraint of unity gainby means of the Lagrangian multiplier α. Following themethod of Lagrange multipliers, we obtain the MV weightvector,

wMV = K−1ww†K−1w

. (58)

This new weight vector depends on the received data asrepresented by the cross-spectral density matrix; hence,the method is “adaptive.” Substituting back into thequadratic form of Eq. (52) gives the output of our MVprocessor,

BMV (θs) = [w†(θs)K−1(θtrue)w(θs)

]−1 . (59)

The MV beamformer should have the same peak value atθtrue as the Bartlett beamformer, Eq. (52), but with side-lobes suppressed and narrower main beam, indicating thatit is a high-resolution beamformer. Examples are shownin Fig. 20.

D. Matched Field Processing

Matched field processing (MFP) is the three-dimensionalgeneralization of the conventional lower-dimensionalplane wave beamformer that matches the measured field atthe array with replicas of the expected field for all sourcelocations. These replicas, w, are derived from propaga-tion models as discussed in Section IV. The unique spatialstructure of the field permits localization in range, depth,and azimuth depending on the array geometry and com-plexity of the ocean environment. The process is shown

schematically in Fig. 21. MFP consists of systematicallyplacing a test point source at each point of a search grid,computing the acoustic field (replicas) at all the elementsof the array, and then correlating this modeled field withthe data from the real point source, K(atrue), whose lo-cation is unknown. When the test point source is colo-cated with the true point source, the correlation will bea maximum. The output of this matched field processor,denoted S(a) to indicate the generalization beyond planewave beamforming, at each point in space a is, in analogyto Eq. (52),

S(a) = w†(a)K(atrue)w(a), (60)

where the peak of the output of the beamformer, S(a), isat atrue. S(a) is also referred to as the ambiguity function(or surface) of the matched field processor because italso contains ambiguous peaks which are analogous tothe sidelobes of a conventional plane wave beamformer.Sidelobe suppression can often be accomplished by usinga nonlinear beamformer such as the MLM beamformer:

SMV (a) = [w†(a)K−1(atrue)w(a)

]−1 . (61)

A simulation vertical receive array example of the Bartlettand MVDP MFP processors for an ocean acoustic waveg-uide with a high signal-to-noise ratio is shown in Fig. 22.The two main factors that limit performance of MFP arenoise and the ability to accurately model the environment.Related to MFP is matched field tomography (MFT)searches for the environmental parameters controlling thepropagation (for example, the index of refraction whichmay be a spatially dependent coefficient of the waveequation) rather than source location.

VII. ACTIVE SONAR PROCESSING

An active sonar system transmits a pulse and extracts in-formation from the echo it receives as opposed to a pas-sive sonar system which extracts information from signalsreceived from radiating sources. An active sonar systemand its associated waveform is designed to detect targetsand estimate their range, Doppler (speed), and bearingor to determine some properties of the medium such asocean-bottom bathymetry, ocean currents, winds, partic-ulate concentration, etc. The spatial processing methodsalready discussed are applicable to the active problem, sothat in this section we emphasize the temporal aspects ofactive signal processing.

A. Active Sonar Signal Processing

The basic elements of an active sonar are: the (waveform)transmitter, the channel through which the signal, echo,and interference propagate, and the receiver. The receiver

Page 411: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

336 Underwater Acoustics

FIGURE 21 Matched field processing. (a) The true source location is obtained by modeling data at an array from aset of grid points and comparing the model with the actual data on the array. (b) Schematic diagram of the matchedfield processor.

consists of some sort of matched filter, a square law device,and possibly a threshold device for the detector and range,Doppler, and bearing scanners for the esimator.

The matched filter maximizes the ratio of the peak out-put signal power to the variance of the noise and is imple-mented by correlating the received signal with the trans-mitted signal. A simple description of the received signal,r (t), is that it is an attenuated, delayed, and Doppler shiftedversion of the transmitted signal, st (t),

r (t) ≈ Re[αeiθ st (t − T )e2π i fct e2π i fd t + n(t)

], (62)

where α is the attenuation transmission loss and targetcross section, θ is a random phase from the range uncer-tainty compared to a wavelength, T is the range delaytime, fc is the center frequency of the transmitted signal,and fd is the Doppler shift caused by the target. The corre-

lation process will have an output related to the followingprocess,

C(a) =∣∣∣∣∫

r (t)s(t ; a) dt

∣∣∣∣2

(63)

where s(t ; a) is a replica of the transmitted signal modi-fied by a parameter set a which include the propagation-reflection process, e.g., range delay and Doppler rate. Forthe detection problem, the correlation receiver is used togenerate a sufficient statistic which is the basis for a thresh-old comparison in making a decision if a target is present.The performance of the detector is described by receiv-ing operating characteristic (ROC) curves which plot thedetection of probability versus false alarm probability asparameterized by a statistic of the signal and noise lev-els. The parameters a set the range and Doppler valuein the particular resolution cell of concern. To estimate

Page 412: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

Underwater Acoustics 337

FIGURE 22 Simulated matched field results for the environmentin Fig. 21a. (a) Bartlett MFP ambiguity surface. (b) Minimum vari-ance distortionless MFP ambiguity surface.

these parameters, the correlation is done as a functionof a.

For a matched filter operating in a background of whitenoise and detecting a point target in a given range-Dopplerresolution cell, the detection signal-to-noise ratio dependson the average energy-to-noise ratio and not on the shapeof the signal. The waveform becomes a factor when thereis a reverberant environment and when one is concernedwith estimating target range and Doppler. A waveform’spotential for range and Doppler resolution can be ascer-tained from the ambiguity function of the transmitted sig-nal. This ambiguity function is related to the correlationprocess of Eq. (63) for a transmitted signal scanned as afunctions range and Doppler,

( T , Tt , f d , fdt

)∝

∣∣∣∣∫

st (t − Tt )st (t − T ) e2π i( fdt − f d )t /, dt

∣∣∣∣2

(64)

where Tt , fdt are the true target range (time) and Dopplerand T , f d are the scanning estimates of range and Doppler.Figure 23 are sketches of ambiguities functions of sometypical waveforms. The range resolution is determinedby the reciprocal of the bandwidth and the Doppler res-olution by the reciprocal of the duration. The coded orPR (pseudo-random) sequence can attain good resolutionof both by appearing as long duration noise with a widebandwidth. Ambiguity functions can be used to designdesirable waveforms for particular situations. However,one must also consider the randomizing effect of the real

FIGURE 23 Ambiguity function for several sonar signals: (a) rect-angular pulse; (b) coded pulses; (c) chirped FM pulse.

ocean. The scattering function describes how a transmit-ted signal statistically redistributes its energy in the re-verberant ocean environment which causes multipath andDoppler spread. In particular, in a reverberation limitedenvironment, increasing transmitted power only does notchange the signal to reverberation level. Signal designshould minimize the overlap of the ambiguity functionof the target displaced to its range and Doppler and thescattering function.

B. Comparison of Processing for Detection,Communications, and Seabed Mapping

An underwater acoustic communication system is an ac-tive sonar system utilizing a channel which has many mul-tipaths. In the communication problem, different signals,using some form of frequency or phase shift algorithm, aretransmitted depending upon the message, and one wantsto identify all the paths that were excited by the signal. Inthe detection problem, the same signal is transmitted and,and one wants to identify just those (range-Doppler) cellsthat are associated with the reflected energy of the target.

Furthermore, sequences of active signals can be used inboth the target detection and the communication problem.For detecting a target, one wants to produce a track bysmoothing a sequence of range-Doppler estimates. Incommunication systems, messages are often encoded intoa sequence of transmissions for reliability and crypto-graphic concealment. For telemetry, equalization for com-pensating for mulitpath effects is often done without mod-eling the channel, but rather by using a reference signal orpredefined sequence of symbols. Time varying intersym-bol interference must be dealt with by some synchroniza-tion technique.

Page 413: Encyclopedia of Physical Science and Technology - Classical Physics

P1: LDK Final Pages

Encyclopedia of Physical Science and Technology EN017F-10 August 2, 2001 17:20

338 Underwater Acoustics

Maps and charts of the seabed are generated by plottingestimates of depths obtained by a sequence of reflectionsfrom the seabed. A single sounding does not convey anappreciable amount of information; a grid of sounding todetermine the topographic relief is usually required. Thisnecessitates a variety of processing methods; some sim-ply compensate for the finite beamwidth of the soundingsystem, others compile the grid of points, interpolate thedata, and contour the relief.

C. Travel Time Tomography

Tomography generally refers to applying some form of in-verse theory to observations in order to infer properties ofthe propagation medium. The received field from a sourceemitting a pulse will be spread in time as a result of mul-tipath structures in which different paths have differentarrival times (or group speeds). Hence the arrival timesare related to the acoustic sampling of the medium. Incontrast, medical tomography utilizes the different atten-uation of the paths rather than arrival time for the inversionprocess. Since the sound speed of the ocean is a function oftemperature and other oceanographic parameters, the ar-rival structure can ultimately be related to a map of theseoceanographic parameters. In its most primitive form, theinversion can be described by an algorithm discretizingthe ocean into cells, computing travel times from candi-date acoustic paths through these cells, and solving a set ofequations which equate these travel times to the measuredtravel times. Each cell is characterized by an unknownsound speed. Typically, some baseline oceanographic in-formation is known so that one searches for departuresfrom this baseline information. Tomographic experimentshave been performed to greater than megameter ranges.Pulse compression methods using sequences of signalsare often employed in ocean tomographic experiments toenhance bandwidth and received signal strength.

VIII. APPENDIX: UNITS

The decibel (dB) is the dominant unit in underwater acous-tics and denotes a ratio of intensities (not pressures) ex-pressed in terms of a logarithmic (base 10) scale. Twointensities, I1 and I2, have a ratio, I1/I2, in decibels of10 log I1/I2 dB. Absolute intensities can therefore be ex-pressed by using a reference intensity. The presently ac-cepted reference intensity is based on a reference pres-sure of one micropascal (µPa): the intensity of a planewave having an rms pressurex equal to 10−5 dynes persquare centimeter. Therefore, taking 1 µPa as I2, a sound

wave having an intensity, of, say, one million times thatof a plane wave of rms pressure 1 µPa has a level of10 log(106/1) ≡ 60 dB re 1 µPa. Pressure (p) ratios areexpressed in dB re 1 µPa by taking 20 log p1/p2 whereit is understood that the reference originates from the in-tensity of a plane wave of pressure equal to 1 µPa.

The average intensity, I , of a plane wave with rms pres-sure p in a medium of density ρ and sound speed c isI = p2/ρc. In seawater, ρc is 1.5 × 105 g cm−2 s−1 sothat a plane wave of rms pressure 1 dyne/cm2 has an in-tensity of 0.67 × 10−12 W /cm2. Substituting the value ofa micropascal for the rms pressure in the plane wave in-tensity expression, we find that a plane wave pressure of1 µPa corresponds to an intensity of 0.67 × 10−22 W /cm2

(i.e., 0 dB re 1 µPa).

SEE ALSO THE FOLLOWING ARTICLES

ACOUSTICAL MEASUREMENT • LIQUIDS, STRUCTURE AND

DYNAMICS • PHYSICAL OCEANOGRAPHY • SIGNAL PRO-CESSING, ACOUSTIC • SIGNAL PROCESSING, DIGITAL •SIGNAL PROCESSING, GENERAL • WAVE PHENOMENA

BIBLIOGRAPHY

Baggeroer, A. B. (1978). In Applications of Digital Signal Processing(A. V. Oppenheim, ed.), Prentice Hall, Englewood Cliffs, NJ.

Brekhovskikh, L. M., and Lysanov, Y. P. (1991). Fundamentals ofOcean Acoustics, Springer-Verlag, Berlin.

Burdic, W. S. (1984). Underwater Acoustic Signal Analysis, PrenticeHall, Englewood Cliffs, NJ.

Collins, M. D., and Siegmann, W. L. (2001). Parabolic Wave Equationswith Applications, Springer-AIP, New York.

deMoustier, C. Int. Hydrogr. Rev. LXV, 25.Jensen, F. B., Kuperman, W. A., Porter, M. B., and Schmidt, H. (1994).

Computational Ocean Acoustics, AIP Press, New York.Johnson, D. H., and Dudgeon, D. E. (1993). Array Signal Processing:

Concepts and Techniques, PTR Prentice Hall, Englewood Cliffs.Keller, J. B., and Papadakis, J. S., eds. (1977). Wave Propagation in

Underwater Acoustics, Springer-Verlag, New York.Medwin, H., and Clay, C. S. (1997). Fundamentals of Acoustical

Oceanography, Academic Press, San Diego.Munk, W., Worcester, P., and Wunsch, C. (1995). Acoustic Tomography,

Cambridge Univ. Press, Cambridge.Proakis, J. G. (1989). Digital Communications, McGraw Hill, New York.Ross, D. (1976). Mechanics of Underwater Noise, Pergamon, New York.Ogilvy, J. A. (1987). Wave Scattering from Rough Surfaces. Rep. Prog.

Phys. 50, 1553–1608.Urick, R. J. (1983). Principles of Underwater Sound, McGraw Hill, New

York.VanTrees, H. L. (1971). Detection Estimation and Modulation Theory,

Wiley, New York.Wilson, O. B. (1985). An Introduction to the Theory and Design of Sonar

Transducers, U.S. GPO, Washington, DC.

Page 414: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, MechanicalMarie Dillon DahlehUniversity of California, Santa Barbara

William T. ThomsonUniversity of California, Santa Barbara, deceased

I. DefinitionsII. Systems with One Degree of FreedomIII. Systems with Multiple Degrees of FreedomIV. Continuous Systems: Normal ModesV. Lagrange’s Equation: Generalized Coordinates

VI. Approximate and Numerical MethodsVII. Conclusions

GLOSSARY

Characteristic equation Algebraic equation from whichnatural frequencies are calculated.

Circular frequency Frequency measured in radians persecond.

Eigenvalue Quantity associated with natural frequencies.Eigenvector Vector column of natural modes.Free vibration Vibration in the absence of external exci-

tation.Modal matrix Matrix of eigenvectors.Mode Deflection shape.Node Point of zero deflection.Shock spectrum Plot of maximum peak response versus

period in terms of pulse time.Transient vibration Vibration due to shock.Viscous damping Damping proportional to velocity.

VIBRATION is a back-and-forth oscillation about anequilibrium position with a wavelike character of peri-odicity. For example, the swinging motion of a pendulum

is a vibration that is visible to the eye. It has a period ofoscillation that can be measured by a stopwatch, and itsamplitude of oscillation can be observed from the extremeexcursion of the swing. The number of oscillations in a unitof time is called the frequency. More often, the vibration isso small in amplitude that it is not observable by the nakedeye and its motion must be measured by a sensitive instru-ment. Sound is a vibration of the air, and its oscillationmay be quite complicated with its wave profile containingmany frequencies. Everything in nature vibrates and hasa rate of vibration. Vibration is a universal phenomenonsince all bodies possessing mass and elasticity are capableof it.

I. DEFINITIONS

Mass and elasticity are the elements of a vibrating system.Their distribution establishes the natural frequencies ofthe system. The mass in motion possesses kinetic energy,whereas potential energy is stored in the deformation ofthe elastic element. In a conservative system a continuous

455

Page 415: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

456 Vibration, Mechanical

interchange of kinetic and potential energy takes placeunder constant total energy. Actually, some dissipationof energy into heat or sound is encountered, diminishingthe motion of the system. Such action is represented by adamper, the third element of the system. Thus, the basicdynamic model of a simple vibratory system consists of amass, a massless spring, and a damper.

Oscillatory motions are generally periodic: that is, theyrepeat themselves in equal intervals of time called the pe-riod τ . The motion completed during the period is referredto as the cycle. The number of complete cycles of motionin a unit of time is called the frequency of vibration f ,which is the reciprocal of the period, f = 1/τ .

Vibrations fall into two general classes: free and forced.Free vibration takes place when a system vibrates underthe action of forces inherent in the system itself and whenexternal impressed forces are absent. A system under freevibration vibrates at one or more of its natural frequencies,which are properties established by its mass and stiffnessdistribution.

Vibration that takes place under the excitation of exter-nal forces is called forced vibration. When the excitationis oscillatory, the system is forced to vibrate at the ex-citation frequency. If the excitation frequency coincideswith one of the natural frequencies of the system, a con-dition of resonance is encountered, and dangerously largeoscillations may result. Thus, the calculation of the natu-ral frequencies of a system is of major importance in thestudy of vibration.

For the analysis of a vibrating system a mathematicalmodel is required. Such a model is either discrete or con-tinuous, the motion of which is described by coordinates.A discrete model requires a finite number of coordinates,whereas a continuous system requires an infinite numberof coordinates.

The number of independent coordinates required to de-scribe the motion of the system is called the degrees offreedom (DOF) of the system. Thus, an elastic body hasan infinite number of DOF. However, such a body is of-ten discretized to one having a finite number of DOF. Infact, a surprising number of vibration problems can betreated with sufficient accuracy by reducing a system toone having a few DOF.

II. SYSTEMS WITH ONEDEGREE OF FREEDOM

A. Free Vibration

1. Natural Frequency

The spring–mass–damper model of Fig. 1 is representativeof the simplest vibration system. With its motion assumed

FIGURE 1 Model of a single DOFS.

to be restricted along a single coordinate x , the system hasone DOF.

The behavior of a single degree of freedom system (sin-gle DOFS) is of basic importance since coordinate trans-formation, to be discussed in Section III.A.3, allows sys-tems of higher DOF to be mathematically treated in termsof equations corresponding to those of a single DOFS.

Measuring x from the equilibrium position of the mass,its differential equation of motion under excitation F(t) is

Mx + Cx + K x = F(t) (1)

where the overdots indicate time derivatives and the termson the left side are the inertia force the damping force, andthe spring force, respectively.

K is the stiffness of the spring and M is the mass. Theviscous damping force Cx , proportional to the velocity hasbeen assumed for mathematical convenience. In reality thedamping force is not known with any degree of accuracy,and the viscous assumption enables one to find a simplesolution of acceptable accuracy for small damping. Othertypes of damping are addressed in Section II.B.5.

For free vibration the right-hand term F(t) is zero.When the natural frequency is of primary concern, thedamping term is also made equal to zero, since the effectof damping on the natural frequency is generally negligi-ble. Equation (1) then becomes

x + ω2x = 0 (2)

Here,ω2 = K/M , whereω is the circular frequency (2π f ).This equation is that of simple harmonic moon with

general solution

x(t) = A sin ωt + B cos ωt

The two arbitrary constants A and B are solved from theinitial conditions x(0) and x(0) for the displacement andvelocity, which results in the Equation

x(t) = x(0) cos ωt + (x(0)/ω) sin ωt (3)

In harmonic motion, a complete cycle takes place whenωt = 2π so that the period τ and the natural frequency ofvibration become

= 2π/ω = 2π√

M/K = period

= 1/τ = (1/2π )√

K/M = natural frequency(4)

Page 416: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 457

FIGURE 2 Stiffness of common elastic elements.

It is seen from Eq. (4) that the natural frequency of thesingle DOFS depends only on the stiffness K of the springand the mass M .

Although the single DOFS may appear in various con-figurations, including rotation as well as translation, theform of the differential equation of motion is the same: thatof the second-order ordinary differential equation. Stiff-ness K for various spring configurations is presented inFig. 2.

2. Energy Method

It is often convenient to examine the vibration problemfrom energy considerations. In a conservative system thetotal energy must remain constant. The energies involvedare kinetic energy T , due to the velocity of the mass, andthe potential energy U stored in the spring.

Since the reference for U is arbitrary, it is convenientto choose it to be zero at the equilibrium position of thesystem. Then U = 0 at x = 0 and a maximum Umax atthe extreme position x = xmax. The kinetic energy T , onthe other hand, is zero at the extreme position x = xmax,where the velocity of the mass is zero, and a maximumTmax as it passes through the equilibrium position x = 0.Thus, the principle of conservation of energy requires that

Tmax = Umax (5)

Assuming sinusoidal motion x = A sin ωt ,

FIGURE 3 Equivalent mass of a spring.

Tmax =(

12 M x2

)max =

12 M ω2 A2

(6)Umax =

(12 K x2

)max =

12 KA2

Equating the two, the natural frequency is obtained as

ω2 = K /M

If the differential equation of motion is also desired, itcan be obtained from (d /dt)(T + U ) = 0.

The simple relationship ω = √K /M of the single-DOF

lumped-mass system can sometimes be extended to ac-count for the mass in the elastic element for a more accu-rate value of the natural frequency. This is accomplishedby assuming some reasonable deflection distribution ofthe mass in the elastic element and calculating its kineticenergy to establish an equivalent mass to add to the lumpedmass M .

EXAMPLE. If the deflection of the spring in Fig. 3 isassumed to vary linearly from zero at the fixed end to x atthe point of attachment to the mass M , the kinetic energyof the spring can be calculated to be Ts =

12 (ms /3)ω2 A2.

where ms is the total mass of the spring. The equivalentmass is one-third the mass of the spring.

The equation for the natural frequency of the spring–mass system including the mass of the spring then be-comes

ω =√

K/(M + 1

3 ms)

Similarly, the equivalent mass of a simply supportedbeam of Fig. 4 can be shown to be 0.4857mb. For itscalculation the statical deflection of the beam has beenassumed, and its kinetic energy was expressed in termsof the deflection at midspan, where the beam stiffnessis K = 48E I/ l3. The equation for the natural frequencyincluding the mass mb of the beam is then

FIGURE 4 Equivalent mass of a simply supported beam.

Page 417: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

458 Vibration, Mechanical

ω =√

48E I/ l3

M + 0.4857mb

3. Time Response with Damping

The influence of damping on the free vibration can bestudied by examining the homogeneous equation

M x + C x + K x = 0 (7)

The usual treatment of this equation is to assume a solutionof the form x = est . Its substitution into the differentialequation results in the characteristic equation

Ms2 + Cs + K = 0 (8)

The two roots of this equation are

s1.2 = −C /2M ±√

(C /2M)2 − K /M (9)

and the behavior of the system is dependent on the nu-merical value of the radical, which can be zero, positiveor imaginary. Only if the radical is imaginary will the sys-tem be oscillatory.

When the radical is zero,

(C /2M)2 = K /M

This value of C is called critical damping,

Cc = 2√

KM = 2M ω (10)

which represents the limiting case between oscillatory andnonoscillatory motion. It is convenient, then, to considerall cases in terms of the critical damping Cc by introduc-ing a nondimensional term ζ = C /Cc, which is called thedamping ratio or damping factor. The two roots of s canthen be written

s1.2 = (−ζ ± i√

1 − ζ 2)ω (11)

and the three cases of damping previously mentioned de-pend on whether ζ is greater than, less than, or equal tounity.

Of greatest interest is the oscillatory case in which ζ isless than 1. One form of the general solution for ζ < 1 is

x(t) = Ae−ζωt sin(√

1 − ζ 2 ωt + φ) (12)

where A and φ are arbitrary constants. It is evident fromthis equation that the frequency of the damped oscillationis slightly lowered by damping and is equal to

ωd = ω√

1 − ζ 2

Figure 5 shows a typical plot of a damped oscillation.The decay of oscillation shown here leads to another con-venient measure of damping. Defining the natural loga-rithm of the ratio of any two successive amplitudes as thelogarithmic decrement δ, it is easily shown from Eq. (12)that

δ = ln(xi /xi +1) = ζωτd

FIGURE 5 Damped free vibration.

Substituting for the damped period τd = 2πω√

1 − ζ 2 theexpression for the logarithmic decrement becomes

δ = 2πζ/√

1 − ζ 2 ∼= 2πζ (13)

B. Forced Vibration

1. Harmonic Excitation

When a system is subjected to forced harmonic excitation,it vibrates at the same frequency as that of the excitation.If the excitation frequency coincides with the natural fre-quency of the system, large amplitudes may result, anddampers and absorbers are often used to prevent danger-ous conditions.

With the system of Fig. 1 excited by a harmonic forceF(t) = F0 sin ωt , the differential equation of motion is

M x + C x + K x = F0 sin ωt (14)

The solution to this equation can be written in the form

x(t) = X sin(ωt + φ) (15)

where the amplitude X and the phase φ of the displacementwith respect to the force are given as

X = F0 /√

(K − M ω2)2 + (C ω)2

φ = tan−1 C ω

K − M ω2

(16)

These results are shown graphically in Fig. 6. It is seenthat the amplitude at resonance ω = ω1 increases with de-creasing damping and becomes infinite when the dampingratio ζ = 0.

2. Rotating Unbalance

Vibration excitation may be the result of unbalance inrotating machinery.

As seen in Fig. 7, the unbalance is represented byan eccentric mass m, with eccentricity e that is rotatingwith angular velocity ω. The exciting force will then be

Page 418: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 459

FIGURE 6 Response due to force excitation.

F = meω2 sin ωt , so that one need only replace F0 in theprevious section by meω2, or

X = meω2 /√

(K − M ω2)2 + (C ω)2 (17)

The phase angle is not altered, and the response can beplotted as in Fig. 8.

3. Support Excitation

In the case where the system is excited by the motion of thesupport point, as shown in Fig. 9 the equation of motionbecomes

M x = −K (x − y) − C(x − y)

Making the substitution z = (x − y), the equation can berewritten as

Mz + C z + K z = −M y = M ω2Y sin ωt (18)

where y = Y sin ωt has been assumed for the motion ofthe base. This equation is similar in form to that of the

FIGURE 7 Harmonic disturbing force resulting from rotatingunbalance.

FIGURE 8 Response due to rotating unbalance.

force excitation, where z replaces x , and M ω2Y replacesF0. Thus, a similar solution is expected for the relativedisplacement z. When Z is expressed in terms of X andY , the result is a somewhat different equation.∣∣∣∣ X

Y

∣∣∣∣ =√

K 2 + (ωC)2

(K − ω2 M)2 + (ωC)2 (19)

which is plotted in Fig. 10. It should be noted that theamplitude curves for different damping all have | XY | = 1at | ω

ω1| = √

2.

4. Vibration Isolation

The results of the last section enable one to understand thebasis of vibration isolation. Vibration isolation attemptseither to protect a delicate object from excessive vibrationtransmitted to it from its supporting structure or to pre-vent vibratory forces generated by machines from beingtransmitted to their surroundings. The basic problem isthe same for these two objectives—that of reducing thetransmitted forces.

Figure 10 shows that the ratio |X /Y | for the mo-tion transmitted from the vibrating floor to the supportedmass is less than 1.0 when the frequency ratio ω/ω1 isgreater than

√2. This means that the natural frequency

ω1 = √K/M of the suspended system must be low in

FIGURE 9 System under support motion.

Page 419: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

460 Vibration, Mechanical

FIGURE 10 Response due to support excitation.

comparison with the disturbing frequency ω. This can beaccomplished by a soft spring (small K ).

The second problem, that of reducing the force trans-mitted by a machine to the floor, has the same require-ment. The force to be isolated is transmitted by the spring–damper system, which has the value

FT =√

(K X )2 + (C ωX )2 = X√

K 2 + (C ω)2

Substituting for the displacement X produced by the ex-citing force F , which is

X = F /√

(K − M ω2)2 + (C ω)2

the ratio of the transmitted force FT to the exciting forceF is

FT/F =√

K 2 + (ωC)2/√

(K − Mω2)2 + (Cω)2 (20)

which is identical to the equation for |X /Y | plotted asFig. 10.

Another solution to either problem is to mount the sys-tem to be isolated on a heavy block supported on a cushionof soft material like spongy rubber or felt. In this way Mis increased to lower the natural frequency and increasethe frequency ratio ω/ω1.

Since in the general problem the mass to be isolated hassix DOF (three translation and three rotation), the designerof the isolator system must use intuition and ingenuity. Theresults of the single-DOF analysis should however, serveas a useful guide.

5. Equivalent Damping

Damping is present in all oscillatory systems. The decayof amplitude in free vibration is due to the loss of energy

by damping. In forced steady-state vibration the loss ofenergy is balanced by the energy supplied by the excitationforce.

There are many different kinds of damping forces, frominternal molecular friction to sliding friction and fluidresistance. Their exact mathematical description is dif-ficult, and a practical approach is to utilize the conceptof equivalent viscous damping based on equal energydissipation.

We need for this, wd the energy dissipated per cycleby viscous damping under harmonic oscillation x = Xsin (ωt − φ). Its substitution into the work equationresults in

Wd =∮

(Cx) dx =∮

Cx2 dt = πCωX2

For the equivalent viscous damping, we then write

πCeqωX2 = Wd (21)

where Wd is now the energy dissipated by any dampingforce. Of course, a simple relationship for Wd may notbe available, but Ceq found in this manner enables one touse all the elementary equations of forced vibration of theprevious sections.

Mentioned here is just one form of damping, that ofsolid damping encountered by many structural materialssuch as steel and aluminum and often referred to as struc-tural or hysteresis damping. For these materials the energydissipated per cycle is independent of the frequency overa wide frequency range and proportional to the square ofthe amplitude of vibration.

Wd = αX2 (22)

Page 420: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 461

By comparsion of Wd in the two equations, the equiva-lent viscous damping coefficient is Ceq = α/πω, and thedamping force becomes

Fd = (α/π )X cos(ωt − φ)

Thus, for structural damping, the amplitude and phaseunder steady-state harmonic force becomes

X = F /√

(K − M ω2)2 + (α/π )2

φ = tan−1 α/π

K − M ω2

By letting γ = α/π K , the differential equation of mo-tion can be written

M x + K (1 + i γ )x = Fei ωt (23)

The quantity K (1 + i γ ) is called the complex stiffnessand γ the structural damping factor. The relation betweenγ and the viscous damping factor ζ is easily found bycomparing the response at resonance:

X = F /2ζ K , amplitude at resonance (viscous)

X = F /γ K , amplitude at resonance (structural)

Thus, it can be concluded that the structural damping fac-tor γ is equal to twice the viscous damping factor ζ on thebasis of equal resonant amplitudes.

C. Transient Vibration

When a dynamic system is excited by a non-periodic force,such as a suddenly applied impact, steady-state oscilla-tions are not produced and the resulting motion is calleda transient response. The transient response of a simplespring–mass system of single DOF will be discussed inthis section since its behavior is basic to the transient be-havior of the more complex system.

1. Impulse Response

Impulse is the time integral of a force as expressed by theequation

F =∫

F(t) dt (24)

when a force of very large magnitude acts for a very shorttime, its time integral can be finite. Such force is describedas impulsive. Letting the magnitude of the impulsive forcebe F /ε over a time duration of ε, its limiting case ε → 0with impulse value of unity is called the unit impulse orthe Dirac delta function.

A delta function at t = ξ is identified by the symbolδ(t − ξ ) and has the following properties:

δ(t − ξ ) = 0 for all t = ξ∫ ∞

0δ(t − ξ ) dt = 1.0, 0 < ξ < ∞

Thus, when δ(t − ξ ) is multiplied by any time functionf (t), its product will be zero everywhere except at t = ξ ,and its time integral will be∫ ∞

0f (t) δ(t − ξ ) dt = f (ξ ), 0 < ξ < ∞

Since impulse is equal to the change in momentum,F acting on a mass will result in a sudden change in itsvelocity equal to F /M without an appreciable change inits displacement. From Eq. (3) for the free vibration ofan undamped spring–mass system with initial conditionsx(0) and x(0), it it evident that the response of the spring–mass system initially at rest and excited by an impulse Fis

x = ( F /M ω1) sin ω1t (25)

The oscillation that takes place is at the natural frequencyω1.

Similarly, the response of a damped spring–mass sys-tem can be determined from Eq. (12) to be

x = F

Mω1

√1 − ζ 2

e−ζω1t sin√

1 − ζ 2ω1t (26)

2. Arbitrary Excitation

Letting h(t) be the response to a unit impulse, the responseto an impulse F becomes x(t) = Fh(t). The response toan arbitrary force f (t) can be established in terms of h(t)by considering f (t) to be a series of impulses of strengthF = f (ξ ) ξ , as shown in Fig 11. Clearly, the response tothe unit impulse at t = ξ is

f (ξ ) ξ h(t − ξ )

where (t − ξ ) is the elapsed time after the impulse. Fora linear system the principle of super-position holds, andthe total response at time t is found by summing all suchcontributions as

x(t) =∫ t

0f (ξ )h(t − ξ ) dξ (27)

FIGURE 11 Response to arbitrary excitation.

Page 421: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

462 Vibration, Mechanical

This integral is known as the convolution integral. Sincef (ξ ) = 0 for times greater than the pulse duration tp, theupper limit of the integral remains constant at tp for t > tp.Another form of this integral can be written by substitutingτ = t − ξ , which leads to

x(t) =∫ t

0f (t − τ )h(τ ) d τ (28)

EXAMPLE. Determine the response of an undampedsingle DOFS to a step function of magnitude F0. The re-sponse to a unit impulse is h(t) = (1/M ω1) sin ω1t . Sub-stituting into the convolution integral, the result is

x(t) = F0

M ω1

∫ t

0sin ω1(t − ξ ) d ξ

= (F0 /K )(1 − cos ω1t)

If the response of a damped system is desired, this proce-dure is repeated with

h(t) = e −ζω1t

m ω1

√1 − ζ 2

sin√

1 − ζ 2 ω1t (29)

III. SYSTEMS WITH MULTIPLEDEGREES OF FREEDOM

A multi-DOFS requires more than one coordinate to de-scribe its motion. Such systems differ from the singleDOFS in that an n DOFS has n natural frequencies, and foreach of the natural frequencies there corresponds a naturalstate of vibration with a displacement configuration knownas normal mode. Mathematical terms for these quantitiesare known as eigenvalues and eigenvectors. They are es-tablished from the n simultaneous differential equations ofmotion and possess certain dynamic properties associatedwith the system.

The normal mode vibrations are free vibrations thatdepend only on the mass and stiffness distribution ofthe system. They are of importance not only in estab-lishing the spectrum of resonance but also in the factthat forced vibration can be analyzed in terms of normalmodes.

For an n DOFS there will be a set of n simultaneousdifferential equations to solve. To carry out this task ef-ficiently, matrix methods are employed. They provide acompact notation and an organized method for the solu-tion of linear simultaneous equations.

A. Two Degrees of Freedom

Since the two DOFS is a special case of the multi-DOFS,all of the concepts of the multi-DOFS apply to the two

DOFS. Its discussion at this point is warranted in that sim-ple analytic solutions with numerical examples are easilyobtained for the two DOFS, whereas this is not the casefor the larger systems. Computers are manditory for higherorder multi-DOFS.

A two DOFS has two natural frequencies at which themotion displays two distinct modes of oscillation callednormal modes. These normal modes can be produced byproper initial conditions. For a more general initial con-dition, the free vibration produced will be the superpo-sition of the two normal modes. As in the single DOFS,forced harmonic motion will take place at the excitationfrequency, and the amplitude will increase to a maximumat the two natural frequencies.

1. Natural Frequencies and Mode Shapes

The system shown in Fig. 12 requires two coordinates todescribe its motion and hence it has two DOF. ApplyingNewton’s laws of motion, we can write the following twoequations:

m1 x1 = −k1x1 + k2(x2 − x1)

m2 x2 = −k2(x2 − x1) − k3x2

These can be expressed in matrix notation as[m1 0

0 m2

]x1

x2

+

[(k1 + k2) −k2

−k2 (k2 + k3)

]x1

x2

=

0

0

(30)

For the normal mode vibration, each mass undergoes har-monic motion of the same frequency ω. To find the normalmode frequencies substitute x j = A j eiωt for j = 1, 2 intoEq. (30) and Eq. (30) becomes[(

k1 + k2 − ω2m1) −k2

−k2(k2 + k3 − ω2m2

)]

x1

x2

=

0

0

(31)

The natural frequencies ω1 and ω2 are found from thecharacteristic equation given by the determinant∣∣∣∣∣

(k1 + k2 − ω2m1

) −k2

−k2(k2 + k3 − ω2m1

)∣∣∣∣∣ = 0 (32)

and the normal modes are found by solving for the ratio

FIGURE 12 System with two DOF.

Page 422: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 463

x1 /x2 in any one of the equations of motion and presentedas a column matrix x1

x2.

EXAMPLE. Illustrating by numbers, assume the stiff-ness and the mass to be equal to k and m. We then have[

(2 − ω2m /k) −1

−1 (2 − ω2m /k)

]x1

x2

=

0

0

Letting λ = ω2m /k, the characteristic equation is

λ2 − 4λ + 3 = 0

which is satisfied by λ = 1 and λ = 3. The two naturalfrequencies are then obtained from

ω21 = k /m

ω22 = 3k /m

Substituting these back into the equations of motion, thenormal modes are found to be

Mode 1:

x1

x2

1

=

1

1

Mode 2:

x1

x2

2

=

1

−1

In a liner system, the principle of superposition holds,and the sum of the normal modes will also be a solution.Thus, any free vibration for this system can be written

x1

x2

= A1

1

1

sin (ω1t − ϕ1) + A2

1

−1

sin (ω2t − ϕ2)

where the arbitrary constants Ai and ϕi are determined bythe initial conditions.

2. Choice of Coordinates

In general, the equations of motion

[M]q + [K ]q = 0 (33)

are coupled. If both the M and K matrices in Eq. (33) arediagonal, the equations of motion are decoupled, and eachequation can be solved independently, as in the single-DOF case. Thus, coupling of coordinates arises from theoff-diagonal terms.

When off-diagonal terms appear in the mass matrix, thesystem is said to have dynamic coupling. This is equivalentto having cross-products of coordinates in the equation forthe kinetic energy.

If off-diagonal terms appear in the stiffness matrix, thesystem is said to have static coupling. A system with staticcoupling will have cross-products of coordinates in thepotential energy equation.

FIGURE 13 Choice of coordinates (c.g., center of gravity).

EXAMPLE. In this example coordinates are chosen intwo different ways, leading to static coupling and dynamiccoupling.

In Fig. 13a the displacement x is chosen at the centerof gravity of the bar. Its equation of motion,[

m 0

0 Jc.g.

]x

θ

+

[(k1 + k2) (k2l2 − k1l1)

(k2l2 − k1l1)(k1l2

1 + k2l22

)]

x

θ

=

0

0

where Jc.g. is the mass moment of inertia, shows staticcoupling.

In Fig. 13b a point c along the bar is chosen where aforce applied normal to the bar produces pure translation;that is, k1l3 = k2l4. Its equation of motion,[

m me

me Je

]x

θ

+

[(k1 + k2) 0

0(k1l2

3 + k2l24

)]

x

θ

=

0

0

shows dynamic coupling.If the coordinate x is measured at the end of the bar, both

the mass and stiffness matrices will be full and dynamic,and static coupling will appear simultaneously.

The choice of coordinates is arbitrary and does not af-fect the nature of the vibration. Regardless of the coordi-nates chosen, the two natural frequencies and their normalmodes remain unchanged.

3. Normal Coordinates

It has been demonstrated that the elements of the mass andstiffness matrices depend on the choice of coordinates. Itcan be shown that there is a set of coordinates, calledprinciple or normal coordinates, that will diagonalize theM and K matrices and thereby decouple the equations ofmotion.

EXAMPLE. For the system of Fig. 12 with ki = k andmi = m, the equation of motion

m

[1 0

0 1

]x1

x2

+ k

[2 −1

−1 2

]x1

x2

=

0

0

shows static coupling. When written out, these are

Page 423: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

464 Vibration, Mechanical

m x1 + 2kx1 − kx2 = 0

m x2 − kx1 + 2kx2 = 0

Adding and subtracting these equations, we obtain a newset of equations:

m(x1 + x2) + 2k(x1 + x2) − k(x1 + x2) = 0

m(x1 − x2) + 2k(x1 − x2) + k(x1 − x2) = 0

Thus, if we let

p1 = x1 + x2

p2 = x1 − x2

the above equations become

m p1 + kp1 = 0

m p2 + 3kp2 = 0

or in matrix notation

m

[1 0

0 1

]p1

p2

+ k

[1 0

0 3

]p1

p2

=

0

0

The equations in the normal coordinates p are now decou-pled, with each equation corresponding to one of singleDOF.

The transformation between the x and the p coordinatesin matrix form is

p1

p2

=

[1 1

1 −1

]x1

x2

and its inverse isx1

x2

= 1

2

[1 1

1 −1

]p1

p2

Note that each column of the transformation matrix cor-responds to the normal modes of the system.

For a more complex set of equations, this technique forfinding a set of normal coordinates would not be practical.In Section III.B.4, after a discussion of the orthogonalityof normal modes, a more general method or normalizingthe equations of motion will be presented.

4. Forced Harmonic Motion

The matrix equation for the two DOFS excited by a har-monic motion acting on mass m1 is[

m1 0

0 m2

]x1

x2

+

[k11 k12

k21 k22

]x1

x2

=

F1

0

sin ωt

(34)Since in forced vibration the system responds at the samefrequency as that of the excitation, we can assume thesolution in the form

x1

x2

=

X1

X2

sin ωt

Substituting this equation into the equation of motion, weobtain[(

k11 − m1 ω2)

k12

k21(k22 − m2 ω

2)]

X1

X2

=

F1

0

(35)

which can be abbreviated as

[Z (ω)]

X1

X2

=

F1

0

Premultiplying by the inverse and noting that[Z (ω)]−1 = adj[Z (ω)]/|Z (a)|, we obtain

X1

X2

= [Z (ω)]−1

F1

0

= adj[Z (ω)]

|Z (ω)|

F1

0

(36)

Here, adj[ ] denotes adjoint matrix.To express this equation in another form, we note that

|Z (ω)| = 0 is the characteristic equation yielding the rootsω1 and ω2. Thus, it is possible to rewrite |Z (ω)| as

|Z (ω)| = m1m2(ω2

1 − ω2)(

ω22 − ω2

)The adj[Z (ω)] is also[(

k22 − m2ω2) −k12

−k21(k11 − m1ω

2)]

Thus, the equations for X1 and X2 become

X1

X2

=

[(k22 − m2ω

2) −k12

−k21(k11 − m1ω

2)]

F1

0

m1m2(ω2

1 − ω2)(

ω22 − ω2

) (37)

EXAMPLE. The equation of motion for the systemshown in Fig. 14 is[

m 0

0 m

]x1

x2

+

[2k −k

−k 2k

]x1

x2

=

F1

0

sin ωt

Thus, k11 = k22 = 2k; k12 = k21 = −k; ω21 = k/m and ω2

2 =3k/m. Then X1 and X2 become

FIGURE 14 Forced vibration of two DOFS.

Page 424: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 465

FIGURE 15 Forced vibration of system of Fig. 14.

X1 = (2k − m ω2)F1

m2(ω2

1 − ω2)(

ω22 − ω2

)X2 =

k F1

m2(ω2

1 − ω2)(

ω22 − ω2

)These equations plotted in nondimensional form areshown in Fig. 15.

B. Higher Degrees of Freedom

1. General Equations of Motion

We now introduce the equations of motion in a more gen-eral notation:

m11 m12 m13

m21 m22 m23

m31 m32 m33

q1

q2

q3

+

c11 c12 c13

c21 c22 c23

c31 c32 c33

q1

q2

q3

+

k11 k12 k13

k21 k22 k23

k31 k32 k33

q1

q2

q3

=

F1

F2

F3

(38)

In this equation, the square matrices are all symmetricabout the diagonal, as was found for previous example oftwo DOF, that is, mi j = m ji , ci j = c ji , and ki j = k ji . Thisproperty of symmetry results from Maxwell’s reciprocaltheorem, which states that the work done on any linearstructure by loads applied at two different points is inde-pendent of the order of loading.

Examining the terms of the stiffness matrix, the fol-lowing interpretation can be made. If qi is made equal tounity with all other q’s equal to zero, Eq. (38) states thatthe terms of the i th column k1i , k2i , k3i , . . . are equal to theforces f1, f2, f3, . . . required at stations 1, 2, 3, . . . in or-der to maintain this displacement configuration. Thus, thestiffness term of any column can be determined by lettingthe displacement corresponding to that column be unity

with all other displacements equal to zero and measuringthe forces required at each station.

For concise presentation of the matrix equation, theform

[M]q + [C]q + [K ]q = Fis generally used, and when there is no ambiguity, bracketsand braces are often dispensed with, that is,

Mq + Cq + K q = F (39)

Similarity with the equation for the single DOF is strik-ingly evident.

2. Eigenvalues and Eigenvectors

For the calculation of the natural frequencies and modeshapes, the damping terms are deleted along with the forc-ing terms, the equation taking the form

Mq + Kq = 0

Since the normal modes execute harmonic motion, q =−ω2q. By letting λ = ω2, the equation to be solved be-comes

[K − λM]q = 0 (40)

The characteristic equation is the determinant of theequation equated to zero, or

|K − λM | = 0 (41)

and the roots λi of this equation are the eigen-values of thesystem. The normal modes (eigen-vectors) are then foundby substituting the λ’s back into the matrix equation.

3. Orthogonality of Modes

The normal modes of the system can be shown to be or-thogonal with respect to the mass and the stiffness matri-ces. Letting the normal modes for the i th mode be repre-sented as ui , the equation for the i th mode can be written

λi [M]ui = [K ]ui (42)

Premultiplying the i th equation by the transpose of u j ,we obtain

λi uTj [M]ui = uT

j [K ]ui

If we start with the equation for the j th mode and repeatthe above operation. we obtain a similar equation with the iand j interchanged. Now by subtracting the two equationsand noting that for symmetric matrices

uTj [ ]ui = uT

i [ ]u j

we obtain

(λi − λ j )uTi [M]u j = 0

Page 425: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

466 Vibration, Mechanical

Since λi differs from λ j , the above equation requires that

u Ti [M]u j = 0 for i = j (43)

Examining the original equation for the i th mode mul-tiplied by the transpose of the j th mode, we also concludethat

u Ti [K ]u j = 0 for i = j (44)

These equations define the orthogonal property of normalmodes.

Finally if i = j , (λi − λ j ) = 0 and

u Tj [M]u j = M j

u Tj [K ]u j = K j

(45)

where M j and K j can be any finite quantity. These arecalled the generalized mass and the generalized stiffness.

EXAMPLE. The natural frequencies and normalmodes for the system of Fig. 12 were found as

ω21 = k

m, u1 =

1

1

; ω2

2 = 3k

m, u2 =

1

−1

The M and K matrices for the system are

M = m

[1 0

0 1

]; K = k

[2 −1

−1 2

]

The orthogonality relations then have the followingvalues:

(1 −1)

[1 0

0 1

]1

1

= 0

(1 −1)

[2 −1

−1 2

]1

1

= 0

The values of the generalized mass and the generalizedstiffness are

M1 = (1 1)

[1 0

0 1

]1

1

= 2

M2 = (1 −1)

[1 0

0 1

]1

−1

= 2

K1 = (1 1)

[2 −1

−1 2

]1

1

= 2

K2 = (1 −1)

[2 −1

−1 2

]1

−1

= 6

4. Modal Matrix

There are several computer programs available to solvethe undamped homogeneous equation,

Mq + K q = 0

for its natural frequencies and normal modes. By usingthese results and forming a model matrix P , the generalequation for the forced vibration can be decoupled andsolved as a system of equations corresponding to those ofthe single DOFS.

The modal matrix is composed of columns of normalmodes ui as follows:

P = [u1u2u3] (46)

If each of the modal columns is divided by the general-ized mass Mi of the mode, a weighted modal matrix P isformed.

If we use the coordinate transformation q = P p andpremultiply by the transpose P

T, the general equation

Mq = Cq + Kq = F

for the forced vibration becomes

PTM P p + P

TC Pq + P

TK Pq = P

TF

Due to the orthogonality of the normal modes, the massand stiffness matrices are diagonalized:

PT

M P =

1

1

1

= I (unit matrix) (47)

PTk P =

ω21

ω22

ω23

=

(diagonalmatrix ofnaturalfrequencies)

(48)

The damping matrix, however, is in general not diago-nalized, and the system of equations remains coupled bydamping.

If the damping matrix is proportional to the mass orstiffness matrix, it is evident that the matrix P

TC P is also

diagonalized. The C matrix is then called proportionaldamping, and each of the decoupled equations will be ofthe form

pi + 2ζiωi pi + ω2i pi = fi (t) (49)

corresponding to that of the single DOFS.When C can be expressed in the form

C = αM + βK (50)

known as Rayleigh damping, where α and β are constants,the forced vibration equation can again be decoupled bythe transformation q = P p. In this case P

TC P becomes

PTC P = α P

TM P + β P

TK P = α I + β

and each of the decoupled equations will have the form

pi + (α + βω2

i

)pi + ω2

i pi = fi (t) (51)

Page 426: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 467

IV. CONTINUOUS SYSTEMS:NORMAL MODES

Systems having continuously distributed mass and elastic-ity lead to partial differential equations. The general so-lution to these equations, including transient vibration, isbeyond the scope of this article. Analytic solutions formoving boundary conditions would involve the use ofLaplace transformation. Numerical solutions for the moregeneral configurations would require discretization andthe aid of computers. In this section only the normal modefree vibration of a few simple bodies with the usual bound-ary conditions will be discussed.

A. Strings and Rods

The flexible string with uniformly distributed mass, andthe uniform rod in axial or torsional vibration, lead to thesame equation of motion known as the wave equation. Ithas the form

∂2u

∂x2= 1

c2

∂2u

∂t2(52)

where x is the continuous coordinate along the string orrod and c the velocity of propagation of the disturbancealong its length.

The propagation velocity for each case is as follows:String under tension T :

c =√

T/ρ, ρ = mass per unit length

Axial vibration of rod:

c =√

E/ρ

E = Young’s modulus of elasticity

ρ = mass per unit volume

Torsional vibration of rod:

c =√

G/ρ, G = shear modulus

ρ = mass per unit volume

The solution to the wave equation can be expressed as

u(x, t) = F1(ct − x) + F2(ct + x) (53)

where F1 and F2 are arbitrary functions. This equationimplies that a wave travels along the x axis in the forwardand backward directions with the propagation velocity c.

One method of solving the partial differential equationis that of separation of variables. In this method the solu-tion is assumed in the form

u(x, t) = U (x)G(t) (54)

Its substitution into the differential equation results in thefollowing two equations:

U (x) = A sin(ωx/c) + B cos(ωx/c)

G(t) = C sin ωt + D cos ωt

The constants A and B are solved from the boundary con-ditions at the two ends, whereas the constants C and Dare established from initial conditions.

EXAMPLE. For a uniform rod in longitudinal vibrationwith one end fixed and the other end free, the boundaryconditions are as follows:

Fixed end x = 0; displacement U (0) = 0 therefore

B = 0

Free end x = l; stress E

(dU

dx

)x = l

= 0 therefore

ccos

ωl

c= 0

This is satisfied when

cos(ωl/c) = 0

or

ωl/c = π/2, 3π/2 · · · (2n − 1)π/2,

n = 1, 2, 3, . . .

B. Beams

For the lateral vibration of uniform beams, the followingdifferential equation, known as Euler’s equation, applies:

E I (∂4 y/∂x4) + ρ(∂2 y/∂t2) = 0 (55)

where ρ is the mass per unit length of beam. For the normalmode vibration, ∂2 y/∂t2 = −ω2 y and Eq. (55) becomes

d4 y/dx4 − β4 y = 0 (56)

where β4 = ρω2/E I .Since this is a fourth-order differential equation, the

general solution has four arbitrary constants:

y = A cosh βx + B sinh βx +C cos βx + D sin βx (57)

These constants and the values of β must be solved fromthe boundary conditions. The natural frequencies can thenbe established from the equation

ω = (βl)2√

E I/ρl4 (58)

Values of (βl)2 for the first three normal modes for variousboundary conditions are given in Table I.

Page 427: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

468 Vibration, Mechanical

TABLE I Numerical Values for (βl )2

Beam configuration First mode Second mode Third mode

Simply supported 9.87 39.5 88.9

Cantilever 3.52 22.0 61.7

Free–free 22.4 61.7 121.0

Clamped–clamped 22.4 61.7 121.0

Clamped–hinged 15.4 50.0 104.0

V. LAGRANGE’S EQUATION:GENERALIZED COORDINATES

For complex problems, the vector method based on New-ton’s laws becomes unwieldy, and the scalar method basedon energy should be considered. The usefulness of the en-ergy method has already been demonstrated for the singleDOFS. However, the statement for the total energy, asused for the single DOFS, provides only one equation,which is insufficient for multi-DOFS. This limitation wasovercome by Joseph L. C. Lagrange (1736–1813).

A. Equations of Motion

Lagrange developed a general treatment for dynamic sys-tems formulated from the scalar quantities of kinetic en-ergy T , potential energy U , and work W . The method isnot confined to any specific coordinate system and resultsin a set of simultaneous differential equations of motionin terms of independent generalized coordinates.

Lagrange’s equation is presented here as

(d/dt)(∂T/∂ qi ) − ∂T/∂qi + ∂U/∂qi = Qi (59)

where Qi is the generalized force. The generalized coor-dinates qi are independent coordinates equal in number tothe DOF of the system.

EXAMPLE. In Fig. 16. q1 = x and q2 = θ are gener-alized coordinates completely defining the motion of thesystem. The velocities of the two masses are

v21 = q2

1

v22 = (q1 + lq2 cos q2)2 + (lq2 sin q2)2

and the kinetic energy becomes

T = 12 m1q2

1 + 12 m2

[(q1 + lq2 cos q2)2 + (lq2 sin q2)2

]

FIGURE 16 System with generalized coordinates q1 and q2.

Letting the potential energy be equal to zero along thehorizontal line through m1, the equation for U becomes

U = 12 kq2

1 − m2gl cos q2

It is seen from these equations that T is a function ofboth qi and qi , whereas U is a function only of the qi .Their substitution into Lagrange’s equation results in a setof nonlinear differential equations in q1 and q2:

(m1 + m2)q1 + m2l(q2 cos q2 − q2

2 sin q2) + kq1 = 0

m2l2q2 + m2lq1 cos q2 + m2gl sin q2 = 0

If small angles are assumed for q2 the linearized equationsof motion become

(m1 + m2)q1 + m2lq2 + kq1 = 0

m2l(q1 + lq2) + m2glq2 = 0

Generalized coordinates are independent coordinatesequal in number to the DOF of the system. If, in the previ-ous example, rectangular coordinates x1, y1 were chosenfor the pendulum mass instead of the generalized coor-dinate q2 = θ , there would be one coordinate more thanthe required two coordinates for the problem. However,the rectangular coordinates x1, y1 are not independent andare related by the constraint equation x2

1 + y21 = l2. Thus,

one of the rectangular coordinates can be eliminated fromthe constraint equation, leaving only two independent co-ordinates as generalized coordinates.

In structural problems related coordinates are often en-countered. However, the excess coordinates can be elim-inated from the constraint equations, leaving the requirednumber of coordinates as generalized coordinates withwhich to formulate the kinetic and potential energy ex-pressions for the Lagrange’s equations.

B. Mode Summation

Engineering structures and machines are generally com-posed of beams, columns, plates, shells, and other contin-uously distributed elements, each with an infinite numberof DOF. The mode summation method enables one to an-alyze such systems as systems with a finite number ofDOF.

In this method, the displacement of each component ofthe structure is represented by the sum of mode shapesφi (x). If these modes are normal modes, considerablesimplification results due to orthogonality. Writing the de-flection as

y(x, t) =∑

i

φi (x)qi (t) (60)

one can determine the kinetic energy, the potential energy,and the work done by external forces. These quantities are

Page 428: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 469

then substituted into Lagrange’s equation to establish theequations of motion for the system.

In terms of the generalized coordinates qi , the followingquantities are needed:

1. Kinetic energy

T = 1

2

∫y2(x, t)m(x) dx

= 1

2

∑i

∑j

qi q j

∫φiφ j m(x) dx

= 1

2

∑i

∑j

Mi j qi q j

= 1

2qT

i [Mi j ]q j (61)

where Mi j = ∫φiφ j m(x) dx is the generalized mass.

2. Potential energy

U = 1

2

∑i

∑j

Ki j qi q j = 1

2qT

i [Ki j ]q j (62)

The generalized stiffness Ki j depends on the type ofelastic structure. For a uniform beam

Ki j =∫

EIφ′′i φ′′

j dx

3. Work term. The work term is established from thework done by the external forces due to a virtualdisplacement δqi of the generalized coordinates:

δW =∫

f (x, t)

( ∑i

φi δqi

)dx

=∑

i

δqi

∫f (x, t)φi (x) dx =

∑i

Qi δqi

The term

Qi =∫

f (x, t)φi (x) dx (63)

is called the generalized force.

These quantities can now be substituted into Lagrange’sequation to establish the finite set of equations for themotion of the system. When φi (x) are normal modes; Mi j

and Ki j become diagonal matrices, which leads to a set ofuncoupled equations.

VI. APPROXIMATE ANDNUMERICAL METHODS

Calculations for the natural frequencies and mode shapesof systems with many DOF are generally long and labori-

ous, requiring the use of computers. In many cases, onlya few of the lower modes are of interest, and simple andapproximate methods can be used.

A. Rayleigh Quotient

When only the lowest fundamental frequency of a systemis desired, the Rayleigh energy method offers a simpleapproach. The method can be applied to the multimass ordistributed system.

We have found that, for a conservative system, the max-imum value of the kinetic and potential energies can beequated. For a multimass or a distributed mass system thisrequires an assumption as to the displacement shape of thefundamental mode.

EXAMPLE. A continuously distributed system is oftenmodeled by several lumped masses m1, m2, m3, . . . . Toapproximate the dynamic deflection, assume that the dis-placements of the masses are y1, y2, y3, . . . , produced bythe static loads m1g, m2g, m3g, . . . . The maximum po-tential energy, stored as strain energy, is equal to the workdone by the static loads of the masses:

Umax = 12 g(m1 y1 + m2 y2 + m3 y3 + · · ·)

With harmonic motion of frequency ω, the velocity of eachmass is ωyi , and the maximum kinetic energy is

Tmax = 12ω2

(m1 y2

1 + m2 y22 + m3 y2

3 + · · ·)Equating the two energies, we obtain the equation for thenatural frequency known as the Rayleigh quotient:

ω2 = g(m1 y1 + m2 y2 + m3 y3 + · · ·)(m1 y2

1 + m2 y22 + m3 y2

3 + · · ·) (64)

Here we have assumed a deflection based on the staticload of the masses. Since the deviation of the assumedcurve from the exact dynamic curve can be thought ofas constraints or added stiffness, the natural frequencycalculated is slightly higher than the exact value. However,the method is somewhat insensitive to small deviationsof the assumed curve and results in a fairly accurate valueof the fundamental frequency.

Rayleigh’s quotient can also be expressed in matrixform. Letting x represent the assumed deflection vec-tor, the potential and kinetic energies are

Umax = 12 xT[K ]x

Tmax = 12ω2xT[M]x

Equating the two, the equation for the fundamental fre-quency in matrix form becomes

ω2 = xT[K ]x/xT[M]x (65)

Page 429: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

470 Vibration, Mechanical

B. Rayleigh–Ritz Method

Ritz extended the Rayleigh method to give a more ac-curate value of the fundamental frequency in addition toproviding an approximation to the higher natural frequen-cies and their mode shapes. The Rayleigh–Ritz methodstarts by computing the fundamental frequency using

Rayleigh’s equation

ω2 = Umax/

T ∗max (66)

where the kinetic energy is given by

T = ω2T ∗

The assumed deflection in the Rayleigh equation is con-sidered to be the sum of several mode functions multipliedby constants

y(x) = c1 φ1(x) + c2 φ2(x) + · · · + cm φm(x) (67)

The functions φi (x) are any admissible functions satisfy-ing the geometric boundary conditions of the problem.

With Umax and T ∗max expressed in terms of φi (x) and ci ,the ω2 is then minimized by differentiating with respectto each of the ci . The result is n algebraic equation in ω2,which in matrix notation is of the form

[ f (ω2)]ci = 0

The determinant of this equation yields the n naturalfrequencies of the system, and the corresponding modeshapes determined by the values of the ci are found bysubstituting each ω2 into the above equation.

The success of this method depends on the choice of theshape functions φi (x), which calls for some experience onthe part of the analyst.

C. Method of Matrix Iteration

The equations of motion, previously formulated in termsof stiffness, can also be expressed in terms of flexibility.The flexibility influence coefficient ai j is defined as thedeflection at i due to a unit load at j . Thus, in Fig. 17, a12

is the deflection at 1 due to a unit force at 2. Similarly a32

is the deflection at 3 due to the same loading.With several forces acting, the principle of superposi-

tion enables one to write

y1 = a11 f1 + a12 f2 + a13 f3

y2 = a21 f1 + a22 f2 + a23 f3

y3 = a31 f1 + a32 f3 + a33 f3

FIGURE 17 Flexibility influence coefficients.

which in matrix notation becomes

y = [ai j ] f (68)

The square matrix [ai j ] is here the flexibility matrix. Since f = [k]y , the flexibility matrix must be the inverse ofthe stiffness matrix.

For a system vibrating in one of the normal modes, theloading is equal to the inertia loads,

fi = −mi yi = mi ω2 yi

Thus, Eq. (68) can be written

y = [a]m ω2 y which written out becomes

y1

y2

y3

= ω2

a11m1 a12m2 a13m3

a21m1 a22m2 a23m3

a31m1 a32m2 a33m3

y1

y2

y3

(69)

This equation is suitable for the matrix iteration procedure,which converges to the lowest mode corresponding to thefundamental frequency.

Starting out with an assumed deflection for the rightcolumn and performing the indicated multiplication, a newdeflection column is obtained. Normalizing by letting oneof the deflections be unity, the procedure is repeated anynumber of times until the normalized deflection convergesto a steady configuration, which is the fundamental modeof vibration. The normalization will also result in a valueof ω2 for the fundamental frequency.

Higher modes can be found, provided that the lowermodes are eliminated from the assumed deflection. Thisis accomplished by assuming the trial deflection to be thesum of normal modes multiplied by constants and system-atically eliminating the lower modes by the use of orthog-onality. The method, which will not be discussed here, iscalled sweeping because it results in a sweeping matrix tosweep out the lower modes.

D. Holzer Method

Holzer proposed a simple numerical procedure for thecalculation of the natural frequencies and mode shapesof any multi-DOF torsional system. The method can alsobe used for the translational vibration of a lumped-mass-spring system.

For the torsional system, the calculations are startedby assuming a unit torsional amplitude at one end; aftera trial frequency ω is chosen, the torques and amplitudesare progressively calculated to the other end of the system.Depending on the boundary conditions, the torque or theangular displacement at the boundary is plotted againstthe frequency ω. If the boundary is free, the frequenciesthat result in zero torque are the natural frequencies of the

Page 430: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 471

FIGURE 18 Torsional system of three DOF.

system. If the boundary is fixed, the frequencies that resultin zero displacement are the natural frequencies.

EXAMPLE. The three-DOF torsional system of Fig. 18is free at 1 and fixed at 4. A frequency ω having beenchosen, the torque required on disk 1 for θ1 = 1 is ω2 J1.This torque acting on shaft 1 will twist it by

ω2 J1 /K1 = 1 − θ2

or

θ2 = 1 − ω2 J1 /K1

The torque required on disk 2 to maintain the amplitudeis ω2 J2 θ2, and shaft 2, supporting the sum of the torquesof disk 1 and 2, will twist by

θ2 − θ3 =(ω2 J1 + ω2 J2 θ2

)/K2

Again the torque of disk 3 to maintain the amplitude θ3 isω2 J3 θ3, and shaft 3 will twist by

θ3 − θ4 =(ω2 J1 + ω2 J2 θ2 + ω2 J3 θ3

)/K3

Repeating the same calculation with another value ofω, θ4 is calculated and plotted against the new value of ω.A plot of θ4 against ω may then appear as in Fig. 19 withθ4 passing through zero at three values of ω. These are thethree natural frequencies of the system.

The mode shapes can be determined by calculating thevalues of θi from the above equations, using the ω’s forthe natural frequencies.

1. Transfer Matrix

In the Holzer method, the state of the deflection and torqueat one station is transferred to the neighboring station, and

FIGURE 19 Natural frequencies of torsional system of Fig. 18.

FIGURE 20 Element of transfer matrix.

the procedure is numerically carried out from one end ofthe system to the other. The transfer matrix method is amatrix systemization of the Holzer method. The methodcan also be applied to the linear spring-mass system andto beams and branched systems.

The existing state at any station is first defined by thestate vector, which is a column matrix of the deflection andforce. For the spring–mass system of Fig. 20 the stationsare numbered with the spring and the mass to the rightas the structural element. The state vector for this systemis the deflection and force at n − 1 x

F n−1, which is to berelated to x

F n .Considering the spring kn , the displacement at the two

ends are xn and xn−1, and the force through it is Fn−1. Theequation relating the two is

xn = xn−1 + Fn−1/kn

which can be written by the matrix equationx

n

=[

1 1/kn]

x

F

n−1

The forces on the two sides of mn are Fn and Fn−1 andthe equation is

Fn = Fn−1 − ω2mn xn

Substituting for xn from the first equation, we have

Fn = Fn−1 − ω2mn(xn−1 + Fn−1/kn)

or F

n

=[

−ω2mn (1 − ω2mn/kn)

] x

F

n−1

Putting the two matrix equations together, the desiredresult is

x

F

n

=[

1 1/k

−ω2m (1 − ω2m/k)

]n

x

F

n−1

which transfers the state vector at n − 1 to the state vectorat n. The square matrix above is called the transfer matrixfor the nth element.

Starting with a numerical value for ω2, the calculationcan be progressively carried out from one end of the systemto the other. Depending on the boundary conditions, either

Page 431: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

472 Vibration, Mechanical

FIGURE 21 Beam element.

xn or Fn at the far end can be plotted against ω2, andthe natural frequencies of the system are found when theboundary conditions are satisfied.

The procedure to be followed for the torsional system isidentical to that of the linear spring–mass system, the statevector being the angular displacement θ and the torque T .

For the beam, the mass is again lumped at the right end,as shown in Fig. 21. The state vector will here contain fourquantities −V M θ yT, where V is the shear, M thebending moment, θ the slope of the beam, and y its lateraldeflection.

The transfer matrix can be developed in two parts, onecalled the field matrix, related to the elastic element, andthe other called the point matrix, related to the quanti-ties on the two sides of the mass. The two are then puttogether for the transfer matrix, which is a 4 × 4 matrix.Again the numerical procedure starts with a chosen valueof ω2, and the boundary conditions must be satisfied forthe determination of the natural frequencies.

E. Finite Difference Numerical Computation

When the differential equation cannot be integrated inclosed form, numerical methods must be employed. Thismay occur when the system is nonlinear or if the systemis excited by a force that cannot be expressed by simpleanalytic functions.

In the finite difference method for initial value prob-lems, the continuous variable t is replaced by the discretevariable ti . The differential equation is solved progres-sively in time increments h = t starting from knowninitial conditions. With a sufficiently small time incre-ment, an approximate solution of acceptable accuracy isobtainable.

In this section we discuss two finite difference methods.A discussion of the merits of the different finite differencemethods such as accuracy, stability, and length of compu-tation are beyond the scope of this article.

In the first method, the second-order differential equa-tion for the viscously damped single DOFS

Mx + Cx + Kx = F(t) (70)

is solved directly by discretizing the derivatives using thecentral difference method. This method is developed fromthe Taylor expansion of xi+1 and xi−1 about the point i .

xi+1 = xi + hxi + h2

2xi + h3

6˙x i + · · ·

xi−1 = xi − hxi + h2

2xi − h3

6˙x i + · · · (71)

where the time interval is h = t . Subtracting and ignoringterms of order h2 and higher, we obtain

xi1

2h(xi+1 − xi−1). (72)

Adding, we find

xi = 1

h2(xi−1 − 2xi + xi+1). (73)

Replacing the derivatives in Eq. (70) by the central differ-ences the finite difference equation is given by

M

h2[xi−1 − 2xi + xi+1] + C

2h[xi+1 − xi−1] + K xi = Fi

(74)

where xi = x(ti ) and Fi = F(ti ). Rearranging this equa-tion yields the recurrence formula

xi+1 =[

1Mh2 + C

2h

] [Fi +

[2M

h2− K

]xi

+[

C

2h− M

h2

]xi−1

]. (75)

It allows us to compute the displacement of the mass attime ti+1, xi+1, if we know the displacements at time tiand ti−1 and the external force Fi . This formula is not self-starting. In order to find x1, we need x0 which is given asan initial condition and x−1 which is not known but can becomputed. The initial values x0 and x0 are used to computex(0) from the differential equation

x0 = 1

M[F(0) − Cx0 − K x0] (76)

The value of x−1 is obtained by evaluating the backwardTaylor expansion about i = 0 to get

x−1 = x0 − hx0 + h2

2x0 (77)

Equations (75–77) constitute the central differencemethod for the viscously damped single degree of free-dom vibrating system.

The second finite difference method, known as theRunge-Kutta method, is also based on Taylor series expan-sions. The fourth-order Runge-Kutta method presentedbelow matches the Taylor series expansion up to terms

Page 432: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPA/GRI P2: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-810 August 2, 2001 17:44

Vibration, Mechanical 473

of order h4 without explicitly computing derivatives be-yond the first. It does this by judiciously combining fourdifferent evaluations of the first derivative. This method ispopular because it is self-starting, i.e. it only uses the initialconditions to compute x1, and results in good accuracy.

Since the Runge-Kutta method approximates firstderivatives, the second-order differential equation needsto be converted into a system of two first order equations.This means that the differential equation for the singledegree of freedom viscously damped system

x = 1

M[F(t) − K x − Cx] (78)

becomes the system of equations

x = y

y = 1

M[F(t) − K x − Cy] (79)

By defining

X =(

x

y

)(80)

and

G =(

y1M [F(t) − K x − Cy]

)(81)

The fourth order Runge-Kutta method results in the fol-lowing recurrence formula

Xi+1 = Xi + 16 [ K 1 + 2 K 2 + 2 K 3 + K 4] (82)

where

K 1 = h G( Xi , ti )

K 2 = h G( Xi + 0.5 ∗ K 1, ti + 0.5 ∗ h)

K 3 = h G( Xi + 0.5 ∗ K 2, ti + 0.5 ∗ h)

K 4 = h G( Xi + K 3, ti+1) (83)

Equations (82) and (83) constitute the fourth-order Runge-Kutta method.

VII. CONCLUSIONS

The subject of vibration covers a wide area with manyinteresting analytic techniques and methods of computa-tion. Obviously many of these areas cannot be presentedcomprehensively in a short summary article such as thisand have thus been omitted.

The digital computer has made possible the solutionof problems that previously defied computation and hasrevolutionalized our treatment of these problems. It hasintroduced new concepts of analysis such as the finite ele-ment approach, which is capable of solving very largestructural problems.

Two general areas of vibration that differ markedly fromthe subjects presented here should be mentioned briefly.The first area is the vibration of nonlinear systems. Its mostimportant difference arises from the fact that the principleof superposition, which plays a major role in the vibrationtheories of linear systems, no longer applies to the nonlin-ear system. Mathematical difficulties are encountered insolving nonlinear differential equations. However, thereis no particular difficulty in obtaining numerical solutionswith the digital computer.

The second area that requires a different approach is thatof random vibrations. These are vibrations produced byforces varying in a random manner, which can be definedonly by probability and statistical terms. For example, airgusts encountered by an airplane in flight can be definedonly in terms of statistical averages and probability ofencounter. Obviously, the response of a structure to suchrandom excitation is also random and must be defined interms of statistics and probability. Any one of these areaswould require extensive study.

SEE ALSO THE FOLLOWING ARTICLES

ELASTICITY • MECHANICS, CLASSICAL • NONLINEAR DY-NAMICS • NUMERICAL ANALYSIS • WAVE PHENOMENA

BIBLIOGRAPHY

Bathe, E. C., and Wilson, E. L. (1976). “Numerical Methods in FiniteElement Analysis,” Prentice-Hall, Englewood Cliffs. New Jersey.

Benaroya, H. (1998). “Mechanical Vibration: Analyasis, Uncertainties,and Control,” Prentice-Hall, Eaglewood Cliffs, New Jersey.

Craig, R., Jr. (1981). “Structural Dynamics,” Wiley, New York.Gerald, C., and Wheatley, P. (1997) “Applied Numerical Analysis,”

Addison-Wesley, Reading, Massachusetts.Meirovitch, L. (1967). “Analytical Methods in Vibrations,” Macmillan,

New York.Meirovitch, L. (1980). “Computational Methods in Structural Dynam-

ics,” Sijthoff & Noordhoff, Rockville, Maryland.Rayleigh, L. (1945). “The Theory of Sound,” Vol. 1, Dover, New York.

Thomson, W. T., and Dahleh M. (1998). “Theory of Vibration withApplications,” Prentice-Hall, New Jersey.

Page 433: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave PhenomenaNorman BleisteinColorado School of Mines

I. Waves in One Dimension: Fundamental ConceptsII. Waves in Higher Dimensions

GLOSSARY

Amplitude Local peak amplitude of a wave form; func-tion A.

Dot product k · x = k1x1 + k2x2 + k3x3 in three dimen-sions, k1x1 + k2x2 in two dimensions.

Frequency Temporal (local) rate at which a wave repeatsits fundamental form; ω in the units of radians/second,f = ω/2π in the units cycles/second or hertz.

Group speed Speed at which energy propagates; in onedimension, |dω/dk|; in higher dimensions, |∇kω(k)|.

Group velocity Velocity vector that describes the magni-tude and the direction of the propagation of energy; inone dimension, dω/dk; in higher dimension, ∇kω(k).

Incident wave Wave on one side of a surface whose prop-agation is toward the surface.

Period Elapsed time for one cycle of a wave; 2π/ω =1/ f .

Phase The function kx −ωt or k·x−ωt in the waveformsabove.

Phase speed Speed at which crests of a wave propagate;in one dimension, |ω/k|; in higher dimensions, |ω|/k,with k being the magnitude of the wave vector, k.

Phase velocity Both speed and direction at which wavecrests propagate; kω(k)/k.

Rays Trajectories along which the constituent compo-nents of a wave—wave vector, frequency, phase, andenergy—propagate.

Reflected wave Wave arising at a surface of discontinuity(interface↔reflector) in the propagation parameters ofa medium; this wave propagates on the same side ofthe interface as the incident wave.

Refracted wave Wave arising at a surface of discon-tinuity (interface) in the propagation parameters of amedium; this wave propagates on the opposite side ofthe interface from the incident wave, and the propaga-tion direction of this wave and the propagation directionof the incident wave satisfy Snell’s law.

Snell’s law Law relating the directions of incidence andrefraction of a wave at an interface. [See Eq. (79).]

Stationary phase Method for obtaining an approxima-tion (asymptotic expansion) of an integral with an oscil-latory factor, such as a Fourier superposition integral.

Wavelength Fundamental length scale over which a waverepeats itself; 2π/k.

Wavenumber Spatial rate at which cycles of a wave oc-cur; coefficient k; for the higher dimensional case, k isthe magnitude of k.

Wave vector k = (k1, k2, k3) in three dimensions; (k1, k2)in two dimensions; the notation (kx , ky, kz) is also used.

THE PHENOMENON of wave motion is the primarymechanism by which a disturbance transfers energy overa distance in a medium. The propagation of this energy isthought of as being wavelike when it can be characterized

789

Page 434: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

790 Wave Phenomena

by some feature (e.g., a crest) that is at least partiallypreserved as recognizable during the propagation over adistance or time interval. The most common wave phe-nomena are acoustic (sound), elastic (seismic), electro-magnetic (light, radio, or television), or gravitational (sur-face water) waves; there are many others. Certain featuresof the propagation of waves are common to all wave phe-nomena, no matter what the medium.

This article is a partial description of the broad classof common features of wave phenomena as seen in theirmathematical description. Where it is necessary to distin-guish between linear and nonlinear waves, this discussionis further limited to the former. Even a textbook-sizeddiscussion would inevitably omit some common featuresof waves—linear and nonlinear—because of the breadthof the subject. This, then, is one author’s choice of afundamental subset of common features of linear wavephenomena.

The discussion starts with one-dimensional wave prop-agation. We start with definitions of the features of a singlesinusoidal wave—amplitude, wavelength, wavenumber,period, frequency. We then proceed to a simple superpo-sition of two waves to introduce the distinction betweenphase speed/velocity and group speed/velocity.

These simple ideas then become a point of departure forthe discussion of Fourier superposition. This is a power-ful tool for deriving analytical representations of solutionsof wave equations in homogeneous media. It further hasapplication to provide exact representation in some casesof heterogeneous media and, beyond that, it provides ap-proximate representations of wave fields in an even largerclass of heterogeneous media.

However, when synthesizing waves over a continuumof wavenumbers, the identification of phase velocity andgroup velocity is obscured by the representation. In or-der to recapture those features of wave propagation, themethod of stationary phase is introduced. It is shownthat this approximation of the wave provides a concep-tually simplified interpretation of the more complicatedFourier synthesis. In this simplified representation, thephase and group velocities of the individual elements ofthe Fourier synthesis again become apparent, but this rep-resentation is an approximation of the original integral. Wepresent a numerical example to demonstrate the reliabil-ity of this approximation under appropriate dimensionlessconstraints on the physical parameters of the wave beingrepresented.

The same development is repeated for higher dimen-sional wave propagation. In this case, there are additionalfeatures due to the dimensionality: The wavenumber is re-placed by a wave vector; directionality plays an importantrole in the identification of phase and group velocities.Interestingly, these two velocities need not coalign.

Again, Fourier synthesis provides a means for describ-ing more complicated waves and a multidimensional sta-tionary phase provides a means of approximating thosewaves that admits simpler interpretation in terms of wavepackets propagating with their own group velocity, whileelements at specific wave vectors within the group travelwith their own individual phase velocity.

The article closes with discussion of reflection and re-fraction of a three-dimensional plane wave by a planarreflector.

I. WAVES IN ONE DIMENSION:FUNDAMENTAL CONCEPTS

As a specific example to picture in our minds, let us sup-pose that we are describing the vertical displacement ofpoints on a straight line (a string) as a function of trans-verse location (x) on the line and time (t). We shall denotethe vertical displacement by u(x, t). As a simple exampleof that displacement, let us suppose that u is given by

u(x, t) = A cos(kx − ωt). (1)

In this equation, A, k, and ω are constants; for now, theyare all positive constants. (Note that we could have aseasily begun our discussion using a sine function insteadof a cosine function.)

A. Amplitude, Phase, Wavelength,Wavenumber, Period, and Frequency

For each fixed value of t , the graph of u(x, t) in the (x, u)-plane is a cosine function of maximum height A calledthe amplitude of the wave. The argument of the cosinefunction [kx − ωt] is called the phase of the wave. Thepeaks or crests of the cosine function, that is, the pointswhere u(x, t) = A, occur whenever

kx = 2nπ + ωt, n = . . . , −2, −1, 0, 1, 2, . . . (2)

The peaks are separated by a distance over which kx in-creases by 2π , namely a distance

λ = 2π/k (3)

called the wavelength of the wave represented by u(x, t)(Fig. 1). The constant k is called the wavenumber.

For fixed x and variable t , the graph of u(x, t) in a(t, u)-plane is analogous to what we have just described.The amplitude of u is again given by A, but now the peaksof the cosine function at fixed x occur at the times

ωt = 2mπ + kx, m = . . . , −2, −1, 0, 1, 2, . . . (4)

Page 435: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave Phenomena 791

FIGURE 1 The wave of Eq. (1) for fixed t .

The elapsed time between peaks at fixed x is such that theincrement in ωt is equal to 2π , given by a time

T = 2π/ω (5)

called the period of the wave motion. The constant ω iscalled the frequency (Fig. 2).

Of course, we can look at the wave as a function of xand t simultaneously; see Fig. 3. For this example, we havechosen ω = 2k. This manifests itself as an apparent com-pression of the wave crests in the t-direction as comparedto the density of the wave crests in the x-direction.

Now think of a vertical plane parallel to the x–u-plane—a constant t-plane. This provides a snapshot, suchas the one in Fig. 1. Now consider moving that plane inthe positive t-direction. From the figure, it should be ap-parent that each wave crest, each wave trough—in fact,every point of constant phase on the wave—moves in thepositive x-direction, increasing x . This is a manifestationof positive phase speed, a subject of the next section.

The units of the phase function in Eq. (1) are radians.Therefore, the units of the wavenumber k are radians perunit length, while the units of the frequency ω are radiansper unit time. Because there are 2π radians per periodor cycle, it is sometimes more convenient to use units offrequency and wavenumber that are scaled by 2π , whichis the number of radians in one period or cycle. Thus, thenew variables have the dimensions of cycles per unit time

FIGURE 2 The wave of Eq. (1) for fixed x.

or cycles per unit length. These variables are often denotedby f and fx , defined by

ω = 2π f and k = 2π fx , (6)

respectively. The units of f are reciprocal time, often re-ferred to as cycles per unit time, and the units of fx arereciprocal length, referred to as cycles per unit length.When the time unit is seconds, the units of f are calledhertz (Hz). In these units, the temporal period and thefrequency are reciprocals of one another, as are the spa-tial period and wave number, now often referred to as thespatial frequency.

B. Phase Speed and Group Speed

Having examined the function u(x, t) for both fixed t andfixed x , we are now prepared to consider u when both xand t are allowed to vary. In particular, let us consider thegraph in the (x, u)-plane. The peaks of u, as well as all thepoints of constant phase, and hence constant u, will moveor propagate as time progresses. The rate (vφ) at which apoint of constant phase will move is readily determinedby setting the phase equal to a constant and differentiatingthat relationship with respect to t :

kx − ωt = const., vφ = dx/dt = ω/k (7)

Thus, we see that the points of constant phase move withthe speed ω/k, called the phase speed. When ω and k havethe same sign, this motion is to the right; when ω and khave opposite signs, the motion is to the left.

The wave u(x, t) defined by Eq. (1) is periodic, havingexactly the same shape in every interval whose length isgiven by the wavelength λ. It is also periodic in t , hav-ing the same shape in every temporal interval given bythe period T . In reality, no wave can be periodic over allspace and time. However, many wave phenomena are peri-odic on intervals of sufficient length (many multiples of λ)and/or for intervals of sufficient time (many multiples ofT ) to be considered periodic for all practical purposes. (Asimple example would be alternating current in a trans-mission line or waveguide.) Indeed, the transmission ofinformation in an otherwise periodic wave depends onlocal variations in amplitude (amplitude modulation) orphase (frequency modulation).

In many cases, the phase velocity vφ varies with ω andk. Typically, the physics of a particular problem and itsattendant mathematical model impose a relationship be-tween k and ω, called a dispersion relation

ω = ω(k) (8)

Except in the special case in which ω = ck, with c indepen-dent of k, different frequencies will propagate at different

Page 436: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

792 Wave Phenomena

FIGURE 3 A space-time image of the wave of Eq. (1).

speeds determined by the dispersion relation and the def-inition of vφ in Eq. (7).

We next consider waves of two nearby frequencies andthe same amplitude and ask how the composite wave,which is the sum of the two, will propagate. Thus, letus introduce the function

u(x, t) = A[cos(k+x − ω+t) + cos(k−x − ω−t)] (9)

In this equation, we have used k± and ω± as shorthandnotations for

k± = k ± k; ω± = ω ± ω

≈ ω ± (dω/dk)k

k = (k+ + k−)/2; ω = ω(k)(10)

By using the appropriate trigonometric identity, we canrewrite the sum of cosine functions in Eq. (9) as a productof cosines:

u(x, t) = 2A cos (kx − ωt) cos(kx − ωt) (11)

Implicit in our notation is the assumption that k is muchsmaller than k, so that the wavelength 2π/k associatedwith the first cosine factor in this equation is much largerthan the wavelength 2π/k associated with the second co-sine factor. Thus, the first cosine factor acts as a slowlyvarying amplitude modulator, varying the amplitude 2A,which is the sum of the two amplitudes of the constituentwaves of u(x, t). The wave of average wavenumber k andaverage frequency ω travels through the envelope at itsphase speed vφ = ω/k, while the envelope itself moves atits own speed, associated with the differentials k andω,

vg = ω/k ≈ dω/dk|k=k (12)

known as the group speed.

Suppose that vφ and vg are both positive. When vφ > vg ,the crests moving at the phase speed move forward througheach wavelength of the packet created by the modulator ofthe amplitude; when vφ < vg , the crests move backwardthrough the packet. The former case—or more precisely,vφ ≥ vg—is more typical, with vg having an upper bound,the characteristic speed of the medium (e.g., sound speed,light speed) through which the wave propagates, and vφ

having the characteristic speed as a lower bound.In Fig. 4, we show a sum of the two waves of Eq. (9).

They are of unit amplitude with k = π , k = 0.05k, andt = 1. Further, ω(k) = √

k2 + π2. The x-range here is 40units in the given length scale. Thus, we see 20 cyclesof the high-fequency wave over this range. On the otherhand, the sum of the waves is equal to zero at x = 20,where kx = 0.05π × 20 = π , and the arguments of thetwo cosine functions are out of phase by π , making thesum equal to zero. In Fig. 5, we show the same wave att = 4.353. The peaks and the zero of the envelope havemoved forward and the peaks of the fast cycles do notoccupy the same positions in the envelope. Actually, theyhave moved forward, as well.

FIGURE 4 An example of Eq. (9) at t = 0.

Page 437: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave Phenomena 793

FIGURE 5 The same example of Eq. (9) as in Fig. 4 at t = 4.353.

An important feature of the group speed is that theenergy of the wave residing in the wavenumbers near kwill propagate at this speed. Thus, if a localized distur-bance is created, it is the group speed that will determinehow much time will elapse before this portion of the dis-turbance is observed at a distance.

C. Fourier Superposition

These ideas extend in a natural way to the Fourier super-position of waves expressed as

u(x, t) = 1

∫ ∞

−∞A(k) expi[kx − ω(k)t]dk (13)

In this equation, we think of A(k) dk/2π as the amplitudeof a wave with wavenumber k and frequency ω(k). Theintegration (summation) is then a superposition over allvalues of k. The values of k for which A(k) are nonzeroare called the spectrum of the wave u(x, t). The productA(k) dk must have the same dimensions as u itself. Thus,A(k) must have the dimensions of u/unit-length of k; thatis, A(k) is a density, called the spectral density of the waveu(x, t).

We have used the complex exponential for our Fouriersuperposition, but we assume that the amplitude functionA(k) is such that the resulting integral is real. For exam-ple, suppose that ω(k) were an odd function of k so thatnegative frequencies yielded an exponential function fornegative k that is the complex conjugate of its values forpositive k. Then, when the real part of A ReA is an evenfunction of k and the imaginary part of A ImA is anodd function of k, u(x, t) would be real. Under other as-sumptions on ω(k), other constraints on A would makethe resulting integral real. Alternatively, we could simplyrequire that u be defined by the real part of the integral onthe right.

D. Stationary Phase Formula

Let us suppose in Eq. (13) that A(k) is nonzero only forvalues of |k| larger than some minimum value, say k0. We

define ω0 = |ω(k0)| as the associated frequency. We as-sume that |ω(k)| ≥ ω0 whenever |k| ≥ k0. We then rewritethe exponent in Eq. (13) as

kx − ω(k)t = ω0t [kx/ (ω0t) − ω(k)/ω0] (14)

In this form, we may think of ω0t as playing the role of adimensionless parameter to be denoted by (see below)and the expression kx/(ω0t) − ω(k)/ω0 as a dimension-less phase function with independent variable k. We couldas well make the independent variable dimensionless byscaling k by k0; that is, k/k0 = η. Later, we will describethe analysis of integrals such as Eq. (13) in terms of suchdimensionless variables.

In practice, the parameter is often large. We offerthe following interpretation of this requirement. Let usdenoted by T0 the period associated with the minimumfrequency ω0; that is, T0 = 2π/ω0. Then 2π t/T0 must belarge. That is, the observation time multiplied by 2π mustbe “many” periods at the minimum frequency. Most often,this requirement is stated in a form that puts the burden onthe frequency rather than the time. That is, the frequencyis such as to make ω0t large. Thus, we may think of large as characterizing high frequency. Although we havedescribed this as being “many” periods, note that the factorof 2π in the expression 2π t/T0 provides some help in thismatter. In practice, one often finds that

2π t/T0 ≥ π ⇒ t ≥ T0/2

is good enough! That is, the asymptotic approximationthat is described below provides a “reasonably” accuratedescription of the integral, Eq. (14), for times beyond ahalf period.

By scaling out the factor k0x , we could have obtainedan interpretation in terms of propagation over many unitsof inverse wavenumber instead of many periods. In eithercase, we must only require that, after scaling, the dimen-sionless derivatives should be bounded and should not becomparable in magnitude to the dimensionless large pa-rameter . That is, the remaining phase function shouldbe “slowly varying” when compared to .

In this limit we can approximate the integral in Eq. (13)by the method of stationary phase. We state the basic re-sult for one-dimensional integrals here. In the followingsection, the result for multidimensional integrals will bepresented. Suppose that

I =∫

f (η) expi (η)dη (15)

with being a large parameter, in practice at least 3 or π ,as was used in the earlier discussion. Then the value of theintegral will be dominated by its contributions from theneighborhood of certain points, say η j , j = 1, 2, . . . , n,

Page 438: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

794 Wave Phenomena

called stationary points, where the first derivative van-ishes:

d/dη = 0, η = η j , j = 1, 2 . . . , n (16)

When the second derivative at the stationary point doesnot vanish, the point is called a simple stationary point. Inpractice, it is most often the case that the stationary pointsare simple. Of course, the case of higher order stationarypoints (where a higher order derivative is the first nonva-nishing derivative at the stationary point) occurs as welland leads to a rich theory of wave phenomena beyond thescope of the present discussion. We proceed under the as-sumption that the stationary points are simple. In this case,the integral I defined by Eq. (15) is approximated by

I ∼n∑

j=1

√2π

| ′′(η j )| f (η j ) exp[i (η j )

+ i(π/4)sgn( )sgn(′′(η j ))] (17)

This is the stationary phase formula for the case of a sim-ple stationary point. In this equation, we have used prime(′) to denote differentiation with respect to η. The notationsgn( ) means “sign of .” The symbol “∼” is to be readas “is asymptotically equal to.” It means that the error ap-proaches zero more rapidly than the terms of the sum, thatis, more rapidly than a constant over

√| |, as → ∞.Usually the error is bounded by a constant over | | or aconstant over | |3/2.

Despite the formal statement addressing the error onlyin the limit as | | → ∞, we repeat that in practice | |greater than 3 or π—use whichever is convenient—wouldseem to suffice. For example, when this asymptotic ap-proximation is used to estimate the zeroth-order Hankelfunction of the first kind for its argument equal to 3 thatis, H (1)

0 (3), the error turns out to be only about 6%, suf-ficiently small for a qualitative understanding of how thefunction in question behaves and even adequate for pur-poses of modeling of real-world wave phenomena.

The method of stationary phase quantifies the follow-ing qualitative ideas about the integration of a functionwith a “rapidly varying” kernel, that is, a multiplier suchas the exponential function, with real and imaginary partseach having intervals of positive function values closelyadjacent to intervals of negative function values. When theamplitude function does not vary as rapidly as the kernel,the integral over a positive lobe tends to cancel the integralover the adjacent negative lobe. The cancellation is slightlyless when the rapid variation is diminished, that is, whenthe phase is stationary. The stationary phase formula thenapproximates the integral over an interval around such astationary point. The resulting Eq. (18) states that the inte-gral over the entire interval is dominated by contributionsfrom the neighborhoods of the stationary points.

Next, we will apply the stationary phase formula to theintegrals such as those in Eq. (13). We will not alwaysbother to rescale that Fourier representation or to intro-duce a dimensionless variable of integration η. We shallproceed formally in our dimensional variables, with theunderstanding that a complete justification of our asymp-totic approximation relies on an analysis such as the onepresented here. Thus, we will apply the results of this sec-tion with η replaced by k and set equal to unity.

E. Asymptotic Analysisof Fourier Superposition

We will now apply the method of stationary phase of theprevious section to the integral in Eq. (13). To do so, weset

(k) = kx − ω(k)t (18)

and differentiate

d

dk= x − dω

dkt ;

d2

dk2= −d2ω

dk2t (19)

In accordance with Eq. (16), we set the first derivativeequal to zero to determine the stationary points:

x = (dω/dk)t (20)

The function dω/dk was defined earlier to be the groupvelocity (this derivative can be positive or negative) at thegiven value of k. We see here that, for a given value of xand t , the stationary points are those k values for whichthe corresponding wave component would propagate at thegroup velocity dω(k)/dk from the origin to the point x intime t . We remind the reader that the method of stationaryphase provides an approximation to the integral over aninterval around the stationary point. Thus, the conditionof stationarity predicts that the packet of wavenumbersaround the stationary value will propagate at the groupvelocity of the stationary value. This theme will repeatitself in higher dimensions.

We now write the asymptotic approximation of Eq. (18)to the integral of Eq. (13) as

u(x, t) ∼n∑

j=1

A(k j )√2π |ω′′(k j )|t

expi[k j x − ω(k j )t]

− i(π/4)sgn(ω′′(k j )) (21)

In this equation, the summation is to be carried out overthe solutions of the equation of stationarity [Eq. (20)]. Wesee here that each term of the sum has the structure ofthe fundamental waveform of Eq. (1), except that the real-valued amplitude and cosine functions have been replacedby a complex-valued amplitude and complex exponential.That is, asymptotically, the general Fourier superposition

Page 439: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave Phenomena 795

of elementary waves behaves locally like the elemen-tary wave, except that the phase and group velocities ofthe elementary waves will vary with both position andtime.

This observation suggests an alternative manner inwhich to interpret the result of Eq. (21). Let us fix thevalue of k. Then we think of the packet of wavenum-bers in the neighborhood of that k value as propagat-ing at the group velocity dω(k)/dk with amplitude andphase being given by the summand of Eq. (21) evalu-ated at k. For some applications, this interpretation is asuseful as the actual evaluation at a given (x, t) as de-fined by the summation in Eq. (21). Indeed, this inter-pretation provides a quantification of the definition of awave. We see here a phase function whose crests propa-gate as time progresses, while the amplitude of the wave,providing the height of the crests, also varies as timeprogresses.

It may not be apparent why the propagation originatesfrom the origin for this example. To understand why thisis so, let us consider the wave represented by Eq. (13) att = 0:

u(x, 0) = 1

∫ ∞

−∞A(k) exp(ikx) dk (22)

Let us rewrite this integral in terms of the dimensionlessvariable η = k/k0:

u(x, 0) = k0

∫ ∞

−∞A(ηk0) exp(ik0xη) dη (23)

As the product k0x approaches infinity, the integral willapproach zero under relatively mild assumptions on theamplitude A. (The Reimann–Lebesgue lemma guaran-tees this result if |A(k)| is integrable.) Thus, we mightexpect that u(x, 0) will be small for large values of k0xand will be substantially different from zero only in someinterval around the origin in which k0x is not large. Con-sequently, to the order of approximation consistent withour asymptotics, the propagation of u(x, t) initiates fromthe neighborhood of the origin in x . In application, theFourier representation may well contain other terms inthe phase that distribute the initiation point of differentcomponents of the wave u(x, t) over a range of x val-ues. For example, we might replace A(k) in Eq. (13) byA(k) exp[iφ0(k)]. We would then add derivatives of φ0 tothe right sides in Eq. (19). In particular, −φ′

0(k) wouldreplace the origin as the initial value of x in Eq. (20).However, even in those cases, the propagation of theconstituent elements of u(x, t) would still be governedby the group velocity, as was the case for this simpleexample.

F. An Example of Dispersive Wave Propagation

We will discuss a simple example of wave propagationthat will exhibit some of the features we have describedin the previous section.

Let us suppose that u(x, t) is a solution of the followinginitial value problem:

∂2u

∂t2− c2 ∂2u

∂x2+ b2u = 0, t > 0, −∞ < x < ∞

(24)

u = 0,∂u

∂t= δ(x), t = 0

The function δ(x) is the Dirac delta function.We will solve the problem for u by Fourier transform.

Thus, we introduce

u(k, t) =∫ ∞

−∞u(x, t) exp(−ikx) dx (25)

By applying Fourier transform to the problem of Eq. (25),we obtain the following problem for u:

d2u/dt2 + (c2k2 + b2)u = 0, t > 0(26)

u = 0, du/dt = 1, t = 0

We leave it to the reader to verify that the solution to thisinitial value problem is

u(k, t) = exp[iω(k)t] − exp[−iω(k)t]

2iω(k)(27)

ω(k) =√

c2k2 + b2

In this equation, we have allowed a slight abuse of no-tation. There are really two waves represented here: onewith ω = ω(k) and the other with ω = −ω(k). Because thetwo dispersion relations define ω with only a difference insign, we have introduced only one function ω(k).

We take the inverse Fourier transform of the solution inEq. (28) to obtain an integral representation of the solutionto the problem of Eq. (25):

u(x, t) = 1

4π i

∑±

±∫ ∞

−∞

exp[i±(k, x, t)]

ω(k)dk (28)

where

±(k, x, t) = kx ∓ ω(k)t = kx ∓√

c2k2 + b2t

Furthermore, the summation notation means that we addtogether the results for the upper and lower signs.

As a basis for comparison, it is worthwhile at this junc-ture to specialize the result here to the case in which thereis no dispersion. That is, we consider the special case inwhich b = 0 and ω = ±ck. We then find that the solutionof Eq. (28) becomes

Page 440: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

796 Wave Phenomena

u(x, t) = 1

4π ic

∑±

±∫ ∞

−∞

exp[ik(x ∓ ct)]

kdk

=

1/2c, |x | < ct

0, |x | > ct

= 1

2cH (ct − |x |) (29)

In the final expression, H (x) is the Heaviside function,defined to be equal to zero for x < 0 and equal to one forx > 0. Its value at x = 0 is unimportant; however, if it isobtained by Fourier inversion, the value at x = 0 will beequal to one-half. The Fourier transform in this equationcan be carried out by standard techniques of complex con-tour integration, or the result may be found in a standardtable of Fourier transforms. We see here that the initialimpulse has caused the value of u(x, t) to “jump” fromzero to the value 1/2c everywhere on the “characteristic”interval (−ct, ct) and to remain there for all time. Onecan think of the initial data, nonzero only at the origin, aspropagating to the right and left at speed c and affectingthe value of u(x, t) everywhere inside the characteristicinterval.

Let us now return to the dispersive wave representedby Eq. (28). This wave is by no means as easy to analyzebecause of the complicated form of the integrand. We willtherefore resort to our asymptotic method in an attempt toreinterpret this solution, at least asymptotically, in termsof simpler functions.

This example provides us an excellent opportunity toconsider the effects of scaling to dimensionless variables,as discussed in Section I.C. Thus, we introduce the newvariable of integration η, defined by

η = ck/b (30)

As a check on dimensions, we note that c has the dimen-sions of length/time, whereas b must have the dimensionsof 1/time for each term of the original Eq. (25) to have thesame dimensions. Since k has the dimension of inverselength, η is indeed dimensionless.

In terms of η, the relevant functions of the integrand inEq. (30) take the following form:

ω(k)t = bt√

η2 + 1, kx = ηxb/c

±(k, x, t) = bt±(η, x, t) (31)

±(η, x, t) = [ηx/ct ∓√

η2 + 1]

We can see in this form that the large parameter emergesnaturally as bt , that is, time measured in inverse units ofa characteristic frequency of the original problem. Paren-thetically, we note that this is also the minimum frequencyof any Fourier component of the solution of Eq. (30). Fur-thermore, one can check that the maximum value of theη derivative of ± is |x |/ct + 1. [It is more difficult toshow from the representation of Eq. (28), but nonetheless

true, that only values of |x | ≤ ct are of interest; other-wise the integral is identically zero.] At any distance, thisbound on the derivative of ± approaches unity as timeincreases.

The asymptotic analysis of each term in the sum inEq. (30) proceeds as in the general case. The phase speeds[Eq. (7)] and the group speeds [Eq. (12)] for the two wavesare given by

vφ = ±√

c2k2 + b2

k; vg = ± c2k√

c2k2 + b2(32)

We see here that the phase speeds are greater in magni-tude than the speed c, while the group speeds are less inmagnitude than c, for every finite value of k. Both havec as limit as |k| → ∞. The magnitude of the group ve-locity |vg| is a monotonically increasing function of |k|.Thus, wave packets centered around lower wavenumberswill propagate more slowly while wave packets centeredaround higher wavenumbers will propagate faster. On theother hand, if one could pick out waves at a particularfrequency/wave number pair, those of lower wavenum-ber would have crests that propagate faster than those ofhigher wavenumber. In any case, we expect, then, that theshape of the initial data function will be distorted as timeprogresses.

We will carry out the stationary phase analysis on thephase functions ± defined in Eq. (30). Thus, followingthe method described in Section I.E, we calculate the firstand second derivatives as in Eq. (19):

d±dk

= x ∓ vgt ;d2±dk2

= ∓ c2b2t

(c2k2 + b2)3/2(33)

We now consider the condition of stationarity [Eq. (20)]for this example:

x = ±c2kt/√

c2k2 + b2 (34)

In many applications, it is not possible to invert thiscondition of stationarity to determine k as a function of xand t . In those cases, we content ourselves with a para-metric solution of the form of Eq. (21) subject to thecondition of Eq. (20). Indeed, in the discussion follow-ing those equations, we offered an interpretation of thatrepresentation of the solution. However, in this example itis possible to explicitly solve Eq. (34), and we now pro-ceed to do so and thereby obtain an explicit asymptoticsolution for this problem under the assumptions that bt islarge.

In these equations, we see that for the upper sign (+),x and k must have the same sign at the stationary point,while for the lower sign (−), x and k must be of oppositesigns. With this observation, we solve Eq. (34) for thestationary values of k, namely, ±kstat:

Page 441: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave Phenomena 797

kstat = bx/c√

c2t2 − x2 (35)

We see here that there are real solutions only for |x | ≤ ct .In the limit of equality, the stationary point moves offto infinity and the entire approximation technique breaksdown. In fact, using (35) to compute the second derivativein (33), we find that

′′±(±kstat) = ∓ (c2t2 − x2)3/2

cbt2(36)

In this dimensional form, the second derivative is seen tovanish in the limit, as x → ct . In such a limit, the stationaryphase formula is invalid. If we had followed through onthe dimensionless form, using (30), then

btd2±dη2

= ∓bt

[1 − x2

(ct)2

]3/2

(37)

In this form, it is clear that our original guess at a largeparameter, bt, must be tempered by the additional factoron the right. Thus, for x = 0, the second derivatives havemagnitude bt, large enough to expect asymptotics to workby assumption. On the other hand, the method must breakdown near the front of propagation, where the last factorin this equation or the numerator of the previous equationis nearly equal to zero. There are more exotic asymptoticexpansions that describe that region, as well, but the dis-cussion of such techniques is beyond the scope of thisarticle.

We now calculate the functions in the general formulaof Eq. (21) for the specific example of Eq. (28) usingEq. (35). The result of that calculation is

u(x, t) ∼ (c2t2 − x2)−1/4

√2πbc

cos

(b

c

√c2t2 − x2 − π

4

)(38)

This result should be compared to the exact solution,

u(x, t) = 1

2cJ0

[b

c

√c2t2 − x2

], c2t2 ≥ x2. (39)

Figure 6 shows the exact solution for t = 5, with c = 1,b = 2π , and 0 ≤ x ≤ 5; Fig. 7 shows the asymptoticsolution for the same values, except that 0 ≤ x ≤ 4.98. Wesee here that the character of the solution to the dispersive

FIGURE 6 The exact solution for t = 5.

FIGURE 7 The asymptotic solution for t = 5.

problem is quite different from the solution, Eq. (30), tothe nondispersive problem. At each fixed x , u(x, t) nowoscillates in time (as described by the cosine factor) whileit decays as 1/

√t as time progresses. For this problem,

these are the consequences of variable propagation speedfor the elements of the Fourier decomposition of the initialdata.

Figure 8 shows an overlay of the two solutions. Theagreement is apparent. Further, it can be seen that thewave slope increases with x . The reason is that the groupvelocity is a monotonic function of k. Thus, wave groupscentered around large k-values propagate faster and there-fore reside closer to the wave front. Larger k is re-lated to more rapid variation and produces these largerslopes.

As noted earlier, we should expect good agreement evenat bt = π . With b = 2π , that means t = 0.5. We show thatagreement in Fig. 9. Here, again, the asymptotic solution isrestricted, in this case, to an upper bound of 0.48. At leastat this empirically claimed lower bound for asymptotics,some separation between the exact solution (solid curve)and the asymptotic solution (dashed curve) is visible. Infact, the difference between these two functions variesfrom −0.015 and +0.015 over the range displayed in thisfigure. We cannot speak of a global percentage error forthese functions that pass through zero. However, the errorat x = 0 is 4.6%; the error at x = 0.48 is 3.8%. Further-more, the shift in the zero crossing between the exact andthe asymptotic solution is only 0.009. In applications, theaccuracy of observed data rarely matches the accuracy ofthe asymptotic expansion, even at this claimed lower limitof the range of validity of the asymptotic expansion. Thus,

FIGURE 8 An overlay of the exact and asymptotic solutions fort = 5. The dashed curve is the asymptotic result, as in the previousfigure.

Page 442: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

798 Wave Phenomena

FIGURE 9 An overlay of the exact and asymptotic solutions forbt = π . The dashed curve is the asymptotic result.

the valid use of asymptotic approximations in applicationsis not a factor in the overall accuracy of the analysis of data.

In summary, we have seen in this example how a fairlycomplicated solution Eq. (28) to a dispersive wave equa-tion [Eq. (25)] can be interpreted by asymptotic methods.In that interpretation [Eq. (38)] the distortion of the orig-inal waveform becomes more apparent and more easilyrecognized, especially when we compare this asymptoticsolution to the exact dispersion free solution [Eq. (30)].Furthermore, we again see the structure of a wave as de-scribed in the introduction. The equiphase points, includ-ing the crests and troughs of the wave, are determinedby setting the argument of the cosine function in Eq. (38)equal to a constant. The amplitude is seen to vary both spa-tially and temporally. As time progresses at a fixed pointx , the amplitude decays algebraically to zero, while pointsof constant amplitude propagate outward from the originas time progresses.

II. WAVES IN HIGHER DIMENSIONS

We will discuss here the extension of the concepts of theprevious chapter to two and three dimensions. We remark,however, that in theory there is no reason to limit ourdiscussion to three dimensions.

We will require a notation that allows us to refer topoints in two- or three-dimensional space. Thus, let usintroduce the boldface symbol x to denote a point or vectorin two or three dimensions. For the two-dimensional case,the coordinates of the point or the components of the vectorwill be (x1, x2), whereas in three dimensions, x will denotethe point or vector (x1, x2, x3). Many of the ideas wewill express here will be independent of the number ofdimensions.

Given two vectors x and k, we will denote by k · x thedot product of the two vectors, defined by

k · x =m∑

j=1

k j x j (40)

with m being the dimension. We will denote by x themagnitude of the vector x, that is,

x = (x · x)1/2 (41)

We will also use the notation (x) to denote the unit vectorin the direction of x, that is,

x = x/x (42)

With this notation in place, we can begin our discussionof waves in higher dimensions.

A. Plane Waves: Phase Velocityand Group Velocity

We will consider now the extension of the concepts ofSection I to higher dimensions. Instead of considering thereal periodic function in Eq. (1), or its alternate in whichthe cosine function is replaced by a sine function, we willconsider here the complex exponential

u(x, t) = A exp[i(k · x − ωt)] (43)

It is to be understood that the wave we are considering isthe real part of the function u(x, t) or a real superpositionof such functions.

In two dimensions, the function u(x, t) might bethought of as the vertical displacement of a membraneor the vertical displacement of the surface of a pool ofwater. These are simple extensions of the concept of thevertical displacement of a string, suggested in the previoussection.

For the three-dimensional case, such easily visualizedwave phenomena are not available. Perhaps the easiestcharacterization in three dimensions might be the pressurevariations or density variations of a compressible fluid,such as air. That is, one might think of sound waves. Moregenerally, u(x, t) might represent one component of themotion of particles of an elastic medium or one compo-nent of the electric or magnetic vectors of electromagneticpropagation.

In any case, we will proceed to introduce the basic con-cepts of wave phenomena in higher dimensions in the con-text of the simple function given by Eq. (43) and its gener-alizations analogous to those introduced in the discussionof wave phenomena in one dimension.

Let us first consider the question of peaks of the realpart of the wave of Eq. (43). These peaks are located atthe positions

k · x = 2nπ + ωt, n = . . . , −2, −1, 0, 1, 2, . . . (44)

For fixed t , a specific peak (fixed n) occurs everywhere ona line in two dimensions or on a plane in three dimensions.In either two or three dimensions, the wave representedby Eq. (42) is called a plane wave. The inclination of thisplane is given by the unit normal k. The normal distance ofthe plane from the origin is given by (2nπ + ωt)/k, with

Page 443: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave Phenomena 799

FIGURE 10 A snapshot at fixed time of a two-dimensional planewave.

k now denoting the magnitude of k. Indeed, any constantvalue of the phase occurs on a plane with the same features,except that the distance from the origin is determined bythe specific value of the phase rather than the value 2nπ .All of these planes are parallel (Fig. 10).

At fixed time, the normal distance between the planesof two peak values of u(x, t) is given by

λ = 2π/k (45)

Thus, we again denote by λ the wavelength of the waverepresented by u(x, t). The scalar k is again called thewavenumber. The vector k is called the wave vector.

For fixed x, the elasped time between two peaks of Re uin Eq. (43) is given by

T = 2π/ω (46)

As in the one-dimensional case, we call T the period ofthe wave and ω the frequency.

As time progresses, we can think of a plane of peakvalues of Reu(x, t) as defined by Eq. (43) (or any planeof constant phase) as propagating normal to itself. It willpropagate in the direction of k when omega is positiveor opposite to the direction of k when ω is negative. Thespeed at which the plane propagates can be determined bycalculating how the point on the normal through the originpropagates. That is, we set

x = kx sgn(ω) (47)

and then replace the requirement of Eq. (43) by

kx sgn(ω) = 2nπ + ωt

n = . . . , −2, −1, 0, 1, 2, . . .

x = kx sgn(ω)

From this equation, we can see that the plane propagatesnormal to itself with a phase speed given by

vφ = |ω|/k (48)

The direction of this propagation is given by k sgn(ω).Thus, we define the phase velocity by

vφ = vφ k sgn(ω) = (ω/k)k (49)

This is the velocity with which planes of constant phasepropagate.

In analogy with the one-dimensional case, let us nowallow ω to be a function of k, that is, ω = ω(k). We willnow consider how a wave composed of the sum of twoplane waves of the form of Eq. (42) with nearby values ofk might propagate. Thus, let us consider

u(x, t) = Aexp[i(k+ · x − ω+t)]

+ exp[i(k− · x − ω−t)] (50)

In this equation we have used k± and ω± as shorthandnotations for

k± = k ± k

ω± = ω ± ω ≈ ω ± ∇kω(k) · k (51)

k = (k+ + k−)/2, ω = ω(k)

We have denoted by ∇k the gradient of ω(k) with respectto k. The dot product occurring in the approximation of ω

is the extension to two or three dimensions of the two-termTaylor expansion appearing in Eq. (10).

By using these definitions in Eq. (32) and rewritingthat sum in terms of the average k and k, we obtainthe following representation of the superposition of twowaves:

u(x, t) = 2A cos[k · (x − ∇kωt)] exp[i(k · x − ωt)]

(52)

As in the one-dimensional case, we see that the superpo-sition of the two waves yields a wave at the average wavevector and frequency with an amplitude modulator pro-vided by the perturbations in the average wave vector andfrequency. The planes of constant phase of this modulatorare of the form

k · (x − ∇kω t) = const (53)

These planes have normal direction given by k and prop-agate in the direction of ∇kω. Indeed, the velocity of prop-agation is given by

vg = ∇kω (54)

which we define to be the group velocity. The magnitudeof this vector, |∇kω|, is called the group speed. As in theone-dimensional case, we will see below that the group

Page 444: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

800 Wave Phenomena

velocity will arise in a natural way when we considerFourier superpositions of waves in the high-frequencylimit.

We remark that the phase velocity and the group ve-locity need not be in the same direction. Indeed, they willonly be in the same direction when ω = ω(k), that is, whenomega is a function of the magnitude k rather than a func-tion of the two or three independent components of k. Welist some examples of both types:

ω = ck, vφ = ck, vg = ck

ω =√

c2k2 + b2, vφ =√

c2k2 + b2

kk

vg = c2k√c2k2 + b2

k

ω = ω0k3/k, vφ = ω0k3/

k2k (55)

vg = ω0[−k3

/k2k + (0, 0, 1)/k

]ω = ck + Uk1, vφ = [c + Uk1/k] k

vg = ck + (U, 0 0)

The third example here arises in the modeling of wavesin a rotating fluid, and the fourth example arises in themodeling of waves in a transversely moving medium.

As in the one-dimensional case, it is the group velocity,now a vector, that governs the propagation of energy overa distance.

B. Fourier Superposition

We now consider waves that are the Fourier superpositionof plane waves of the type introduced above. Thus, let usset

u(x, t) = 1

(2π )m

∫ ∞

−∞A(k) expi [k · x − ω(k)t] dmk

(56)

In this equation, the domain of integration is understoodto be from −∞ to ∞ in all m independent k variables. Forour purposes, m will be restricted to 2 or 3.

Such Fourier superpositions can be used to reconstructa broad class of waves. Below, we describe three quitedifferent types of waves and their corresponding Fouriertransforms, A(k), along with the necessary dispersion re-lation. That is, we will provide the amplitudes of the in-tegrand in (57), as well as the attendant function, ω(k),needed to complete the integrand in that equation. In allexamples, m = 3.

The first example is a periodic plane wave in three di-mensions

u(x, t) = cos[k0 · x − ck0t]

for which

A(k) = A+(k) + A−(k)

A±(k) = 4π3δ(k1 ∓ k10) δ(k2 ∓ k20) · δ(k3 ∓ k30)(57)

ω = ω±(k) = ±ck0

Here, upper signs in the last two lines go together, as dothe lower signs.

The second example is the Green’s function for the waveequation in a homogeneous medium:

u(x, t) = δ(t − r/c)

4πr

r =√

x21 + x2

2 + x23

for which

A(k) = A+(k) + A−(k),(58)

A±(k) = ± ic

2k, ω±(k) = ±ck.

It should be noted that the singularity, 1/k, in these am-plitudes is actually quite mild in three dimensions, owingto the fact that the differential volume element written inspherical polar coordinates is k2 sin θ dk dθ dφ. The mul-tiplication by k2 in the inverse transform assures that thevolume integral will not be singular at k = 0.

Note also that if this representation is derived as the so-lution of a causal problem, that is, one for which u = 0 fort < 0, then it should only be used for t > 0. If not, it willactually yield a second wave, δ(t + r/c)/4πr , propagatingbackwards in time! Use of a causal inverse Fourier trans-form in time will ensure that this wave does not arise. Dis-cussion of causal Fourier transforms is beyond the scopeof this article.

The last example is the distributional plane wave:

u(x, t) = δ(x1 − ct)

for which

A(k) = (2π )2δ(k2)δ(k3)(59)

ω(k) = ck1

C. Multidimensional Stationary Phase Formula

In Eq. (57) we cannot as easily write down a closed-formrecognizable function representing the wave propagatingin space and time. In this case, we again resort to themethod of stationary phase, this time multidimensionalstationary phase, to approximate the multifold wave formin terms of more familiar plane waves of the form ofEq. (43) for arbitrary A(k). First, we present the multi-dimensional stationary phase formula.

Page 445: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave Phenomena 801

Let us suppose that the integral I is defined by

I ( ) =∫

f (η) exp[i (η)] dmη (60)

In this equation, the single integral sign is understoodto represent an m-fold integral over the m variablesη1, η2, . . . , ηm . We are interested in an approximation ofthe integral for large values of .

As in one dimension, the integral is dominated bycontributions from the neighborhoods of certain crit-ical points, η1, η2, . . . , ηn , called stationary points,where

∂(η)/∂ηp = 0, p = 1, 2, . . . , m

η = η j , j = 1, 2, . . . , n

This is the generalization of the condition of Eq. (15).A stationary point are called simple when the Hessian

matrix, the matrix of second derivatives, has a nonzerodeterminant at that point. That is,

det[pq (η j )

] = 0; pq (η) = ∂2(η j )/∂ηp ∂ηq

p, q = 1, 2, . . . , m, j = 1, 2 . . . , n (61)

The integral I is then approximated by

I ∼n∑

j=1

(2π

| |)m/2 f (η j )√| det[pq (η j )]|

exp[i (η j )

+ i(π/4)sgn( )Sgn(pq (η j ))] (62)

In this equation, Sgn(pq ) denotes the signature of thematrix [pq ]. The signature of a matrix is the number ofpositive eigenvalues minus the number of negative eigen-values of the matrix. This result is the multidimensionalstationary phase formula.

The qualitative description of the method of stationaryphase is completely analogous to the discussion of theone-dimensional case. Each term in the sum in Eq. (63) isan approximation to the integral in a small domain aroundthe stationary point.

As in the one-dimensional case discussed in Section I,a dimensionless large parameter can be identified forintegrals of the type in Eq. (57) by recasting that integral indimensional variables in terms of dimensionless variables.However, we will proceed formally to use this approxima-tion in the dimensional integral of Eq. (57) with the for-mal large parameter equal to unity. As we demonstrated inSection I, this will produce an asymptotic approximationvalid for large time measured in units of a characteristictime of the integral or large distance measured in a char-acteristic distance of the integral.

D. Asymptotic Analysisof Fourier Superposition

We will now apply the multidimensional stationary phaseformula of Eq. (63) to the integral of Eq. (57). To do so,we introduce the phase function

(k) = k · x − ω(k)t (63)

In order to use this method, both the first and sec-ond derivatives of this phase function are needed. Thosederivatives are

∂(k)

∂kp= x p − ∂ω(k)

∂kpt

∂2(k)

∂kp∂kq= − ∂2ω(k)

∂kp∂kqt = −ωpq (k)t (64)

p, q = 1, 2, or 1, 2, 3

The stationary points are determined by setting the firstderivatives of equal to zero. We write that result in thevector form,

x = ∇kω(k)t (65)

The vector on the right side, ∇kω(k), can be recognized asthe group velocity vector introduced earlier. For a partic-ular choice of (x, t), the stationary points in k are thosepoints for which the group velocity is the velocity of prop-agation from the origin to x in the time t . We remark thatwith more structure in A(k) (for example, some phase de-pendence), we could create examples in which the propa-gation is not from the origin but from other points in space.In any case, the velocity of propagation picked out by thecondition of stationarity would remain the group velocity.Again, as in one dimension, the contribution from eachstationary point, that is, each solution of Eq. (65), approx-imates the integral in a local domain around the stationarypoint. Thus, each such contribution represents the propa-gation of a packet of wave vectors in a neighborhood ofthe particular wave vector satisfying Eq. (65).

The asymptotic approximation to Eq. (57) in the formof Eq. (63) is

u(x, t) ∼n∑

j=1

1

(2π t)3/2

A(k j )√|det[ωpq (k j )]|× expi[k j · x − ω(k j )t]

(66)− i(π/4)Sgn(ωpq (k j ))

x = ∇kω(k j )t

We see here that asymptotically each term of this generalwave form behaves locally as a plane wave propagatingat a group velocity that, in general, will vary from pointto point in space. This is an essential feature of high-frequency propagation of waves. Thus, the propagation of

Page 446: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

802 Wave Phenomena

plane waves takes on an added significance as the localpropagation of more complex wave structures.

As in the one-dimensional case, we see in the structureof this representation a wave with recognizable crests—which are the phase surfaces of the exponential—andslowly varying amplitude.

The propagation paths along which the solution prop-agates [Eq. (65)] turn out to be the rays of geometricaloptics, a high-frequency technique based on the WKBJmethod for ordinary differential equations. For continu-ous gradient functions ∇kω(k), the rays for a packet ofnearby k values will remain near to one another and willfill out a cone (not necessarily of circular cross section) astime progresses.

This observation leads to the interpretation of the solu-tion representation as an example of energy conservation.Returning to the representation of Eq. (57) and settingt = 0, we see that A(k) can be interpreted as the spectraldensity of the initial data. We then think of the square ofthis quantity, |A(k)|2, as being the spectral density of theenergy in the k domain or |A(k)2|Vk as the energy in thepacket of k values in the volume element Vk around k.

Let us define |A(k, t)| to be the amplitude of the waveu(x, t) at fixed k as time progresses. In this expression ofthe amplitude, we define the x-coordinate associated withk by the ray equation (65). Thus, |A(k, 0)| is just the spec-tral density |A(k)|. In an energy-conserving system, weexpect that as the wave propagates, |A(k, t)|2Vk(k, t)will be preserved (that is, remain constant) while the vol-ume element varies in accordance with the ray equation(65). The product t3|det(ωpq (k))| is the Jacobian of trans-formation via rays and is proportional to this volume el-ement. Thus, for the energy to be preserved in a packetof k values, the energy density |A(k, t)|2 must vary in-versely with this Jacobian, and the amplitude |A(k, t)| ofthe wave must therefore vary inversely with the square rootof this Jacobian. This provides a physical interpretation ofthe division by the square root of the Hessian matrix inthe asymptotic expressions of the summand in Eq. (66a)and our interpretation of the solution formula as a man-ifestation of conservation of energy. It is also consistentwith our earlier claim that energy propagates at the groupvelocity.

E. Plane Waves: Reflection and Refraction

Fundamental to the set of concepts of how plane wavespropagate is the interaction of such waves with a planarboundary across which some property of the medium ofpropagation [equivalently, some coefficient(s) of the mod-eling equation(s)] changes. We will describe this phe-nomenon in the context of a specific example and thendiscuss generalizations of the basic result.

Let us suppose that we are considering plane waves thatare solutions of the wave equation

c2

[∂2u

∂x21

+ ∂2u

∂x22

+ ∂2u

∂x23

]− ∂2u

∂t2= 0

c =

c−, x1 < 0

c+, x1 > 0(67)

We wish to consider the interaction of a plane wave at afixed frequency, incident on the interface at x1 = 0 fromthe left; that is, from the medium in which c = c−. Thus,we anticipate an incident wave, which we will denote byuI of the form

uI (x, t) = AI expi[(kI · x − ω(kI )t] (68)

In this equation, we must choose ω(k) so that the planewave satisfies the governing equation (67). Thus,

ω2 = c2−k2

I , ω = ±c−kI (69)

Of the two choices, we will set ω = ckI . This was thefirst example of a dispersion relation in Eq. (55). With thischoice, both the phase velocity and the group velocity havethe same direction as kI ; for the opposite choice, the twovelocities would be directed opposite to kI . Thus, so thatour plane wave is propagating from x1 < 0 toward x1 = 0,kI must make an acute angle with the x1 axis. That is,

k1I > 0, ω = ckI (70)

We will conjecture that the total solution in x1 < 0 ismade up of the incident wave and another wave called thereflected wave (u R). Furthermore, we will assume that an-other plane wave is transmitted (uT ) through the interface.Thus, we conjecture a total solution of the form

u(x, t) =

uI (x, t) + u R(x, t), x1 < 0

uT (x, t), x1 > 0

u R(x, t) = AR expi [kR · x − ωRt] (71)

uT (x, t) = AT expi [kT · x − ωT t]Our objective now is to express ωR , ωT , kR , kT , AR , andAT in terms of kI and AI . That is, we seek to express thefrequencies, the directions of propagation, and the ampli-tudes of the reflected and transmitted waves in terms ofthe same parameters for the incident wave and conditionsimposed on the model as to how these waves are to interactat the boundary.

A typical requirement of such interactions is that thesolution be continuous across the interface. That is,

AI exp[i(k2I x2 + k3I x3 − c−kI t)]

+ AR exp[i(k2R x2 + k3R x3 − ωRt)]

= AR exp[i(k2T x2 + k3T x3 − ωT t)] (72)

Page 447: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

Wave Phenomena 803

We take the Fourier transform of this equation with respectto t , that is, we multiply by exp(iµt) and integrate from−∞ to ∞ with respect to t , and we find that all frequen-cies must agree. This follows from the fact that the firstintegral is proportional to δ(µ − ckI ), whereas the secondis proportional to δ(µ − ωR) and the third is proportionalto δ(µ − ωT ). Since each of these Dirac delta functions isnonzero only where its argument is zero, they could notagree unless the frequencies were the same. Thus,

ωR = c−kR = ωT = c+kT = c−kI (73)

By a completely analogous argument applied to the spatialtransforms, we find also that

k2R = k2T = k2I and k3R = k3T = k3I (74)

These equations state that the projections of the three wavevectors on the planar interface must agree. The previousequation, in addition to equating the frequencies, statesthat the magnitudes of the reflected wave vector mustequal the magnitude of the incident wave vector, whilethe magnitude of the transmitted wave vector must equalthese two up to a scale factor.

Let us first focus our attention on kR , the reflected wavevector. From Eqs. (73) and (74) it follows that k1R = ±k1I .If these two components had the same sign, then kR wouldequal kI and the reflected wave would also be directed to-ward the interface. On physical grounds, we reject this; weexpect u R to be a wave directed away from the interface.The mathematical basis for rejecting this case is equallystrong. Were we to continue, we would find that AR wouldbe the negative of AI and AT would be zero. That is, a totalsolution that is identically zero would result. This is notthe solution of interest. Thus, whether on mathematicalgrounds or physical grounds, we set

k1R = −k1I (75)

We see then that the incident and reflected wave vectorsdiffer only in the sign of the normal component. Thus,these two vectors must make equal angles with the normalvector to the interface. This is Snell’s law of reflection.

Let us now consider the parameters for the transmittedwave. We denote by K I and KT , respectively, the magni-tudes of the transverse components of the wave vectors kI

and kT :

K I =√

k22I + k2

3I , KT =√

k22T + k2

3T (76)

From Eq. (74), we see that these two magnitudes areequal. Furthermore, dividing this equality by the last partof Eq. (73) yields

K I

c−kI= KT

c+kT(77)

This is Snell’s law of refraction, and the transmitted waveis, in fact, the refracted wave. The law is more often ex-pressed in terms of the angles of incidence and refraction,these being the angles that the wave vectors make withthe normal to the interface. If we denote those angles byI and R, respectively, then

sin I = K I /kI , sin R = K R/kR (78)

Thus, we conclude from Eqs. (77) and (78) that

sin R/ sin I = c+/c− (79)

This is Snell’s law of refraction in more familiar form. Inorder that R be a real angle, we must require that sin R beless than or equal to unity. Equivalently, we require that(c+/c−) sin I ≤ 1. When this criterion is violated (onlypossible for c+ > c−), we do not have a wave of the formof Eq. (72) propagating in the second medium.

We now determine k1T . From Eq. (73), we can see that

k2T = k2

1T + k22T + k2

3T = c2−k2

I

/c2+ (80)

We know k2T and k3T from Eq. (74). Thus, we can de-termine k1T within a sign. We require that uT be a wavepropagating away from the interface. Thus, k1T must bepositive, and the solution for k1T is

k1T =√

k2I c2−

/c2+ − k2

2I − k23I

= kI

√c2−/c2+ − sin2 I (81)

Our assumption that the angle of refraction be real as-sures us that k1T is real. We can now see that when thiscriterion is violated, k1T is imaginary and an attenuatedor evanescent wave propagates in the second medium.

In summary, determination of the direction of propa-gation of the reflected and refracted wave rests totally onthe matching of the phases at the interface. Thus, evenunder conditions that require some multiple of u(x, t) onboth sides of the interface to be equal, the same conclu-sion would be reached. Furthermore, we can state thisresult in more general terms. First, Eq. (73) tells us thatthe frequencies of all of the waves must agree at the in-terface. Since the frequency is related to the wave vectorsthrough the disperion relation, we obtain one equation re-lating the wave vector kR to kI and another relating kT

to kI . In general, these equations are nonlinear. Equation(74) may be viewed as prescribing that the projections ofall of the wave vectors on the interface (i.e., the trans-verse part of the wave vectors) must agree. This providesanother pair of equations for the components of kR andanother pair of equations for the determination of kT . In-deed, this determines the transverse components of thewave vectors, and only the normal component remains tobe determined. It is in this normal component that all ofthe change from kI in the structure of the wave vectors

Page 448: Encyclopedia of Physical Science and Technology - Classical Physics

P1: GPQ Final Pages

Encyclopedia of Physical Science and Technology EN017B-822 August 2, 2001 19:4

804 Wave Phenomena

can occur. Finally, we observe that for our high frequencyapproximation [Eq. (66)] to the general Fourier superpo-sition, the same result obtains in a pointwise manner at theinterface. These features are common to all linear wavephenomena.

To determine the amplitudes AR and AT in Eq. (72),we need a second relationship between the solutions onthe two sides of the interface. We will impose the condi-tion that the normal derivatives of the fields be equal atthe interface. For our specific example of an interface atx1 = 0, the normal derivative is the x1 derivative. We willdifferentiate the two representations of u(x, t) in Eq. (72)and then set x1 equal to zero. We exploit what we alreadyknow about the wave vectors and frequency to simplifythis expression. We also use Eq. (72) with the same sim-plifications. This leads to a pair of equations in the twounknowns AR and AT :

AI + AR = AT

k1I AI − k1I AR = k1I

√c2−/c2+ − sin2 I AT (82)

The solution of this pair of equations is

AR = R AI , AI = T AI (83)

where R and T are, respectively, the reflection coefficientand transmission coefficient, which relate the amplitudesAR and AT to AI . They are given by

R =1 −

√c2−/c2+ − sin2 I

1 +√

c2−/c2+ − sin2 I(84)

T = 2

1 −√

c2−/c2+ − sin2 I

The value of sin I in terms of kI is given by Eq. (78).At normal incidence, that is, when the angle I is zero,sin I = 0, these coefficients reduce to

R = c+ − c−c+ − c−

(85)

T = 2c+c+ − c−

SEE ALSO THE FOLLOWING ARTICLES

ACOUSTICS, LINEAR • ATMOSPHERIC TURBULENCE •ELECTROMAGNETICS • FOURIER SERIES • GREEN’S FUNC-TIONS • PHYSICAL OCEANOGRAPHY, OCEANIC ADJUST-MENT • PLANETARY WAVES • RADIO PROPAGATION •SEISMOLOGY, THEORETICAL • TIME AND FREQUENCY

BIBLIOGRAPHY

Aki, K., and Richards, P. (1980). “Quantitative Seismology: Theory andMethods,” Vols. 1 and 2, Freeman, New York.

Bleistein, N. (1984). “Mathematical Methods for Wave Phenomena,”Academic Press, New York.

Bleistein, N., and Handelsman, R. A. (1986). “Asymptotic Expansionsof Integrals,” Dover Publications Inc., New York.

Bleistein, N., Cohen, J. K., and Stockwell, J. W., Jr. (2000). “Mathematicsof Multidimensional Imaging, Migration and Inversion,” Springer-Verlag, New York.

Brekhovskikh, L. M., and Godin, O. A. (1998). “Acoustics of Lay-ered Media I : Plane and Quasi-Plane Waves,” Springer-Verlag,New York.

Brillouin, L. (1960). “Wave Propagation and Group Velocity,” AcademicPress, New York.

Brillouin, L. (1953). “Wave Propagation in Periodic Structures,” Dover,New York.

Erdelyi, A. (1954). “Asymptotic Expansions of Integrals,” Dover Publi-cations Inc., New York.

Ewing, W. M., Jardetzky, W. S., and Press, F. (1957). “Elastic Waves inLayered Media,” McGraw-Hill, New York.

Felsen, L. B., and Marcuvitz, N. (1973). “Radiation and Scattering ofWaves,” Prentice Hall, Englewood Cliffs, NJ.

Goodman, J. W. (1968). “Introduction to Fourier Optics,” McGraw-Hill,New York.

Jackson, J. D. (1998). “Classical Electrodynamics,” 3rd ed., John Wiley& Sons, New York.

Pekeris, C. L. (1963). “Theory of propagation of explosive sound inshallow water,” in “Propagation of Sound in the Ocean,” Geol. Soc.Am. Memoir 27.

Sommerfeld, A. (1964). “Optics, Lectures on Theoretical Physics,” Vol.4, Academic Press, New York.

Stoker, J. J. (1957). “Water Waves,” Wiley (Interscience), New York.Titchmarsh, E. C. (1948). “Introduction to the Theory of Fourier Inte-

grals,” Clarendon, Oxford.Whitham, G. B. (1974). “Linear and Nonlinear Waves,” Wiley, New

York.