the dft: an owners' manual for the discrete fourier transform

Facts, Definitions, and Conventions

Fourier Transform/•oo

Fourier Series

Discrete Fourier Transform

& &

Spatial Grid Frequency Grid

Reciprocity Relations Critical Sampling Rate

Discrete Orthogonality

Notation

The DFTAn Owner's Manual for theDiscrete Fourier Transform

This page intentionally left blank

The DFTAn Owner's Manual for theDiscrete Fourier Transform

William L. BriggsUniversity of Colorado, Boulder

Van Emden HensonNaval Postgraduate School

Society for Industrial and Applied MathematicsPhiladelphia

siam

Copyright © 1995 by the Society for Industrial and Applied Mathematics.

10 9 8 7 6 5 4 3 2 1

All rights reserved. Printed in the United States of America. No part of this bookmay be reproduced, stored, or transmitted in any manner without the writtenpermission of the publisher. For information, write to the Society for Industrial andApplied Mathematics, 3600 University City Science Center, Philadelphia, PA19104-2688.

Library of Congress Cataloging-in-Publication Data

Briggs, William L.The DFT: an owner's manual for the discrete Fourier transform /

William L. Briggs, Van Emden Henson.p. cm.

Includes bibliographical references (p. - ) and index.ISBN 0-89871-342-0 (pbk.)1. Fourier transformations. I. Henson, Van Emden. II. Title.

QA403.5.B75 1995515'.723—dc20 95-3232

siam isa registered trademark

to our parents

Muriel and Bill

and

Louise and Phill

Contents

xi

xv

Chapter 1

26

10

Chapter 2

16172333

41445359

Chapter 3

66718793

100109

Preface

List of (Frequently and Consistently Used) Notation

Introduction

1.1. A Bit of History1.2. An Application1.3. Problems

The Discrete Fourier Transform

2.1. Introduction2.2. DFT Approximation to the Fourier Transform2.3. The DFT-IDFT Pair2.4. DFT Approximations to Fourier Series

CoefficientsThe DFT from Trigonometric ApproximationTransforming a Spike TrainLimiting Forms of the DFT-IDFT PairProblems

2.5.2.6.2.7.2.8.

Properties of the DFT

3.1. Alternate Forms for the DFT3.2. Basic Properties of the DFT3.3. Other Properties of the DFT3.4. A Few Practical Considerations3.5. Analytical DFTs3.6. Problems

vii

Viii CONTENTS

Chapter 4118120122

125

127137138

Chapter 5144144152161163172

Chapter 6180181184193197200206212222226227

Chapter 7

236

260272286299

Chapter 8

310310312321331341347

Symmetric DFTs

4.1. Introduction4.2. Real Sequences and the Real DFT4.3. Even Sequences and the Discrete Cosine

Transform4.4. Odd Sequences and the Discrete Sine

Transform4.5. Computing Symmetric DFTs4.6. Notes4.7. Problems

Multidimensional DFTs

5.1. Introduction5.2. Two-Dimensional DFTs5.3. Geometry of Two-Dimensional Modes5.4. Computing Multidimensional DFTs5.5. Symmetric DFTs in Two Dimensions5.6. Problems

Errors in the DFT6.1. Introduction6.2. Periodic, Band-Limited Input6.3. Periodic, Non-Band-Limited Input6.4. Replication and Poisson Summation6.5. Input with Compact Support6.6. General Band-Limited Functions6.7. General Input6.8. Errors in the Inverse DFT6.9. DFT Interpolation; Mean Square Error6.10. Notes and References6.11. Problems

A Few Applications of the DFT

7.1. Difference Equations; Boundary ValueProblems

7.2. Digital Filtering of Signals7.3. FK Migration of Seismic Data7.4. Image Reconstruction from Projections7.5. Problems

Related Transforms

8.1. Introduction8.2. The Laplace Transform8.3. The z-Transform8.4. The Chebyshev Transform8.5. Orthogonal Polynomial Transforms8.6. The Hartley Transform8.7. Problems

CONTENTS ix

Chapter 9

358358369375

Chapter 10

380381387389392395397397

401

409

419

Quadrature and the DFT

9.1. Introduction9.2. The DFT and the Trapezoid Rule9.3. Higher-Order Quadrature Rules9.4. Problems

The Fast Fourier Transform

10.1. Introduction10.2. Splitting Methods10.3. Index Expansions10.4. Matrix Factorizations10.5. Prime Factor and Convolution Methods10.6. FFT Performance10.7. Notes10.8. Problems

Appendix: Table of DFTs

Bibliography

Index

Preface

Let's begin by describing what this book is andwhat it is not. Fourier analysis is the study ofhow functions defined on a continuum (that is, atall points of an interval) can be represented andanalyzed in terms of periodic functions like sinesand cosines. While this is an immensely elegant andimportant subject, many practical problems involvedoing Fourier analysis on a computer or doingFourier analysis on samples of functions, or both.When carried out in these modes, Fourier analysisbecomes discrete Fourier analysis (also calledpractical, computational, and finite Fourieranalysis). All of these names convey a sense ofusefulness and tangibility that is certainly one ofthe hallmarks of the subject. So the first claim isthat this book is about discrete Fourier analysis,with the emphasis on discrete.

Fourier analysis in the continuous setting in-volves both Fourier series and Fourier transforms,but its discrete version has the good fortune of re-quiring only a single process which has come to becalled the discrete Fourier transform (DFT).One might argue that discrete Fourier series orFourier sums are more appropriate names, butthis is a decision that we will concede to genera-tions of accepted usage.

There are several aspects of DFTs that makethe subject so compelling. One is its immediateapplicability to a bewildering variety of problems,some of which we shall survey during the course ofthe book. Another tantalizing quality of the subject

xi

xii PREFACE

is the fact that it appears quite distinct from continuous Fourier analysis and at thesame time is so closely related to it. Much of the book is devoted to exploring theconstant interplay among the DFT, Fourier series, and Fourier transforms. For thisreason, the subject can feel like a delicate dance on the borders between the continuousand the discrete, and the passage between these two realms never fails to be exciting.Therefore, in a few words, this book is a study of the DFT: its uses, its properties, itsapplications, and its relationship to Fourier series and Fourier transforms.

Equally important is what this book is not! Undoubtedly, one reason for thespectacular rise in the popularity of practical Fourier analysis is the invention of thealgorithm known as the fast Fourier transform (FFT). For all of its magic andpower, the FFT is only an extremely efficient way to compute the DFT. It must be saidclearly at the outset that this book is not about the FFT, which has a vast literatureof its own. We will devote only one chapter to the FFT, and it is a whirlwind surveyof that complex enterprise. For the purposes of this book, we will assume that theFFT is the computational realization of the DFT; it is a black box that can be found,already programmed, in nearly every computer library in the world. For that reason,whenever the DFT is mentioned or used, we will assume that there is always a veryefficient way to evaluate it, namely the FFT.

A reader might ask: why another book on Fourier analysis, discrete or otherwise?And that is the first question that the authors must answer. It is not as if thisvenerable subject has been neglected in its two centuries of existence. To the contrary,the subject in all of its guises and dialects undoubtedly receives more use and attentionevery year. It is difficult to imagine that one collection of mathematical ideas couldbear in such a fundamental way on problems from crystallography to geophysicalsignal processing, from statistical analysis to remote sensing, from prime factorizationto partial differential equations. Surely it is a statement, if not about the structureof the world around us, then about the structure that we impose upon that world.And therein lies one justification of the book: a field in which there is so much changeand application deserves to be revisited periodically and reviewed, particularly at anintroductory level.

These remarks should not imply that outstanding treatments of Fourier analysisdo not already exist. Definitive books on Fourier analysis, such as those written byCarslaw [30], Lighthill [93], and Bracewell [13], have been followed by more recentbooks on discrete Fourier analysis, such as those by Brigham [20], [21], Cizek [36], andVan Loan [155]. The present book owes much to all of its predecessors; and indeednearly every fact and result in the following pages appears somewhere in the literatureof engineering and applied mathematics. This leads to a second reason for writing,namely to collect the many scattered and essential results about DFTs and place themin a single introductory book in which the DFT itself is on center stage.

Underlying this book is the realization that Fourier analysis has an enormous andincreasingly diverse audience of students and users. That audience is the intendedreadership of the book: it has been written for students and practitioners, and everyattempt has been made to write it as a tutorial. Hopefully it is already evidentthat the style is informal; it will also be detailed when necessary. Motivation andinterpretation generally take precedence over rigor, although an occasional proof doesappear when it might enhance understanding. The text is laced with numerical andgraphical examples that illuminate the discussion at hand. Each chapter is supportedby a healthy dose of instructive problems of both an analytical and computational

PREFACE xiii

nature. In short, the book is meant to be used for teaching and learning.Although there are some mathematical prerequisites for this book, it has been

made as self-contained as possible. A minimum prerequisite is a standard three-semester calculus course, provided it includes complex exponentials, power series, andgeometric series. The book is not meant to provide a thorough exposition of Fourierseries and Fourier transforms; therefore, some prior exposure to these topics would bebeneficial. On occasion a particularly "heavy" theorem will be used without proof,and this practice seems justified when it allows a useful result to be reached.

The DFT Book

FlG. 0.1. This is the first of 99 figures in this book. It shows the interdependence of itschapters and suggests some pathways through the book.

Unlike the DFT, this book is not entirely linear. The map in Figure 0.1 suggestspossible pathways through the book. Chapter 1 uses two examples to provide somehistorical and practical motivation for the DFT. Readers who disdain appetizers beforenine-course meals or overtures before operas can move directly to Chapter 2 withminimal loss.

The DFT edifice is built in Chapter 2 from several different perspectives, each ofwhich has some historical or mathematical significance: first as an approximation toFourier transforms, then as an approximation to Fourier series coefficients, then as

xiv PREFACE

a form of interpolation, and finally with the use of spike (or delta) functions. Anyand all of these four paths lead to the same end: the definition of the DFT. Theabsolutely fundamental reciprocity relations which underlie all uses of the DFTare developed early in this chapter and must not be overlooked. Chapter 3 is therequisite chapter that deals with the many properties of the DFT which are usedthroughout the remainder of the book.

At this point the path splits and there are several options. In Chapter 4 specialsymmetric forms of the DFT are considered in detail. This collection of ideas andtechniques is tremendously important in computations. Equally important is thesubject of DFTs in more than one dimension since many computationally intensiveapplications of the DFT involve multidimensional transforms. This is the subject ofChapter 5.

The toughest passage may be encountered in Chapter 6, in which the issue oferrors in the DFT is treated. Be forewarned: this chapter is meant to be methodicaland exhaustive—perhaps to a fault. While all of the results in this chapter haveundoubtedly appeared before, we know of no other place in which they have all beencollected and organized. This chapter is both the most theoretical chapter of the bookand the most important for practical applications of the DFT.

The final four chapters offer some choice and can be sampled in any order. The firstof these chapters deals with a few of the many applications of the DFT, specifically,the solution of difference equations, digital signal processing, FK migration of seismicdata, and the x-ray tomography problem. The next of these chapters is a survey ofother transforms that are related in some way to the DFT. Another chapter delvesinto a host of issues surrounding the DFT, viewed as a quadrature rule. Not wishingto neglect or emphasize the fast Fourier transform, the final chapter provides a highaltitude survey of the FFT for those who wish to look "under the hood" of the DFT.

A book that began as a 200-page tutorial could not have grown to theseproportions without some help. We are grateful to the following SIAM staffpeoplefor bringing the book to fruition: Nancy Abbott (designer), Susan Ciambrano(acquisitions editor), Beth Gallagher (editor), J. Corey Gray (production), and VickieKearn (publisher). Early versions of the book were read by Bob Palais, RidgwayScott, Gil Strang, Roland Sweet, and Charles Van Loan. We appreciate the time theyspent in this task, as well as the suggestions and the encouragement that they offered.Cleve Moler, using Matlab, showed us the mandala figures that grace the cover andfrontmatter of the book (they should be displayed in color for best effect). Scanningof one of the book's figures was done by David Canright, and we also owe thanksto the writers of the freeware package SU (Seismic Unix) at the Colorado School ofMines. One of us (WLB) would like to thank the people at Dublin City Universityfor nine months of Irish hospitality during the early stages of the writing. Finally, thecompletion of this book would have been impossible without the love and forbearanceof our wives and daughters, who provided encouragement, chili dinners, waltzes withsnowmen, and warm blankets in spare bedrooms: we thank Julie and Katie, Teri andJennifer.

You probably agree that this preamble has gone on long enough; it's time for thefun (and the learning) to begin.

List of (Frequently and Consistently Used) Notation

a x 10_n

length of spatial domaingeneric matrices (bold face)Fourier coefficientsdiscrete cosine transform operatorspace of m times continuously differentiate functions on [a, b]DFT operatorFourier transform operatorgeneric functionsconvolution of two functionsFourier transforms of f, g, hgeneric sequences or sequences of samples of functions f,g,hconvolution of two sequencesarray of samples of a function f ( x , y )output of inverse DFTsequence of DFT coefficientsarray of DFT coefficientsauxiliary array (not complex conjugate)a spatial domain, usually {—A/2, A/1]number of points in a DFTthe set of integers {– N/2 + 1,.. ., N/2}the consecutive integers between P and Q, inclusivesquare pulse or boxcar function of width areplication operatorreal, imaginary partdiscrete sine transform operatorlength of the input (time) domain, Chapter 7matrix of the N-point DFTgrid points in the spatial domaingeneric vectors (bold face)

Dirac delta function, x a real numberKronecker delta sequence, k an integermodular Kronecker delta sequence, k an integergrid spacing (sampling interval) in spatial domaingrid spacing (sampling interval) in frequency domainwavelength of two-dimensional wave in y-directionextrema of an orthogonal polynomialwavelength of two-dimensional wavewavelength of two-dimensional wave in x-directionlength of frequency vector of two-dimensional wavevariable in frequency domaincut-off frequencygrid point in frequency domainei2π/N

length of frequency domainsecond variable in frequency domainsum in which first term is weighted by 1/2sum in which first and last terms are weighted by 1/2zeros of an orthogonal polynomial

X V

Chapter 1

The material essential fora student's mathematicallaboratory is very simple.

Each student shouldhave a copy of Barlow's

tables of squares, etc.,a copy of Crelle's

Calculation Tables, anda seven-place table of

logarithms. Further it isnecessary to provide a

stock of computing paper,. . . and lastly, a stock of

computing forms forpractical Fourier

analysis. With thismodest apparatus nearly

all of the computationshereafter described maybe performed, although

time and labour mayoften be saved by the use

of multiplying andadding machines when

these are available.— E. T. Whittaker and

G. RobinsonThe Calculus of

Observations, 1924

Introduction

1.1 A Bit of History

1.2 An Application

1.3 Problems

1

2 INTRODUCTION

1.1. A Bit of History

Rather than beginning with the definition of the discrete Fourier transform on the firstpage of this book, we thought that a few pages of historical and practical introductionmight be useful. Some valuable perspective comes with the realization that the DFTwas not discovered ten years ago, nor was it invented with the fast Fourier transform(FFT) thirty years ago. It has a fascinating history, spanning over two centuries,that is closely associated with the development of applied mathematics and numericalanalysis. Therefore, this chapter will be devoted, in part, to the history of the DFT.However, before diving into the technicalities of the subject, there is insight to begained from seeing the DFT in action on a specific problem. Therefore, a few pagesof this chapter will also be spent extracting as much understanding as possible froma very practical example. With these excursions into history and applications behindus, the path to the DFT should be straight and clear.

Let's begin with some history. Fourier analysis is over 200 years old, and its historyis filled with both controversy and prodigious feats. Interwoven throughout it is thethread of discrete or practical Fourier analysis which is most pertinent to this book.In order to appreciate the complete history, one must retreat some 60 years prior tothe moment in 1807 when Jean Baptiste Joseph Fourier presented the first version ofhis paper on the theory of heat conduction to the Paris Academy. The year 1750 is agood starting point: George II was king of England and the American colonies were inthe midst of the French and Indian War; Voltaire, Rousseau, and Kant were writing inEurope; Bach had just died, Mozart was soon to be born; and the calculus of Newtonand Leibnitz, published 75 years earlier, was enabling the creation of powerful newtheories of celestial and continuum mechanics.

There were two outstanding problems of the day that focused considerablemathematical energy, and formed the seeds that ultimately became Fourier analysis.The first problem was to describe the vibration of a taut string anchored at both ends(or equivalently the propagation of sound in an elastic medium). Remarkably, the waveequation as we know it today had already been formulated, and the mathematiciansJean d'Alembert, Leonhard Euler, Daniel Bernoulli, and Joseph-Louis Lagrange hadproposed methods of solution around 1750. Bernoulli's solution took the form of atrigonometric series

y = A sin x cos at + B sin 2x cos 2at + • • • ,

in which x is the spatial coordinate and t is the time variable. This solution alreadyanticipated the continuous form of a Fourier series. It appears that both Euler andLagrange actually discretized the vibrating string problem by imagining the string toconsist of a finite number of connected particles. The solution of this discrete problemrequired finding samples of the function that describes the displacement of the string.A page from Lagrange's work on this problem (see Figure 1.1), published in 1759 [90],contains all of the ingredients of what we would today call a discrete Fourier sineseries.

The second problem which nourished the roots of Fourier analysis, particularlyin its discrete form, was that of determining the orbits of celestial bodies. Euler,Lagrange, and Alexis Claude Clairaut made fundamental contributions by proposingthat data taken from observations be approximated by linear combinations of periodicfunctions. The calculation of the coefficients in these trigonometric expansions led toa computation that we would call a discrete Fourier transform. In fact, a paper

A BIT OF HISTORY

ET LA PROPAGATION DU SON. 81

inent, pour arriver a une qui ne contienne plus qu'une seule de ces

variables; mais il est facile de voir qu'en s'y prenant de cette faexw on

tom|>erait dans des calculs impraticables a cause du nombre indetcrmint!

d'equations et d'inconnues; il est done necessaire do suivro unc ai i t iv

roule : voici celle qui m'a paru la plus propre.

24. Je multiplie d'abord cbacune de ces equations par un des coeffi-

cients indelermint's D,, Da, D,, D,,.. . , en supposant que le premier D,

soil egal a i; ensuite je les ajoute toutes ensemble : j'ai

Qu'on veuille a present la valeurd'un vquelconque, par exemplc de y;i,

on fera evanouir les coefficients des atitres v, et Ton obtiendra IV-quation

simple

On determinera ensuite les valeurs des quantities D,, D,, D4 quisont en nombre de m — a, par les equations particulieres qu'on aura ensupposant egaux a zero les coefficients de tous les autres v: on aura par

la 1'equation generate

FIG. 1.1. This page from The Nature and Propagation of Sound, written by Lagrangein 1759 [90], dealt with the solution of the vibrating string problem. Lagrange assumed thatthe string consists of m — I particles whose displacements, yn, are sums of sine functionsof various frequencies. This representation is essentially a discrete Fourier sine transform.Note that w means TT.

3

4 INTRODUCTION

published in 1754 by Clairaut contains what has been described as the first explicitformula for the DFT [74].

The story follows two paths at the beginning of the nineteenth century. Notsurprisingly, we might call one path continuous and the other discrete. On thecontinuous path, in 1807 Fourier presented his paper before the Paris Academy, inwhich he asserted that an arbitrary function can be represented as an infinite seriesof sines and cosines. The paper elicited only mild encouragement from the judgesand the suggestion that Fourier refine his work and submit it for the grand prize in1812. The Academy's panel of judges for the grand prize included Lagrange, Laplace,and Legendre, and they did award Fourier the grand prize in 1812, but not withoutreservations. Despite the fact that Euler and Bernoulli had introduced trigonometricrepresentations of functions, and that Lagrange had already produced what we wouldcall a Fourier series solution to the wave equation, Fourier's broader claim that anarbitrary function could be given such a representation aroused skepticism, if notoutrage. The grand prize came with the deflating assessment that

the way in which the author arrives at his equations is not exempt fromdifficulties, and his analysis still leaves something to be desired, be it ingenerality, or be it even in rigor.

Historians are divided over how much credit is due to Lagrange for the discovery ofFourier series. One historian [154] has remarked that

certainly Lagrange could find nothing new in Fourier's theorem exceptthe sweeping generality of its statement and the preposterous legerdemainadvanced as a proof.

Fourier's work on heat conduction and the supporting theory of trigonometricseries culminated in the Theorie analytique de la chaleur [60], which was published in1822. Since that year, the work of Fourier has been given more generous assessments.Clerk Maxwell called it a "great mathematical poem." The entire career of WilliamThompson (Lord Kelvin) was impacted by Fourier's theory of heat, and he proclaimedthat

it is difficult to say whether their uniquely original quality, or their tran-scendently intense mathematical interest, or their perennially importantinstructiveness for the physical sciences, is most to be praised.

Regardless of the originality and rigor of the work when it was first presented, therecan be little doubt that for almost 200 years, the subject of Fourier analysis haschanged the entire landscape of mathematics and its applications [156], [162].

The continuous path did not end with Fourier's work. The remainder of thenineteenth century was an incubator of mathematical thought in Europe. Some ofthe greatest mathematicians of the period such as Poisson, Dirichlet, and Riemannadvanced the theory of trigonometric series and addressed the challenging questionsof convergence. The campaign continued into the twentieth century when Lebesgue,armed with his new theory of integration, was able to produce even more generalstatements about the convergence of trigonometric series.

Let's return to the beginning of the nineteenth century and follow the secondpath, with all of its intrigue. As mentioned earlier, Clairaut and Lagrange hadconsidered the problem of fitting astronomical data, and because those data hadperiodic patterns, it was natural to use approximating functions consisting of sines

A BIT OF HISTORY 5

and cosines. Since the data represented discrete samples of an unknown function, andsince the approximating functions were finite sums of trigonometric functions, thiswork also led to some of the earliest expressions of the discrete Fourier transform.

The work of Lagrange on interpolation was undoubtedly known to the Germanmathematician Carl Friedrich Gauss, whose prolific stream of mathematics originatedin Gottingen. Almost a footnote to Gauss' vast output was his own contributionto trigonometric interpolation, which also contained the discrete Fourier transform.Equally significant is a small calculation buried in his treatise on interpolation [61]that appeared posthumously in 1866 as an unpublished paper. This work has beendated to 1805, and it contains the first clear and indisputable use of the fast Fouriertransform (FFT), which is generally attributed to Cooley and Tukey in 1965.Ironically, Gauss' calculation was cited in 1904 in the mathematical encyclopedia ofBurkhardt [25] and again in 1977 by Goldstine [67]. The entire history of the FFTwas recorded yet again in 1985 in a fascinating piece of mathematical sleuthing byHeideman, Johnson, and Burrus [74], who remark that "Burkhardt's and Goldstine'sworks went almost as unnoticed as Gauss' work itself."

To introduce the discrete Fourier transform and to recognize a historical landmark,it would be worthwhile to have a brief look at the problem that Gauss was workingon when he resorted to his own fast Fourier transform. Around 1800 Gauss becameinterested in astronomy because of the discovery of the asteroid Ceres, the orbit ofwhich he was able to determine with great accuracy. At about the same time, Gaussobtained the data presented in Table 1.1 for the position of the asteroid Pallas.

TABLE 1.1Data on the asteroid Pallas used by Gauss, reproduced byGoldstine from tables of Baron von Zach. Reprinted here, bypermission, from [H. H. Goldstine, A History of NumericalAnalysis from the 16th through the 19th Century, Springer-Verlag, Berlin, 1977]. ©1977, Springer-Verlag.

Departing from Gauss' notation, we will let 6 and X represent the ascension (stillmeasured in degrees) and the declination, respectively. We will also let (0n, Xn) denotethe actual data pairs where n = 0 , . . . , 11. Since the values of X appear to have aperiodic pattern as they vary with 9, the goal is to fit these twelve data points with atrigonometric expression of the form

Notice that there are twelve unknown coefficients (the a^'s and fr^'s), each of whichmultiplies a sine or cosine function (called modes) with a certain period or frequency.The fundamental period (represented by the coefficients ai and 61) is 360 degrees; themodes with k > I have smaller periods of 360/fc degrees.

Ascension, 9 0 30 60 90 120 150(degrees)

Declination, X 408 89 ^66 10 338 807(minutes)

Ascension, B I 180 2H) 240 270 300 330(degrees)

Declination, X 1238 1511 1583 1462 1183 804(minutes)

6 INTRODUCTION

Twelve conditions are needed to determine the twelve coefficients. Thoseconditions are simply that the function / match the data; that is,

for n = 0 : 11 (throughout the book the notation n = NI : N% will be used to indicatethat the integer index n runs through all consecutive integers between and includingthe integers NI and -/V2). These conditions amount to a system of twelve linearequations that even in Gauss' day could have been solved without undue exertion.Nevertheless Gauss, either in search of a shortcut or intrigued by the symmetries ofthe sine and cosine functions, discovered a way to collect the coefficients and equationsin three subproblems that were easily solved. The solutions to these subproblems werethen combined into a final solution. Goldstine [67] has reproduced this remarkablecomputation, and it is evident that the splitting process discovered by Gauss lies atthe heart of the modern fast Fourier transform method.

At the moment we are most interested in the outcome and interpretation of Gauss'calculation. Either by solving a 12 x 12 linear system or by doing a small FFT, thecoefficients of the interpolation function are given in Table 1.2.

TABLE 1.2Coefficients in Gauss' trigonometric interpolating function

A picture is far more illuminating than the numbers. Figure 1.2 shows the data(marked as x's) and the function / plotted as a smooth curve. Notice that theinterpolation function does its job: it passes through the data points. Furthermore,the function and its coefficients exhibit the "frequency structure" of the data. Thecoefficients with the largest magnitude belong to the constant mode (correspondingto ao), and to the fundamental mode (corresponding to ai and bi). However,contributions from the higher frequency modes are needed to represent the dataexactly.

His use of the FFT aside, Gauss' fitting of the data with a sum of trigonometricfunctions is an example of a DFT and is a prototype problem for this book. As wewill see shortly, the "input" for such a problem may originate as data points (as inthis example) or as a function defined on an interval. In the latter case, the processof finding the coefficients in the interpolating function is equivalent to approximatingeither the Fourier series coefficients or the Fourier transform of the given function. Wewill explore all of these uses of the DFT in the remainder of this book.

1.2. An Application

We now turn to a practical application of the DFT to a problem of data analysisor, more specifically, time series analysis. Even with many details omitted, theexample offers a vivid demonstration of some of the fundamental uses of the DFT,namely spectral decomposition and filtering. On a recent research expedition to theTrapridge Glacier in the Yukon Territory, Canada, data were collected by sensors in the

kOfe

bk

0780.6

1-411.0-720.2

243.4-2.2

3-4.35.5

4-1.1-1.0

5.3

-.3

6.1

AN APPLICATION

FIG. 1.2. The 12 data points (marked by x 's) describe the position of the asteroid Pallas.The declination (in minutes} of Pallas is plotted along the vertical axis against the ascension(in degrees). A weighted sum of 12 sine and cosine functions with different periods is usedto fit the data (smooth curve). The coefficients in that weighted sum are determined by adiscrete Fourier transform.

bed of the glacier, 80 meters below the surface [132]. In particular, measurements ofthe turbidity (amount of suspended material) of the subglacial water were taken everyAt = 10 minutes w .0069 days. When plotted, these data produce the jagged curveshown in the upper left graph of Figure 1.3 (time increases to the right and turbidityvalues are plotted on the vertical axis). The curve actually consists of N = 368 datapoints, which represent TV At = 2.55 days of data.

Notice that the data exhibit both patterns and irregularities. On the largest scalewe see a wave-like pattern with a period of approximately one day; this is an explicablediurnal variation in the water quality variables of the glacier that has a relatively lowfrequency. On a smaller time scale the data appear to be infected with a high frequencyoscillation which is attributable partly to instrument noise. Between these high andlow frequency patterns, there may be other effects that occur on intermediate timescales.

One of the most revealing analyses that can be done on a set of data, or any"signal" for that matter, is to decompose the data according to its various frequencies.This procedure is often called spectral analysis or spectral decomposition, and itprovides an alternative picture of the data in the frequency domain. This frequencypicture tells how much of the variability of the data is comprised of low frequency wavesand how much is due to high frequency patterns. Using the glacier data, here is how itworks. The first question that needs to be answered concerns the range of frequenciesthat appear in a data set such as that shown in the top left graph of Figure 1.3.Just a few physical arguments provide the answer. We will let the total time intervalspanned by the data set be denoted A] for the glacier data, we have A = 2.55 days. Asmentioned earlier, we will let the spacing between data points be denoted At. Thesetwo parameters, the length of the time interval A, and the so-called sampling rateAt are absolutely crucial in the following arguments. Here are the key observationsthat give us the range of frequencies.

7

INTRODUCTION

FIG. 1.3. The above graphs show how the DFT can be used to carry out a spectralanalysis of a data set. The original data set consisting of N — 368 subglacial water turbiditymeasurements taken every At = 10 minutes is shown in the top left figure. The same data setwith the "mean subtracted cmt" is shown in the top right figure. The center left figure showsthe spectrum of the data after applying the DFT. The horizontal axis now represents frequencywith values between u\ = .39 and a>i84 = 72 days~l. The relative weight of each frequencycomponent in the overall structure of the turbidity data is plotted on the vertical axis. Thecenter right figure is a close-up of the spectrum showing the low frequency contributions.By filtering the high frequency components, the data can be smoothed. The lower left andright figures show reconstructions of the data using the lowest 10 and lowest 50 frequencies,respectively.

The subscript 1 on u>i means that this is the first (and lowest) of an entire set offrequencies. Therefore, the length of the time interval A days determines the lowestfrequency in the system.

We now consider periodic patterns with higher frequencies. A wave with two fullperiods over A days has a frequency of u>2 = 2/A days"1. Three full periods over

8

If we collect data for 2.55 days, the longest complete oscillation that we can resolvelasts for precisely 2.55 days. Patterns with periods greater than 2.55 days cannot befully resolved within a 2.55 day time frame. What is the frequency of this longestwave? Using the units of periods per day for frequency, we see that this wave has afrequency of

AN APPLICATION 9

the interval corresponds to a frequency of u>3 = 3/A days"*. Clearly, the full set offrequencies is given by

Having determined where the range of frequencies begins, an even more importantconsideration is where it ends. Not surprisingly, the sampling rate At determines themaximum frequency. One might argue that there should be no maximum frequency,since oscillations continue on ever-decreasing time scales all the way to molecular andatomic levels. However, when we take samples every At = 10 minutes, there is a limitto the time scales that can be resolved. In pages to come, we will demonstrate inmany ways the following fundamental fact: if a signal (or phenomenon of any kind)is sampled once every At units, then a wave with a period less than 2At cannot beresolved accurately. The most detail that we can see with a "shutter speed" of At isa wave that has a peak at one sample time, a valley at the next, and a peak at thenext. This "up-down-up" pattern has a period of 2At units, and a frequency of

Where does this maximum frequency fit into the set of frequencies that we denoted(jjwk = k/Al Notice that At = A/N, where N is the total number of data points.Therefore

In other words, the highest frequency that can be resolved with a sample rate ofAt = A/N is the TV/2 frequency in the set of frequencies. (If TV is odd, a modifiedargument can be made, as shown in Chapter 3.) In summary, the TV/2 frequenciesthat can be resolved over an interval of A units with a sample rate of At units aregiven by

For the glacier data set, the numerical values of these frequencies that can be resolvedare

We can now turn to an analysis of the turbidity data. A common first stepin data or signal analysis is to subtract out the mean. This maneuver amounts tocomputing the average of the turbidity values and subtracting this average from allof the turbidity values. The result of this rescaling is shown in the top right plot ofFigure 1.3; the data values appear as fluctuations about the mean value of zero. Withthis adjusted data set, we can perform the frequency or spectral decomposition ofthe data. The details of this calculation will be omitted, since it involves the discreteFourier transform, which is the subject of the remainder of the book! However, we canexamine the output of this calculation and appreciate the result. The center left graphof Figure 1.3 shows what is commonly called the spectrum or power spectrum othe data set. The horizontal axis of this graph is no longer time, but rather frequency.Notice that N/1 = 184 frequencies are represented on this axis. On the vertical axisis a measure of the relative weight of each of the frequencies in the overall structureof the turbidity data. Clearly, most of the "energy" in the data resides in the lowerfrequencies, that is, the uVs with k < 10. The center right graph of Figure 1.3 is aclose-up of the spectrum for the lower frequencies. In this picture it is evident that the

10 INTRODUCTION

dominant frequencies are MZ and 0*3, which correspond to periods of 1.25 days and 0.83days. These periods are the nearest to the prominent diurnal (daily) oscillation thatwe observed in the data at the outset. Thus the spectral decomposition does captureboth the obvious patterns in the data, as well as other hidden frequency patterns thatare not so obvious visually.

What about the experimental "noise" that also appears in the data? This noiseshows up as all of the high frequency (k » 10) contributions in the spectrum plot.This observation brings us to another strategy. If we believe that the high frequencyoscillations really are spurious and contain no useful physical information, we mightbe tempted to filter these oscillations out of the data. Using the original data set,this could be a delicate task, since the high frequency part of the signal is spreadthroughout the data set. However, with the spectral decomposition plot in front ofus, it is an easy task, since the high frequency noise is localized in the high frequencypart of the plot.

The process of filtering is a science and an art that goes far beyond the meat-cleaver approach that we are about to adopt. Readers who wish to see the more refinedaspects of filtering should consult one of many excellent sources [20], [70], [108]. Withapologies aside, the idea of filtering can be demonstrated by simply removing all of thehigh frequency contributions in the spectrum above a chosen frequency. This methodcorresponds to using a sharp low-pass filter (low frequencies are allowed to pass) thattruncates the spectrum.

With this new truncated spectrum, it is now possible to reconstruct the data setin an inverse operation that also requires the use of the DFT. Omitting details, we canobserve the outcome of this reconstruction in the lower plots of Figure 1.3. The first ofthese two figures shows the result of removing all frequencies above LJIQ. Notice thatall of the noise in the data has been eliminated, and indeed even the low frequencyoscillations have been smoothed considerably. The second of the reconstructions usesall frequencies below ^50; this filter results in a slightly sharper replica of the lowfrequency oscillations, with more of the high frequency noise also evident.

This discussion of spectral decomposition and filtering has admittedly been rathersparse in details and technicalities. Hopefully it has illustrated a few fundamentalconcepts that will recur endlessly in the pages to come. In summary, given a functionor data set defined in a time or spatial domain, the DFT serves as a tool thatgives a frequency picture of that function or data set. The frequency picture dependson the length of the interval A on which the original data set is defined, and it alsodepends on the rate At that is used to sample the data. Once the data set is viewedin the frequency domain, there are operations such as filtering that can be applied ifnecessary. Equally important is the fact that if the problem calls for it, there is alwaysa path back to the original domain; this passage also uses the DFT. All of these ideaswill be elaborated and explored in the pages to come. It is now time to dispense withgeneralities and history, and look at the DFT in all of its power and glory.

1.3. Problems

1. Gauss' problem revisited. Consider the following simplified version ofGauss' data fitting problem. Assume that the following N — 4 data pairs (xn,yn)

PROBLEMS 11

are collected:

where the values T/O, 7/1, 7/2, and 7/3 are known, but will be left unspecified. The dataset is also assumed to be periodic with period of 2-rr, so one could imagine anotherdata point (2ir, 7/4) = (27T, yo} if necessary. This data set is to be fit with a function ofthe form

where the four coefficients ao, fli, &i, 02 are unknown, but will be determined from thedata.

(a) Verify that each of the individual functions in the representation (1.1),namely, fo(x) = 1, fi(x] = cos a:, fi(x] = sinx, and f$(x) = cos(2x) has aperiod of 2?r (that is, it satisfies f ( x ) = f(x + 27r) for all x).

(b) Conclude that the entire function / also has a period of 2?r.

(c) Plot each of the individual functions /o, /i, /2, and /3, and note the periodand frequency of each function.

(d) To find the coefficients ao, 01, &i, and tt2 in the representation (1.1), imposethe interpolating conditions

for n = 0,1,2,3, where xn = mr/2. Write down the resulting system oflinear equations for the unknown coefficients (recalling that yo, yi, y%, and7/3 are assumed to be known).

(e) Use any available means (pencil and paper will suffice!) to solve for thecoefficients ao,ai,&i, and a? given the data values

(f) Plot the resulting function / and verify that it does pass through each ofthe four data points.

2. The complex exponential. Given the Euler relation

verify that

3. Modes, frequencies, and periods. Consider the functions (or modes)Vk(x) = cos(7rkx/2) and Wk(x) — sm(rckx/2) on the interval [-2,2].

(a)

(b)

(c)

(d)

(e)

if m is any integer multiple of the integer JV,

where m and TV are any integers.

and (also called Euler relations),

12 INTRODUCTION

(a) Plot on separate sets of axes (hand sketches will do) the functions Vk andWk for k = 0,1,2,3,4. In each case, note the period of the mode and thefrequency of the mode. How are the period and frequency of a given moderelated to the index A;?

(b) Now consider the grid with grid spacing (or sampling rate) Ax = 1/2consisting of the points xn = n/2 where n =-4 : 4. Mark this grid on theplots of part (a) and indicate the sampled values of the mode.

(c) What are the values of VQ and WQ at the grid points?

(d) What are the values of v$ and w^ at the grid points?

(e) Plot v$ on the interval [—2,2]. What is the effect of sampling 1*5 at the gridpoints? Compare the values of ^5 and the values of v$ at the sample points.

4. Sampling and frequencies. Throughout this book, we have, somewhatarbitrarily, used Ax to denote a grid spacing or sampling rate. This notation suggeststhat the sampling takes place on a spatial domain. However, sampling can also takeplace on temporal domains, as illustrated by the glacier data example in the text. Inthis case At is a more appropriate notation for the grid spacing.

(a) Imagine that you have collected N = 140 equally spaced data values (forexample, temperatures or tidal heights) over a time interval of durationA — 14 days. What is the grid spacing (or sampling rate) At? What isthe minimum frequency ui that is resolved by this sampling? What is themaximum frequency UJ-JQ that is resolved by this sampling? What are theunits of these frequencies?

(b) Imagine that you have collected N = 120 equally spaced data values (forexample, material densities or beam displacements) over a spatial intervalof length A = 30 centimeters. What is the grid spacing Ax? What isthe minimum frequency u\ that is resolved by this sampling? What is themaximum frequency U)QQ that is resolved by this sampling? What are theunits of these frequencies? Note that in spatial problems frequencies areoften referred to as wavenumbers.

5. Complex forms. It will be extremely useful to generalize problem 1 and obtaina sneak preview of the DFT. There are some advantages to formulating the aboveinterpolating conditions in terms of the complex exponential. Show that with N = 4the real form of the interpolating conditions

for n = 0,1,2,3 may also be written in the form

for n = 0,1,2,3, where the coefficients co,ci,C2, and 03 may be complex-valued.Assuming that all of the data values are real, show that CQ and 02 are real-valued,

PROBLEMS 13

and that Re{ci} = Refca} and Imjci} = —Imfca}. Then the following relationsbetween the coefficients {ao,ai,bi,d2} and the coefficients {co,Ci,C2,C3} result:

6. More general complex forms. To sharpen your skills in working with complexexponentials, carry out the calculation of the previous problem for a data set with TVpoints in the following two cases.

(a) Assume that the real data pairs (xn,yn} are given for n = 0, . . . , TV — 1where xn = 2im/N. Show that the interpolating conditions take the realform

for n = 0 : N — 1. Show that these conditions are equivalent to theconditions in complex form

for n = 0 : N — 1, where CQ and CN. are real, Rejcfc} = Re{c;v_fc}, andIm {cjt} = —Im {cjv-fc}- The relations between the real coefficients {a^, bk}and the complex coefficients c^ are ao = CQ, a^ = CN , and

2 2

for A; = 1 : N/2 - 1.

(b) Now assume the N real data pairs (xn,yn] are given for n = —N/2 +1,.. . , AT/2, where xn = 1-nn/N. Notice that the sampling interval is now[—7r,7r]. The data fitting conditions take the real form

where n = —N/2 + 1 : N/2. Show that these conditions are equivalent tothe conditions in complex form

for n = —N/2+1 : N/2, where CQ and CN, are real, Re {c^} = Re {c_fc}, andImjcfc} = -Im{c_fc}. The relations between the real coefficients {dk,bk}and the complex coefficients c/t are an = cn,aN = C A T , and

2 2

for k = 1 : AT/2 - 1.

14 INTRODUCTION

We will see shortly that all of the above procedures for determining the coefficientsCk from the data values yn (or vice-versa) are versions of the DFT.

7. Explain those figures. The cover and frontmatter of this book display severalpolygonal, mandala-shaped figures which were generated using the DFT. With just asmall preview of things to come, we can explain these figures. As will be shown earlyin the next chapter, given a sequence {fn}n=Q, one form of the DFT of this sequence

is the sequence {Fk} k=0 , where

(a) Let N = 4 for the moment and compute the DFT of the unit vectors(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), and (0, 0, 0, 1). Verify that the correspond-ing DFTs are |(1, 1, 1, 1), J(l, -i, l,i), J(l, -1, 1, -1), and i(l,i, -1, -»).

(b) Plot each of these DFT vectors by regarding each component of a givenvector as a point in the complex plane and draw a connecting segmentfrom each component to its predecessor, cyclically (i.e., FQ — * FI — » F^ — >•F3-FO.

(c) Repeat this process with the eight unit vectors ej — (0, 0, . . . , 1, 0, 0, . . . , 0),where the 1 is in the jth component. Show that the 8-point DFT of e,j is

which holds for k = 0 : 7.

(d) Verify that plotting each of these eight DFTs in the complex plane(connecting end-to-end as described above) produces a figure similar tothe ones shown on the cover and elsewhere.

Chapter 2

The Discrete FourierTransform2.1 Introduction

2.2 DFT Approximation to the FourierTransform

2.3 The DFT-IDFT Pair

2.4 DFT Approximations to Fourier SeriesCoefficients

2.5 The DFT from TrigonometricApproximation

2.6 Transforming a Spike Train

2.7 Limiting Forms of the DFT-IDFT Pair

2.8 Problems

Our life is anapprenticeship to the

truth that aroundevery circle another

can he drawn.Ralph Waldo Emerson 15

16 THE DISCRETE FOURIER TRANSFORM

2.1. Introduction

We must first agree on what the words discrete Fourier1 transform mean. What,exactly, is the DFT? The simple response is to give a formula, such as

and state that this holds for k equal to any N consecutive integers. Equation (2.1)is, in fact, a definition that we shall use, but such a response sheds no light on theoriginal question. What, then, is the DFT? Is it a Fourier transform, as its namemight imply? If it is not a Fourier transform, does it approximate one? The adjectivediscrete suggests that it may be more closely related to the Fourier series than to thecontinuous Fourier transform. Is this the case? There are no simple answers to thesequestions. Viewed from certain perspectives, the DFT is each of these things. Yetfrom other vantage points it presents different faces altogether. Our intent is to arriveat an answer, but we shall not do so in the span of two or three pages. In fact, theremainder of this book will be devoted to formulating an answer.

We will begin with a derivation of a DFT formula. Notice that we did not say thederivation of the DFT. There are almost as many ways to derive DFTs as there areapplications of them. In the course of this chapter, we will arrive at the DFT alongthe following four paths:

• approximation to the Fourier transform of a function,

• approximation to the Fourier coefficients of a function,

• trigonometric approximation, and

• the Fourier transform of a spike train.

This approach may seem a bit overzealous, but it does suggest the remarkablegenerality of the DFT and the way in which it underlies all of Fourier analysis. Thefirst topic will be the DFT as an approximation to the Fourier transform. This choiceof a starting point is somewhat arbitrary, but it does have the advantage that theabsolutely fundamental reciprocity relations can be revealed as early as possible.This first derivation of the DFT is followed by the official presentation of the DFT,its orthogonality properties, and the essential inverse DFT relation. We then presentthree alternative, but illuminating derivations of the DFT within the frameworks ofFourier series, trigonometric approximation, and distribution theory. Those who feelthat one derivation of the DFT suffices can pass over the remaining three and proceeddirectly to the last section of the chapter. However, we feel that all four paths provideimportant and complimentary views of the DFT and deserve at least a quick reading.While there is certainly technical material throughout this chapter, our intent is topresent some fairly qualitative features of the DFT, and return to technical matters,deeper properties, and applications in later chapters.

1Born in Auxerre, France in 1768, orphaned at the age of eight, JEAN BAPTISTE JOSEPH FOURIEwas denied entrance to the artillery because he was not of noble birth, "although he were a secondNewton." Fourier was an active supporter of the French Revolution and accompanied Napoleon toEgypt, where he served as secretary of the Institute of Egypt. He died of an aneurism on May 16,1830.

TTTT

2.2. DFT Approximation to the Fourier Transform

A natural problem to examine first is the approximation of the Fourier transform of a(possibly complex-valued) function / of a real variable x. We should recognize that,in practice, / may not appear explicitly as a function, but may be given as a set ofdiscrete data values. However, for the moment let's assume that / is defined on theinterval (—00, oo) and has some known properties, one of which is that it is absolutelyintegrable on the real line. This means that

where — oo < a; < oo and i = \f— 1. We have chosen to offend half of our audienceby letting i (the mathematician's choice), rather than j (the engineer's preference),be the imaginary unit! The function / is called the Fourier transform of / and isuniquely determined by (2.2). The transform / is said to be defined in the frequencydomain or transform domain, while the input function / is said to be defined inthe spatial domain if x is a spatial coordinate, or in the time domain if / is atime-dependent function. Of tremendous importance is the fact that there is also aninverse relationship between / and / [13], [55], [88], [111], [160] given by

This relationship gives / as the inverse Fourier transform of /(a;).Uninitiated readers should not feel disadvantaged with this sudden exposure to

Fourier transforms. In fact, it can be argued that an understanding of the DFT canlead to a better understanding of the Fourier transform. Let's pause to give a briefphysical interpretation of the Fourier transform, which will be applicable to the DFT.It all begins with a look at the kernel of the Fourier transform, which is the terme-i2-Kux_ Similarly, the kernel of the inverse Fourier transform is el2nux. Using theEuler2 formula, these kernels may be written

We see that for a fixed value of cj, the kernel consists of waves (sines and cosines)with a period (or wavelength) of 1/w, measured in the units of x (either lengthor time). These waves are called modes. Equivalently, the modes corresponding toa fixed value of u) have a frequency of u> periods per unit length or cycles per unit

2L,EONHARD EULER (1707-1783), among the most prolific and accomplished mathematicians of hisor any other time, has lent his name to equations, relations, theorems, methods, formulas, numbers,and curiosities in every branch of mathematics. Within the 886 books and papers that make upEuler's work (nearly half published posthumously) are found what we know today as Euler's relation,the Euler differential equation, Euler's method for solving quartics, the Euler method for numericalsolution of ODEs; the list seems endless.

Then we may define a function by

APPROXIMATION TO THE FOURIER TRANSFORM 17


FlG. 2.1. All of Fourier analysis (the DFT, Fourier series, and Fourier transforms) concernsthe representation of functions or data in terms of modes consisting of sines and cosineswith various frequencies. The figure shows some typical modes of the form Acos(27rcjx) orAsin(27roix). The amplitude, A, varies between 0 and 1. Counterclockwise from the top leftare COS(TTX), with frequency uj = 1/2 cycles/unit and period (wavelength) 2 units; sin(27rx),with frequency u; = 1 and wavelength 1; cos(67rx), with frequency u = 3 and a wavelength1/3; and sin(127rx) with frequency u = 6 and wavelength 1/6.

time (the combination cycles per second is often called Hertz3). Figure 2.1 showsseveral typical modes of the form cos(27ru;x) or sin(27ro;:r) that might be used in therepresentation of a function.

The inverse Fourier transform relationship (2.3) can be regarded as a recipe forassembling the function / as a combination of modes of all frequencies — oo < u) < oo.The mode associated with a particular frequency u has a certain weight in thiscombination, and this weight is given by /(a;). This process of assembling a functionfrom all of the various modes is often called synthesis: given the weights of themodes /(w), the function / can be constructed. The complete set of values of / is alsocalled the spectrum of / since it gives the entire "frequency content" of the functionor signal /. Equally important is the opposite process, often called analysis: giventhe function /, we can find the amount, /(w), of the mode with frequency uj that ispresent in /. Analysis can be done by applying the forward transform (2.2). Thisinterpretation is illustrated in Figure 2.2, in which a real-valued function

is shown together with its (in this case) real-valued Fourier transform, .

3HEINRICH HERTZ was born in 1857 and studied under Helmholtz and Kirchhoff. His experimentalwork in electrodynamics led to the discovery of Hertzian waves, which Marconi later used in hisdesign of the radio. Hertz was a professor of physics at the University of Bonn and died a monthbefore his 37th birthday.

DFT APPROXIMATION TO THE FOURIER TRANSFORM 19

With this capsule account of the Fourier transform, let's see how the DFT emergesas a natural approximation. First a practical observation is needed: when a functionis given it is either already limited in extent (for example, / might represent an imagethat has well-defined boundaries) or, for the sake of computation, / must be assumedto be zero outside some finite interval. Therefore, for the moment we will assume thatf ( x ) = 0 for \x\ > A/2. The Fourier transform of such a function with limited extentis given by

It is this integral that we wish to approximate numerically.To devise a method of approximation, the interval of integration [—A/2, A/2] is

divided into N subintervals of length Ao; = A/N. Assuming for the moment that Nis even, a grid with N + 1 equally spaced points is defined by the points xn = nAxfor n = —N/2 : TV/2. Thus the set of grid points is

We now assume that the function / is known at the grid points (in fact, / might beknown only at these points). Letting the integrand be

we may apply the trapezoid rule [22], [80], [166] (see Figure 2.3) to this integral. This

FIG. 2.2. A real-valued function f ( x ) = e 'x'COS(TTO;), defined on (—00,00), is shown (forx E [—6,6]) in the top figure. Its Fourier transform f is also real-valued, and is shown (foru G [—3,3]) in the bottom figure. The value of the Fourier transform /(o>) gives the amountby which the modes cos(2m<;:r) and sin.(2nu>x) are weighted in the representation of f.


FIG. 2.3. In order to approximate the integral g(x)dx, a grid is established on the

interval [—A/2, A/2] consisting of N + 1 equally spaced points xn = nAx, where Ao; = A/Nand n = —N/2 : N/2. Here A = 8 while N = 10. The trapezoid rule results if the integrandis replaced by straight line segments over each subinterval, and the area of the region underthe curve is approximated by the sum of the areas of the trapezoids.

leads to the approximation

For now we will add the requirement that g(—A/2) = g(A/2), an assumption that willbe the subject of great scrutiny in the pages that follow. With this assumption thetrapezoid rule approximation may be written

At the moment, this approximation can be evaluated for any value of w. Lookingahead a bit, we anticipate approximating / only at selected values of u). Therefore, wemust determine how many and which values of u; to use. For the purposes of the DFT,

it stands to reason that we should choose N values for a; at which to approximate /.That is the easier question to answer!

The question of which frequency values to use requires a discussion of fundamentalimportance to the DFT, because it leads to the reciprocity relations. It is impossibleto overstate the significance of these relations, and indeed they will reappear often inthe remainder of this book. The reciprocity relations are the keystone of the DFTthat holds its entire structure in place.

we need the sampled values f(xn) to determine the approximations to f(w) uniquely,a n d v i c e v e r s a . S i n c e N v a l u e s o f f ( x n ) a r e u s e d i n t h e t r a p e z o i d r u l e a p p r o x i m a t i o n ,

DFT APPROXIMATION TO THE FOURIER TRANSFORM 21

FIG. 2.4. The spatial (or temporal} grid of the DFT is associated with the frequency gridthrough the reciprocity relations. With the grid spacings and grids lengths as shown in thefigure, these relations state that AxAu; = l/N and ASl = N.

Ironically, for all of their importance, the reciprocity relations can be exposedwith very little effort and stated quite succinctly. We are presently working ona spatial or temporal domain [—A/2, A/2] with grid spacing Ax and grid pointsxn = nAx. Closely associated with that domain is a frequency domain that wewill denote [—$7/2,0/2]. This frequency domain will also be equipped with a gridconsisting of N equally spaced points separated by a distance Au;. We will denotethese grid points ujk = fcAu;, where k — —N/2+ 1 : N/1. The task is to relate the fourgrid parameters Ax, Au;, A, 0, assuming that both the spatial and frequency gridsconsist of N points. The reciprocity relations serve precisely this purpose. Figure 2.4shows the two grids and their relevant parameters.

Imagine all modes (sines and cosines) that have an integer number of periods on[—A/2, A/2] and fit exactly on the interval. Of these waves, consider the wave withthe largest possible period. This wave is often called the one-mode, or fundamentalmode. Clearly, it has one full period on the interval [—A/2, A/2] or a period of Aunits. What is the frequency of this wave? This wave has a frequency of I/A periodsper unit length. This frequency is the lowest frequency associated with the interval[—A/2, A/2]. Therefore, we will denote this fundamental unit of frequency

and it will be the grid spacing in the frequency domain. All other frequenciesrecognized by the DFT will be integer multiples of Au; corresponding to modes withan integer number of periods on [—A/2, A/2]. Since there are N grid points on thefrequency interval [—SI/2, 0/2], and the grid points are separated by Au>, it follows that0 = TVAu;. Combining these two expressions, we have the first reciprocity relation:

This relation asserts that the lengths of the spatial (or temporal) domain and the


frequency domain vary inversely with each other. We will examine the implicationsof this relation in a moment.

A second reciprocity relation can now be found quickly. Since the interval[—A/2, A/2] is covered by N equally spaced grid points separated by Ax, we knowthat N Ax = A. Combining this with the fact that Au; = I/'A, we see that

As in the first reciprocity relation, we conclude that the grid spacings in the twodomains are also related inversely. Let us summarize with the following definition.

Reciprocity Relations

The two reciprocity relations are not independent, but they are both useful. Thefirst relation tells us that if the number of grid points N is held fixed, an increasein the length of the spatial domain comes at the expense of a decrease in the lengthof the frequency domain. (Remember how the first reciprocity relation was derived:if A is increased, it means that longer periods are allowed on the spatial grid, whichmeans that the fundamental frequency Aw decreases, which means that the length ofthe frequency domain f2 = JVAu; must decrease.) The second reciprocity relation canbe interpreted in a similar way. Halving Ax with N fixed also halves the length of thespatial domain. The fundamental mode on the original grid has a frequency of I/Acycles per unit length, while on the new shorter grid it has a frequency of l/(A/2) or2/A cycles per unit length. Thus Au; is doubled in the process. All sorts of variationson these arguments, allowing N to vary also, can be formulated. They are usefulthought experiments that lead to a better understanding of the reciprocity relations(see problems 14 and 15).

With the reciprocity relations established, we can now return to the trapezoidrule approximation and extract the DFT in short order. First we use fn to denote thesampled values /(xn) for n — —N/2 + I : N/2. Then, agreeing to approximate / atthe frequency grid points UK = fcAu; = k/A, we note that

The sum in the trapezoid rule becomes

Therefore, our approximations to the Fourier transform / at the frequency grid pointsUk — k/A are given by

THE DFT-IDFT PAIR 23

for k = —N/2 + 1 : JV/2. The expression on the right above the brace is our chosendefinition of the DFT. Given the set of N sample values /n, the DFT consists of theN coefficients

for k = —N/2 + 1 : N/2. In addition to identifying the DFT, we can conclude thatapproximations to the Fourier transform /(o^) are given by /(cjfc) ~ AFk- Thisapproximation and the errors it entails will be investigated in later chapters. Let usnow officially introduce the DFT.

2.3. The DFT-IDFT Pair

For convenience, we will adopt the notation

With this bit of notation in hand, we may define the DFT in the following fashion.

> Discrete Fourier Transform ^

Let N be an even positive integer and let fn be a sequence4 of N complexnumbers where n = —N/2 + 1 : N/2. Then its discrete Fourier transformis another sequence of N complex numbers given by

for k = -N/2 + 1 : N/2.

Much of our work will be done with this form of the DFT in which N is assumedto be even. However, there is an analogous version that applies when N is odd and itis worth stating here.

If TV is an odd positive integer and fn is a sequence of TV complex numberswhere n = —(N — l)/2 : (N — l)/2, then its discrete Fourier transform isanother sequence of TV complex numbers given by

4The terms sequence and vector will be used interchangeably when referring to the input to theDFT. Although vector may seem the more accurate term, sequence is also appropriate because theinput is often viewed as an infinite set obtained by extending the original set periodically.

so that

for


The implications of using even and odd values of N will be explored in Chapter 3. We

Those readers who are familiar with the literature of DFTs and FFTs willrecognize that our definition differs from those that use indices running from 0 toN —I. We have not made this choice lightly. There are good arguments to be made infavor of (and against) the use of either definition. Certainly those who view the inputto their DFTs as causal (time-dependent) sequences will be more comfortable with theindices 0 : N — I . For other (often spatial) applications, such as image reconstructionfrom projections, it is more convenient to place the origin in the center of the imagespace, leading to definitions such as (2.6) and (2.7). Our choice is motivated by thefact that many theoretical explanations and considerations are simpler or more naturalwith indices running between —N/2 and N/2. It is important to note, however, thatanything that can be done using one index set can also be done using the other, andusually with little alteration. For this reason we now present an alternate definition,and will occasionally use it.

> Discrete Fourier Transform (Alternate Form) •*

Let AT be a positive integer and let fn be a sequence of N complex numberswhere n = 0 : N — 1. Then its discrete Fourier transform is anothersequence of N complex numbers given by

fo

As shown in problem 17, the alternate form of the DFT (2.8) is entirely equivalentto the original form (2.6). One obvious advantage of the alternate definition is thatit is independent of the parity (even/oddness) of N. There are many other alternateforms of the DFT which will be presented in detail in Chapter 3.

In general, the output of the DFT, F&, is a complex- valued sequence. Ofimpending significance is the fact that it is also an TV-periodic sequence satisfyingFk = Fk±N (problem 9). It is frequently convenient to examine the real and imaginaryparts of the DFT independently. These two sequences are denoted by Re{Ffc} andIm {Fk}, respectively. Using the same notation to denote the real and imaginary partsof the input sequence /n, we observe that the real and imaginary parts of the DFTare given by

shall often employ the operator notation D {fn} to mean the DFT of the sequence fn,and D{fn}k to indicate the fcth element of the transform, so that D {fn}k. = -Fk-


FIG. 2.5. The real (top left) and imaginary (top right) parts of the 12-point input sequencefn of Table 2.1 are shown. The real part of Fk is shown on the bottom left, while theimaginary part of Fk is shown on the bottom right. Observe that the real part of Fk isan even sequence satisfying Re{Ffc} = Re{F_fc}, while the imaginary part is odd, satisfyingIm{Ffc} = -Im{F_fc}.

Therefore, the real and imaginary parts of the output F^ are defined as follows.

Real Form of the DFT ^

Example: Numerical evaluation of a DFT. Before proceeding further, it isuseful to solidify these concepts with a simple example. Consider the 12-point real-valued sequence fn given in Table 2.1. Its DFT F^ has been computed explicitly fromthe definition and is also shown in the table. Figure 2.5 shows both the input sequencefn and the DFT Fk graphically.

The interpretation of the DFT coefficients is essentially the same as that given forthe Fourier transform itself. The kth DFT coefficient Fk gives the "amount" of the kthmode (with a frequency u>k) that is present in the input sequence /„. The N modesdistinguished by the DFT, that is, the set of functions e

i27rWfcX, for k = -7V/2+1 : N/2(as well as their real-valued components cos(2mx;fca;) and sin(27ru;fc£)), we refer to asthe basic modes of the DFT.

Real Form of the DFT

n

n T k


TABLE 2.1Numerical 12-point DFT.

n, k-5-4_3

-2-1

0123456

Re{/n}0.7630

-0.1205-0.0649

0.6133-0.2697-0.7216-0.0993

0.9787-0.5689-0.1080-0.3685

0.0293

Im{/n}000000000000

Re{Ffc}0.0684

-0.1684-0.2143-0.0606-0.0418

0.0052-0.0418-0.0606-0.2143-0.1684

0.06840.1066

Im{Ffc>-0.1093

0.0685-0.0381

0.1194-0.0548

00.0548

-0.11940.0381

-0.06850.10930.0000

The real (cosine) part of each of these modes is plotted in Figure 2.6, whichclearly shows the distinct frequencies that are represented by the DFT. Notice that incounting the DFT modes we can list either

N complex modes v% for k = —N/2 + I : N/2, or

a sine and cosine mode for k = I : N/2 — 1 plus a cosine mode for k = 0and k = N/2.

Either way there are exactly N distinct modes (see problem 10).Returning to Table 2.1, a brief examination of the data serves to introduce a

number of the themes that will appear throughout the book. First, observe that the

for n, k = —JV/2 + 1 : N/2. For the case at hand, the N = 12 modes can be groupedby their frequencies as follows:

Let's look at these modes more closely. In sharp distinction to the Fouriertransform which uses modes of all frequencies, there are only N distinct modes inan N-point DFT, with roughly N/2 different frequencies. The modes can be labeledby the frequency index fc, and each mode has a value at each grid point xn wheren — —N/2 : N/2. Therefore, we will denote the nth component of the kth DFT modeas


FlG. 2.6. An N-point DFT uses exactly N distinct modes with roughly N/2 differentfrequencies. The figure shows the cosine modes cos(2irnk/N) for the DFT with N = 12.The mode with frequency index k has \k\ periods on the domain where k — —N/2 + 1 : N/2.Shown are modes k — 0,±1 (top row); k = ±2, ±3 (second row], k = ±4, ±5 (third row),and k = 6 (bottom). The "spikes" indicate the discrete values cos(2Tvnk/N), while the dottedcurves indicate the corresponding continuum functions cos(2ivkx) for x 6 ( — 1/2,1/2).

input is a real-valued sequence, while the output is complex-valued with the exceptionof F0 and FG, which are real-valued. The real part of the output is an even sequencewith the property RejFfc} = Re{F_fc}, while the imaginary part is an odd sequencewith the property that lm{Fk} = —Im{F_fc}. The value of FQ (often called the DCcomponent) also equals the sum of the elements of the input sequence (problem 19).These observations constitute our first look at the symmetries of the DFT, a topicwhich constitutes Chapter 4. At this point our intent is merely to observe that thereoften exist lovely and important relationships between the input and output sequences,and that by examining these relationships thoroughly we may develop a powerful setof tools for discrete Fourier analysis.

The utility of the DFT, like that of any transform, arises because a difficult


problem in the spatial (or temporal) domain can be transformed into a simpler problemin another domain. Ultimately the solution in the second domain must be transformedback to the original domain. To accomplish this task, an inverse transform is required.We now define it.

> Inverse Discrete Fourier Transform (IDFT) -4

Let N be an even positive integer and let F^ be a sequence of N complexnumbers where k = —N/2 + I : N/2. Then its inverse discrete Fouriertransform is another sequence of N complex numbers given by

for If TV is an odd positive integer and Fjt is a sequence of N complex numbers,where k = — (N — l)/2 : (TV — l)/2, then its inverse discrete Fouriertransform is another sequence of N complex numbers given by

for

Notice that this definition confers periodicity on the sequence fn; it satisfiesfn = fn+N (problem 9). As in the case of the DFT, we shall often employ an operatornotation, D"1 {F^}, to mean the IDFT of the sequence Ffc, and T>~1 {Fk}n to indicatethe nth element of the inverse transform; therefore D-1 {Fk}n = fn.

The notation and the discussion above suggest that the DFT and the IDFT reallyare inverses of each other, but this fact certainly has not been shown! Therefore, thenext task is to show that the operators T> and T>~1 satisfy the inverse relations

To verify the inversion formula, we must first develop the orthogonality propertyof the complex exponential, a property that is crucial to the whole business of Fourierseries, Fourier transforms, and DFTs.

Let's begin with the discrete orthogonality property for the complex exponential.To facilitate the discussion, we will introduce some notation known as the modularKronecker5 delta. Letting k be any integer, we define 8]y(k) by

For example, £4(0) = 1,64(1) = 0,64(8) = 1,64(—3) = 0. With this notation we canstate the orthogonality property; it is so important that we exalt it as a theorem (witha proof!).

5 LEOPOLD KRONECKER (1823-1891) specialized in the theory of equations, elliptic function theory,and algebraic number theory while at the University of Berlin. Kronecker held firm in his belief thatall legitimate mathematics be based upon finite methods applied to integers, and to him is attributedthe quote "God made the whole numbers, all the rest is the work of man."


THEOREM 2.1. ORTHOGONALITY. Let j and k be integers and let N be a positiveinteger. Then

Proof: As before, we let UN = el27r/w. Consider the N complex numbers ujv, fork = 0 : N — 1. They are called the Nth roots of unity because they satisfy

and therefore are zeros of the polynomial ZN — 1. In fact, u^ is one of the Nih roots of unityfor any integer k, but it is easy to show that the sequence {u;^} _ is ./V-periodic (problem

9), so that the complete set of these roots may be specified by u)Jfk for any N consecutiveintegers k. We first factor the polynomial ZN — 1 as

Noting that u;^ is a root of zn — I = 0, there are two cases to consider. If we let z = u>^N k

where j — k is not a multiple of N, then z ^ 1, and we have

On the other hand, if j — k is a multiple of N then u)^ = 1 and

The orthogonality property follows from these two cases.IAnother proof of the orthogonality property can be constructed using the

geometric series (see problem 12). Notice that since the sequence o;jy- is TV-periodic,the orthogonality property holds when the sum in (2.11) is computed over any TVconsecutive values of n; in other words,

for any integer P.The orthogonality property of the DFT is a relationship between vectors.

Orthogonality of two vectors x and y in any vector space means that the inner productof the two vectors is zero; that is,

For real vectors the inner product is simply

k


while for complex-valued vectors the inner product is

where * denotes complex conjugation, that is, (a + ib)* — a — ib. Thus, if we definethe complex N-vector

then

and any N consecutive values of k yield an orthogonal set of vectors.With the puissance of orthogonality, we are ready to examine the inverse relation

between the DFT and the IDFT. Here is the fundamental result.THEOREM 2.2. INVERSION. Let fn be a sequence of N complex numbers and let

Fk = T>{fn}k be the DFT of this sequence. Then V~l {D{fn}k}n = fn.Proof: Combining the definitions of T> and T>~1, we can write

where we have applied the orthogonality property. The inner sum in this equation is nonzeroonly when j = n in the outer sum, yielding

It is equally easy to show that T> {T> : {Fk}n}k = Fk, so that indeed we have apair of operators that are inverses of each other (problem 16).

Example revisited. Let's return to the example of Table 2.1 and Figure 2.5 inwhich a 12-point DFT of a real-valued sequence was computed, and now interpret itin light of the inverse DFT. We will begin by writing the inverse DFT using only real

real-valued, Re {Fk} is an even sequence and Im {Fk} is an odd sequence. Furthermore,we saw that the DFT coefficients F0 and F/y/2 are real. The IDFT now looks like

quantities. Recall that Fk = Re {Fk} + ilm {Fk}, and that in this case, in which fn is


Let's pause for an explanation of a few maneuvers which we have used for the first ofmany times. As noted earlier, when the input sequence is real, the DFT coefficientsFQ and .F/v/2 are real. Therefore, we have pulled these terms out of the IDFT sum,and used the fact that uffi = COS(T™) = (-l)n, when k = N/2. We have also foldedthe sum so it runs over the indices k = 1 : N/2 — I.

We may now continue and write

A few more crucial facts have been used to get this far. Since the real part of F^ isan even sequence, Re {Fk} = Re {-F_fc}; and since the imaginary part of Fk is an oddsequence, Im {Fk} = —Im {F_k} (see the DFT coefficients in Table 2.1). Furthermore,the Euler relations have been used to collect the complex exponentials and form sinesand cosines as indicated. We are now almost there. The above argument continues toits completion as

where n = —TV/2 + I : TV/2. We see clearly that the sequence fn consists entirelyof real quantities and is thus real-valued, as shown in Table 2.1. Furthermore, themeaning of the DFT coefficients is reiterated: the real part of Fk is the weight of thecosine part of the kih mode, and the imaginary part of Fk is the weight of the sinepart of the Arth mode in the recipe for the input sequence fn.

Before closing this section, we will have a look at the DFT in the powerful andelegant language of matrices. The DFT maps a set of N input values fn into N outputvalues Fk. We have shown that it is an invertible operator, and we will soon showthat it is a linear operator. Therefore, the DFT can be represented as the product ofan TV-vector and an N x TV matrix. In this setting it is more convenient to use thealternate definition of the DFT


for k = Q : N — 1. I f f represents the vector of input data,

and F represents the vector of output values,

then the DFT can be written as

The DFT matrix W is the square, nonsingular matrix

This matrix has many important properties [36], [99], some of which we list below.

• Since the DFT is invertible, we observe that a matrix representation for theIDFT exists and is W"1.

• The matrix W is symmetric, so that WT = W.

• The inverse of W is a multiple of its complex conjugate:

Therefore, NWWH = I (where H denotes the conjugate transpose and I is theidentity matrix), and W is unitary up to a factor of N. The factor of N canbe included in the definitions of both the DFT and the IDFT, in which caseWWH = I (problem 18).

• For N > 4 the matrix W has four distinct eigenvalues, namely

The multiplicities of the eigenvalues are mi, m2, m^, and 7714, respectively, andare related to the order JV of the matrix as shown in Table 2.2 [6], [36], [99]. Thevalue of the determinant of W is also related to JV, and is included in the table.

While we will not spend a great deal of time considering the matrix formulationof the DFT, there are certain applications for which this is an extremely usefulapproach. For example, the multitude of FFT algorithms can be expressed in termsof factorizations of the matrix W [155]. For those interested in far deeper and moreabstract algebraic and number theoretic properties of the DFT, we recommend thechallenging paper by Auslander and Tolimieri [6].

DFT APPROXIMATIONS TO FOURIER SERIES COEFFICIENTS 33

TABLE 2.2Multiplicity of the eigenvalues of the DFT matrix. Reprintedhere, by permission, from [V. Gizek, Discrete Fourier Trans-forms and Their Applications, Adam Hilger, Bristol, England,1986]. ©1986, Adam Hilger.

2.4. DFT Approximations to Fourier SeriesCoefficients

As closely as the DFT is related to the Fourier transform, it may be argued that itholds even more kinship to the coefficients of the Fourier series. It is a simple matterto use the Fourier series to derive the DFT formula, and we will do so shortly. Butfirst, it is worthwhile to spend a few pages highlighting some of the important featuresof the theory of Fourier series. We begin with the following definition.

^ Fourier Series ^

Let / be a function that is periodic with period A (also called .A-periodic).Then the Fourier series associated with / is the trigonometricseries

where the coefficients Ck are given by

The symbol ~ means that the Fourier series is associated with the function /. Wewould prefer to make the stronger statement that the series equals the function atevery point, but without imposing additional conditions on /, this cannot be said.The conditions that are required for convergence of the series to / will be outlinedshortly. For the moment, we will assume that the function is sufficiently well behavedto insure convergence, and will now show that expression (2.13) for the coefficientsCfe is correct. As with the DFT, our ability to find the coefficients c^ depends on anorthogonality property.

THEOREM 2.3. ORTHOGONALITY OF COMPLEX EXPONENTIALS. Let j and kbe any integers. Then the set of complex exponential functions el2^kx'A satisfies the

N

4n

4 n + l

4n + 2

4n + 3

m\

n + 1

n + 1

n+1

n + 1

ni2

n

n

n + 1

n + 1

ma

n

n

n

n + 1

7714

n- 1

n

n

n

det W

-i(-l)nNN/2

(-l)nNN/2

-NN/2

i(-l)nNN/2


orthogonality relation

where we have used the ordinary Kronecker delta

Proof: The orthogonality property follows from a direct integration, the details of whichare explored in problem 22

As in the discrete case, continuous orthogonality can be viewed as orthogonalityof "vectors" in a space of functions defined on an interval I. The inner product of /and g on J is denned as

For a vector space of complex-valued functions on the interval [—A/2, A/2], thefunctions

are orthogonal since

To find the coefficients c^, we assume that the A-periodic function / is the sumof its Fourier series, so that

Multiplying both sides of this equation by (l/A)e~l27rkx/A and assuming that term-by-term integration over [—A/2, A/2] is permitted, we find that

By the orthogonality property, If A times the integral vanishes unless k = j, in whichcase it is unity. The only term that survives in the series on the right is the k = jterm, which gives

Since many applications involve real-valued functions /, it is often convenient towrite the Fourier series in a form that involves no complex-valued quantities. Forthese situations we give the following definition.


Fourier Series for Real-Valued /

Let / be a real-valued function that is A-periodic. Then the Fourier seriesassociated with / is the trigonometric series

for fc= 1,2, . . . .

The equivalence of the real and complex forms of the Fourier series for real-valuedfunctions is easily established using the Euler relations, and is the subject of problem23. The real form of the Fourier series, like the complex form, depends on anorthogonality property. In this case, we use the space of real-valued functions onthe interval [—A/2, A/2] with the inner product

It can be shown that the functions

where j, k = 0, ±1, ±2, ±3,..., form orthogonal sets (problem 24).But now we must ask about when a Fourier series converges. What conditions are

sufficient for the Fourier series associated with / to converge to /? The answer to thisquestion is not simple. In fact, astonishing as it may seem considering the venerabilityof the topic, there are still open questions concerning Fourier series convergence. Thereis a vast literature on the issue of convergence ([30], [34], [36], [153], [158], among manyothers) and a treatment of this topic in any detail is far beyond the scope or purposeof this book. We will state here a widely used criterion for the convergence of theFourier series.

First we need some definitions. A function / is said to be piecewise continuouson an interval [a, b] if it is continuous at all points in the interval except at a finitenumber of points Xi at which it is either not defined or discontinuous, but at whichthe one-sided limits

where the coefficients are given by

for A; = 0,1,2, . . . , and by


exist and are finite. An alternative view is that / is piecewise continuous on [a, b]if that interval can be divided into a finite number of subintervals, on each of which/ is continuous. More generally, a function / is said to be piecewise continuous(everywhere) if it is piecewise continuous on every finite interval. As an example, thefunction

is piecewise continuous, despite being undefined at x = 3 and having a jumpdiscontinuity at x = 2.

A function / is said to be piecewise smooth on an interval [a, b] if both /and /' are piecewise continuous on the interval, and is said to be piecewise smooth(everywhere) if it is piecewise smooth on every finite interval. Piecewise smoothnessof a function implies the existence of the one-sided derivatives

at every point of discontinuity. Figure 2.7 illustrates these "piecewise properties."Shown in the upper two figures are the function f ( x ) and its derivative f ' ( x ) , whichare defined by

The function f(x] is piecewise continuous since it is continuous on each of thesubintervals [—1,0], (0,1], and (1,2] and has both left- and right-hand limits at re = 0and x = I . The derivative /'(#), however, does not have left-hand limits at eitherx = 0 or x = 1, but rather approaches —oo. Hence f ( x ) is piecewise continuousbut not piecewise smooth. Shown in the lower figures are the function g(x) and itsderivative g'(x), defined by

and


and

Unlike /(a:), the function g(x) has a piecewise continuous derivative, and therefore ispiecewise smooth. We may now state the main convergence result.

FIG. 2.7. The upper left figure shows the graph of the piecewise continuous function f ( x ) onthe interval [—1,2]. The derivative f ' ( x ) is shown in the upper right. The function f ( x ) iscontinuous on the subintervals (—1,0), (0,1), and (1,2) and has one-sided limits at 0 and 1.However, f ' ( x ) , while continuous on the subintervals ( — 1,0), (0,1), and (1,2), has no left-hand limits at either 0 or 1. Hence f ( x ) is not piecewise smooth. The lower left figure showsthe graph of a function g(x) that is piecewise smooth on [—1,2]. The function is piecewisecontinuous, and one-sided derivatives exist at each of the points 0 and 1.

THEOREM 2.4. CONVERGENCE OF FOURIER SERIES. Let f be a piecewise smoothA-periodic function. Then the Fourier series for f

converges (pointwise) for every x to the value

We shall not prove this theorem here, but the interested reader can find a gooddiscussion of the proof in [158]. However, some important observations about this


result should be made. Since at a point of continuity the right- and left-hand limits ofa function must be equal, and equal to the function value, it follows that at any pointof continuity, the Fourier series converges to f ( x ) . At any point of discontinuity, theseries converges to the average value of the right- and left-hand limits.

So far, the Fourier series has been denned only for periodic functions. However,an important case that arises often is that in which / is defined and piecewise smoothonly on the interval [—A/2, A/2]; perhaps / is not defined outside of that interval, orperhaps it is not a periodic function at all. In order to handle this situation we needto know about the periodic extension of /, the function h defined by

The periodic extension of / is simply the repetition of / every A units on both sides ofthe interval [—A/2, A/2]. It should be verified that h is an ^-periodic function. Hereis the important role of the periodic extension h: if the Fourier series for / convergeson [—A/2, A/2}6, then it converges

to the value of / at points of continuity on (—A/2, A/2),

to the average value of / at points of discontinuity on (—A/2, A/2),

to the value of the periodic extension of / at points of continuity outside of(-A/2, A/2), and

to the average value of the periodic extension at points of discontinuity outsideof (-A/2, A/2).

Figure 2.8 shows a function / defined on an interval [—1,1] and its periodicextension beyond that interval. The periodic extension is the function to which theFourier series of / converges for all x provided we use average values at points ofdiscontinuity. In particular, if f ( — A / 2 ) ^ f ( A / 2 ) then the Fourier series converges tothe average of the function values at the right and left endpoints

These facts must be observed scrupulously when a function is sampled for input tothe DFT.

With this prelude to Fourier series, we are now in a position to derive the DFTas an approximation to the integral that gives the Fourier series coefficients Ck- Theprocess is similar to that used to derive the DFT as an approximation to the Fouriertransform. Now we consider approximations to the integral

6There seems to be no agreement in the literature about whether the interval for defining Fourierseries should be the closed interval [—A/2, A/2], a half-open interval (—A/2 , A/2], or the open interval(—A/2, A/2). Arguments can be made for or against any of these choices. We will use the closedinterval [ — A / 2 , A/2] throughout the book to emphasize the point (the subject of sermons to come!)that in defining the input to the DFT, values of the sampled function at both endpoints contributeto the input.


FIG. 2.8. The piecewise smooth function f shown here is defined on the interval [—1,1](solid curve). Its Fourier series on that interval converges to the periodic extension of f(solid and dashed curve) at all points x. Notice that at points at which the periodic extensionis discontinuous (such as x = ±1) the Fourier series converges to the average value of thefunction at the discontinuity (marked by the small ovals).

As before, the interval of integration is divided into N subintervals of equal length,and let the grid spacing be Ax = A/N. A grid with N + 1 equally spaced pointsover the interval [—A/1, A/2] is defined by the points xn = n&x for n — —N/2 : N/2.Furthermore, we let

Now the question of endpoint values enters in a critical way. Recall that inapproximating the Fourier transform, we were able to associate the trapezoid rulewith the DFT because of the assumption that g(—A/2) — g(A/2). Now we mustreason differently and actually avoid making an unnecessary assumption.

We have already seen that if the periodic extension of / is discontinuous at theendpoints x = ±A/2, then, when its Fourier series converges, it converges to theaverage value

Therefore, it is the average value of / at the endpoints that must be used in thetrapezoid rule. Noting that the kernel e-'

l2^kx/A has the value ( — l ) k at x = ±-4/2,

be the integrand in this expression. Applying the trapezoid rule gives the approxima-tions


we see that function g that must be used for the trapezoid rule is

It should be verified that this choice of g, dictated by the convergence propertiesof the Fourier series, guarantees that

In a similar way, an average value must be used at any grid points at which / hasdiscontinuities.

Using this definition of g, along with the observation that 2-Kkxn/A = 27rnk/N,reduces the trapezoid rule to

Letting fn = f(xn), we see that an approximation to the Fourier series coefficient Ckis given by

which, for A; = —N/2 + I : N/2, is precisely the definition of the DFT. Thus we seethat the DFT gives approximations to the first N Fourier coefficients of a function /on a given interval [—A/2, A/2] in a very natural way. There are subtleties concerningthe use of average values at the endpoints and discontinuites, but the importance ofthis issue will be emphasized many times in hopes of removing the subtlety!

We have now shown that the DFT provides approximations to both the Fouriertransform and the Fourier coefficients. The errors in these approximations will beinvestigated thoroughly in Chapter 6. Having related the DFT to both the Fouriertransform and Fourier coefficients, we close this section by completing the circleand establishing a simple but important connection between Fourier transforms andFourier series. When a function / is spatially limited, meaning that /(x) = 0 for|x| > A/2, we see that the Fourier transform evaluated at the frequency a;̂ = k/A isgiven by

Now compare this expression to the integral for the Fourier coefficient of / onI-A/2, A/2}:

THE DFT FROM TRIGONOMETRIC APPROXIMATION 41

The relationship is evident. In the case of a function that is zero outside of the interval[— A/2,^4/2], the Fourier transform and the Fourier coefficients are related by

for —oo < k < oo. This relationship will prove to be quite useful in the pages to come.

2.5. The DFT from Trigonometric Approximation

The derivations shown so far have evolved from the problem of approximating eitherthe Fourier series coefficients or the Fourier transform of a particular function. Anotherway to uncover the DFT follows by considering the problem of approximating (orfitting) a set of data with a function known as a trigonometric polynomial. The goalis to find a linear combination of sines and cosines that "best" approximates a givendata set. One of the beautiful connections of mathematics is that the solution to thisproblem leads to the DFT. We have already seen a preview of this development inChapter 1, with Gauss'7 interpolation of the orbit of Ceres.

Suppose that we are given N data pairs that we will denote (#n,/n), wheren — — (N — l)/2 : (N — l)/2. (This derivation is one instance in which it is moreconvenient to work with an odd number of samples; everything said here can be donewith minor modifications for even N.} The xn's are real and are assumed to be equallyspaced points on an interval [—A/2, A/2]; that is, xn — nAx where Ax = A/N. The/n's may be complex-valued. The data pairs may be viewed as samples of a continuousfunction / that have been gathered at the grid points xn. But equally likely is theinstance in which the pairs originate as a discrete set of collected data.

We seek the best possible approximation to the data using the TV-term trigono-metric polynomial ^jy, given by

We describe the function t/>jv as a trigonometric polynomial because it is apolynomial in the quantity et2nx/A. There are many conditions that might be imposedto determine the "best" approximation to a data set. We will use the least squarescriterion and require that the sum of the squares of the differences between the datavalues and the approximation function T/>JV at the points xn be minimized. Said a littlemore concisely, we seek to choose the coefficients otk to minimize the discrete leastsquares error

7Born in 1777, in Brunswick, Germany, to uneducated parents, CARL FRIEDRICH GAUSS isuniversally regarded as one of the three greatest mathematicians of all time, the other two beingArchimedes, Newton, Euler, Hilbert, or Euclid (pick two). He completed his doctoral thesis at age20, and became Professor of Mathematics at the University of Gottingen in 1807, where he lived untilhis death in 1855. Gauss made lasting contributions to algebra, astronomy, geodesy, number theory,and physics.


The least squares error is a real- valued, nonnegative function of the N coefficients

Therefore, a necessary condition for the minimization of E is that the first partialderivatives of E with respect to each of the N coefficients vanish. Observing that

we arrive at the so-called normal equations for the problem (see problem 26):

where A; = —(N — l)/2 : (N — l)/2. Rearranging the terms gives us the set of Nequations

These expressions can be further simplified if we use our conventional notation thato>jv = g1271"/-^. The normal equations now appear as

where again A; = —(TV — l)/2 : (N — l)/2. As indicated, the inner sum on the rightside begs to be treated by the orthogonality relation. Doing so, we find that

for k — —(N — l)/2 : (N — l)/2. Notice the minor miracle that has occurred! Therather dense set of normal equations that linked the coefficients ap in an obscure wayhas been separated or "diagonalized" so that there is a single equation for each of theN coefficients. From this last expression (2.20) it is evident that the least squareserror is minimized when the coefficients in the approximating polynomial are given bythe DFT of the data,

for fc = -(iV-l)/2: (TV - l)/2.Having determined that the DFT coefficients give the trigonometric polynomial

with minimum least squares error, it is natural to ask just how good this "best"

THE DFT FROM TRIGONOMETRIC APPROXIMATION 43

polynomial is. What is the size of the error? Recall that the least squares error E isgiven by

A direct calculation shows (problem 27) that each of the last three sums in thisexpression has the value N^2n «n|2, so that

Observing that |o:n|2 = ano:*, we may use (2.21) to obtain

Since the last sum involves AT terms, we may invoke orthogonality and conclude thatit is zero except when p = m, in which case it has the value N. Thus we observe,perhaps unexpectedly, that

Since the sum of the squares of the individual errors at the grid points is zero, itnecessarily follows that the individual errors themselves must be zero. In other


words, the approximation function I^N must pass through each of the data pointsor i/>Ar(arn) = fn at each grid point. This means that VN is both a least squaresapproximation and an interpolating function for the data. We have arrived at thefact that the least squares approximation is an interpolant in a rather circuitiousmanner. Indeed, the DFT can be derived directly by requiring that the polynomialI/JN interpolate the data; this approach is considered in problem 28.

One immediate consequence of this discovery is that with E — 0, (2.22) may nowbe written as

a fundamental property of the DFT known as Parseval's8 relation.Example: A least squares approximation. We would like to present an

example of trigonometric approximation using the DFT. In order to provide somecontinuity with the earlier sections of the chapter, why not use the data set given inTable 2.1? We assume that those data are collected on the interval [—1/2,1/2], andtabulate them again in Table 2.3.

Notice that the first variable consists of equally spaced points xn on the interval[—1/2,1/2]; therefore, we can take A = 1 in the above formulation. We have discoveredthat the coefficients in the least squares approximation polynomial

are given by ctk = 22 {/n}/^ and these values are also shown in Table 2.3. Recall thatin this case, since the data fn are real, the coefficients ctk have the property

If we take the values of ajt given in Table 2.3 together with the symmetry in their realand imaginary parts, it can be shown (see problem 25) that

on the interval [—1/2,1/2]. Note that this expression is essentially the real form of theinverse DFT. This trigonometric polynomial is plotted in Figure 2.9 together with the12 data points. Clearly, the approximation function does its job: it passes througheach of the data points.

2.6. Transforming a Spike Train

We have now derived the DFT as an approximation to the Fourier transform, as anapproximation to the Fourier coefficients, and as the coefficients in an interpolating

8Little is known about the life of MARC ANTOINE PARSEVAL DBS CHENES. He was born in 1755 andwas forced to flee France in 1792 after writing poems critical of Napoleon. Nominated for membershipin the Paris Academy of Sciences five times, he was never elected. The only enduring mathematicalresult published in his five memoirs is what we now call Parseval's relation.

TRANSFORMING A SPIKE TRAIN 45

TABLE 2.3Coefficients of a least squares trigonometric polynomial.

n, fc-5-4-3

-10123456

Xn

-5/12-1/3-1/4-1/6-1/12

01/121/61/41/3

5/121/2

fn

0.7630-0.1205-0.06490.6133

-0.2697-0.7216-0.09930.9787

-0.5689-0.1080-0.36850.0293

Re{afc} Im{afc}0.0684 -0.1093

-0.1684 0.0685-0.2143 -0.0381-0.0606 0.1194-0.0418 -0.05480.0052 0

-0.0418 0.0548-0.0606 -0.1194-0.2143 0.0381-0.1684 -0.06850.0684 0.10930.1066 0.0000

FlG. 2.9. The figure shows N = 12 data points at equally spaced points of the interval[—1/2,1/2]. Using the DFT, the least squares trigonometric polynomial for this data set canbe found. This polynomial actually interpolates the data, meaning that it passes through eachdata point.

function. While this may seem like an exhaustive exhibition of the DFT, we will nowtake another approach by computing the Fourier transform of a sampled waveform, andshowing that the DFT appears once again. This is a very natural and complementaryapproach since it requires applying the Fourier transform to a discretized functionrather than approximating a Fourier integral. While this approach is stimulating, andalso provides a quick survey of delta distributions, readers who feel that the DFT hasalready had sufficient introduction can move ahead to Chapter 3.

In order to proceed, we must quickly review the Dirac9 delta function, oftencalled the impulse function. This object, denoted <5(t), is not, in fact, a function;

9PAUL ADRIEN MAURICE DIRAC (1902-1984) was one of the greatest theoretical physicists of thetwentieth century. His groundbreaking work in quantum mechanics led to the discovery of quantumelectrodynamics, the relativistic equations of the electron, and a Noble Prize in 1933.

-2


FIG. 2.10. Several members of the sequence gn(x) are shown, where gn(x) = n/2 if\x\ < l/n,and gn(x) = 0 otherwise. Note that each member of the sequence is nonzero over a smallerinterval and has a greater amplitude than its predecessors. Furthermore, the area under thecurve is unity for every member of the sequence.

it is a distribution or a generalized function (these two terms actually havedifferent meanings, but for our purposes, they are essentially the same [62], [93],[128]). However, we will abide by common practice and call 6 the delta function. Thefirst task is to avoid confusing the delta function with the Kronecker delta sequences£jv(n) and <5(n). Hopefully the notation and context will make the meaning clear.

The delta function can be defined as the limit of a sequence of functions {gn(x)}having certain properties. A sequence {gn(x)} that leads to the 6 function must satisfy

and

A simple example of a sequence possessing these desired properties is

Successive members of this sequence have increasing amplitude over smaller andsmaller intervals about the origin (see Figure 2.10): they get taller and skinnier, whilemaintaining unit area. For this reason the delta function is often said to have zerowidth, infinite height, with unit area under the curve. We may view the 6 function asa spike.

It is tempting to define 6(x) = limn_toog'n(x), but this is not technically correct,since the sequence does not converge at x — 0. Nevertheless, we do take this limit asan informal definition, valid except at x = 0. The function is officially defined by its


action when integrated against other regular functions. That is, we define propertiesof the delta function implicitly by requiring that, for any suitable function f(x],

It should be verified that the sequence {gn(x)} (plus several more given in problem29) satisfies the following properties.

1. Zero value. The 6 function is zero almost everywhere; that is,

2. Scaling property. The 6 function has the property that for a a real number,

3. Unit area. The area under the 6 function is unity, or

4. Product with regular functions. For every real number y the product of acontinuous function / with the 6 function satisfies

By this we mean that

5. Sifting property. Integrating a function against the 6 function has the effect ofsifting (or testing for) the value of the function at the origin; that is,

6. General sifting property. Generalizing the previous property, the 6 functionhas the property that

The sifting property is extremely useful. Suppose, for example, that one wishesto shift a function, but not alter it in any other way (think of introducing a time delayin a signal). Then, by the sifting property, we may conclude that


FIG. 2.11. The figure shows that the convolution of a function f (top) with a spike O(X — XQ)located at xo (middle) applies a shift to f ( x ) so that f(x) * 6(x — XQ) = f ( x — XQ) (bottom).

This integral is a special example of an operation called convolution, which we willmeet again. We see that the convolution of a function / with a shifted 8 functionhas the effect of translating the function to the location of the 8 function (see Figure2.11).

We will use the 8 function to give us yet another derivation of the DFT shortly.Before we do, however, it is worthwhile to apply these properties to develop a fewmore important Fourier transform pairs.

Example: Fourier transform of a spike. In engineering and physics, we oftenidealize the response of a system to a brief pulse input by using a 8 function which isa spike of infinitesmal duration and infinite amplitude (integrating to unity). Whatfrequencies make up a spike? Consider the Fourier transform of a spike. By the siftingproperty, we see that

The transform of a spike in the spatial domain is a flat spectrum; that is, all frequenciesare present at unit strength.

One might ask what the presence of a unit spike at the origin in the frequencydomain implies about its spatial domain partner. In other words, what is the inverseFourier transform of a delta function? Intuitively, a spike at zero frequency shouldcorrespond to a spatial function that is constant. Indeed, the inverse Fourier transformof a 8 distribution is easily obtained, again by the sifting property:

Example: Fourier transform of the exponential function. What does thepresence of a single spike in the frequency domain mean if it is not at the origin?Again, arguing by intuition, we expect a single spike at the frequency u; = WQ torepresent a function that oscillates at precisely the frequency u — a;0- Consider the


inverse transform of the shifted 6 function. Once again the sifting property says that

This result leads easily to the Fourier transform of a cosine, since by a similarargument

Adding and subtracting the inverse transforms of 6(ud — u;0) and 8(u + UQ}, we findthat

We now know that the Fourier transform of an exponential el27rw°x is a singlespike located at u; = WQ, and that cos(2iruox) has a strictly real Fourier transformconsisting of a pair of spikes at u) = ±u;o, while the Fourier transform of s'm(27rujQx}is purely imaginary, and consists of a pair of spikes at uj = ±u>o- Figure 2.12 displaysthe Fourier transform pairs developed using the delta function.

We can now use these results to derive the DFT from a different perspective.Instead of using a sampled version of a continuous function (as in previous approaches),this derivation will apply a continuous transform to a sampled function. Consider a"function" that consists of a finite sequence of regularly spaced spikes, each with someamplitude; we will refer to such a sequence as a spike train. (This object is sometimesreferred to as a comb function, or a shah function.) Continuing to use the functionnotation, a spike train can be represented by a sum of shifted 6 functions. If the spikesare separated by a distance Ax and are located at the grid points xn = nAx, thenthe spike train can be written as

where fn is a set of amplitudes for the spikes. Note that h(x) has no meaning in theordinary sense of a function, being almost everywhere equal to zero.

Notice that we may regard this spike train as samples of an underlying continuumfunction / where fn = f(xn). This can be seen by using property 4 above and writing

In the last line we see that the delta functions 6(x — xn) isolate the values of thefunction / at the grid points x = xn.

Suppose now that the spike train consists of precisely N spikes located at thepoints x = xn = nA/N of the interval [--4/2, A/2], where n = -N/2 + I : N/2. Let's

and


FIG. 2.12. Several Fourier transform pairs derived with the help of the delta function areshown here. Each row of figures consists of a real-valued function in the spatial domain (left),the real part of its Fourier transform in the frequency domain (center), and the imaginarypart of its transform in the frequency domain (right). Shown are: the transform of a spike orimpulse (top row), the inverse transform of a spike or impulse (second row), the transformof cos(2TruJox) (third row), and the transform of sm(2Ttuox) (bottom row).

see what happens if we formally take the Fourier transform of this spike train. It isgiven by

We have used the sifting property to arrive at the last line which tells us that theFourier transform of a spike train is just a linear combination of exponentials weightedby the amplitudes that appear in the spike train.


At this point there are two ways to proceed. Having determined the Fouriertransform of the spike train, we wish to sample it, so that we have a discrete setof data in both domains. We are at liberty to choose any values of u> at which tosample the Fourier transform. We can rely on our previous experience with thereciprocity relations and choose the N sample points Uk = kAu> = k/A, wherek = —N/2 + I : AT/2. The sampled version of the Fourier transform is then

for k = —N/2 + 1 : N/2. In other words, if we are willing to appeal to the reciprocityrelations, the DFT appears as the samples of the Fourier transform of the spike train.

However, there is a slightly more adventurous and independent way to capturethe DFT. If we assume no familiarity with the reciprocity relations, then the choiceof the sample points u>fc is not obvious. They can be determined if we require that theinverse Fourier transform applied to the spike train of samples f {h(x}} (cj^) bringsus back to the input values /„ with which we began. It is an interesting argumentand is worth investigating. Before we present it, let's provide some notational reliefby agreeing to let the set of indices

be denoted A/".

Let u>k = kAu, where Au; is as yet unspecified, and k 6 A/". The sampled versionof the Fourier transform can be written as the spike train

Another use of property 4 allows us to write

Notice that (2.31), the continuum Fourier transform of a spike train, is itself a spiketrain. The amplitudes of the spikes, that is, the coefficients of the 8(uj — uJk) terms,are given by


Forming the inverse Fourier transform of the spike train (2.31), we have

which is a continuous function of x. At the peril of becoming strangled in ournotation, we now sample this function by evaluating it at all of the original gridpoints xp — pA/N, where p e M. This yields the sampled inverse Fourier transform

Rearranging the order of summation, replacing xn and xp by their definitions, andcombining the exponentials we may write

We now see precisely which value for Aw we must select to recover the original valuesof the input fn. If we select Aw = 1/^4, then w^ — k/A, and the previous equationbecomes

The key to this passage is the orthogonality property. The summation over k collapsesto a single term, namely N6]^(p — n), and we obtain

Hence, with the choice Aw — I/A, we find that the inverse Fourier transform ofthe spike train of Fourier transform values (2.31) returns the original input data, up toa multiple of N. For this reason, it seems natural to adopt I/AT times the sequence ofamplitudes (2.32) of the spike train F {h(x}} (w) as the DFT. The DFT then becomes

which is precisely the formula obtained in previous derivations.The upshot of this argument (either using the reciprocity relations or orthogonal-

ity) is that samples of a continuous function / can be expressed as a spike train ofdelta functions. The (continuous) Fourier transform of that spike train when sampledat the appropriate frequencies is precisely the DFT up to the scaling factor N. Al-though the demonstration of this fact relies on the use of the delta function, it doesoffer an alternative pathway to the DFT.

LIMITING FORMS OF THE DFT-IDFT PAIR 53

2.7. Limiting Forms of the DFT-IDFT PairThe remainder of this chapter will be devoted to a qualitative exploration of therelationships among the DFT, the Fourier series, and the Fourier transform. Theapparent similarities among this trio are tantalizing; at the same time, there aredifferences that must be understood and reconciled. Of particular interest is thequestion: exactly what does the DFT approximate? We will try to answer this questionconceptually by looking at the DFT in various limits, most notably when N —» oo.While the discussion in this chapter will be incomplete, it does reach some salientqualitative conclusions which are valuable in their own right. It also sets the stage forthe full analysis which will take place in Chapter 6.

Limiting Forms of the Forward DFTLet's begin with a process that is hopefully becoming familiar. A function / is sampledon the interval [—A/2, A/2] at the TV grid points xn — nAx, where Ax = A/N. Wewill denote the sampled values of the function fn = f(xn). Now consider the DFT asit has already been defined. It looks like

for k = —TV/2 + 1 : N/2. The discussion can be divided quite concisely into threesituations for the forward DFT.

1. Let's assume first that / is a function with period A and that the goal is toapproximate its Fourier coefficients. Since xn = nAx = nA/N, we can replacen/N by xn/A in the DFT definition and write

for k — —N/2 + 1 : N/2. We now ask what happens if we hold A and k fixedand let N —> oo. Since Ax = A/N, Ax —» 0 as N —>• oo, and the DFT sumapproaches an integral. In fact,

We see that as N —* oo and Ax —»• 0, the DFT approaches the Fouriercoefficients of / on the interval [—A/2, A/2]. For finite values of TV the DFT only

or simply


approximates the Cfc's, and the error in this approximation is a discretizationerror that arises because Ax is small, but nonzero. Letting N —> oo and Ax —>• 0reduces this error.

2. Consider the case of a spatially limited function / which satisfies /(x) — 0 for|a;| > A/'2. Now the goal is to approximate the Fourier transform of /. Wehave already observed that in this case, the value of the Fourier transform atthe frequency grid point ujk = k/A is a multiple of the fcth Fourier coefficient;that is,

Therefore, if we hold A fixed (which means Au; — I/A is also fixed) and hold kfixed (which means that u)k — fcAu; is also fixed), then we have that

It is interesting to note that by the reciprocity relation Afl = N, the length ofthe frequency domain fi increases in this limit. In other words, as N —>• oo andAx —> 0, higher frequencies can be resolved by the grid which is reflected in anincrease in the length of the frequency domain. As in the previous case, theerror in the DFT approximation to f(u)k) is a discretization error that decreasesas N —»• co.

It is misleading to think that the DFT becomes the entire Fourier transform asthe number of data samples becomes infinite. Letting N —» oo allows the DFTsum to converge to the Fourier integral, but the transform is approximated onlyat certain isolated points in the frequency domain since Aw remains constantas N —>• oo. The DFT does not provide any information whatsoever about/(a;) if u> ^ k/A. This distinction will prove to be of crucial importance inunderstanding the relationship between the DFT and the Fourier transform.

3. The third case is that in which / does not vanish outside of a finite interval. Asin the previous case we can let N —>• oo and Ax —-»• 0 with k held constant in theDFT. This limit gives us

However, since / is nonzero outside of the interval [—A/2, -A/2], the integral inthe previous line is necessarily an approximation to the Fourier transform

The error in this approximation is a truncation error due to the fact that theinterval of integration (—00, oo) has been truncated. Therefore, a second limit isrequired to eliminate this error and recover the exact value of /(a>fc). Consider


the effect of letting A —>• oo. By the reciprocity relations, this limit will alsoforce Ao> = If A to zero. At this point we are involved in a very formal andconceptual exercise. However, the conclusion is quite instructive. Imagine thatas A —> oo and Au; —»• 0, we hold the combination uJk = fcAu; fixed. Then wecan write formally that

We can now summarize this two-step limit process. The combined effect of thetwo limits might be summarized in the following manner:

The outer limit eliminates the truncation error due to the fact that the DFTapproximates an integral on the finite interval [—A/2, A/2], while the innerlimit eliminates the discretization error that arises because the DFT uses adiscrete grid in the spatial domain. Of course, these limits cannot be carried outcomputationally, but they do tell us how DFT approximations can be improvedby changing various parameters. These limits will be revisited and carried outexplicitly in Chapter 6.

Refining the Frequency Grid

We have seen that with the interval [—A/2, A/2] fixed, the quantity AD{fn}k

approximates the Fourier transform at the discrete frequencies uik = k/A. A commonquestion is: what can be done if values of the Fourier transform are needed at pointsother than k/Al An approach to approximating /(tt>) at intermediate values of u; canbe developed from the reciprocity relations. By way of example, suppose that valuesof /(w) are desired on a grid with spacing Awnew = l/(2A) rather than Au; = I/A.From the reciprocity relation (2.5),

Therefore,

We see that if the number of points N is held fixed and the grid spacing Ax isdoubled, then the grid spacing in the frequency domain Au; is halved. Notice thatthis process also doubles the length of the spatial domain (TVAx) and halves the lengthof the frequency domain (7VAu>). This operation clearly provides a refinement of thefrequency grid, as shown in Figure 2.13.

If we are willing to double the number of grid points, then it is possible to refinethe frequency grid without increasing Ax or reducing the length of the frequencydomain f2. The new grid lengths must also obey the reciprocity relations. Therefore,increasing N to 2N with $1 constant has the effect of doubling the length of the spatialdomain since the reciprocity relation with the new grid parameters,


FIG. 2.13. The reciprocity relations determine how changes in the spatial grid affect thefrequency grid. If the number of grid points N is held fixed and the grid spacing Ax is doubled,then the length of the spatial domain is doubled, and the frequency grid spacing is halved, as isthe length of the frequency domain (top —> middle). If the number of grid points N is doubledwith Ax fixed, then Au; is halved while the length of the frequency domain remains constant(top —» bottom). Both of these processes result in a refinement of the frequency grid.

must still be obeyed. At the same time, the other reciprocity relation decrees that thefrequency grid spacing must decrease if the reciprocity relations are to be satisfied:

This manipulation provides the desired refinement in the frequency domain withoutlosing representation of the high frequencies as shown in Figure 2.13. In either case,doubling A has the effect of halving Au>, which refines the frequency grid. Howdoes this refinement affect the sampling of the given function /, which must now besampled on the interval [—-A, A]? If / is spatially limited and zero outside the interval[—.A/2, .4/2], then it must be extended with zeros on the intervals [—A, —A/2] and[A/2, A]. If / is not zero outside the interval [—A/2, A/2], then we simply sample /at the appropriate grid points on the interval [—A, A]. We will return to the practiceof padding input sequences with zeros in Chapter 3.

Limiting Forms of the IDFT

We will close this section by asking: what does the inverse DFT approximate? Thequestion will be answered by looking at the limiting forms of the IDFT. As with theforward DFT the account can be given in three parts.

1. We first consider the case in which coefficients Fk are given. These are presumedto be the Fourier coefficients of a function / on an interval [—A/2, A/2]. Themeaning of the coefficients Fk determines how we interpret the output of the

Space Domain Frequency Domain


IDFT. The IDFT appears in its usual form as

for n = —N/2 + 1 : TV/2. We anticipate using a spatial grid on the interval[—A/2, A/2] which has grid points xn = nAx = nA/N. Setting n/N = xn/A inthe IDFT we have

for n = -N/2 + 1 : N/2. We ask about the effect of letting N — > oo in thisIDFT expression. Holding xn and A fixed, we see that the IDFT becomes theFourier series representation for / evaluated at the point x — xn:

So this is the first interpretation of the IDFT: if the input sequence Ff. is regardedas a set of Fourier coefficients for a function / on a known interval [—A/2, A/2],the IDFT approximates the Fourier series synthesis of / at isolated points xn.The error that arises in the IDFT approximations to the Fourier series is atruncation error, since the IDFT is a truncated Fourier series representation.

2. Now consider the case in which the input Fk to the IDFT is regarded as a set ofsamples of the Fourier transform / of a function /. Assume furthermore that thefunction / is band- limited, which means that /(a;) = 0 for \u\ > £2/2 for somecut-off frequency £2/2. The samples of the Fourier transform, Fk = /(u>fc), aretaken at N equally spaced points Uk = kAus of the interval [—£2/2, £2/2], whereAu; = £2/JV. The grid that has been created in the frequency domain inducesa grid in the spatial domain, and not surprisingly, the reciprocity relationscome into play. As we have seen, the grid points in the spatial domain arexn = nAx = nA/N, where A:r = l/£2.

Now let's have a look at the IDFT. It can be written

for n — —N/2 + 1 : N/2. We have used the relationships between the two gridsto argue that

We can now see the effect of letting N —> oo. Recall that the length of thefrequency domain £2 is fixed, as is the spatial grid point xn. Therefore, as


N —» oo, it follows that Aw = fi/AT —> 0, and the length of the spatial domain Aincreases. Therefore, in this limit the IDFT approaches an integral; specifically,

Note that the integral over the interval [—11/2, fi/2] is the inverse Fouriertransform, since / is assumed to be zero outside of this interval. Theinterpretation of this chain of thought is that the IDFT approximates the inverseFourier transform evaluated at x — xn, which is just f ( x n ) . The error that isreduced by letting N —> oo is a discretization error, since a discrete grid is usedin the frequency domain.

3. The assumption of band-limiting is a rather special case. More typically, theFourier transform does not vanish outside of a finite interval. In this moregeneral case, we might still expect the IDFT to approximate the inverse Fouriertransform, and hence samples of /; however, there are both discretization errorsand truncation errors to overcome. We may begin by assuming that 17 (andhence, by reciprocity, Ax) are held fixed while N —* oo and Au; —> 0. As in theprevious case we have

However, this last integral is not the inverse Fourier transform

since / no longer vanishes outside [—fJ/2,fi/2]. In order to recover the inverseFourier transform, we must also (quite formally) let f2 —» oo. Note that by thereciprocity relations this also means that Az —> 0, and so we must delicatelykeep the grid point xn fixed. The limit takes the form

Combining the two limit steps we may write collectively that for a fixed valueof xn

PROBLEMS 59

The outer limit accounts for the fact that the IDFT approximations use a finiteinterval of integration in the frequency domain. The inner limit recognizes thefact that the IDFT uses a finite grid to approximate an integral. This two-steplimit process will also be examined more carefully and carried out explicity inChapter 6.

This concludes an extremely heuristic analysis of the relationships among theDFT/IDFT, Fourier series, and Fourier transforms. Hopefully even this qualitativetreatment illustrates a few fundamental lessons. First, it is essential to know what theinput sequence to either the DFT or the IDFT represents, since this will determinewhat the output sequence approximates. We have seen that the DFT may provideapproximations to either Fourier coefficients or samples of the Fourier transform,depending upon the origins of the input sequence. We also demonstrated that theIDFT can approximate either the value of a Fourier series or the inverse Fouriertransform at selected grid points, again depending upon the interpretation of theinput data. Beyond these observations, it is crucial to understand how the DFT/IDFTapproximations can be improved computationally. While the limiting processes justpresented are formal, they can be simulated in computation by varying the gridparameters in appropriate ways. Above all, we wish to emphasize that there aresubtleties when it comes to interpreting the DFT and the IDFT. We will mostassuredly return to these subtleties in the pages to come.

2.8. Problems

8. The roots of unity. Show that ujj^k satisfies the equation ZN — I — 0 fork = Q:N-l. Show that u>^k satisfies the equation ZN - 1 = 0 for k = P : P + N - 1,where P is any integer.

9. Periodicity of uffi •

00 ls

ls

period N.

(c) Show also that ukN ^ UP

N if 0 < \k - p\ < N.

(d) Show fn and Fk as given by the forward (2.6) and inverse (2.9) DFT areTV-periodic sequences.

10. Modes of the DFT. Write out and sketch the real modes (cosine and sinemodes) of the DFT in the case N = 8. For each mode note its period and frequency.

11. General orthogonality. Show that the discrete orthogonality relation may bedefined on any set of N consecutive integers; that is, for any integer P,

12. Another proof of orthogonality. An entirely different approach may be usedto prove orthogonality without considering the polynomial ZN — I. Use the partial

(a) Show that the sequence {o;^ }fc_ periodic of length N.

(b) Show that the sequence {i^^fk} fc=_00 periodic in both n and k with

k


sum of the geometric series

to prove the orthogonality property of problem 11.

13. Matrix representation. In the text, the matrix, W, for the alternate DFT

for k = 0 : N — 1, was presented. Find the DFT matrix that corresponds to the DFTdefinition

for k = -N/2 + 1 : N/2, when N = 8.

14. The reciprocity relations. In each of the following cases assume that afunction / is sampled at N points of the given interval in the spatial domain. Givethe corresponding values of Au; and f2 in the frequency domain. Then give the valuesof all grid parameters (A, Au;, and fJ) if (i) N is doubled and Ax is halved, and (ii) Nis doubled leaving Arc unchanged. Sketches of the spatial and frequency grids beforeand after each change would be most informative.

15. Using the reciprocity relations to design grids.

(a) The function / has a period of two and is sampled with N = 64 grid pointson an interval of length A — 2. These samples are used as input to theDFT in order to approximate the Fourier coefficients of /. What are theminimum and maximum frequencies that are represented by the DFT? Dothese minimum and maximum frequencies change if the same function issampled, with ./V = 64, on an interval of length ^4 = 4?

(b) The function / is zero for x\ > 5. The DFT will be used to approximate theFourier transform of / at discrete frequencies up to a maximum frequencyof 100 periods per unit length. Use the reciprocity relations to select agrid in the spatial domain that will provide this resolution. Specify theminimum ./V and the maximum Ax that will do the job.

(c) The function / is available for sampling on the interval (—00,00). TheDFT will be used to approximate the Fourier transform of / at discretefrequencies up to a maximum frequency of 500 periods per unit length.Use the reciprocity relations to select a grid in the spatial domain that willprovide this resolution. Specify the minimum N and the maximum Axthat will do the job.

PROBLEMS 61

16. Inverse relation. Follow the proof of Theorem 2.2 to show that

17. Equivalence of DFT forms. Assuming that the input sequence fn and theoutput sequence Fk are periodic, show that the ./V-point DFT can be defined on anyset of N consecutive integers,

for k = Q : Q + N — 1, where P and Q are any integers.

18. Matrix properties. Show that the DFT matrix W is symmetric and thatW"1 = ATW*. Furthermore, show that NWWH = I, where H denotes the conjugatetranspose. Hence W is a unitary matrix up to a factor of N.

19. Average value property. Show that the zeroth DFT coefficient is the averagevalue of the input sequence

Now the modes u>Nnk are indexed by k = 0 : N — I. For each value of k in this set,

find the period and frequency of the corresponding mode. In particular, verify thatthe modes with k = 0 and k = N/2 (for N even) are real. Show that the modeswith indices k and N — k have the same period, which means that the high frequencymodes are clustered near k = N/2, while the low frequency modes are near k = 0 andk = N.

21. Sampling input for the DFT. Consider the functions

on the interval [—1,1]. Sketch each function on [—1,1] and sketch its periodic extensionon [—4,4]. Show how each function should be sampled if the DFT is to be used toapproximate its Fourier coefficients on [—1,1].

22. Continuous orthogonality. Show that the continuous Fourier modes on theinterval [—A/2, A/2] satisfy the orthogonality relation

20. Modes of the alternate DFT. Consider the alternate form of the DFT


for integers j and k. (Hint: Consider j = k separately and integrate directly whenj ± k.)

23. Real form of the Fourier Series. Use the Euler relations

to show that if / is real-valued, then the real form of the Fourier series

for fe = 1, 2, . . . , is equivalent to the complex form

where the coefficients Ck are given by

for k = — oo : oo. Find the relationship between Ck and {ofc, bk}.

24. Orthogonality of sines and cosines. Show that the functions wfc and vj,defined by

satisfy the following orthogonality properties on [—A/2, A/2]:

where

for k = 0,1,2,. . . , and by

with coefficients given by

PROBLEMS 63

25. Real form of the trigonometric polynomial. Assuming N is odd, showthat if the coefficients ak in the ./V-term trigonometric polynomial

satisfy

written in the form

where Jfe = -(N - l)/2 : (JV - l)/2.

27. Properties of the least squares approximation. Verify the following factsthat were used to relate the DFT to the least squares trigonometric approximation.The least squares approximation to the JV data points (xn, fn} is denoted I^N.

(a) Use the orthogonality property and the fact that ak = T>{fn}k to showthat

(b) Show that with the coefficients ak given by the DFT, the error in the leastsquares approximations is zero; that is,

26. The normal equations. Show that the necessary conditions that the leastsquares error be minimized,

for k — -(N — l)/2 : (N - l)/2, yield the system of normal equations (2.18)

(which will be the case if the data fn are real- valued), then the polynomial can be


28. Trigonometric interpolation. The result of the least squares approximationquestion was that the trigonometric polynomial ^JV actually interpolates the givendata. Show this conclusion directly: given a data set (xn, fn) for n = —AT/2 +1 : AT/2,where xn = nA/N, assume an interpolating polynomial of the form

Then impose the interpolation conditions V'JvC^n) = fn for n = —N/2 + 1 : N/2 toshow that the coefficients are given by ctk = T> {fn}k.

29. The Dirac 6 function. Sketch and show that each of the following sequencesof functions {gn(x]}^=l satisfies the three properties

and that each sequence could thus be used as a defining sequence for the Dirac 6function.

Thus each of these sequences could be used to define the Dirac 6 function.

)

1

n

Chapter 3


3.1 Alternate Forms for the DFT

3.2 Basic Properties of the DFT

3.3 Other Properties of the DFT

3.4 A Few Practical Considerations

3.5 Analytical DFTs

3.6 Problems

All intelligentconversation is playing

on words. The rest isdefinition or instruction.

— Herman WoukThe Caine Mutiny 65

66 PROPERTIES OF THE DFT

3.1. Alternate Forms for the DFT

The DFT arises in so many different settings and is used by practitioners in somany different fields that, not surprisingly, it appears in many different disguises.In defining the DFT in the previous chapter, we issued the proviso that the definitionused primarily in this book, namely

for k — —N/2 + 1 : N/2, is only one of many that appear in the literature. Tounderscore this point we will occasionally use different forms of the DFT even in thisbook. Given this state of affairs, it seems reasonable to devote just a few momentsto other forms of the DFT that might be encountered in practice. Knowing theprotracted deliberations that led to our choice of a DFT definition, it would be foolishto suggest that one form is superior among all others. There is no single DFT that hasa clear advantage over all others. The best attitude is to accept the DFT's multiplepersonalities and to deal with them however they appear. This approach is particularlyvaluable in working with DFT (or FFT) software, an issue we will also discuss briefly.

Many of the variations on the DFT are truly superficial. Other authors havechosen virtually every combination of the following options:

1. placing the scaling factor l/N on the forward or inverse transform (the optionof using l/\/JV on both transforms also occurs),

2. using i or j for \/— 1,

3. using ± in the exponent of the kernel on the forward or inverse transform, and

4. including in the notation the left-hand or right-hand endpoint of the samplinginterval (we have seen that the endpoint value actually used must be the averageof the endpoint values).

Having listed these options we will dispense with them as notational differences. Thereare other choices that are more substantial.

Much of the latitude in denning the DFT comes from the implied periodicity ofboth the input sequence fn and the transform sequence Fk- An Appoint DFT canbe denned on any N consecutive terms of the periodic sequence /n, and can be usedto define any N consecutive terms of the periodic transform sequence F^. In otherwords, a general form of the forward JV-point DFT is

for k = Q + l : Q + N, where P and Q are any integers and N is any positive integer.Of this infinitude of DFTs, only a few are useful and have been committed to practice.We will consider these select few in this section.

There appear to be three factors distinguishing the DFTs that are extant in theliterature. We will base our DFT taxonomy upon these three factors:

ALTERNATE FORMS FOR THE DFT 67

FIG. 3.1. Using centered indices (top figure), the low frequencies correspond to \k\ < N/4.for a single set of sample points, while the high frequencies correspond to \k\ > N/4. Withnoncentered indices, the low frequencies correspond to 0 < k < N/4 and 37V/4 < k < N, andthe high frequencies correspond to N/4 < k < 37V/4 (for a single set).

1. Is the sampling interval centered about the origin or not! If the samplingtakes place on an interval of length A, this is the choice between the intervals[—A/2, A/2] and [0, A}. The centered forms of the DFT generally occur when thesample interval is a spatial domain (for example, an image or an object), whereasthe noncentered forms are generally used for time-dependent (causal) sequences.Centered indices (for example, n, k = —N/2 : N/2) have the appeal that in thefrequency domain, the low frequency indices are clustered about A; = 0, whilethe high frequency indices are near the two ends of the index set for \k\ « N/2.In the noncentered cases (for example, n, k — 0 : N — 1), the low frequencyindices are near the ends of the index set (k « 0 and k w TV), and the highfrequency indices are near the center (k « N/2). Some may find it disconcertingthat for the noncentered cases, the low frequencies are associated with both low-and high-valued indices. The distribution of high and low frequencies in variouscases is summarized in Figure 3.1.

2. Does the DFT use an even or odd number of sample points? This distinction isdifferent than the issue of whether N itself is even or odd, and some confusion isinevitable. For example, with N even it is possible to define a DFT on N points(an even number of points) by using the index ranges n, k = —N/2 + I : N/2or on N + 1 points (an odd number of points) by using the index rangesn,k = —N/2 : N/2. As another example, with N even or odd, it is possible todefine a DFT on 27V points (an even number of points) using the index rangesn, k = 0 : 27V — 1 or on 27V + 1 points (an odd number of points) using the indexranges n, k = 0 : 27V. Examples of these various combinations will be givenshortly.

3. Is the DFT defined on a single or double set of points? If the input and outputsequences consist of either TV, TV — 1, or TV +1 sample points, we will say that theDFT is defined on a single set of points. If the DFT is defined on either 2TVor 2TV + 1 points, we will use the term double set of points. There seems to be

High

frequency Low frequency

High

frequency

Low

frequency High frequency

Low

frequency


no overriding reason to favor one option over the other. With double sets, theparity of N is immaterial, and the choice of an even or odd number of samplepoints is reflected in the index range. This may be the only advantage, and it isnot significant since the geometry of the single and double set modes is virtuallyidentical. This distinction is included only because both forms appear in theliterature.

The various combinations of these three binary choices are shown in Table 3.1, andaccount for most of the commonly occuring DFTs. The DFT used primarily in thisbook is the single/even/centered case in which AT is assumed to be even. The geometryof the modes of the various DFTs is intriguing and may be a factor in selecting a DFT.With an even number of points, the DFT always includes the highest frequency moderesolvable on the grid, namely cos(Trn) with period 2Ax. As elucidated in problems32 and 33, a DFT on an odd number of points does not include the cos(Trn) mode;its highest frequency modes will have a period slightly greater than 2Ax (typicallysomething like 2(1 + l//V)Az).

We will offer two examples of how Table 3.1 can be used to construct DFTs. Thefirst is quite straightforward; the second presents some unexpected subtleties.

Example: Double/even/noncentered DFT. Assume that a function / issampled at 2N points of the interval [0,A]. Since Ax = A/(2N), those grid pointsare given by xn = nA/(2N), where n = 0 : IN — 1. Notice that the resulting DFT isdefined on an even number of grid points. Denoting the samples of / by fn = f ( x n ] ,the forward transform is given by

for k = 0 : 2N — I . Orthogonality relations can be established on this set of gridpoints, and the inverse DFT is easily found to be

for n = 0 : 2N — 1. A slightly modified set of reciprocity relations also holds for thisDFT. Since the length of the spatial domain is A, it follows that Ao> = I/A; therefore,AzAu; = 1/(2JV). The length of the frequency domain is fi — 2N/A, which leadsto the second reciprocity relation, A$l = 2N. One would discover that this alternateDFT changes very few, if any, of the DFT properties.

Example: Single/odd/centered DFT. Now assume that the function / issampled at N + I equally spaced points of the interval [—A/2, A/2} (including theorigin), where N itself is even. This means that the grid points for the DFT are givenby xn = nA/(N+l), where n = ~N/2 : N/2. Letting fn = f ( x n ) , the resulting DFTis given by

for k = —N/2 : N/2. It can be shown (problem 34) that the kernel uNn^ satisfies an

orthogonality property on the indices n = —N/2 : N/2, and an inverse DFT can bedefined in the expected manner. However, there are some curious properties of this

ALTERNATE FORMS FOR THE DFT 69

TABLE 3.1Various forms of the DFT.

Type

single

evencentered

single

evencentered

single

evennoncentered

single

oddcentered

single

oddcentered

single

oddnoncentered

double

evencentered

double

evennoncentered

double

oddcentered

double

oddnoncentered

Comments

N even

N points

N odd

N — 1 points

N even

N points

N even

N + 1 points

N odd

N points

N odd

N points

N even or odd

27V points

N even or odd

2N points

N even or odd

2TV + 1 points

N even or odd

2N +1 points

Index setsn, k

-K + 1 • K2 + L • 2

N-l , 1 N-l2 ' i • 2

0: N-l

N . N2 ' 2

N-l . N-l2 ' 2

0: N-l

-N + l : N

0 : 27V - 1

-TV: TV

0 : 2N

Mode

ei2-xnk/N

ei2Trnk/N~l

gilirnk/N

ei2trnk/N+l

ei2Trnk/N

ei2nnk/N

pinnk/N

piirnk/N

ei2irnk/2N + l

ei27rnfc/2AT+l

Highestfrequency

indices

TV/2

(TV-l ) /2

TV/2

±TV/2

±(TV - l)/2

(TV±l) /2

TV

TV

±TV

TV, TV + 1

DFT (and the double/odd/centered DFT) that are not shared by the other forms.Notice that the DFT points xn = nA/(N + 1), while uniformly distributed on theinterval [—A/2, A/2] do not include either endpoint. However, the zeros of highestfrequency sine mode occur at the points £n = nA/N for n = —TV/2 : AT/2, which doinclude the endpoints ±A/2. There is a mismatch between these two sets of points (asillustrated in Figure 3.2) which does not occur in the other DFTs of Table 3.1. Thediscrepancy between the xn's and the £n's decreases as TV increases. This quirk doesnot reduce the legitimacy of the odd/centered forms as full-fledged DFTs, but theyshould carry the warning that they sample the input differently than the other DFTs.There is still the question of how well the odd/centered DFTs approximate Fouriercoefficients and Fourier integrals. While this is the subject of the next chapter, we


FIG. 3.2. When a DFT is defined on a centered interval with an odd number of samplepoints (single/odd/centered), the sample points xn = nA/(N + 1) do not coincide with thezeros of the sine modes, £n = nA/N. The two sets of points do coincide in the other forms ofthe DFT shown in Table 3.1. The figure shows the mismatch for a nine-point DFT (N = 8);the xn 's are marked by and the £n 's are marked by

should state here for the sake of completeness that these forms of the DFT appearto have the same error properties as the other DFTs. To the best of our knowledge,this question has not been fully addressed or resolved in the literature. One couldverify that the odd/centered forms do possess the usual DFT properties that will bediscussed in this chapter.

To conclude the discussion of the single/odd/centered DFT, we note that it carriesits own reciprocity relations. Since the sampling interval has length A, the grid spacingin the frequency domain is Au; = I/A, and hence AxAu; = l/(N + 1). The DFT isdefined on N + 1 points, so the extent of the frequency grid is SI = (N + l)Au; =(N + I ) / A . This implies the second reciprocity relation, Aft, = N + 1. It shouldbe noted that the frequency grid points are given by ujk = k/A. This says that thehighest frequencies,

do not (quite) coincide with either endpoint of the frequency domain which have thevalues ±fi/2. Thus there is a similar anomaly in the grid points of the frequencydomain which should not present accuracy problems as long as one realizes preciselywhich frequencies are represented by the DFT.

The question of different forms of the DFT arises in a critical way when it comesto using software. Care and ingenuity must be used to insure that the input is in theproper form and the output is interpreted correctly. We proceed by example and listseveral popular software packages that use different forms of the DFT; inclusion doesnot represent author endorsement!

1. Mathematica [165] uses the TV-point DFT

for k = I : N, with the single proviso that "the zero frequency term appears atposition 1 in the resulting list." This is essentially the single/even/noncenteredor single/odd/noncentered cases of Table 3.1, except that the input and outputsequences are defined for n, k = 1 : N.

2. The IMSL (International Mathematical and Statistical Libraries) routine

BASIC PROPER 1

FFTCC uses the following DFT:

for k = 0 : N — 1. This is essentially the single/even/noncentered orsingle/odd/noncentered form of Table 3.1, depending on whether N is evenor odd. Again, the input and output sequences must be denned with positive(nonzero) indices.

3. The package Matlab [97] uses the DFT definition

for k = 0 : N — 1, in its FFT routine, which agrees with the IMSL definition.

4. A widely used mainframe DFT software package FFTPACK [140] uses thedefinition

for k = 1 : N. Except for scaling factors, and the appearance of a minus sign inthe exponential, this definition matches that of Mathematica.

5. The mathematical environment MAPLE [31] computes the DFT in the form

for k = 0 : N — 1, which has the effect of using a different index set for the inputand output.

6. MATHCAD [96] offers both real and complex DFTs. The real version has theform

for where f n is a real sequence and F^ is complex.

This small software survey confirms the assertion made earlier that DFTs appearin all sorts of costumes. Anyone who works regularly with the DFT will eventuallyencounter it in more than one form. Hopefully this section has helped prepare thereader for that eventuality.

3.2. Basic Properties of the DFT

At this stage, the DFT is like a town that we know by name, but have never visited.We have a definition of the DFT, derived in several ways, and we have discussedsome alternate forms in which the DFT may appear. However, we still know very


little about it. Now is the time to become acquainted with the DFT by exploringits properties. As with its cousins, the Fourier series and the Fourier transform, theDFT is useful precisely because it possesses special properties that enable it to solvemany problems easily. Such simplification often occurs because a problem specifiedoriginally in one domain (a spatial or time domain) can be reformulated in a simplerform in the frequency domain. The bridge between these two domains is the DFT,and the properties of the DFT tell us how a given problem is modified when we passfrom one domain to another.

Before launching this exploration, one cover statement might be made. Inthe previous section we observed that the DFT may take a bewildering variety of(ultimately equivalent) forms. The form of the DFT pair that will be used throughoutthis chapter is

for which is the forward transform, and

for n — —N/2 + l : AT/2, which is the inverse transform. However, every property thatis discussed inevitably holds for any legitimate form of the DFT that one chooses touse. With that sweeping unproven statement underlying our thoughts, let's discussproperties of the DFT.

PeriodicityA natural point of departure is a property that we have already mentioned and used.The complex sequences fn and Fk defined by the JV-point DFT pair (3.1) and (3.2)have the property that they are TV-periodic, which means that

fn+N — fn and -Ffc+;v = Fk for all integers n and k.

This property follows immediately from the fact that

Among the ramifications of this property is the fact that by extending either sequencefn or Ffc periodically beyond the set of indices n, k = —N/2 + 1 : AT/2, the DFT canbe defined on any two sets of N consecutive integers. Another crucial consequence isthat in sampling a function for input to the DFT, the sequence of samples must beperiodic and satisfy /_ N_ = /AT .

Linearity

One of the fundamental properties of the DFT is linearity: the DFT of a linearcombination of input sequences is the same as a linear combination of their DFTs.Linearity enables us to separate signals into various components (the basis for spectralanalysis) and to keep those of interest while discarding those of no interest (the

BASIC PROPERTIES OF THE DFT 73

principle underlying filtering theory). Linearity can be expressed in the followingway. If fn and gn are two complex-valued sequences of length JV, and a and 0 arecomplex numbers, then

Precisely the same argument can be used to show that the inverse DFT is also a linearoperator.

Shift and ModulationTwo closely related properties that have important implications are the shift andmodulation properties. The shift property tells us the effect of taking the DFT ofa sequence that has been shifted (or translated). A brief calculation using the DFTdefinition (3.1) can be used directly to derive the shift property (problem 43). Here isa slightly different approach. Consider the sequence fn that has been shifted j unitsto the right. Using the IDFT, it can be written

From this last statement it follows that

In words, transforming a sequence that has been shifted j units to the right has theeffect of rotating the DFT coefficients of the original sequence in the complex plane; infact, the original coefficient F^ is rotated by an angle —1-KJk/N. The magnitude of theDFT coefficients remains unchanged under these rotations. This effect is illustratedin Figure 3.3. A popular special case of this property is that in which the originalsequence is shifted by half of a period (N/2 units). The resulting sequence of DFTcoefficients has the property that

This property follows from the following argument:


where Fk is the DFT of the unshifted sequence.The modulation or frequency shift property gives the effect of modulating the

input sequence, that is, multiplying the elements of the input sequence fn by u;^J,where j is a fixed integer. In a somewhat symmetrical and predictable way, it resultsin a DFT sequence that is shifted relative to the DFT of the unmodulated sequencefn. A brief argument demonstrates the effect of modulation:

The property says that if an input sequence is modulated by modes of the formcos(27mj/N) or sm(2imj/N) with a frequency of j cycles on the domain, the resultingDFT is shifted by j units. This property can be visualized best when fn is real-valuedand the modulation is done by a real mode (problem 61).

Hermitian SymmetryThe DFT and the IDFT are similar in form, as shown by their respective definitions,(3.1) and (3.2). Indeed, the relationship is sufficiently close that either transform canbe used to compute the other, with a simple alteration to the input data. For example,the inverse transform fn = T>~1 {Fk}n can be computed by taking the conjugate ofthe forward transform of a mildly modified version of the sequence Fk. A similarmaneuver can be used to compute T>{fn}k. The two properties that capture thesimilarity between the DFT and the IDFT are usually called Hermitian1 symmetryproperties, and are given by

where * indicates complex conjugation.These relations are easy to establish using the definitions of the DFT and its

inverse, which is best left as an exercise for the reader (see problem 42). In principle,the Hermitian symmetry properties suggest that only one algorithm is needed tocompute both the forward and inverse DFTs. In practice, however, the FFT (discussedin Chapter 10) is used to compute DFTs, and literally hundreds of specialized FFTalgorithms exist, each tuned to operate on a specific form of the input.

DFT of a Reversed Sequence

Suppose we are given the input sequence fn, where as usual n = — JV/2 -f 1 : N/2.Consider the input sequence /_n formed by reversing the terms of fn. The new

1 Mathematically gifted from childhood, CHARLES HERMITE (1822-1901) did not receive aprofessorship at the Sorbonne until the age of 54. He made fundamental contributions to numbertheory and the theory of functions (particularly elliptic and theta functions).


FIG. 3.3. According to the DFT shift property, if an input sequence fn of length N = 8(upper left) with DFT coefficients Fk (upper right) is shifted to the right by two units toproduce the sequence /n_2 (lower left), the resulting DFT coefficients F'k (lower right) arerotated in the complex plane through angles of —Tck/1 where k = —3 : 4. Note that themagnitude of individual DFT coefficients remains unchanged under the rotation. Note alsothat since both input sequences are real, both DFT sequences are conjugate symmetric.

sequence is

What can we say about the DFT of the altered sequence £>{/_n}fc? Substituting thenew sequence into the definition DFT (3.1), we obtain


for k = —N/2 + 1 : N/2. Now letting p = —n the sum becomes

where / = —N/2 : N/2 — 1. Finally, we invoke the periodicity of the DFT and recallthat /_7v/2 = /Ar/2» which allows us to run the indices p and / from —N/2 + 1 up toN/2 . Hence, (3.6) is the ordinary DFT of fp — /_n with frequency index / = —k.We may therefore conclude that

where F^ is the DFT of the original sequence. The fact that the DFT of a reversedinput sequence is a reversed DFT is used in the derivations of numerous othersymmetry properties.

We now take up a discussion of properties that emerge when the DFT is appliedto input sequences that exhibit certain symmetries. These symmetries can takemany forms, but they are all extremely important, because they ultimately lead tocomputational savings in evaluating the DFT. The resulting algorithms, known assymmetric DFTs, are the subject of Chapter 4. However, the actual symmetryproperties can be discussed right now.

DFT of a Real SequenceAlthough the term "symmetry" may seem somewhat inappropriate, by far the mostprevalent symmetry that arises in practice is that in which the input sequence is real-valued. Therefore, suppose that the input sequence fn consists of real numbers, whichmeans that /* = fn. Then we find that

We have used the fact that /„ = /* and (UNP)* = UP

N in reaching this conclusion. Anycomplex-valued sequence possessing the property F£ = F-k is said to be conjugatesymmetric or conjugate even. Therefore, the DFT of a real sequence is conjugatesymmetric. Notice that if we write Fk in the form Fk = RejFfc} + Hm{Fk}, thenconjugate symmetry implies that the real part of the transform is an even sequencewhile the imaginary part of the transform is an odd sequence:

for k = —N/2 + I : N/2. If we now let k = —/, we may write


The fact that the DFT of a real sequence is conjugate symmetric also implies that theIDFT of a conjugate symmetric sequence is real (problem 46).

There are some important implications of this property, especially since real-valued input sequences are extremely common in applications. The first consequenceis economy in storage. Clearly, a real-valued input sequence of length N can be storedin N real storage locations. The output sequence Fk is complex- valued and mightappear to require 2N real storage locations. However, since F£ = F_fc, it follows thatFQ and Fjv/2 are real. Furthermore, we observe that the real parts of F^ are neededonly for k = 0 : N/2, and the imaginary parts are needed only for k = I : N/2 — 1,since the remaining values can be obtained from (3.8). A quick count shows thatthere are precisely TV independent real quantities in the output sequence Fk. If TV isodd, then the relations (3.8) hold for k — 0 : (N — l)/2, so knowledge of Fk for thesevalues of k is sufficient to determine the entire sequence. This property (and analogousproperties for other symmetric input sequences) can be used to achieve savings in bothcomputation and storage in the DFT.

DFT of a Conjugate Symmetric SequenceGiven the previous property, it should be no surprise that the DFT of a conjugatesymmetric sequence is real. In fact, if /^ = /_„, then

where we have used the periodicity of fn to write the last sum. Since Fk — F£, theDFT is real-valued.

DFTs of Even or Odd SequencesWe now consider the transform of complex sequences with either even or oddsymmetry. By even symmetry, we mean a sequence in which /_n = /„. Notice thatthe reversal of an even sequence is the original sequence. Having already establishedthat the DFT of a reversed input sequence /_n is the reversed output sequence F_fc,it follows that the DFT of an even sequence is also even. Therefore, we conclude thatF-k = Fk. This property may also be shown directly using the definition of the DFT(3.1) (problem 44).

We define an odd sequence as one in which /_n = —fn. By the reversed sequenceproperty and the linearity property we may conclude that the DFT of an odd sequencesatisfies F_^ = — F^; thus the DFT of an odd sequence is itself an odd sequence.

DFTs of Real Even and Real Odd SequencesThe majority of DFT applications involve real-valued input sequences, and thesesequences often possess other symmetries. For example, a sequence that is both realand even, so that fn — f-n and fn = /£, must have a DFT that is conjugate symmetric(because fn is real) and even (because fn is even). These two properties imply thatRe{Ffc} is an even sequence and Im{Ffc} = 0. Thus we conclude that the DFT ofa real, even sequence is also real and even. We can analyze the DFT of an inputsequence that is both real and odd in a similar way. Recall that the DFT of an odd


sequence is odd (F-k = — -Ffc), and the DFT of a real sequence is conjugate symmetric(F^k = F£). These facts together imply that Re{Ffc} = 0 and ImfF^} is an oddsequence; or, stated in words, the DFT of a real, odd sequence is a purely imaginary,odd sequence. These properties can also be shown directly from the definition of theDFT (problem 44).

There are many other symmetries that could be considered, and almost anysymmetry will give a new property. Those we have outlined above are the mostcommon and useful. Among the symmetries omitted (until Chapter 4) are those calledquarter-wave symmetries, which are extremely important in the numerical solutionof partial differential equations. The symmetry properties which were included in thisdiscussion are summarized graphically in Figures 3.4 and 3.5.

Before leaving the topic of symmetries entirely, we mention the fact that,while some sequences exhibit symmetries and others do not, any sequence can bedecomposed into a pair of symmetric sequences. We briefly outline how this is done,and relate the DFTs of the symmetric component sequences to the DFT of the originalsequence.

Waveform Decomposition

An arbitrary sequence fn can always be decomposed into the sum of two sequences,one of which is even and the other odd. This is accomplished by defining

This property is called the waveform decomposition property and may be verifiedcarefully in problem 45. Notice the consistency in these relations:

At this point we turn our attention from the symmetry properties of the DFT,and examine what could be termed operational properties. That is, how does theDFT behave under the action of certain operations? While many operations could beof interest, we will restrict our attention to convolution and correlation, two of themost useful such creatures.

Discrete (Cyclic) Convolution

The discrete convolution theorem is among the most important properties of the DFT.It underlies essentially all of the Fourier transform-based signal processing done today,and by itself accounts for much of the utility of the DFT. We begin by defining anddiscussing the notion of discrete convolution. Given two iV-periodic sequences fn and

and noting that

But since £>{/_n}fc = F-k (reversal property), we may use the linearity of the DFTto show that


FIG. 3.4. Various symmetries of the DFT are summarized in this figure. In the left twocolumns, the real and imaginary parts of the input are shown, while the right two columnsshow the corresponding real and imaginary parts of the DFT. The symmetries shown are:arbitrary input (top row), reversed arbitrary input, i.e., reversed input sequence of top row(second row), real input (third row), and conjugate symmetric input (bottom row).


FIG. 3.5. More of the various symmetries of the DFT are summarized in this figure. Inthe left two columns, the real and imaginary parts of the input are shown, while the right twocolumns show the corresponding real and imaginary parts of the DFT. The symmetries shownare: even input (top row), odd input (second row), real even input (third row), and real oddinput (bottom row).


gn defined for the indices n = —N/2 4- 1 : AT/2, their discrete (cyclic) convolution,denoted fn * gn, is another sequence hn defined by

for n = —N/2 + 1 : N/2. Notice that hn is also an A/"-periodic sequence.It is never enough merely to define the convolution operator; some interpretation

is always needed. Convolution may be viewed in several different ways. Because ofits importance, we will highlight some of its properties and try to instill a sense of itssignificance. We note first that the order of the sequences in convolution is immaterial,since if we let p = n — j and invoke the periodicity of fn and gn we find that

In addition, a scalar multiple can be "passed through" the convolution; that is,

The convolution operator is developed graphically in Figure 3.6 and may be describedin the following way. The nth term of the convolution hn results from reversing theorder of the convolving sequence QJ (forming the sequence g~j), shifting this sequenceto the left an amount n (forming the sequence gn-j), and then forming the scalarproduct of that sequence with the sequence fj. This process is repeated for eachn = -N/2 + 1 : N/2.

Further insight into discrete convolution might be offered by considering a siftingproperty of the 6 sequence. Let's look at the effect of convolving an arbitrary sequencefn with the sequence 8(n — no), where no is a fixed integer. (Recall that 6(n — no) iszero unless n — no.) A short calculation reveals that

We see that if a sequence is convolved with a 5 sequence centered at the index no, theeffect is to shift the sequence no units to the right (see problem 60).

We will illustrate convolution with one of its most frequent uses. As shown inFigure 3.7, the sequence fn is obtained by sampling a superposition of cosine waveswith various frequencies, including several with high frequencies. The sequence

includes only 21 nonzero (positive) entries, centered about go, which sum to unity.Convolution of / with g may be thought of as a running 21-point weighted averageof /. As seen in the figure, the resulting convolution is a smoothed version of /„, inwhich the higher frequency components have been removed. In this light, we may viewconvolution of two sequences as a filtering operation in which one of the sequences isinput data and the other is a filter. Having developed the view of convolution as a


FIG. 3.6. Each term of the convolution sequence hn is the scalar product of one sequencefn with a shifted version of a second sequence gn in reversed order. Each hn may be viewedas a weighted sum of the input fn, the weighting given by g~n- The top row shows, left toright, the sequences gn, fn, and hn. The second row shows the shifted, reversed sequence92-n (left), the sequence fn (middle), and the output entry h-2 (right), which is the scalarproduct o/gt(2-n) o,nd fn- The third row shows the shifted, reversed sequence gs-n (left), thesequence fn (middle), and the output entry h^ (right), the scalar product of g(s-n)

ana fn-The bottom row shows the shifted, reversed sequence g±-n (left), the sequence fn (middle),and the output entry h± (right), the scalar product of g^_n) and fn-


FIG. 3.7. The convolution operator acts as a filter. In the case shown, the input fn (top)is an oscillatory sequence of length N = 64, while the convolving sequence gn (middle) is a21-point weighted average operator (whose entries sum to unity). The output hn = fn * gn

(bottom) is a smoothed version of the input fn, that also has damped amplitude.

filtering operator, we now arrive at one of the most important of the DFT properties.We will dignify it with theorem status and state the convolution theorem.

THEOREM 3.1. DISCRETE CONVOLUTION THEOREM. Let fn and gn be periodicsequences of length N whose DFTs are Fk and Gk • Then the DFT of the convolutionhn = fn * 9n W

That is, the DFT of the convolution is the pointwise product of the DFTs. This isoften expressed by saying that convolution in the time (or spatial) domain correspondsto multiplication in the frequency domain.

Proof: We begin by writing fn and gn- j as the IDFTs of their DFTs. By the modulationproperty we know that gn-j = T>~1 {Gfcu;^J'fc}. Hence we can write


where the order of summation has been changed and terms have been regrouped. Theorthogonality property may now be applied to the innermost sum to give

This shows that fn * gn is the IDFT of NFkGk. Stated differently,,

We are now in a position to understand one of the mightiest applications of theDFT, that of digital filtering. Assume that fn is a signal that will be filtered, andgn represents a selected filter. Essentially, we may perform filtering by computingthe DFT of the input signal (fn —» F^), multiplying it term-by-term with the(precomputed) set of weights Gk to form the sequence F^Ck, and computing theIDFT to obtain the filtered sequence. Here is the important observation. Computinga convolution of two N-po'mt sequences by its definition requires on the order of AT2

multiplications and additions. However, if the DFTs and IDFTs are done using anFFT (which, as shown in Chapter 10, requires approximately NlogN operations),the convolution can be done with approximately 2N log N + N operations (N log Noperations for each of the two FFTs and N operations for the pointwise products ofthe DFTs). Particularly for large values of N, the use of the FFT and the convolutiontheorem results in tremendous computational savings.

There is one other curious perspective on convolution that should be mentioned,since it has some important practical consequences. Consider the familiar exercise ofmultiplying two polynomials

and

The product of these two polynomials looks like

A close inspection of the coefficients of the product polynomial reveals that thecoefficient Ck has the form of a convolution. We can make this more precise as follows.

We will let N — rn + n + 1 and then extend the two sets of coefficients {a^} and{bk} with zeros in the following way:

The auxiliary sequences a/j and bk have length 2 A/", and now we can write

n

n

BASIC PROPE

In other words, the coefficients of the product polynomial can be obtained by takingthe 2./V-point convolution of the auxiliary sequences. Notice that, of the entireconvolution sequence c^, only the coefficients

are needed to form the product polynomial. This association between convolution andproducts of polynomials has important implications in algorithms for high precisionarithmetic, as outlined in problem 50.

We have scarcely touched upon the subject of convolution with all of itsapplications and algorithmic implications. For a far more detailed account ofrelated issues (for example, noncyclic convolution, convolution of long sequences, andFFT/convolution algorithms) the interested reader is referred to any signal processingtext or to [28], [73], or [107].

Frequency ConvolutionJust as the DFT of a convolution of two sequences is the product of their DFTs, itcan be shown that the DFT of the product of two sequences is the convolution of theirDFTs; that is,

To show this, we consider the inverse transform of the convolution and write

The modulation property has been used in the third line of this argument. Thefrequency convolution theorem, T>{fngn} = Fk * Gk, now follows immediately byapplying T> to each side of this equation.

Discrete CorrelationClosely related to the convolution of two sequences is an operation known ascorrelation. Correlation is (perhaps) more intuitive than convolution, because

85

}


its name describes what it does: it determines how much one sequence resembles(correlates with) another. Consider two real sequences that need to be compared.Intuitively, one might multiply the sequences pointwise and sum the results. If thesequences are identical then the result will be large and positive. If they are identicalexcept for being of opposite sign, then the result will be large and negative. If theyare entirely dissimilar, then we expect that the agreement or disagreement of the signswill be random, and the resulting sum should be small in magnitude. The discretecorrelation operator embodies this idea, except that it may be applied to complex, aswell as real-valued sequences, and it also computes a correlation for the two sequenceswhen one has been retarded, or lagged, relative to the other.

Here is how discrete correlation works. Let fn and gn be ./V-periodic sequences.The discrete correlation of the sequences fn and gn, denoted fn®9m is an JV-periodicsequence hn denned by

As with discrete convolution, the discrete correlation of two sequences can be doneefficiently by performing the pointwise product of their DFTs, and then transformingback to the original domain with an inverse DFT. There is also a correspondingrelationship for correlation of two sequences in the frequency domain. It is givenby

We leave the verification of these properties to problem 47.

Parseval's RelationA well-known and important property is Parseval's relation, which we encounteredpreviously in connection with trigonometric interpolation. It is an importantproperty with some physical significance and its proof (problem 48) appeals again toorthogonality properties. Parseval's relation says that a sequence fn and its TV-pointDFT Fk are related by

Parseval's relation has analogous versions in terms of Fourier series and Fouriertransforms. In all cases, the same physical interpretation may be given. The sum ofthe squares of the terms of a sequence, ]T)n \fn

2, is often associated with the energyof the sequence (it is also the square of the length of the vector whose components arefn). It says that the energy of fn may be computed using either the original sequenceor the sequence of DFT coefficients. Alternatively, the energy of the input sequenceequals the energy of the DFT sequence (up to the scaling factor l/N).

for n — —N/1 + 1 : TV/2. With this definition in hand, we state the discretecorrelation theorem, namely that

OTHER PROPERTIES OF THE DFT 87

We close this section with a summary table of the basic DFT properties that havebeen presented up to this point. It is more than coincidence that the properties ofthe DFT have continuous analogs that can be expressed in terms of Fourier series andFourier transforms. Because of this kinship, many of the most useful techniques forFourier transforms and Fourier series can be used when the DFT is applied to discreteproblems. To emphasize these remarkable similarities, Table 3.2 shows not only theDFT properties, but the corresponding continuous properties.

The Fourier series properties assume that the functions / and g are defined onthe interval [—A/2, A/2] and have the representations

The table refers to the convolution and correlation properties for Fourier transforms.For completeness, we state the continuous form of the convolution theorem, whichbears an expected similarity to the discrete convolution theorem. Given two functions/ and p, with Fourier transforms / and <?, their convolution and its transform aregiven by

3.3. Other Properties of the DFT

There are many other properties of the DFT, a few of which we will list in this section.For the most part, we will merely state the properties, with some observations abouttheir utility. Proofs of these properties are generally relegated to the problems.

Earlier in this chapter we examined the DFT of sequences that possess certainsymmetries. We now consider sequences that have no special form originally, but arethen altered to produce a sequence with a special pattern. In each case the goal isto relate the DFT of the new sequence to the DFT of the unaltered sequence. It isuseful to know that these altered sequences are not mentioned frivolously; many ofthem arise in specific applications.

and

while their correlation and its transform are

where the coefficients are given by

The Fourier transform properties assume that / and g are absolutely integrablefunctions on the real line (Jf° \f(x)\dx < oo) whose Fourier transforms are

PROPERTIES OF THE DFT

TABLE 3.2Properties of the DFT and Fourier series.

Property

Periodicity

Linearity

Shift

Modulation

Hermitian

symmetry

Reversal

Real

Conjugate

symmetric

Even

Odd

Real even

Real odd

Convolution

Correlation

Input

property

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

DFT

property

Fourier series

property

FC {/(x)} means Fourier coefficients of /.

88


TABLE 3.2 (CONTINUED)Properties of the DFT and Fourier transforms.

Property

Periodicity

Linearity

Shift

Modulation

Hermitian

symmetry

Reversal

Real

Conjugate

symmetric

Even

Odd

Real even

Real odd

Convolution

Correlation

Input

property

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

Arbitrary

DFT

property

Fourier transform

property

None


Padded SequencesIt is a popular misconception that the FFT works only on sequences whose lengthis a power of two. While we shall see in Chapter 10 that this is most assuredly notthe case, it is true that some FFT packages handle only limited values of N. Givena data sequence with a length that is not amenable to computation by a particularFFT package, it is not uncommon to pad the data with zeros until it has a convenientlength. The padded sequence is then used as input to the DFT, often without askingabout the effect of the padding. We will investigate this practice now. Let fn be asequence defined for n = —M/2 + 1 : M/2 and suppose that N > M, where N andM are even. Since we are working on the symmetric interval, we pad the sequence atboth ends to create the augmented sequence

for k = -N/2 + I : N/2. We have let Fk denote the M-point DFT of the originalsequence /„. In other words, we have shown that the kth DFT coefficient of thepadded sequence is equal to M/N times the (kM/N}ih DFT coefficient of the originalsequence, whenever kM/N is an integer! This interpretation is more meaningful if weinvert it slightly; after all, the goal is to determine the coefficients Fk in terms of thecomputed coefficients Gk- So we might also write that

for k — -M/2 + 1 : M/2, whenever kN/M is an integer.A specific example helps considerably in unraveling this relationship. A common

situation is that in which N is chosen as an integer multiple of M, making N = pM,where p > I is an integer. The above relationship then appears as Fk = pGpk, andnow it is clear that the kih DFT coefficient of the original sequence is p times the(pfc)th coefficient of the DFT of the padded sequence, as shown in Figure 3.8. Themeaning of this property can be explained physically. By padding a sequence withzeros, we do not alter the spatial sampling rate Ax, but only increase the length ofthe spatial domain by increasing the number of grid points from M to N. By thereciprocity relations for each problem, we have

Letting Gk denote the TV-point DFT of the padded sequence gn, we see that

for k = —N/2 + 1 : N/2. Now a sleight-of-u; is needed to simplify this last sum. Usingthe fact that UN — UJM , we may continue and write/N


FIG. 3.8. The top row of figures shows the real (left) and imaginary (right) parts of theDFT of a sequence of length M = 16. The bottom row of figures shows the real (left) andimaginary (right) parts of the DFT of that same input sequence, now padded with zeros ateach end to increase its length to N = 32. Observe that every other coefficient of the DFTon the bottom is half of the corresponding coefficient of the original DFT on top.

Since N > M and Ax is constant, we see that

therefore, padding with zeros decreases the grid spacing Au; in the frequency domainby a factor of p. At the same time, the length of the frequency domain fi is unchangedsince Jl = I/Ax. Thus padding with zeros can be used as a technique for refining thefrequency grid. It is important to realize that this interpretation is valid only ifthe padded values are correct; that is, if the function represented by fn is indeedzero outside the interval [—(AT/2)Ax, (Ar/2)Axj. If the function is not zero over theregion of zero-padding, then legitimate data is lost in the process, and errors may beintroduced. An interesting effect occurs in the case when an M-periodic sequence isset to zero for |n| > M/2. The result is not increased resolution in frequency, butrather an interpolation in the frequency domain (see problem 53, as well as [21] and[36]).

Summed and Differenced SequencesLet gn = fn + fn+i be the sequence obtained by adding each term of fn to its rightneighbor. This summing process is a sort of averaging: if the terms of fn are constantor slowly varying, then the terms of the summed sequence will have approximatelytwice the magnitude of the terms of the original sequence. On the other hand, if fn is ahighly oscillatory sequence, then some cancellation will take place and the magnitudeof the terms of the summed sequence will be greatly diminished. In fact, it is notdifficult to show (problem 55), using the DFT definition (3.1), that

n 1


Noting that

we see that summing a sequence magnifies all of the low frequency modes (3.14).While some of the high frequency modes are magnified slightly (by as much as \/2),it is evident from (3.15) that the high frequency modes of the summed sequence tendto be suppressed.

Now define a differenced sequence hn = fn — fn+i by taking the difference betweeneach term of fn and its right neighbor. In this case the roles are reversed: if fn is afairly smooth sequence, then the differenced sequence hn will experience cancellation,and the low frequency modes will be suppressed. It is a direct calculation to show(problem 55) that

In contrast to the summing operator, the difference operator amplifies high frequencycoefficients (3.18) and diminishes most of the low frequency coefficients (3.17).

Folded Sum and Difference

Another interesting manipulation that may be applied to any sequence is folding. Letthe folded sequence gn = fn + fn+N/z be formed by averaging each term of fn with itspartner halfway through the sequence (using the periodicity of fn when necessary).Using the definition of the DFT, it may be shown (problem 55) that

We see that a mode with an odd frequency (for example, fn — sin(27rn/N}, whichis an odd sequence) is exactly canceled by the folding, while a mode with an evenfrequency (which is an even sequence) is reinforced by the folded sum operation.

Once again, using the periodicity of /n, let hn = fn — /n+jv/2 be the sequenceformed by differencing each term of fn with its partner half a period away. It thenfollows that

We see that the folded difference reinforces modes with an odd frequency indexand annihilates modes with an even index.

and

and

We see that now

A FEW PRACTICAL CONSIDERATIONS 93

3.4. A Few Practical Considerations

In this section we will introduce three very practical matters that arise in the day-to-day use of the DFT. They are

averaging at endpoints,

aliasing,

leakage.

Each of these issues will be presented in an empirical manner, as it might beobserved in a typical calculation, and some observations will be made in each case.But we can promise that each of these phenomena will appear again in later chapters:averaging at endpoints is almost a campaign slogan in this book that is chantedrepeatedly; and aliasing and leakage will figure in critical ways in the analysis of DFTerrors in Chapter 6.

Averaging at Endpoints (AVED)A penetrating example should suffice to make the essential points. Assume that weare approximating the Fourier coefficients of the function f(x) = x on the interval[—1/2,1/2]. As we can easily show, the exact values are given by

for k = 0, ±1, ±2, . . . . The Fourier coefficients are pure imaginary (since / is odd andreal- valued) and they decrease as Ifc]"1 . In order to use the DFT to approximate theCfc's, we must sample / at the N equally spaced points xn = n/N of [—1/2, 1/2]. Inkeeping with the convention that we have established, we will take n — — JV/2+1 : N/2.Our lesson can be communicated with a small value of N. With N = 8, let the samplesof / be fn = n/N , as shown in Table 3.3. Also shown are the DFT coefficients

TABLE 3.3DFT approximations to the Fourier

coefficients, with and without averagingat the endpoints; f(x] = x on [— |, |], with N = 8.

Without averaging With averaging

n or k fn Ffc

-3 -3/8

-2 -1/4-1 -1/80 01 1/82 1/43 3/84 1/2

-.0625 - .1509;

.0625 + .0625;-.0625 - .0259;

.0625-.0625 + .0259;.0625 - .0625;

-.0625 + .1509;.0625

fn Fk

-3/8 -.1509i

-1/4 .0625;-1/8 -.0259i

0 01/8 .0259z1/4 -.0625;3/8 .1509;0 0

4

8

Fk=D8[fn]k.


A moment's glance should arouse some suspicion. Notice that the DFT coefficientsFk are not pure imaginary, as the Fourier coefficients are. An astute observer mightalso notice that the real part of the F^s oscillates between ±.0625 = 1/16, whichmeans that the DFT coefficients do not decrease with |fc| as the Fourier coefficientsdo. Is there a problem?

There is a problem, and it can be identified in several ways. In this particularcase one can argue in terms of symmetry. The function /(x) = x is an odd function(/(— x] — —f(x)) on the interval (—1/2, 1/2). If extended periodically beyond thisinterval, it will remain odd only if we define /(— 1/2) = /(1/2) = 0. In sampling /for input to the DFT, the sequence of samples must also inherit this symmetry, whichmeans f N = fN — 0.J-~2 J T

More generally, one must argue by analogy with the Fourier series. Given theFourier coefficients Cfc of a well-behaved function /, the function ^k Cket2™kx convergesto f(x] on the interval (—1/2,1/2), to the periodic extension of / outside of thatinterval, and to the average value at points of discontinuity. As shown in Figure 3.9,the Fourier series converges to the value

at the points x = ±1/2. Therefore, the input to the DFT must also be defined withaverage values at the endpoints.

FIG. 3.9. A function f must be sampled so that the resulting sequence takes average values atendpoints and points of discontinuity. The figure shows a linear function on [—1/2,1/2] which,if it were extended periodically beyond this interval, would be discontinuous at x = ±1/2 (solidline). Appropriate samples of the function are shown (•) with /±jv =0, which is the average

value of f at x — ±1/2.

In the example above, the sequence of samples that was fed to the DFT was notdefined with an average value at the endpoints. The correct input to the DFT isshown in the third column of data in Table 3.3 as the sequence /n, with fN/2 = 0.The resulting DFT coefficients Fk are pure imaginary and they decrease with \k\, asthey should.

2


The discrepancy between the two sets of DFTs in Table 3.3 can be explainedprecisely. The two input sequences fn and /„ differ only in the n = 4 position. Infact,

where 8 (n — 4) represents a spike with a magnitude of one at n = 4. Taking eight-pointDFTs of both sides of this equation, and using the fact that

we see that

Clearly, / and g differ only at a point, but it makes all the difference to the DFT.The same argument applies in defining the input sequence at any point at which

/ has a discontinuity. This state of affairs leads us to coin the watchword AVED:average values at endpoints and discontinuities. The importance of AVED willbe emphasized throughout the remaining chapters; its importance is also demonstratedin problem 58.

AliasingHave you ever watched a western movie (especially an old one) in which a stagecoachis traveling forward rapidly, but it looks as though the wheels are turning very slowlyor even backwards? This is an example of aliasing (or strobing), a very importantphenomenon to those who work in the area of signal processing. The reason for theapparent motion of the stagecoach wheel is simple. The shutter on the movie cameraopens and closes at a specific rate, while the wheel on the stagecoach rotates at anotherrate. Suppose the wheel is turning rapidly enough that it rotates 1^ times betweenconsecutive openings of the camera shutter. Then in consecutive frames of the filmthe wheel appears to advance | of a turn, when in fact it travels through l| rotThis gives the rather disconcerting illusion on film that while the stagecoach is movingat a good clip, the wheels are turning too slowly. Notice that the wheels could alsohave actually turned through 2| or 3^ full cycles and still appeared to advance only^ of a turn. The same phenomenon can afflict any form of sampled data. Whenevera signal oscillates rapidly enough, a given sample rate becomes insufficient to resolve

Notice that Fk has a spurious component, (—l) fc/16, as observed in Table 3.3.The lesson is now quite compelling: average values must be used for a function

whose periodic extension is discontinuous at x = ±A/2, or the resulting DFT willassuredly be infected with errors. The remedy is to be sure that in generating theinput to the DFT, the function that is sampled on the interval [—A/2, A/2] is not /,but the auxiliary function


the signal. As occurred with the wagon wheel, the signal appears to be oscillating ata lower frequency than the one at which it actually oscillates.

We need to understand aliasing in order to know when it can cause difficultywith DFTs. We first introduce the important concept of a band-limited function. Afunction / is called band-limited if its Fourier transform is zero outside a finite interval[—17/2,0/2]. In other words, the "frequency content" of the function (or signal)lies below the maximum frequency fi/2. How does band-limiting affect aliasing?We have already seen in Chapter 2 that if a function is known at a finite numberof sample points, then the DFT can be used to interpolate that function with atrigonometric polynomial. It makes sense that by increasing the number of samplesof a function more accurate interpolating functions can be obtained. This naturallyleads us to wonder: when do we have enough samples? That is, when is it possibleto construct an interpolating function that is exact? Intuition suggests that therewould be no sufficiently small grid spacing (or sampling rate) that would allow us toreconstruct the function exactly. Surprisingly, this isn't the case, and the statementto the contrary requires the condition of band-limiting. The celebrated theorem thatgives the conditions under which a function may be reconstructed from its sampleswas introduced by Claude Shannon2 [124]. It is worth stating without proof at thistime.

THEOREM 3.2. SHANNON SAMPLING THEOREM. Let f be a band-limited functionwhose Fourier transform is zero outside of the interval [—fJ/2, J7/2]. //Arc is chosenso that

The sine function is given by sinc(x) — sin(x)/x, and is graphed in Figure 3.10(see problem 51).

Rigorous proofs and discussions of this theorem abound [20], [82], [131], [158],[159], and it will appear again in Chapter 6. We will, however, make a few observationsregarding this important result. The theorem tells us that, in theory, we mayreconstruct a function exactly from its samples, provided that the function is band-limited and that we sample it sufficiently often to resolve its highest frequencies. Thesampling theorem gives a prescription for selecting a sufficiently small grid spacingAx once the band-limit Q is known. It tells us that Ax must be chosen to satisfy(3.21). The critical sampling rate Ax = 1/J7 is called the Nyquist3 sampling rate,

2CLAUDE SHANNON was born in 1916 and educated at the University of Michigan and MIT. Hisfar-reaching research, done primarily at Bell Labs and MIT, either created or advanced the fields ofinformation and communication theory, cryptography, and the theory of computation.

3The Swedish-born engineer HARRY NYQUIST (1889-1976) received his Ph.D. in physics from YaleUniversity in 1917. He is best known for his contributions in telecommunications, including hisdiscovery of the conditions necessary for maintaining stability in a feedback circuit, known as theNyquist criterion. Nyquist held 138 patents, many of them fundamental to electronic transmissionof data.

then f may be reconstructed exactly from its samples fn = /(nAx) = f ( x n ) by


FlG. 3.10. The sine function sine (x) = sin(x)/x, shown in this figure, appears frequentlyin Fourier analysis and lies at the heart of the Shannon Sampling Theorem.

and the frequency Q/2 is known as the Nyquist frequency. The Nyquist frequencyis the highest frequency that can be resolved using a given sample spacing Ax. Allhigher frequencies will be aliased to lower frequencies. The Nyquist sampling rate isthe largest grid spacing that can resolve the frequency i7/2. Observe that (3.21) alsoimplies that in order to resolve a single wave, we must have at least two sample pointsper period of the wave.

As a simple example, consider Figure 3.11, which shows a single wave with afrequency of u; = 6 cycles per unit length. We can use the foregoing arguments toconclude that the wave is band-limited with a Nyquist frequency of f2/2 = 6 cyclesper unit length. Therefore, a sampling rate or grid spacing of Aa: < 1/12 must beused to avoid aliasing and resolve the wave. If the wave is sampled with Arr = 1/4as shown, the wave that is actually seen by the grid has a frequency of one cycle perunit length. The entire issue of aliasing, and all that it implies, will arise again in afundamental way in Chapter 6 when we explore errors in the DFT.

In practice, it is rarely possible to meet the condition of band-limiting exactly.It is a fundamental fact that a function cannot be limited in both frequency (band-limited) and in space (or time). Since many functions (or signals) have finite durationin space or time, they cannot be strictly band-limited. However, many functions ofpractical use are essentially band-limited and have a rapid decay rate. A function/ is essentially band-limiting if there exist positive constants 0 and // such that

which means that |/(u>)| decays faster than u\ 1 as \u>\ —> oo. (A deeper andmore precise statement about the rate of decay of Fourier transforms is given bythe Paley-Wiener4 Theorem [55], [110].) For such functions it is possible to choosea grid spacing Ax sufficiently small that the error in the sine function representation

4NORBERT WIENER (1894-1964) was a mathematical prodigy who spent many years at MIT as amathematics professor. He made fundamental contributions to control theory, communication theory,analysis, and algebra, and he established the new subject of cybernetics.


FIG. 3.11. A single wave with a frequency of six cycles per unit length (solid curve) issampled (as shown by the asterisks) at a rate of Ax — 1/5, which is insufficient to resolve thewave. The wave that is actually "seen" by the sampling process has a frequency of one cycleper unit length (dashed curve).

(3.22) is negligible. We also note that the interpolation in (3.22) requires infinitelymany points. In practice, the series may be truncated, and the resulting errors can beestimated and controlled [82].

LeakageAnother phenomenon that DFT users must understand is called leakage. It is sonamed because its effect is to allow frequency components that are not present in theoriginal waveform to "leak" into the DFT. A simple example will demonstrate thephenomenon quite vividly. We have seen that the Fourier transform of the functionf(x] — cos(27R<;ozO is given by

and consists of two "spikes" in the frequency domain at the points ±u;o. Now considerthe task of approximating this Fourier transform by sampling the function / on theinterval [—1/2,1/2] at the N equally spaced points xn = n/N. This means thatAx = l/N and (by the reciprocity relations) the grid spacing in the frequency domainis Au; = 1. How well does the DFT approximate the Fourier transform if we use afixed value of TV and choose two different values of WQ?

The input sequence for the DFT is given by

and the DFT can be evaluated directly from the definition (problem 59) or found inThe Table of DFTs in the Appendix. Let's first consider the case in which UJQ = 10.The DFT is given by

A FEW PRAAACTI

FIG. 3.12. The DFT of a sampled cosine wave f ( x ) = cos(2ivujQx) for N = 32 behavesquite differently depending on the value of UJQ. In the top figure, with UJQ = 10, the DFTexactly reproduces the Fourier transform 8(u — 10)/2 + 6(uj + 10)/2. In the bottom figure,with UJQ — 10.13, the DFT exhibits leakage errors in approximating the Fourier transform,because the function is sampled on a fraction of a full period.

Noting that in this case uJk = kAuj = k, we see that the DFT approximates the Fouriertransform exactly at the frequency grid points as shown in the upper graph of Figure3.12. On the other hand, when UJQ = 10.13, the Fourier transform still takes the formof two spikes,

However, the lower graph of Figure 3.12 reveals that the DFT does a very poorjob of approximating the Fourier transform. In fact, for ko = 2u>o = 20.26, The Tableof DFTs in the Appendix shows that the DFT is now given by

which is hardly the representation of two spikes! In the latter case the exact Fouriertransform, which should still consist of two spikes, has been contaminated by errorsin the DFT that have leaked in from neighboring sidelobes frequencies. We willreturn to the matter of leakage in Chapter 6, when it arises in the analysis of DFTerrors. For now it suffices to note that leakage occurs when a periodic function istruncated and sampled on an interval that is not an integer multiple of the period. Inthe above example, the function f ( x ) = cos(2irujQx) has l/a>o periods on the interval[—1/2,1/2]. When UJQ = 10, an integer number of complete cycles is sampled, and thetruncated function is continuous when extended periodically. In the second case (wheno>o = 10.13), the truncated wave, when extended periodically, has a discontinuity. Aswe will soon see, when the DFT sees discontinuities in the underlying function, itreacts with large errors.

99

100

3.5. Analytical DFTs

One of the universal distinctions that cuts across all of mathematics is one of process:it is the distinction between analytical methods and numerical methods. In afew simple words, analytical methods are "pencil and paper" techniques that oftenresult in a formula for an exact solution to the problem at hand. For example, usingthe quadratic formula to solve a second-degree equation is an analytical method. Onthe other hand, numerical methods involve approximations (often very good ones)that are usually implemented on a computer. A numerical solution is often expressedas an algorithm and often involves the process of convergence. For example, usingan iterative method (such as Newton's method) to find the roots of an equation is anumerical approach. The distinction is usually clear, but there are some gray areas.For example, finding a solution of a differential equation in the form of a Fourier seriesis an analytical method, since the coefficients can be given in terms of a nice formula.However, the actual evaluation of the Fourier series at specific points is usually doneon a computer and involves an approximation which makes the method numerical innature.

Despite occasional ambiguities, the distinction is quite useful, and furthermore itpervades the topic of DFTs. Given a function or a set of data samples, the evaluationof the DFT is usually done on a computer and is regarded as a numerical procedure.However, there are cases in which the DFT can be evaluated analytically, and whenthis is possible, the result is a single expression that gives all of the DFT coefficientsin one clean sweep. Thus, there is something very satisfying about generating a DFTanalytically. However, this satisfaction may not be appreciated by everyone! Thethought of evaluating DFTs with a pencil and paper, instead of a computer, mayseem arcane to some, rather like writing a term paper in calligraphy or reading Euclidin Greek. But for those who enjoy analytical methods (and calligraphy and Greek),this section may have some appeal. At the same time it is recommended for everyone:analytical methods for DFTs rely on the entire repertoire of properties that we havejust studied, and there are many practical lessons along the way.

We will try to dispense with most preliminaries and proceed by example. However,there are two techniques that are so pervasive in evaluating DFTs analytically thatthey are worth stating at the start. One pattern that arises endlessly in various formsin this business is the geometric sum. The reader should know and be able to show(problem 64) that the sum of consecutive powers of a fixed number a =^ 1 (called theratio) is given by

The second tool that is essential in more involved DFT calculations is summationby parts. Just as integration by parts is applied to the product of functions,summation by parts can be applied to the product of sequences. Let un and vn

be two sequences and let M and N be integers. To make the summation of parts ruleresemble the familiar integration by parts formula (J udv = uv — f vdu), we will also

More familiar forms of the geometric sum are

PROPERTIES OF THE DFT

ANALYTICAL DFTs 101

let Awn = un+i — un and Avn = vn+i — vn. Then the summation by parts rule says

This result is most easily proved by writing out the terms of the left-hand sum,regrouping terms, and identifying the terms on the right side. A special case thatarises in DFT calculations is

Notice that if un and vn are periodic sequences with period N, then UN_ — U_N_ andv_N_+I — VN_+I, and the two "boundary terms" cancel. Thus, for periodic sequenceswe have

With these two handy tools at our disposal, let's do a couple of DFT calculationsanalytically.

Example: DFT of the square pulse. An important transform that arisesboth in theory and in practice is the transform of the square pulse (often called aboxcar function). We will begin with a function defined on the interval [— A/2, .4/2],and then sample it to generate a sequence suitable as input to the DFT. The squarepulse is given by

Notice that in anticipation of sampling this function for use in the DFT, we havedefined / to have the average of its values at x = ±A/4, where it is discontinuous.Notice also that if / is extended periodically beyond this basic interval, then it iscontinuous at x = ±A/2, and we may rightfully assume that /(±A/2) = 0.

When / is sampled at the usual grid points xn = nA/N for n = —N/2 + I : TV/2,the resulting sequence is

Again notice how fn is assigned its average value at the discontinuities.We may now proceed with the calculation of the Appoint DFT of this sequence.

Using the familar definition we have that for k — —N/2, + I : N/2,


Let's pause and comment on a few important maneuvers that were used. The±A/"/4 terms were separated in the first step and gathered as a single cosine termunder the authority of the Euler relations. The remaining sum was then collapsed, soit runs over the indices n = 0 : A/"/ 4 — 1. It is best to include the n = 0 term in thesum, but notice that it is "double-counted," which is why it is necessary to subtract 1from the sum. In the last step we noted that the summand uj~^n + 0^ can be written

Now the sum can be recognized as a geometric sum with ratio u;jy. Evaluatingthe geometric sum allows the calculation to continue:

At this point the problem has been "cracked" since the DFT sum has beenevaluated. However, the work is not complete. A bit of algebraic alacrity is needed tocollect and simplify terms, and this is best not done in public! No matter what formof thrashing is used, the result that finally emerges is

for k = —N/2 + 1 : AT/2, but k

Generally, one arrives at this point without a clue as to whether the result iscorrect. How can we check the accuracy of our calculation? First, it does not hurt tograph the sequence Fk for selected values of N. Figure 3.13 shows the alleged DFTcoefficients plotted for N = 16 and N — 32. The coefficients decay by oscillation likeA;"1, which is apparent from the analytic result (3.23). Notice that the length of thespatial interval A does not appear in the DFT (in fact, it did not even appear in theinput sequence).

There are several tests that can be used to instill some confidence in an analyticalresult. The first check is symmetry. In the case of the square pulse, the input sequenceis real and even (/_n = fn}. Therefore, we expect the DFT sequence to be real andeven as well, and indeed it is. A second easily applied test is periodicity. The sequenceFk generated by an Appoint DFT must be Af-periodic. The alleged DFT (3.23) hasthis property.

A separate calculation shows that

k

as

ANALYTICAL DFTs 103

Another test might be called the FQ test or average value test. It is usuallyfairly easy to check that the DFT property

is satisfied. In the case of the square pulse, we have already noted that indeed FQ = 1/2is the average value of the input sequence.

FlG. 3.13. The analytical DFT of the square pulse is plotted at the points k — —N/2 : N/2for N = 16 (marked with +) and N = 32 (marked with o). The exact values of the Fouriercoefficients lie on the solid curve, which is superimposed for comparison. The convergence ofthe DFT coefficients to the Fourier coefficients as N —> oo can be observed. Note that wherecoefficients were computed with both sequence lengths, those using N — 32 are more accuratethan those using N — 16. Note also that within either sequence, the coefficients for small \k\are more accurate than those for large \k\.

These three tests are encouraging, but not conclusive. Perhaps the mostconvincing support one can find for an analytical DFT is the comparison with theFourier series coefficients, c^, and the Fourier transform, /. In this case, two briefcalculations demonstrate that


where the sine function has been defined as sine (x) = sinx/x. Notice that Ck isindependent of A, while / is not. As an aside, we may confirm the relationship, derivedin Chapter 2, between the Fourier coefficients and the Fourier transform: evaluating/ at the points LJ^ = k/A of the frequency grid, we see that f(u>k) = Ac^.

How do Ck and /(cjfc) compare to our analytical DFT? Visually the comparisonis good, as seen in Figure 3.13, in which the Fourier coefficients c/t are superimposedon the DFT coefficients. Furthermore, the agreement between the DFT and Fouriercoefficients appears to improve as N increases. The comparison becomes more thanvisual if we let N become large in the DFT coefficients. A limit can actually be taken,and we find that (problem 65)

for k = -N/2 + 1 : N/2. In other words, in the limit N —» oo, we see that the DFTcoefficients approach the Fourier coefficients. This limiting property will be exploredat great lengths in Chapter 6, but it is useful to see a prior example of it.

Example: The DFT of a linear profile. We now consider an input sequencethat requires another technique. Let f(x] = x/A on the interval [—A/2, A/2]. If thisfunction is sampled at the N points xn = nA/N, where n = —N/2 + 1 : N/2, theresulting sequence is given by

Once again, we see (Figure 3.14) that in anticipation of the DFT, the average valueof the function has been used at the points of discontinuity (AVED) which occur atthe endpoints of the interval. Said differently, if the sequence fn defined above wereextended periodically, it would be an TV-periodic sequence.

Now we may proceed with the DFT. Applying the DFT definition to the sequencefn we have

Notice that the n = N/2 term has been kept in the sum even though fN/2 = 0. Sincefn is linear in n, we no longer have a geometric sum, and another tool is needed. Inanalogy with the integration of a function like f ( x ) — xeax, we consider summationby parts. As shown in the DFT sum above, the term un is identified with that part ofthe sum that is easily "differenced," whereas Avn is chosen so that it is (hopefully) thedifference of a sequence vn. With un and Aun selected in this manner, the summationby parts formula says that

2

ANALYTICAL DFTs 105

FIG. 3.14. The linear function f ( x ) = x/A is sampled at N = 20 equally spaced points ofthe interval [—A/2, A/2] to generate the input sequence /„ = n/N for n = —N/2 + 1 : N/2.The periodic extension of this function beyond the interval is discontinuous at x = ±A/2.Therefore, the average value of the function must be used at these points (AVED), and thevalue f±N_ = 0 is assigned.

As with integration by parts, there are two subproblems that must be solved. Thefirst is to compute Aun_i, where un = fn is given by

A short calculation shows that elements of the (periodic) sequence Aun_i = un — un_ifor n — —N/2 + 1 : TV/2 are given by

The second subproblem is to find the sequence vn that satisfies

Those readers who are familiar with difference equations will recognize this as afirst-order constant coefficient difference equation (a creature that will be examinedin greater detail in Chapter 7). It can be solved by assuming a trial solution ofthe form vn = au>xnk, substituting and solving for the undetermined coefficient a.This problem is also equivalent to finding the "antidifference" of u^nfc (analogous toantidifferentiation). By either method, the result (problem 67) is that the sequencevn is given by

An important fact is that vn is periodic with period TV, as is un.


We may now invoke the summation by parts formula. We find that

The first important simplification now occurs. Notice that because of the periodicityof both un and vn, the first two "boundary terms" exactly cancel. We next recall thatAun_i = I/A7" except when n = —AT/2 + 1 and n — N/2, which prompts us to splitthe above sum as follows:

The first two terms after the sum remove the extraneous first and last terms from thesum; the last two terms after the sum add in the correct first and last terms of thesum. The reason for splitting the sum in this awkward way is to maintain a sum overthe full range of indices (n = —N/2 + 1 : N/2)] furthermore, this sum is a geometricsum. Additional simplification leads to

where k = —N/2 + 1 : TV/2. The important step is the evaluation of the geometricsum. As shown, it has a value of 7V<Sjv(fc), which is zero unless k = 0.

Now only some algebraic dexterity is required to reduce this expression to amanageable form. Once it is done, the result is remarkably simple; we find thatthe DFT of the linear sequence is given by

for k = —N/2 + I : AT/2, with the special case that FQ = 0. Most mortals emerge fromsuch a calculation with at least a bit of skepticism about its accuracy. So let's try to

ANALYTICAL DFTs 107

verify the above DFT. First the symmetry looks good: the input sequence is real andodd; the DFT is pure imaginary and odd as it should be. Also, the periodicity testis satisfied, since Fk as defined above is an JV-periodic sequence. The F0 test is alsoconfirmed, since

Note that this test fails if the AVED warning is not heeded, that is, if the averagevalue is not used at the point n = N/2.

A little more effort is needed to compare the DFT to the Fourier coefficients, butit offers further assurance. A short calculation (not surprisingly requiring integrationby parts) shows that the Fourier coefficients of the function f(x) = x/A on the interval[—A/2, A/2] are given by

for k = . . . , —2, —1,1,2 , . . . , with CQ = 0. Clearly, we already have CQ — F0. A limitingargument (problem 66) can be used to show that

for k = —N/2 + I : N/2. At this point, we have used all of the immediate tests, andthey give reasonable testimony to the accuracy of the DFT calculation just presented.

Example: How one DFT begets another. Since the previous computationmay have seemed arduous, it would be nice to know that additional DFTs can besqueezed from it with very little effort. The strategy relies upon the properties of theDFT and it finds frequent use. Consider the problem of approximating the Fouriercoefficients of the function g(x] = x on the interval [0,^4]. If this function is sampledat the N equally spaced points xn = nA/N, where n = 0 : N — 1, the resultingsequence is

Notice that in anticipation of using gn as input to the DFT, the average value of thefunction has been used at the discontinuity at x — 0. To compute the DFT, onecould use the alternate definition of the DFT (on the index set n = 0 : TV — 1) andcompute the coefficients directly. This exercise would resemble the previous examplevery closely. On the other hand, one could use the result of the previous example andsave some labor. Let's follow the latter course.

Looking at Figure 3.15, we see that the input sequence gn can be formed fromthe sequence fn = n/N of the previous example by (i) shifting it horizontally by N/2index units (or A/2 physical units), (ii) shifting it vertically A/2 physical units, and(iii) scaling it by a factor of A. Each of these three operations can be accommodatedby the DFT. In fact, we can relate the two sequences as follows:

for n = 0 : N - 1. (Check, for example, that g0 = A f / _ N + ^ J = A/2.) Letting

Gk — D{gn}k we can now appeal to the relevant DFT properties to write

2n


(by linearity)

(by the shift property)

for k = 0 : N — 1. In order of appearance, the linearity of the DFT, the shift property,and the fact that T>{l}k = #/v(fc) have been used to arrive at this result.

FIG. 3.15. Having computed the DFT of one sequence, the DFT of a related sequence canoften be found easily. The figure shows two linear sequences: fn — n/N (shown as •) arisesfrom sampling the function f ( x ) = x/A on the interval [—A/2, A/2], and gn — nA/N (shownas o) are the samples of g(x) = x on the interval [0, A] . The sequences differ by a scaling,a horizontal shift, and a vertical shift. All of these modifications can be accomodated by theDFT. The figure shows the case in which A — I and N = 20.

Using the values of Fk found previously we have that

Notice that the sequence Gk is not quite symmetric; although it appears to be

PROBLEMS 109

imaginary and odd, it is not, since GO = A/2. The sequence Gk lost its symmetrywhen the sequence fn was altered to form gn.

Examples and exercises involving analytical DFTs abound and could continueendlessly (as will be demonstrated in the problem section). However, we mustconclude, and will do so with a brief look at The Table of DFTs which appears inthe Appendix. This table is a collection of as many analytical DFTs as time, energy,and sanity allowed. A few words of explanation are needed. Each entry in the Tableof DFTs is arranged as follows.

Discrete input name

Graph of fn

Continuum input name

Graph of f(x)

fn , n A/"

Fk, fceAf

|cfc-F fc | , fee AT

f(x), xe /

cfc, fc e z

Comments

Graph of F^

Graph of \ck-Fk\

max|cfc -Ffc|

The first column has two boxes. The upper box gives the name of the input,below which are graphs of the real and imaginary parts of the discrete input sequence.The lower box contains the name of the continuum input, and the correspondingcontinuum input graphs. The middle column has six boxes containing, in order fromtop to bottom, the formula of the input sequence fn; the analytic TV-point DFT outputF^, a measure of the difference \Ck — Fk\\ the formula of the continuum input functionf(x); the formula for the Fourier coefficients Ck\ an entry for comments, perhaps themost important of which is the AVED warning. This means that average values atendpoints and discontinuities must be used if the correct DFT is to be computed. Thethird column consists of two boxes. The upper box displays graphically the real andimaginary parts of the DFT. The lower box gives the maximum error max|cjt — Fk\,and displays graphically the error |c& — Fk\ for a small (24-point) example.

Unless otherwise noted, the function is assumed to be sampled on the interval[—A/2, A/2]. The difference \Ck — Fk\ is generally in the form CN~P for some constantC and some positive integer p, which should be interpreted in an asymptotic sense forN —» oo; in other words, if \Fk — Ck\ = CN~P, then

While this measure is different than the pointwise error that will be derived in Chapter6, it does agree with those estimates in its dependence on N.

We close this section of esoterica with the request that readers who find challengeand joy in computing DFTs analytically submit any genuinely new DFTs that do notappear in The Table of DFTs to the authors.

3.6. Problems

Alternate Forms of the DFT

30. Geometry of even/noncentered DFT modes. Consider a 12-point DFTdefined on the index sets n, k — 0 : 11. Sketch the 12 different modes corresponding


to the indices k = 0 : 11 and note their frequency and period. What value of kcorresponds to the highest frequency mode? What is the frequency and period of thatmode? Give the pairs of indices that correspond to modes with the same frequency.

31. Geometry of odd/noncentered DFT modes. Consider an 11-point DFTdefined on the index sets n, k = 0 : 10. Sketch the 11 different modes correspondingto the indices A; = 0 : 10 and note their frequency and period. What values of kcorrespond to the highest frequency mode? What is the frequency and period of thismode? Give the pairs of indices that correspond to modes with the same frequency.

32. DFTs on an odd number of points. Show that in general a noncentered DFTon an odd number of points (single set or double set) does not include the highestfrequency mode cos(Trn). Show that the highest frequency modes that are presentcorrespond to indices k = (N ± l)/2 for a single set of points and k = AT, N + 1 for adouble set of points.

33. DFTs on an odd number of points. Show that in general a centered DFTon an odd number of points (single set or double set) does not include the highestfrequency mode cos(?m). Show that the highest frequency modes that are presentcorrespond to indices k — ±N/2 for a single set of points with N even, k = ±(N — l}/2for a single set of points with N odd, and k = ±N for a double set of points.

34. Odd/centered DFTs. Consider the DFT defined on a centered interval withan odd number of transform points.

(a) Assuming N is even, verify the orthogonality property

(b) Given this orthogonality property and the forward DFT,

for k = —N/2 : N/2, derive the corresponding inverse DFT.

(c) Verify that the sample points for this DFT, xn = nA/(N+l), do not includeeither endpoint of the interval [—A/2, A/2]. Show that X±N_ approach ±^4/2as N increases.

35. Double/noncentered/odd DFT. Write out the complete DFT pair for adouble set of an odd number of noncentered points. Indicate the ranges of the indicesclearly. Write out the grid points in the spatial and frequency domains and find thereciprocity relations that apply.

36. Double/centered/odd DFT. Write out the complete DFT pair for a doubleset of an odd number of centered points. Indicate the ranges of the indices clearly.Write out the grid points in the spatial and frequency domains and find the reciprocityrelations that apply.

37. Using a DFT. Assume that you have a program that computes adouble/noncentered/odd DFT. Write out the forward DFT expression. Show how

PROBLEMS 111

it can be used to approximate the Fourier coefficients of f ( x ) = x on the interval [0,2]using 31 points. What are the spatial grid points xn and the frequency grid points uJkthat are used by this DFT? Give the values of the input sequence at each spatial gridpoint (with attention to the endpoints).

38. Using a DFT. Assume that you have a program that computes adouble/centered/even DFT. Write out the forward DFT expression. Show how itcan be used to approximate the Fourier coefficients of f ( x ) = x on the interval [—2,2]using 32 points. In particular, what are the spatial grid points xn and the frequencygrid points ujk that are used by this DFT? Give the values of the input sequence ateach spatial grid point (with attention to the endpoints).

39. Modifying a DFT. It is not uncommon to have a DFT program that doesnot fit the specifications of the problem at hand. Assume that you have a programthat computes a single/centered/even DFT. Write out the forward transform andindicate the index ranges clearly. Show how it can be used to approximate the Fouriercoefficients of f ( x ) = x on the interval [0,2] using 32 points. Use the periodicity ofthe DFT and indicate how to define the input sequence and interpret the output.

40. Software and DFTs. Assume that the Fourier coefficients of f ( x ] = cos(7r:r/2)must be approximated on the interval [—1,1]. Show how the Af-point input sequencemust be defined for each of the software packages discussed in the text. Equallyimportant, show how the sequence of transform coefficients should be interpreted ineach case. In particular, show where the coefficients of the constant mode and thehighest frequency mode(s) appear in the output list.

41. Custom DFTs. Assume that you need a DFT program that approximates theFourier coefficients of a given function / on the interval [—A/4,3A/4]. Describe howto "hand tool" a DFT for this problem with each of the following strategies.

Design a special DFT from scratch based on the sampling interval [—.A/4,3.A/4].Describe how you would define the spatial and frequency grids. How would youinterpret the transform coefficients Fk; specifically, with which frequency is eachFk associated?

Use the shift theorem for the DFT to transform this problem so that a standardDFT can be used.

Use the shift theorem for Fourier coefficients to transform this problem so thata standard DFT can be used.

Can you think of any other strategies? Can you outline a general procedure for creatinga new DFT or modifying an existing DFT so that it applies to an arbitrary samplinginterval [—pA, (1 — p}A] of length ^4, where 0 < p < 1?


42. Hermitian symmetry. Verify the Hermitian symmetry relations

using the definition of the DFT and its inverse.

43. Shift property. Apply the definition of the DFT (8.21) directly to the


sequence gn = fn-j, where n = —N/2 + 1 : N/2 and j is a fixed integer to prove theshift property X > { f n - j } k = u~^3k T>{fn}k.

44. Even and odd input. Use the definition of the DFT to show directly that (a)the DFT of an even sequence is even, (b) the DFT of a real and even sequence is realand even, (c) the DFT of an odd sequence is odd, and (d) the DFT of a real and oddsequence is imaginary and odd.

45. Waveform decomposition. Show that an arbitrary sequence fn can bedecomposed in the form fn = /®ven + /£dd, where

46. IDFT of a conjugate symmetric sequence. Show that if F^ is any conjugatesymmetric sequence then fn = T>~1 {Fk}n is a real-valued sequence.

47. Discrete correlation theorem. Prove that

using the definition of the DFT and the orthogonality property.

49. Alternate proof of the Discrete Convolution Theorem. Prove theDiscrete Convolution Theorem by showing directly that T>~1 {NFkQk}n — fn * 9n- Is

it a simpler proof than the one given in the text?

50. Convolution and high precision arithmetic. Note that the decimalrepresentation of an (n + l)-digit integer a has the form

where an ^ 0. Using a similar representation for the (m + l)-digit integer 6, showhow convolution can be used to compute the product of these two integers exactly.In particular, show how the sets of digits a^ and bk must be extended to perform thenecessary convolution.

51. The sine function. Consider the sine function, sine (x) = sin(x)/x, and thesquare pulse (or boxcar) function

48. Parseval's relation. Prove Parseval's relation

Verify that /^ven is an even sequence and /°dd is an odd sequence. Finally, show that

kD

where

PROBLEMS 113

(a) Show that Ba(bx) = Ba/b(x) for any positive real numbers a and b.

(b) Make rough sketches of sine (x), sine (2x), and sine (x/2). Note the locationof the zeros and the "width" of each function.

(c) Verify that the inverse Fourier transform of Ba(u>) is

(d) From part (c), it follows that

Make some observations about how the shape of the sine function and itstransform vary as a is increased and decreased.

(e) From part (d) deduce that F {sine (x}} = 7rBi/n(u>).

(f) Is the sine function band-limited? What is the maximum frequencyrepresented in the Fourier transform of sine (ax)? Based on the ShannonSampling Theorem, what is the maximum sampling rate needed to resolvethe function sine (ax)?

52. Padded sequences Let fn be the sequence

and suppose that its 12-point DFT is F^ for k = — 5 : 6. Show that if gn is formed bypadding fn with two zeros on either end, and Gk is the 16-point DFT of gn, then

53. DFT interpolation Assume that / is a function that is either zero outside ofthe interval [—A/2, A/1] or is A-periodic. The sequence fn is formed by sampling /at N equally spaced points of [-A/2.A/2]. Let Fk = T>N {fn}k be the TV-point DFTof fn. Now extend the sequence fn to a length of 27V by padding with zeros on bothends and call this new sequence gn with a DFT Gk = T^2N {dn}k- Show that the eventerms of the sequence Gk match the terms of Fk and that the odd terms of Gk canbe interpreted as interpolated values of the sequence Fk. Extend this observation andconclude that it is possible to interpolate at p — I points between each term of Fk bypadding the original sequence with zeros to a length of pN.

54. Using the DFT to approximate integrals. Let / be continuous on[-A/2, A/2}. Show how the definition of the DFT

can be used to approximate the integral

of / might be discontinuous.

55. Summed and differenced sequences. Given a sequence /n, let

Use the definition of the DFT along with linearity and the shift property to verify theproperties (3.13), (3.16), (3.19), and (3.20):

56. Average value property. Show that the average value of an input sequence/„ is given by

and hence, if fn is odd, FQ = 0.

57. Rarified and repeated sequences. An ./V-point sequence fn can be rarifiedby preceding each of its elements by p— I zeros. For example, the three-fold rarefactionof fn is the sequence of length 3N

for k = —JV/2 +1 : JV/2. An TV-point sequence fn can be repeated by preceding eachof its elements by p — 1 copies of itself. For example, the two-fold repetition of fn is

Find a relationship between T>PN {hn} and T> {fn}-

58. Averaging at endpoints and discontinuities. The rectangular wave on theinterval [—1/2,1/2] is defined by

Note that / is real and odd, and therefore its Fourier coefficients are odd andimaginary. Assume that the function is sampled at the points xn = ra/8, wheren = — 3 : 4, to produce the input sequence

and

Show that


Take caare to define the input sequence correctly aat points where the periodic extension

PROBLEMS 115

Compute the eight-point DFT of fn. Is the DFT odd and imaginary? How do youexplain the error? How should the input sequence be defined? Verify that when theinput sequence is correctly defined, the DFT is odd and imaginary.

59. Cosine DFT. Verify that if ko is a fixed integer, then the Af-point DFT of thesampled cosine wave is given by

Why must the modular delta <5jv(& ± k0) rather than the regular delta 8(k ± k0) beused?

60. Shift property. Let fn be an arbitrary sequence and let no be a fixed integer.

(a) Show directly using the definition of discrete cyclic convolution that

(b) Show the same result in the following, less direct way. Let Fk denote theDFT of /„ and note that

Now use the convolution theorem in the form

together with the modulation property, to conclude that fn*^(n—n0) = fno.

61. Modulation by real modes. Given a real sequence fn find the DFT of themodulated sequences fncos(27mko/N) and fnsm(27rnko/N), where ko is an integer.

62. Aliasing. Consider the following functions on the indicated intervals. Ineach case, determine the maximum grid spacing Ax and the minimum number of gridpoints N needed to insure that the function is sufficiently resolved to avoid aliasing.

63. Aliasing (strobing). A spinning wheel with a light attached to the rim isphotographed with a camera with a shutter speed of 16 frames per second.

(a) What is the greatest wheel speed (in revolutions per second (rps)) that canbe accurately resolved by this camera without aliasing?

(b) If the wheel revolves at 20 rps, what is the apparent speed recorded by thecamera?

(c) If the wheel appears to revolve at 1 rps, what are the possible true wheelspeeds (in addition to 1 rps)?

(d) If the wheel appears to revolve backwards at 1 rps, what are the possibletrue wheel speeds?


Analytical DFTs

64. Geometric sums. Show that for a given real or complex number a ^ I andintegers M and A7",

65. Square pulse Fourier coefficients and DFT. Compare the computed DFTof the square pulse with its Fourier coefficients by showing that

for k = -AT/2 + 1 : AT/2 (with F0 = 0).

67. An "antidifference" calculation. The summation by parts formula requiresthat a difference equation of the form Awn = fn be solved for the sequence un when/„ is given. Solve the difference equation

for k = -AT/2 + 1 : AT/2.

66. Convergence test for linear sequence DFT. Consider the linear sequencefn — n/N. Compare the computed DFT with its Fourier coefficients by showing that

where k = —N/2 + 1 : AT/2 is fixed and the solution sequence un is Af-periodic. (Hint:Assume a solution of the form un — au>^fnk and determine a.)

68. New DFTs. Use The Table of DFTs in the Appendix to determine the DFTof the following sequences and functions on the indicated index sets or intervals. Tryto minimize your efforts by using DFT properties! In each case sketch the input andbe sure that average values are used when necessary.

69. Evaluating sums. Use the geometric series and/or summation by parts toevaluate the following sums:

Chapter 4

Symmetric DFTs

4.1 Introduction

4.2 Real Sequences and the Real DFT

4.3 Even Sequences and the Discrete CosineTransform

4.4 Odd Sequences and the Discrete SineTransform

4.5 Computing Symmetric DFTs

4.6 Notes

4.7 Problems

What immortalhand or eye

Could frame thyfearful symmetry?— William Blake 117

The preceding chapters have been devoted to the DFT in its fullest generality, as itapplies to arbitrary complex-valued input sequences. It is now time to investigatesome special forms that the DFT takes when the input possesses special properties.These special properties are often called symmetries, and the resulting DFTs arecalled symmetric DFTs. This exploration will prove to be very fruitful for tworeasons. First, in practice, the input to the DFT very often does possess symmetries.Second, we will soon see that when symmetries are exploited, the resulting DFTsoffer savings in both computational effort and storage over the general complex DFT.These economies can be significant in large computations, particularly in more thanone dimension. Symmetric DFTs are also the discrete analogs of the real, sine, andcosine (and other) forms of the Fourier series. Not surprisingly, they are indispensablein applications in which these special forms of the Fourier series arise, most notablyin solving boundary value problems (as we will see in Chapter 7). Therefore, thesubject of symmetric DFTs is not an idle exercise; it has a tremendous impact onissues of practical importance. In learning and writing about symmetric DFTs, wehave found the subject filled with subtleties and pitfalls. There are several intertwinedthemes that underlie symmetric DFTs, and we will try to untangle and highlight themcarefully. Here is the general four-step approach that we will follow.

Step 1: Symmetry in the input. The discusson always begins by assumingthat the input sequence fn has a particular symmetry. Three of the many possiblesymmetries that we will consider are:

1. real symmetry: fn is real,

2. even symmetry: fn = /_„,

3. odd symmetry: /„ = -/_„.

Examples of these three symmetries are shown in the simple sequences of Figure4.1. We will see that in the presence of each of these symmetries, the DFT takes aspecial, simplified form. In each case, the cost (in terms of storage and computation)of evaluating the DFT of a symmetric sequence is less than in the case of a generalcomplex sequence. We will note these savings for each symmetry.

Step 2: New symmetric transform. The next step in the development issubtle. For each symmetry, it is possible to define a new DFT that can be appliedto an arbitrary real input sequence. The new DFT that arises in each case is calleda symmetric DFT. The symmetric DFTs are generally analogs of special forms ofthe Fourier series. For the three cases we will consider, the symmetric DFTs and theirFourier series analogs are:

1. real symmetry -»• Real DFT (RDFT) <£> Real Fourier Series,

2. even symmetry —> Discrete Cosine Transform (DCT) <^> Fourier Cosine Series,

3. odd symmetry —> Discrete Sine Transform (DST) <£> Fourier Sine Series.

If confusion arises at all, it is because a symmetric DFT can be applied to any realsequence in which the original symmetry is usually entirely absent. In other words,given a real input sequence, one could apply the RDFT, the DCT, or the DST,

118 SYMMETRIC DFT

4..1. Introduction

INTRODUCTION 119

FiG. 4.1. Periodic sequences with the three most commonly occurring symmetries areshown. The top sequence is arbitrary apart from the fact that its elements are real; themiddle sequence also has the even symmetry x~n — xn; and the bottom sequence has the oddsymmetry x-n = —xn.

depending upon the problem at hand. In general, the results of the three transformsare different. In the opposite direction, given a particular problem, an appropriatesolution might take the form of an inverse RDFT, an inverse DOT, or an inverse DST.This is analogous to the use of a Fourier series: given a fairly arbitrary real-valuedfunction, one may represent it in terms of a real, cosine, or sine Fourier series.

Step 3: Pre- and postprocessing forms. With the new symmetric DFTs andtheir inverses denned, we will next address the question of how they can be computedefficiently. One can always use these definitions, which we will call explicit formsof the DFT; they are direct, but often inefficient. An improvement over the explicitforms can be found in pre- and postprocessing techniques. These algorithms takevarious forms, but the fundamental idea is always the same. The input sequence ismodified in a simple preprocessing step to produce an auxiliary sequence; the auxiliarysequence is then used as input to the complex DFT (which is evaluated by a fastFourier transform (FFT)); and the resulting sequence of coefficients is then modifiedin a simple postprocessing step to produce the desired DFT. In this way the complexDFT can be used to compute symmetric DFTs efficiently.

Step 4: Compact symmetric form. To complete the story and make it ascurrrent as possible, we will occasionally mention a more recent development thatleads to very efficient symmetric DFT algorithms. This is the design of compactsymmetric FFTs in which the pre- and postprocessing is avoided altogether. Insteadthe symmetry in both the input and output is built directly into the FFT. Theresulting compact FFT offers savings in both computation and storage over thecomplex FFT. These symmetric FFTs appear to be the most efficient methods forcomputing symmetric DFTs.

120 SYMMETRIC DFTs

4.2. Real Sequences and the Real DFTWe will begin with the most common symmetry encountered in practice, that inwhich the input sequence consists of N real numbers that we will label /n, wheren = —N/2 -f 1 : N/2. In developing symmetric DFTs, it pays to have a bookkeepingoutlook to keep track of savings in work and storage. For example, we can concludeimmediately that the storage needed for a real input sequence is half of the storageneeded for a complex sequence of the same length. While this is not a profound insight,it does suggest that we might expect the same savings in storing the DFT coefficientsFk. In fact, this is precisely the case, and we have already encountered the reason.Recall from Chapter 3 that if the input sequence /„ is real, then the resulting DFTcoefficients are complex, but possess the conjugate even symmetry F-k — F£. Wecan also deduce from this definition that when the /n's are real, FQ and FN. are alsoreal.

Now let's do the bookkeeping. The sequence of DFT coefficients Fk consistsof exactly N distinct real quantities: the real and imaginary parts of F\, . . . , FN__I

plus FQ and Fw, which are real; this adds up to N real quantities. The remainingcoefficients with k = —N/2 + 1 : — 1 can be recovered using the conjugate evensymmetry. Therefore, the N real input data are matched exactly by N distinct realvalues in the output. Using the fact that the input sequence is real we can rewrite thecomplex DFT as

This expression is what we will call the explicit form of the real DFT. While ithardly looks like a simplification, there are a few important facts to be gleaned fromit. First, it allows us to assess the computational cost of the real DFT and make asurprising comparison with the complex DFT. Recall that we need roughly N2 complexmultiplications and additions to compute the complex DFT from its definition. (In allthat follows, the important lessons can be extracted if we make rough operation counts;it suffices to keep track of multiples of N2.) In terms of real operations, this amountsto 47V2 real multiplications and 47V2 real additions. From the explicit form of the realDFT (4.1) a quick count shows that roughly 27V real multiplications and 27V additionare required to compute a single coefficient F&, but because of the symmetry in thetransform coefficients, only FQ, ... ,-Fjv need to be computed. Therefore, roughly TV2

real multiplications and AT2 additions are needed to compute the coefficients of the realDFT. There is a factor of four savings that arises for two reasons: the fact that fn isreal means the complex multiplications of the DFT have only half of their usual cost,and the fact that Fk is conjugate even means that only half of the coefficients need tobe computed. It is worth mentioning that a bit more efficiency can be squeezed outof both the complex and real DFTs. As shown in problem 70, by "folding" the sums,the number of real multiplications in both DFTs can be reduced by another factor oftwo.

The first stage of the discussion of the real symmetry is complete. We have shown

REAL SEQUENCES AND THE REAL DFT 121

how the DFT responds when it is presented with a real input sequence: the explicitform is more economical in both computation and storage. The next step is to showhow a new DFT can be created from this explicit form. In the case of the real symmetryit is a short step. The explicit form (4.1) can be regarded as a transformation betweenan arbitrary real Af-vector

for k = 0 : TV/2, and

for k = 1 : N/2 — 1. This is the first of the symmetric DFTs. From a computationalpoint of view, the RDFT is a new DFT. However, it produces the same coefficients asthe complex DFT; it just does it more efficiently.

For each of the symmetries that we will consider there is always the question ofan inverse transform. To determine the relevant inverse DFT it is necessary to usethe known symmetries in the opposite direction. The real DFT provides a good firstexample. We must now imagine a sequence of DFT coefficients Fk that has conjugateeven symmetry; that is, Fk = F^k. The goal is to recover the real sequence fn

that is the inverse DFT of Fk. Beginning with the complex inverse DFT and thenincorporating the conjugate even symmetry of Fk, we arrive at the inverse real DFTas follows.

for jfc = 0 : N/2, or

and another real TV-vector

where we have denoted the real and imaginary parts of Fk as RejjFfc} and Im{Ffc},respectively. We call this transformation the real DFT.

Real DFT (RDFT)

Inverse Real DFT

122 SYMMETRIC DFTs

where n — — iV/2 + 1 : TV/2. Notice how the conjugate even symmetry, Fk = F^fc, hasbeen used to collect the complex exponentials and express them as sines and cosines.Notice also that in the end, only the independent values of Fk, for k = 0 : N, areused in computing the inverse. This explicit form of the inverse real DFT makes itclear that the sequence fn is real (recall that FQ and FN. are real). A quick count ofoperations also confirms that each term of the sequence fn requires roughly N realmultiplications and additions; therefore a total of roughly N2 real multiplications andadditions are needed to evaluate the inverse real DFT; this matches the operationcount for the forward real DFT.

In working with the inverse DFT many authors give the coefficients a distinctname and write

It is then an easy matter to make the association between the complex coefficients F^and the real coefficients afc and 6^; it is simply

for k = 0 : AT/2, and

for k — 1 : N/2 — 1. The inverse RDFT defines a sequence fn that is real and periodicwith period N. This property suggests the analogy between the inverse RDFT and thereal form of the Fourier series. Therefore, this is an appropriate form for the solutionof a problem (for example, a difference equation) in which the solution must be realand periodic.

4.3. Even Sequences and the Discrete CosineTransform

Another symmetry that appears frequently, for example in the solution of boundaryvalue problems [79], [133] and in image processing techniques [37], is real evensymmetry. This term describes a real sequence fn that satisifes the conditionfn = /_n. While a sequence could be complex and even, we shall assume from here

EVEN SEQUENCES AND THE DISCRETE COSINE TRANSFORM 123

onwards that an even sequence is also real. Notice that if a sequence is known to haveeven symmetry, then it can be stored in half of the space required for a general realsequence and one-fourth of the space required for a general complex sequence. Weshould anticipate a similar savings in storing the DFT coefficients, and indeed this isthe case. Recall from Chapter 3 that the DFT of a real even sequence is also real andeven; that is, Fk = F-k- This means that only half of the real DFT coefficients needto be computed.

Let's determine the form of the DFT when it is applied to an even sequence.Anticipating coming developments, it is best if we begin with a real even sequence oflength IN and then use the definition of the complex DFT. Appealing to the symmetryof the input sequence /„, we have that the 27V-point DFT is

for k — 0 : N. Notice how the fact that fn is real and even allows the complexexponentials to be gathered into a single cosine term.

With this form of the DFT we see that only half of the input terms, /o, /i, • • • , /AT,are needed to compute each Ffc. We can also confirm that since the cosine is an evenfunction of its arguments, the DFT coefficients have the property that Fk — F-k- Aquick tally shows that the cost of computing the DFT of an even sequence of length2AT is roughly TV2 real additions and N2 real multiplications. Therefore, the costof computing the DFT of an even sequence of length TV is _/V2/4 real additions andmultiplications. Thus the even symmetry provides additional savings over the DFTof real sequences.

Once again, we have completed the first stage of the discussion, namely to showthe form of the DFT in the presence of even symmetry. The next step is to define agenuinely new DFT. Here is the thinking that leads us there. We have seen that ifthe input sequence to the DFT is real and even, then its DFT can be computed usingonly half of its elements. Furthermore, only half of the resulting DFT coefficientsare independent. One might wonder why the "other half" of the input and outputsequences are even needed. In fact they are not! Expression (4.4) can be regarded asa mapping between the two arbitrary real vectors

both of length N + I . By omitting the redundant half of each sequence, we havecreated a new transform between two arbitrary real vectors. With no more delay we

124 SYMMETRIC DFTs

FIG. 4.2. A few representative modes of the discrete cosine transform, cos(nkn/N), areshown on a grid with N = 24 points. Shown are modes k = 0,1,5,11,8, anrf 4 (clockwisefrom upper left). Notice that the odd modes (right column) have a period of2N, while theeven modes (left column) have a period of N.

now give the explicit form of the discrete cosine transform.

Discrete Cosine Transform (DCT)

for A; = 0 : N. The handy notation £" has been introduced to denote a sum whosefirst and last terms are weighted by one-half. We emphasize the possibly confusingpoint that in using the JV-point DCT, the input sequence fn is an arbitrary set ofTV + 1 real numbers that carries no particular symmetry. The DCT produces anotherset of N + 1 real transform coefficients Fk that also possesses no particular symmetry.In other words, the DCT is a new and independent discrete transform!

It is worthwhile to pause for a moment and inspect the geometry of the DCT.Notice that the modes of an /V-point DCT are different than the modes of the TV-pointreal and complex DFTs. The fcth DCT mode, cos(7mfc//V), has k half-periods of theinterval [0, N]. Small values of k correspond to low frequency modes with k — 0 beingthe constant mode; values of k near TV correspond to high frequency modes with thk = N mode oscillating between ±1 at every grid point. An important observation isthat all of the modes have a period of 2TV grid points, but only the even modes areperiodic on the interval [0, TV]. A few representative modes of the DCT are shown inFigure 4.2.

s

ODD SEQUENCES AND THE DISCRETE SINE TRANSFORM 125

There are a couple of ways to arrive at the inverse of the DCT. Perhaps thsimplest is to begin with the inverse real DFT (4.3) of length 27V and assume that thecoefficients Fk are real and even. This means Im {Fk} = 0, and the inverse real DFTfollows immediately.

*> Inverse Discrete Cosine Transform (IDCT) *

for n = 0 : N. Since the F^s are real we have replaced the coefficients RejFjt} byFk to simplify the notation. A notable conclusion about the DCT is that it is its owninverse up to a multiplicative factor. This fact can also be demonstrated (problems78 and 79) by using the discrete orthogonality of the cosine terms cos(imk/N).

In many problems, a solution fn is defined in terms of the inverse DCT (4.6),with the goal of finding the coefficients Fk. This is entirely analogous to assuming asolution to a continuous problem in the form of a Fourier cosine series. It is importantto note the properties of a sequence defined by the IDCT. It is straightforward tocheck that when it is extended beyond the set n = 0 : N, the sequence fn given by(4.6) is periodic with a period of 27V. Furthermore, when extended, the sequence fn

has the property that/i = /-i and /jv+i = /jv-i-

Not surprisingly, this extended sequence is called the even extension of fn. Aswe will see in Chapter 7, this property of the IDCT is essential in solving boundaryvalue problems which require a solution with a zero "derivative" (or zero flux) at theboundaries.

4.4. Odd Sequences and the Discrete SineTransform

The foregoing discussion of the discrete cosine transform is a pattern for all othersymmetric DFTs. We will outline the highlights of one other important symmetry,leaving the reader to engage in the particulars in the problem section. Anothercommon symmetry that arises in practice is that in which the real input sequencefn has the odd symmetry fn = —f-n. (We will assume that odd sequences arereal unless otherwise stated.) The task is to determine the form the DFT takes inthe presence of this symmetry, and then to define the new DFT that results from it.Before diving into the computations, a few crucial observations will ease our labors.

First, note that only half of a sequence with odd symmetry needs to be stored.This suggests that there should be similar savings in storing the transform coefficientsFk, and indeed this is true. We learned in Chapter 3 that the DFT of a real oddsequence is odd and pure imaginary; thus only half of the transform coefficients needto be stored. Here is another essential observation. We will begin with a periodicinput sequence of length 27V that has the odd symmetry fn = —f~n. Since /o = —/it follows that /o = 0. Furthermore, f^ — /_# by periodicity and /jv = —/_jv by the

sinf =

126 SYMMETRIC DFTs

odd symmetry; therefore /AT = f-N — 0. In other words, an odd periodic sequencewill always have predictable zero values (see Figure 4.1). We may now proceed.

As in the case of even sequences, we will begin with a real odd sequence of length27V and apply the complex DFT. We have the following simplifications due to the oddsymmetry:

This expression holds for k — —N+l : TV, although we see immediately thatFQ = F±N = 0. Furthermore, we may verify that F^ = — F_^, which means that onlythe coefficients FI, . . . , FN-I are independent. Since only the input values /i,. . . , /jv-iare required to compute the DFT of an odd sequence, the entire computation can bedone with N — 1 storage locations for the input and output. This form of the DFTalso confirms that the transform coefficients are all imaginary. A quick count revealsthat roughly TV2/4 real additions and multiplications are required to compute theindependent DFT coefficients of an odd sequence of length N. As with the evensymmetry, this represents a factor of four savings over the real DFT.

We now argue as we did in the case of even symmetry. Expression (4.7) can beregarded as a mapping between the two arbitrary real vectors

both of length N — 1. The odd symmetry that led us to this point is nowhere to befound in either of these vectors. But we can still define this mapping and give it aname.

*> Discrete Sine Transform (DST) <4

for k = I : N — I. This is the explicit form of a new DFT defined for an arbitrary realsequence fn. The multiplicative factor i that appears in (4.7) is not needed; withoutit the DST involves only real arithmetic.

If the symmetries of the DST are used in the inverse real DFT (4.3), it is notdifficult to show (problem 77) that the DST is its own inverse up to a multiplicativeconstant. This also follows from the orthogonality of the sine modes sm(7mk/N)

COMPUTING SYMMETRIC DFTs 127

(problems 78 and 79). In either case we have the inverse of the DST.

*• Inverse Discrete Sine Transform (IDST) -̂

for n = 1 : N - 1.Notice that a sequence fn defined by the IDST has some special properties. First,

fn is a sequence with period 2N. Furthermore, if fn is extended beyond the indicesn — 0 : N, the extended sequence is odd (/n = —/_n), and fo — /N = fpN = 0for any integer p. This extended sequence is called the odd extension of fn. Thereare instances in which the solution to a problem is required to satisfy "zero boundaryconditions" (/o = /jv — 0). A trial solution given by the ISDT (4.9) has this property,just as a Fourier sine series does for continuous problems.

Some conclusions of this section are summarized in Table 4.1 which shows thecomputational and storage costs of evaluating the DFT when the input sequence hasvarious symmetries. These are not the costs of computing symmetric DFTs (whichwill be discussed in the next section), but rather the costs of computing the DFT ofsymmetric sequences. For this reason the expected savings in storage and computationcan be seen as we move from the complex to the real to the even/odd DFTs.

TABLE 4.1Cost of computing the DFT of symmetric

N-point input sequences using the explicit form.

Storage

(real locations)

Real

additions*

Real

multiplications*

Complex

2N

47V2

4/V2**

Real

N

N2

JV2**

Real even

TV/2

7V2/4

N2/4

Real odd

N/2

N2/4

N2/4

* Computation costs shown up to multiples of N2.** An additional factor-of-two savings can be obtained by folding

sums (see problem 70).

4.5. Computing Symmetric DFTs

We must now embark on an odyssey that will lead to more efficient methods forcomputing symmetric DFTs. Before doing so, a few historical thoughts will givethe discussion some perspective. In the beginning the only way to compute thecomplex, real, cosine, and sine DFTs was by using their definitions (2.6), (4.1), (4.5),and (4.8), respectively (the explicit forms). One must now imagine the arrival ofthe FFT in 1965, but also realize that for (just) a few years the FFT was used toevaluate only the complex DFT. Given the remarkable efficiency of the FFT, it madesense to devise methods whereby the complex FFT could be used to evaluate theRDFT, DOT, and DST. These methods, which we will call pre- and postprocessingmethods, were introduced in 1970 by Cooley, Lewis and Welch in a paper that is

128 SYMMETRIC DFTs

filled with ingenious tricks [41]. Their methods, when combined with the complexFFT, provided much faster ways to evaluate the RDFT, DOT, and DST. Other earlypre- and postprocessing methods (with better stability properties) were introducedby Dollimore in 1973 [49]. Research on pre- and postprocessing algorithms for othersymmetries and for advanced computer architectures has continued to the present day[33].

Soon after the pre- and postprocessing methods were established, it becameevident that efficient symmetric DFTs could also be devised by building the symmetryof the input and output sequences directly into the FFT itself. The first of thesemethods, called compact symmetric FFTs, was designed around 1968. Attributedto Edson [9], it computes the RDFT and provides the template for all other compactsymmetric FFTs. Another compact symmetric FFT for the DCT was given byGentleman [65] in 1972. Further refinements on the compact symmetric FFT ideahave appeared during the last 20 years [141], [19], [75], [16] for many differentcomputer architectures. Compact symmetric FFTs have now replaced the pre- andpostprocessing methods in many software packages and are generally deemed superior.Nevertheless, the pre- and postprocessing algorithms have historical significance andare still the best way to compute symmetric DFTs if only a complex FFT is available.For these reasons we will have a look at a few of the better known, easily implementedpre- and postprocessing methods for symmetric DFTs.

Let's begin with the problem of computing the DFT of real input sequences. Hereis a motivating thought: the fact that the real DFT can be done with half of thestorage of the complex DFT might lead an enterprising soul to surmise that it shouldbe possible to do either

• two real TV-point DFTs with a single complex TV-point DFT or

• one real 2TV-point DFT with a single complex TV-point DFT.

The first proposition is the easiest to demonstrate, and we will also be able to answerthe second proposition in the affirmative.

Assume that we are given two real sequences gn and hn, both of length TV. Thegoal is to compute their DFTs Gk and H^ using a single TV-point complex DFT. Inorder to make full use of the complex DFT, we form the complex sequence fn = gn+ihn

to be used for input. The complex DFT applied to the sequence fn now takes theform

We need to compare this expression to the two desired RDFTs


for k = -AT/2 + 1 : N/2.This is the first example of what we call a pre- and postprocessing algorithm.

In this case it is a scheme for computing two real DFTs for the price of a single complexDFT. The underlying idea will appear again: the given input sequences are modified(in this case combined) to form the input to the complex DFT; this is the preprocessingstage. A complex DFT is computed and then the output is modified in a simple way(4.10) to form the sequences Gk and Hk\ this is the postprocessing stage.

In assessing the computational costs of pre- and postprocessing algorithms, it iscustomary to neglect the cost of the pre- and postprocessing. These costs are alwaysproportional to N while the DFT step consumes a multiple of N2 operations (orNlogN operations if the FFT is used). When N is large (which is when one worriesmost about computation and storage costs) the pre- and postprocessing costs becomenegligible compared to the DFT costs.

A perceptive reader might notice that it is more efficient to compute the DFTof two real sequences by using the explicit form of the RDFT (4.2) on each sequencerather than using Procedure 1, which involves an AT-point complex DFT. This is trueif one uses the explicit forms for all DFTs. However, if the FFT is used to computethe complex DFT, then Procedure 1 is more efficient than two explicit RDFTs.

The second idea mentioned above, computing a real DFT of length 2N with an N-point complex DFT, is also worth investigating. In order to do so we must introducethe splitting method, which will make several more appearances in this and laterchapters. We begin by splitting the input sequence fn of length IN into its even andodd subsequences,

There is an unfortunate coincidence of terminology: these even and odd subsequencesmust not be confused with the even and odd symmetries. We may now write thecomplex 2N-po'mt DFT of fn as

and

Procedure 1: Two Real DFTs by One Complex DFT

A brief calculation (problem 80) confirms that the computed DFT Fk can be easilyrelated to the desired DFTs Gk and Hk- The relationships will be used again in thischapter, and it is worthwhile to summarize them as the first of several procedures.

130 SYMMETRIC DFTs

The fact that u;|jv = ^N 'ls *he cmx of the splitting method and has been used toobtain the third line of this argument.

Recall that the goal is to determine Fk from Gk and Hk. The relationship (4.11)holds for k = —N/2 + 1 : N/2, which determines half of the F^'s. Since the sequencesGk and Hk are TV-periodic, we may replace k by k ±N in the expression (4.11) to findthat

which can be used to compute the remaining F^s. We now summarize theserelationships (often called combine or butterfly relations)^ which allow the DFTof a full-length sequence to be computed easily from the DFTs of the even and oddsubsequences.

for k = -AT/2 + 1 : N/2.The butterfly relations take an ever-so-slightly different form if the alternate

definition of the DFT on the sets n, k — 0 : N — 1 is used (problem 81).We can now turn to the conjecture made earlier that it should be possible to

compute a real DFT of length 2N using a single complex DFT of length N. Withthe two procedures already developed, we will see that it can be done. Assume thata real sequence of length 2N is given; call it fn where n = —TV + 1 : N. The goalis to compute its DFT Fk, and since the DFT is a conjugate even sequence, onlyFQ, . . . , F/v need to be computed. The overall strategy is this: we first compute theDFT of the even and odd subsequences of fn simultaneously using Procedure 1, andthen use Procedure 2 to combine these two DFTs into the full DFT Fk.

Here is how it is actually done. Letting gn = /2n and hn = /2n-i, we form the 7Vpoint complex sequence zn — gn + ihn, where n — —N/2 + 1 : N/2. Once its complexDFT Zk has been computed, Procedure 1 can be used to form the transforms Gk andHk. According to (4.10) they are given by

for k — —N/2 + I : N/2, and are both conjugate even sequences. We now have theDFTs of the even and odd subsequences of the original sequence. Using Procedure 2,the sequences Gk and Hk can be combined to form the desired DFT sequence Fk. Bythe combine formulas (4.12) we have that

Procedure 2: Combine (Butterfly) Relations TV —> 2N

for k = 0 : TV/2. The second relation of Procedure 2 can be used to compute theremaining coefficients for k = TV/2 + 1 : TV. Collecting both sets of relations, we havethe following procedure.

It should be verified (problem 82) that the coefficients Fk given by these relationsactually have the conjugate even symmetry that we would expect of the DFT of a realinput sequence, and that FQ and FN are real.

A quick operation count is revealing. The cost of the pre- and postprocessingmethod is the cost of a complex TV-point DFT or roughly 4TV2 real additions and 4TV2

real multiplications (neglecting the cost of the pre- and postprocessing). The cost ofa single real DFT of length 2TV is (2TV)2 = 4TV2 real additions and multiplications,so it appears that the pre- and postprocessing method is no better than using theexplicit form of the RDFT. As mentioned earlier, the pre- and postprocessing methodbecomes preferable when the FFT is used to compute the complex DFT. Anothermore efficient strategy for computing RDFTs (without FFTs) is explored in problem86. Finally, we mention in passing that Procedure 3 can be essentially "inverted," orapplied in reverse, to give an efficient method for computing the inverse RDFT of aconjugate even set of coefficients Fk [41].

We now come to the question of computing the DCT of an arbitrary real sequence.Let's start with a bad idea and refine it. Our discussion about DFTs of even sequencesleads naturally to the following fact (problem 83):

The TV-point DCT of an arbitrary real sequence fn consists of the TV + 1(real) coefficients F0, . . . ,FN of the 2TV-point complex DFT of the evenextension of fn.

This says that one way to compute an TV-point DCT of a sequence fn definedfor n = 0 : TV is to extend it evenly to form a new sequence fn that is even over theindex range n — —TV + 1 : TV. This new sequence can then be used as input to acomplex DFT of length 2TV. The output Fk will be real and even, and the desiredDCT coefficients will be F0, . . . , FN- Needless to say, there is tremendous redundancyand inefficiency in computing a DCT in this manner, since the symmetries in boththe input and ouput sequences have been overlooked.

We will now describe a pre- and postprocesssing algorithm for the DCT. Suchmethods are often difficult to motivate, since they were rarely discovered in thesuccinct manner in which they are presented. The underlying idea is always thesame: we must find an auxiliary sequence, formed in the preprocessing step, thatcan be fed to a complex DFT. The output of that DFT may then be modified in

Procedure 3: Length 2TV RDFT from Length TV Complex DFT

Given the real sequence fn of length 2TV, form the sequence zn — f^n + i/2n-iand compute its TV-point complex DFT Zk where k = -TV/2 + 1 : TV/2.

Then FQ, . . . , FN are given by


f t

132 SYMMETRIC DFTs

a postprocessing step to produce the desired DOT. Some of the procedures alreadydeveloped (particularly Procedures 1 and 2) will generally enter the picture.

Given an arbitrary real sequence fn defined for n = 0 : N, the goal is to computethe sequence of DOT coefficients Fk for k = 0 : N. We begin by forming the evenextension of fn so that it is defined for n = — N + I : N. It is important to note thatthe complex DFT of this even extension, which we will denote Ffc, contains the desiredDCT of the original sequence. The trick is to define the complex auxiliary sequence

Now we are set to use Procedure 2 since the DFTs of the even and odd subsequences offn, which we have labeled Gk and H'k, are known. The combine formulas of Procedure2 give us that

Notice that the shift property has been used to relate P{/2n+i}fc to T>{/2n-i}k-It is now possible to solve for H'k, the DFT of the odd subsequence, in terms of theDFT Hk that has been computed. We have that

for k = —AT/2 + 1 : TV/2. The DFTs of the two real subsequences gn and hn are nowavailable. We are maneuvering toward an application of Procedure 2 to compute Fk interms of Gk and Hk- But it cannot be used yet since, while gn is the even subsequenceof /n, hn is not its odd subsequence. Let h'n be the odd subsequence /2n-i, and letits DFT be H'k. Then we can write

for n = —N/2 + 1 : AT/2. Notice that we have defined two subsequences gn and hn

of length N that form the real and imaginary parts of zn. If there is any rationalefor this choice of a sequence, it is that gn is the even subsequence of zn, and hn canbe easily related to the odd subsequence of zn. Furthermore, zn is a conjugate evensequence (zn = z*Ln), which means that its DFT Zk is real.

Now the actual transform takes place. The sequence zn is used as input to anAppoint complex DFT to produce the coefficients Zk- Procedure 1 can then be usedto recover the DFT of the two real subsequences gn and hn. Recalling that Zk is real(Zk = ££), these DFTs are given by


for k = I : N/2. Since Zk is a real sequence, this expression involves only realarithmetic and provides the first half of the desired DFT Fk. A similar calculationproduces the relation

for k —: —N/2 + 1 : —1. This expression can be used to find the coefficients withk = N/2 + 1 : N — 1. Finally, a special case must be made for A; = 0 and k = N. Wecan directly compute

and then it follows that

and

We may now collect all of these results in a single recipe for computing the DCT bypre- and postprocessing.

Procedure 4: DCT by Pre- and Postprocessing ^

Given an arbitrary real sequence {/o, • • • , /TV}I extend it evenly (f-n = fn] andperiodically (fn+2N = fn} to form the sequence

It should be noted that the computations given in Procedure 4 can be prone tonumerical instability due to the division by sin(7rnfc/7V) which approaches zero for

Compute the ./V-point complex DFT Zk, where k = -N/2 + I : N/2.

Then FQ , . . . , FN are given by

134 SYMMETRIC DFTs

values of A; near 0 and N. For small N this may not present a problem. Stable pre-and postprocessing methods have been devised essentially by inverting the relationsgiven in Procedure 4 [49].

What about computing the DST? As with the DCT, there is a safe but inefficientway. It follows from the following assertion (problem 84):

The TV-point DST of an arbitrary real sequence fn consists of the N — 1coefficients iF\,... ,iFx-i of the 2N-point complex DFT of the oddextension of /„. (Since the Ffc's are imaginary, the DST coefficients arereal.)

This says that one can always apply a 2A"-point complex DFT to the odd extensionof a given real sequence and then find the DST coefficients in the imaginary part ofthe output. However, there are (much) better ways to accomplish the same end. Oneof the preferred methods is a pre- and postprocessing algorithm. In the interest ofbrevity and reader involvement, we will give only the menu for this meal; the fullfeast can be savored in problem 85. Here is the key idea: the pre- and postprocessingalgorithm for the DCT was launched by the observation that if fn is an even sequencethen the auxiliary sequence zn — /2n + *(/2n+i — /2n-i) is conjugate even. In a similarway, if fn is an odd sequence, then izn — /2n-i — /2n+i + "ifin is conjugate even.Therefore, in a few words, the pre- and postprocessing algorithm for the DST can bederived by applying the arguments leading to Procedure 4 to the auxiliary sequenceizn. The resulting Procedure 5 is given in problem 85.

Example: Numerical RDFT, DCT, and DST. For the sake of illustration,we will take N = 12 and analyze a real sequence /„. The RDFT, DCT, and DST ofthis sequence will be computed and discussed. Before proceeding, we must make onesmall adjustment to facilitate a comparison between the transforms. The RDFT givenin (4.2) is defined for an input sequence fn with n = —N/2 + 1 : AT/2, whereas theDCT and DST given in (4.5) and (4.8) are defined forn = 0 : N and n = 1 : N - 1,respectively. To make these definitions compatible, we may use the periodicity of fn

to give an alternate (equivalent) definition of the Appoint RDFT pair (problem 72).

for n = 0 : AT - 1.With these adjustments we can now turn to Table 4.2 and make a few pertinent

comments. First look at the input column labeled fn. Notice that values are givenfor /o , . . . , /i2, but they must be used carefully. To compute the RDFT using (4.13)with N = 12, we use the values of fn with n = 0 : 11, with the important conditionthat the value used for /o is the average of /o and /i2- On the other hand, the DCT

for k = 0 : AT/2.

Alternate Inverse RDFT (n, fc = 0 : AT - 1)

Alternate RDFT (n, k = 0 : N - 1)


requires values for n = 0 : 12, and so an additional value for /i2 is given for use withthe DOT only. The DST requires values forn = 1 : 11 and assumes that /o = /i2 = 0.With these essential observations, we can now proceed with the experiment. Table4.2 also shows the computed values of the transform coefficients.

TABLE 4.2RDFT, DCT, and DST of a real sequence with N = 12.

n, k01234567891011

(12)

fn

.11

.44

.55

.22

.00-.11-.33-.66-.11.33.77.99.88

Ffc for RDFT2.15(-1)

2.87(-l) + 1.71(-2)t-5.50(-2) + 1.03(-l)i-5.04(-2)-9.17(-3)»-1.83(-2) + 5.56(-2)»-3.04(-2) + 1.23(-3)t

1.38(-2)-3.04(-2) - 1.23(-3)»-1.83(-2)-5.56(-2)t-5.04(-2) + 9.17(-3)»-5.50(-2) - 1.03(-l)i2.87(-l) - 1.71(-2)t

-

Ffc for DCT2.15(-1)

-8.23(-2)2.87(-l)

-9.96(-2)-5.50(-2)2.73(-2)

-5.04(-2)-5.05(-2)-1.83(-2)1.70(-2)

-3.04(-2)-4.49(-3)1.38(-2)

Ffc for DST-

2.08(-2)-1.71(-2)3.00(-1)

-1.03(-1)l.Ol(-l)9.17(-3)3.06(-2)

-5.56(-2)2.45(-2)

-1.23(-3)-1.83(-2)

-

Note: The RDFT uses |(/o + /i2) for /0. The DCT uses both /0 and /i2.The DST uses fn with n — 1 : 11. The parenthetical entries are exponents,e.g., 1.38(-2) means 1.38 x 10~2.

Simple as this example may seem, it does offer some valuable lessons. The first isthat for an arbitrary real sequence with no symmetry, the sets of RDFT, DCT, andDST coefficients are different. There are 12, 13, and 11 independent real coefficientsfor the RDFT, DCT, and DST, respectively, matching the number of real input entriesexactly. Notice that the RDFT transform coefficients possess conjugate even symmetrysince the input sequence is real. However, the DCT and DST coefficients have noparticular symmetry (other than real), since the input has no additional symmetry.

A further revelation occurs if we now use the computed coefficients and evaluatethe inverse transforms. Figure 4.3 shows the sequences that are generated by theinverse RDFT, DCT, and DST given by (4.14), (4.6), and (4.9). The importantobservation is that the values /i,... ,/n match the original input values. However,the sequences differ at the endpoints and when they are extended beyond the intervaln = 0 : 12. Specifically, the inverse RDFT extends to a sequence with period TV,the inverse DCT extends to an even sequence with period 27V, and the inverse DSTextends to an odd sequence with period 2N. This is in strict analogy with the real,cosine, and sine forms of the Fourier series.

We have described in detail how the explicit forms and the pre- and postprocessingmethods can be used to compute three symmetric transforms. For a completecomparison, the compact symmetric FFTs mentioned earlier must also be broughtinto the discussion. Table 4.3 shows the cost of computing the RDFT, DCT, andDST of an arbitrary real sequence of length TV by using the explicit form, a pre- andpostprocessing method, and a compact symmetric FFT. It assumes that the pre- andpostprocessing methods use a complex TV-point FFT which has a computational costof 3TV log TV — 2TV real additions and 2TV log TV — 4TV real multiplications. Only thetwo "leading order" terms in the pre- and postprocessing and compact FFT costs are

136 SYMMETRIC DFTs

FlG. 4.3. Having computed the RDFT, DOT, and DST of a given sequence with N = 12(values given in Table 4.2), the transform coefficients are then used to reconstruct the sequencefn according to the inverse RDFT (top figure), inverse DOT (middle figure), and inverse DST(lower figure). These sequences agree at the points n — 1 : 11, but differ when extended beyondthis interval.

TABLE 4.3Cost of RDFT, DCT, and DST by explicit definition,

pre- and postprocessing, and compact symmetric FFT.

Real addsReal mults

Real addsReal mults

Real addsReal mults

RDFTexplicit~ N'2

~/V2

DCTexplicit~ N'z

~/V2

DSTexplicit~ A^-AT2

RDFT bypre- and postprocessing

3N log AT/2 - 2N log N - 2N

DCT bypre- and postprocessing

3N log N/2 + NN\ogN-2N

DST bypre- and postprocessing

37V log N/2 -3N/2N log N - 3N

RDFT bycompact FFT

3N log N/2 - 5N/2N log N - 3N

DCT bycompact FFT

37V log N/2 - 3NN log N - 4/V

DST bycompact FFT

3N log TV/2 - 3NN log N - 4N

Operation counts for pre- and postprocessing and compact FFTs taken fromSwarztrauber [141] modified for arbitrary sequences of length N.

included, which is sufficient for purposes of comparison. We note that costs shown arefor arbitrary real sequences of length N (not symmetric sequences of length N).

A few comments are in order. First, note the savings that both the pre- andpostprocessing and compact FFTs offer over the explicit forms. However, the pre-and postprocessing methods and the compact FFTs are identical in cost in the leadingterm. The differences appear in the terms that are linear in N and may be insignificant.

NOTES 137

TABLE 4.4Real symmetric sequences and their symmetric DFTs,

original sequence of length 2N, DFT of length N.

Symmetry

Even

Odd

Quarter even*

Quarter odd**

Property Example: N = 3

Boundary

conditions

See problem 87.See problem 88.

The compact FFTs are often given the advantage because they require fewer passesthrough the data; for some computer architectures, this property could be far moreimportant than operation counts. Finally, we emphasize that the DOT and DSTshow no computational advantage over the RDFT. This is because all three DFTsare applied to an arbitrary sequence of length N in which no symmetries are present.Of course, as claimed earlier, all three methods have roughly half the storage andcomputation costs of the complex DFT.

Much of the symmetric DFT territory has been covered in this section— certainlythose regions that are most frequently traveled. However, there are many moresymmetries that have been discovered and studied, often in rather specializedapplications. We close this section with Table 4.4, which is a list (still far fromexhaustive) of symmetries that arise in practice and that lead to other symmetricDFTs. Each line represents a different symmetric sequence and the associatedsymmetric DFT. The original sequence is assumed to be periodic with length 2./V,which leads to a new symmetric DFT of length N. Since many symmetric DFTs areused in the solution of boundary value problems, the relevant boundary condition isalso shown on a computational domain denned by n = 0 : N.

4.6. Notes

A matrix formulation of both pre- and postprocessing algorithms and compactsymmetric FFTs can be found in Van Loan [155]. The symmetries discussed in thischapter arise on a regular set of grid points. Symmetric DFTs can also be denned onstaggered grids using midpoints of subintervals [144]. DFTs have also been definedon more exotic two-dimensional grids; see [5] for DFTs on hexagonal grids. The mostcomplete collection of compact symmetric FFTs that includes versions for standard,staggered, and mixed grids (together with codes) is given by Bradford [16]. Readersinterested specifically in the DOT should see Rao and Yip [116].

138 SYMMETRIC DFTs

4.7. Problems

70. Folded RDFT. It is interesting to note that further savings can be realized inboth the real and complex DFTs by folding the DFT sum. Show that the real N-pomtDFT (4.2) can be written as

where k = 0 : TV/2. In this form, how many real additions and multiplications areneeded to compute the full set of coefficients (counting only multiples of TV2)?

71. Folded DFT. Show how the maneuver of the previous problem can also beapplied to the complex DFT (2.6). What is the savings in real arithmetic in using thefolded form of the DFT?

72. Alternate form of the RDFT. Show that if the real sequence fn is dennedfor n = 0 : TV — 1, then the transform sequence has the conjugate even propertyFk = F£f_k,

and is §iven bv

for k — 0 : N/2. What value of k corresponds to the highest frequency mode? Showthat the associated inverse (again using the conjugate even symmetry of the F^s] is

for n = 0 : TV - 1.

73. DCT and DST modes. Make a rough sketch of the modes that appear in theDOT and DST of length N = 8. In each case note the frequency and period of eachmode.

74. Aliasing of DST and DCT modes. Let

be the nth component of the kth DCT mode where k — 0 : N. Show that the k + 2Nmode is aliased as the kih mode. What form does the k + N mode take relative tothe kth mode? Do the same conclusions apply to the DST modes?

75. Properties of the DCT. Given a sequence /n, let F^ — C{fn}k denote theTV-point DCT of fn as given by (4.5). Show that the following properties are true:

(a) Periodicity:

(b) Shifti:

and

PROBLEMS 139

(c) Shift2:

(d) Even input: If

(e) Odd input: If

for k = 0: N,j = 0 : N, and

for fc = 1 : N-1J = I : N - I.

79. Inverses from orthogonality. Derive the inverse DOT and DST directlyusing the above orthogonality properties. For example, in the DCT case proceed asfollows. Multiply both sides of the forward DCT

by an arbitrary mode cos(7mk/N] and sum (using E") both sides over k = 0 : N. Usethe orthogonality to solve for /n, where n = 0 : N. Carry out the same procedure toderive the inverse DST.

80. Two real DFTs by one complex DFT. Verify that if gn and hn are tworeal sequences of length N and fn — gn + ihn, then the respective DFTs are relatedas given by Procedure 1:

76. Properties of the DST. Given a sequence fn, let Fk = S{fn}k denote the./V-point DST of fn as given by (4.8). Show that the following properties are true:

77. Inverse DST. Start with a sequence of coefficients Fk of length 2N that isodd and pure imaginary. Apply the inverse RDFT (4.3) of length 2N and use thesymmetry of the coefficients to derive the inverse DST as given in (4.9).

78. Orthogonality. Show that the following orthogonality properties govern themodes of the DOT and DST.

(a) Periodicity:

(b) Shifti:

(c) Shift2:

(d) Even input:

(e) Odd input: If

and

then

then

then

then

140 SYMMETRIC DFTs

where k = -AT/2 + 1 : N/2.

81. Alternate butterfly relations. Show that if the complex DFT is defined onthe indices n, k = 0 : N — 1 by

for k = 0 : N — 1, then the butterfly relations of Procedure 2 take the form

for k = 0 : N/2 — 1. In light of this modification, how are the relations of Procedure3 changed?

82. 2N-point real DFT from JV-point complex DFT. Verify that the DFT ofa real sequence of length 2N as given by Procedure 3 is conjugate even. Verify thatFQ and FN are real.

83. An inefficient DCT. Show that the AT-point DOT of an arbitrary real sequence/„ consists of the N + 1 (real) coefficients FO, . . . , FN of the 2JV-point complex DFTof the even extension of fn.

84. An inefficient DST. Show that the JV-point DST of an arbitrary real sequencefn consists of the N — I (real) coefficients iFi,... , zF/v-i of the 2N-po'mt complexDFT of the odd extension of fn.

85. Pre- and postprocessing for the DST. The following steps lead to a pre-and postprocessing method for computing the DST.

(a) Given an arbitrary real sequence fn for n = 1 : N, use its odd extension todefine the auxiliary sequence

for n = -N/2 + 1 : N/2.(b) Show that zn is a conjugate even sequence and its DFT Zk is real.(c) Having computed Zk, use Procedure 1 to show that

(d) Letting g'n = fan-\ show that

(e) Having found the DFT of the even and odd subsequences of fn, use thecombine relations of Procedure 2 to write the following procedure.

Procedure 5: Pre- and postprocessing for the DST

PROBLEMS 141

86. A more efficient RDFT. The text offered three methods for computing theDFT of a real sequence of length IN.

Use a 2TV-point complex DFT (with a cost of roughly 4(2TV)2 = 16TV2 realadditions and multiplications).

Use the explicit form of the RDFT (with a cost of roughly (27V)2 = 4TV2 realadditions and multiplications).

Use Procedure 3 (with a cost of roughly 4TV2 real additions and multiplications).

Here is a more efficient method (short of calling in the FFTs, which will surpass allof these suggestions). Procedure 2 can be streamlined to combine the RDFTs of twosequences of length N to form an RDFT of length IN.

(a) Let fn be the given sequence of length IN, where n = —N+l : N. Note thatits DFT Fk is conjugate even and hence only the coefficients for k = 0 : TVneed to be computed.

(b) Let gn = /2n and hn = /2n-i be the even and odd subsequences of /n,where n = -N/2 + I : AT/2. Note that the TV-point RDFTs Gk and Hk arealso conjugate even and only the coefficients k = 0 : N/2 need to be stored.

(c) With these symmetries in mind, show that the combine relations ofProcedure 2 can be reduced to

and the full sequence Fk can be recovered.

(d) How many real additions and multiplications are required in this process(in multiples of TV2)?

(The recursive application of this strategy to the smaller RDFTs results in Edson'scompact symmetric FFT.)

87. The quarter-wave even DFT. Assume that a real sequence fn of length 27Vis defined for n = —N + 1 : TV and satisfies the quarter-wave even (QE) symmetryf-n-l = fn-

(a) Show that this symmetry together with periodicity (period 27V) implies thatthe sequence fn satisfies the boundary conditions fo—f-i = /N~/N-I =0.Note that there are TV independent values of fn in this sequence of length27V.

(b) Beginning with the complex DFT (2.6), reverse the order of summationto show that the QE symmetry implies that F-k = F£ and Fk = ^2N^k~

fc/2 ~ ~Show that Fk can be expressed in the form Fk = u2N Fk, where Fk is real.

(c) Show that the auxiliary sequence Fk has the properties FN — 0, Fk =Fk±2N, and F_fc = Fk.

(e) Use either the inverse complex DFT or the inverse real DFT (4.3) and thesymmetry Fk = <jJ2N ^k to find the following inverse DFT.

142 SYMMETRIC DFTs

(d) Now use the QE symmetry in the definition of the complex DFT (2.6) toderive the following symmetric DFT.

Quarter-Wave Even (QE) DFT

Inverse Quarter-Wave Even DFT

Note that by using the auxiliary sequence Fk the transform pair can be computedentirely in terms of real quantities.

88. The quarter-wave odd DFT. Follow the pattern of the previous problem todevelop the forward and inverse quarter-wave odd (QO) DFTs.

(a) Beginning with a sequence fn of length 2N that possesses the QO symmetryf-n-i — —fn, show that fn satisfies the boundary conditions /o + /i =/AT + /TV-I = 0.

(b) Show that with this symmetry the DFT satisfies Fk — —^N^k'i ^s impliesthat Fk = iu2N Fk where Fk is real.

(c) Show that the auxiliary sequence Fk satisfies FQ = 0 and F-k = —Fk.

(d) Derive the following symmetric transform pair.

Quarter-Wave Odd (QO) DFT

Inverse Quarter-Wave Odd DFT

Note that by using the auxiliary sequence Fk the transform pair can be computedentirely in terms of real quantities.

Chapter 5

It is good to express amatter in two ways

simultaneously so as togive it both a right foot

and a left. Truth canstand on one leg, to be

sure; but with two it canwalk and get about.

— Friedrich Nietzsche

Multidimensional DFTs

5.1 Introduction

5.2 Two-Dimensional DFTs

5.3 Geometry of Two-Dimensional Modes

5.4 Computing Multidimensional DFTs

5.5 Symmetric DFTs in Two Dimensions

5.6 Problems

143

144 MULTIDIMENSIONAL DFTs

5.1. Introduction

Up until this point every ounce of attention has been devoted to the uses and propertiesof the one-dimensional DFT. However, in terms of practical occurrences of the DFTand in terms of the overall computational effort spent on DFTs, the problem thatarises even more frequently is that of computing multidimensional DFTs. Wemust now imagine input that consists of an array of data that may be arranged in aplane (for the two-dimensional case), a parallelepiped (for the three-dimensional case),or even higher-dimensional configurations. It is not difficult to appreciate how suchhigher-dimensional arrays of data might arise. The image on a television screen, theoutput of a video display device, or an aerial photograph may all be regarded as two-dimensional arrays of pixels (picture elements). A tomographic image from a medicalscanning device is a two- or three-dimensional array of numbers. The solution of adifferential equation in two or more spatial dimensions is generally approximated atdiscrete points of a multidimensional domain. A statistician doing a multivariate studyof the causes of a particular disease may collect data in 20 or 30 variables and then lookfor correlations. All of these applications require the computation of DFTs in two ormore dimensions. Fortunately, not one bit of the time spent on one-dimensional DFTswill be wasted: almost without exception, the properties of the one-dimensional DFTcan be used and extended in our discussions of multidimensional DFTs. The primarytask is to present the two-dimensional DFT and make sure that it is well understood.The generalization to three and more dimensions is then absolutely straightforward.It is a lovely excursion with analytical, geometrical, and computational sights in alldirections. With that promise, let's begin this important and most relevant subject.

5.2. Two-Dimensional DFTs

As with the one-dimensional DFT, multidimensional DFTs begin with input. De-pending on the origins of the particular problem, the input may be given in a discreteform (perhaps sampled data) or in continuous form (a function of several variables).For the moment assume that we are given a function / defined on the rectangularregion1

As in the one-dimensional case this function must be sampled in order to handle itnumerically. To carry out this sampling a grid is established on the region with auniform grid spacing of Ax = A/M in the x-direction and Ay = B/N in the y-direction. The resulting grid points (see Figure 5.1) are given by

for n = -TV/2 + 1 : AT/2 and ra = -M/2 + 1 : M/2. The input function / can now bsampled by recording its values at the grid points to produce the sequence (or array)of numbers fmn = f(xm,yn). If the original problem had been discrete in nature,the input would already appear in this form. In anticipation of using the array fmn

1 Following the convention of previous chapters, we will continue to define functions on closed sets(boundary points included) with the understanding that special attention must be given to boundarypoints.

n

f

TWO-DIMENSIONAL DFTs 145

FIG. 5.1. T/ie two-dimensional DFT works on a spatial grid with grid spacings Ax = A/Mand AT/ = B/N in the x- and y-directions, respectively. A typical grid point is (xm,yn), wherem = —M/2 + 1 : M/2 and n — —AT/2 + 1 : N/2. Intimately related to the spatial grid is thefrequency grid (to be introduced shortly) with grid spacings Au; = f2/M and ACT = A./N. Atypical frequency grid point is (u>m,<7n), where m = —M/2+1 : M/2 andn = —N/2+1 : N/2.

as input to a DFT, we will already begin to think of it as being doubly periodic,meaning that

Furthermore, all of the familiar exhortations about endpoints and discontinuitiesmust also be repeated. At boundaries and discontinuities, fmn must be assigned theappropriate average value (AVED). For discontinuities at the boundaries this presentsno difficulties, since the boundaries are parallel to one of the coordinate directions.For example, if f(—A/2,yn) ^ f(A/2,yn) then /A* n should be assigned the valueI (/(—A/2, yn) + f(A/2, yn}}. For internal discontinuities, there can be some subtletiesin determining average values which we will not belabor here.

In order to motivate the two-dimensional DFT, we will start by considering aslightly special case that will be generalized immediately. For the moment assumethat the input sequence fmn is separable, meaning that it has the form fmn = gmhn,the product of an m-dependent (or x-dependent) term and an n-dependent (or y-dependent) term. With this small assumption, we now look for a representation ofthe input /mn in terms of the basic sine and cosine modes in each of the coordinatedirections. In this separable case, we know that the individual sequences gm and hn

have representations given by the one-dimensional inverse DFTs

In these representations, m = —M/2 + 1 : M/2 and n = —N/2 + 1 : N/2. We should


recognize Gj and Hk as the DFT coefficients of gm and hn, respectively, given by

where j = -M/2 + 1 : M/2 and k = -JV/2 + 1 : AT/2. It is now an easy matter toconstruct a representation for the two-dimensional sequence fmn = gmhn- Multiplyingthe two individual representations for gm and hn, we have that

If we now let the product GjHk be the new DFT coefficient Fjfc, we have thefollowing representation for the input sequence fmn'-

where m = —M/2 + 1 : M/2 and n = —AT/2 + 1 : N/2. Furthermore, we can combinethe expressions for the DFT coefficients Gj and Hk to write that

which applies for j = -M/2 + 1 : M/2 and k = -TV/2 + 1 : AT/2. At this point wehave presented little more than a conjecture, but here is the claim that needs to beverified: Given an M x N input array /mn, the forward DFT is given by the followingformula.

Forward Two-Dimensional DFT

for j = -M/2 + 1 : M/2 and A; = -AT/2+1 : TV/2. Meanwhile the inverse DFT is givenby the following formula, although the validity of the inverse is still to be verified.

Inverse Two-Dimensional DFT


for j = -M/2 +1 : M/2 and k = -N/2 +1 : N/2. It is this pair that we will use in allthat follows. Having finally written down an alleged two-dimensional DFT, it shouldbe said that several different paths could have been chosen to the same destination.Any of the derivations of the DFT presented in Chapter 2 (approximation of Fourierseries, approximation of Fourier transform, interpolation, least squares approximation)could be used in a two-dimensional setting to arrive at the two-dimensional DFT. Thederivation based on Fourier series is explored in problem 115.

Now the task is to show that (5.1) and (5.2) really do form a DFT pair. Beforedoing so, it would help to interpret these expressions for fmn and Fjf.. Just as in theone-dimensional case, the aim of the DFT is to express the two-dimensional input fmn

as a linear combination of modes that consist of sines and cosines. We can organizethe various modes with the help of the two spatial indices j and k. For fixed valuesof j and k, these modes are functions of the two indices ra and n. It is essential tounderstand what these modes look like. The (j, k) mode has the form

We see that for a fixed pair (j,k), one complex mode UJ^N corresponds to one cosinemode and one sine mode.

We must now interpret the frequency indices j and A; and make the first of manyimportant geometrical observations. Consider either the sine or cosine mode givenabove for a fixed (j, k) pair. If we hold n fixed and sample this wave on a lineof increasing ra, then the resulting sequence is periodic in m with j periods (orwavelengths) every M grid points (see Figure 5.2). In other words, j determinesthe frequency of the mode as it varies in the m- (or #-) direction alone. Similarly, ifwe hold ra fixed and sample the mode on a line of increasing n, we obtain a slice ofthe wave that has a frequency of k periods (or wavelengths) every N grid points inthe n- (or y-) direction.

As in the one-dimensional case, values of j or k near zero correspond to lowfrequencies (long periods) in the respective directions, while if \j\ is near M/2 or \k\is near N/2, we expect to see a high frequency (small period) in that direction. Afew pictures will speak many words in showing how both high and low frequenciesin either direction can be mixed in all combinations. Figure 5.3 shows several of themodes that are needed for a full representation of the two-dimensional input fmn.

We will return to the fascinating geometry of the DFT modes shortly. But it'stime to verify that the pair of expressions (5.1) and (5.2) has properties that we mightexpect of a DFT pair. It is easy to see that both the input sequence fmn as given by(5.1) and the transform sequence Fjk as given by (5.2) are M-periodic with respect totheir first index and TV-periodic with respect to the second. We must also verify thatthe alleged forward and inverse transforms really are inverses of each other. As in theone-dimensional case, the inverse property relies on orthogonality. So let's first checkthe orthogonality of the vectors wjfc with components w-£n = ^A/^A^- Consider thediscrete inner product of two modes with frequencies (jf, A;) and (jo,ko). It looks like


FIG. 5.2. Two slices reveal the anatomy of a typical two-dimensional DFT mode. Thesemodes are fully two-dimensional waves (top figure) characterized by two frequency indices( j , k ) . When sliced along a line of constant n (or y), the mode is periodic in the m- orx-direction (middle figure) with a frequency of j periods every M grid points. Similarly, aslice through the mode along a line of constant m (or x) produces a wave with a frequencyof k periods every N grid points (bottom figure). Here j' = 1 and M = 32, while k = 1 andAT = 24.

And now something happens that will recur many times in working with multidi-mensional DFTs. The problem splits into separate problems, each of which can behandled easily with one-dimensional thinking. Here is how it happens in the case oforthogonality. By the multiplicative property of the exponential (u^ujb

N = u;^+b) wehave that

We have appealed to the orthogonality of the one-dimensional vectorsand used the modular Kronecker delta notation

and

Stated in words, the inner product of two modes vanishes unless the two modes areidentical or unless the two modes after aliasing are identical. (We will return to thematter of aliasing in two dimensions shortly.) With this two-dimensional orthogonalityproperty, it is a short and direct calculation (problem 89) to verify that the relations(5.1) and (5.2) really are inverses of each other.


FIG. 5.3. The two-dimensional DFT uses modes (sines and cosines) that can have differentfrequencies in the two coordinate directions. The figure shows three typical cosine modes ona grid with M = N = 24 points in both directions: (top left) j = l,k = 6 (low frequency inthe x-direction, high frequency in the y-direction), (middle left) j = 3, k = 3, and (bottomleft) j = 4, k = 0. The three figures on the right are the corresponding frequency domainrepresentations. Since each function consists of a single real mode, the DFTs consist of twoisolated spikes. They are displayed in map view, with the diamonds at the locations of thespikes.

There is a multitude of other one-dimensional DFT properties that carry overdirectly to the two-dimensional case. Most of these will be relegated to the problemsection (problem 92). As a quick example, let's look at a shift property to demonstratesome common techniques. Using the operator notation, we will let Fjk = T^{fmn}jk-,where M and TV are understood. Now consider the input sequence obtained fromfmn by shifting it mo units in the positive ra direction and no units in the positive ndirection. Its DFT is given by


Notice that the periodicity of fmn and the complex exponential have been used toadjust the summation limits back to their conventional values. We conclude that,as in one dimension, the DFT of a shifted sequence is the DFT of the originalsequence times a "rotation" or change of phase in the frequency domain. In thespecial case that the input fmn is shifted by exactly half a period in both directions(ra0 = ±M/2,n0 = ±7V/2), we have that

Therefore, it follows immediately that the DFT of a single real mode consists of twospikes:

and

We see that the DFT of a single complex mode consists of one spike located at position(jo,fco) in the frequency domain. Combining two complex exponential modes allowsus to produce single real modes. For a fixed pair of integers jo and fco, we have that

Examples: Two-dimensional DFTs. Computing two-dimensional DFTsanalytically is difficult in all but a few special cases. One special case is that inwhich the input function has the separable form f ( x , y ) = g(x)h(y). In this situation,the corresponding array of sampled values also has a separable form fmn = gmhn-With input of this form, it is easy to show (problem 90) that Fjk — GjHk, whereGJ and Hk are the one-dimensional DFTs of the sequences gm and /in, respectively.Analytical DFTs can also be found without undue effort when the input consists ofcertain combinations of sines and cosines. The simplest example is a single complexexponential with frequency indices JQ and fco- Using the orthogonality property of thetwo-dimensional DFT, it follows that (problem 91)

n


and

The DFTs of three different single cosine modes are shown in the plots of Figure 5.3.The double spikes occur symmetrically with respect to the origin, and their positionindicates the mode's frequency in both the x- and y-directions. We can also considerthe product of two real modes. To change the setting a bit, assume that for a fixedpair of integers jo and fco, a function of the form

is given on the domain with —A/1 < x < A/2 and —B/1 <y< B/2. We now samplethis function with M points in the x-direction and N points in the y-direction andthen apply the two-dimensional DFT. We can either note that the input function hasthe separable form mentioned above or use the identity

Either route leads to the same result. Letting

be the sampled form of the function /, we find that the DFT is given by

A careful inspection of this expression reveals that it amounts to four spikes (withimaginary amplitudes) located at the four pairs of frequency indices (±jo, ±&o)- Othercombinations of sines and cosines in product form can be handled in a similar manner.For example, the DFT of the product of two cosine modes or two sine modes willconsist of four spikes with real amplitude. Problem 93 offers several more examplesof analytical DFTs in two dimensions.

This would be a good place to note that we have defined the two-dimensional DFTpair on a symmetric domain with the origin (0,0) at the center. But just as in the one-dimensional case, there are often situations in which other domains are convenient.We can define the DFT pair on any domain with M x N adjacent grid points. Inlater chapters, we will use the two-dimensional DFT on domains with spatial indicesra = 0 : M - l , n = 0 : J V - l . The DFT pair for this set of indices is

for j = 0: M – 1, k = 0: N –, in the forward direction, and


FIG. 5.4. TTie two-dimensional DFT can also be defined for indices m,j = Q:M — l andn, k = 0 : AT — 1. With this choice, the high frequencies appear near the center (denoted "H"),the low frequencies near the corners (denoted "L"), and the frequencies near the central edges(denoted "M") are mixed frequencies, high in one direction and low in the other.

for m = 0 : M — l , n = 0 : J V — 1 for the inverse DFT. This convention hassome advantages; however, it should be noted that it does imply a rather peculiararrangement of the frequencies. As shown in Figure 5.4, the indices for the lowfrequency modes are located near the corners of the frequency domain (indicatedby "L"), while the high frequency (denoted "H") indices are located near the center(clustered around j = M/2 and k = N/2). Mixed modes, with a high frequency inone direction and a low frequency in the other direction, appear near the edges of thefrequency domain, and are marked "M" in the figure.

5.3. Geometry of Two-Dimensional Modes

We gave only passing attention to the geometry of the two-dimensional DFT modes.There is more to be learned and it will eventually bear on the important questions ofreciprocity, aliasing, and sampling. For the sake of illustration, we will work on thedomain

and consider the continuous mode

with a fixed pair of frequency indices (j,k). The corresponding sine mode could beconsidered in the same way, and there is an analogous analysis for the discrete formof this mode (problem 96). Several typical modes of this form were graphed in Figure5.3. But how would one graph such a mode in general? What is the frequencyand wavelength of such a mode? These are the questions that we must investigate.

GEOMETRY OF TWO-DIMENSIONAL MODES 153

Before proceeding, let's agree on some terminolgy: since multiple-dimension DFTs aregenerally used in spatially dependent problems, we will use the more appropriate termwavelength instead of period throughout this discussion. This means that the units ofx and y are length.

Remember that the frequency indices j, k and the dimensions of the domain A, Bare assumed to be fixed in this discussion. We can take a first cut at analyzingthis mode by using the slicing exercise that was applied earlier (Figure 5.2 still helps).Along a line of constant y, the mode given by (5.3) has a wavelength of A/j (measuredin units of length) and a frequency of j/A (measured in cycles per unit length). Someadditional notation is now needed: we will let \JLJ and ujj be the wavelength andfrequency, respectively, of the (j, k) mode when sliced in the x-direction. The sameargument in the y-direction shows that this mode, when sliced along a line of constantx, has a wavelength of B/k and a frequency of k/B that we will denote 77^ and a^.To summarize this notation, we have that the wavelengths and frequencies in thecoordinate directions are given by

in the re-direction and

in the y-direction. However, the mode given in (5.3) is more than a slice in thecoordinate directions; it is a fully two-dimensional wave. Here is the crucial insight:viewed as a function of x and y, this mode (for fixed j and k) has the property thatit is constant along the parallel lines

in the ( x , y ) plane, where c is any constant. These lines are called the lines ofconstant phase or simply phase lines of this particular mode. Envisioned as awave, the crests and troughs of a mode are parallel to its phase lines. We see bywriting equation (5.4) as y — —(Bj/Ak)x + (Bc/k) that all of the phase lines of the(j, k) mode have a slope of —Bj/Ak (Figure 5.5); or equivalently, the phase lines havean angle of Ojk with respect to the or-direction where

We can now ask questions about wavelength and frequency of the full wave, notof slices. For the cosine mode (5.3), with a maximum value at x = y = 0, there is acrest along the line

that passes through the origin. The adjacent crests lie along the phase lines for which

as shown in Figure 5.5. Therefore, the wavelength of this particular mode is justthe perpendicular distance between pairs of adjacent phase lines. A short calculation


FIG. 5.5. The modes cos 27r(jz/A + ky/B) and sin27r(jx/A + ky/B) are constant alongthe phase lines jx/A + ky/B = c (diagonal lines in the left figure). This means that thecrests and troughs of these waves are aligned with the phase lines with an angle 6jk to thehorizontal. In the frequency domain the frequency vector (ujj,o-k) = (j/A,k/B) points in adirection orthogonal to the phase lines given by the angle tjjjk.

(problem 95) shows that the wavelength of the (j, k] mode (measured in units oflength) is

For example, the modes with j = 0 (constant in the x-direction) have wavelengths of

as we would expect from the slicing argument. Similarly, the modes with k = 0(constant in the y-direction) have wavelengths of

The high frequency modes (j = 0, k = N/2 and j = M/2, k = 0) have wavelengths

But notice that the (M/2, N/2) mode has an even smaller wavelength of

The pattern of wavelengths of two-dimensional waves is given in Table 5.1 for thecase B = 2A = 8. The same table applies to either the sine or cosine modes. Becauseof symmetry, only one-quarter of the frequency domain corresponding to nonnegativeindices is given.


TABLE 5.1Wavelengths of two-dimensional modes with B = 2A = 8.

k = 0k= 1k = 2k = 3fc = 4

j = 0 j = 1 j - 2 j - 3 j = 400

84

8/3=2.672

48/v/5 = 3.584/>/2 = 2.82S/v/13 = 2.224/v^ = 1.79

28/v^l7 = 1.944/V/5 = 1.79

8/5=1.60\/2 = 1.41

4/3=1.338/V37 = 1.324/yiO = 1.268/Sv^ = 1.194/v^3 = 1.11

1S/%/65^4/Vl7 =8/^73 =2A/5 =

.99

.97

.9489

Having determined the wavelength of the (j, k} mode, its frequency, measuredin cycles per unit length, is simply the reciprocal of the wavelength. Therefore, thefrequency of the (j, k) modes, denoted i/jk, is given by

This expression has the valuable interpretation that y^ may be regarded as thelength of a frequency vector with components (u;j, crfc). Furthermore, this frequencyvector has an angle ipjk with respect to the horizontal axis of the frequency domain

Comparing this angle ifijk to the angle of the phase lines 6jk given by (5.5), it followsthat Ojk and ifrjk differ by Tr/2 radians. This leads to the important observation thatthe phase lines of the (j, k) mode and the (j, k] frequency vector, if plotted in thesame domain, are orthogonal (problem 97).

There is one final step to complete the geometrical picture of the two-dimensionalDFT modes. We must now set up the actual grid in the frequency domain that isinduced by the spatial grid. This will also bring us closer to the reciprocity relations.The spatial domain

is intimately associated with a frequency domain

where the lengths Q and A will be determined momentarily. Corresponding to theM x N spatial grid, there is an M x N grid in the frequency domain with gridspacings Au; and ACT. From the definitions of the component frequencies Uj — j/Aand ah — k/B, we can anticipate that the frequency grid spacings should be given by

This makes sense physically also. It says that in the x- and y-directions separately,the smallest units of frequency (Au; and ACT, respectively) correspond to waves withthe largest possible (finite) wavelengths: a wavelength of A in the x-direction has afrequency of Au; = If A in that direction, and a wavelength of B in the y-direction

where


has a frequency of Aa = l/B in that direction. The frequencies represented by theDFT lie on this frequency grid and they are given by

where j = -M/2 + 1 : M/2 and k = -N/2 + 1 : N/2. It would be worthwhileto ponder Figure 5.1 to become familiar with the grids in the spatial and frequencydomain, and all of the attendant notation.

With these geometrical insights about the modes of the two-dimensional DFT,we can now say a few words about two important matters: the first is the questionof aliasing and the choice of grid spacings (or sampling rates), and the second isreciprocity relations. Let's begin with aliasing. Recall that if a band-limited input inone dimension is to be fully resolved, then it is necessary to sample it with at least twogrid points per wavelength of every mode that appears in the signal. In one dimension,this means that if u;max is the highest frequency that appears in the input and it hasa wavelength Amin = l/o;max, then a grid spacing A:r must be chosen such that

A similar, but subtly different analysis can be applied in two or more dimensions.We will actually present four different arguments, each with its own virtues, to reacha final conclusion about aliasing and sampling rates.

Here is the guiding question. Consider a single mode

with frequency indices (j, k). Then, how should the grid be chosen to insure that thismode is fully resolved so as to avoid aliasing? The goal is to find the minimum valuesof M and N (or alternatively, the maximum values of Ax and AT/) that insure noaliasing of this mode. The same argument will also apply to the sine version of thismode. The relationship we seek is a condition on M and N in terms of A,B,j,k.Let's first review what we know about this mode.

From the foregoing discussion of the geometry of the DFT modes, we know thatthe mode w(x, y] has crests and troughs along the phase lines

and these phase lines have an angle

with respect to the horizontal. The wave has wavelengths of // = A/j and 77 = B/kin the x- and y-directions individually. The full wave has a wavelength of

j B


in the direction orthogonal to the phase lines. (We will suppress the subscripts j, kto simplify the notation.) Recall that the frequency vector with components u = j/Aand a = k/B has a magnitude and an angle to the horizontal given by

Furthermore, the angles 9 and -0 differ by Tr/2 radians. This geometry is shown inFigure 5.6, from which it is not difficult to show that

In other words, the wavelength of the full wave is less than the wavelengths in each ofthe coordinate directions.

FIG. 5.6. The sampling conditions that insure no aliasing of a two-dimensional wave canbe derived from the geometry of the mode. The left figure shows the phase lines for the crestsof a single mode (bold diagonal lines) with frequency indices j = 2, k = 1. One wavelength(A) is the distance between two consecutive phase lines. The wavelengths in the x- and y-directions (p, and rf) are also shown. A local coordinate £ is introduced along the frequencyvector perpendicular to the phase lines. The right figure shows part of a spatial grid withspacings Ax and Ay that are half of the wavelengths p, and 77. The local grid spacing A£ ishalf of the wavelength A.

With these preliminary remarks, we can now offer four viewpoints about aliasingand the selection of grids to avoid it.

1. Playing it safe. Since the wave we wish to resolve has a wavelength of A and afrequency v — I/A, one could reason that it is best to choose the grid spacingsAx and Ay so that they individually assure no aliasing; that is,

This is a conservative choice that may use more grid points than absolutelynecessary. For example, if the mode is aligned strongly with the y-axis (j » k),then the wave requires a small grid spacing in the x-direction, but a much coarser

This relates t/> to A,B, M, N. What about z/? Recall that i/2 — u2 + a"2. Againthinking of u; = M/2A and a = N/2B as the maximum frequencies that can beresolved by the grid, we see that


grid in the y-direction will suffice. Nevertheless, for an arbitrary input withfrequencies of many different magnitudes and directions, this may be the safestchoice.

2. By intuition. Arguing from a physical perspective, one might say that since themode has wavelengths of // and 77 in the x- and y-directions, then the gridspacings in those directions should satisfy the one-dimensional sampling criterionindividually; to wit,

If this argument is convincing, then accept it, since it gives the correct conditions!If you desire more persuasion or rigor, then consider the following two lines ofthought.

3. Geometry. As shown in Figure 5.6, the mode w(x,y) is aligned with the phaselines. Imagine introducing a local one-dimensional grid with coordinate £perpendicular to the phase lines (along the frequency vector). Relative to thisaxis the mode looks like a one-dimensional wave with wavelength A. The one-dimensional sampling criterion says that a local grid spacing A£ should be chosensuch that A£ < A/2. Now it is just a matter of relating A£ to Aa: and Ay. Thegeometry of Figure 5.6 shows that

With these observations, the condition A£ < A/2 can be expressed in terms ofAx and Ay (problem 101). These individual conditions become

and the intuitive argument given above is confirmed.

5. Algebra. The same conclusion can be reached algebraically, but let's do it witha slightly different form of the question: Assume you are told that a particulartwo-dimensional input on a rectangular domain with dimensions A and B hasa maximum frequency of v in the direction ?/>. What are the minimum valuesof M and N that should be used to avoid aliasing of the input? As before,let (jj and a be the frequencies in the x- and y-directions and think of them asthe maximum frequency components that need to be resolved by the grid. Themaximum frequencies that appear on the frequency grid correspond to j — M/2and k = AT/2; they are given by uj = (M/2) Aw and a — (N/2)Acr. Therefore,given -0 and the facts that Au; = I/A and ACT = 1/B, we have that


Now it is just a matter of algebra! In equations (5.8) and (5.9), A,B,v,ty aregiven, while the grid parameters M and TV are to be determined. These twoequations can be solved for M and N (problem 102) to find that

are the minimum values of M and N that avoid aliasing. To compare with theresults of the earlier arguments, recall that Ax = A/M and Ay = B/N. Thisgives the conditions that

are the maximum values of Ax and Ay that insure no aliasing. We see onceagain that the one-dimensional sampling conditions applied to the componentsof the frequency or wavelength are the correct conditions in two dimensions.

Hopefully, at least one of the foregoing arguments demonstrates the following fact:

In order to avoid aliasing of a two-dimensional wave, grid spacings Ax andAy must be chosen so that the one-dimensional waves obtained by slicingthe full wave in the x- and y-directions are individually free of aliasing.

In applications, it is critical that the grid spacings be chosen carefully, in order tobalance the need to avoid aliasing with the need to reduce the cost of complicatedcomputations by reducing the size of the grids. Here is a simple graphical recipefor determining the maximum grid spacings that avoid aliasing of a single mode (seeFigure 5.7).

Anti-Aliasing Recipe

1. Draw two adjacent phase lines t\ and t^ corresponding to the same phase of themode (for example, two consecutive troughs or crests).

2. The maximum values of Ax and Ay that insure no aliasing are one-half thehorizontal and vertical distances between t\ and £2, respectively.

It is an edifying exercise to verify (problem 103) that this recipe gives the sameresults as the above arguments.

We can now take the final steps to establish the reciprocity relations between thespatial and frequency domains. The fact that Au; = I/A and ACT = l/B results inthe one-dimensional relations

This results in the one-dimensional reciprocity relationships between the extents ofthe spatial and frequency domains:

How are the extents of the spatial and frequency domains related? The maximumfrequencies that can be represented in the x- and y-directions are given by


FlG. 5.7. The figure shows consecutive phase lines for crests of a single mode (bold diagonallines). The maximum grid spacings Ax and AT/ that prevent aliasing of that mode are one-halfof the horizontal and vertical distances between two consecutive phase lines.

We see that in two dimensions the one-dimensional reciprocity relations hold indepen-dently in each direction, and they can be used in much the same way. Holding Mand N fixed, an increase in either grid spacing A:r or Ay results in a correspondingincrease in the extents of the spatial domain A or B, which produces a decrease in theextents fi and A, which in turn leads to a decrease in the grid spacings Au; and ACT.Figure 5.8 is intended to be helpful in unraveling these important relationships.

These relations can also be used to form the two-dimensional reciprocity relations.Simply combining the two expressions (5.10) and (5.11), we can obtain relationshipsbetween grid ratios,

or between grid areas,

The area relationships (5.13) are the true analogs of the one-dimensional reciprocityrelations. Like their one-dimensional cousins, they may also be interpreted as anuncertainty principle: Holding M and N fixed, an increase in the grid resolution inthe spatial domain must result in a decrease in the grid resolution in the frequencydomain.

COMPUTING MULTIDIMENSIONAL DFTs 161

Frequency domain:

FIG. 5.8. The figure shows a spatial domain with A = 4 and B = 8 overlaid by a gridwith M = N — 16 grid points in each direction (implying Arr = 1/4 and Ay — 1/2). Thereciprocity relations determine the parameters of the M x N frequency grid. In this case,

5.4. Computing Multidimensional DFTs

A few observations need to be made about the actual implementation of multidi-mensional DFTs. As with so much that has already transpired, we will see that itis possible to rely heavily on what we know about one-dimensional DFTs. For thepurpose of computation it is useful to view the two-dimensional DFT as a two-stepprocess. Given an M x N input array /mn, we first write the forward DFT (5.1) in

and

Spatial domain:

We have used Fmk to label the intermediate array given by the inner sum. This simpledevice of splitting the double sum and denning the intermediate array Fmk allows usto express the computation of Fj^ in the following two steps.

Procedure: Two-Dimensional Complex DFT

1. Compute the intermediate array Fmk. For each m = —M/2 + 1 : M/2, A; =-AT/2 + 1 : AT/2, compute

This is most easily visualized if we place the input fmn in a physical array inwhich m and j increase along rows of the array, and n and k increase alongcolumns. Then this first step amounts to performing M one-dimensional DFTsof length N of each of the columns of the array. The resulting intermediate arrayFmk can be stored over the original array.

2. Compute the array Fjk. For each j = -M/2 + 1 : M/2, k = -N/2 + 1 : N/2,compute

This step amounts to doing N one-dimensional DFTs of length M of each of therows of the array Fmk. The result of this second step is just the array of DFTcoefficients Fjk.

To summarize a fundamental theme that will be repeated often, it is possible tointerpret and evaluate the two-dimensional DFT of an M x N array of data as atwo-step process in which we

1. perform M DFTs of length N of the columns of the array, and then

2. perform N DFTs of length M of the rows of the array from step 1.

It should be evident that the two steps could be interchanged: the two sweeps ofDFTs could be done across the rows first and columns last. The multiplication by thescaling factors 1/M and l/N is usually deferred until the end of the computation, ifit is needed at all. The power of this formulation is that it reduces the entire two-dimensional DFT to one-dimensional DFTs. All that is needed is a driver program toorganize the two steps and a one-dimensional DFT subprogram that can do DFTs offairly arbitrary length. For maximum efficiency, an FFT should be employed in theDFT subprogram. A quick operation count is revealing. If the two-dimensional DFTis done explicitly (by the definition (5.1)) then it entails the product of an MN x MN

162

the form

MULTIDIMENSIONAL DFTs

SYMMETRIC DFTs IN Two DIMENSIONS 163

matrix and an MTV-vector, or roughly M2TV2 operations. On the other hand, if theone-dimensional DFTs are done with FFTs, the two-dimensional DFT requires NFFTs of length M and M FFTs of length N. Borrowing from Chapter 10 the factthat an TV-point FFT requires on the order of TV log TV complex operations, we seethat the two-dimensional DFT has a computational cost of roughly

complex operations,

which represents a significant savings over the explicit method.The above formulation is precisely the overall strategy behind all multidimensional

DFT computations. The extension to DFTs in three or more dimensions is immediate.In the three-dimensional case, three steps are needed, each requiring a set oftwo-dimensional DFTs along sets of parallel planes of data (problem 110). Eachtwo-dimensional DFT can be done by the two-step procedure described above.There are many variations on this basic idea that are determined primarily bycomputer architecture. For single processor scalar computers, the implementation isstraightforward. However, with vector computers, issues of data storage (for example,interleaving of data and stride length) determine how the arrays are stored and theorder in which the two steps are performed [134]. For parallel or multiprocessingcomputers, additional issues of data availability and interprocessor communicationarise [17], [18], [138]. Suffice it to say that when it comes to advanced computerarchitectures, there are still some open questions about the implementation of DFTs,and those questions will persist as long as there are new architectures.

Example: A numerical two-dimensional DFT. A brief numerical exampleof a two-dimensional DFT may help to solidify the ideas of this section. Consider thefunction

5.5. Symmetric DFTs in Two Dimensions

The preceding section dealt with the properties of the DFT applied to the most generaltwo-dimensional input array. We have already learned that when the input possessescertain symmetries, the one-dimensional DFT can be simplified both conceptually andcomputationally. The same economies can be realized in two (and more) dimensions.

The simplest and most important symmetry that arises in practice is simplythat of a real input array. So let's assume that fmn is a real M x TV array andthen appeal to the definition of the two-dimensional complex DFT. Recall that for

The function has a value of zero on the boundary of the domain and therefore has nodiscontinuities when extended periodically. It is sampled and the resulting array isused as input to the complex DFT implemented by the two-sweep method describedabove. (As we will see in the next section, the fact that / is real-valued makes thecomplex DFT inefficient in both computation and storage.) The resulting DFT arrayis shown in Figure 5.9. Notice that the DFT coefficients Fjk are even in the index j(F_jk = Fjk), and odd in the index k (-F^,-* = ~Fjk)- These symmetries will also beexplained in the next section.


FIG. 5.9. The function f ( x , y ) = sin(3cos(?//2)sin:r) is shown (top) on the domain[—7r,7r] x [—7r,7r]. For f sampled on a grid with M = N — 24, the two-dimensional DFTproduces the arrays of transform coefficients shown below, where the real part is shown onthe left and the imaginary part is on the right (with different vertical scales). Observe that asthe function is primarily a sine, its DFT is dominated by the imaginary part, which has anamplitude approximately 10 times that of the real part.

we have

The first step is to see what the real DFT in two dimensions actually looks like. It

This expression has been obtained by using the addition rules for the sine andcosine, plus the fact that the elements of fmn are real.

This representation immediately exhibits the periodicity of the transform coeffi-cients: Fj±M,k = Fjk and FJ^±N — Fjk- It also has the advantage that the real andimaginary parts of Fjk are quite evident. Notice that replacing j and k by —j and—fc, respectively, has the effect of negating the imaginary part of Fjk and leaving thereal part unchanged. Similarly, replacing j and —k by —j and fc, respectively, has thesame effect. We can summarize this symmetry by writing

or more simply F-jt-k = F*k.

In strict analogy with the one-dimensional case, we see that the DFT of a realsequence is conjugate symmetric. In practical terms, this means that only half ofthe DFT coefficients need to be computed. The remaining half is known immediatelyby symmetry. The particular half of the coefficients that one chooses to compute andstore is somewhat arbitrary. Because of the conjugate symmetry, it is possible to workwith the coefficients in any two adjacent quadrants of the frequency domain (Figure5.10). The quadrant j > 0, k > 0 is familiar, and seems like a natural choice; we willalso choose the quadrant j < 0, k > 0. To be quite specific, we must compute theFife's for j = -M/2 + 1 : M/2 and k = 0 : N/2.

Now let's do some bookkeeping to account for all of the transform coefficients.Since there are MTV real input data, we expect (indeed must have) exactly MTVindependent real quantities in the output. Referring to Figure 5.10 and using theperiodicity and symmetry of Fjfc, the following observations are critical (problem 112).

(a) FQO, F0 N_ , FM 0, and FM N are real.

(b) FJQ = F^j0 (which means FJQ must be computed only for j — 0 : M/2}.


can be written out in an alternate form by using the real symmetry of the input array.Expanding the exponentials in the complex DFT in terms of sines and cosines, wehave the following definition.

Two-Dimensional Real DFT (RDFT)

n


FIG. 5.10. The DFT of a real M x N array fmn can be done in two stages. The result ofcomputing M real DFTs of length N of the columns of fmn is a conjugate symmetric arrayFmk that can be stored as shown in the left figure. The result of computing N DFTs of lengthM of the rows of the Fmk array is the conjugate symmetric DFT Fjk, which can be stored asshown in the right figure. The entire computation can be done in place, in the array originallyoccupied by the input array.

(c) F- N_ = F^ . jv (which means F- N_ must be computed only for j = 0 :•>' 2 3' 2 -'' 2

M/2).

(d) The remaining Fjk for j = -M/2 + I : M/2 and k = 1 : N/2 - I arecomplex and have no symmetry (which means that their real and imaginaryparts must be computed and stored).

If we now count the number of real quantities that must be computed and stored,the tally looks like this:

where we have indicated the source of the four terms in the list above. The match isperfect: there are MTV real quantities on input and output. The practical consequenceof this exercise is that since the input to the DFT consists of MN real quantities, onlyhalf of the DFT coefficients, a total of M N real quantities, needs to be computedand stored. Thus there is a factor-of-two savings in computation and storage over thecomplex DFT in the presence of the real symmetry in the input.

The actual computation of the real DFT can be formulated much as it was in thecomplex case. We start with the DFT definition (5.1)


for j = -M/2 + 1 : M/2 and k = -N/2 + 1 : N/2. Once again the evaluation of thisdouble sum may be split into two stages:

Now the symmetries of the one-dimensional real DFT must be exploited to obtainan efficient two-step method Notice that the computation of the intermediate arrayFmk requires M real DFTs of length N of the columns of the array fmn. Therefore,the array Fmk has conjugate symmetry in its second argument: Fmk = Fm _k. Thismeans that the complex coefficients Fmk need to be computed only for k = 0 : N/2.

We would like the second stage of the procedure, the computation of Fjk, toinvolve only real DFTs as well. This can be done if we split the array Fmk into itsreal and imaginary parts by writing Fmk — Re {Fmk} + «Im {Fmk}. We then havethat the DFT coefficients Fjk are given by

In this manner the second stage of the computation consists of real DFTs of length Mof the rows of the two arrays Re {Fmk} and Im {Fmk}- But notice that these DFTsmust be done only for the indices j — —M/2 + I : M/2 and k = 0 : N/2.

The data storage also works out exactly as one might hope provided the followingobservation is made: because of the conjugate symmetry of Fmfc, Im{Fmo} =Im {-Fm,±N/2} = 0, and therefore these terms do not need to be stored. Theintermediate array Fmk can be written over the input array by placing its real partsRe {Fmk} in the locations with k = 0 : N/2 and its imaginary parts in the locationswith k = —7V/2 + 1 : — 1 (see Figure 5.10). The DFT coefficients Fjk can also be storedwith their real and imaginary parts in the same locations as the real and imaginaryparts of Fmk- Therefore, the entire computation can be done in the M x N real arrayin which the input data arrive.

We can now summarize this two-step method for real DFTs given an M x TV inputarray Fmn.

Procedure: Two-Dimensional Real DFT

1. For m = -M/2 + 1 : M/2, compute the M real DFTs of length TV of the columnsof the input array to form the arrays Re {Fmk} and Im{Fmfc}. For each m,these arrays are stored in rows k — 0 : N/2 and k = — N/2 +1 : — 1, respectively.

2. For k = -N/2 +_1 : N/2 compute the N real DFTs of length M of the rows ofthe arrays Re {Fmk} and Im {Fmk}. The real parts of Fjk will appear in therows with k = 0 : N/2. The imaginary parts of Fjk will appear in the rows withk = -N/2 + l : -I.

where


We see that, as in the complex case, the two-dimensional real DFT can be reducedto a sequence of one-dimensional DFTs; in this case, the one-dimensional DFTs arereal DFTs. The entire calculation amounts to M real DFTs of length N and N realDFTs of length M; in practice, these DFTs should be done with efficient symmetricforms of the FFT. The two steps can be interchanged, and one order may be preferablein light of computational constraints. The multiplication by the scaling factors 1/Mand 1/N is generally deferred until the end of the calculation, if it is needed at all

What about the inverse of the forward two-dimensional real DFT? The inversedeserves attention and illustrates some fundamental tricks that can be used in othersymmetric DFTs. Imagine that we are now given an array of DFT coefficients Fjkwith the conjugate symmetry Fjk = .F* • _*.. The goal is to reconstruct the real arrayfmn given by

for n = —TV/2 + 1 : N/2 and m = —M/2 + 1 : M/2. We now look for ways to exploitthe symmetry of Fjk, and we begin by separating the double sum of the inverse DFTas follows:

Matters will quickly become overly complicated without a notational device to preventfurther proliferation of ink. We will continue to use the convention that E" indicatesa sum in which the first and last terms are weighted by one-half and all other termsare weighted by one. With this notation we can rewrite this last line as

We may now split the outer sum in the same way (j = 0, j = I : M/2—1, j — M/2).In a rather schematic form the result can be written as

These antics appear to be getting out of hand. Fortunately, it is now possibleto use the S" notation for the sum on j as well. With the aid of this notation (and

Notice how the four terms of this sum can be grouped in pairs to take advantageof the conjugate symmetry. The final step is to express fmn in terms of purely realquantities. Recall that complex numbers z satisfy z + z* = 2Re {z}. With this fact, weclaim that the inverse real DFT in two dimensions is given by the following definition.

Two-Dimensional Inverse Real DFT (IRDFT)

We have called this expression the inverse real DFT, despite the fact that its inputconsists of conjugate symmetric (hence complex) data; it does produce real output.This expression appears rather cumbersome, but it has a few useful purposes. Itdemonstrates that the array fmn consists of real numbers (as anticipated) that areM-periodic in the first index and Af-periodic in the second index. A careful countingalso confirms that the Fjk's actually used in the representation (5.16) consist of exactlyMN real quantities. This corroborates our previous tally, in which it was shown thatthere are MN real data points for the input and output. This means that the inversereal DFT can also be done with one-half of the computation and storage of the complexDFT It should be mentioned that the form given by (5.16) is never used for actualcomputation. Just as we did for the forward real DFT, the inverse real DFT can bereduced to a two-step procedure that requires only one-dimensional inverse real DFTs(implemented as FFTs) and MN real storage locations (problem 106).

All of the foregoing ideas can be used to further advantage when we deal withother symmetries. For example, consider an input sequence fmn that is real and hasthe even symmetry:

Observe that if this symmetry is known to exist in the data, it is necessary to storeonly one-quarter of the actual array. In order to mimic the development that we


problem 113), the previous expression for fmn can miraculously be reduced to

where we have again appealed to the E" notation to denote a sum whose first and lastterms are weighted by one-half. The fact that ui^jv + ̂ jv* = ^ cos(7rnk/N) has alsobeen used. The symmetry f-m,n = fmn may now be used on the outer sum to obtain

We can now unravel the symmetries in the transform Fjk that are induced bythe symmetries of the input fmn. First it is clear that all of the coefficients Fjk arereal since the expression (5.17) involves only real quantities. Furthermore, becausethe cosine is even in its argument, the Fj^s are even in the indices j and k; thatis, F-jk = Fjk and Fj^k — Fjk- Thus, from the coefficients with j = 0 : M andk = 0 : N, it is possible to construct the entire array Fjk. Notice that the bookkeepingworks out as it should: with the even symmetry of the arrays fmn and Fjk, there areexactly (M + 1)(7V + 1) distinct real quantities in both arrays.

Let's stand back and see what we have done. First we noted that because ofits symmetry, only one-quarter of the input array corresponding to m = 0 : M andn = 0 : N needs to be stored. Indeed, we see from expression (5.17) that only theseinput elements are required in the computation of the DFT. Furthermore, we see thatbecause of the symmetry in the coefficients Fjk, only one-quarter of the transformarray needs to be computed, corresponding to j — 0 : M and k — 0 : N. We might askwhy we even deal with the unused three-quarters of the data. This question promptsus to proceed just as we did in the one-dimensional case: we create a new transformthat uses only the independent data points and produces only the coefficients that areactually needed. We have just derived the two-dimensional discrete cosine transform,the formula for which follows.

Two-Dimensional Discrete Cosine Transform (DCT)

and then use the symmetry of fmn in each component separately. With the symmetrythat fm,-n = fmn, we can write


used in the one-dimensional case and to anticipate the most useful form of the finalresult, we shall assume that the input array is indexed with m — — M + I : M andn = —N + 1 : N; in other words, there are 2M x IN real data points in the inputarray. We will now proceed as we did in the real case: first working out the symmetriesthat appear in the transform coefficients because of the even symmetry of the inputdata, then describing the expected two-step computational process. What does thetwo-dimensional DFT of a real even array actually look like when it is written out?We can start with the complex DFT (5.1) in the form


which is computed for j = 0 : M and k = 0 : N. Notice that there is no longer anyassumption about the symmetry of the input array /mn, apart from the fact that itselements are real. And there is no particular symmetry in the transform coefficients,except that they are real also.

It is now possible (problem 108) to determine the inverse of the DCT given in(5.18). As in the one-dimensional case, the two-dimensional DCT is its own inverseup to multiplicative constants.

Two-Dimensional Inverse Discrete Cosine Transform (IDCT)

for m = 0 : M and n = 0 : N.A fact of importance in many applications of the DCT is that if either of the arrays

fmn or Fjk given by (5.18) or (5.19) is extended beyond the indices m, j = 0 : M orn, k = 0 : TV, then the resulting arrays are periodic with periods 2M and IN inthe first and second indices, respectively. Furthermore, both arrays are even in bothdirections: /_mn = /mn,/m,_n = fmn and F_jk = Fjk,Fit-k = Fjk.

We can now look at the computational question and, not surprisingly, develop atwo-step method for evaluating the two-dimensional DCT. Separating the two sumsof the forward transform (5.18), we have that

for j = 0 : M and k = 0 : N. Once again we have defined the intermediate arrayFmfc, which, for each fixed m — 0 : M, is simply the (TV + l)-point DCT of the rathcolumn of the input array /mn. Thus the computation of the intermediate array Fmkrequires M + 1 DCTs of length TV + 1. In a manner that should now be familiar, thisintermediate array is used for another wave of DCTs in the opposite direction. Forfixed j and k, we compute

which holds the one-dimensional (M + l)-point DCT of the fcth row of the intermediatearray Fmfc. The cost of doing these transforms is TV + 1 DCTs of length M 4-1

Thus the two-dimensional DCT of a real input array fmn can be formulated ina two-step process. Given the array fmn for m = 0 : M and n = 0 : TV, we proceedirectly.

Procedure: Two-Dimensional DCT

1. perform M+l one-dimensional DCTs of length TV+ 1 along the columns of fmn

and store (or overwrite) the results in an intermediate array Fmk, and then

2. perform TV -f- 1 one-dimensional DCTs of length M + l along the rows of thearray Fmk and store (or overwrite) the DCT coefficients F j k .

f


We remark as before that all of the one-dimensional DCTs should be done withsymmetric FFTs, that the two steps may be done in either order, and that thenormalization by the factors 1/M and I/AT can be postponed until the end of thecalculation (if it is needed at all).

There is one other major symmetry that must be mentioned. We will do littlemore than record it and leave the derivation as a stimulating problem (problem 109).The case in which the input sequence fmn is real and has the odd symmetry

arises frequently. It leads to the two-dimensional discrete sine transform (DST). Givenan arbitrary real M x N input array fmn, this transform pair has the following form.

Two-Dimensional Discrete Sine Transform (DST)

Two-Dimensional Inverse Discrete Sine Transform (IDST)

for ra = 1 : M — l,n = 1 : N ~ I. There are several important properties of thistransform pair which are also stated in problem 109.

The 'picture may seem to be complete, and indeed it nearly is. However, weshould not close without saying that the above symmetries may be combined toform hybrid transforms. For example, if an array must be even in one directionand odd in the other direction, then one could combine a DOT in one direction witha DST in the other (problem 114). There are also situations, particularly in thesolution of differential equations, in which DFTs with other symmetries (for example,quarter-wave or staggered grid symmetries) must be used [134], [142], [144]. In allof these variations, in any number of dimensions, the theme is always the same:multidimensional DFTs can always be reduced to a sequence of one-dimensional DFTs,and symmetries always lead to savings in storage and computation.

5.6. Problems

In all of the following problems assume that

foij = l:M-l,k = l:N-l.

j, m = -M/2 + 1 : M/2, and k,n = -N/2 + I : N/2,, unless otherwise stated.

89. Two-dimensional orthogonality and inverse property. Use the two-dimensional orthogonality of the complex exponential,

PROBLEMS 173

to show that the forward DFT given by

93. Analytical DFTs. Assume that the following functions / are defined on theset I = {(x, y) : — TT < x < TT, — TT < y < TT}. If / is sampled with M and N points inthe x- and y-directions, respectively, find the two-dimensional DFT of each function.

with the same property holding for the DST.

(g) Convolution: where

92. Two-dimensional DFT properties. Let T),C, and S denote the discretetwo-dimensional DFT, DOT, and DST, respectively, of dimension M x N. Verify thefollowing properties, where fmn and gmn are input arrays and Fjk and Gjk are thecorresponding arrays of transform coefficients.

(a) Periodicity of T>:

(b) Periodicity of S or C: S{

(c) Linearity of £>, «S, and C: T where a and(3 are constants.

(d) Shift:

(e) Rotation:

(f) Shift:

(Hint: Substitute the expression for fmn into the expression for Fjk (or vice versa),choose your indices carefully, and show that an identity results.)

90. DFTs with separable input. Show that if the input to a DFT has the formfmn — 9mhn, then the two-dimensional DFT is given by Fjk = GjHk, where Gj andHk are the one-dimensional DFTs of gm and hn, respectively.

91. DFTs of complex exponentials. Let jo and &o be a pair of fixed frequencyindices. Show that

has the inverse


94. Two-dimensional DFT matrix. Assume that the input fmn to a two-dimensional DFT is ordered by rows (m varying most quickly) to form a vector f oflength MN. Assume the transform array Fjk is ordered in the same way to forma vector F of length MN. Let W be the two-dimensional DFT matrix such thatF = Wf. Write out several representative rows and columns of W for a grid withM = N = 4. What is the size of the matrix? Note the structure and symmetries ofthis matrix.

95. Wavelengths of continuous modes. Show that the continuous DFT modeon X given by

with respect to the rr-axis.

96. Wavelengths of discrete modes. Verify that the discrete DFT mode

has a wavelength of

How do you interpret the units of the wavelength? Show also that the frequency ofthis mode is given by

and that the phase lines have an angle of

Furthermore, show that the phase lines of this mode have an angle of

has a wavelength measured in the units of x and y of

PROBLEMS 175

with respect to the direction indexed by m.

97. Phase lines and frequency vectors. As described in the text, let Ojk bethe angle that the phase lines of the (j, k) mode make with the x-axis. Let fyk bethe angle between the frequency vector (w, a) and the u;-axis. Show that Ojk and V'jfcdiffer by Tr/2 radians.

98. Geometry of two-dimensional modes. Consider the following modes onthe domain X with A = 2?r and B — 4?r.

For each mode,

i. Find the wavelength A and the wavelengths n and 77 in the x- and^-directions. Verify that A2 = (n~2 + r ) ~ 2 ) ~ l .

ii. Find the frequency i/ and its components u> and a.iii. Draw the domain T (to scale) and the phase lines corresponding to the

crests (maximum values) of the mode.iv. Find and indicate the angle 9 that the phase lines make with the x-axis.v. On this same plot, indicate the direction of the frequency vector and

the angle ijj that it makes with the x-axis.vi. Verify that 0 and i/j differ by ?r/2 radians.

99. More geometry. Assume that a particular DFT mode has a phase linecorresponding to a crest that passes through the point (—A/2, —B/2). Show thatthere must also be phase lines for crests passing through (A/2, —B/2), (—A/2, B/2},and (A/2, B/2).

100. High frequency modes. Assume that an M x N grid is placed on a spatialdomain. Draw a 3 x 3 block of points of this grid. Mark the points at which the mode

has the values 0, ±1 for the modes with

Find the wavelengths and frequencies of these modes.

101. Geometry of the anti-aliasing condition. Refer to the spatial grid shownin Figure 5.6 in which A£ is the local grid spacing in the direction perpendicular tothe phase lines, and Ax and Ay are the grid spacings in the x- and y-directions.

(a) Show that A£ = &XCOSTJJ and A£ = Ay sin •0.

(b) Show that the local anti-aliasing condition A£ < A/2 implies that


on the domain 2" with A = 2 and B = 4. What are the minimum number of grid pointsM and N and the maximum grid spacings Ax and Ay required to avoid aliasing ofthese modes?

105. Aliasing. You are told that on a domain J with A = TT and B = 4-jr thehighest frequency that appears in the input is v — 500/Tr (cycles per unit length).How would you design a grid (with a minimal number of grid points) to avoid aliasingof this signal? If you are also told that this maximum frequency occurs in a directionib = tan"1 (4/3), what are the minimum values of M and N that insure no aliasing?

106. Inverse real DFT algorithm. Describe a two-step algorithm analogous tothat used for the forward real DFT that takes an M x N conjugate symmetric inputarray and produces a real output array. Use the symmetries to minimize the storageand computation requirements. Indicate clearly how the MAT storage locations shouldbe allocated.

107. Alternate index sets. Find the complex and real two-dimensional DFTpairs for the index sets m, j = 0 : M — 1 and n, k = 0 : N — \.

108. Inverse DCT. Verify that the two-dimensional DCTs given by (5.18) and(5.19) form an inverse pair, and that the DCT is its own inverse up to a multiplicativeconstant.

109. Two-dimensional DST. Following the development of the two-dimensionalDCT in the text, carry out the derivation and analysis of the two-dimensional DST.

(a) Assume that the input array has dimensions 2M x 2N. Show that theodd symmetry fm,-n = -fmn and f-m,n = -fmn together with theperiodicity fm±2M,n = fmn and fm,n±2N = fmn imply that fmn = 0whenever m = ±M, m = 0, n = ±AT, or n — 0.

102. Algebra of the anti-aliasing condition. Assume that a single continuousmode is given with a frequency that has magnitude v and an angle if) with respectto the w-axis. Let M and N be that minimum number of grid points in the x- and^-directions needed to resolve this mode (and avoid aliasing). Show algebraically thatthe conditions

(equations (5.8) and (5.9) of the text) imply that M = 2Az/cost/>, N = 1Bvs\\\il), andthat

are maximum grid spacings that will resolve the mode.

103. Anti-aliasing recipe. Show that the simple graphical procedure given in theanti-aliasing recipe actually gives the maximum grid spacings Ax and Ay that insureno aliasing of a single mode.

104. Aliasing. Consider the modes

(a) Show that the transform coefficients Fjk have the following symmetries.

(b) Verify that there are exactly MN distinct real quantities Fjk defined bythe forward transform (5.14) that correspond to the MAT real quantities ofthe input array fmn.

111. Three-dimensional aliasing. Consider the single mode

PROBLEMS 177

(b) Starting with a real 2M x 2N input array fmn with odd symmetry in bothindices, use the complex DFT (5.1) to derive a transform that uses onlysine modes and the input values for ra = 1 : M — 1 and n = 1 : N — 1.

(c) Notice that .F-j,fc = —Fjk and -Fj,-fc = — Fjk, and conclude that only one-quarter of the coefficients Fjk need to be computed.

(d) Discarding unused coefficients, write the forward two-dimensional DST foran arbitrary real M x N input sequence.

(e) Verify that up to a multiplicative constant the DST is its own inverse.

(f) Show that the array fmn given by the inverse DST (5.21) when ex-tended is 2M-periodic in the m-direction, 2AT-periodic in the n-direction.Furthermore, the array is odd in both directions

110. Three-dimensional DFTs. Given an M x N x P input array /mnp, itsthree-dimensional complex DFT is defined as

(a) Describe how you would design an efficient three-step method for evaluatingthis DFT.

(b) Carefully itemize the cost of the computation in terms of the number ofone-dimensional DFTs of length M, AT, and P.

(c) How much does this computation cost in terms of complex arithmeticoperations if the one-dimensional DFTs are done explicitly by matrix-vectormultiplication?

(d) How much does the computation cost if the one-dimensional DFTs aredone with FFTs? (Assume that an AT-point FFT requires on the order ofNlogN operations.)

on a parallelepiped domain with physical dimensions 8x4x6. What are the minimumnumbers of grid points M, N, and P that should be used in the :r-, y-, and z-directions,respectively, to insure no aliasing of this mode?

112. Properties of the real DFT. Consider the real periodic transform pair givenby (5.14) and (5.16).


Use this Fourier series pair to derive (or at least motivate) the two-dimensional DFTpair (5.1) and (5.2).

113. E" notation for the inverse real DFT. Show that the E" notation allowsthe expression (5.15) for the inverse real DFT to be reduced to the form given in(5.16).

114. Other symmetries. What symmetry do you expect in the DFT of an inputarray that has the symmetry Design an efficientDFT for input with this symmetry.

115. Fourier series to the DFT. Assume that a function / is a continuousfunction on the rectangle J with \ and Thetwo-dimensional Fourier series representation for / on this region is given by

where

Chapter 6

An approximate answerto the right problem is

worth a good deal morethan an exact answer to

an approximate problem.- John Tukey

Errors in the DFT

6.1 Introduction

6.2 Periodic, Band-Limited Input

6.3 Periodic, Non-Band-Limited Input

6.4 Replication and Poisson Summation

6.5 Input with Compact Support

6.6 General Band-Limited Functions

6.7 General Input

6.8 Errors in the Inverse DFT

6.9 DFT Interpolation; Mean Square Error

6.10 Notes and References

6.11 Problems

179

180 ERRORS IN THE DFT

6.1. Introduction

Chapters 2 and 3 were devoted to introducing the DFT and discovering some of itsimportant properties. At that time, we began to establish the relationships amongthe DFT, Fourier series, and Fourier transforms in rather qualitative terms. Now it istime to make these relationships more precise. Our plan is to proceed systematicallyand investigate a sequence of cases that will ultimately cover all of the common usesof the DFT. As we will see, the form of the input sequence dictates how the DFT isused and how its output should be interpreted. In some cases, the DFT will provideapproximations to the Fourier coefficients of the input; in other cases, the DFT willprovide approximations to (samples of) the Fourier transform of the input. The goalof this chapter is to understand exactly what the DFT approximates in each case, andthen to estimate the size of the errors in those approximations. Most of the chapterwill be devoted to errors in the DFT. However, it is also illuminating to explore theinverse DFT (IDFT) and to appreciate the complementarity that exists between thetwo transforms and their approximations. For the most part, we will be concernedwith pointwise errors. For example, if the DFT component Fk is used to approximatethe Fourier coefficient Ck, what is the error of that particular approximation? In thelast section, however, we will look at errors in the DFT from a different perspective.When the DFT is viewed as an interpolating function for a given function, we are moreinterested in the error of the entire approximation, rather than the error at individualpoints. In this case, it makes sense to ask about the mean square error (or integratederror) in DFT approximations. Fortunately, the tools we will develop to analyzepointwise errors can also be applied to estimate the error in DFT interpolation.

This chapter walks a fine line between practical and theoretical realms. On onehand, understanding the uses of the DFT and the sources of error in the DFT is animmensely practical subject. No practitioner can use the DFT successfully withoutunderstanding what the input means, how errors arise, and how the output should beinterpreted. On the other hand, it is impossible to estimate errors or even understandtheir sources without an occasional excursion into technical territory. This chapterpresents the roughest terrain in the book. It is meant to be an exhaustive coverageof DFT errors in all of the various cases in which the DFT is used. No concessionshave been made to brevity or terseness. The chapter does include formal theorems,and proofs will be sketched to the extent that interested readers can provide thedetails. Meaning and insight will be added to the theorems with detailed analytical andnumerical examples. In this way a complete and balanced coverage of this importanttopic can hopefully be achieved.

Before embarking on this journey, we must pause and define a new semitechnicalterm that (to our knowledge) does not have currency in the literature. In' workingwith DFTs, there are occasional technicalities that can arise unexpectedly. These"details" (for example, the treatment of endpoints or the issue of even/odd number ofDFT points) occur often enough and in so many different forms that they deserve aname. We have coined the felicitous term pester to describe these various irritations.Despite their many forms, there are two properties that all pesters share: first, theycannot be ignored or they will cause trouble; second, once dealt with, we have nevermet a pester that did not ultimately hold a hidden lesson about DFTs. With thatimportant definition, we will begin an exploration of errors in the DFT—pesters andall.

PERIODIC, BAND-LIMITED INPUT 181

6.2. Periodic, Band-Limited InputThere is a very natural place to begin this discussion about errors in the DFT, andthat is with periodic functions and sequences. As we saw in Chapter 2, if / is apiecewise smooth function on the interval [—A/2, A/2] 1, then it has a representationas a Fourier series of the form

where the complex coefficients Ck are given by the integrals

The series in (6.1) converges to the function / on the interval [—A/2, A/2], toits periodic extension outside that interval, and to its average value at points ofdiscontinuity. This series describes how the function / can be assembled as a linearcombination of modes (sines and cosines), all of which have an integer numberof periods on the interval [—A/2, A/2]. The kth mode has exactly k periods (orwavelengths) on the interval [—A/2, A/2], and thus, it has a frequency of k/A cyclesper unit length. The coefficient Cfc is simply the amount by which the kth mode isweighted in this representation of /.

We are now in a position to investigate our first question: how well does theDFT approximate the Fourier coefficients of /? Assume that the given function / hasperiod A; this includes the situation in which / may be defined only on the interval[—A/2, A/2] and is then extended periodically. Something rather remarkable can bediscovered right away with one simple calculation. Imagine that the periodic function/ is sampled on the interval [—A/ 2, A/ 2] at the uniformly spaced points

Denoting the sampled values fn = f ( x n ] , we can use the Fourier series for / to write

Since /„ is a periodic sequence of length JV, we might contemplate taking its DFT. Itis not difficult to determine the outcome:

1We repeat the convention adopted in previous chapters of using the closed interval [—A/2, A/2]to denote the interval on which a function / is sampled or expanded in its Fourier series. The sampledvalue assigned to / at the right endpoint (/jv/z) is the average of its endpoint values.

Look who appears! As indicated by the braces, we need only to use the discreteorthogonality relation for the DFT (see Chapter 2) to simplify the inner sum. Thedouble sum in (6.3) then collapses and we are left with the following result.

We have taken some liberties with the terminology. As we will see later in the chapter,the "official" Poisson2 Summation Formula is a relationship between a function andits Fourier transform. The above relationship between the DFT of a function andits Fourier coefficients is analogous to the Poisson Summation Formula; therefore, aname reflecting that similarity seems appropriate. It may not be evident right now,but this relationship between the DFT and the Fourier coefficients has a lot to tell us.To make sense of it, we need a definition, and then two quite different cases appear.

We need to review the property of band-limitedness, but now as it applies toperiodic functions. If the A-periodic function / has the property that the Fouriercoefficients Ck are zero for \k\ > M, where M is some natural number, then / is saidto be band-limited. It simply means that the "signal" / has no frequency contentabove the Mth frequency of M/A cycles per unit length. Expression (6.4) tells us thatif / is periodic and band-limited with M < AT/2, then Ck = 0 for \k\> AT/2, and theAf-point DFT exactly reproduces the N Fourier coefficients of /. By this we meanthat Fk = Ck for k = —N/2 + 1 : N/2. If / is not band-limited or is band-limited withM > N/2, we can expect to see errors in the DFT.

It is worth stating this important result in several different (but equivalent) ways:

If the Fourier coefficients Ck of the function / on the interval [—A/2, A/2] arezero for k > N/2, then the DFT of length N exactly reproduces the nonzeroFourier coefficients of /.

Assume the Fourier coefficients on the interval [—A/2, A/2] vanish for frequencies

2Encouraged by his father to be a doctor, SIMEON DENIS POISSON (1781-1840) attracted theattention of Lagrange and Laplace when he entered the Polytechnic School at the age of 17. Hemade lasting contributions to the theories of elasticity, heat transfer, capillary action, electricity, andmagnetism. He was also regarded as one of the finest analysts of his time.


for k = —N/2 + 1 : N/2. If we now rearrange the order of summation and combinethe exponentials, we have that

Discrete Poisson Summation Formula

If the highest frequency resolvable by the DFT, which issatisfies

PERIODIC, BAND-LIMITED INPUT 183

then the N-point DFT exactly reproduces the nonzero Fourier coefficients of /.Notice how the reciprocity relation A$l = N appears in this condition.

A maximum frequency of corrresponds to a minimum period (wavelength)of Therefore, the above condition can also be written

This condition means that if / is sampled with at least two grid points perperiod, then the DFT exactly reproduces the nonzero Fourier coefficients of /.

The critical grid spacing (or sampling rate) should look familiar; weencountered it in Chapter 3 and called it the Nyquist sampling rate. It plays anessential role in analyzing signals and using the DFT, and we will see it again withnonperiodic band-limited functions.

The special case in which M = N/1 raises our first pester. Special care must beused with the N/2 mode since it is the highest frequency mode that the DFT canresolve. The Fourier series, using continuous modes, can distinguish the —N/2 andthe N/2 modes. In fact, if / is real-valued, then On the other hand, the

DFT cannot distinguish between the (discrete) modes; in fact, it combines theminto one real mode of the form cos(Trn). Combining these observations, we concludethat in the special case in which the M = N/2

This result also assumes that N is even; an interesting variation arises with an oddnumber of DFT points (problem 126). Let's solidify these essential ideas surroundingsampling rates with a case study.

Case Study 1: Periodic, band-limited functions. Let / be the functionwhose nonzero Fourier coefficients are given by c& = 1 for 16 and assumeit is sampled at N equally spaced grid points on the interval [—1,1]. The sampledvalues of / are given by

Noting that the sample points are xn — 2n/N, the input sequence to the DFT is givenby

We can already anticipate the minimum number of sample points that the DFTneeds to reproduce the Fourier coefficients exactly. The highest frequency mode thatcomprises the function / has a frequency of 16 periods on the interval [—1,1] or 8periods per unit length. Using the above notation, this means that Q/2 = 8. Thishighest frequency mode has a period of 1/8 units of length (or time). Therefore, thesampling criterion that we place at least two grid points per period of every mode


We now consider a more prevalent case in which / is periodic, but not band-limited.This means that / has nonzero Fourier coefficients Ck for arbitrarily large values of k.

6.3. Periodic, Non-Band-Limited Input

Now let's do the experiments. The input sequence fn is fed to DFTs of variouslengths, and the results are shown in Figure 6.1. The upper left graph shows thefunction / itself; as expected, it is real and even, since its Fourier coefficients arereal and even. The upper right and lower left plots of Figure 6.1 show the DFTs oflength N = 32 and N = 64, which reproduce the nonzero Fourier coefficients exactly.However, note that with N = 32 (the case M = AT/2 discussed above), there is theexpected doubling of the N/2 coefficient. To anticipate coming events, the lower rightgraph of Figure 6.1 shows the DFT of length N = 16. In this case the input no longerappears band-limited to the DFT and the input sequence is "undersampled." Thissimply means that there are not enough DFT points to resolve all frequencies of theinput at a rate of at least two grid points per period. Therefore, the DFT coefficientsare in error (see problem 122). We had a glimpse of this effect in Chapter 3, and nowit will be investigated in detail.

take N 32 to resolve the Fourier coefficients exactly.means that we must have Since we see that we must

[—1,1], and the sampled function is used as input for DFTs of various lengths. The functionf itself is shown in the upper left graph, while the output of DFTs of length N — 32 andN = 64 are shown in the upper right and lower left graphs. In these two cases, all of themodes of f can be resolved exactly, since there are at least two grid points per period of everymode. The result is that all of the Fourier coefficients are to be computed exactly by the DFT.However, when a DFT of length N = 16 is used (lower right), there are errors in the DFT,since the input is "undersampled."

given by Ck = 1 for —16 16 is sampled at N equally spaced points on the intervalFlG. 6.1. Case Study 1. A periodic function f whose nonzero Fourier coefficients are

PERIODIC, NON-BAND-LIMITED INPUT 185

FIG. 6.2. A periodic function with high frequency modes (u> > N/2A) will undergo aliasingwhen sampled at N grid points on the interval [—A/2, A/2]. When sampled at the grid points,the kth mode is indistinguishable from the k + mN mode for any integer m. The figure showsthe (jj-axis and the modes that are coupled through aliasing.

is now considerably more complicated, but it has an extremely important interpre-tation. In this case, the DFT coefficient Fk is equal to Ck plus additional Fouriercoefficients corresponding to higher frequencies. We see that the fcth mode is linkedwith other modes whose index differs from k by multiples of AT, as shown in Figure6.2. Why should these modes be associated with each other? There is a good andfar-reaching explanation for this effect.

For the sake of illustration assume that we are working on the interval [—1,1] withN — 10 grid points. Figure 6.3 shows the k — 2 mode with a frequency of uo-2 — 1period per unit length. Also shown are the k + N = 12 and k — N = —8 modes,which have much higher frequencies of uj\2 — 6 and c<;_8 = 4 cycles per unit length.However, all three modes take on the same values at the grid points! In other words,to someone who sees these three modes only at the grid points, they look identical.

This phenomenon can be verified analytically as well. On the interval [—A/2, A/2]with N equally spaced grid points, the value of the kih mode, with frequencyUk — k/A, at the grid point xn = nA/N is given by

The value of the k + mN mode, with frequency LUk+mN — (k + mN}/A, at the gridpoint xn is given by

where gl27rmn = 1, since m and n are integers.In other words, the kih mode and the k + mN mode (where ra is any integer) agree

at the grid points. This is precisely the effect called aliasing that was observed inChapter 3, in which higher frequency modes masquerade as low frequency modes (seeproblems 116 and 117). The conclusion is that the DFT cannot distinguish a basicmode (—TV/2 + 1 < k < N/2) from higher frequency modes; hence the kth coefficientcomputed by the DFT includes contributions not only from the basic mode, but fromall of the aliased modes as well. In the case of a band-limited function, there are no

The Discrete Poisson Summation Formula


FIG. 6.3. On a grid with N = 10 points, the k = 2 mode (dashed line) takes on the samevalues as the k = 12 mode (solid line) and the k — —8 mode (dotted line) at the grid pointsxn = -1, -.8, -.6, -.4, -.2,0, .2, .4, .6, .8,1.

higher frequency modes to be aliased, assuming that / is sampled at or above thecritical rate.

The Discrete Poisson Summation Formula indicates how and when aliasingintroduces errors to the DFT. But just how serious are these errors? This is thequestion that we now address. The goal is to estimate the magnitude of \Fk — Ck\,the error in the coefficients produced by the DFT. In order to do this we need moreinformation about the function /, and it turns out that the most useful informationabout / concerns its smoothness or, more precisely, the number of continuousderivatives it has. With this information, a well-known theorem can be used, which isof considerable interest in its own right. Here is a central result [30], [70], [115], [158].

THEOREM 6.1. RATE OF DECAY OF FOURIER COEFFICIENTS. Let f and its firstp — 1 derivatives be A-periodic and continuous on [—A/2, A/2] for p > 1. Let f^be bounded with at most a finite number of discontinuities on [—A/2, A/2]. Then theFourier coefficients of f satisfy

where C is a constant independent of k.Sketch of proof: We start with the definition of the fcth Fourier coefficient and integrate

it by parts:


FlG. 6.4. A piecewise monotone function on the interval [—A/2, A/2] has the property thatthe interval can be subdivided into a finite number of subintervals, on which the function iseither nonincreasing or nondecreasing. Several piecewise monotone functions on [—1,1] areshown, with smoothness corresponding to (top left) p = 0, (top right and middle left) p = 1,(middle right) p = 3, and (bottom left) p = 5 in Theorem 6.2.

The integrated term (first term) on the right side vanishes because of the periodicity of /and the complex exponential; the remaining term is integrated again by parts. If this step isperformed a total of p times and the periodicity of the derivatives is used each time, we findthat

At this point, it would be easy to use this result and deduce an error bound forthe DFT. However, at the risk of raising a few technicalities, it pays to work just abit more and obtain a stronger version of the previous theorem. This will lead toa more accurate error bound for the DFT. It turns out that the last integral in theprevious proof (which has f^ in the integrand) can be bounded only by a constantif we assume only that f^ has a finite number of discontinuities. By placing slightlytighter conditions on f^p\ a stronger result can be obtained. Here is the additionalcondition that must be imposed. A function is said to piecewise monotone on aninterval [—A/2, A/1] if the interval can be split into a finite number of subintervals oneach of which / is either nonincreasing or nondecreasing. Figure 6.4 shows examples ofpiecewise monotone functions with various degrees of smoothness; it suggests that thiscondition does not exclude most functions that arise in practice. With this definitionwe may now state the stronger result [30], [70], [158].

THEOREM 6.2. RATE OF DECAY OF FOURIER COEFFICIENTS. Let f and its firstp—l derivatives be A-periodic and continuous on [—A/2, A/2] forp > 0. Assume thatf(p^ is bounded and piecewise monotone on [—A/2, A/2}. (The case p = 0 means that

The magnitude of this integral may be bounded by AM wherewhich leads to the bound


only f itself is bounded and piecewise monotone on the interval.) Then the Fouriercoefficients of f satisfy

where C is a constant independent of k.Proof outline: As in the previous theorem, the proof begins with p integrations by

parts of the Fourier coefficients Ck. With the assumption of piecewise monotonicity and thesecond mean value theorem for integrals, the final integral can be bounded by a constanttimes k~l, which provides the additional power of A; in the result.

It should be mentioned that the conditions of this theorem can be extended toinclude functions with a finite number of discontinuites with infinite jumps. All of theconditions under which this result is true are often called Dirichlet's3 conditions[30]. As we noted, the condition of monotonicity is really not a significant restrictionfor functions that arise in most applications. This result may now be used to obtaina bound for the error in using the DFT to approximate the Fourier coefficients of aperiodic, non-band-limited function. The result is given as follows [4], [76].

THEOREM 6.3. ERROR IN THE DFT (PERIODIC NON-BAND-LIMITED CASE).Let f and its first p— I derivatives be A-periodic and continuous on [—A/2, A/2] forp > I. Assume that f^ is bounded and piecewise monotone on [—A/2, A/2]. Thenthe error in the N-point DFT as an approximation to the Fourier coefficients of fsatisfies

Before turning to a case study, it would be useful to elaborate on this theoremwith an eye on The Table of DFTs of the Appendix, which demonstrates it quiteconvincingly. Theorem 6.3 gives the errors in the DFT in terms of a bound that holdsfor all N. On the other hand, The Table of DFTs shows the errors in the DFTs inan asymptotic sense for large values of N'. While these are not the same measure oferror, it can be shown that both estimates agree in the leading power of N. Therefore,the table can be used to check the results of Theorem 6.3. We will mention severalspecific examples.

3Born and educated in (present day) Germany, PETER GUSTAV LEJEUNE DmiCHLET (1805-1859)interacted closely with the French mathematicians of the day. He did fundamental work in numbertheory (showing that Fermat's last theorem is true for n — 5) and in the convergence theory of Fourierseries. He succeeded Gauss in Gottingen for the last four years of his life.

where C is a constant independent of k and N.Proof: From the Discrete Poisson Summation Formula we know that

By Theorem 6.2, the Fourier coefficients satisfy the boundC'. Hence

for some constant

for Both series can be shown to converge for and we mayconclude that


1. If the periodic extension of / is continuous on the interval [—A/2, A/2] (implyingthat f(—A/2) = f ( A / 2 ) ) , but no higher derivatives are continuous, thenTheorem 6.3 applies with p — 1. We may conclude that the error in the DFT isbounded by a multiple of A/""2. This situation is illustrated by cases 6a and 6b(real even harmonics) and case 8 (triangular wave), all of which are continuous,but have piecewise continuous derivatives.

2. DFTs of functions with smoothness p > 1 are difficult to compute analyticallyand do not appear in The Table of DFTs. Numerical examples of these caseswill be shown in the next section.

3. Unfortunately, there is one case that occurs frequently in practice that Theorem6.3 does not cover; this is the case of functions that are only piecewise continuous(p = 0). With a bit more work, Theorem 6.3 can be extended to this case [76],and the result is as expected: if the .A-periodic extension of / is bounded andpiecewise monotone, the error in the DFT is bounded by C/N, where C is aconstant independent of k and N. This situation is illustrated in The Tableof DFTs by case 5 (complex harmonic), case 6c (real odd harmonic), case 7(linear), case 9 (rectangular wave), cases 10 and lOa (square pulse), and case 11(exponential), all of which have discontinuities either at an interior point or atthe endpoints. The typical asymptotic behavior for these cases is

There is actually a little extra meaning in this bound. For low frequencycoefficients (|fc| « N/2), the error behaves like CN~2; for high frequencycoefficients (|fc| « N/2) the error decreases more slowly, as CN~l, as predictedby the theory. This says that the errors in the low frequency coefficients aregenerally smaller than in the high frequency coefficients, which is often observedin computations.

Case Study 2: Periodic, non-band-limited functions. The phenomenonof aliasing can be demonstrated quite convincingly by looking at a periodic functionwhose period is greater than the interval from which samples are taken. Consider the2-periodic function f ( x ) — COS(TT:C) on the interval [—1/2,1/2] (which is case 6b in TheTable of DFTs). The function / is sampled at the N grid points xn = n/N for variousvalues of N, and these sequences are used as input to the DFT. Figure 6.5 shows thefunction / and its periodic extension (top left). The errors \Fk — c/-| in the DFTs oflength N = 16,32,64 are shown in the remaining graphs. Regardless of how large Nis taken, the DFT never sees a complete period of the input function, and hence theinput appears non-band-limited. Therefore, the coefficients that the DFT producesare in error because the coefficients of higher frequency modes are aliased onto thecorresponding low frequency modes. Indeed, as N increases, these errors decrease inmagnitude in accordance with the p — 1 case of Theorem 6.3. Furthermore, errors inthe low frequency coefficients are smaller than errors in the high frequency coefficients.

The previous case study is a specific example of a more general rule that is oftenoverlooked in practice: when a periodic function is sampled on an interval whoselength is not a multiple of the period, an error is introduced in the DFT (in additionto a possible aliasing error). We have already seen an example of this error in Chapter3, and it was attributed to leakage. We would like to explore the phenomenon ofleakage again, both from a slightly different perspective and in greater detail.


FIG. 6.5. Case Study 2. TTie function f ( x ) = cos(Tnc) (upper left) zs sampled at Nuniformly spaced points on the interval [—1/2,1/2], and extended periodically beyond thatinterval. The errors in the DFTs of lengths N = 16,32,64 are shown in the upper right,lower left, and lower right plots, respectively. Because of symmetry, only the 0 < A; < N/2components are shown. Note that the vertical scale is different for each of these plots.Regardless of the value of N this function, when sampled on this interval, appears non-band-limited, and there are errors in the DFT.

Assume that the function / is A-periodic and has Fourier coefficients Ck on[—A/2, A/2], We will be interested in computing the Fourier coefficients of / on asecond interval [—pA/2 ,pA/2] where p > 1. Let us denote the coefficients on thatinterval c'k. Here is an illuminating preliminary question: what happens if the lengthof the second interval is a multiple of the period; that is, p is an integer? It is asignificant result (problem 123) that in this case the two sets of coefficients Ck and c'kare related by

where k is any integer. There is a clear meaning of this result: if p > I is an integer,the kth mode on the interval [—pA/2 ,pA/2] appears as the (k/p)th mode on theinterval [—A/2, A/2]. The remaining modes are not periodic on [—A/2, A/2] andare not needed in the representation of / on [—pA/2,pA/2]. For example, one fullperiod on [—1,1] looks like two full periods on [—2,2]. On the other hand, one fullperiod on [—2, 2] looks like half a period on [—1,1], and this mode is not used in therepresentation of / on [—2,2]. To make this point quite clear consider the functionf ( x ) = cos2x. It has the following Fourier coefficients on the given intervals:


Note that c'k,4 = c'k,2 = Ck reflecting the scaling of the indices as the interval ischanged.

Equally important, the DFT has the same property: if p > I is an integer and the^4-periodic function / is sampled with pN points on the interval [—pA/2 ,pA/2] theresulting coefficients F'k are related to the original set Fk by

where k is any integer. Since p is not an integer, k—pko is never an integer, and none ofthe coefficients c'k vanish (whether / is band-limited or not). Typically the coefficientswith index closest to pko will have the largest magnitude. Nearby coefficients, in whatare often called the sidelobes, decrease in magnitude and decay to zero by oscillationlike (k — pko)~l. The appearance of these characteristic sidelobes is often a symptomof poor sampling of a periodic signal.

Equally important is the fact that the DFT exhibits the same effect. If a periodicfunction is sampled on an interval that does not contain an integer number of periods,then the resulting DFT will show discrepancies when compared to the results ofsampling on a complete period. This is clearest when the same single complex mode

where k = -pN/2 + I : pN/2 (problem 124).Those were preliminary observations. Now what happens if the A-periodic

function / is sampled on the interval [—pA/2,pA/2] where p > I is not an integer? Inthis case the interval contains at least one full period of / plus a fraction of a period.The crux of the issue is captured if we confine our attention to a single complex mode

on the interval [ — p A / 2 , p A / 2 ] , where ko is an integer, but p is not. In this case thelength of the interval is not a multiple of the period. A short calculation (or a peek atcase 5 of The Table of DFTs) reveals that the Fourier coefficients of / on \—pA/2, pA/2]are given by

where Both the Fourier coefficients and the DFT coefficientsas given by (6.5) and (6.6) have the property that if p is an integer, then they reduceto the expected result, namely that c'k = F'k — 6(k — pko). Formally letting p —> 1in either expression (6.5) or (6.6) recovers the single spike at the central frequency(problem 125). This says that the DFT agrees with the Fourier series coefficients and

with integer frequency fco is sampled on the interval [—pA/2,pA/2] with pN points,where p is not an integer. Assuming that pN/2 is an integer, a short calculation(problem 125 or The Table of DFTs) reveals that the Fourier coefficients of / on[ — p A / 2 , p A / 2 ] in the Appendix shows that the pN-poiut DFT is given by


FIG. 6.6. The effect of sampling a periodic function on integer and noninteger multiplesof its period are shown in these four graphs. The DFT coefficients of the single wavef ( x ) = cos(27ro;) are computed on the interval [—p, p], where p = 1,1.25,1.75,2 (left to right,top to bottom). As discussed in the text, when p is an integer, the Fourier coefficients consistof two clean spikes (although their location must be interpreted carefully). When p is notan integer, leakage occurs into the sidelobes near the central frequencies. The same behavioroccurs in the DFT coefficients.

returns all zero coefficients except at the single frequency corresponding to the indexko. On the other hand, if p is not an integer, then in general both sets c'k and F'k arenonzero and the sidelobes appear.

Let's interpret these expressions with the help of a few pictures. Figure 6.6 showsthe DFT coefficients of the real mode f ( x ) = cos(27rx) computed on the interval [—p ,p]for several values of p. Observe that with p = 1 the set of coefficients consists of twononzero coefficients at k — ±2. With p = 1.25, the wave is sampled on a fractionof a full period and nonzero coefficients appear in sidelobes around the two centralfrequencies. With p = 1.75, the wave is still sampled on a fraction of a full periodand the sidelobes actually broaden. With a value of p = 2, the sidelobes disappear,but now the two central frequencies have moved out to k = ±4, as predicted by theabove analysis. The same behavior could also be observed in the Fourier coefficients.Notice that this is essentially the same problem that was considered in Section 3.4in our preliminary discussion of leakage. In that setting the interval [—A/2, A/2] washeld fixed while the frequency of the sampled function was varied.

As mentioned earlier, the effect that we have just exposed is generally calledleakage, since the sidelobes drain "energy" from the central frequencies. Thephenomenon also appears with other names in the literature, often resulting inconfusion. Seen as an error, it is often called a truncation error, meaning thatthe periodic input function has been truncated in an ill-advised way. The effect isalso called windowing, referring to the square "window" that is used to truncate theinput. In computing Fourier coefficients, this kind of error can often be avoided withprior knowledge or estimates of the period of the input. These same ideas will recur inan unavoidable way when we consider the approximation of Fourier transforms. Letus now turn to that subject.

REPLICATION AND POISSON SUMMATION 193

6.4. Replication and Poisson Summation

The next item on the agenda of DFT errors is the class of compactly supported (orspatially limited) functions. Before undertaking this mission, a small diversion isnecessary. On the first pass, this excursion will seem unrelated to the discussion athand but we will soon see that it bears heavily on everything that follows. We beginwith a definition. Imagine a function / that is defined on the entire real line (—00, oo)and let A be any positive real number. We will now associate with the function /a new function called its replication of period A (or simply replication if A isunderstood). It is defined as

The replication of period A of / is the superposition of copies of /, each displaced bymultiples of A; this new function is also defined on the entire real line. The result ofthis operation can be seen in Figure 6.7, which shows the replication of period A = 5of the function f(x] = e~\x\. An important property is that the replication of period Ais a periodic function with period A. Take heed: the periodic replication of a functionis not its periodic extension (unless f ( x ) = 0 outside of an interval of length A).

There may be some solace in knowing that we have already seen the idea ofreplication in this chapter, although the name was not mentioned at the time. Thereplication operator can also be applied to a sequence in the following manner. If {cn}is a sequence defined for all integers n, then its replication of period TV is

The idea is exactly the same. The replication of period N of a sequence is thesuperposition of copies of that sequence, each translated by multiples of N. Theperiodic replication is a periodic sequence as shown in Figure 6.7. The replication of asequence appeared in the Discrete Poisson Summation Formula (6.4), which we couldnow write more compactly as

The fact that the replication operator appears in the connection between the DFTand the Fourier series suggests that it might appear again in the relationship betweenthe DFT and Fourier transforms. Here is how it all comes about. A short calculationwill lead to the version of the Poisson Summation Formula that pertains to Fouriertransforms. This result is of interest in its own right, but it also leads to greatertruths.

Consider a function / that has a Fourier transform /. As before, we let Axbe the grid spacing in the physical domain which determines a natural interval[— l/(2Az), l/(2Ax)] in the frequency domain. Letting J7 = I/Ax, we now formthe replication of period f2 of the transform /. It is denned as

As discussed a moment ago, the function g is simply a superposition of copies of /,each shifted by multiples of f) = I/Ax. This means that g has a period of I/Ax, and


FlG. 6.7. The replication operator may be applied to functions or sequences. The functionf ( x ) = e~'x ' (solid curve}, and its replication of period 5 (denoted 7^5{/(x)}), are shownin the top figure (dashed curve). Similarly, the replication of period 20 of the sequencecn = 1/(1 + |n|~2) (denoted 7?-2o{cn}) is shown in the bottom figure.

as a periodic function it has a Fourier series of the form

The coefficients in this series are given by

We now substitute the definition (6.7) of g into this expression for cn and findthat

Now notice that the summation and the integral in this expression conspire nicely toform a single integral over the interval — oo < uj < oo. This allows us to write

where we have used xn = nAx and the definition of / in terms of its Fourier transform.Using these cn's in the Fourier series for g (6.8), we have that

REPLICATION AND POISSON SUMMATION 195

One more step brings us home. Now compare the two representations for g given by(6.7) and (6.9). They imply that

It is the second of these two equalities that is of interest. In fact, it culminates in theproof of the following theorem that will serve as the central pillar of the remainder ofthe chapter.

THEOREM 6.4. POISSON SUMMATION FORMULA. Assume that f is defined on theinterval —oo < x < oo, and that its Fourier transform f is defined for —oo < u < oo.Given a grid spacing Ax, let the sample points be given by xn = n£±x for integers—oo < n < oo. Then

Actually we have parted with convention; what is generally called the PoissonSummation Formula results by setting u = 0 in (6.10). This rather remarkablerelationship between samples of a function and its Fourier transform is

Quite unrelated to DFTs, this relationship says that the sum of the samples of / isa constant times the sum of the samples of / at multiples of the cut-off frequencyQ = I/As.

With this versatile and powerful result in our hands let's begin by making a fewvaluable observations. Some will be of immediate use; some are of interest in theirown right and might help illuminate the Poisson Summation Formula.

1. Using the replication operator, the Poisson Summation Formula appears as

where cn = Ax/(x_n). We see that the periodic replication of / can berepresented as a Fourier series on the interval [—SI/2, fi/2]. This says thatsamples of / can be found by computing the Fourier series coefficients of thereplication of /. We will return to the implications of this observation shortly.

where

2. If / vanishes outside of some finite interval and is sampled on that interval, thenthe sum on the left side of the Poisson Summation Formula is finite (in fact,we will see momentarily that it is the DFT!) and the replication of the Fouriertransform / can be computed exactly. This raises the question of whether afunction (e.g., /) can be recovered uniquely from its replication (problem 128).

3. Recall from expression (6.8) that the periodic replication of / can be written


We now move towards a powerful connection among the Poisson SummationFormula, the DFT, and replication operators. A bit of rearranging on the left side ofthe Poisson Summation Formula leads to another perspective. Let the samples of /be denoted where

we find that

We have used the periodicity of the complex exponential and introducedThat takes care of the left side of the Poisson Summation Formula.

We now express the right side of the formula as 7£n{/(w)} and evaluate it atAn important observation is that sampling the replication of period O of /

is the same as replicating the samples of / with period N. In other words, lettingCombining the modified right and

left sides of the Poisson Summation Formula leads to the following deceptively simpleresult.

It is impossible to unravel all of the implications of these observations at once; in fact,the remainder of this chapter will be devoted to that task. Occasionally we will pauseand look at a particular result or problem from the replication perspective, and veryoften it will provide penetrating insights.

Replication Form of the Poisson Summation Formula

Stand back and see what we have done! The result is the remarkable fact that (up tothe constant A)

the N-point DFT of the sampled replication of f is the sampled replicationof the Fourier transform of f.

The replications in the two domains have different physical periods: in the spatialdomain the replication has a period of A, while in the frequency domain the replicationhas a period of Q = N/A. Not surprisingly, the two periods are related by thereciprocity relations. But as replications of sample sequences they both have periodsof N.

Before moving ahead it might pay to indicate schematically what we have justlearned. Given a function / we may denote its relationship to its Fourier transform as

We now see that the DFT gives the analogous relationship between the replicationsof the samples of / and /; namely

atEvaluating the left side of (6.10)

we can write

INPUT WITH COMPACT SUPPORT 197

6.5. Input with Compact Support

With the Poisson Summation Formula and the replication perspective in our quiver,we may now turn to a more general class of functions that could be used as input tothe DFT. This is the class of compactly supported functions (also called spatiallylimited functions or functions of finite duration). These functions have theproperty that

Comparing the Fourier transform (6.13) evaluated atcoefficients of / (6.12), we discover a valuable result.

to the Fourier

Relationship

The Fourier transform of a function compactly supported onevaluated at the frequency is a constant multiple of thecorresponding Fourier coefficient:

Hence the DFT of a compactly supported function will provide approximations to bothits Fourier coefficients and its Fourier transform. The task is to estimate the errors

where A > 0 is some real number. The interval [—A/2, A/2] is called the intervalof support, or simply the support, of /. Compactly supported functions are notuncommon in applications. In most problems with spatial dependence (for example,image processing or spectroscopy), an object of finite extent is represented by afunction that vanishes outside of some region. Although a compactly supportedfunction is not periodic, we can still compute its Fourier coefficients Ck on its intervalof support. The function

is an A-periodic function (called the periodic extension of /) that is identical to / onthe interval [—A/2, A/2] (using average values of / at the endpoints of the intervalif necessary). This says that when a compactly supported function is sampled on itsinterval of support, and the samples are used as input for a DFT, it is as if the periodicfunction g had been sampled. And since the DFT "sees" samples of a periodic function,the results of the previous two sections are relevant. Notice that if the original function/ is spatially limited, then it cannot also be band-limited, and hence aliasing can beexpected to occur.

However, we have not dispensed with this case completely. It is not entirelyanalogous to the case of periodic functions. Since / is compactly supported, it alsohas a Fourier transform, and it is instructive to relate the Fourier transform of acompactly supported function to its Fourier coefficients. As was shown in Section 2.7,if f(x) = 0 when \x > A/2, then its Fourier transform is given by


in these approximations. Before stating the central theorem, it might be helpfulto garner some qualitative understanding of DFT errors in the case of compactlysupported functions.

The replication form of the Poisson Summation Formula (6.11) can tell us a lot.Recall that it says

In the present case, in which / has compact support on [—A/2, A/2], replication of fn

with period N does not change the sequence /n; hencePoisson Summation Formula takes the form

where we have used g(uk) to stand for samples of the replication of /. Now, the errorin the DFT, Fk, as an approximation toWith the help of the triangle inequality, it may be written as

The first term vanishes identically by virtue of the Poisson Summation Formula.The second term can be attributed to the sampling of the function /: samplingin the spatial domain essentially replaces / by the replication of / which we havecalled g. The outcome is rather surprising; we see that, in this special case of acompactly supported function, the error in the DFT is simply the difference betweenthe transform / and its replication g. We need to learn more about the actual size ofthat error.

We will briefly mention another perspective on DPT errors. Approximating theFourier transform of a compactly supported function takes place in two steps:

1. the function / must be sampled on its interval of support [—A/2, A/2] (or alarger interval);

2. the sampled function, now a sequence of length N, is used as input to the DFT.

Each of these steps has a tangible graphical interpretation. The reader is directedto the excellent discussion and figures of Brigham [20], [21] for the details of thisperspective. Having presented these preliminary arguments for motivation, let's nowturn to the central result that actually gives estimates of the size of errors in the DFTapproximations to the Fourier transform.

THEOREM 6.5. ERROR IN THE DFT (COMPACTLY SUPPORTED FUNCTIONS). LetLet the A-periodic extension of f have (p — 1) continuous

derivatives for p > I and assume that f^ is bounded and piecewise monotone on[—A/2, A/2]. If the N-point DFT is used to approximate f at the pointsthen

where C is a constant independent of k and N.

Iherefore, the

foris

for

INPUT WITH COMPACT SUPPORT 199

Proof: The assumption thatFormula to

where we have evaluated both sums at u = u>k = k/A. The sum on the left istimes the TV-point DFT of / sampled on the interval [—A/2, A/2]. We can now rearrangethe previous expression and apply the triangle inequality to conclude that

The equality follows by recalling thatWe now call in the relationship between the Fourier coefficients and the Fourier transform,

which gives us that

where C is a constant that depends on neither k nor N.Two comments are in order. The first comment concerns another pester at the

endpoints. The result of this theorem depends on the assumption thatIf this requirement is relaxed to allow eitherperiodic extension of / has a discontinuity at the endpoints, and a weaker boundresults. This brings us to the second comment. If / is bounded but discontinuous(which would correspond to p = 0), a separate proof is required. However, the resultis as one might expect and hope: if the A-periodic extension of / is bounded andpiecewise monotone on [—A/2, A/2] then the error in using the DFT to approximatethe Fourier transform at the frequencies ujk is bounded by C/N where C is independentof k and N [76]. Therefore, with this addendum, this theorem applies to all inputswith compact support that might be encountered in practice. We now present a casestudy to illustrate the conclusions of this theorem.

Case Study 3: Compactly supported functions. In this case study weconsider a sequence of functions with increasing smoothness, each with compactsupport on the interval [—1,1]. These functions are shown in Figure 6.4 and aregiven analytically as follows.

1. Square pulse. The function

has a periodic extension that is not continuous at the endpoints ±1, and weexpect that the error in the DFT should be bounded by C/N.

and that

for reduces the Poisson Summation

Theorem 6.2 can now be used to bound the Fourier coefficients Ck by a constant times and then we proceed as in the proof of Theorem 6.3 to bound the series. The result is that

then the

Figure 6.8 shows how log Ejy varies with log TV for each of the five functions givenabove. The use of the logarithm identifies the rate of decrease of the errors since thegraph of log(7V~p) is a straight line with slope —p. We see that in all five cases theerrors decrease as N~p~l, as predicted by Theorem 6.5.

6.6. General Band-Limited Functions

In our systematic survey of the DFT landscape, the next class of functions that appearsconsists of functions that are nonperiodic, but are band-limited. The notion ofband-limited functions was introduced earlier in the chapter with respect to periodicfunctions. We now give the analogous (and conventional) definition for nonperiodicfunctions. A function / with a Fourier transform / is said to be band-limited, if thereexists some constant fi such that

2. Triangular pulse. The function


has a periodic extension that is continuous, but its derivative is discontinuousat x = 0, ±1. Theorem 6.5 applies with p — 1, and we expect the error in theDFT to be bounded by C/N2.

3. Quadratic with cusps. The function

like the previous example, is continuous when extended periodically, but has adiscontinuity in its derivative at x = ±1 (p = 1). The error in the DFT shoulddecrease as as N increases.

4. Smooth quartic. The function

has continuous derivatives of orders p = 0,1,2 when extended periodically andTheorem 6.5 applies with p = 3. We expect to see errors bounded by C/N4.

5. Smoother still! The periodic extension of the function

has continuous derivatives of order p = 0,1, 2, 3,4, and Theorem 6.5 applies withp = 5.

We define the error in the DFT approximations to the Fourier transform as

GENERAL BAND-LIMITED FUNCTIONS 201

FIG. 6.8. Case Study 3. The errors in DFT approximations to the Fourier transformsof the five functions of Case Study 3 are shown in this figure. The greater the degree ofsmoothness of f , the faster the decay rate of the errors. Each plot consists of the error curve(solid line) and the theoretical error bound N~^p+l' (dashed line), plotted on a log-log scale,so that a trend of N~a appears as a line with slope —a. The curves correspond to (top left)p — 0, the square pulse; (top right) p = 1, f ( x ) = 1 — |x|; (middle left) p — 1, f ( x ) = 1 — x2;(middle right) p = 3, f ( x ) = (1 - x2)2; and (bottom left) p = 5, f ( x ) = (1 - x2)4. Aspredicted by the theory, the errors decrease very nearly as N~p~1. It should be noted thatonly the slopes of the error curves are significant.

As in the periodic case, band-limited simply means that the function / has no Fouriercomponents with frequencies above the cut-off frequency Q/2. Note that the Fouriertransform of a band-limited function has compact support. Two practical notes shouldbe injected here. In most applications, it is impossible to know a priori whether agiven function or "signal" is band-limited. The objective in computing the DFT is todetermine the frequency structure of the function, and until it is known, a judgmentabout "band-limitedness" is unfounded. Perhaps more important is the fact thattruly band-limited functions or signals are rare in practice. Fourier transforms maydecrease rapidly, and at predictable rates, but functions whose Fourier transformsvanish beyond some fixed frequency are arguably idealizations. Nevertheless, withthis caution now in the wind, let us proceed, because this case has several importantlessons. The first path we will follow involves the Poisson Summation Formula, whichhas served us so well already. In contrast to the previous section in which / itself


vanished outside of some interval, thus simplifying the Poisson Summation Formula,we can now see a similar simplification because / vanishes outside of some interval.First let's set some ground rules. As in the past, we will imagine that / is sampledat N equally spaced points on an interval [—A/2, A/2]. Notice that there is now nonatural way to choose A or Arc in the spatial domain, since / has no periodicityor compact support. For this reason it makes sense to choose grid parameters in thefrequency domain first; then the reciprocity relations will determine the correspondinggrid parameters in the spatial domain. The first requirement is that the frequencygrid must cover the entire interval [—fi/2, fi/2] on which / is nonzero (the pester atthe endpoint will be discussed shortly). Having chosen fi and the number of samplepoints TV, the frequency grid spacing Au; is given by Au; = tt/N. Whether one appealsto the reciprocity relations or the Nyquist sampling condition (they are ultimatelyequivalent), the spatial grid spacing Ax must satisfy

With the assumption that f(w) = 0 for |u>| > fi/2 = l/(2Ax), the infinite sum onthe right side of the formula collapses to the single term /(<*>). The left side containsthe DFT when evaluated at u> = ujk. Therefore, evaluating both sides at u = 0;^, thePoisson Summation Formula now takes the form

for A; = -N/2 + 1 : N/2 - I. We have used the familiar definition of the DFT (usingaverage values at the endpoints) and written Fk on the left-hand side. Replacing A/Axby A, a bound now follows by rearranging terms:

The notation £" has been introduced to indicate that theterms of the sum are weighted by 1/2. Another pester arises here concerning

endpoints. Notice that ifnecessarily valid for k — N/1. It is a minor technicality which is avoided if SI is chosenlarge enough that

Now the situation becomes a bit sticky, since the task is to estimate the sum onthe right side of inequality (6.14). Any statement on this matter requires additionalassumptions on / and its rate of decay for large |x|. For example, if it is known that

if aliasing is to be avoided. Finally, the length of the spatial domain is given byA = AT Ax.

We are now ready to move towards an error estimate for the DFT as anapproximation to /. Not surprisingly, we begin with the Poisson Summation Formula

for k

the above bound is not


it is possible to approximate the sumon the right side of (6.14) by an integral which behaves asymptotically as A~r forlarge A (problem 135). Such results may not be of great practical use, and we willnot attempt to be more specific (see [36] for more results concerning the truncationof Fourier integrals). The lesson to be extracted from (6.14) is that the DFT errorfor band-limited functions is an error due to truncation of / in the spatial domain. Itcan be reduced by increasing A, which should be accomplished by increasing TV withAx fixed. Note that there is nothing to be gained by decreasing Ax (equivalentlyincreasing f2) since / is band-limited. We will postpone a theorem on DFT errors inthe band-limited case, as it will appear as a special case of a more general theorem inthe next section.

We now consider another perspective provided by the replication formulation ofthe Poisson Summation Formula (6.11)

In the present case of a band-limited function, the replication of the samples of/ becomes very simple provided the sampling of the input is done correctly*. Thereplication of / occurs with a period of I/Ax. Assuming that the sampling interval

and the band-limit il satisfy the condition Ax < 1/fi, the replication of / willproduce no overlapping of / with itself. This means that

and no aliasing occurs in the frequency domain. Therefore,we see that if the DFT is applied to the samples of the replicated input sequence /n,then the samples of the Fourier transform are produced exactly. There are a coupleof interesting consequences of this observation.

First, it suggests a method for improving the DFT approximation to the Fouriertransform of a band-limited function: if the function / is available over a largeinterval, one can compute an approximation to 7^v{/n} by taking several terms inthe replication sum. This approximation can then be sampled at N points of theinterval [—A/2, A/2] and the samples of the replication can be used as input to theDFT. In principle, the more closely 7Z>N{fn} can be approximated for the input, themore closely the samples of / can be approximated. The efficacy of this strategy isexamined in Case Study 7 below. The second observation has already been made, butit is quite transparent within the replication framework. The difference between thesamples of fn and the samples of 7£jv{/n} can be minimized by taking the period ofthe replication operator as large as possible. The sampling of the input takes placeover the spatial domain [—A/2, A/2]] therefore, increasing A decreases the overlappingof the tails of / in the replication process. If A is increased with N fixed, then Ax alsoincreases. The reciprocity relations (fi = N/A = I/Ax) tell us that the extent of thefrequency grid must decrease as Ax increases with the possibility that it will no longercover the entire interval on which f is nonzero. This oversight would once again causealiasing errors in the frequency domain. On the other hand, if A is increased and Nis increased so that Ax does not increase, then the frequency grid does not decreasein length, and the full support of / can still be sampled. As is so often the case,increasing N is a remedy for many DFT errors.

Here is a third observation that results from the replication perspective. The errorthat we would like to estimate is

for

Using the

for where


triangle inequality we can write

The source of the DFT error is the first term which can be viewed as a truncationerror since it is the difference between the DFT of fn and the DFT of the replicationof fn (which includes values of / outside the interval [—A/2, A/2]). The second termvanishes identically because of the replication form of the Poisson Summation Formulain this special case of a band-limited function.

We mention that there is also a graphical approach to understanding the DFTerrors in this case. Let's begin by making the observation that in using the DFTto approximate the Fourier transform of a general band-limited function, three stepsmust be performed:

1. / is truncated to restrict it to the interval of interest

2. / is sampled with a grid spacing Ax, which must satisfyto be avoided, and

3. the truncated, sampled version of /, now a sequence of length N, is used asinput to the DFT.

Each of these steps has a known effect on / and its Fourier transform, which can bedisplayed graphically. Although this approach does not lead to an estimate of theerror in the DFT, it does provide insight into how errors arise. As before, we cite thebooks of Brigham [20], [21] for the original presentation of this argument.

Finally, it would be inexcusable to discuss band-limited functions withoutreturning to the Shannon Sampling Theorem which was first encountered in Chapter 3.Given the omnipresence of the Poisson Summation Formula in this chapter, perhapsit is not surprising that it can also be used to derive the Sampling Theorem. Thefollowing presentation is not rigorous, but it is instructive nonetheless.

Moments ago we saw that the Poisson Summation Formula for a band-limitedfunction / is given by

To avoid endpoint pesters, assume thatgrid spacing should be chosen such that Ax < I/ft to avoid aliasing. Since / vanishesoutside of the interval [—ft/2, ft/2], we can multiply both sides of the previous equationby the square pulse p(u), which has a value of 1 on (—ft/2, ft/2) and is zero elsewhere.In other words,

The goal is to extract from this equation an expression for the original band-limitedfunction /. It appears that we are not far from this goal since / appears on the left

if aliasing is

Then the spatial


side of this last equation. Therefore, we take an inverse Fourier transform of thisequation and try to make sense of the right-hand side. Doing this, we have

Now recall two facts:

1. the inverse Fourier transform of the symmetric square pulse with width ft is

Assembling these observations, we arrive once again with the Shannon SamplingTheorem as it was presented in Chapter 3. If / is band-limited with f ( u ] — 0 forM > ft/2, and the grid spacing Ax is chosen such that Ax < I/ft, then

The theorem claims that if a function / is band-limited with a cut-off frequencythen it may be reconstructed from its samples /„. Admittedly, the

reconstruction requires an infinite number of samples and the use of the sine function(6.15) formula [131]. However, the theorem has variations and approximate versionsthat are of practical value. An excellent departure point for further reading on theShannon Sampling Theorem is the tutorial review by Jerri [82], in which the authordiscusses the theorem and a host of related issues.

Case Study 4: Band-limited functions. As mentioned earlier, genuinelyband-limited functions rarely arise in practice. In this case study, we will examine aclean but idealized band-limited function. The square pulse (or square wave) functionwas encountered in Chapter 3 along with its sine function Fourier transform. We willturn this transform pair around and use the fact that

Since BI, the square pulse of width one, has compact support, the sine function isband-limited. The use of the DFT to approximate the Fourier transform of the sinefunction is an exquisite example of the reciprocity relations at work.

Two parameters will play leading roles in this numerical experiment. One is A,which determines the interval [—A/2, A/2] from which samples of the sine functionare collected. The other parameter is the number of samples N. Once values of Aand N are selected, the reciprocity relations determine the rest. The grid spacingin the frequency domain is Au; = 1/.A, and the length of the frequency domain

and

2. by the shift theorem for Fourier transforms


TABLE 6.1Grid parameters for Case Study 4.DFT of a band-limited function.

Figure

Upper right

Middle left

Middle right

Lower left

Lower right

N

32

64

64

128

128

A

8

8

16

32

64

Au; = I/ A

.125 = I

.125 = io

.0625 = i

.03125 = i

.015625 = ̂

ft

4

8

4

4

2

Figure 6.9 shows just a few of the many possible DFTapproximations that might be computed using different combinations of A and N.The upper left figure shows the sine function itself. The remaining five cases and theattendant grid parameters are summarized in Table 6.1.

The numerical evidence is quite informative. The general shape of the squarepulse is evident in all five DFT sequences shown and becomes more clearly definedas N increases. All of the approximations show oscillations (overshoot) near thediscontinuities in the square pulse, which is due to the well-known Gibbs4 effect. Theseoscillations subside with increasing N. Caused by the nonuniform convergence of theFourier series near discontinuities, the Gibbs effect has been analyzed intensively;accounts of the Gibbs effect and methods to correct and minimize it can be found in[30], [55], and [84].

Let's see how the reciprocity relations enter the picture. With a width of oneunit, there are roughly 1/Au; grid points under the nonzero part of the square pulse.By virtue of the reciprocity relations, the only way to increase the resolution and putmore points under the pulse is to increase the length of the sampling interval A, sinceAu; = I/A. However, if A is increased with TV fixed, the effect is to decrease the lengthof the frequency domain O, as shown in moving from the middle left to the middleright figure. Note that the unit on the horizontal axes of the DFT plots is actualfrequency (cycles per unit length). Therefore, if one wishes to increase the frequencyresolution and maintain the same frequency domain, it is necessary to increase both Aand N commensurately, as shown by moving from the middle right to the lower leftfigure. As the sequence of figures suggests, errors in the DFT decrease as both A andAT are increased, and with A — 64 and N — 128 (lower right figure), the square pulseis fairly well resolved.

6.7. General Input

We now come to the final case in which the input to the DFT does not have periodicity,compact support, or a band-limit. We will assume that these functions are definedon an interval of the real line (a, b) where a and/or b is infinite. The only otherassumption is that the function / that provides the input is absolutely integrable

4JOSIAH WlLLARD GlBBS (1839-1903) is generally regarded as one of the first and greatestAmerican physicists. A professor at Yale University, he was the founder of chemical thermodynamicsand modern physical chemistry.

is

GENERAL INPUT 207

FlG. 6.9. Case Study 4. TTie Fourier transform of the band-limited sine function (upperleft] is approximated by the DFT with various values of A and N as given in Table 6.1. Thehorizontal axes on the DFT plots have units of frequency (cycles per unit length).

this insures that its Fourier transform exists. Our goal is toapproximate the Fourier transform / using the DFT and then to estimate the errorsin that approximation. Our conclusions may be less general than in earlier cases.Nevertheless, some insight can be gained and we will be able to make some remarksthat unify all of the preceding sections.

Let's return to the Poisson Summation Formula one last time, and it will tell usimmediately what this case entails. Recall that it may be written

where Ax is the grid spacing in the spatial domain and u) is an arbitrary frequency.In the absence of compact support for / (which would make the left-hand sum finite)or band-limiting (which would make the right-hand sum finite), both of the sums areinfinite in general. We can split both sums, evaluate them at a> = o>fc, and use thedefinition of the DFT to write


We have set u; = Uk to anticipate the fact that Fk will approximate f(u>k)- Aslight rearrangement with AT Ax = A gives an expression for the error in AF^ asan approximation to /(wjt):

Again the notation E" indicates that the ±JV/2 terms in the sum are weighted by1/2. We can now see qualitatively how errors enter the DFT. In a very real sensethis general case is a linear combination of the two previous cases. The first term onthe right side of this error bound is due to the fact that / does not have compactsupport, and / must be truncated on a finite interval; this term was encountered inSection 6.6. The second term on the right side of this inequality arises because / isnot band-limited, and hence some aliasing can be expected in the form of overlapping"tails" of /. This was the same term that was handled in Section 6.5.

It turns out that a DFT error result can be stated in this case for specific classesof functions. We will offer such a result for functions with exponential decay for large\x\. While this hardly exhausts all functions of practical interest, it does suggest ageneral approach to estimating errors.

THEOREM 6.6. ERROR IN THE DFT (GENERAL FUNCTIONS WITH EXPONENTIALDECAY). Let the A-periodic extension of f have (p — 1) continuous derivatives andassume that f^ is integrable and piecewise monotone on (—00,00) for p > 1.Furthermore, assume that \f(x)\ < Ke~a\x\ for \x\ > A/2, for some a > 0. // thefunction f is sampled on the interval [—A/2, A/2] and the N-point DFT is used toapproximate f at the points Uk = k/A, then

where C is a constant independent of k and N.The proof follows that of Theorem 6.5 with additional arguments to handle the

exponential decay and the noncompact support of /. A brief sketch of the proof is inorder.

Proof: As shown above (6.16), the Poisson Summation Formula for this case leads tothe bound

The first sum can be bounded by

This expression attains a maximum over k when uJk — 0, and a bound for the first sum isgiven by

The second sum cannot be treated as it was in the proof of Theorem 6.5, since / is notcompactly supported. We must first write the summand as

for

GENERAL INPUT 209

The second integral over [—A/2, A/2] can be identified as the Fourier coefficient Ck-jN andtreated as in the proof of Theorem 6.5. The first and third integrals must be integrated byparts p times with contributions at x = ±A/2 canceling, as in the proof of Theorems 6.1 and6.2. The outcome is that the entire second sum has a bound of the form Combiningthese two bounds we have that

for k = —N/2 + 1 : TV/2. This proof has not been extended for the case p = 0, but it seemsfeasible.

Let's make a few observations. Although the goal is to approximate the Fouriertransform /, the DFT is still applied on a finite interval [—A/2, A/2] with a finitenumber of points. The error in the DFT depends on the number of sample points, thelength of the interval, and the smoothness of / on that interval. Specifically, the errordecreases as the length of the interval increases (approaching the interval of integrationfor the Fourier transform), as the number of points increases (as we have seen before),and as the smoothness of / increases (including smoothness at the endpoints ±A/2).These dependencies will become clear when we present a case study.

As usual, the replication perspective also offers insight. The Poisson SummationFormula in terms of replication operators is

In the absence of compact support or band-limiting, neither replication operation canbe simplified, //the DFT could be applied to the full replication of the input, 7^jv{/n},the best we could do is to produce samples of the replication of /. Unfortunately, itis impossible to reconstruct a function or a sequence uniquely from its replication.Therefore, an error is introduced because the DFT produces samples of T£JV{/}> notsamples of / itself; this is the aliasing error mentioned above. But in practice the DFTcannot be applied to the exact replication of the input; therefore, a second source oferror is introduced, corresponding to the truncation of /. Notice the tension thatthe reciprocity relations impose. The error in either of the two replication operations(72.jv{/n} or 7£;v{/fc}) can be reduced by increasing either A or ft. However, unless Nis increased commensurately, the reduction in one error is achieved at the expense ofthe other.

Case Study 5: Asymmetric exponentials. We will consider the problem ofapproximating the Fourier transform of functions of the form

where a > 0. Parting with the convention used in most of this book, we will considerthe DFT applied to a set of N equally spaced samples taken from the asymmetricinterval [0, ^4]. The way in which A and N are chosen is of critical importance, andwe will investigate how the error in the DFT varies with these two parameters. Wewill proceed analytically and give a lustrous exhibit of the limiting relations amongthe DFT, the Fourier coefficients, and the Fourier transform.

Clearly, / is not compactly supported and there is no reason to suspect that it isband-limited. From a practical point of view, perhaps the first decision concerns thefrequency range that needs to be resolved. Let's assume that the Fourier transform isrequired at frequencies in the range [—ft/2, ft/2], where ft/2 is a specified maximum

where c is a constant that involves fc, but not A or N. The term cAx represents theerror in the Taylor series. Comparing the DFT in this form to the expressions for ck

and /(u>jt), we can now write two most revealing relationships. We see that the DFTcoefficients are related to the Fourier coefficients, for k = —N/2 + 1 : N/2, by

5BROOK TAYLOR (1685-1731) was an English mathematician who published his famous expansiontheorem in 1715. Educated at Cambridge University, he became secretary of the Royal Society at anearly age, then resigned so he could write.


frequency. The reciprocity relationship now enters in a major way. In order to resolvecomponents in this frequency range, the grid spacing Ax must satisfy Ax < I/ft.Having chosen Ax, the choice of JV, the number of sample points, determines the lengthof the spatial interval A = AT Ax. It also determines the grid spacing in the frequencydomain since Au; = ft/AT. Finally, with Aw specified, the actual frequencies Wfc = k/Aare determined for k = —N/2+1 : AT/2. It also follows from the reciprocity relationshipthat if a larger range of frequencies is desired, then Ax must be decreased. If thisdecrease is made with AT fixed, then A decreases and Au; increases; that is, resolutionis lost on the frequency grid. If Ax is decreased and N is increased proportionally,then A and Au; remain unchanged. With these qualitative remarks, let's now do somecalculations.

It is possible to compute the DFT and Fourier series coefficients of / on[-A/2, A/2] analytically. They are given in The Table of DFTs as

where and A short calculation also reveals that the Fouriertransform of / is given by

where / has been evaluated Notice thatWe can now compare F/j,Cfc, and f(u>k}- Here is the first observation. Recall that

for a compactly supported function, Ack = f(^k}- For this noncompactly supportedfunction, we see that Ack approaches /(u>fc) as A becomes large (problem 132); thatis,

To get the DFT into the picture, it will be useful to express F^ in an approximateform. Assume that A is fixed and N is large (hence Ax is small). Using a Taylor5

series to expand eAr,sin0fc, and cosflfc, it can be shown (rather laboriously (problem132)) that for fc = -AV2 + 1 : N/2

Furthermore, the DFT coefficients and the Fourier transform are related according to

GENERAL INPUT 211

What does it mean? Looking at relation (6.17) first, we see that for fixed A andu>fc, the DFT coefficients approach the Fourier series coefficient as N becomes largeand Ax approaches zero; that is,

for k = —N/2 + 1 : AT/2. Implicit in this limit (because of the reciprocity relations)is the fact that Au; is fixed and fJ —* oo. In other words, letting Ax decrease allowshigher frequencies to be resolved which means that the length of the frequency domainincreases. However, since A remains fixed, the actual resolution in the frequencydomain Au; does not change.

Having let Ax —> 0 and N —» oo, expression (6.18) becomesNow letting A —>• oo (which also means Au; —> 0) we have that

where A is held fixed in the inner limit while

We may also reverse the two limits above and watch the DFT approach / alonganother path. Imagine sampling / on larger intervals by holding Ax fixed andincreasing ^4; this means that N increases as A increases. Notice that as A increases,Au; decreases. Since we regard Fk as an approximation to /(u;^), it is necessary to letu;fc = k/A be fixed also. Look again at expression (6.18) and let A increase. We seethat

In other words, if we hold the grid spacing Ax fixed and increase the length of theinterval A by increasing N, then AFk approaches the value of the Fourier transformat (jj = UK to within a relative error of cAx. Notice that implicit in this limit, fi isfixed and Au; —> 0.

If we now let Ax approach zero (meaning D becomes large), then AFk approaches/(u>fc). This two-limit process can be summarized as

for k — —N/2+1 : N/2, where in the inner limit Ax is fixed while A —> oo. In practice,these limits are never realized computationally. However, they do indicate the sourcesof error in the DFT and how quickly those errors subside. In this particular case study,the error due to truncation of the interval of integration decreases exponentially withA (as predicted by Theorem 6.6). On the other hand, the error due to the lack ofsmoothness of / decreases only as Ax or l/N (also by Theorem 6.6).

The relationships among the DFT, the Fourier series coefficients, and the Fouriertransform, and the manner in which they approach each other in various limitsare shown in Figure 6.10. These dependencies apply to any general (noncompactlysupported, non-band-limited) input function /, although the rates of convergence inthe various limits depend upon the properties of /. The previous analysis can alsobe carried out for the symmetric exponential function f ( x ) = e~a\x\, with slightlydifferent convergence rates (highly recommended: problem 134).

for and


FIG. 6.10. For a general (noncompactly supported, non-band-limited) function, the DFT Fkapproximates the Fourier series coefficients Ck and the Fourier transform f ( u k ) in variouslimits. With the interval [A/2, A/2] fixed, Fk approaches the Fourier coefficients Ck as N —> ooand Ax —> 0 (which also implies that Q —+ oo with Au> fixed). With the sampling rate Axfixed, Fk approaches f up to small errors proportional to (Ax)p as N, A —> oo (which alsoimplies that Aa> —» 0 with ft fixed). Letting A —> oo in the first case or Ax —>• 0 in thesecond case allows the AFk to approach f(uJk)- In order to make consistent comparisons, thefrequency of interest Uk must be held fixed in each of the limits.

6.8. Errors in the Inverse DFT

You may agree that a significant amount of effort has been devoted to the question oferrors in the forward DFT. And yet, in one sense, only half of the work has been done.We still have the equally important inverse DFT (IDFT) to consider. Be assured thatthe discussion of errors in the IDFT can be streamlined considerably, partly by relyingon the results of the previous sections. At the same time, there are some features of theIDFT that are genuinely new, and these properties need to be pointed out carefully.First we set the stage and review a few earlier remarks.

We must now imagine starting in the frequency domain with either a sequence ofcoefficients Cfc or a function f(u>), either of which could be complex-valued. As thenotation suggests, if we have a sequence {cfc}, it should be regarded as a set of Fouriercoefficients of a function / on an interval [—A/2, A/1}. If we have a function /(a;), itshould be regarded as the Fourier transform of a function /. In either case, the goalis to reconstruct /, or more realistically, N samples of / at the grid points x = xn onsome interval [—A/2, A/2]. Let's deal with these two cases separately.

Fourier Series Synthesis

First consider the case in which the input to the IDFT is a sequence {c^}, and thetask is to reconstruct the function / that has Fourier coefficients {cfc}. At this point,it is necessary to introduce some new notation. As said, we will let / represent the

ERRORS IN THE INVERSE DFT 213

function with Fourier coefficients {ck}. This means that the samples of / (let's callthem f ( x n ) ) are the exact solution to the problem. On the other hand we will use theIDFT to compute approximations to the values of f(xn), and these approximationsneed a new name. We will let fn denote the sequence generated by the IDFT. In otherwords, using the definition of the IDFT,

for n = — N/2 +1 : N/2. Once again the reciprocity relations enter in a crucial way. Ifthe function / is to be approximated at N equally spaced grid points on the interval[—A/2, A/2], then Ck must be interpreted as the Fourier coefficient corresponding tothe frequency u>k = k/A on an interval [—fi/2, fi/2], where O = A/N. Therefore, withxn = nA/N and Uk = k/A, we have

for Now notice that the Fourier series for is

for n = —N/2 + I : N/2. If we now compare expressions (6.19) and (6.20), we can seehow well the IDFT approximates the values of f ( x n ) . Subtracting the two expressions,we have that

for n = —N/2 + 1 : N/2. In the unlikely case that the coefficients Ck are nonzero onlyfor k — —N/2 + l : N/2, we see that the IDFT exactly reproduces the values of f ( x n } ;this is the case of a periodic band-limited function /. This simply says that if / hasa finite number of frequency components and the IDFT has enough terms to includeall of them, then / can be reconstructed exactly at the grid points.

More realistically, if the sequence Ck has nonzero values for arbitrarily large A;,then the IDFT will use only N of those coefficients and the error can be bounded by

for n = —N/2 + 1 : N/2. In other words, the error in using the IDFT to approximatethe Fourier series is the error in truncating the Fourier series. If additional informationis available about the rate of decay of the coefficients (or, equivalently, about thesmoothness of /), then this bound can be made more specific. For example, ifit is known that the coefficients satisfy cjt < \k\~p for \k\ > AT/2, then the errorcan be bounded by CN~P, where C is a constant independent of TV (problem 136).Perhaps more important than a precise error statement are the following qualitative


FIG. 6.11. Case Study 6. The IDFT can be used to reconstruct a function from a givenset of coefficients Ck- The graphs show the reconstructions with (clockwise from upper left)N = 16,32,64,128. The result is increasingly accurate samples of a real-valued function onan arbitrary interval.

observations: the effect of increasing N with A fixed is to lengthen the frequencyinterval [—fi/2, il/2] while the grid spacing Ao> remains constant. The result is thathigher and higher frequency components are included in the representation of /. Inthe limit as N —» oo, the IDFT approaches the Fourier series of / at the grid pointsxn of [—A/2, A/2}. The subtleties of using C±N in reconstructing / are examined inproblem 137.

Case Study 6: Fourier series synthesis by the IDFT. This case study isa numerical demonstration of the use of the IDFT to reconstruct a function from itsFourier coefficients. We begin with the set of coefficients

for k = —TV/2 : AT/2, and use the IDFT to construct the sequence fn using the IDFTfor various values of N. We might anticipate the outcome before we even look atthe output. Since the sequence of coefficients is conjugate even (cfc = c*_k), we canexpect that the sequence fn is real. As N increases, we will see more samples of thatfunction on the same interval in the spatial domain. With this bit of forethought,1

let's look at the numerical results. Figure 6.11 shows the output of the IDFT forN = 16,32,64,128 plotted on a fixed interval. As N increases we see the graph of afunction / filled in with more resolution and more smoothness until a ramp functionemerges. Notice that the DFT takes the average value of the function at the endpointdiscontinuities. The oscillations that occur near the endpoints are another onset ofthe Gibbs effect that reflects the nonuniform convergence of the Fourier series neardiscontinuities. The length of the interval on which / is reconstructed is arbitrary,since the coefficient Ck simply gives the weighting of the fcth mode f^'nkxlA on theinterval [—A/2, A/2] for any A.


We would be remiss by overlooking the replication perspective in the case ofFourier series synthesis; it has a compelling message that can be found after a briefcalculation. Beginning with the Fourier series representation for f(xn) given by (6.20),we can write

for n = —N/2 + 1 : N/2. In a familiar maneuver, a single infinite sum has beenrewritten as a double sum to introduce the replication operator. The outcome is easilyexplained: the N samples of the function / can be produced exactly by applying theIDFT, not to the set of Fourier coefficients, but to the replication of this set. Thiscorroborates the earlier observation that if only the first N coefficients are nonzero,then the synthesis is exact. Otherwise, error is introduced because the sequence ofcoefficients is truncated. The only way to reduce the difference between the sequenceCk and the sequence 7£;v{cfc} is to increase N. However, a computational strategy isalso suggested by this result. It should be possible to improve the approximation tof(xn) by using approximations to 7£;v{cfc} as input to the IDFT. This idea is testedin the following example.

Case Study 7: Improved Fourier synthesis. The set CQ = 1/2, qt =sin(7rfc/2)/(7rA;) is the set of Fourier coefficients of the square pulse of width onecentered at the origin. The graphs of Figure 6.12 show various attempts to reconstructthe pulse from its set of Fourier coefficients. In all cases the number of grid pointsis N = 64. The observed improvements are due, not to increasing N, but to usingapproximations to 7^64 {cjb}, the replication of the Fourier coefficients, as input tothe IDFT. The first figure shows the output of the IDFT using only the coefficientsCfc for k — —32 : 32. Clearly, the approximations to fn are good near the centerof the interval, but they suffer from errors near the discontinuities. The remainingfigures show the results when the set Ck is replicated one, two, three times before beingused as input to the IDFT. The improvement in the reconstructions after replication issignificant, and comes at little additional expense. The replication is quite inexpensiveand in all cases the length of the IDFT remains constant. This is a strategy whichdoes not seem to have received much use or attention.

Inverse Fourier TransformsWe now turn to the other inversion problem, that in which a function / is given andthe task is to approximate the function / that has / as a Fourier transform. Before /can be used as input for the IDFT, it must be sampled on an interval [—17/2, Q/2] with


FIG. 6.12. Case Study 7. The reconstruction of a square pulse from its Fourier coefficientscan be improved by using a replication strategy. The IDFT, using the coefficients Ck fork = — 32 : 32 as input, is shown in the upper left figure. If the coefficients Ck are replicatedonce (upper right), twice (lower left), and three times (lower right), the IDFT reconstructionshows marked improvement. In all cases the IDFT produces N = 64 points.

a grid spacing of Au> = £l/N to produce a sequence that we will denotewhere k = —AT/2 + 1 : JV/2. This sequence of length N can now be used as input tothe IDFT, and we have that

for n = —N/2 + 1 : TV/2. (Recall that the notation /„ has been introduced to denotethe output of the IDFT.) The exponential terms in the rightmost sum will look likethe integrand of the inverse Fourier transform if we write

for n = —N/2 + I : N/2. All we have done is to abide by the reciprocity relations andlet xn = nArc and ujk = A;Aw, where Ax = A/N and Ao; = I/A.

To move onward from here and estimate the error in the IDFT as an approximationto the inverse Fourier transform, we need a version of the Poisson Summation Formulathat "works in the other direction," one that will allow us to compare values of theIDFT to values of /. Fortunately the arguments of Section 6.4 leading to the PoissonSummation Formula can be carried out analogously to yield what we will call theInverse Poisson Summation Formula. We will only state it and use it, leaving


the derivation as a worthwhile exercise (problem 138).

Inverse Poisson Summation Formula

This result, quite analogous to the (forward) Poisson Summation Formula (6.10),relates the ^-periodic replication of the function / to samples of / at the pointsujk = fcAu; = k/A. Not surprisingly, this result is indispensable in reaching conclusionsabout errors in the IDFT. We will now proceed with brevity, since the argumentsparallel those made in previous sections. The first case of interest is that in which/ is band-limited. We will assume that the function / is sampled on the interval[-ft/2, ft/2] where /(«;) = 0 for u > ft/2. The left-hand sum of (6.21) can besimplified and we have that

Evaluating this expression at the grid points x = xn, we see that the sum on the leftis the IDFT, fn. Rearranging the terms of this expression we discover that

for n = —N/2 + I : N/2. The error in using the IDFT to reconstruct a band-limitedfunction involves the values of / outside of the interval [—A/2, A/2]. This is a samplingerror, but now in the frequency domain. In other words, the choice of a samplinginterval Au; induces an interval [—A/2, A/2] in the spatial domain; those parts of /that do not lie in this interval are folded back onto the interval by the IDFT. It isanother case of aliasing. (There is also a pester here that arose in the case of compactsupport for the DFT. The result requires that /(u;_^v/2) = /(^jv/2) — 0; otherwiseanother term proportional to Au; survives in the Inverse Poisson Summation Formula,and the error is bounded by CAu; or C/N.)

We will not pursue this case further except to say that if additional informationabout / were known, for instance, the rate at which it decays for large |x|, then itwould be possible to make more precise bounds of the error. Since / is band-limited,nothing can be gained by increasing the extent of the frequency domain [—ft/2, ft/2].The way to reduce this error is to increase N with ft fixed, which increases A, whichin turn reduces the overlap in the values of f ( x — jA).

The other case that is easily handled by the Inverse Poisson Summation Formulais that in which / has compact support, but is necessarily not band-limited. We willassume that f ( x ) — 0 for \x\ > A/'2, and that Au; is chosen to satisfy Au; < I/A. Thenthe right-hand sum in (6.21), when evaluated at the grid points x = xn, is reduced toa single term, and we have

for


for n = —TV/2 + 1 : N/1. Rearranging this expression to isolate the IDFT results inthe error bound

for n = —N/2 + l : N/2. The E" means that the ±N/2 terms of the sum are weightedby 1/2. The expected endpoint pester appears: if f(—A/2+) ^ 0 or f ( A / 2 ~ ] ^ 0then the result does not apply for n — N/2; for this single coefficient, the error isbounded by a constant times Au; reflecting the discontinuity in / at the endpoints.This situation is easily avoided by taking a slightly larger value of A.

In contrast to the previous case, we see that the error in using the IDFT toapproximate a compactly supported function is a truncation error: the transform/ must be restricted to the finite interval [—J7/2, Q/2] before it is sampled, andits values outside of that interval contribute to the error. Fortunately, there isno aliasing error in this case, provided that the sampling interval Au; is chosensufficiently small (Au; < 1 /A). If additional information about / were known, itwould be possible to make more specific bounds on the error. Recall that the Fouriertransform of a compactly supported function is closely related to its Fourier coefficients(Ack — f(uk))- Therefore, if smoothness properties of / are known, then Theorem6.2 can be used to describe the decay of Ck and f(u>k)-

The qualitative lesson is most important in this case. Since the error is due tothe truncation of / to the interval [—fi/2,11/2], increasing fi (by increasing N with Aand Au; fixed) will decrease the error. Increasing fJ has the effect of decreasing Ax,which places more grid points in the fixed interval [—A/2, A/2]. There is no gain inincreasing A since / is compactly supported on [—A/2, A/2] and does not need to bereconstructed on a larger interval. In summary, increasing fi by increasing N with Au;fixed has the effect of increasing the resolution on the interval [—A/2, A/2] on which/ is represented. Thus, in this limit, we see that the IDFT approaches the Fourierseries representation for / at the grid points xn. This conclusion is mapped out inFigure 6.13, which will be discussed shortly.

There is one final case. As always, it is the most general case and perhaps the onethat occurs most frequently. If / (or, equivalently, /) is known to be neither compactlysupported nor band-limited, then the previous two cases can be taken together bothqualitatively and quantitatively. Rather than try to be more precise, we will appealto a case study.

Case Study 8: Errors in the IDFT. In this case study we will investigatethe errors in the IDFT by considering a function / whose IDFT and inverse Fouriertransform can be computed explicitly. Consider the function / and its samples at theN points u}k = fcAu; on an interval [—Q/2, fi/2] given by

for k = —N/2 + 1 : N/2. It is not too difficult to show that the inverse Fouriertransform of / is

We will determine how the IDFT approximates and approaches / in various limits.As mentioned, it is possible to compute the IDFT fn of the sequence /& explicitly. It


FlG. 6.13. The IDFT approaches f(xn) in various limits. With the sampling ratefixed while Q, and N increase, Au;/n approaches f on [—A/2, A/2] up to relative errors thatdecrease as (Aa>)p. (This limit also implies that A is fixed while Ax —>• 0.) Alternatively, with

fixed while Au; decreases (by letting N increase), the sequence Au;/™ approaches f(xn) onup to errors (denoted e(fi)) that decrease with increasing fi. Implicit in this limit

is that Ax is fixed and A —> oo. Letting Au; —> 0 in the first case or Q —> oo in the secondcase allows to approach f(xn) on

can be gleaned from The Table of DFTs in the Appendix to be

We have used the facts thatxn = nAx to write fn in this form. At this point fn appears to bear little resemblanceto /(xn). Recall from the Inverse Poisson Summation Formula that the quantityapproximates /(xn); therefore, we will look at Au;/n and see how it behaves in twodifferent limits.

First, consider the effect of letting £1 and N increase while holding Au; fixed (thereciprocity relations tell us that A is fixed and Ax —* 0 in this limit). This meansthat the range of frequencies used to reconstruct / increases, and the grid spacing inthe spatial domain Ax decreases. At the same time, the interval of reconstruction inthe spatial domain [—.A/2, A/2] as well as the grid point of interest xn remain fixed.If we formally let (7 —* oo in (6.22) with Au; fixed, we find that

We have denoted this limit g(xn); it is a periodic function inxn with a period of l/Au> = A.

We now need to relate this periodic function g to the (nonperiodic) function /which is the exact inverse Fourier transform of /. This can be done if we use Taylor

for

andfor


series to expand in powers of Aw. If we assume that Aw is small, we discoverthat (problem 133)

where —A/1 <xn< A/2 and c is a constant. In other words, letting N and ft increasein the IDFT creates samples of a periodic function g that differs from / on [—A/2, A/2]by an amount that decreases as Aw decreases. Said differently, in this limit, the IDFTapproaches samples of the Fourier series of / on the interval [—A/2, A/2]. If we nowlet Aw approach zero (implying that A —* oo), then the samples g(x approach thef(xn) on the interval (—00,00). This process can be summarized by writing

where Aw is held fixed in the inner limit.Now consider the alternate limit. Imagine that the range of frequencies

[—ft/2, ft/2] is fixed, as is the grid spacing A:r and the point of interest xn. Theeffect of letting N increase is to produce a finer grid spacing Aw and a larger intervalof reconstruction [—A/2, A/2] in the spatial domain. As before, we may start withthe analytical expression (6.22) and let Aw —> 0 with ft fixed. A copious use of Taylorseries leads us to

forWe see that the effect of letting Aw decrease to zero while holding ft fixed is to

produce a sequence that differs from by an amount which decreases (in this case)exponentially with ft. Subsequently letting ft increase, which amounts to includinghigher and higher frequencies in the representation of /, recovers the sampled function/. This two-limit process may be written

for —oo < xn < oo.Figure 6.14 illustrates these limit paths and the manner in which Aw/n approaches

f ( x n ) as A, ft, and N change (when a = 1/2). First, consider the three error plotsin the left column. If A — 4 is held fixed while TV and ft are increased (abidingby the reciprocity relation Aft = N}, then the IDFT produces increasingly accurateapproximations to / on the interval [—2,2]. This sequence of plots follows the limitpath leading to (6.23) in which, as ft — > oo, the values of f ( n n ] are produced up toerrors proportional to (Aw)2. The errors are also proportional to f ( x n ) itself, whichhas a maximum at xn — 0, explaining the maximum error at xn = 0.

In the right column of Figure 6.14, we fix ft = 4 and increase TV; this has the effectof decreasing Aw and increasing the length of the interval [—A/2, A/2] on which thesamples f ( x n ) are generated. This sequence of plots follows the limit path leading to(6.24). Note that the errors (most noticeably in the center of the interval near xn = 0)do not decrease with increasing values of TV. A check of numerical values shows that

for

n))


FIG. 6.14. Case Study 8. The above graphs demonstrate the limit paths of the IDFTdescribed in Case Study 8. The goal is to approximate the inverse Fourier transform of/(w) = e~^^2 by applying the IDFT with various choices of £7 and N. In all cases, theabsolute error \ f ( x n ) — Aw/n| is given at the sample points. In the left column N and fi areincreased (N = 16,32, 64 and J7 = 4,8,16) so that A = N/Ci remains constant. In the rightcolumn, N is increased (N = 16, 32, 64) and il = 4 is held constant, which produces largerintervals of reconstruction in the spatial domain and decreasing grid spacings in the frequencydomain (Aw = 1/4,1/8,1/16).

the sequence Aw/n has reached the limit f(xn}(l — cos(7rn)e aJV2)r as predicted by(6.24); taking larger values of N cannot improve the accuracy of this approximation.Further improvements can be made only by increasing S7. As before, since the errorsare also proportional to f(xn) itself, which has a maximum at xn — 0, the maximumerror occurs at xn — 0.

The limit paths of Case Study 8 are summarized in Figure 6.13. Like the DFTmap of Figure 6.10, this figure shows the relationships between the IDFT and variousforms of the function / which is to be reconstructed. If the IDFT is computedwith the sampling rate Aw fixed while letting N and £7 increase, the result isbetter approximations to the Fourier series of / on the fixed interval [—A/2, A/2].Subsequently letting Aw —»• 0 (and A —* oo) will produce increasingly accurateapproximations to f ( x n ) on (—00,00). On the other hand, if the IDFT is computedwith SI fixed, but with Aw decreasing (by letting TV increase), the result is values off ( x n ) up to errors that decrease as i7 increases. Subsequently letting 1) increase willreduce these errors, and the IDFT approaches f ( x n ) on (—00,00).

We will close in a predictable manner by appealing to the replication perspectiveone last time. All of the discussion of this section could have centered around thereplication form of the Inverse Poisson Summation Formula, and the same conclusionswould have followed. We leave it as a worthy exercise (problem 139) to show thatthe Inverse Poisson Summation Formula (6.21) can be expressed using replicationoperators in the form

This says that in the absence of special cases such as band-limiting or compact support,


the samples of the replication of / can be obtained by applying the IDFT to thereplication of /. In the presence of compact support or band-limits, this result reducesto special cases in which certain errors vanish.

6.9. DFT Interpolation; Mean Square Error

In this section we take up one final issue related to errors in the DFT, but the problemis posed in a much different setting. In previous sections of this chapter we concernedourselves with the question of how well the DFT approximates Fourier coefficients orFourier transforms of a given function /. In this section we discuss the question ofhow the DFT can be used to approximate a given function itself. In particular, wewill return to the problem of interpolation with trigonometric polynomials. This is animportant question in its own right, but it also turns out to be rather easy to handle,given everything that we have learned in this chapter.

As before we will work on an interval [—A/2, A/2] with N equally spaced gridpoints xn = nA/N, where n = —N/2 + 1 : N/2. We will also be given a function/ defined on that interval with known smoothness properties. We can now state theinterpolation problem. Given the N values of the function / at the grid points xn,find the coefficients Fk such that the trigonometric polynomial

agrees with / at the grid points; that is, 4>(xn) = f ( x n ) for n = —N/1 + 1 : TV/2. Theterm trigonometric polynomial may be confusing; it refers to the fact that 0 consistsof powers of (or is a polynomial in) e

l27rx/A.The solution to this interpolation problem was actually carried out in Chapter

2 as a means of deriving the DFT. To summarize, we start with the interpolationconditions that

for n — -N/1 + 1 : N/1. We will let and also note that should bedenned as the average of the values of / at the endpoints, xn — ±A/2 (AVED). Usingthe discrete orthogonality of the exponential functions allows each of the coefficientsFk to be isolated, and we find that

for k = —N/1+1 : N/1. In other words, Fk — T> { f n } k , and we see that the coefficientsof the interpolating polynomial are given by the DFT of the sequence

We now need to ask how well this function (j) approximates / on the entire interval{—A/1, A/1}. Up until now we have discussed the error in the DFT or IDFT atparticular points. When we compare two functions / and 0 at all points of an interval,we need a new measure of error. The tools that facilitate this new measure are normsand inner products. A short review is worthwhile.

DFT INTERPOLATION; MEAN SQUARE ERROR 223

It turns out that a very convenient way to measure the difference between twofunctions on an interval is the mean square error. It is the integrated square of thedifference between the two functions, and for the interval [—A/2, A/2] is given by

The norm 11 • 1 1 that we have defined is called the mean square norm or often simplythe L2-norm. Recall that the inner product of two functions / and </> on the interval[-A/2, A/2] is defined by

where 0* is the complex conjugate of </>. Here is the important connection betweeninner products and mean square norms: it is easy to check thatTherefore,

There is one other fundamental property associated with inner products, and that isorthogonality. We say that two functions g and h are orthogonal on an intervalif their inner product on that interval (g, h) vanishes. The important orthogonalityproperty that we will need in this section concerns the trigonometric polynomials.One straightforward integral (see problem 120 and Chapter 2, problem 22) is all thatis needed to show that

This property of the complex exponentials is entirely analogous to the discreteorthogonality that lies at the heart of the DFT.

We are now ready to state and prove a result about the error in trigonometricinterpolation [89].

THEOREM 6.7. ERROR IN TRIGONOMETRIC INTERPOLATION. Let the A-periodicextension of f have (p — 1) continuous derivatives for p > 1 and assume that f^p' isbounded and piecewise monotonic on [—A/2, A/2]. Assume that <p is the trigonometricpolynomial that interpolates f at the points xn = nA/N, where n — —N/2 + I : N/2.Then the mean square error in $ as an interpolant of f satisfies

where C is a constant independent of N.The proof of this theorem is a pleasing collection of results and ideas that have

already appeared in this chapter; it merits a succinct presentation with a few detailsleft to the reader.

Proof: First notice that we have imposed the same conditions on / that were usedearlier in the chapter. This will allow us to use Theorem 6.2 to estimate the rate of decay ofthe Fourier coefficients of /; specifically, we know that |cjt| < C'/\k\p+l for some constant C'.


The proof is quite physical since it relies on the splitting of / into its low and high frequencyparts. We will let f = /L + fn where

Notice that /L consists of the low frequency components that can be resolved by the DFT,whereas fa consists of the remaining high frequency components. Since /£, and fn arethemselves functions we can form their interpolating polynomials on the same N grid points,which we will call <jf>L and (J>H, respectively. This means that

Four observations will now be needed to complete the proof, and they should each beverified:

1. The interpolant of / on the grid points xn is

2. 4>L = /L (since both consist of the same N modes with the same coefficients).

3. fn and (f>n are orthogonal on [—A/2, A/2] since they share no common modes. Thisis a beautiful instance of aliasing: fn consists entirely of high frequency modes whichare aliased onto the lower frequency modes of the approximating function 4>n • This isa key element in the proof that follows.

4. The following Parseval relations hold:

(by the orthogonality of the sets and

We may now proceed. We start by writing the mean square difference between / andand simplifying it a bit:

Observations 1 and 2 have been used; and the last equality follows from the orthogonality offn and (J>H (observation 3 above). The remaining work is now clear: we must find bounds

Let's begin with ||/H||2- Using observation 4 above and the fact thatthe Fourier coefficients Ck satisfy \Ck\ < C'|fc|~p~1, we have that

The last series converges provided that p > —1/2, which for our purposes means p > 0.Therefore, we can conclude that, for some constant Ci,

and

andon

DFT INTERPOLATION; MEAN SQUARE ERROR 225

The second term ||</>H||2 is a bit more recalcitrant. The sequence FJ? is the DFT of thefunction /#, which itself has Fourier coefficients Ck that are zero for k = —N/2 + 1 : AT/2. Itfollows from the Discrete Poisson Summation Formula (one last time) that

for k = —N/2 + 1 : N/2. We may combine this fact with observation 4 above to concludethat

Once again, we use the decay rate of the Fourier coefficients \Ck\ < C'\k\ p \ which leads to

where the index in the inner sum runs over all integers except j = 0. Now, for any sum,it is true that J^la"|2 — (X)la n l)2- Applying this fac^ to the outer sum in the previousexpression, we have that

Those who take a moment to write out a few terms of these two sums will discover that theycan be condensed into a single sum of the form

and now we are just about there. As we did earlier in the proof, this series can be boundedby the following maneuver:

This series converges provided p > 0, which for our purposes means p > 1. This allows us toclaim that

where 62 is independent of N. Combining the bounds on ||///||2 and ||(/>H||2, together withthe stonger of the two conditions on p, we have that

where C is a constant independent of N.Having worked this hard, let's try to wrest some insight from this proof. The mean

square error ||/ — (j>\\ consists of two parts: one from /# and one from (/>//. Notice thatfn consists of high frequency components that are not resolved by the interpolating

2


polynomial 0. Therefore, the contribution to the error from /// represents thecomponents of / that are lost when only N modes are used in the approximatingfunction 0. This is simply a truncation error. The second contribution to the error ismore interesting, but it should be no stranger. The function 4>H is the interpolatingpolynomial (which uses low frequency modes) for the high frequency modes of /. Itmight seem that $H cannot possibly resolve these high frequency modes of / andshould be zero. Indeed, this would be the case were it not for aliasing. Because / issampled, its high frequency modes are disguised as low frequency modes that can bedetected by 0#. These modes do not belong with the low frequency coefficients, andhence they contribute to the error. Thus, we see that there are two sources of error intrigonometric interpolation: truncation and sampling. This is precisely what we haveobserved throughout this chapter.

Case Study 9: Trigonometric interpolation. Graphical demonstrations oftrigonometric interpolation can be made easily and convincingly. We will use a ratherarbitrary polynomial

that has no special symmetries on the interval [—1,1]. Following the procedureoutlined above, the DFT can be used to compute the coefficients of the polynomial0 that interpolates / at N equally spaced points on [—1,1]. Figure 6.15 shows theresults using N = 4,8,16,32 points. The plots show the original function /, theinterpolating polynomial 0, and the interpolating points xn. First note that eachinterpolating polynomial does the required job: it passes through the interpolatingpoints. Each interpolating polynomial is two-periodic, but since the periodic extensionof / has a discontinuity at x = ±1, the interpolating polynomial takes the average value

This discontinuity in / clearly degrades the accuracy of theapproximations near the endpoints. In the interior of the interval, the approximationsare much better, and they improve with increasing N. (Theorem 6.7 does not applydirectly to this case, since it requires at least continuity of /. However, it is reasonableto suspect that mean square errors in this case of a piecewise continuous functiondecrease as N~l.)

6.10. Notes and References

The subject of errors in the DFT is treated in a multitude of ways in a bewilderingassortment of books and papers. The goal of this chapter is to collect and organizenot only the results, but the frameworks in which those results are presented. Ourconclusion in writing this chapter is that there are three frameworks that have emergedin the literature for analyzing DFT errors, and we have attempted to represent all threeof them. In summary, they are

Poisson Summation Formula,

replication operators,

graphical presentation.

This chapter relies heavily on the the Poisson Summation Formula, which iscertainly the cornerstone of one framework in which DFT errors can be analyzed. Thehistory of this remarkable result is rather difficult to trace. Most recent treatments ofFourier transforms cite the result and are consistent about its name. It appears in the

PROBLEMS 227

FlG. 6.15. Case Study 9. The use of trigonometric interpolation is illustrated in thesefour figures. The function f ( x ) = (x+l)x^(x + 2) is interpolated at N equally spaced points ofthe interval [—1,1] where (reading left to right, top to bottom) N = 4,8,16,32. The originalfunction (dashed line), the interpolation function (solid line), and the interpolating points (*)are shown in each figure.

1934 book of Paley and Weiner [110] and the well-known 1924 testament of Courantand Hilbert [43]. It is the subject of two papers in 1928 by E. H. Linfoot [94] andL. J. Mordell [101], and Mordell remarks that "Poisson's formula has been ignoredin the usual text-books." We were unable to trace the result back to the originalwork of Poisson, although it seems likely that it must appear in his treatise Theoriemathematique de la chaleur of 1835, in which solutions to the heat equation in manydifferent (finite and infinite) domains and geometries are proposed.

Not entirely unrelated to the Poisson Summation Formula, but sufficiently distinctto call it a different framework, is the replication perspective. This approach has someappealing notational advantages and allows many results to be stated quite succinctly.It appears that this approach received its first expression in the literature in the workof Cooley, Lewis, and Welch in the late 1960s [40], [41]. The third approach tounderstanding DFT errors is the graphical approach, which lacks rigor and does notlead to concise error bounds, but certainly has acclaimed visual appeal. It seemsthat this approach was at least popularized, and perhaps created, by Brigham in hiswell-known books [20], [21].

6.11. Problems

116. Aliasing. Consider a grid with N = 6 DFT points (a total of seven pointsincluding the endpoints). Find and sketch the imaginary part (the sine mode) of thek = 2 mode on this grid. Find and sketch the imaginary part of the k = 8 mode onthis grid. Show that these two modes have identical values at the grid points. Find


the frequency of all of the modes that are aliases for the k = 2 mode.

117. Aliasing. Show that in general the modesthe same values at the grid points xn = nA/N, where k and p are any integers. Doesthe same conclusion hold for the real modes cos(2,7tkx/A) and sin(27rfcx/JA)?

118. Grid parameters. Consider the following functions on the designatedintervals. In each case, find the highest frequency mode that appears in the function(in units of periods per unit length), the maximum grid spacing Ax that fully resolvesthe function without aliasing, and the minimum number of grid points that fullyresolves the function without aliasing.

119. Modes on coarser grids. Show that if the kih mode on a grid with 2Npoints is viewed on a grid with N points, then it still appears as the fcth mode on thiscoarser grid, provided that \k\ < N/2.

120. Inner products, orthogonality, and aliasing. The continuous innerproduct of two functions / and g on the interval [—-A/2, A/2] was denned in the textas

An analogous discrete inner product can be defined for two sequencesand

Two functions / and g are orthogonal on [—A/2, A/2] if (/, g} = 0, while two sequences/ and g are orthogonal on N points if {/, g}^ = 0.

(a) Verify the oft-used fact that

(b) Verify the equally oft-used fact that

(c) Argue that following claim: Aliasing can be attributed to the fact thatthe orthogonality of the functions determined by (theordinary Kronecker delta), while orthogonality of the sequences determined by <5jv(fc) (the modular Kronecker delta).

haveand

where It is given by

isis

PROBLEMS 229

In each case carry out the following analysis:

(a) Make a sketch of / and its periodic extension.

(b) Find the sequence fn that should be used as input to the DFT toapproximate the Fourier coefficients of / (with special attention to theendpoints).

(c) Assess the smoothness of /, particularly at the endpoints.

(d) Use Theorem 6.3 to determine how the error in the DFT as an approxima-tion to the Fourier coefficients should decrease with N.

(e) Find the DFTs, Ffc, of the sequences /„, analytically (The Table of DFTsin the Appendix can be used in nearly all cases).

(f) Compute the DFT for various values of N and comment on the numericalresults.

122. DFT of a band-limited periodic sequence. The function considered inCase Study 1 offers a few more intriguing lessons that shed more light on the DiscretePoisson Summation Formula. Recall that the function consisted of the first 17 Fouriermodes all equally weighted. When sampled at N points the resulting input sequenceis

for n — —N/2 + 1 : N/2. Evaluate the DFT coefficients of this sequence forN = 64,32,16,8, and verify the conclusions of Case Study 1. In particular, notethat if N > 32, then the DFT coefficients are exact (Fk = cfc). With N = 32, notethat the DFT is exact except for the coefficient FIQ = 2. Explain this result in lightof the Discrete Poisson Summation Formula. Finally, describe the errors that arise inall of the DFT coefficients when N < 32.

123. Fourier coefficients on different intervals. Assume that / is A-periodicwith Fourier coefficients Cfc on the interval [—A/2, A/2]. If p is a positive integer, showthat the coefficients c'k of / on the interval [—pA/2,pA/2] are given by

121. DFT approximations to Fourier coefficients. Consider each of thefollowing functions and their periodic extension outside of the indicated interval.


124. DFT coefficients on different intervals. Assume that / is A-periodicand is sampled with N equally spaced points on the interval [—A/2, A/2}. Let theresulting DFT coefficients be Fk. If p is a positive integer, and / is sampled on theinterval [—pA/2 ,pA/2] with pN points, show that the resulting DFT coefficients arerri^ron Vvtr

(b) Graph the coefficients for fixed ko and |fc| < 5fc0 for several values of p with1 < P < 2. Notice the behavior of the coefficients as p —> 1 and p —> 2 andnote the presence and size of the sidelobes (as discussed in the text).

(c) Show analytically that as p —> 1, c'fc approaches the expected valuecfc = tf(fc - fco)-

(d) Carry out the same analysis on the pJV-point DFT of the samples of / onthe interval [-pA/2,pA/2]. Show that

where k = -pN/2 + I : pN/2 (The Table of DFTs may be helpful).

(e) Show that as p —»• 1 the coefficients F'k approach their expected valuesFk = 6(k-k0).

126. Alternate Discrete Poisson Summation Formula. The Discrete PoissonSummation Formula takes slightly different forms depending on whether the index setis centered or whether N is even or odd.

(a) Derive the Discrete Poisson Summation Formula for the DFT defined onthe indices n, k — 0 : N — I. Consider the cases in which N is even andodd.

(b) Find the relationship (analogous to (6.4)) that relates the DFT coefficientsto the Fourier coefficients. Consider the cases where / is band-limited andnot band-limited. Consider N both even and odd.

127. Replication operator.sketches of the following replications of the sequence cn and the function /:

for

125. Sampling on nonmultiples of a period. Consider the single mode where fcoan integer, and assume that it is expanded in a Fourier

series on the interval [—pA/1 ,pA/1] , where p > 1 is not an integer.

(a) Show that the Fourier coefficients are given by

Make

2]

PROBLEMS 231

In each case carry out the following analysis:

(a) Make a sketch of /.

(b) Find the sequence fn that should be used as input to the DFT toapproximate the Fourier transform of / (with special consideration for theendpoints).

(c) Assess the smoothness of / (noting endpoints).

(d) Use Theorem 6.5 to determine how the error in the DFT as an approxima-tion to the Fourier transform should decrease with N.

(e) At what frequencies will the DFT provide approximations to the Fouriertransform?

(f) Find the DFT Fk analytically (The Table of DFTs can be used in nearlyall cases).

(g) Compute the DFT for various values of N and comment on the numericalresults.

130. DFT approximations to Fourier transforms. Consider the followingfunctions defined for —oo < x < oo.

In each case carry out the following analysis,

(a) Make a sketch of /.

128. Inverse replication. Give an example to show a function that cannot berecovered uniquely from its replication.

129. DFT approximation to Fourier transforms. Consider the followingfunctions with compact support on the indicated intervals (and value zero outside ofthe indicated intervals).

and

where


(b) Find the sequence fn that should be used as input to the DFT toapproximate the Fourier transform of / (with special consideration for theendpoints).

(c) Assess the smoothness of / (endpoints!).

(d) Use Theorem 6.6 to determine how the error in the DFT as an approxima-tion to the Fourier transform should decrease with N.

(e) At what frequencies will the DFT provide approximations to the Fouriertransform?

(f) Find the DFT Fk analytically (The Table of DFTs can be used in nearlyall cases).

(g) Compute the DFT for various values of N and comment on the numericalresults.

131. Reciprocity relations. Consider the following functions:

These functions are sampled on the interval [—.A/2, A/2], where A > 1, and theresulting sequence is used as input for the N-point DFT. In each case, explain theeffect of (i) increasing A by increasing Aa; with N fixed, and (ii) increasing A byincreasing N with Ax fixed. Specifically, in each case draw the corresponding grids inthe frequency domain and indicate the frequencies that are represented in the DFT.

132. Details of Case Study 5. Case Study 5 explored the DFT of samples of thefunction f ( x ) = e~ax for x > 0 and f ( x ) — 0 for x < 0.

(a) Show that when / is sampled on the interval [0, A], the DFT and Fouriercoefficients are given by

(b) Show that the Fourier transform of / is given by

where / has been evaluated at

(c) Verify that and that

PROBLEMS 233

(d) Use Taylor series to expand e^, sin 9k, and cosflfc for large N and smallvalues of Az. Show that for k = -TV/2 + 1 : N/2

where c is a constant independent of A and N.

133. Details of Case Study 8. Assume that the functionsampled at the points u>k — kAui (where a > 0) to produce the sequencefor

(a) Show that the inverse Fourier transform of

(b) Verify that the IDFT of fk is given by

(c) Show that with Aw and xn fixed

for

(d) Verify that the function g that results from this limit is periodic in xn withperiod

(e) Use Taylor series to expand in powers of Au; assumingshow that

134. DFT case studies. Carry out an analysis similar to Case Study 5 for theproblem of approximating the Fourier transform of the functions

Specifically,

(a) Find the DFT and the Fourier coefficients of / when it is restricted to theinterval [-,4/2, ,4/2] (or use The Table of DFTs).

(b) Find the Fourier transform / of /.


(c) Expand the DFT in Taylor series assuming that Ax is small and A is large.

(d) Show that both of the limit paths of Figure 6.10 lead to the sequence /&.

135. Asymptotic bounds on integrals. In estimating the error in the DFT ofa noncompactly supported function that satisfies |/(a:)| < Cx~r for \x\ > A/2,r > Iand C a constant, the integral JT)2 x~T e~l<2'K(J}X dx must be bounded. Integrating byparts at least twice, show that this integral satisfies

for A —> oo, where C is a constant and O(A r 1] represents terms that are boundedby a constant times A~r~1 as A —» oo.

136. IDFT error. The IDFT is to be used to reconstruct a function / from itsFourier coefficients Cfc for k = —N/2 + 1 : N/2. Show that if the Fourier coefficientssatisfy |cjt| < C'\k\~p then the error in the JV-point IDFT is bounded by CN~P whereC and C" are constants independent of k and N.

137. Fourier series synthesis. There are some subtleties in selecting the set ofFourier coefficients {ck} to be used for the IDFT. These requirements are the analogsof AVED (average values at endpoints and discontinuities) with respect to Fouriercoefficients.

(a) Show that if Ck = c£, then the function

is real-valued, but the IDFT will produce real values of fn only if asymmetric set of coefficients, c/c, where k — —N/2 : N/2, is used as input.

(b) Show that if the c^'s are real and Ck = c_£, then / (given above) is realand even. Furthermore, the IDFT returns a real and even sequencewhether the index set k = —N/2 : N/2 (an odd number of points) ork = —N/2 + I : N/2 (an even number of points) is used.

138. Inverse Poisson Summation Formula. Switch the roles of the spatial andfrequency domains and then mimic the derivation of the Poisson Summation Formulain the text to obtain the Inverse Poisson Summation Formula (6.21).

139. Replication form of the Inverse Poisson Summation Formula. Showthat the inverse Poisson Summation Formula can be expressed using replicationoperators as

Chapter 7

A Few Applicationsof the DFT7.1 Difference Equations; Boundary Value

Problems

7.2 Digital Filtering of Signals

7.3 FK Migration of Seismic Data

7.4 Image Reconstruction from Projections

7.5 Problems

In most cases of practicethe number of given

values UQ, u}, u2, . . . iseither 12 or 24.

- E. T. Whittaker andG. Robinson

The Calculus ofObservations, 1924 235

236 A FEW APPLICATIONS OF THE DFT

7.1. Difference Equations; Boundary ValueProblems

Background

Difference equations have a history measured in centuries and they find frequent usetoday in subjects as diverse as numerical analysis, population modeling, probability,and combinatorics. And yet it is a curious fact that they are sadly neglected inthe mathematics curriculum. Most students, if they encounter difference equationsat all, do so after studying differential equations, a subject equally important, butarguably more advanced. We cannot correct this state of affairs in the confines ofthese few pages. However, we shall attempt to provide a qualitative survey of differenceequations, and then investigate the wonderful connection between certain differenceequations and the DFT. We will begin by standing back and looking at differenceequations from afar, in order to supply a general map of the territory.

Like all equations, difference equations are intended to be solved, which meansthat they contain an unknown quantity. The unknown in a difference equation maybe regarded as a sequence or a vector which we will denote either un or

A difference equation gives a relationship between the components of this vector ofthe form

where m > 1 is an integer. The function <I>n relates one component of the unknownvector to the preceding m components. The terms fn are components of a givenvector that may be viewed as input to the "system" or, in other contexts, as externalforcing of the system. With this general form of the difference equation, we may nowdefine some standard terms of classification. The order of the difference equationis m. If $ is a linear function of its arguments (that is, it does not involve termssuch as u^ or unun-\ or sin(wn)), then the difference equation is said to be linear;otherwise the difference equation is nonlinear. If the input terms fn are zero, thedifference equation is said to be homogeneous; otherwise it is nonhomogeneous.For example, the difference equations

are fourth-order, linear, nonhomogeneous and first-order, nonlinear, homogeneous,respectively.

We will get much more specific in just a moment, but first it is necessary tomake a major distinction between two types of difference equations that arise inpractice. Consider the mth-order difference equation (7.1). //the first m componentsof the solution MQ, w i , . . . , WTO-I were given, then it would be possible to enumerate theremaining components by applying the difference equation explicity. In other words,given values of UQ, w i , . . . , wm_i we could then evaluate

DIFFERENCE EQUATIONS; BOUNDARY VALUE PROBLEMS 237

and the entire vector u would be determined. This suggests that in order to finda single solution to an mth-order difference equation, m additional conditions mustbe specified. The way in which these additional conditions are specified changes thecharacter of the difference equation significantly. Here are the two cases that arisemost often.

1. Given the mth-order difference equation (7.1), if the first m components arespecified as

where the a^'s are given real numbers, then the task of solving the differenceequation is called an initial value problem (IVP). The easiest way tointerpret this terminology is to imagine that the increasing index n = 0,1, 2,3, . . .represents the passing of time. The difference equation describes a particularsystem (for example, a bacteria culture or a bank account) as it evolves intime, and the unknown un represents the state of the system (the populationof bacteria or the balance in the bank account) at the nth time unit. In such atime-dependent system, it is reasonable that the initial state of the system,as represented by the components WQ, w i , . . . , wm_i, should be specified in orderto determine the future state of the system. For this reason, this formulation iscalled an initial value problem. There are very systematic ways to solve lineardifference equations/initial value problems that have counterparts in the solutionof linear differential equations/initial value problems.

2. The second class of difference equations is less obvious, but equally important.Now imagine that the index n = 0 : N represents spatial position within a system(for example, distance along a heat-conducting bar or position on a beam that isanchored at both ends). The unknown un now represents a particular propertyof the system at the position n when the system has reached steady state (forexample, the temperature in the bar or the displacement of the beam under aload). As before, a particular solution to an mth-order difference equation in thissteady state case can be determined only if m additional conditions are provided.It turns out that the conditions in a steady state problem must be specified ator near the two endpoints of the system (where the system touches the "outsideworld"). For this reason, a difference equation with such conditions is called aboundary value problem (BVP). A second-order difference equation BVPmight carry boundary conditions such as

which would specify, say, the temperature at the end of a rod, or the displacementat the end of a beam. There are systematic methods for solving linear differenceequations that appear as BVPs. These methods usually bear little resemblanceto the corresponding methods for initial value problems, reflecting the verydifferent nature of these two types of problems.

Hopefully this brief introduction gives a sense of the lay of the land. With thisbackground we can now state precisely that the goal of this section is to explore


difference equations that take the form of BVPs. We will limit the discussion tosecond-order linear difference equations, and even then make additional restrictions,with signposts to more general problems. This road (as opposed to that of initial valueproblems) is far less trodden; it also offers the marvelous connection with the DFT.

BVPs with Dirichlet Boundary ConditionsIn this section we will consider a general family of boundary value problems andindicate how the DFT can be used to obtain solutions. We will leave it to the followingsection to present some specific applications in which such BVPs arise. For the momentconsider the difference equation

Notice that the Dirichlet boundary conditions UQ = UN = 0 have been used in thefirst and last equations, which results in a system of N — I equations. We could go

where the coefficients a and 6 are given real numbers, and the terms fn are also given.According to the discussion of the previous section, this difference equation is second-order because the (n + l)st term is related to the two previous terms. It is linearsince the unknowns are multiplied only by the constants a and 6. The equation is alsononhomogenous because of the presence of the term fn, which can be regarded as anexternal input to the system. This particular equation is even more specialized sincethe coefficients a and b do not vary with the index n (this is the constant coefficientcase), and furthermore the coefficients of un±\ are equal. Nevertheless, this is a veryimportant special case.

This difference equation could be associated with either an initial value problemor a boundary value problem. Therefore, to complete the specification of the problemwe will give the boundary conditions

and quickly assert (problem 142) that the more general conditions UQ — a, UN = ftcan be handled by the same methods. Some helpful terminology can be injected atthis point: a boundary condition that specifies the value of the solution at a boundaryis called a Dirichlet boundary condition. There are actually several other typesof admissible boundary conditions, two more of which we will consider a bit later.

The aim is to find an (N + 1)-vector whose first and last components are zero andwhose other components satisfy the difference equation (7.2). It pays to look at thisproblem in a couple of different ways. If the individual equations of (7.2) are listedsequentially we have the system of linear equations


one step further and write this system of equations in matrix form as follows:

In the language of matrices, we see that this particular boundary value problem takesthe form of a symmetric tridiagonal system of linear equations (meaning that allof the matrix elements are zero except those on the three main diagonals). Younow argue: why not use a method for solving systems of linear equations such asGaussian elimination and be done with it? And your argument would be irrefutable!For this one-dimensional problem, there is no need to resort to DFTs, and indeedlinear system solvers are preferable. However, for problems in two or more dimensions(coming soon), the tables are reversed and the DFT is the more economical method.So we will present the DFT solution with the assurance that the extra work will soonpay off.

There are now three ways to proceed, each instructive and each leading to thesame end. We will take them in the following order:

component perspective,

operational perspective,

matrix perspective.

Component Perspective

Notice that for any integer k = I : N — 1, the vector uk with components

for n = 0 : AT satisfies the boundary conditions UQ = UN — 0. This observationmotivates the idea of looking for a solution to the difference equation which is a linearcombination of these N — 1 vectors. Thus, we will assume a trial solution to thedifference equation that looks like

for n = 0 : N.This representation is simply the inverse discrete sine transform (DST) of

the vector Uk as defined in Chapter 4; the factor of 2 has been included to maintainconsistency with that definition. This form of the solution contains the N—l unknowncoefficients Uk', so we have replaced the problem of finding the wn's by a new problemof finding the t/^'s. However, if the coefficients Uk could be found, then the solution


un could be reconstructed using the representation given in (7.3); furthermore, it canbe done quickly by using the FFT.

As with any trial solution, it must be substituted into the problem at hand andthen followed wherever it may lead. Substituting the expression (7.3) for un into thedifference equation

where we understand that this relation must hold for n = 1 : N — I. Now we need tocombine terms with the aid of the sine addition rules. Recalling that

sin(A + B) = sin A cos B + cos A sin B,

expanding and collecting terms simplifies this previous relationship significantly. It isnow merely

for n = 1 : N — 1. Notice how the entire left-hand side of the difference equation hasbeen reduced to a linear combination of the terms sin(Tmk/N) in which the solutionun was expressed. This suggests that we ought to try to express the right-hand side,/n, as a linear combination of the same terms. Toward this end we will give the vectorfn the representation

for k = 1 : N.We may now return to the computation and insert this representation for fn into

the right-hand side of (7.4). If we do this and collect all terms on one side of theequation we find that the difference equation now appears as

for n — 1 : N - 1.Let's pause to collect some thoughts before forging ahead. The original players in

the difference equation were the given vector fn and the unknown vector un. Theyare no longer in sight. They have been replaced by the DSTs of these vectors, Fkand [Tfc, respectively. Recall that the F^s can be computed (since fn is known) and

we find that

for n = 1 : N — 1. Take note that we may interpret the Ffc's as the DST coefficientsof the vector /„. Since the vector fn is known, its DST coefficients can be computedusing


that the goal is to determine Uk, from which the solution un can be reconstructed.With this in mind we resume the task of finding Uk in terms of Fk. The last line ofcomputation (7.5) must be valid at each of the indices n — 1 : N — 1. The only wayin which this can happen in general is if each term of the sum vanishes identically.Since sinfank/N) ^ 0 for all n = I : N - 1, each term vanishes only if the coefficientof sin(7mfc/TV) vanishes. This observation leads to the N — I conditions

for k = I : N — 1. Solving for the desired coefficients Uk, we have that

for k = 1 : TV — 1. An important note is that all of the t/fc's are well defined providedthat a, 6, and k do not conspire to make the denominator b + 2acos(7rk/N) equal tozero. This can be assured if we impose a condition such as |6| > |2o|; this conditionendows the problem with a property known as diagonal dominance. (Any matrixin which the magnitude of each diagonal term exceeds the sum of the magnitudes ofthe other terms of the same row is called diagonally dominant.)

The final step is to recover the actual solution un. Having determined the sequenceUk, it is now possible to find its inverse DST

for n = 1 : N — 1. The resulting solution satisfies the original difference equation (7.2)and the boundary conditions UQ = UN = 0. We can summarize the entire solutionprocess in three easy steps.

1. Apply the DST to the input vector fn to determine the coefficients Fk.

2. Solve for the coefficients Uk of the solution.

3. Apply the inverse DST to the vector Uk to recover the solution un.

Note that the factor of I/TV in the forward DST and the factor of 2 in the inverseDST can be combined in a single step as long as care is used. Let's solidify these ideaswith an example that also begins to move towards one of the applications of differenceequations and BVPs.

Example: Towards diffusion. If you have ever opened a door to a warm roomon a winter night or spilled a drop of colored dye in a basin of calm water you haveexperienced the ubiquitous phenomenon of diffusion. For present purposes, it sufficesto say that diffusion is characterized by the property that some "substance" (forexample, heat, dye, or pollutant) spreads in such a way as to smooth out differencesin concentration. A diffusing substance moves from regions of high concentrationto regions of low concentration. If a door is left open long enough, the room willeventually have the same temperature as the outdoors; if left undisturbed long enough,the entire basin of water will eventually have a uniform concentration of dye. Hereis an idealized model of the diffusion process. Imagine a long thin cylindrical rod

N)


FIG. 7.1. The process of diffusion of heat along the length of a long thin conducting rodmay be idealized by a discrete model. The rod lies along the interval [0, A] which is sampledat N + 1 equally spaced points XQ, ... ,XN- The steady state temperature (heat content] at aparticular point is assumed to be the average of the temperatures at the two neighboring pointsplus the contribution from a possible external source.

that has N + 1 uniformly spaced points along its length. As shown in Figure 7.1,the nth point has a coordinate xn, for n = 0 : N. We will assume that the rod isa good conductor of heat (for example, copper) and has uniform material properties.Furthermore, we will assume that the ends of the rod (n = 0 and n = N) are heldat a fixed temperature of zero; this assumption will form the boundary conditions forthe problem. Finally, to make the problem interesting, we will assume that each ofthe points n = I : N — I along the interior of the rod may have a source (or sink)of heat of known intensity fn. Our question is this: when this system consisting of aconducting rod heated externally along its length reaches equilibrium (steady state),what is the temperature at each of the points along the rod?

We will let un be the temperature at the nth point along the rod where n = 0 : N.We may argue qualitatively as follows. The fact that diffusion tries to smooth (oraverage) variations in temperature can be described by requiring that the steady statetemperature at a point along the rod be the average of the temperature at the twoneighboring points. Since there is also a source of heat at each point of strength /n,we can write that the steady state temperature at the nth point is

for n = I : N — 1. By the boundary conditions we also know that UQ = UN = 0.It should be emphasized that this is a qualitative argument since we have

introduced no length scales or temperature units, and have assumed that thecontinuous rod can be represented as N + 1 sample points. Nevertheless theaveraging idea does capture the essence of diffusion and does lead to a plausible BVP.Rearranging the terms of the previous relationship, we find a difference equation thatlooks like

for n = I : N — 1, together with the boundary conditions UQ = UN = 0. Thisboundary value problem has exactly the form required for use of the DST. In order toproduce a specific solution, let's consider a particular input vector fn. Assume thatthe external heat sources along the rod have a uniform strength of S/N2 units; thatis, fn = S/N2 for n = 1 : N — 1. Following the three-step procedure just described,we must first transform the right-hand side vector fn, then solve for the coefficientsUk of the solution, then perform the inverse DST of Uk to recover the solution un.

A short calculation (problem 144) reveals that the DST of the constant vectorfn = 8/N2 is given by


FIG. 7.2. An idealized model of diffusion leads to a difference equation BVP that can besolved using the DST. The solution to this problem with a constant input vector fn = 8/N2

and boundary condition UQ = UN — 0 is shown for N = 16 (left) and N = 32 (right).

for A; = 1 : N — 1. Noting that the coefficients of the difference equation are a = —1/2and b = 1, we find (with the help of the identity 1 — cos 29 = 2sin#) that the DSTcoefficients of the solution are

for k = 1 : N— 1. The final step of taking the inverse DST cannot be done analytically.However, the numerical calculation is straightforward, and the results are shown inFigure 7.2 for N = 16 and N — 32. Because of the symmetric nature of the input/n, the solution also finds a symmetric pattern that manages to satisfy the boundaryconditions UQ = un = 0. The "hottest" point on the bar is the midpoint, while heatdiffuses out of both ends of the bar in order to maintain the zero boundary conditions.The choice of scaling for fn has the effect of making the maximum temperatureindependent of N.

We have now shown how the DST can be used to solve difference equations thatoccur as BVPs. The next move would ordinarily be to show how the discrete cosinetransform and the full DFT can also be used to solve BVPs with different kindsof boundary conditions. However, before taking that step, we really should lingera bit longer with the sine transform and make two optional, but highly instructiveexcursions. The foregoing discussion can be presented in two other perspectives whichare laden with beautiful mathematics and connections to other ideas.

Operational Perspective

It is possible to look at the solution of BVPs from an operational point of view.Indeed we have already used this perspective in previous chapters in discussing theDFT. We must first agree on some notation to avoid confusion. Assume that we aregiven a sequence un. Since un will be associated with the DST, we will assume that ithas the periodicity and symmetry properties of that transform (Chapter 4), namely,u0 = UN = 0 and un = un+2N- We will denote the DST of un as


The sequence un+i will be understood to be the sequence un shifted right by oneunit, while un_i is un shifted left by one unit. (The terminology admittedly becomessomewhat slippery at times. When un is viewed as a list of numbers with a fixed length,it seems best to refer to it as a vector. However, when we imagine extending that listto use periodicity or shift properties, the term sequence seems more appropriate.) Ashort calculation (problem 145) demonstrates the shift property for the DST:

for k = 1 : N — 1. As before, the terms Fk can be computed since the input vectorfn is specified. The goal is to determine the coefficients t/fc, and clearly this is nowpossible. We find that

for A; = 1 : N — 1, which agrees with the expression for Uk found via the componentperspective. The solution to the problem can then be expressed as un = «S-1 {£/&}„.

Thus, we see that an alternative way to solve a linear difference equation withDirichlet boundary conditions is to apply the DST as an operator to the entireequation. Other types of boundary conditions require different transforms, as we willsoon see. This approach certainly has analogs in the solution of differential equationsusing Fourier and Laplace transforms. In all of these situations, it is necessary toknow the relevant shift (or derivative) properties of the particular transform.

Matrix Perspective

It was noted earlier that the difference equation BVP

With this property in mind, we can turn to the difference equation

for n — 1 : N — 1. We now apply the DST operator <5 (hence the term operational) toboth sides of the difference equation regarding un and wn±i as full vectors. Regroupingterms in anticipation of the shift property we have that

Now, using the shift theorem and the notation thatwe have that

for n = I : N — 1, with UQ = UN = 0, can be regarded as a system of linear equations.We will denote this system Au = f, where u and f are vectors of length N — I and Ais an (N — 1) x (N — 1) symmetric tridiagonal matrix. As a system of linear equationsthis is not a difficult problem to solve; however, it could be even easier! We askwhether there is a change of coordinates (or change of basis) for the vectors u and fsuch that this system appears in an even simpler form. A change of coordinates canbe represented by an invertible matrix P such that u and f are transformed into new


vectors v and g by means of the relations u = Pv and f = Pg. If we make thesesubstitutions into the system of equations Au = f we find that

Not only are the vectors u and f transformed, the system of equations is alsotransformed into a new system that we will write Dv = g where D = P-1AP.Here is the question: is there a choice of the transformation matrix P that makesthis new system particularly simple? The simplest form for a system of linearequations is a diagonal system in which the equations are decoupled and may be solvedindependently. So the question becomes: can P be chosen such that D = P-1AP isa diagonal matrix? This turns out to be one of the fundamental questions of linearalgebra and it is worth looking at the solution for this particular case in which A istridiagonal.

The condition D = P-1AP can be rewritten as AP = PD, where the objectiveis to find the elements of the matrix P in terms of the elements of A. Let's denote thekih column of P as wfc and the kih diagonal element of D as A^, where k = 1 : N — 1.The condition AP = PD means that the columns of P must each satisfy the equation

for n = 1 : N — 1, where WQ = WN = 0. You should sense that we are coming back fullcircle, since this system of equations for the eigenvalue problem is almost identical to

A typical equation in this system is

for k = I : N — 1, where I is the (N — 1) x (TV — 1) identity matrix. For the moment, tosimplify the notation a bit, let w represent any of the (N — 1) columns of the matrixP that we have denoted wfc, and let A denote any of the diagonal elements Afc. Thenthe problem at hand is to find the vectors w and scalars A that satisfy (A — A/)w = 0.Any scalar A that admits a nontrivial solution to this matrix equation is called aneigenvalue and of the matrix A, and the corresponding nontrivial solution vector iscalled an eigenvector. Finding the eigenvalues and eigenvectors of general matrices Acan be a challenging proposition, however we will show that for the tridiagonal matrixthat corresponds to our BVP, it can be done. One approach is to note that if thehomogeneous system of equations (A — AI)w = 0 is to have nontrivial solutions, thenthe determinant |A — AI| must vanish. This fact provides a condition (a polynomialequation) that must be satisfied by the eigenvalues.

An alternative approach is to write out the equations of the system (A —A/)w = 0in the form


the original set of difference equations that sent us down this road. The only differenceis that the coefficient of the "center" term wn is b — X rather than 6. Therefore, wecan look for solutions in much the same way that we originally solved the differenceequation. Solutions of the form wn = sm(nnk/N} satisfy the condition WQ = WN = 0for any k = 1 : N — 1. If we substitute and use the sine addition laws just as before,we find that for each k = 1 : N — 1

which must hold for n = 1 : N — 1. Since sm(Trnk/N) / 0 for all n = 1 : N — 1, thecoefficient of sm(Tmk/N] must vanish identically, which says that the eigenvalues Amust satisfy

In other words, the (n, A;) element of the matrix P is sm(7rnk/N). Equivalently, thekth column of the matrix P is just the kth mode of the DST.

We have now accomplished the task of rinding the elements of the transformationmatrix P and the elements of the diagonal matrix D. In principle, we could completethe solution of the difference equation. The solution of the new system Dv = ghas components vn = gnMn- With the vector v determined, the solution to theoriginal difference equation can be recovered, since u = Pv. In practice, there isusually no need to compute the eigenvectors and eigenvalues explicitly. However, thisperspective does reveal the following important fact: the matrix P whose columns arethe N — 1 modes of the DST is precisely the matrix that diagonalizes the tridiagonalmatrix A. In fact, the matrix perspective leads to a three-step solution method thatexactly parallels the solution method described under the component and operationalperspectives. The parallels are extremely revealing. Here are the three steps forsolving Au = f, seen from the matrix perspective.

1. Apply P"1 to the right-hand side vector: g = P-1f. (This corresponds to takingthe DST of the right-hand side sequence /n.)

2. Solve the diagonal system of equations Dv = g. (This corresponds to solvingfor Uk from F^ by simple division.)

3. Recover the solution by computing u = Pv. (This corresponds to taking theinverse DST of C/fc.)

We have now presented three different but equivalent perspectives on the use of theDST to solve a particular difference equation BVP. The component and operationalperspectives provide actual methods of solution that can be carried out in practice,while the more formal matrix perspective illuminates the underlying structure of the

for k = 1 : N — 1. Corresponding to the eigenvalue Ajt we have the solution vectorwith components wn = sm(Tmk/N). Resorting to our original notation and lettingwfc denote the solution vector associated with \k, we discover that the nth componentof the kth eigenvector is


DST and shows why it works so effectively. We will now turn to two other commonlyoccurring BVPs that can be handled by the discrete cosine transform (DCT) and theDFT. In each of these cases it would be possible to use any of the three perspectivesjust discussed. The following discussion will be less than exhaustive, with unfinishedbusiness left for the problems.

BVPs with Neumann and Periodic Boundary ConditionsThe second variety of difference equation BVP that we will consider has virtually thesame form as difference equation (7.2), namely

for n = 0 : N. We consider an alternative form of the boundary conditions:

Whereas the Dirichlet boundary conditions of the previous section required thatthe solution vanish (or be specified) at the boundaries n = 0 and n = N, thisnew boundary condition, called a Neumann1 boundary condition, stipulates thechange in the solution at the two boundaries. (In differential equation BVPs, theNeumann condition specifies the value of the derivative of the solution. If the derivativeat a boundary is zero, it describes the case of zero flux at that boundary.) Theparticular Neumann condition that we have given above requires that the solutionhave no change at the boundaries. Notice that with these boundary conditions UQ andUN are unknowns and we have N + I unknowns. If the corresponding N+l equationsof (7.7) are listed sequentially we have

where the boundary conditions have been used in the first and last equations. Wecould go one step further and write this system of equations in matrix form as follows:

1 FRANZ ERNST NEUMANN (1798-1895) was professor of physics and mineralogy in Konigsberg.He is credited with developing the mathematical theory of magneto-electric inductance. He alsomade important contributions to the subjects of spherical harmonics, hydrodynamics, and elasticity.His son Carl Neumann lived in Leipzig and is known for his work on integral equations, differentialgeometry, and potential theory. We believe the boundary condition was named for Carl!


The matrix associated with this BVP is tridiagonal, although not quite symmetric,and has N -f 1 rows and columns. It would be possible to solve this system of linearequations (and hence the BVP) using Gaussian elimination. Instead, we will considerthe use of another form of the DFT, since this becomes the more efficient methodin higher-dimensional problems. The key observation is that a vector denned by theDOT

is 2JV-periodic and has the properties HI — u-i = 0 and MJV+I — UN-I = 0. (Recallthat E" means the first and last terms of the sum are weighted by 1/2.) In otherwords, a trial solution of this form satisfies the boundary conditions of the problem.At this point we may proceed with any of the three perspectives discussed earlier. Themost succinct solution is provided by the operational perspective. The component andmatrix perspectives will be discussed in problems 141 and 146. As we saw earlier, thesolution takes place in three steps:

1. the DOT coefficients, F&, of the input vector, /„, must be found,

2. the DCT coefficients, [/&, of the solution must be computed from Fk,

3. the inverse DCT must be used to recover the solution un.

Recall also that the operational approach requires the use of the shift propertiesof the relevant transform. Letting C represent the DCT, and letting un+\ and un-\ bethe sequences obtained by shifting un right and left one unit, respectively, the criticalshift property for the DCT is (problem 145)

for k = 0 : N. Finally, the coefficients Fk may be computed as the DCT of the givenvector /n; that is,

for k = 0 : N. It is an easy matter to solve for the unknown coefficients Uk- Doing so,we find that

for k = 0 : N. We now apply the DCT to both sides of the difference equation (7.7)to obtain

for k = 0 : N. Using the shift property and lettingwe find that


for k = 0 : N. The final step of the solution is to recover the solution un by applyingthe inverse DCT (7.8) to the vector C/fc.

There is an interesting condition that arises in this BVP. Notice that as longas |6| > |2a|, the coefficients Uk may be determined uniquely. However, absent thiscondition on a and 6, it might happen that the denominator 2acos(-jrk/N) + b vanishesfor some value of k. If so, a solution fails to exist unless, for that same value of k,the numerator Fk also vanishes. In this case, Uk is arbitrary (since it satisfies theequation 0 • Uk = 0) and the solution is determined up to an arbitrary multiple of thefcth mode. For example, the most common case in which this degeneracy arises is thedifference equation with 6 = —2a,

We should at least mention the underlying mathematical principle at work here. Thisis one of many forms of the Fredholm2 Alternative, which states that either thehomogeneous problem has only the trivial (zero) solution or the nonhomogeneousproblem has an infinity of solutions.

The third class of BVPs on our tour introduces yet another type of boundarycondition to the same difference equation. We will now consider the difference equation

2A native of Sweden, ERIC IVAR FREDHOLM worked at both the University of Stockholm and atthe Imperial Bureau of Insurance during his lifetime. He is best known for his contributions to thetheory of integral equations.

together with the zero boundary conditions u\ — w_i = UN+I — UN-I — 0. Thecoefficient UQ is now undefined and a solution fails to exist, unless it happens thatFQ = 0 also. If FQ = 0, then UQ is arbitrary and the solution is determined up toan additive constant (since the k — 0 mode is a constant). This makes sense, sinceany constant vector satisfies —un-\ + 2un — un+i = 0 and the boundary conditions.Thus, any constant vector can be added to a solution of the BVP, and the resultis another solution. It also makes sense physically since the Neumann boundaryconditions u\ — w_i = UN+I — UN-I — 0 are "zero flux" conditions that preventthe diffusing substance from leaving the domain. A steady state can exist only if thenet input from external sources is zero; this means that

subject to the boundary condition that UQ = UN and *u_i = UN-I- These boundaryconditions stipulate that the solution should be TV-periodic, and not surprisingly, theyare called periodic boundary conditions. In this case UQ (or UN) becomes an unknown,and there is a total of N unknowns. If the corresponding N equations of (7.9) arelisted sequentially we have the system of linear equations


Notice that the boundary conditions have been used in the first and last equations.This set of equations can also be written in matrix form as follows:

The matrix associated with this BVP is no longer tridiagonal, but circulant (noticethe nonzero corner elements); it is symmetric and has N rows and columns.

As before, we would like to consider a DFT-based method for solving this BVP.The boundary conditions now require a (real) periodic solution vector with period N.The vector that fits this bill is given by the inverse DFT in its real form. Therefore,we will consider solutions of the form

where UN = el2n/N. It is easiest to allow the coefficients Uk to be complex, andthen take the real part to assure that the solution un is real-valued. Note thatun given in this form satisfies the boundary conditions. Again, there are threepossible perspectives that could be adopted. For variety and brevity, we will takethe component perspective.

Substituting the solution un as given in (7.10) and a similar representation for theinput vector fn into the difference equation (7.9), we find that

where the coefficients Fk can be computed as the DFT of the input vector fn. Thisequation must hold for n = 0 : 'AT — 1. The shifts n i l that appear in the complexexponentials are easily handled (no trigonometric addition laws are needed here) anda gathering of terms gives

for n = 0 : N — 1. Once again, we argue that if this equation is to hold in generalfor all n = 0 : N — 1, then each term of the sum must vanish identically. Noting that

we have that

for k = 0 : N — 1. The three-step job is completed by taking the inverse DFT ofthe vector Uk to recover the solution un according to (7.10). We pause to point outthe degeneracy that can occur in this case also. If we assume |6| > |2a|, then allof the coefficients Uk can be determined without ambiguity. Otherwise, a, 6, and kmay combine to make one of the C/fc's indeterminate. This can lead either to thenonexistence of a solution or to an infinity of solutions. We would like to illustratethe use of a BVP with periodic conditions with a short example.

Example: A model of gossip. The process of diffusion discussed earlierhas many physical applications. However, it is also used in a nonfrivolous way bysociologists to describe the spread of information known otherwise as gossip. Recallthat the driving principle in diffusion is the flow of a "substance" (in this case, luridstories) from regions of high concentration to regions of low concentration. This iscertainly a plausible mechanism for the spread of gossip, since like heat, it tends to flowfrom "hot" (well-informed) people to "cold" people. In a steady state, large differencesin the amount of gossip possessed by individuals would eventually be smoothed out,as the system tends toward equilibrium. We will idealize this process with a modelthat consists of N individuals (or houses or communication nodes) that have theconfiguration shown in Figure 7.3. Each individual can talk to two nearest neighborsand no one else. Furthermore, the group is arranged in a ring so that person n = 0talks to persons n — 1 and n = N — I . In order to capture this cyclic structure wewill identify person n = 0 with person n = N. The variable (unknown) in the modelwill be wn, the amount of gossip that person n has when the process has reached asteady state. (Note that, as with any diffusion process, gossip may still be transferredin the steady state, but in such a way that no individual's amount of gossip changesin time.)

FlG. 7.3. A model for the steady state distribution of gossip consists of a ring of people(or communication nodes), each with two nearest neighbors. The cyclic configuration (shownhere with N = 8) is handled by periodic boundary conditions.

We will now model the actual diffusion process as we did before by requiring thatthe steady state amount of gossip possessed by person n be the average of the gossip


for k = 0 : N — 1. Solving for the unknown coefficients Uk gives


possessed by the two neighbors. This assumption embodies the smoothing effect ofdiffusion. Furthermore, we will assume that there are possible sources (or sinks) ofgossip, whereby new information is introduced to the system or deliberately withheld.This input to the system will be denoted fn. Combining these two effects, we see thatin a steady state the gossip content for person n satisfies

for n = 0 : N — 1, a vector whose DFT is given by

for A: — 0 : TV — 1 (see The Table of DFTs in the Appendix). Notice that the constraint0 is satisfied.

The foregoing method for periodic boundary conditions now applies directly witha = — 1/2 and 6=1. The DFT coefficients of the solution are given by

for k = 1 : N — 1, with UQ arbitrary since it satisfies the equation 0 • UQ = 0. Takingthe inverse DFT of this vector, we have the solution

for n = 0 : N - 1.In the absence of a reference value for gossip content, we may take UQ = 0. It

doesn't appear possible to simplify this expression for un directly (but see problem

for n = 0 : N — 1. The cyclic geometry of the configuration makes the periodicboundary conditions UQ — UN and u_i = UN-I appropriate. Rearranging thedifference equation we see that

for n = 0 : N — 1. To complete the statement of the problem we must specify an inputvector to describe sources and sinks of gossip. Intuition alone suggests that since thisis a closed system, a steady state cannot be reached unless the net flow of gossip inand out of the system is zero. In other words, Y^n=o fn — 0> which is equivalent toFQ = 0. We will see shortly that this condition also appears naturally in the course ofthe solution. For this particular example we will assume that there is a single sourceof gossip located at n = 0 and a single sink of gossip located at n = AT/2, both of unitstrength. This means that


FIG. 7.4. A diffusion model may be used to describe the steady state distribution of gossipin a cyclic configuration. The model takes the form of a difference equation with periodicboundary conditions that can be solved using the DFT. With one source and one sink ofgossip, the solution with N = 16 participants (gossipers) consists of two linear segments asshown in the figure.

153). However, it is easily evaluated numerically, and the results for TV = 16 areshown in Figure 7.4. Perhaps the solution could have been anticipated from physicalintuition, but certainly not from the analytical solution. We see that the solution(for any N) consists of two linear segments with a maximum value located at thesource of gossip (n = 0) and a minimum value located at the sink (n = AT/2). Noticethat the average value of the solution is zero. The smoothing effect of the differenceequation creates the linear profiles between the maximum and minimum values, whichis a common feature in many steady state diffusion problems.

Fast Poisson SolversIn this section we consider arguably the most prevalent use of DFTs and differenceequations, namely the solution of ordinary and partial differential equations that takethe form of boundary value problems. This is a central problem of computationalmathematics and has been the scene of vigorous activity for many years [79], [133],[134], [137]. We will content ourselves with exploring a few representative problemsand demonstrating the vital part that the DFT plays in the solution process. It is bestto begin with a one-dimensional problem to fix ideas, then turn to higher-dimensionalproblems in which the DFT becomes indispensable.

Consider the ordinary differential equation (ODE)

subject to the Dirichlet boundary conditions 0(0) = (t>(A) = 0. Notice that theadditional conditions on the solution are given at the endpoints or boundaries of thedomain, so this is called an ODE boundary value problem. The nonnegative coefficient<r2 and the source function g are given. With a2 = 0 this equation describes steadystate diffusion. (In fact, this continuous equation can be derived from the differenceequation of the previous section by letting the spatial distance between the individualcomponents decrease to zero, while letting the number of components JV increase toinfinity.)


FIG. 7.5. Numerical solutions of differential equations often take place on a computationaldomain or grid consisting of a finite number of points. The left figure shows the grid for thesolution of a one-dimensional BVP, while the right figure shows a two-dimensional grid on arectangular region. Second derivatives at a point (•) are approximated using function valuesat that point and at nearest neighbors (x).

Here is the kind of thinking that might precede solving this BVP. One can usea Fourier series method to arrive at an analytical solution to this problem. In alllikelihood, that solution will be expressed in terms of an (infinite) Fourier series thatwill need to be truncated and evaluated numerically. Since numerical approximationwill be inevitable even with an analytical solution, one might choose to use a numericalmethod from the start. As we will see shortly, one numerical method is based on theDFT and mimics the analytical methods very closely. With this rationale for using anumerical approximation, let's see how it works.

The exact solution of the continuous BVP (7.11) gives the value of the solutionfunction <f> at every point of the interval [0, A]. Since we cannot compute, tabulate,or graph such an infinity of values, we must resign ourselves at the outset to dealingwith a discrete problem in which the solution is approximated at a finite number ofpoints in the domain. Toward this end, we establish a grid on the interval [0, A] bydefining the grid points xn — nA/N where n = 0 : N (see Figure 7.5). Notice thatthe distance (or grid spacing) between these equally spaced points is Ax = A/N. Thegoal will be to approximate the solution 0 at the grid points.

Our first task is to show how the continuous (differential) equation can beapproximated by a discrete (difference) equation of the form studied in the previoussection. This process is known as discretization, and the easiest way to do it is toapproximate the derivative </>" in the differential equation by a finite difference. Ashort exercise in Taylor series (problem 148) suffices to establish the following fact: ifthe function </> is at least four times differentiate, then at any of the grid points xn

for n = I : N — 1, where cn is a constant. We see that the second derivative can beapproximated by a simple combination of the function value at xn and its nearest


neighbors xn±\. The truncation error in this approximation, denoted cnAx2,decreases as the square of the grid spacing Ax. Thus, as N increases and the gridspacing decreases, we expect the error in our approximation to decrease.

The whole strategy behind finite difference methods is to replace derivatives byfinite differences and solve the resulting discrete equations for approximations to theexact solution. We do this by using the approximation (7.12) in the original ODE(7.11), dropping the truncation error cnAx2, and letting un denote the resultingapproximation to </>(xn). This means that the components un satisfy

for n = 1 : TV — 1, with UQ = UN = 0. This is just the second-order BVP problemwith Dirichlet boundary conditions that was discussed in the previous section, witha = —1, b = 2 + cr2Ax2, and /„ = pnAx2. The three-step DST method can be applieddirectly to this difference equation, now with an eye on the question of convergence.The ultimate goal is to approximate the solution of the original BVP. In practice,one would solve the difference equation (7.13) for several increasing values of TV untilsuccessive approximations reach a desired accuracy.

Example: Toward diffusion. Let's look at numerical solutions of the BVP

N

EN

8

3.4(-2)

16

8.3(-3)

32

2.2(-3)

64

5.4(-4)

The notation a(—n) means a x 10 n.

Before moving on to two-dimensional problems, we will just mention that twoother frequently arising BVPs can be handled using DFT-based methods. The samedifferential equation (7.11) accompanied by the Neumann conditions 0'(0) = <j)'(A} = 0

for n = 1 : N — 1, where we have let gn = g(xn). Furthermore, since UQ and UN areimplicated in this equation, we use the boundary conditions UQ = UN = 0. A bit ofrearranging will cast this equation in a familar form. Multiplying through by Ax2 andcollecting terms gives us the difference equation

with the boundary conditions </>(0) = </>(TT) = 0. Knowing the exact solution(0(x) = jjxsin4x), it is possible to compute the error

for various values of N. The evidence, shown in Table 7.1, demonstrates clearly thatthe errors not only decrease with increasing N, but decrease almost precisely as N~2

or Ax2.

TABLE 7.1Errors in N-point DST solutions to

4>" + 16(f> - 4xsin4o; - cos4x, with 0(0) = <j>(n] = 0.


can be discretized in the same way. The result is a difference equation of the form(7.13) that can be solved using the DOT. And finally, the same differential equationwith the periodic conditions </>(0) = <j)(A), ^'(0) = (j>'(A) can also be solved in itsdiscrete form using the full DFT. Hopefully these two assertions are plausible in lightof the previous discussion. The details will be elaborated in problem 149.

Having spent considerable time on one-dimensional BVPs and having recognizedthat they can be solved more efficiently using a direct (Gaussian elimination) equationsolver, we now have the groundwork to consider two-dimensional BVPs in which DFT-based methods really do have an advantage over direct equation solvers. The two-dimensional analog of the continuous BVP (7.11) is most easily posed on a rectangulardomain

for m = 1 : M — l,n = 1 : N — I. Notice that these difference approximationsare the analogs of the difference approximation to 0" used earlier. We see that thesecond partial derivatives at (xm, yn) can be approximated by a simple combination of

3HERMANN VON HELMHOLTZ (1821-1894) was an eclectic man who lived in Berlin, Konigsberg,Bonn, and Heidelberg during his lifetime. He was trained in medicine and served as a militaryphysician and as a professor of physiology. As a physics professor in Berlin he did fundamental workin electrodynamics and hydrodynamics.

We will denote the boundary of this domain

The prototype BVP is one of the classical problems of applied mathematics. It isgiven by the partial differential equation (PDE)

where the nonnegative constant a2 and the input function g are given. The PDE willbe accompanied by the boundary condition </> = 0 on dQ. With a2 7^ 0 the PDEis called the Helmholtz3 equation; it arises in many wave propagation problems.With a2 = 0, the equation is called the Poisson equation, and it governs the steadystate in diffusion processes, electrostatics, and ideal fluid flow.

We will proceed, in analogy with the one-dimensional problem, converting thecontinuous PDE into a partial difference equation. As before, the first step is toestablish a grid on the domain of the problem as shown in Figure 7.5. In general,there could be different grid spacings in the two coordinate directions; therefore, wewill let Ax = A/M and Ay = B/N be the grid spacings in the x- and ^-directions,respectively. The grid consists of (M — l)(N — 1) interior points, where M andTV are any positive integers. We must now use finite difference approximations toreplace $xx and 4>yv in the PDE. Focusing on an interior point (xm,yn), the simplestapproximations to the second partial derivatives are

and


function values at that point and its nearest neighbors. The truncation error, involvingthe constants cmn and dmn, decreases as the square of the grid spacing.

We proceed just as before by defining umn as the approximation to <^>(xm,7/n).Partial derivatives in the PDE are replaced by finite differences, the truncation errorsare neglected, and the approximate solution umn is found to satisfy the differenceequation

for m = 0 : M and n = 0 : N, it is easy to see that the boundary conditions aresatisfied immediately. This representation is the two-dimensional inverse DST of thevector Ujk as discussed in Chapter 5.

It helps to rearrange equation (7.14) into the form

for m = l : M — 1, n = 1 : TV — 1. We have also used gmn to denote g(xm, yn). Theboundary conditions for ^> carry over to the discrete approximation in the form

for m = 0 : M and n = 0 : N.There are several ways to proceed from here. One could write out the individual

equations of (7.14) in matrix form and discover that the difference equation corre-sponds to a square matrix with (M — l)(N — 1) rows and columns. The matrixconsists mostly of zeros (called a sparse matrix) and has a block tridiagonal struc-ture. The fact that the matrix is sparse makes a direct use of Gaussian eliminationvery impractical. Taking a lesson from the one-dimensional case, we will use a DFT-based method. The fact that the BVP has Dirichlet boundary conditions in both thex- and y-directions suggests that a discrete sine transform is appropriate. In fact, ifwe assume a solution of the form

before proceeding. We have used 7 to denote the ratio Arc/Ay and let fmn = 72<?mn-Taking the component approach of the previous section we assume a representationsimilar to (7.15) for the vector fmn with coefficients Fjk, and then substitute into thedifference equation (7.16). Working carefully with the multitude of indices, we havethat


which holds for m = 1 : M — 1 and n — 1 : N — I. Perhaps it is a testimony to thespecial power of the DST that it can render this monstrous equation benign. Thecoefficients Fjk are known (or computable) as the two-dimensional DST

for j = 1 : M — 1, fc = 1 : AT — 1, while the coefficients t/jfc are the objects of thequest. (The minus sign appears in the definition of the two-dimensional DST given inChapter 5.) Using the sine addition laws as before and collecting terms, we find that

for m = 1 : M — 1 and n = 1 : TV — 1. We argue in a familiar fashion that this equationcan hold in general for all relevant values of m and n only if the coefficient of each ofthe modes vanishes identically. This leads to the condition

for all of the points m = l : M — l,n = 1 : JV — 1. Having determined the coefficientsC/jA: of the solution, it is a straightforward task to perform the inverse DST accordingto the representation (7.15) to recover the solution umn. We see that in two dimensionsthe same basic three-step procedure applies.

1. The two-dimensional DST must be applied to the input vector fmn to find thecoefficients Fjk.

2. A simple algebraic step gives the coefficients Ujk-

3. The two-dimensional inverse DST of the vector Ujk must be done to producethe solution umn.

We hasten to point out that in practice all of the DSTs in this procedure are doneusing specialized versions of the FFT, tailored for both two dimensions and for thesine symmetries.

This approach generalizes in several directions. First, the same PDE withNeumann conditions (<j)n = 0 on d£l, where 4>n is the derivative in the normal(orthogonal) direction to the boundary) can be discretized and solved using the two-dimensional DCT. Similarly, the same PDE with periodic boundary conditions inboth directions is amenable to the full DFT. A little contemplation shows that thetwo coordinate directions are independent in this solution method. Therefore, witha different type of boundary condition on each pair of parallel sides of the domain,the relevant DFT can be applied in each coordinate direction. One step further, ifdifferent types of boundary conditions are specified on a single pair of parallel sides


(for example, a Dirichlet condition for x = 0 and a Neumann condition for x — A),then there are additional quarter-wave DFTs to handle these cases. Finally, withjust a bit more industry, the same ideas can be applied to the three-dimensionalversion of the the BVP posed on cubical or parallelepiped domains. In this case, thethree-dimensional versions of the relevant DFTs are applied in the three coordinatedirections. Perhaps most important of all, the same basic three-step method stillapplies in all of these extended cases. There are limitations also. DFT-based methodsdo not work directly for problems posed on irregular (nonrectangular) domains. Thesymmetry of the matrix associated with the discrete problem is also crucial for thesuccess of these methods; therefore, they do not apply to variable coefficient problems.However, other boundary value problems may be amenable to solution using otherdiscrete transforms (see Chapter 8).

In closing, we must make an attempt to assess the performance of these DFT-basedmethods. The discrete problem that arises from the BVP (7.11) has roughly MNunknowns (actually (M — 1)(TV — 1) unknowns for the pure Dirichlet problem, MN forthe pure periodic problem, and (M + l)(N + 1) for the pure Neumann problem). Thismeans that the matrix of coefficients associated with the problem has approximatelyMN rows and columns. If Gaussian elimination is used mindlessly to solve this systemof equations (neglecting the fact that most of the matrix coefficients are zero), roughly(MN)3 arithmetic operations are needed. If the regular zero-nonzero structure ofthe coefficient matrix is taken into account, there are methods that require roughly(MN)3/2 arithmetic operations. What is the cost of the DFT-based methods? Weneed to anticipate the fact (confirmed in Chapter 10) that an TV-point DFT (or DST orDCT) can be evaluated in roughly N log TV arithmetic operations when implementedwith the fast Fourier transform (FFT). With this in mind, the operation count for thethree-step DFT method can be tallied as follows.

1. The forward DFT of fmn consists of M DFTs of length TV plus N DFTs of lengthM for a total cost of roughly MTV log N + TVM log M arithmetic operations.

2. The computation of Ujk costs roughly MTV operations.

3. The inverse DFT of Ujk consists of M DFTs of length TV plus TV DFTs of lengtM for a total cost of roughly MTV log TV + TVM log TV arithmetic operations.

Adding these individual costs we see that the DFT method (using the FFT) requireson the order of

2 (MTV log TV + TVM log M) + MTV arithmetic operations.

The point is easiest to make if we consider a grid with M = TV. Then the cost offinding the TV2 unknowns is on the order of TV2 log TV. An operation count of TV2

would be considered optimal since some work must be done to determine each of theTV2 unknowns. Therefore, we conclude that the DFT-based methods are very nearlyoptimal. For this reason these methods are often collectively called fast Poissonsolvers. It is worth noting that an improvement can be made over the scheme justnoted by combining DFTs and tridiagonal solvers. This hybrid method is generallyused in software packages, and it is worth investigating, but that will have to be thesubject of problem 155.


Notes

The preceding discussion of fast Poisson solvers is just a teaspoon from the sea ofresearch that has been done on the subject. Those interested in the history andevolution of the field should begin with the landmark papers by Hockney [79] andBuzbee, Golub, and Nielson [29], then graduate to the papers by Swarztrauber [133],[134], [137]. Comprehensive software packages have been written for fast Poissonsolvers; the most notable in supercomputer environments are FISHPACK [135] andCRAY-FISHPACK [143].

7.2. Digital Filtering of Signals

It is probably no exaggeration to say that digital signal processing occupies morecomputers, more of the time, than any other application. Furthermore, much of thiscomputational effort is devoted to DFTs in the form of FFTs. We cannot hope totreat the DFT in signal processing in the few pages allotted here. Instead, we willbe content to look at only one application, digital filtering, and even this is much toobroad a topic to be treated fully in a short discussion. This introduction will give thereader a taste of how the DFT is used in signal processing, and those finding the topicinteresting may be motivated to pursue other sources. Certainly, there is no shortageof literature on the topic; a visit to the signal processing section in any library willconvince you of that.

We already encountered the basic concept behind digital filtering in the develop-ment of the circular convolution property of the DFT. It is worth the time to reviewthat development here. For our purposes a digital signal is a sequence of numbersoccurring at regular intervals, often obtained by recording some fluctuating voltageor current in an electronic device that measures some physical quantity. Such datagenerally represent values of a continuous function that is sampled regularly in time,although this need not be the case.

As a simple motivating example, consider the signal shown in the upper graphof Figure 7.6. The signal consists of N = 24 samples and is constructed from threeoscillatory components with frequencies of one, five, and six periods on the samplinginterval. In fact, the signal is given by

for n = —11 : 12. The sequence gn is plotted in the lower graph of Figure 7.6.It is much smoother than the original, and the running average appears to have"killed" most of the high frequency components, leaving a signal dominated by the lowfrequency component. We have, in fact, filtered the signal fn by a low-pass filter,an operator that eliminates high frequency components of a signal while allowing lowfrequency components to survive (pass) essentially intact.

for n = — 11 : 12, where Aa: = 1/24. Suppose that a new sequence gn is formed bycomputing a five-point weighted running average of fn. Specifically, assume that fn

is periodic (/n±24 = fn} and let

DIGITAL FILTERING OF SIGNALS 261

FIG. 7.6. A simple example of filtering is given, in which an input signal (top) consistingof three frequency components is filtered (bottom) by constructing a running weighted averageof every five consecutive values. Observe that this has the effect of eliminating the higherfrequency components of the signal while leaving the low frequency essentially unaltered.

Although this is a simple example, it is instructive to examine it more closely,in order to see precisely how the filtering was accomplished. Note first that theprocess of forming the five-point weighted running average may be expressed as acyclic convolution

where hn is the sequence

whose entries are

Recall the cyclic convolution property of the DFT, which holds that the TV-point DFTof the cyclic convolution of two sequences equals TV times the pointwise product of theDFTs of the two sequences:

In this example both fn and hn are real even sequences, which implies that both Fkand Hk are also real even sequences. From The Table of DFTs of the Appendix, it


may be seen that

Noting that hn is a square pulse with amplitude 1/4 (with averaged values at itsendpoints, satisfying AVED), The Table of DFTs may again be used (see also problem158) to determine that

Figure 7.7 displays the DFT sequences F^, H^, and DFT of the convolution, NF^R^.It is now apparent how the filtering process works. The amplitudes of the highfrequency components have been diminished greatly, while the amplitudes of the lowfrequency components have been left essentially unchanged.

FiG. 7.7. The frequency domain representation of a low-pass filter shows clearly howthe filtering process works. The DFT Fk of the input signal (top) shows that the signalis composed of the ±1, ±5, and ±6 modes. It may be observed that the DFT of the filter Hk(middle) consists of low frequency modes with high amplitude, and higher frequency modeswith decreasing amplitude. The product NFkHk (bottom) gives the DFT of the filtered output.Comparison of the spectra of the input (top) and output (bottom) signals shows that the highfrequency modes have been attenuated. Note that both input and output sequences are realand even, hence only the real parts of the DFTs are nonzero.

Based on this example, it seems reasonable to define filtering as the process offorming the convolution of a signal with a second sequence; the second sequence is


called a filter. To filter a digital signal we can either perform the cyclic convolution ofthe signal with the filter in the time or spatial domain or compute the DFTs of boththe filter and the signal, multiply them together pointwise, and compute the IDFT ofthe result. A simple observation leads to an understanding of the popularity of thelatter approach. Computing the filtered output by convolution methods is an O(TV2)operation, since two sequences of length N are multiplied together pointwise for eachelement of the filtered output, of which there are N. Using FFTs, however, the twoDFTs and the IDFT are computed in O(N log N) operations each, while the pointwiseproduct of the DFTs Fk and Hk entails another TV operations. Hence, the cost of thefiltering operation using FFTs is proportional to N log TV while the convolution methodhas a cost proportional to TV2. (To be fair, we point out that time domain filtering canbe made competitive with frequency domain methods if filters are selected that aremuch shorter than the data stream. Details can be obtained in any signal processingtext, e.g., [70] and [108].)

We will now examine several types of filters that are commonly used, and how theyare designed. It is convenient to discuss filters in the setting of functions, rather thandiscrete sequences, so that the basic concepts may be developed without the necessityof considering sampling, aliasing, and leakage. We may proceed in this manner becauseof the convolution property for functions (see Chapter 3), namely that if

The function h(t) is called the filter, while its transform h(cu} is often called thetransfer function. If the input function is an impulse, that is, if f ( t } = 6(t), thenthe output g(t) = h(t), by the properties of the delta function. For this reason, thefilter h(t) is also called the impulse response of the system.

It is useful to recall that since /(a;), h(u), and g(u) are, in general, complex-valuedfunctions, we may write them in the amplitude-phase form

and the phase of / is given by

With this notation, we may observe that the frequency domain representation of thefiltered output

then

where the amplitude of / (for example) is given by

may be written in amplitude-phase form as


The action of the filtering operation can be summed up by observing that theamplitude spectrum of the output is the product of the amplitudes of the inputand the filter,

The same amplitude and phase properties of the output spectrum can be deduced(problem 159).

Filter DesignThere are many different types of filters and we employ properties (7.22) and (7.23) todevelop examples of some of the most common filter types and to give insight into theprocess of filter design. A few words regarding strategy of filter design are in order.As we have seen repeatedly, much of the power and beauty of Fourier transforms andDFTs arise because they allow us to work in either the time domain or the frequencydomain. This means that we may consider either the action of the filter on the signaldirectly (time domain design) or the action of the filter on the spectrum of the signal(frequency domain design). An example of the former approach was used to open thissection, where the filter operator was viewed as a moving average acting on the inputsequence.

Another example of time domain design is that of distortionless filters, filtersthat do not distort the shape of the input, but are purely time-shifting filters. Foran input signal /(£), let's assume that the desired output is f ( t — to)- Thus,

Comparison with (7.21) shows that the transfer function of the filter is

The amplitude spectrum of the output is unchanged from that of the input,corresponding to \h(u)\ = 1, while the phase spectrum has undergone a linear phase

which in turn implies that

while the phase spectrum of the output is the pointwise sum of the phases of theinput and the filter,

Of course, these properties have discrete analogs. If /„ is an input signal whose DFTis Fk, then we can write the DFT in amplitude-phase form as

where

from which the time-shifting property of the Fourier transform gives


shift by the amount — 2nujtQ (see problem 160). Recalling that6(t — to), we see that convolving a function / with a shifted delta function yields

so that the DFT of the desired filter (the transfer function) is

An example of this filter is shown in Figure 7.8. The figure shows the amplitude andphase spectra for the filter, as well as the input and output signals. One feature shouldbe noticed, as it can have disturbing consequences if no provision is made for it. Thisis the wrap around effect caused by the shift operator acting on signals that areassumed to be AT-periodic. Portions of the signal originally occurring near the endof the finite-duration signal are shifted beyond the end of the signal, but the implicitperiodicity of the DFT causes these portions to appear at the beginning of the signal.In the next section, we will see a problem in seismic migration in which the wraparound effect is a commonly encountered difficulty.

Perhaps the most common type of filter is the amplitude distortion filter, whichmay be characterized by the fact that it affects only the amplitude spectrum of thesignal, and does not alter the phase spectrum. Since the effect of convolution is to addthe phase spectra of the signal and the filter, this implies that amplitude distortionfilters must have a phase spectrum that vanishes identically. Such filters are calledzero phase filters. Since the phase spectrum of the filter hn is

This confirms that the filter operator will indeed produce a shift of the input sequence.The discrete form of this filter may be developed in an analogous fashion. Suppose

the desired output is gn = /n-n0> where the input is fn and no is an integer. Again,applying the shift theorem, we find that

The amplitude and phase spectra of this filter are given by

and

it follows that lm{Hk} = 0, and we conclude that amplitude distortion filters (zerophase filters) are real-valued sequences.

The most common of these filters are band-pass filters. As the name implies,a band-pass filter passes frequencies within a specified band, while distorting oreliminating all other frequencies of the signal. From this description it seems naturalto perform the filter design in the frequency domain, and indeed, this is the approachwe shall take.


FIG. 7.8. The amplitude spectrum of the discrete time-shifting filter is shown in the upperleft graph, and the phase spectrum is shown in the upper right graph. The unfiltered inputfn is shown by the solid curve in the graph at the bottom of the figure, while the filtered(time-shifted) output gn is shown by the dotted curve in the lower graph.

Low-Pass Filter

Let us begin by considering an idealized low-pass filter, that is, a filter whichpasses low frequency components of the signal while eliminating the high frequencycomponents. Such a filter has a frequency domain representation given by

where usc is some specified cut-off frequency. The filter has an amplitude spectrum ofunity for \u\ < ujc and an amplitude spectrum of zero for higher frequencies.

The time domain representation of this filter is

To apply the filter to an input signal, we have a choice of two methods: either theinput signal may be convolved directly with 2u;csinc(27ru;ct) or (using the convolutiontheorem) the Fourier transform of the input signal may be multiplied by h(u) followedby the inverse Fourier transform of that product.

As with the shift filter, the discrete form of the low-pass filter may be obtained in


FlG. 7.9. The amplitude of the idealized low-pass filter in the frequency domain (top) andthe time domain (bottom) are shown in this figure. Observe that in the frequency domain thefilter values at the cut-off frequencies are 1/2, satisfying the AVED condition.

a straightforward fashion. It is given by

where kc is the index associated with the desired cut-off frequency. Note that thevalues of H±kc — | must be used since they are the average values at the discontinuity(AVED).

It is a direct calculation (problem 156 or a fact from The Table of DFTs) to showthat the time domain representation of the discrete filter is given by

Figure 7.9 displays a low-pass filter in both the time and frequency domains. Thefilter is generated using N — 64 and a time sample rate Ai = 1/128, so that the totallength of the filter is T = A7" At = 0.5 seconds and the frequency sample rate (by thereciprocity relations) is Au; = 2 hertz (cycles per second). The low-pass filter passesall frequencies |u;| < u;c = 34 hertz. A perfect all-pass filter has a frequency responseof unity for all frequencies, and thus has a time domain representation consisting of asingle spike at t — 0 with amplitude N. Hence, in this and the following figures, wedivide all time domain representations by AT, so that the amplitudes of the filters maybe compared.

Examination of Figure 7.9 reveals that in the time domain the idealized low-pass filter is characterized by many small amplitude oscillations near the central lobe.These oscillations, known as sidelobes, are caused by the sinusoidal nature of the


FIG. 7.10. The amplitude of the modified low-pass filter is displayed in the frequency domain(top) and the time domain (bottom).

numerator of a sine function. When applied to a signal, such filters often producemany spurious oscillations near large "impulses" in the output, a phenomenon knownas ringing. This is caused largely by the abrupt cut-off of the filter in the frequencydomain. One way to counteract ringing is to smooth the sharp cut-off of the low-passfilter edge in the frequency domain. There are many ways to accomplish this; forexample, a modified filter of the form

can be used, where kc and m are positive integers. The frequency and time domainrepresentations of this filter are shown in Figure 7.10. The parameters are N = 64,A£ = 1/128,T = 0.5, and Au; = 2, as in Figure 7.9. The parameters for the frequencyattenuation are kc = 20 and m = 12, so that the frequencies are attenuated for16 < \u\ < 40 hertz. While the ranges of frequencies passed by the filters in Figures7.9 and 7.10 are substantially the same, it may be seen that the sidelobes of the latterfilter are significantly diminished.

Band-Pass Filters

By judicious application of the linearity and shifting of the continuous and discreteFourier transforms many other filters can be created. Among the more common filtersis the band-pass filter, which, as its name implies, attenuates all frequencies exceptthose within a prespecified band in the spectrum. Taking h(u) to be the idealized


FlG. 7.11. A typical idealized band-pass filter is shown in the frequency domain (top) andtime domain (bottom). It is a symmetric filter centered at UQ = ±25 hertz with half-bandwidth(jjc = 15 hertz. Note that the AVED condition is applied at the edges of the band-pass andthat the sharp cut-offs on both ends of the band results in a filter with very large sidelobes.

low-pass filter (7.24), we may construct an idealized band-pass filter b(uj) centered at(jj = ±CL>O of width uc by

Figure 7.11 shows a typical idealized band-pass filter in the frequency and timedomains. As in Figures 7.9 and 7.10, the sampling rates in time and frequency and theextent of the time domain are given by the parameters AT = 64, A£ = 1/128, Au; = 2,and T — 0.5. The band of unattenuated frequencies is centered at <j = ±25 hertz andthe half-bandwidth is given by (jc = 15 hertz.

Letting h(t) be the time domain representation of the idealized low-pass filter(7.26), the time domain representation of the band-pass filter can be obtained withthe help of the frequency shift property, which implies that

By linearity we obtain the desired time domain representation of the idealized band-pass filter,


As was the case with the idealized low-pass filter, constructing the idealized band-pass filter with sharp frequency cut-offs produces a filter with large sidelobes, andtherefore leads to a filtered output characterized by ringing. The same technique of"softening" the sharpness of the frequency cut-off used to alleviate the problem beforemay be applied again here.

Filter Design from the Generalized Low-Pass FilterRather than developing filters from scratch, it is beneficial to have a general filterdesign technique. We will begin with an idealized low-pass filter which is "softened"by applying a linear taper in the frequency domain to avoid ringing. The result isthe generalized low-pass filter, or ramp filter, which has the frequency domainrepresentation

The notation h-L^'^i^z) is used to remind us that the filter, a function of a;, isdependent on the two parameters u\ and 0*2 > ^i- It is not difficult (problem 157 isrecommended) to find that the time domain representation of the filter is given by

The generalized low-pass filter is illustrated in Figure 7.12, where both frequency andtime domain representations are displayed for two different choices of the frequencyparameters u>i and u?2- Observe that the longer taper produces a filter with a narrowermain lobe of greater amplitude and (importantly) greatly reduced sidelobes.

Armed with the generalized low-pass filter, it is easy to produce several commontypes of filters, such as band-pass, high-pass, multiple-band, and notch (band-reject)filters. We will illustrate only three general filters, and those we design only asfrequency domain representations in the continuous case. The process of convertingthe continuous frequency domain representation to a discrete filter should be familiarand will be exercised in problem 161.

1. To create a generalized band-pass filter tapering linearly from zero at ±u>ito unity at ±CJ2 and tapering linearly back from unity at ±073 to zero at ±we simply form the combination

where 0 < u>i < ^2 < ^3 < ^4- This filter is illustrated in Figure 7.13.

2. To create a high-pass filter that rejects all frequencies below uj\ and passes allfrequencies above 0*2 with a linear taper between these frequencies, we subtractthe appropriate band-pass filter from the all-pass filter HA(M} — 1:

w4


FlG. 7.12. The frequency domain response of the generalized low-pass filter is unity for\uj\ < u)\ and tapers linearly to zero at u> = ±u>2- The graph on the top left illustrates thechoice of parameters o>i = 8 hertz and u>2 = 20 hertz, while in the graph on the top right thechoices are uj\ = 8 hertz and u? — 40 hertz. The corresponding time domain representationsare shown in the two lower graphs.

3. To create a notch filter or band-reject filter that passes all frequencies exceptthose within a specified band (with tapers) we form

These last two filters (and their component filters) are illustrated in Figure 7.14.

Notes and References

We have illustrated only the simplest of filters, and have barely scratched the surfaceof this topic. Modifications (and improvements) to the filters presented here may bedeveloped immediately, by replacing the linear tapers and their sharp corners withfilters whose spectral cut-offs are "softer," and whose spectra are thus smoother.

A closely related topic is the application of windows to the input signal. Asignal f ( t ) of finite duration may be viewed as the result of multiplying an infinitelylong signal g(t] by a function wc(i] that has the value of unity if \t\ < tc, and thevalue of zero otherwise. The function wc(i] is a simple example of a window. Thisidealized window has difficulties that can be alleviated by using windows with a smoothtransition from unity (in the center) to zero at the edges. Such windows often bearthe names of the individuals who developed them, and a list reads like a Who's Whoof the pioneers of digital signal processing. Those interested in further exploration ofdigital filters should see problem 162 and references such as [21], [70], and [108].


FlG. 7.13. The generalized band-pass filter can be constructed by subtracting one generalizedlow-pass filter from another. The upper and lower graphs at left show the frequency domainand time domain representations of a generalized low-pass filter whose frequency responsetapers from unity at o>i = 6 hertz to zero at u>2 = 16 hertz. The upper and lower graphs inthe center are for a low-pass filter tapering from unity at u>3 = 36 hertz to zero at u>4 = 58hertz. The generalized band-pass filter results from the subtraction of the first filter from thesecond, and is displayed in the graphs at the right. The frequency domain response taperslinearly from 0 to unity between ±cc>i = 6 and ±0*2 = 16 hertz, is unity between ±u;2 = 16and ±o>3 = 36 hertz, and tapers linearly to zero at ±0*4 = 58 hertz.

7.3. FK Migration of Seismic Data

A Crash Course in Seismic Exploration

The science of reflection seismology is used extensively for geological studies and inthe exploration for oil and gas. The basic principle is quite simple. An incidentsound wave is generated (by an explosion or by vibrations) at the surface of the earth,which propagates down through the rock layers. At the discontinuities between layerssome fraction of the sound wave energy is reflected back toward the surface as anecho, while the remainder is transmitted further down through the rock layers as anincident wave. The echos are recorded by sensitive geophones as they arrive back atthe surface and the resulting data are used to analyze both the depth and compositionof the subsurface layers.

We know that the earth is generally formed of layers of rock piled one on topof another. This is evident in every road-cut along the highway. While we seethese rock layers violently folded and broken in the mountains, throughout most ofthe world they have suffered only minor deformation. The individual layers of rockare relatively homogeneous, and their properties may be assumed to be reasonablyconsistent throughout a given layer. Hence the discontinuities between layers, suchas the change from sandstone to limestone, or an abrupt change in density, shouldbe mappable. Indeed, the reflection seismograph is the primary tool for mapping thenature of the subsurface layering.

There are two important principles that underlie the propagation of seismic waves.

FK MIGRATION OF SEISMIC DATA 273

FIG. 7.14. The figure shows the frequency domain construction (top row) and time domainrepresentation (second row) of a high-pass filter, with zero frequency response below thefrequency u\ = ±36 hertz and a linear taper to unity at u>2 = ±50 hertz. The third andfourth rows show the frequency and time domain representations of a notch filter, with ataper to zero between u\ = ±36 hertz and 0*2 = ±46 hertz, and a taper back up to unitybetween u>3 = ±46 hertz and u>4 = ±56 hertz.

The first, known as Huygens'4 principle, states that every point on a wavefront canbe regarded as a new source of waves. Given the location of the wavefront at a timeto, the position of the wavefront at time to + Ai can be determined by drawing arcsof radius cAt from many points on the wavefront, where c is the velocity of thewavefront in the medium (in this case, the seismic velocity; that is, the speed of thepressure wave or sound, which may vary from point to point in the medium). Providedthat sufficiently many arcs are drawn, the envelope of the arcs gives the position of the

4Born in the Hague, Holland in 1629, CHRISTIAN HUYGENS is regarded as one of the greatestscientists of the seventeenth century. He was a respected friend of Sir Isaac Newton and a member ofthe French Academy of Sciences. Huygens proposed the wave theory of light, wrote the first treatiseon probability, and made fundamental contributions to geometry. He died in Holland in 1695.


wavefront at time to + A£ (see Figure 7.15). As a consequence of Huygens' principle,the wavefront may be accurately described by the use of raypaths, lines that originateat the source and are always normal to the wavefronts.

The second fundamental principle is Snell's5 law. When a wavefront impingeson an interface between layers, part of the energy is reflected, remaining in the samemedium as the incoming wave, while the remainder is refracted into the neighboringlayer with an abrupt change in direction of propagation. Snell's law dictates the anglesof reflection and refraction, and states that the angle of incidence 9\, measured fromthe normal to the interface, equals the angle of reflection Orfi (see Figure 7.15).Furthermore, the angle of refraction 9%, measured from the normal on the oppositeside of the interface, is related to the angle of incidence by

where p\ and c\ are the density and velocity of sound in the layer through which theincident wave travels, while p2 and 0-2 are the density and velocity of sound of thelayer through which the refracted wave travels. Perhaps it is not startling to learnthat these two fundamental laws are related in that Huygens' principle can be usedto derive Snell's law (problem 163).

Ideally, the earth is composed of layers that are homogeneous and flat. If we placea source and a geophone at the same point on the surface, we would record reflectionsonly from points directly below the source. This is because the angle of incidence iszero, so that the reflected wave returns to the surface along the same raypath takenby the incident wave, and the refracted wave continues downward without a changein direction. In this case the depth of the interface could be determined simply bymultiplying one-half of the travel time (the elapsed time between activating thesource and recording the echo) by the estimated velocity of sound in the subsurfacelayer. We do not generally acquire data in this fashion, however, for a number ofreasons (not the least of which is the effect of exploding dynamite near sensitiverecording equipment!). Instead, the data set is acquired using the common midpoint(CMP) method: the source and geophone are located equidistant from a commonmidpoint on the surface. In the ideal earth described above, the reflection receivedat the geophone originates from the same point on the subsurface reflector as thoughthe source and geophone were located at the same point. Since the earth is notcomposed of flat, uniform layers this model is not exact, but it is sufficiently accuratefor most situations (see Figure 7.16). For this reason, we may imagine that the arrivingreflections were generated at points on the subsurface interface where the raypath fromthe CMP is normal to the interface. Such points are called the normal incidencepoints (NIP).

5Born in 1591, WlLLEBROD SNELL (actually Snel) was a Dutch astronomer, lawyer, andmathematician. He is best known for the triangulation method in geodesy and for Snell's Lawof Optics.

where c\ and 02 are the velocities of sound in the media through which incident andrefracted waves travel (see Figure 7.15). The effect of the interface on the amplitudesof the reflected and refracted waves is complicated. In the case of normal incidence(when 9\ = 0 radians) the ratio of amplitudes of the reflected and incident waves isgiven by a quantity known as the reflection coefficient, which is


FIG. 7.15. The left figure illustrates how Huygens' principle can be used to locate wavefronts.The wavefront at time t + At may be found by drawing arcs of radius cAt around manypoints along the wavefront at time t. The envelope of all such arcs (dotted curve) givesthe new position of wavefront. Snell's law (illustrated on the right) governs the reflectionand refraction of waves in two different media. The angle of incidence 9\ equals the angle ofreflection Or/i, while the relationship between the angle of incidence and the angle of refraction62 is 02 sin#i = ci sin 6-2, where c\ and 02 are the seismic velocities in the two media.

FIG. 7.16. The common midpoint (CMP) method assumes that the source and receiver arelocated so that the reflection is generated at the normal incidence point (NIP), the point wherea raypath from the common midpoint is normal to the reflector. For flat layers (left) this isa good model, while for rocks that have been deformed by geological forces (right) the model isless accurate.


FlG. 7.17. A synthetic seismic section consists of many traces or curves representingmovement of the geophones as a function of time. Time t = 0 corresponds to the initiationof the seismic disturbance, and is at the top of the figure. Increasing time is downward.Reflections are shown as large amplitude "wiggles" on the traces.

Typically, a seismic data set is presented in the form of a seismic section orrecord section. Recordings, called traces, are made that correspond to many CMPsthat have been laid out along a line on the surface of the earth. Each trace, aftersuitable data processing, is plotted as a curve below each CMP surface location, withelapsed time forming the vertical axis. Arriving reflections are typically indicatedby short pulses on each trace at the time of arrival. Figure 7.17 shows a syntheticseismic section with three gently dipping reflectors. Actual data sets are generallycharacterized by rather low signal-to-noise ratios, however, and are generally muchmore difficult to interpret.

If, at a given surface location, a record section could be obtained that containedonly reflections originating directly beneath the source, some of the difficulty ofseismology could be alleviated. Unfortunately, this isn't possible, for several reasons.Foremost is the fact that the NIP is not generally located directly beneath the CMP.Instead, the reflection comes from somewhere below and to one side of the reflector,as was illustrated in Figure 7.16. Frequently the reflections originate from subsurfacelocations that are out of the plane of the record section. An extreme example (and yeta common one) is shown in Figures 7.18 and 7.19. The geologic model is a syncline,a geologic phenomenon in which originally flat-lying rock strata are folded such thatthe fold is concave upward (the opposite fold, such that the fold is concave downward,is called an anticline, and is a common accumulation structure for oil and gas). Across-section of the geologic structure is shown in Figure 7.18 (top). The synclineacts as a hemispherical "mirror," focusing the seismic raypaths on a focal point farbelow the surface (Figure 7.18, bottom). As a result, the reflections at arriving at agiven surface location are not generated directly below the surface, but rather, frompoints far to one side. Note that several CMP locations detect reflections generatedat points on both sides of the structure. The resulting record section is shown inFigure 7.19. The focusing effect of the stream channel has the effect of inverting


FIG. 7.18. A simple geological model of a syncline (top) is used to illustrate the difficulty ofinterpreting seismic data. The syncline has the effect of focusing the raypaths like a mirror,as may be seen from the idealized sketch below, where one reflector and some of the reflectionand diffraction raypaths are shown. The resulting record section is shown in Figure 7.19.

the apparent structure, so that a geologic object which is concave-up appears on therecord section as being concave-down, like a dome! This "criss-crossed" pattern ofreflections is a characteristic seismic response for this geology, and is referred to as abowtie. The record section is further complicated by the fact that the sharp cornersof the stream channel give rise to diffractions, which appear on the record section assecondary, smaller "domes." The bowtie phenomenon is especially pernicious becauseunderground domes are favorable locations for the accumulation of oil and gas. It isdifficult to imagine the amount of money that has been spent drilling for oil in bowties!

We can now state the problem that seismic migration is designed to alleviate.The goal of seismic migration is to move the reflections on the seismic section so thatthey are located accurately in the subsurface. Properly corrected, a reflection will


appear on the seismic trace corresponding to the CMP located vertically above thereflecting point, and at a travel time that would be correct were the travel path infact vertical.

FIG. 7.19. The record section resulting from the geologic model in Figure 7.18 is shown.Because of the focusing effect and the diffractions, the record section is characterized byartificial ramp- and dome-like structures, which are highly misleading to the interpreter.

Frequency-Wavenumber (FK) MigrationIn actual practice, a record section is produced by setting off many sources, inmany locations, at many different times, and recording the results on many separatereceivers, each at a different surface location. To form the record section, the data aresubjected to a sequence of data processing steps entailing the application of filters,


corrections, and adjustments. The details are too complicated (and numerous) todiscuss here, but the net result is that after processing, the data set appears as thoughall sources were fired simultaneously while all receivers were active.

A mathematical model for seismic migration can be developed through thefollowing thought experiment. Suppose that the receivers are located along the surface,but that the sources, instead of being at the surface, are located along each of thereflectors in the subsurface, with a charge proportional to the reflection coefficient ofthe interface. At time t = 0 all charges are fired simultaneously, and the resultingwaves propagate toward the surface. Furthermore, suppose that in this model all theseismic velocities are precisely half of the actual seismic velocities. This is called theexploding reflector model, and the record section produced in this idealized waywould be essentially the same as the section actually acquired in the field. Thus, wemay use a mathematical model based on this thought experiment to design a methodfor migrating the reflections on the record section to their correct spatial locations.

Using this conceptual model, we will let u(x, z, t] denote the pressure wavefield,where x is the horizontal spatial variable, z is the vertical spatial variable, and t is thetime variable. The earth is assumed to be flat, with z = 0 at the surface and increasingin the downward direction. (A correction for surface topography is one of manyadjustments made during data processing, so the flat earth assumption is valid here.)The time t = 0 represents the instant at which the "exploding reflectors" are fired.Note that there is no dependence on y, the third spatial dimension. For conventionalseismic data the earth is assumed to vary only with x and z; that is, reflections areassumed to originate in the plane of the section. This assumption is fairly good wherethe geological deformation is mild. Treatment of fully three-dimensional seismology isa much more complicated problem and will not be discussed here.

Under the assumptions of the thought experiment, it is possible to derive agoverning equation (excellent references are the books by Telford et al. [146] andDobrin [48]) which is a partial differential equation called the wave equation; it hasthe form

where c is the seismic velocity in the subsurface. In actuality, c varies throughout thesubsurface, so that c = c(x, z), but we will assume a constant velocity. Framed in thisway the record section, consisting of measured data, is u(x, 0, i)\ it may be viewed as aboundary condition. The desired output is the reflector section or depth section,the distribution of the reflectors in the subsurface, and is given by u(x, z,0).

We now proceed in a straightforward manner. Let U(kx,kz,i) be the Fouriertransform of w(x, z, t) with respect to x and z, where the frequency variables, orwavenumbers, in the spatial directions are denoted by kx and kz. Thus,

Upon taking the two-dimensional Fourier transform with respect to the spatial vari-ables of both sides of the wave equation (7.28), applying the derivative property (seeproblem 165), and interchanging the order of the integration and the differentiationwith respect to t, we find that (7.28) becomes the ordinary differential equation


At this point we could solve this ordinary differential equation directly. However,it proves to be advantageous to apply yet another Fourier transform, with respectto the variable t. Denoting the frequency variable associated with t by w, we letu(kx, kz, u) be the Fourier transform of the function C/(/cx, fcz, t):

Notice that u(kx, fcz,u;) is the full t/iree-dimensional Fourier transform of n(x, z,t). Wemay now take the Fourier transform of the ordinary differential equation (7.30) withrespect to t and use the derivative property once again to obtain

This differential equation has two linearly independent solutions,

where P(kx,kz] and Q(kx,kz) are arbitrary functions of the wavenumbers (indepen-dent of t) that must be chosen to satisfy initial or boundary conditions.

//we could find P(kx,kz) and Q(kx,kz}, we could also find U(kx,kz,t), and thesolution to the wave equation could be constructed by taking the two-dimensionalinverse Fourier transform of U(kx,kz,i). Formally, the general solution to the waveequation would look like

Without knowing the solution in detail, we can conclude from this general solutionthat it will consist of a linear combination of the functions

The first of these functions is constant along the plane in (x, z, t] space given by

Since we are not interested in a solution of the form u(x, z, t} = 0, the factoru(kx, kz,u>) may be canceled from both sides to yield

This important relationship between the variables in the transform domain (fcx, fcz,u;)is known as the dispersion relation. Using it in (7.30) allows us to simplify thatequation and write

Thus, the general solution to (7.32) is given by

)

)

)


Thus (with the help of problem 166 and the geometry of two-dimensional modes ofChapter 5), we see that e

l2ir(kxx+kzz+ut) represents a plane wave propagating in thexz-plane in the direction of the vector (—fcx , — kz). Using the dispersion relation, theplane wave travels in the direction of the unit vector

By selecting the solution with the plus sign, we can ensure that LU has the same signas kz. In that case, the second component of the direction vector (7.33) is negative,and since we have defined z to be positive in the downward direction, this tells us thatet2ir(kxx+kzz+ut) represents an upgoing plane wave. A similar argument reveals thate-i2ir(-kxx-kzz+ut) ^ & dc-wngoing plane wave. Since we are interested only in theupgoing (reflected) waves, we will choose Q(kx,kz] — 0, and seek a solution to (7.32)of the form

Here is the key to the whole calculation: we see that U(kx,kz,Q) = P(kx,kz) is theFourier transform of u(x,z,Q), which is the reflector section that we seek. Therefore,the entire migration problem can be reduced to finding the function P(kx,kz) andtaking its two-dimensional inverse Fourier transform.

We have now worked the unknown reflector section into the discussion. We mustnow incorporate the known record section. Here is how it is done. Recall that thefunction U(kx, kz, t} — P(kx, kz)e

l2lTWt is the Fourier transform, with respect to the twospatial variables, of the wavefield u(x, z, t}. Therefore, its inverse Fourier transform is

It is now critical to determine which way (up or down) this wave propagates. Thedispersion relation (7.31) may be written as

The initial condition for the ordinary differential equation (7.30) may now be usedto advantage. By setting t = 0 in the previous line (7.35), it is clear that

On the other hand, setting t = 0 in (7.29) tells us that

Letting z = 0 in this expression (evaluating it at the earth's surface) yields arepresentation for the record section w(x,0,t) . It is


Another representation for the record section u(x, 0, t) can be obtained by writing itin terms of its Fourier transform with respect to x and t, which we shall denote

Now we can make the crucial comparison. From expressions (7.37) and (7.38) wesee that

The dispersion relation (7.34) now enters in a fundamental way and produces theeffect of migration. Differentiating the dispersion relation (problem 164) yields

Substituting for both aj and duj/dkz in the previous relation (7.39) we find

With the function P(kx,kz) = U(kx,kz,G) in hand, the reflector section, u(aj,2, 0),may be recovered by taking one last inverse Fourier transform:

The path to this conclusion may have seemed somewhat convoluted. But it is nowpossible to stand back and summarize it rather succinctly. The process of convertingthe record section u(x, 0, t} to the reflector section M(X, z, 0) requires three basic steps.

1. Compute the function H ( k x , u j ) by finding the two-dimensional Fourier trans-form, with respect to x and i, of the record section w(x,0, t).

2. Compute the function P(kx,kz} from H ( k x , u j } by equation (7.40).

3. Compute the reflector section u(x, z, 0) by finding the two-dimensional inverseFourier transform, with respect to kx and A;z, of P(kx,kz).

Since the record section u(x, 0,t) is known, H(kx,u) can be found by taking theFourier transform of the record section with respect to x and t. We can also writew(x,0, t) as the inverse Fourier transform of H(kx,uj), which looks like

Therefore, P(kx,kz] can be obtained from the Fourier transform (with respect to xand t) of the record section by


Frequency-Wavenumber Migration with the DFTHow can we implement the method described above? Let us suppose that we have Msource and receiver locations at the surface, laid out along a straight line at regularintervals, with a spacing of Ax. We are at liberty to place the origin anywhere alongthe line, and to stay with our symmetric interval convention we select our origin sothat the record section is nonzero over the intervals —A/2 < x < A/2, with tracesrecorded at j'Ax for j — —M/2 + 1 : M/2. The total width of the interval in thex-direction is denoted A, so that A = MAx. For each surface location we have arecording of the seismic trace over the interval 0 < t < T and there are N samples ofthe trace, with a sample interval of At, so that T = JVAt.

To perform the first step of the algorithm we must compute H(kx,u), which isgiven by

Both integrals may be approximated with a trapezoidal rule (with an appropriateadjustment to satisfy the AVED requirement). Therefore, we define a sequence {gjn}for j = —M/2 + 1 : M/2 and n = 0 : N — 1, made up of samples of u given by

Using the sample intervals Arr = A/M and At = T/N we find by the reciprocityrelations that we may take Afcx = I/A and Au; = 1/T. Under these conditions, atrapezoidal rule approximation to the integral in (7.42) is given by

where / = —M/2 +1 : M/2 and ra = 0 : N — 1. We will denote the approximate valuesof H(l£±kx,m£±uj} by Him and then make the enterprising observation that Him isjust AT times the two-dimensional DFT of Qjn. It is easily written as

To perform the second step of the process, we must compute the function P(kx, kz)from H(kx,u) by equation (7.40). Before doing this step it is necessary to determinethe lengths of the domains and the sample rates for the depth variables z and thewavenumber kz. We are at liberty to select any depth interval, since u(x,z,i) clearly


exists to great depth. Since our model assumes that c, the velocity of sound in theearth, is constant, it seems reasonable to choose the depth interval to be [0, cT/2].The reflection from any reflector deeper than D = cT/1 will arrive at the surface afterthe recorders have been shut off.

While in theory we could change the sampling so that the number of samplesused for kz in P(kx, kz] is not the same as the number used for a; in H(kx,w), it willcertainly be simpler to choose the same number of samples, namely N. This in turnimplies that Az = D/N, from which the reciprocity relations give us Afcz = l/D.Now the migration step must be done in a discrete manner. Using the relationship(7.40),

we must form array Pim = P(/Afcx, mA&2) from the array H(lAkx, mAw) = Him thatwas computed in the first step. The identification of these two arrays must be donefor / = —M/2 + 1 : M/2 and ra = 0 : N — 1, but there are some subtleties involved.Ordinarily some interpolation will be needed to obtain the samples P(/Afcz,raAfc2)on a regular grid with respect to the variable kz. Many interpolation schemes areavailable, but a choice must be made with care. One danger arises from the fact thatthe sample intervals Ax and A 2 generally differ greatly. As a result, the phenomenonof dip aliasing may occur because the dip of the migrated reflections cannot beresolved using the horizontal sample interval Ax. The reflections will have an apparentdip that is less than the true dip. We will not digress into the arcana of dip aliasing.

To complete the migration we must perform the final step in the algorithm, thecalculation of the reflector section from the function P(kx, ky). It will be no surpriseto learn that an IDFT is used to perform this step. In fact, it should be predictablethat, letting Ujn represent the samples of u(x, 2,0) and Pim represent the samples ofP(fcx,u;), the inverse transform is given by

where j = —M/2 + 1 : M/2 and n = 0 : N — 1. It should be verified (problem167) that with the given domains and grid spacings, this IDFT does, in fact, provideapproximations to the reflector section u(x, z, 0) at the appropriate spatial grid points.

Example: FK migration. A simple example of the frequency-wavenumbermigration process is shown in Figure 7.20. The synthetic seismic data from Figure7.19 has been migrated using the algorithm described above. Recall that the geologicalmodel consisted of layers that had been folded into a syncline. The seismic sectiongenerated by this model is characterized by the bowtie pattern. Note that aftermigration, the data appears much more like the geological model, since the reflectionsshown on the original synthetic section have been moved so that they now appear onthe trace overlying the reflecting points that generated them.

Of course, this example is highly idealistic. Real data, unlike the synthetic dataused here, is generally characterized by a rather low signal-to-noise ratio, renderingthe input data far more difficult to interpret. In particular, the record section is usedto estimate the velocity to be used in the migration process, which was known ingenerating the example. Hence, reality generally falls short of the success portrayedhere.

FK MIGRATION OF SEISMIC DATA

FIG. 7.20. The application of frequency-wavenumber migration to the synthetic seismic datain Figure 7.19 yields dramatic improvement in the data, as may be seen here. The migrationprocess has moved the reflections on the section so that they appear on the trace overlying thecorrect physical location of the reflectors.


In reality a host of complications arise with the above procedure, and many highlysophisticated data processing routines have been developed to address them. We willmention two of these issues very briefly. The first, and most obvious drawback tothe scheme concerns the use of a constant seismic velocity c. Obviously, the velocitydistribution of the subsurface is unknown. Additional information often exists thatcan be used to make good estimates of the subsurface velocity. Even without such

285


information, one could perform the migration several times, using several differentvalues of c, and choose the best image of the subsurface. A more serious concern isthat the velocity of the subsurface is not constant. There are many local variationsand even the regional average velocity varies systematically, generally increasing withdepth. Without the constant velocity assumption Fourier-based migration cannot beapplied, since c cannot be moved outside the Fourier transform integrals. ApproximateFourier methods have been developed for variable velocity cases, but are much moredifficult to apply. Current research addresses ways to implement migration schemesusing variable velocity.

The second difficulty is the appearance of ghost reflections due to the use of theDFT. Ghost reflections are reflections that should occur at or near the bottom of thereflector section, but actually appear near the top after migration (and vice versa).Similarly, reflectors that should appear at the extreme right or left sides may show upat the opposite sides of the migrated section. The cause of this problem is simple, asis one method of curing it. Upon close examination of the effect of migration in thefrequency domain (7.40), it will be observed that both the real and imaginary partsof H(kx,u) are moved the same amount for each uj. The amount that the data aremoved differs for each a;, however. Hence the change is a phase shift, as well as anamplitude change. For this reason FK migration is often referred to as phase shiftmigration. We have seen in Section 7.2 that the effect of a linear phase shift in thefrequency domain is a constant shift in the time domain, and that the periodicity of theDFT then produces the wrap around effect. While the phase shift given by (7.40) iscertainly nonlinear, over the range of frequencies common to reflection seismology theeffect is often a nearly linear phase shift, and the ghost reflections are indeed causedby the wrap around effect. A simple cure for the wrap around effect is to pad the datawith zeros prior to the migration process. Padding is generally done at the bottomof the record section, and along one or both sides. A sufficient number of zeros mustbe used to insure that only zeros are "wrapped around" into the data. Of course, theintroduction of these zeros affects the accuracy of the DFT as an approximation to theFourier transform. The seismic data processor must select the migration parameterscarefully, to balance the benefits of zero padding with the degradation in accuracy.

7.4. Image Reconstruction from Projections

Suppose an x-ray is passed along a straight line through a homogeneous object. Ifthe length of the travel path is x and the x-ray has an initial intensity /o, then theintensity, /, of the x-ray that emerges from the object satisfies

where Ui is the attenuation coefficient of the ith material. We see that passing anx-ray through a nonhomogeneous object may be modeled by letting the number of

where u, the linear attenuation coefficient, is dependent on the material makingup the object. If the x-ray passes through one material along a path of length xi, andthen through another material along a path of length x<z, and so on through a numberof layers, the emerging x-ray will be attenuated according to

IMAGE RECONSTRUCTION FROM PROJECTIONS 287

materials increase while the length of the travel path through each material decreases.Upon passage to the limit, the decay of the x-ray behaves according to

where u(x) is the linear attenuation function, and L designates the line along whichthe x-ray passes through the object.

Consider passing many x-rays through an object along lines that lie in the sameplane. Suppose that the linear attenuation function in the plane of the x-rays can bedescribed by a function of two variables u(x, y}. The attenuation function of an objectmay depend on many factors, but in most cases a principal factor is the density of thematerial [78]. Therefore, we will refer to u(x,y) as the density function, or simplythe image. For each x-ray the travel path can be parameterized by the distance, s,traveled along the path, L. The attenuation in the x-ray intensity for the path L is

The basic problem of Computer Aided Tomography (CAT) is to reconstruct thedensity function u(x,y) from measurements of the x-ray attenuation log(///o) alongmany paths through the object.

The Radon Transform and Its PropertiesWe will now let u(x, y] be an arbitrary function of two spatial variables that is nonzeroover some region D in the xy-p\ane. The xy-plane is often called the image or objectplane. We will regard u as the density of a planar object that occupies the region D.The Radon transform of u is defined as the set of all line integrals of M,

where L is any line passing through D. The transform is named for Johann Radon6

[114], who studied the transform and discovered inversion formulae by which thefunction u(x, y) may be determined from l^u.

The Radon transform *R,{u} is also a function of two variables that must beinterpreted carefully. The two variables in the transform domain can be any twoparameters that uniquely specify a line in the plane. For example, let p be a realnumber and <j> be an angle measured from the positive x-axis. Then the condition

6JOHANN RADON (1887-1956) was born in Bohemia (the former Czechoslovakia). He heldprofessorships throughout Europe including the Universities of Hamburg and Vienna. He madelasting contributions to the calculus of variations, differential geometry, and integration theory.

and taking logs of both sides we have

determines a line L in the xy-plane, normal to the unit vector anda distance from the origin, measured along (see Figure 7.21 and problem 168),


FIG. 7.21. The figure shows the geometry of the Radon transform [R,u](p, 0). For givenvalues of p and 4>, the transform gives the value of the line integral of u along the lineorthogonal to the vector (cos 0, sin 0)T whose signed orthogonal distance from the origin is p.The figure shows the two lines corresponding to p2 = cos2(7T/6) = 0.75 and 4> = Tr/6, withequations x cos

Therefore, the coordinates (/?, </>) determine a line uniquely and can be used as thevariables of the Radon transform in the following way. Recall the sifting (or testing)property of the Dirac 8 distribution,

in one dimension, and

Figure 7.21 shows the geometry of the Radon transform, in which (j) gives the angleof the ray that is normal to the line L and is measured clockwise from the positive a>axis. Then p is the signed distance from the origin to the point where the line L meetsthe ray. A word of caution is needed: the variables p and </> should not be confusedwith polar coordinates. A function of polar coordinates must be single-valued at theorigin, but there is no reason to suppose that the line integrals of u passing throughthe origin have the same value for all 0.

The Radon transform may be thought of as a projection operator. Specifically,if the transform \Ru] is considered as a function of p with a parameter 0, then theprofile given by the set of all line integrals, for fixed 0, as p varies, is a projectionof u(x, y) onto the one-dimensional subspace in which p varies between —oo and oo.Henceforth we shall use the term projection to indicate the values of the Radon

where 0 and

in two dimensions. Then the Radon transform can be specified by


transform corresponding to a fixed value of 0. Taken over all angles, the transform isreferred to as the set of projections. This terminology reflects medical tomographytechniques in which many parallel x-rays are passed through an object and collectedon the other side, all at a fixed angle </». This single pass constitutes a projection. Theentire apparatus is then rotated to a new angle </> and another projection is made.

Example: The characteristic function of a disk. Consider the densityfunction consisting of the characteristic function of a disk of radius R, centered atthe origin; that is,

Since this object is symmetric with respect to the projection angle 0, its Radontransform is independent of 4> and a single projection will suffice to determine theRadon transform. Consider the projection angle 0 = 0, so that the p-axis of theprojection corresponds to the rr-axis of the image space. The line integrals are thentaken along lines parallel to the y-axis, at a distance p from the origin. For all valuesof \p\ > R, the lines of integration do not intersect u(x, y), and the projection has zerovalue. For

Invoking the symmetry of u(x,y), the Radon transform is

This example is shown graphically in Figure 7.22. Notice that both the function u(x, y)and the transform [7£w](p, 0) are displayed on Cartesian grids. For the transform[R,u](p, (/)) the grid uses p and 0 axes, while values of the transform are displayedas height above the p0-plane. Another simple image whose Radon transform can becomputed analytically is presented in problem 169.

The Radon transform has been well studied, largely because it has proven useful inmany diverse areas. Rigorous treatments and exhaustive bibliographies can be foundin numerous works [47], [62], [78], [104]. Certain properties of the Radon transformare essential in developing the DFT-based inversion methods of this study, so theyare discussed briefly here. For a detailed treatment, see the books by Deans [47] orNatterer [104].

1. Existence and smoothness. The line specified by p — xcosfi + ysin</> is, ofcourse, of infinite extent. For the transform to exist at all, it is necessarythat u(x,y) be integrable along L. It is common to require that the functionu belong to some reasonable space of functions. The following discussion ismotivated by the medical imaging problem, so we are concerned primarily withfunctions possessing the following characteristics. First, we assume compactsupport, specified by u(x, y) = 0 for x\ > A, \y\ > A, or in polar coordinates byu(r, 9} = 0 for r > A. Second, we impose no smoothness requirements. We donot, for example, insist that the density function u(x, y} be continuous. It is easyto imagine objects imbedded within objects (for example, bone in flesh) withsharp discontinuities. The density function may or may not be differentiable,although it often will be differentiable almost everywhere. We will assume thatthe density function is bounded, and that in turn the projections are bounded.

R the transform is


FIG. 7.22. The characteristic function of a disk is shown on the left. The x-axis is parallelto the lower right edge, while the y-axis parallels the lower left edge. The Radon transform ofthe characteristic function of a disk is shown on the right. The p-axis is parallel to the lowerright edge, while the 4>-axis parallels the lower left edge.

2. Linearity. The Radon transform of a linear combination of functions may beexpressed as

Therefore, the Radon transform is a linear operator.

3. Shift property. Given a function M(X,T/) , consider the effect of transformingu(x — a, y — b). We see that

Letting v — x — a and w = y — b yields

Thus, the effect of shifting the function u(x,y) is to shift each projection adistance acos0 + 6sin0 along the p axis.

4. Evenness. The Radon transform is an even function of (p, 0), in that


There are several more properties of the Radon transform that are useful fornumerous applications, but the foregoing serve as an adequate introduction for ourpurposes. The interested reader is referred to problem 170 for further properties.

The Central Slice Theorem

For our purposes, the most important property of the Radon transform is given by theCentral Slice Theorem. This fundamental theorem relates the Fourier transform of afunction to the Fourier transform of its Radon transform, and in so doing, providesthe basis for several methods for inverting the Radon transform.

THEOREM 7.1. CENTRAL SLICE. Let the image u(x,y)'have a two-dimensionalFourier transform, u(ujx,ujy), and a Radon transform, [R.u](p, 0). If"R.u(u, 0) is theone-dimensional Fourier transform, with respect to p, of the projection [R,u](p, 0),then

where (J2 = ui^ + uiy and 0 = tan l(uy/ujx). That is, the Fourier transform of theprojection ofu perpendicular to the unit vector (cos0,sin</>)r is exactly a slice throughthe two-dimensional Fourier transform of u(x,y] in the direction of the unit vector.

Proof: Consider the Fourier transform of a projection. For fixed 0, the one-dimensionalFourier transform of [Tlu] (p, </>), with respect to p, is

Defining the frequency variables in the component directions by

we have u2 = u^+Uy and 0 = ta,n~1(u)y/u>x)- Note that (u, 0) are genuine polar coordinates.The variables (u}x,uy) cover all of R2 as 0 varies over the interval [0, TT) and ui varies over(—00, oo). But then the expression for 7iu(uj,(t>), the one-dimensional Fourier transform ofthe Radon transform, becomes

which is precisely the two-dimensional Fourier transform of u(x, y).The power of this theorem is that it leads, in a very straightforward manner, to

a simple, elegant method for inverting the Radon transform, and recovering an imagefrom its projections.


The Fourier Transform Method of Image Reconstruction

Let us suppose that the Radon transform of an unknown image u(x, y) is the function0(p,0); that is,

Image reconstruction is an inverse problem: given the collected projection data g(p, </>),we seek the function u(x,y). If we can form the Fourier transforms of g(p,4>), for allvalues of 0, then the assemblage of the Fourier transforms, g(u, 0), defines a two-dimensional function (in polar coordinates) which, by the Central Slice Theorem,must equal u(ujx,uy). In principal, the image u(x,y) can be recovered from g(ux,u>y)by a two-dimensional inverse Fourier transform

Letting Fn and Tn l represent the n-dimensional Fourier transform and n-

dimensional inverse Fourier transform operators, respectively, this procedure can bewritten more compactly as

as long as we keep track of the appropriate transform variables. This formula canbe used to develop a practical inversion method. The algorithm proceeds by reading(7.49) right to left. Let g^ be the Radon transform data for a fixed angle <j>. Then:

1. For each </>, compute g^ = Fig(j>, the one-dimensional Fourier transform of eachprojection in the data set. Assemble the transforms g<j, into a two-dimensionalfunction g(uj, (/>), which is the two-dimensional Fourier transform u((jJx-, ^y] whereujx = <jj cos <j> and u!y = aj sin 4>.

2. From u(ujx,ujy), find the unknown image u(x,y) by way of a two-dimensionalinverse Fourier transform operator.

The DFT-Based Image Reconstruction Method

In practice, of course, there is only a finite number of projections and only a finitenumber of samples along each projection. Therefore, in order to apply (7.49), theproblem must be discretized. Assume that there are M equally spaced projectionangles (j>j = jn/M, where j = 0 : M — 1. This means that the directionvector specifying a given ray is (cos0j,sin</>j)T. It will also be assumed that eachprojection is evenly sampled at the points pn = nAp = 2n/N on the interval [—1,1],for n = —N/2, + 1 : N/2. We will denote the set of projection data by

9N/2,j — 0- For convenience, we will assume N to be an even integer. The algorithmcan easily be modified to accommodate odd N as well.

scaled so that u(x, y] = 0 for then for 1. Therefore,for n = —N/2 + 1 : TV/2 and j = 0 : M — 1. If we assume that the problem has been


The first step of the algorithm is to calculate Figj, where g(p, fij) = 0 forThe integral

must be approximated for j = 0,1,. . . , M — 1. The trapezoid rule can be called infor this approximation using the grid points pn = nA/o = 2n/N. By the reciprocityrelation ApAu; = I/A/", we see that Aw = 1/2, and therefore the appropriate gridpoints in the frequency domain are Wfe = fcAo; = k/2. Combining these facts thetrapezoid rule approximation takes the form

Note that the requirement that g(—1,0) = g(l, (j>) = 0 allows us to set g±N/<2 = 0 for(9-N/2 + #JV/2)/2 in the trapezoid rule. It also satisfies the AVED requirement.

We recognize the sum in (7.50) as the DFT of the sequence that, for each fixed jf,

This gives frequency samples Gkj at kAu> = k/2 for k = —N/2 + 1 : N/2. Making thischoice for approximating T\ is equivalent to approximating the Fourier coefficientsCfc of the two-periodic function g(p,fij) on [—1,1]. Equally important, computing theDFT for each of the M projections 0j, for ra = 0 : M — 1, produces data on a polarcoordinate grid in the frequency domain (see Figure 7.23).

The next step in the algorithm is to compute the inverse transform (7.48) byapplying T^ to the transformed projections g(u)k,(f)j}. It is possible to discretize(7.48) directly, using the data in polar coordinates. For example, we might write

where Au; = 1/2 and A0 = ?r/M, which uses the data produced in step 1.We are at liberty to reconstruct at any convenient sampling of (X,T/) within theregion {(x,y)|x2 + y2 < 1}. It seems, however, to be almost universal to selectAx = Ay = 2/N. This is done so that the final reconstructed image has the samesampling along the x- and y-axes as Ap in the projection set. It also means that imagevalues for N2 points must be computed.

Equation (7.51) is not a particularly good discretization for (7.48), because thesummation requires O(MN) operations per point. Since the reconstruction must bedone for N2 image points, this portion of the inversion would require a prohibitiveO(MN3) operations. The difficulty stems from the fact that gkj is a data set on apolar coordinate grid in the frequency domain, for which an FFT algorithm is notavailable.

In order to reconstruct an image of TV2 pixels on [—1,1] x [—1,1], and do soefficiently (with an FFT), it is essential to have the transform data, ^(w,^!>), on aCartesian frequency grid, as shown in Figure 7.23. Such a grid gives transform dataGmp = ^(raAu^pAu^) for m,p = —N/2 + 1 : N/2. If the grid spacing on the image

so

)


FIG. 7.23. According to the Central Slice Theorem, the Fourier transform of the imagecan be computed on a polar grid in the frequency domain (solid lines). Before peforming theinverse transform, the transform must be interpolated to a Cartesian grid (dashed lines). Theleft figure shows the two grids, while the relationship between the grids is detailed on the right.A simple interpolation uses a weighted average of the transform values at the four nearestpolar grid points (o) to produce a value at the enclosed Cartesian grid point (•).

is to conform to the grid spacing of the projection data, then Ax = Ay = 2/JV; thisimplies that Au>x = Awy = 1/2 by the reciprocity relation.

With the transform data on a Cartesian grid, the operator J-%1 can be approxi-mated using the two-dimensional IDFT

for q, s = —N/2 + l : TV/2, where the factor of 1/4 arises as the product AwxAu;y. Thisdouble sum can be computed efficiently using a two-dimensional FFT, in O(N2 Iog2 N)operations. In order to obtain Gmp ~ g(mAu>x,p&uy) on a Cartesian grid it isnecessary to interpolate the values Gkj from the polar grid. This is the central issueof the Fourier reconstruction method. With this in mind, the symbolic description ofthe algorithm, (7.49), is modified to read

where Z f is an interpolation operator that maps the polar representation of g(u, </>)to the Cartesian representation g(u>x,(jjy). The details of the interpolation problemneed not concern us here, as they have no bearing on the use of the DFT in the imagereconstruction problem. The interested reader will find more than enough discussionof the problem in the references. For the sake of brevity and simplicity, we will assumethat the interpolation is carried out by a simple process in which each value on theCartesian grid is a weighted average of the four nearest values on the polar grid. Thegeometry of this interpolation is illustrated in Figure 7.23.

The inversion algorithm may be summarized in the following steps, where weassume the discretized projection data


1. For each j = 0 : M - 1, find Gkj « F\g(p, $3} by applying forward DFTs:

for k = -N/2 + 1 : N/2, using M FFTs of length N.

2. Interpolate the polar frequency data Gkj onto a Cartesian grid. A simple schemefor this has the form

where Ga, Gb, Gc, and Gd are the four points on the polar grid nearest to thetarget Gmp. The weighting coefficients (wa,Wb,wc,Wd) are unspecified here, butare selected according to the type of interpolation desired.

3. Approximate the image u(x,y) = ^:2~

1M(u;a;,a;3/) with a two-dimensional IDFT

for q, s = —N/2 + 1 : N/2, using an N2-po'mt two-dimensional FFT for speed.

The entire algorithm is summarized in the map of Figure 7.24, which shows therelationships between the four computational arrays.

It is not difficult to estimate the computational cost of the algorithm, whichconsists of the cost of approximating the forward transforms, the cost of theinterpolation, and the cost of the two-dimensional inverse transform. To simplify theargument we will assume that M — N in the method just presented. As we shall seein Chapter 10, with the use of the FFT, the TV forward transforms entail O(NlogN)operations each. The operation count can be reduced by one-half with specializedFFTs for real data. For most interpolation schemes, 0(1) operations are required foreach of the N2 points to be interpolated, hence we will use 0(7V2) as the cost of theinterpolation. (It should be noted, however, that some of the extremely accurateinterpolations based on the Shannon Sampling Theorem incur costs approachingO(N3}.) The cost of computing the two-dimensional IDFT is the cost of N one-dimensional IDFTs, each of length N. We will assume that an inverse fast Fouriertransform (IFFT) is employed to perform this computation. Since the IFFT and theFFT have the same operation count, this means that the inverse transform step ofthe algorithm has a cost of O(N2 Iog2 N). Putting the three phases of the algorithmtogether, the total cost of the DFT reconstruction method is

where the constants C\ and 63 depend on the specific FFT algorithms used and theconstant C<z depends on the method of interpolation.

Example: A simple model of a brain. A simple example of imagereconstruction from projections is shown in Figure 7.25. The desired image is shownon the left. It consists of a thin, hollow, high density ellipse filled with low densitymaterial. Within the low density material are several elliptical regions of various


FIG. 7.24. A map showing the relationships of the four computational arrays of theimage reconstruction problem. The input data is the set of projections, gnj, shown in theupper left corner. The reconstructed image is uqs, in the lower left. We arrive at thereconstruction by way of the frequency domain. First, the one-dimensional DFT is takenalong each projection, resulting (according to the Central Slice Theorem) in the polar-gridrepresentation in frequency, Gkj (upper right). This is interpolated to a Cartesian grid,giving Gmp, the two-dimensional DFT of the desired image (lower right). Finally, a two-dimensional IDFT leads back from the frequency domain, and yields the reconstructed image

FIG. 7.25. A simple example of the process of image reconstruction from projections isshown. The desired image is shown on the left. The "data," or projections, are shown on theright. Here the data are presented in image format, where the value of the Radon transform isrepresented by the shading, with white representing large values and black representing smallvalues. The bottom edge of the projection data plot parallels the p axis, while the left-handedge parallels the <j) axis. The reconstructed image is shown in Figure 7.26.


densities. This simple model, modified from [125], might represent the problem ofimaging a human brain. The "data," or projections, are shown on the right. There are64 projections, each sampled at 64 points in [-1 : 1]. They were obtained by samplingthe analytic Radon transform of the model, which is easy to compute (see problem171 for a similar, although simpler, model). Finally, the reconstructed image is shown(using a 64 x 64 grid) in Figure 7.26. The reconstruction has good overall quality,and all the features of the "exact" image have been essentially resolved. Evident inthe reconstructed image are reconstruction artifacts, features that appear as negative-image reflections and curves. These phenomena are characteristic of reconstructionsby the Fourier method, and are caused by wrap around (leakage) and aliasing. Theymay be minimized by padding the data with zeros prior to processing, similar to thetechniques applied in the FK migration procedure [76], [104].

FlG. 7.26. The reconstructed image from the data shown in Figure 7.25 is displayedhere, on a 64 x 64 grid. The reconstruction is generally good, and all the features of the"exact" image have been essentially resolved. Reconstruction "artifacts," features that appearas negative-image reflections and curves, are present in the reconstruction, although they arenot prominent. These phenomena are characteristic of reconstructions by the Fourier method,and are caused by wrap around (leakage) and aliasing.



The Radon transform literature is confusing when it comes to the topic of erroranalysis. Much of the error analysis that is done is heuristic in nature, anddifferent workers hold differing views about the types and importance of the errorsin reconstruction algorithms. In the Fourier reconstruction literature, the mostcommonly cited sources of error are [92], [129], [130]:

1. undersampling of the projections, g(p,(j)};

2. error in approximating the Fourier transforms of the projections, g(u;,(/>);

3. truncation of the frequency domain;

4. interpolation error in Z^;

5. undersampling of the frequency data, u(u)x,u)y)',

6. error in approximating u by T^ •

The general consensus among workers in this field is that the most significantsource of error in the algorithm is the operator J^, which maps the frequencydomain data from a polar grid to a Cartesian grid. Many interpolation schemes havebeen proposed in the literature, including nearest-neighbor interpolation [78], [104], asimple bilinear scheme [78], bi-Lagrangian interpolation [76], and schemes based onthe Shannon Sampling Theorem [102], [103], [104], [129], [130]. The discussions tendto be qualitative, although careful error analysis is included in [76], [102], and [104].Typically, the image reconstructions based on the Fourier transform are characterizedby several types of reconstruction errors called artifacts. The more sophisticatedinterpolations based on the sampling theorem [131] generally produce much betterimages than those made using simpler interpolation methods, however, they also tendto be quite expensive to compute.

An alternate approach to the interpolation problem can be taken by devising areconstruction algorithm based on discretizing (7.48) directly—that is, by performingthe inverse Fourier transforms in polar coordinates. This can be done efficiently bytreating the polar grid as an "irregular Cartesian grid," and employing methods forcomputing FFTs on irregular grids. Such methods are fairly recent additions to theFFT family. Irregular-grid FFTs were developed for the Radon transform problem in[76] and [77], while a more general and exhaustive treatment may be found in [54].

The sources of error listed above are generally treated individually in the literature.In [76], however, it is shown that a bound on the error of the reconstruction consistsof the sum of three terms, one arising from each of the three steps in the algorithm:forward transform, interpolation, and inverse transform. As we have seen, the error inthe transforms in either direction is bounded by N~p for some p > 0, depending on thesmoothness of the data. We may expect, then, that the errors in these two steps can becontrolled by acquiring more data, that is, by increasing N. Of more concern, however,is the error of interpolation. For simple polynomial interpolation [76], the error isO(Au;r), where r is some positive number related to the order of the interpolation(e.g., linear, cubic, etc.). This error is also dependent on A</>; however, the transformsare typically quite smooth in the radial direction, and there tends to be little errorgenerated by the angular interpolation. What makes the radial interpolation error so

PROBLEMS 299

troublesome is that, unlike the transform error, we cannot expect that simply usingmore data will help. In fact, from the reciprocity relation

we observe that doubling the number of samples on each projection by cutting thespatial sample rate does not alter the frequency sample rate at all. The quality of theradial interpolation cannot improve by this technique, as the same samples are usedto interpolate frequencies even with smaller values of Ap. There are ways to improvethe resolution in the frequency domain, but they involve using and understanding thereciprocity relations which appear yet again in a fundamental way.

7.5. Problems

Boundary Value Problems

140. Some difference equation BVPs. Consider the difference equation

subject to the boundary conditions

(six BVPs in all). In each case, use the appropriate form of the DFT with eitherthe component perspective or the operational perspective. If necessary, express thesolution in terms of the coefficients £/&.

141. Matrix perspective. Consider the three matrices A given by

that correspond to the difference equation —w n _i +2un — un+\ = fn, with N = 4, withDirichlet, periodic, and Neumann boundary conditions, respectively. In each case findthe matrix P that diagonalizes the given matrix and the eigenvalues that lie on thediagonal of the matrix D = P-1AP.

142. The trick for nonhomogeneous boundary conditions. Consider theBVP

Solve this equation with


subject to the boundary conditions UQ = a, UN = /?, where a. and (3 are given realnumbers. Show that the vector

satisfies the same BVP with a different input vector fn and homogeneous boundaryconditions (VQ = VN = 0). How would you apply this idea to a BVP withnonhomogeneous Neumann boundary conditions HI — U-i — CX.UN+I — wjv-i = /??

143. Solving BVPs with nonhomogeneous boundary conditions. Use thetechniques of the previous problem to solve the difference equation

subject to the two sets of boundary conditions

144. A DST. Verify that the DST of the constant vector fn = 1 is

for k = 1 : N — 1, as claimed in this chapter.

145. Shift properties. Given the sequence un, let un+\ and un_i denote thesequences obtained by shifting un right and left one unit, respectively. Show that if<S,C, and T> represent the DST, the DOT, and the DFT, respectively, on N points,then

146. Component perspective for the DCT. Use the component perspectivedescribed in the chapter to derive the method of solution for the BVP

for n = 0 : N, subject to the homogeneous Neumann boundary condition u\ — u_i =0, UN+I — UN-I — 0.

147. Operational perspective for the DFT. Use the operational perspectivedescribed in this chapter to derive the method of solution for the BVP

for n = 0 : N — 1, subject to periodic boundary conditions UQ = UN,U-I = UN-I-

148. Finite difference approximations. Assume that </> is at least four timesdifferentiate in the interval (x - Ax,ar + Ax). Show that 4>'(x) and (j)"(x) may be

PROBLEMS 301

approximated by the following finite difference approximations with the indicatedtruncation errors:

where c and d are constants independent of Ax. (Hint: Expand <j)(x ± Ax) in Taylorseries.)

149. Neumann and periodic boundary conditions. Following the developmentof the chapter, describe the DOT and DFT methods for approximating solutions ofthe BVP

subject to the Neumann and periodic boundary conditions

(Hint: Use the approximation from problem 148, <^'(0) w (u\ — w_i)/(2Ax), with asimilar expression for (j)'(A).}

150. Two-dimensional shift properties. The solution of two-dimensional BVPsis usually easier using the operational perspective, which requires the relevant shiftproperties. Verify the following shift properties for the two-dimensional DST (<S),DOT (C), and DFT (£>).

151. A two-dimensional BVP with periodic boundary conditions. Considerthe two-dimensional BVP

subject to the periodic boundary conditions

for m = 0 : M - 1, n = 0 : N - 1.

(a) Show that a solution of the form

satisfies the boundary conditions.


(b) Show that if fmn is given a similar representation with coefficients F^fc,then the coefficients of the solution are given by

for j = 0 : M - 1, k = 0 : N - 1.

152. A fourth-order BVP. Consider the fourth-order BVP

for n = I : N — 1, subject to the boundary conditions

(corresponding to <£(0) = <f>(A) = </>"(0) = <j)"(A) = 0 for an ODE). Find theappropriate discrete transform that satisfies the boundary conditions and use it tofind the coefficients of the solution un in terms of the coefficients of the input /„.

153. Solution to the gossip problem Consider the solution

for n = 0 : N — 1, given to the gossip problem in the text. How would you analyzeand simplify this solution analytically?

(a) Verify that the solution is A^-periodic: un = UH+N.

(b) Show that the solution is even on the interval [0, N]: un = UN-n.

(c) Use the analytical DFT of the triangular wave given in The Table of DFTsof the Appendix to show that the solution un consists of two linear segmentsand may be expressed as un = \n — AT/21 — TV/4. Plot the solution for variousvalues of N.

154. A model of altruism. The following model suggests how diffusion might beused (or perhaps should not be used) to describe the distribution of wealth within amulticomponent system. Imagine a collection of JV+1 economic units (people, families,towns, cartels) that may have supplies and demands of income. The underlying lawof altruism is that each unit must at all times attempt to distribute its wealth equallyamong its two nearest neighbors (longer range interactions could also be considered),so that in a steady state the wealth of each unit is the average of the wealth of thetwo neighboring units.

(a) Argue that such a system would satisfy the following difference equation ina steady state:

where un is the steady state wealth of the nth unit, and fn representsexternal sources or sinks of wealth for the nth unit.

PROBLEMS 303

(b) Interpret the boundary conditions HI = u_i ,wjv+i = UN-I-

(c) Argue mathematically and physically that a solution can exist only if

(d) Find a solution for this problem for an arbitrary input fn and for the specificinput fn = sm(27rn/N).

155. Improvement to DFT fast Poisson solvers. Consider the partial differenceequation (7.16)

subject to the boundary conditions

where m = 1 : M — 1 and n = 1 : N — 1. Rather than applying two sweeps ofDFTs (one in each direction), an improvement can be realized if the DFT is appliedin one direction only and tridiagonal systems of equations are solved in the remainingdirection. Here is how it works.

(a) Assume a solution to the BVP of the form

for m = 1 : M — 1 and n = 1 : N — 1. Notice that this amounts to applyinga DST in the m- (or x-} direction only. The right-hand sequence fmn isgiven a similar representation with coefficients

for j — 1 : M — 1 and n = 1 : N — 1. Now substitute these representationsfor umn and fmn into the BVP. After simplifying and collecting terms, showthat the BVP takes the form

where

and the coefficients Ujn are the new unknowns.

(b) This equation must hold for all m = 1 : M — 1 and n = 1 : N — 1, whichmeans that each term of the sum must vanish independently. Therefore,for each fixed j — 1 : M — 1 the Ujn 's must satisfy the linear equations

for n = 1 : N — 1. Notice that the work entailed in this step is the solutionof M — 1 tridiagonal systems each of which has the (N — 1) unknownsC/ j i , . . . , C/j,jv-i-


(c) Once the coefficients Ujn have been determined the solution can berecovered by performing inverse DSTs of the form

for m = 1 : M - 1 and n = 1 : N - 1. (The factors of 2 and 1/M in theforward and inverse transforms can be carefully combined.)

(d) Using the fact that the cost of solving an TV x N symmetric system oftridiagonal equations is roughly 3TV operations and that an TV-point FFTcosts approximately TV log TV operations, find the cost of this modifiedmethod when applied to a partial difference equation with MN unknowns.In particular, show that the modified method offers savings over the DFTmethod proposed in the text. Show that the method can also be formulatedby doing the DSTs in the n direction and solving tridiagonal systems inthe m direction. If M > TV which strategy is more efficient?

Digital Filters

156. Discrete low-pass filter. Show that the discrete time domain representationof the low-pass filter

157. Discrete low-pass filter. Show that the generalized low-pass filter

has a time domain representation given by

(Hint: Use symmetry to convert the Fourier integral to a cosine integral and integrateby parts.)

158. Square pulse. Verify that the 24-point DFT of the square pulse filter hn

with components

is given by

PROBLEMS 305

is given by

159. Amplitude-phase relations. Assume that a discrete filter hn is applied toa signal fn in the form of a convolution to produce the filtered signal gn — fn * hn.Letting Fk,Gk, and H^ represent corresponding DFTs of these signals, show that

where (j)F,<i>G-> and 4>H are the phases of Fk,Gk, and H^, respectively.

160. Time-shifting filters. Show that the time-shifting filter h(t] that takesthe input f ( t ] into the output g(t] — f(t — t0] has the properties \h(uj)\ = I and

161. Continuous to discrete filters. Given the parameters u;i, 0^2,^3, and 0^4,devise the discrete versions of the generalized band-pass filter, HB (a; ;u;i, 0^2, 0^3, 0^4),the high-pass filter, fo7/(u>;u>i,u>2), and the notch filter, ^(o;;u;i, cj2,^3, ^4), given inthe text. Is it possible to find the time domain representations of these filters?

162. Window functions. The process of filtering a signal is closely related tothe process of truncating a signal. As shown in the text, the use of a square pulse(rectangular window) to truncate a signal has some unpleasant side-effects. Considerthe following window functions on the interval [—A/2, A/2].

(a) Bartlett (triangular) window:

(b) Manning (cosine) window:

(c) Parzen

In each case plot the window function in the time domain. Then, eitheranalytically, numerically, or with the help of The DFT Table in the Appendix, findthe frequency domain representation of each window function. How do the propertiesof the frequency domain representations (such as the width of the central lobe andthe amplitude of the side lobes) compare with each other and with the rectangularwindow?

window:


FK Migration

163. Snell's law. Suppose an incident plane wave arrives at a flat interface betweentwo media having densities p\ and p% and sound velocities c\ and 02, respectively.Assume that the wavefront arrives with angle of incidence 0 < #1 < 7T/2. Using astraightedge, a compass, and a bit of trigonometry, use Huygens' principle to deriveSnell's law, and to show that the angle of reflection equals the angle of refraction.

164. Dispersion relation. Given the dispersion relation uj2 = c2(/e2 + fc2), verifythat

165. Derivative properties of the Fourier transform. Using the definition ofthe Fourier transform given in the text verify (using integration by parts) the followingFourier transform derivative properties.

166. Plane waves in space-time. Consider a function of the form

(or consider the real and imaginary parts of this function). Using the dispersionrelation u2 = c2(k2 + fc2) show that this function is constant along the lines in thexz-plane

and may be interpreted as a wave propagating in the direction given by the unit vector

167. Inverse DFT. Verify that with the grid parameters, and Afcz = 1/D, the IDFT given in (7.43)

gives an approximation to the function

at the spatial grid points (xj, zn).

PROBLEMS 307

Image Reconstruction

168. X-ray paths. The Radon transform consists of line integrals in the xy-planealong lines given by

where p is a real number and 0 < </> < TT is an angle measured counterclockwisefrom the positive x-axis. For each of the following choices of p and 0, sketch thecorresponding line and interpret p and </>.

169. Analytical Radon transforms. Let u(x,y] = e ^x +y \ Show that theRadon transform of u(x, y] is given by

(Hint: Use the definition of the Radon transform given in equation (7.45), and thechange of variables

which is simply a rotation of axes. Then perform the integration, invoking the propertyof the delta function.)

170. Properties of the Radon transform. Let £ = (cos</>,sine/))T be theunit vector specifying the p-axis for a given angle 0. Then the line specified byp = a:cos</> + ysincf) is also specified by p = x • £, where x = (x,y)T. The Radontransform may be written as

(a) Show that the Radon transform has a scaling property, that is, show that

where a is a scalar. Show that the evenness property can be obtained fromthe scaling property by setting a = — 1.

(b) Show that the Radon transform has a linear transformation property. Let

be any nonsingular matrix and let B — A~l. Show that the Radontransform of a function u(Ax.) is related to the transform of w(x) by


(c) Let w(x.) be the characteristic function of an ellipse, given by

The ellipse can be generated from a circle by a linear change of variables,that is,

where u(x.) is the characteristic function of a disk given by (7.46), and

Use the scaling and linear transformation properties to show that the Radontransform of the characteristic function of an ellipse is given by

171. Model problem. Using linearity and the results of problem 170, determinethe Radon transform of a "skull model" consisting of a thin, high density elliptical shellcontaining lower density material. That is, if w(x, y) is the characteristic function of anellipse and z(x,y) is the characteristic function of a slightly smaller confocal ellipse, theRadon transform of the skull model may be computed by [R,w] (p, (j>) — .75[R,z](p, 0).Sketch profiles of the resulting Radon transform corresponding to several differentvalues of (f>.

where

Chapter 8

Related Transforms

8.1 Introduction

8.2 The Laplace Transform

8.3 The z-Transform

8.4 The Chebyshev Transform

8.5 Orthogonal Polynomial Transforms

8.6 The Hartley Transform

8.7 Problems

Training is everything.The peach was once a

bitter almond;cauliflower is nothing but

cabbage with a collegeeducation.

- Mark Twain 309

310 RELATED TRANSFORMS

8.1. Introduction

Like the Zen image of a finger pointing at the moon, this chapter is intended to pointthe way toward other abundant lands that are related in some way to the DFT. But,as in the Zen lesson, woe to those who mistake this brief discussion for a completetreatment of the vast subject of related transforms. The most we can possibly do isopen a few doors, suggest some guiding references, and offer a glimpse of what liesbeyond. The transforms included in this section were chosen for one of two reasons.The Laplace and Chebyshev transforms appear because they can be reduced, in theirdiscrete form, to the DFT (and hence the FFT). On the other hand, Legendre andother orthogonal polynomial transforms, while not reducible to the DFT, share manyanalogous properties and applications, and are also worthy of recognition at this time.So with apologies for brevity, but with hopes of providing a valuable reconnaissanceof the subject, we shall begin.

8.2. The Laplace Transform

Used to solve initial value problems that arise in mechanics, electrodynamics, fluiddynamics, and engineering systems analysis, the Laplace transform is one of thefundamental tools of applied mathematics. It is a prototype for many transformmethods and is often the first transform that students encounter. Laplace1 enunciatedthe transform that bears his name in his 1820 treatise Theorie analytique desprobabilites and used it to solve difference equations. With a few notational changes,here is the transform that Laplace devised. Given an integrable function h on theinterval (0, oo), its Laplace transform is

The choice of t as the independent variable reflects the most common situation inwhich h is a time-dependent or causal function. The transform variable s is complex,and if the input / decays such that lim^oo h(t)e~bt — 0 for some real number 6,then H is defined for all complex values of s with Re {s} > b. The forward transform(8.1) may be evaluated analytically for many commonly occurring functions; it hasalso been tabulated extensively [3], and it may be accurately approximated usingnumerical methods [119].

The inversion of the Laplace transform is a more challenging problem. Given afunction H (s) of a complex variable, the inverse Laplace transform, h(t] = £~1{H (s)},is defined in terms of a contour integral in the complex plane. In this section we willpresent a method for inverting the Laplace transform that appeals to the Fouriertransform. Not surprisingly, the actual implementation of the method implicates theDFT (and hence the FFT). The method, attributed to Dubner and Abate [50], isanalyzed in the valuable paper of Cooley, Lewis, and Welch [41].

The key to most methods for the numerical inversion of the Laplace transform isto express the complex transform variable in the form s = c + i2iruj, where c and u

1Born in poverty, PlERRE SlMON LAPLACE (1749-1827) spent most of his life in Parisian prosperityonce d'Alembert recognized his talents and assured him a faculty position at Ecole Militaire at ayoung age. His scientific output was prodigious; his most notable contributions were to the subjectsof astronomy, celestial mechanics, and probability. Laplace's equation, which governs the steady statepotential of many physical fields, was proposed in 1785.

THE LAPLACE TRANSFORM 311

are real. In so doing, e~st is periodic in u> with period one. The forward transform(8.1) then becomes

For a fixed value of c, this expression defines a Fourier transform relation between thefunctions F(u) = H(c + iliru}} and

To compute the inverse Laplace transform of a given function H, one could, inprinciple, proceed by

• choosing a value of c and sampling F at selected values of a;,

• applying the inverse DFT to the samples of F to approximate / at selected gridpoints, and

• computing the values of h at the sample points from

However, there are some subtleties, one of which is the choice of the parameter c thathas suddenly appeared. Therefore, a few more remarks are still in order.

Since the goal is to construct the function h on an interval [0, A] in the timedomain, the first choice is the value of A. Once a value of A and a number of gridpoints N are selected, then the reciprocity relation A£l = N determines the extent ofthe frequency interval ft. We will assume for the moment that an appropriate value ofthe parameter c can be chosen. Then the function F(u>) = H(c+i27ruj) is sampled at Nequally spaced points of the interval [—ft/2, ft/2] to produce the samples Fk = F(uk).Notice that average values of F must be used at endpoints and discontinuities (AVED).In particular, this means that

where we have used the fact that if h is a real-valued function (which is often thecase), then F is a conjugate symmetric function with F(—UJ) = F*(u).

With N samples of F properly generated, the IDFT can be used to produceapproximations fn to f(tn). The grid points are tn = nA/N = n/ft, wheren = 0 : N — 1. Most likely some rearrangement of the DFT output (or input) will beneeded to reconcile the two index sets k = —N/1 + 1 : N/2 and n = 0 : N — I . As wehave seen, the periodicity of the input and output sequences allows either sequenceto be shifted. The final step is to calculate the approximations to h(tn) from therelationship h(tn) = f(tn)e

ctn. We will denote these approximations hn = fnectn.

We can now comment qualitatively on the role of the parameter c. In theory, ccan be taken as any real number greater than the real part of the largest pole of H .But in practice, there are some numerical considerations. A large value of c > 0 hasthe beneficial effect of making the output sequence fn = h(tn}e~cin decay rapidly,thus reducing the overlapping in the replication of / that inevitably takes place inthe time domain. As we observed earlier, this error can also be reduced by decreasingAu>, which increases the period of the replication of /. On the other hand, if c is


large, errors in the computed sequence fn will be magnified when multiplied by ectn

to produce the final sequence hn. A nice analysis of this optimization problem withrespect to the parameter c can be found in the Cooley, Lewis, and Welch paper [41].Rather than try to reproduce it here, we will resort to a numerical demonstration toshow the effect of different choices of c.

Example: Numerical inversion of the Laplace transform. A family ofconvenient test problems for the Laplace transform inversion is given by h(t) =tn~1eat/(n — 1)!, which has the Laplace transform H(s] = (s — a)~n, where a isany real number and n > 1 is an integer (we agree that 0! = 1). We will exercise themethod outlined above on this problem with n = 2 and a = — 2. We have computedapproximations to the inverse Laplace transform h on two different intervals [0,2] and[0,4] using several different values of c and N. In each case an error was determinedusing the exact values of h and the error measure

The graphs of Figure 8.1 offer a concise summary of several pages of numerical outputfrom this experiment. The four curves in this figure show how the errors in theapproximations vary with respect to the parameter c in the four cases A = 2,4and N = 128,256. Several observations should be made. First note the sensitivedependence of the errors on the choice of c for fixed values of A and N. In eachcase there is a narrow interval of optimal values of c, and straying from this intervaldegrades the approximations significantly. Furthermore, the optimal value of c variesconsiderably with the particular choice of A and N. This relationship c = c(A, N) isof great practical interest, however it appears that in general it must be determinedexperimentally. It should be said that once an optimal value of c is determined, goodapproximations to the inverse Laplace transform can be found with a predictabledecrease in errors as TV increases. In this particular case, with a discontinuity in thefunction F, errors decrease approximately as TV"1, assuming that optimal values of care chosen for each TV.

In closing we mention another now-classical method for the numerical inversionof the Laplace transform that is also described in Cooley, Lewis, and Welch [41],[111], [161], [163]. It assumes that the function h(t) can be expanded in a seriesof Laguerre2 polynomials (suggested by the exponential kernel e~st of the Laplacetransform). Perhaps surprisingly, the method ultimately leads to a Fourier series whosecoefficients must be approximated. Hence the DFT makes another appearance in thismethod as well. There are also more recent methods for the inversion of the Laplacetransform. The fractional Fourier transform can be applied to this computation [8]and wavelet (multipole) methods have also been proposed [10].

8.3. The ^-Transform

At this point, it would be tempting to move ahead into new waters; but we are atan exquisitely critical juncture in which several pieces of a large picture are aboutto come together. Throughout this book we have explored the relationship between

2EDMOND NICOLA LAGUERRE (1834-1886) published over 140 papers in his lifetime, over halfof which were in geometry. After serving in the army for ten years, he was a tutor at the EcolePolytechnique and a professor in the College de France. He is best known for the family of orthogonalpolynomials named in his honor.

THE Z-TRANSFORM 313

FlG. 8.1. The four curves show how the error (on the vertical axis) in approximations to theinverse Laplace transform vary with the parameter c (on the horizontal axis) in four cases:A = 2, W = 128 (dashed line), A = 2, N = 256 (solid line), A = 4, N = 128 (dash-dot), andA = 4, N = 256 (dotted line). The test problem is the approximation of h(t) = te~2t fromits Laplace transform H ( s ) = (s + 2)~2. Note how the errors depend quite sensitively on thechoice of c for fixed values of A and N.

the DFT and the Fourier transform in some detail. We have just introduced theLaplace transform, illustrated its relationship to the Fourier transform, and foundthe (perhaps surprising) manner in which the DFT may be used to approximatethe Laplace transform. If we were to stand back and look at the DFT, the Fouriertransform, and the Laplace transform from a distance, we might see an arrangementsomething like the following diagram:

The three double-headed arrows indicate pathways that we have already traveled,and it might be tempting to conclude that the picture is complete. However, theconnection between the DFT and the Laplace transform that was presented in theprevious section was made via the Fourier transform. One might wonder whether theLaplace transform has its own discrete transform, and, if it does, whether it can berelated to the DFT? The answer to both questions is affirmative, and the missingpiece (denoted ???) of the above diagram is called the 2-transform. Besides theaesthetics of completing the above picture, there are many reasons to study the z-transform. One reason is its extreme utility in signal processing and systems analysisapplications [108]. Equally important is the fact that it provides yet another link tothe DFT.

Fourier transform

Laplace transform

DFT

???


The z-transform is different from the DFT in some fundamental ways. It mightbe called a semidiscrete transform since it is discrete in one direction and continuousin the other. To define the forward transform, we begin with a sequence of possiblycomplex numbers wn, where n = 0,1,2,3,.. . . The z-transform of this sequence isgiven by

where z is a complex variable, subject to some restrictions. The operator notation2, denotes the process of taking the z-transform. We see that the ^-transform is thefunction that has the wn's as coefficients in its power series of negative powers of thevariable z. As we will see momentarily, this power series is not defined at z = 0;therefore it will always converge in a region that excludes the origin. We cannotdeny that the definition (8.2) has dropped from the sky with very little justification.However, it is a brief exercise (problem 181) to show that this definition can be derivedas the discrete analog of the Laplace transform

after the change of variable z = es is made. For the moment, let's take this definitionas given and use it to compute some ^-transforms.

Example: ^-transform of a step sequence. Consider the sequence

Applying the definition of the z-transform we have that

At this point we appeal to a hopefully familiar result that gets plenty of use in the^-transform business: the geometric series. The general form of the geometric seriesthat we will need repeatedly is

Using this result in (8.3) gives us our first z-transform:

This transform is valid provided that the convergence condition of the geometric seriesis met. Therefore, we must require \l/z < 1 or z\ > 1, which means that thistransform is valid outside of the unit circle centered at the origin in the complexplane.

Example: Geometric sequence. In this case consider the sequence un = an,where a is any complex constant and n = 0 , l ,2 ,3 , . . . . Applying the definition of the^-transform, we have that

THE 2-TRANSFORM 315

Appealing to the geometric series again, we have that

for \z\ > \a\. Notice that as in the previous example, this ^-transform is valid only ina certain region of the complex plane: in this case, for all points outside of a circle ofradius \a centered at the origin.

Example: Sine and cosine sequences. The previous ^-transform can beapplied in the special case a = e10, where 9 is a fixed real angle. We can then deducethat

This z-transform is valid providedSimilarly (problem 175),

provided z\ > I.So far we have not mentioned anything about an inverse ^-transform, which is the

process of recovering the sequence un from a given function U(z). We will see shortlythat it exists and what it looks like. However, there are several indirect approachesto finding the inverse ^-transform. We will introduce the operator notation 2,~l toindicate the inverse ^-transform. Therefore, if we are given a function U ( z ) , then itsinverse ^-transform is the sequence

for n = 0,1, 2 ,3 , . . . . Here is an example to demonstrate one approach.Example: An inverse ^-transform. Not surprisingly, the geometric series can

also be used to discover inverse z-transforms. Let's consider the function

where a is a complex constant. The idea is to rewrite this function so that it canbe identified as the sum of a geometric series. Notice that the ^-transform involvesnegative powers of the variable z, that is, (l/z}n. Looking for powers of 1/2, a fewsteps of algebra lead us to

since a and


The geometric series we have written converges, provided \z\ > \a\. It is now possibleto pick out the coefficients of z~n and identify them as the terms of the sequence un.Doing this we see that

The strategy of using algebra to form a geometric series can be used endlessly to findinverse z-transforms. Here is a variation on the same theme.

Example: Inversion by long division. Very often, the power series in z~l

needed to determine an inverse ^-transform can be found simply by long division. Forexample, given the ^-transform

it can be expanded in powers of z l by long division to give

The coefficients in this power series give the inverse ^-transform3n+ 1.

Example: Inversion by partial fractions. Another powerful tool for findinginverse z-transforms (just as it is for inverting Laplace transforms) is partial fractiondecomposition. For example, the function

can be written in partial fractions as

Now the result of a previous example can be used to conclude that

Before turning to some important properties of the z-transform, we will providea short table of z-transforms, some of which have already been derived as examples,others of which can be found in the problem section. Table 8.1 shows the inputsequence wn, the corresponding z-transform, and the region of the complex plane inwhich the transform is valid.

The z-transform, like all the transforms we have seen, has a variety of usefulproperties. Only two properties will concern us, and we enumerate those now.

THE 2-TRANSFORM 317

TABLE 8.1A short table of z-transforms.

1. Linearity. The 2-transform is a linear operator, which means that if a and b areconstants and un and vn are sequences, then

In words, the ^-transform of a sum of sequences is the sum of the ^-transforms,and the ^-transform of a constant times a sequence is that constant times thez-transform (problem 173).

2. Shift property. As with the DFT, the shift property of the ^-transform isextremely important, particularly for the solution of difference equations. It isnot difficult to derive this property, so let's do it. Let un+i denote the sequence

which is produced by shifting the sequence un to the left one place. Then

We have used U(z) to denote the 2-transform of the original sequence un. Wesee that the the ^-transform of the shifted sequence is related directly to the z-transform of the original sequence and to the initial term of the original sequence


UQ. This property should be compared to the property for the Laplace transformof the derivative of a function; the resemblance is not an accident!

A similar calculation can be used (problem 174) to show that the z-transformof the fc-fold shifted sequence wn+fc is given by

where k is any positive integer.

Solution of Initial Value ProblemsIn the first section of Chapter 7 the fundamental distinctions between boundary valueproblems and initial value problems were described, and then the DFT was usedto solve boundary value problems. With the shift property in hand, we can nowinvestigate how the ^-transform is used to solve initial value problems. We will proceedby example and consider the second-order difference equation initial value problem

for n > 2, subject to the initial conditions UQ = 0 and u\ = 3. The goal is to findthe sequence un that satisfies both of the initial conditions (for n — 0 and n — 1) andthe difference equation (for n > 2). The problem is called an initial value problembecause the initial two terms of the unknown sequence are given, and the differenceequation governs the evolution of the system for all other values of n.

The solution begins by taking the z-transform of both sides of the differenceequation:

The fact that the ^-transform is a linear operator is essential in this step. Now aliberal use of the shift property leads to

The two given initial conditions fit the needs of the shift property perfectly. LettingUQ = 0, HI = 3 and using the ^-transform of the sine sequence derived earlier, we canwrite

As with all transform techniques, the most immediate task is to solve for thetransform of the unknown sequence; in this case, we must solve for U(z]. Notice thatthis is an algebraic problem, and its solution is

Having found the ^-transform of the solution, the actual solution is only an inverse2-transform away. This looks like a grim task; however, the tricks learned in theprevious examples will serve us well. First, a partial fraction decomposition of U(z)leads us to

THE Z-TRANSFORM 319

The first two terms in the partial fraction representation have inverses that we havealready encountered. The third term is also a familiar z-transform whose inverse canbe found in problem 176 and Table 8.1. Combining the three inverse z-transformsgives us the solution

While this expression is rather cumbersome, it can be verified that it produces thecorrect initial conditions (MO = 0 and MI =3). Furthermore, it generates the sequence{0,3,3,20,48,...}, which is precisely the sequence that results if the original differenceequation is evaluated recursively. The benefit of the z-transform solution is that itprovides a single analytical expression that can be used to find un for any values ofn > 0. The method of solution used for this initial value problem is perfectly general:it can be applied to any constant coefficient difference equation with initial conditions,of any order, homogeneous or nonhomogeneous, and it provides a solution at least upto the final inversion step. If the inverse z-transform step cannot be done analytically,then numerical methods must be used. This brings us to the point of this discussion:the relationship between the z-transform and the DFT.

The ^-Transform and the DFT

We have spent several pages introducing the z-transform, its properties, and its usefor the solution of initial value problems. All of this work has been analytical. It isnow time to investigate numerical methods that must be used when the z-transformor its inverse cannot be determined exactly. This leads directly to the DFT.

Let's begin with an observation: for all of the examples of z-transforms doneearlier, and for all of the z-transforms that appear in Table 8.1, there is a condition ofthe form |z| > RQ that gives the region of validity of the z-transform in the complexplane. In other words, each z-transform is defined outside of a circle of radius RQ inthe complex plane. Therefore, we will take the definition of the z-transform

and evaluate it on a circle C of radius R > R0. Recall that a circle of radius R in thecomplex plane can be parameterized as

for 0 < 9 < 27T, where z is a point on the circle, and the angle 0 is the parameter.This gives us the representation

We must now think about approximations to U(z] that might be implementedon a computer. Said slightly differently, we must now imagine how this continuousz-transform can be made discrete. To begin with, we will be able to compute U(z)


only at a finite number of points. So, we will choose N equally spaced sample pointson the circle C and denote them

Hopefully the conclusion of this little argument is clear. If we use our usualnotation and let Uk = U(zf:}, we can write that

for k = 0 : N — I . In words, the ^-transform of the (infinite) sequence un canbe approximated by applying the DFT to the auxiliary sequence R~nun, wheren = 0 : N — 1. The only condition on R is that it must exceed the radius of the criticalcircle .Ro- Therefore, R must be regarded as a parameter in this method preciselyin the manner that the approximate inverse of the Laplace transform involved theparameter c.

The relationship between the 2-transform and the DFT turns out to be remarkablystraightforward. How do the respective inverse transforms come together? Thereare at least two ways to display the connection. The first is to formally invert therelationship (8.4). Taking inverse DFTs of both sides gives

This argument displays the relationship between the inverse ^-transform and theDFT, but it does not quite give the entire account for a simple reason: we have yetto see the exact inverse 2-transform! In all of the preceding examples, we conspiredto find inverse ^-transforms by devious means such as long division, geometric series,or partial fractions. Not once did we use an inversion "formula." Therefore, we willsketch the final scene in broad strokes since it requires an excursion into complexvariables.

Therefore,

The other aspect of the z-transform that is not discrete is the infinite sum. A discreteform of the 2-transform must involve a finite sum. Since we have chosen N samplepoints 2fc, it stands to reason that we should take N terms of the ^-transform sum.If we incorporate both of these discretizations (samples of z and the finite sum) intothe definition of the ^-transform, we have

THE CHEBYSHEV TRANSFORM 321

The ^-transform definition

really says that the sequence un consists of the coefficients of the Laurent3 series forU(z) in the region \z > RQ. These coefficients are readily found by an application ofthe theory of residues, and the result is that

where C can be taken as any circle with radius R > RQ. With this exact inversionformula, we can find a discrete version of the inverse ^-transform. As outlined inproblem 180, the contour integral can be approximated by summing the integrand atthe same N points on the circle C,

that we used earlier. If this discretization is carried out carefully, then indeed wediscover that

for n = 0 : N — I. The relationship between the inverse ^-transform and the inverseDFT transform can then be expressed as

for n = 0 : TV — 1. As in the inversion of the Laplace transform, the parameter Rappears and must be determined experimentally.

8.4. The Chebyshev Transform

Did you ever look at the multiple angle formulas for the cosine function and wonderabout the patterns in the coefficients? If not, let's do it now. Here are the first fewformulas:

The list can be continued indefinitely, but the first few entries lead to some immediateobservations. Notice that cosn0 can be expressed as a polynomial of degree n in

3French analyst PIERRE ALPHONSE LAURENT (1813-1854) is best known for his generalization ofTaylor series in the complex plane.


cos#. If n is even, that polynomial consists entirely of even powers of cos#; if nis odd, the polynomial consists of odd powers only. In both cases, the coefficientsof the polynomial alternate signs, and the leading coefficient in the polynomial forcosnO is 2n~1. These properties and many others hold for all positive integers n,and they have fascinated mathematicians for centuries. The most eminent personto study these polynomials was the nineteenth century Russian mathematician P. L.Chebyshev4 (variations on this name include Tchebycheff), whose name has been givento them.

Let's rewrite these polynomials with the notational simplification that x = cos 9.We will also denote the nth polynomial in the list Tn(x] with the agreement thatTQ(X) = I . The list now reads

We have listed the first six Chebyshev polynomials. They may be characterizedvery simply for any nonnegative integer n in the following way:

or more simply as

for n = 0,1, 2 , . . . . This set of polynomials has been studied extensively for the past 150years, and its properties have been uncovered and recorded. The polynomials arise in astartling variety of seemingly disparate subjects, from approximation theory to algebraand number theory, and undoubtedly there are connections with other disciplines thathave yet to be discovered. Our goal is to relate the Chebyshev polynomials to theDFT, but along the way we will explore their properties and learn a bit more aboutthem.

Let's survey some of the more useful and frequently encountered properties of theChebyshev polynomials. The simpler demonstrations will be left as exercises withhints; the deeper results (and there are many more of them) are accompanied byreferences (a good general reference is [12]).

1. Polynomial properties. As mentioned, Tn is an nth-degree polynomial; Tn isan even function when n is even and an odd function when n is odd. The graphsof the first few Chebyshev polynomials are shown in Figure 8.2.

4PAFNUTI LIWOWICH CHEBYSHEV (1821-1894) was associated with the University of Petrograd formuch of his life. His name appears in many different branches of mathematics, perhaps most notablyin number theory and probability. While everyone agrees upon his importance in mathematicalhistory, almost nobody agrees on the spelling of his name, which appears in many different forms.Indeed, the controversy inspired Phillip J. Davis to write a book, The Thread, a Mathematical Yarn,in which the name Pafnuti Liwowich Chebyshev is the central, unifying theme.


FIG. 8.2. The figure shows the graphs of the Chebyshev polynomials Ti, . . . ,T5 on theinterval [—1,1]. The polynomials can be identified by counting the zero crossings: Tn has nzero crossings. Note that the even- (odd-} order polynomials are even (odd] functions, thatTn has all n of its zeros on (—1, 1), and that Tn has n + 1 extreme values on [— 1, 1].

2. Zeros. The n zeros of Tn (points at which Tn(x) — 0) are real and lie in theinterval (—1,1). From the definition, Tn(x) = cos(ncos~1 x), they are easilyshown to be (problem 184)

for j — 1 : n.

3. Extreme values. On the interval [—1,1], |Tn(x)| < 1 and Tn attains its extremevalues of ±1 at the n + 1 points (problem 185)

for j = 0 : n.

4. Multiplicative property. For m > n > 0

(problem 186).

5. Semigroup property. For m > 0 and n > 0

(problem 187).


6. Minimax property. The normalized Chebyshev polynomials are formed bydividing through by the leading coefficient 2n~1 giving the monic polynomialsTn(x) = 2l~nTn(x) with leading coefficient one. An extremely importantproperty of the Chebyshev polynomials is that on the interval [—1,1], amongall nth-degree polynomials pn,

for n > 0, where ||/||oo — max_i<x<i|/(a;)|. In other words, among all nth-degree polynomials, Tn has the smallest maximum absolute value on [—1,1].This "best minimax" property [118] is the basis of many approximation methodsthat involve Chebyshev polynomials.

7. Recurrence relation. The trigonometric identity

can be used directly to show (problem 188) that the Chebyshev polynomialssatisfy the recurrence relation

for n = 2 ,3 ,4 , . . . .

8. Differential equation. Computing T'n(x) and T^(x] (problem 189) shows thatTn satisfies the second-order differential equation

for n = 0 ,1 ,2 , . . . .

9. Orthogonality. Undoubtedly, much of the utility of Chebyshev polynomials arisesfrom the fact that they comprise a set of orthogonal polynomials. Here isthe crucial orthogonality property: For nonnegative integers m and k

This property says that the polynomials Tn are orthogonal on the interval [—1,1]with respect to the weight function (1 — or2)"1/2. The proof is direct and worthreviewing, since it reveals the important connection between the orthogonalityof the Chebyshev polynomials and the orthogonality of the cosine functions thatwe have already studied. If we start with the orthogonality integral above andmake the change of variables x = cos 0, we see that

If m = n — 0 the value of the integral is TT.

10. Representation of polynomials. The orthogonality of the Chebyshev poly-nomials allows them to be used in the representation of other functions. This


representation is exact and finite when the other functions are polynomials. Theprocess is familiar and instructive and will be used again. Let p^ be a polynomialof degree N. We seek coefficients Ck such that

where we have used the notation E' to designate a sum in which the first termis weighted by one-half. In a method analogous to the determination of Fouriercoefficients, we multiply both sides of the representation (8.5) by an arbitraryTm, where 0 < ra < AT, and by the weight function (1 — z2)"1/2. We thenintegrate over [—1,1], to discover that

As indicated, the orthogonality of the Tn's means that only the k = m term ofthe sum survives, leaving

for k = 0 : N. Note that the factor of 1/2 on the CQ term accounts for the specialcase of m = k — 0 in the orthogonality property. (See problems 190 and 197.)

11. Minimum least squares property. The orthogonality can be used to show(problem 191) that of all nth-degree polynomials pn, the normalized polynomialTn minimizes the quantity

12. Least squares representations of functions. We have seen that polynomialscan be represented exactly by a finite linear combination of Chebyshev polyno-mials. What about the representation of arbitrary continuous functions / on[—1,1]? This leads us to consider an expansion of the form

consisting of an infinite series of Chebyshev polynomials. We may proceedformally as in the polynomial case, using the orthogonality property, to findthat the coefficients Ck in this expansion are given by

for k = 0,1,2, Having determined the coefficients ck, the critical questionsare whether the series J^fc CfcT^ converges to / and in what sense. The answersare known and we can investigate further.


In practice one would compute only a partial sum of the expansion (8.6).Therefore, we will denote the Nth partial sum

Note that sjv is a polynomial of degree N. It can be shown [118] that of all JVth-degree polynomials, sjv (with coefficients given by (8.7)) is the least squaresapproximation to / on [—1,1] with respect to the weight function (1 — x2)"1/2.To state this carefully, we must define the weighted two-norm of a function / as

Then SN satisfies

for every Nth-degree polynomial PN (with equality only if p^ = SN).

The convergence theory for Chebyshev expansions is well developed [12], [69],[118]. Mean square convergence (convergence in the two- norm) can be assessedby first using the orthogonality to deduce that (problem 192)

Thus, the rate of convergence in the 2-norm can be estimated knowing the rateof decay of the coefficients Ck which is determined by the smoothness of / on[—1,1] (in much the same way that the decay rate of Fourier coefficients wasdetermined in Chapter 6).

There are also rather deep and recent results on the pointwise error in Chebyshevexpansions. Rivlin [118] has shown that if p^(x) is the best uniform (orminimax) approximation to / on [—1,1] (minimizing ||/ — Pjylloo over all JVth-degree polynomials) then

where SN is the least squares approximation to /. A practical interpretationof this result is that "since 4(1 + lnN/7r2) < 10 for N < 2,688,000, theChebyshev series is within a decimal place of the minimax approximation forall such polynomial approximations" [69]. This confirms the experience ofmany practitioners that Chebyshev expansions give very accurate and rapidlyconverging approximations to functions.

A few of the important properties of the Chebyshev polynomials have beendisplayed in this brief tour. We now move toward the practical matter of computingthe coefficients Ck in Chebyshev expansions and showing how the DFT appears rathermiraculously. We will proceed adroitly on two avenues: the first is continuous, thesecond is discrete.


Consider once again the representation

for a continuous function / on the interval [—1,1]. If we let x = cos#, let/(cos#) and recall that Tk(x) = Tk(cosO] = coskO; then we may write

where 0 < 9 < TT. We recognize this representation as the Fourier cosine series forg(0] on the interval [0,7r]. This suggests a computational strategy for approximatingthe coefficients CQ, c i , . . . , c/v; it consists of the following two steps:

Sample the given function / on the interval [—1,1] at the N + I pointsxn = cos(n7r/N), where n = 0 : N. This gives the samples gn = f(xn) =f(cos(mr/N)} for n = 0 : N.

Apply the discrete cosine transform (DOT) (preferably in the form of an FFT)to the sequence gn to obtain the coefficients F^ which are approximations to theexact coefficients Cfc. They are given explicitly by

for k = 0 : AT.

This argument exhibits the connection between the Chebyshev coefficients and theDCT quite clearly.

Let's now reach the same destination along the discrete pathway. It turns out thatlike the complex exponential, the Chebyshev polynomials have both continuous anddiscrete orthogonality properties. However, there are some unexpected developmentsthat arise with the discrete orthogonality. The first twist is that there are actuallytwo different discrete orthogonality properties. Furthermore, the relevant discreteorthogonality properties use the extreme points of the polynomials. Recall that weused rin — cos(7rn/N) to denote the extreme points of TJV where n = 0 : N. With thesetwo clues in mind it is not difficult to use the orthogonality of the DCT to show thefollowing two discrete orthogonality properties (problem 193), where 0 < j, A;, n < N.

Discrete Orthogonality Property 1 (with respect to grid points)

Discrete Orthogonality Property 2 (with respect to degree)


The notation S" indicates a sum whose first and last terms are weighted by one-half. The terminology is critical: in the first orthogonality property, the value ofthe sum depends on the indices of the grid points (j and n), whereas in the secondproperty the value of the sum depends on the degree of the polynomials (j and k).We might add that the surprises with discrete orthogonality do not end here. Thereare actually two more orthogonality relations for the Chebyshev polynomials that usethe zeros, rather than the extreme points, as grid points (problems 195 and 208).

We may now proceed in a way that mimics the computation of the coefficients inthe continuous Chebyshev expansion. We look for a representation of a given function/ at the points r)n that has the form

for n = 0 : N, Multiplying both sides of this representation by the arbitrarypolynomial Tfc(7?n), where k = 0 : JV, and summing over the points rjn for n = 0 : Nwe find that

for k = 0 : N.Now the discrete orthogonality enters in a predictable way. Using the second

discrete orthogonality property (with respect to degree), we see that if k ^ 0 or TV,then the inner sum has a value of N/2 when j = k and vanishes otherwise. This saysthat

for k = 1 : N - 1. However, Tk(r]n) = Tk(cos(7rn/N)) = cos(7mk/N). Therefore,letting gn = /(r/n), the coefficients Fk are given by

for k = I : N — I. A similar argument reveals that for k — 0 and k = N

However, since FQ and F/v are weighted by 1/2 in the representation (8.8), we can usea single definition for all of the F^'s; it is

for k — 0 : N. Once again, this expression should be recognized as the discrete cosinetransform of the sequence gn = /(%)• This leads us to the conclusion that was reachedearlier on the continuous pathway:


The coefficients of the TV-term Chebyshev expansion for a function / onthe interval [—1,1] can be approximated by applying the TV-point DOT tothe samples

for n = 0 : N.

This argument gives us the forward discrete Chebyshev transform.

Forward Discrete Chebyshev Transform

for k = 0 : N.We can now apply a similar argument and use the first discrete orthogonality

property (with respect to the grid points) to establish the inverse transform (problem194).

Inverse Discrete Chebyshev Transform

forn = 0 : TV.Despite all of the meandering that we have done through properties of the

Chebyshev polynomials, this final result is the intended destination. It shows clearlythe relationship among Chebyshev polynomials, the Fourier cosine series, and theDCT. It also confirms that there is a fast method for computing Chebyshev coefficients,namely the FFT implementation of the DCT. Let's solidify these ideas with anumerical example.

Example: Chebyshev coefficients and expansion. Consider the functionf ( x ) = \x\ on the interval [—1,1]. While this is not a profound choice of a testfunction, it does have the virtue that the coefficients in its Chebyshev expansion canbe found analytically. Verify that (problem 198) this function has the expansion

Note that C2fc+i = 0 because / is an even function.If we sample / at the points xn = cos(n?r/TV) and apply the DCT, approximations

to the coefficients c& can be computed. As a simple example, approximations to Cfc,computed for TV = 8 and TV = 16, are shown in Table 8.2.


FIG. 8.3. Given the exact coefficients in the Chebyshev expansion of f ( x ) = \x\, the functioncan be reconstructed from its partial sums. The figure shows the function f (solid curve) andthe partial sums of the Chebyshev expansion for N = 4 (dotted curve), N = 8 (dashed anddotted curve), and N = 16 (dashed curve).

TABLE 8.2Approximations to Chebyshev coefficients for f(x) — \x\.

k0246810121416

cfc (Exact)1.27(0)

4.24(-l)

Ffc for AT = 81.26(0)

4.41(-1)

Fk for AT = 161.27(0)

4.29(-l)-8.49(-2) -1.04(-1) -8.91(-2)3.64(-2)

-2.02(-2)1.29(-2)

5.87(-2)-4.97(-2)

-

4.08(-2)-2.49(-2)1.79(-2)

-8.90(-2) - -1.44(-2)6.53(-3) - 1.28(-2)

A QQ( 1\ _ 1 OQf 9^rt. i7i7^ Oy X . <£O^ i*^

The numerical results show that the approximations converge to the exact valuesof Cfc with increasing N (the continuity of / and its periodic extension mightsuggest that the errors should decrease as N~2, and indeed they do). In keepingwith approximations related to the DFT, we also see a typical degradation of theapproximations for the higher frequency coefficients.

The reconstruction of / from its coefficients can also be observed. Figure 8.3 showsthe partial sums of the Chebyshev expansion of f ( x ) = \x\ (using the exact values ofCfc) for N = 4, 8, 16. Clearly, very few terms are needed to obtain a good representationof this function. Not surprisingly, the poorest fit occurs near the discontinuity in thederivative of /.

Theory and applications of Chebyshev polynomials extend far beyond the briefglimpse we have provided; see the list of references for additional excitement. We closeon a note of intrigue with a picture (Figure 8.4) of the so-called white curves of theChebyshev polynomials, the geometry of the which is elaborated in Rivlin [118].

identically for all N.

ORTHOGONAL POLYNOMIAL TRANSFORMS 331

FiG. 8.4. The first 30 Chebyshev polynomials, when graphed on the square — 1 < x <1, — 1 < y < 1, reveal striking and unexpected patterns known as white curves.

8.5. Orthogonal Polynomial Transforms

The subject of orthogonal polynomials, which carries many august names, is one ofthe most elegant and tantalizing areas of mathematics. Orthogonal polynomials arisein a variety of different applications, including approximation theory and the solutionof differential equations. Not surprisingly, there are some important computationalissues associated with their use. In this section, we will show how any system oforthogonal polynomials can generate a discrete transform pair. The discussion willuse the Legendre5 polynomials as a specific case to illustrate the procedure that canbe used for any set of orthogonal polynomials. We will begin with a few essentialwords about orthogonal polynomials. The wealth of rich classical theory that willunfortunately, but necessarily, be omitted from this discussion can be found in manysources [32], [81], [106], [145].

We will let pn denote the nth polynomial of a system of orthogonal polynomials

5ADRIEN MARIE LEGENDRE (1752-1833) is ranked with Laplace and Lagrange as one of the greatestanalysts of eighteenth and nineteenth century Europe. He held university positions in Paris, as wellas several minor governmental posts. Legendre made significant contributions to the theory of ellipticintegrals and number theory. His study of spherical harmonics led to Legendre's differential equation.


where n = 0,1 ,2 , . . . . The degree of pn is n, and we will always use an to denote theleading coefficient (coefficient of xn) of pn. Every system of polynomials is denned ona specific interval [a, 6] with respect to a positive weight function w. The orthogonalityproperty satisfied by the set {pn} has the form

for j, k = 0,1,2, . . . , where

For example, the set of Chebyshev polynomials just studied satisfies the orthogonalityproperty

with [a,6] = [-1,1], w(x) = (1 - Z2)"1/2, d\ = 7T/2 for k > 1, and d2, = TT. TheLegendre polynomials, which we will investigate shortly, satisfy the orthogonalityproperty

with [a,b] = [—1,1], w(x) = 1, and d\ = 2/(2k + 1). As a slightly different example,the Laguerre polynomials have the orthogonality property

where [a, b) = [0, oo), w(x] = e x, and d\ = 1. The literally endless properties (forexample, recurrence relations, associated differential equations, generating functions,properties of zeros) of these and many other systems of orthogonal polynomials aretabulated in a variety of handbooks [3].

Of particular relevance to this discussion is the problem of representing fairlyarbitrary functions as expansions of orthogonal polynomials. Given a function /defined on an interval [a, 6], we ask if coefficients Cfc can be found such that

where {pk} is a system of orthogonal polynomials defined on the interval [a, 6]. Wecan proceed formally, just as we have done many times already. Multiplying bothsides of this representation by W(X)PJ(X), integrating over [a, 6], and appealing to theorthogonality property leads immediately to the coefficients in the form (problem 199)

for k — 0,1,2, The question of convergence of expansions of this form is of coursequite important. The theory is well developed and, not surprisingly, parallels theconvergence theory for Fourier series [81], [145].


We can already anticipate how discrete transforms based on orthogonal polyno-mials will be formulated. The process of computing the coefficients c^ of a givenfunction / can be regarded as a continuous forward transform that might be denotedCk = P { f ( x ) } . The process of reconstructing a function from a set of coefficientscan be viewed as an inverse transform that we might denote f(x} = P~l{ck}. Forexample, the continuous Legendre transform pair for a function / has the form

(forward transform)

The task is to find discrete versions of this continuous transform pair. However,some properties of orthogonal polynomials will be needed first. A fact of greatimportance in this business is that systems of orthogonal polynomials have two discreteorthogonality properties. We will need to develop both properties and avoid confusingthem.

In order to proceed, a few properties of the zeros of orthogonal polynomials needto be stated. Any system of orthogonal polynomials {pn} on the interval [a, b] has theproperty that the nth polynomial has precisely n zeros on (a, 6). We will denote the nzeros of pn by £TO where ra = 1, . . . , n. An interesting property (not of immediate usefor our present purposes) is that the zeros of any two consecutive polynomials in a setof orthogonal polynomials are interlaced. This can be seen in the graphs of severalLegendre polynomials shown in Figure 8.5. The zeros of the orthogonal polynomialswill play a fundamental role in all that follows.

We may now begin the quest for discrete orthogonal polynomial transforms.Having seen how the DFT and the discrete Chebyshev transform were developed,we can now make a reasonable conjecture that will also prove to be correct. To makea discrete forward transform from a continuous forward transform (8.10), we mightanticipate that the integral defining the coefficients Ck will need to be approximatedby a sum:

where the an's and xn's are weights and nodes in a chosen quadrature rule. Tomake a discrete inverse transform from (8.10), we might anticipate that the seriesrepresentation for / will need to be truncated and evaluated at certain grid points:

Let's first consider the forward transform and the approximation of the integralfor the expansion coefficients Ck. One might appeal to any number of quadrature (in-tegration) rules to approximate this integral. But if there are orthogonal polynomialsin the picture (as there are here), then there is one technique that begs to be used,and that is Gaussian quadrature. A quick synopsis of Gaussian quadrature goes

(inverse transform).

and


FlG. 8.5. The first six Legendre polynomials Po, • • • , PS are shown in the figure on theirnatural interval [—1,1]. Note that Pn has n zeros and that the zeros of two consecutivepolynomials are interlaced.

something like this. The integral fa g(x]w(x]dx can be approximated by a rule of theform

where the £n's are the zeros of the orthogonal polynomial PN on the interval [a, b]with respect to the weight function w. The weights an are generally tabulated or canbe determined. This rule has the remarkable property that it is exact for integrandsg(x) — xp that are polynomials of degree p = 0 : 2N — 1. It achieves this super-accuracy by allowing both the weights an and the nodes £n (a total of IN parameters)to be chosen. It is a direct, but rather cumbersome task to show that the weights ofany Gaussian quadrature rule are given by [80], [106]

for n = 0 ,1,2, . . . , where a^v and o/v-i are the leading coefficients of px and PNrespectively.

If we now use this quadrature rule to approximate the expansion coefficientswe find that

for k = 0,1, 2 , . . . . With these thoughts in mind, it is now possible to propose adiscrete transform pair associated with the orthogonal polynomials {pk}- Let's adoptour usual notation and let fn = f ( £ n ) be the samples of the function / that comprise


the input sequence where n = 1 : N. We will also let F^ denote the sequence oftransform coefficients that approximate C Q , . . . , C N - I . Then here is our proposeddiscrete transform pair.

Discrete Orthogonal Polynomial Transform Pair

and

where k = 0 : TV — 1 in the forward transform and A; = 1 : N in the inverse transform.So far this is all a conjecture, but hopefully a rather convincing one. Now the

task is to show that these two relations between the sequences fn and Fk really doconstitute a transform pair. In principle, this should be an easy matter: one couldsubstitute (8.11) into (8.12) (or visa versa) and check for an identity. Unfortunately,as we have seen repeatedly in the past, this procedure ultimately relies on a (discrete)orthogonality property. Therefore, in order to establish this transform pair, we mustask about the discrete orthogonality of this system. The answer is both curious andunexpected.

The investigation into discrete orthogonality of orthogonal polynomials requiresa special result called the Christoffel6-Darboux7 formula. This formula is notthat difficult to derive, but it would take us further afield than necessary [81],[145]. Therefore, we will simply state the fact that if {pk} is a system of orthogonalpolynomials on (a, 6), then

where x ^ y are any two points on the interval (a, 6). When x = y, it follows (byL'Hopital's rule) that

Consider what happens if we let x and y be two zeros of p^ in the Christoffel-Darbouxformula. With x = £n and y = £m, noting that £>;v(£m) = PN(£TI) — 0, we find thefirst discrete orthogonality property.

Discrete Orthogonality Property 1 (with respect to grid points)

6ELWIN BRUNO CHRISTOFFEL (1829-1900) was a professor of mathematics in both Zurich andStrasbourg. He is known for his research on potential theory, minimal surfaces, and general curvedsurfaces.

7The work of JEAN GASTON DARBOUX (1842-1917) in geometry and analysis was well knownthrough both his teaching and his books. He was the secretary of the Paris Academy of Sciences andco-founder of the Bulletin des sciences mathematiques et astronomiques.

(forward transform)

(inverse transform).


for m, n = 1 : N.The sum of products of the first N polynomials evaluated at zeros of p^ vanishes

unless the polynomials are evaluated at the same zero of p^. This is an orthogonalityproperty, although it is not analogous to the continuous orthogonality properties thatwe have already seen. The sum is on the order of the polynomials, not on the gridpoints, as is usually the case. We will call this property orthogonality with respectto the grid points since the 6(m — n) term operates on the indices of the grid points.

We can put this property to use immediately in verifying the inverse transform ofour alleged transform pair. We begin with the forward transform (8.11)

and solve for one of the terms of the input sequence fn. Multiplying both sides of thisrelation by Pk(£n) and summing over the degrees of the polynomials (k = 0 : TV — 1)results in

for n = I : N. As shown by the underbrace in the second line, the inner sum (ork) can be simplified by the first orthogonality relation, which in turn allows a singkterm of the outer sum (on m) to survive. The outcome is the inverse transform of ouialleged transform pair.

Recall that with the DFT a single orthogonality property sufficed to derive botr.the forward and inverse transform. In the orthogonal polynomial case, an attempt tcuse the same orthogonality property to "go in the opposite direction" (start with theinverse transform and derive the forward transform) leads nowhere. A second (dual'orthogonality property is needed. We could resort to another long-winded derivatiorof this second property or make a very astute observation.

Let's write the first orthogonality property in the form

where


just stands for the constants multiplying 6(m — ri). This sum can now be written

If we now view the c^n's as the elements ofan TV x N matrix C, then this equation can be written in matrix form as CTC = I.Two important consequences follow from this fact: first, CT = C~l, which says thatC is an orthogonal matrix (reflecting the first orthogonality property), and second,CCT = I (problem 196). If we now write out the components of the matrix equationCCT = /, we see that

where we have let

Replacing the Cfcn's by what they really represent, we have

And finally, replacing An, a brand new orthogonality property appears.

Discrete Orthogonality Property 2 (with respect to degree)

for j>, k = 0 : N — 1. This orthogonality property does parallel the continuousorthogonality properties since the sum is over the grid points and the "zero-nonzeroswitch" is with respect to the degree of the polynomials.

We may now take the final step and verify the forward transform of the proposedtransform pair. Beginning with the alleged inverse transform (8.12)

for n = I : N, we multiply both sides bygrid points £n for n' — I : N. This leads to the following line of thought, whichinevitably uses the second discrete orthogonality property:

and sum over the


Therefore,

for k = 0 : N — 1, which is the proposed forward transform (8.11) given above.In summary, we have established a discrete transform pair based on a system of

orthogonal polynomials. It is a curious state of affairs that two different orthogonalityproperties are needed to establish a complete transform pair. At this point we couldproceed to derive transform pairs for any of the many orthogonal polynomial systemsthat populate the mathematical universe. We will be content with just one suchexercise, and it is a transform that arises frequently in practice. Let's consider the setof Legendre polynomials and its discrete transform.

As mentioned earlier, the Legendre polynomials {Pn} reside on the interval [—1,1]and use the weight function w(x) = I. Here are the first few Legendre polynomialswith their graphs shown in Figure 8.5:

We will cite some of the more interesting and useful properties of the Legendrepolynomials, some of which are needed to develop the discrete transform. The fulldevelopment of these properties can be found in many sources [81], [145]; some willbe elaborated in the problems.

1. Degree. The polynomial Pn has degree n and is even/odd when n is even/odd.

2. Zeros. The polynomial Pn has n (real) zeros on the interval (—1,1). If n is odd,then Pn(0) = 0.

3. Orthogonality. We reiterate the (continuous) orthogonality property of theLegendre polynomials:

for j, k = 0,1,2, . . . . This means that the scaling factors d\ are given by

4. Rodrigues' formula. There are several ways to define and generate Legendrepolynomials. Many treatments begin with the definition known as Rodrigues'8

8We know only that OLINDE RODRIGUES (1794-1851) was an economist and reformer who evidentlywas sufficiently familiar with the work of Legendre to derive the formula for the Legendre polynomialsthat bears his name.


formula and all other properties follow (problem 201):

5. Recurrence relations. The Legendre polynomials satisfy several recurrence re-lations that relate two or three consecutive polynomials and/or their derivatives.The fundamental three-term recurrence relation is

for n = 1,2,3, Given P0 and PI, this relation can be used to generate theLegendre polynomials. Two other recurrence relations that will be needed are

and

The relation that we will use for the discrete Legendre transform results if wemultiply (8.16) by x and subtract the resulting equation from (8.17). We find(problem 202) that

for n = 1, 2,3,.

6. Differential equation. The nth Legendre polynomial Pn satisfies the differentialequation (problem 203)

7. Ratio of leading coefficients. Letting an be the leading coefficient in Pn, it canbe shown that

With these properties in hand, we may now deduce the discrete orthogonalityproperties and the discrete transform pair for Legendre polynomials. Let's begin withthe orthogonality properties. The first discrete orthogonality property on N points(8.13) was found to be

where the £n's are the zeros of PN. We can now tailor this general property to theLegendre polynomials. The scaling factors d?k and d?N_l are given in property 3 above,and the ratio of leading coefficients, ajv-i/fljv? is given in property 7. The term Pjv(£n)can also be replaced by a nonderivative term by letting n — N and x = £n in (8.18)and noting that P/v(£n) — 0. After some rearranging we see that


Assembling all of these pieces, we arrive at the first orthogonality property.

Discrete Orthogonality Property 1 for Legendre Polynomials

for m,n = 1 : N. Notice that this is an orthogonality property with respect to thegrid points, which are the zeros of P/V.

We can carry out a very similar substitution process (problem 205) with thesecond discrete orthogonality property (8.14) and deduce the corresponding propertyfor Legendre polynomials.

Discrete Orthogonality Property 2 for Legendre Polynomials

for j,k = Q : N — 1. This second orthogonality property is with respect to the degreeof the polynomials. These orthogonality properties can now be used to derive thediscrete Legendre transform pair or we can appeal directly to the general transformpair (8.11) and (8.12). By either path, we are led to the forward ]V-point transform(problem 206).

Forward Discrete Legendre Transform

for k = 0 : N — 1, where the input sequence fn = /(£n) is the given function /sampled at the zeros of PN- Going in the opposite direction, given a set of coefficientsFfc, the function / can be reconstructed at the zeros of PN using the inverse discretetransform.

Inverse Discrete Legendre Transform

for n = 1 : N.

Since it is a somewhat tangential topic, this excursion into the realm of otherdiscrete transforms has probably lasted long enough. In closing, we mention that theaccount of orthogonal polynomial transforms given in this chapter is incomplete inmore than one aspect. There are still many discrete transforms that are not includedin the orthogonal polynomial framework given here. We mention only the Walsh-

THE HARTLEY TRANSFORM 341

Hadamard9, the Hilbert10, the discrete Bessel11 (or Hankel12) [83], [84], and theKohonen-Loewe transforms at the head of the list of omissions. The book by Elliotand Rao [57] gives a good account of these and many more discrete transforms.

Another issue of extreme importance that is under active investigation is thematter of fast discrete transforms. For all of the discrete transforms mentioned, thereis the attendant question of whether the Appoint transform can be computed in FFTtime (roughly N log N operations) rather than matrix-vector multiply time (roughlyN2 operations). In addition to the DFT, the discrete Chebyshev transform and theWalsh-Hadamard transforms have fast versions because of their kinship with the DFT.However, the search for fast algorithms for other discrete transforms is a wide openand tempting quest. It appears that wavelet methods have recently led to a fastLegendre transform [10], and there may be extensions to other discrete transforms.

8.6. The Hartley Transform

Some BackgroundIt seems appropriate to include the Hartley transform in this chapter because it isintertwined so closely with the Fourier transform both mathematically and historically.It is of mathematical interest because it resembles the Fourier transform and sharesmany analogous properties. Computationally, it is also compelling since, in itsdiscrete form, it may be a legitmate alternative to the DFT, with alleged advantages.Historically, the story of the Hartley transform is relatively brief, but it parallels that ofthe Fourier transform on a much-compressed scale. It made its first official appearancein a 1942 paper by Ralph V. L. Hartley13 in the Proceedings of the Institute of RadioEngineers [72]. The need to sample signals and approximate the continuous transformon computers led inevitably to the discrete form of the Hartley transform (DHT). Likethe DFT, the DHT is a matrix-vector product, and it requires laborious calculations,particularly for long input sequences. The final step occurred in 1984 when Ronald N.Bracewell announced (and patented) the fast Hartley transform (FHT). This algorithmachieves its speed in much the same way as the FFT and computes the DHT in FFTtime (order TV log TV). It also led to claims that "for every application of the Fouriertransform there is an application of the Hartley transform," and that the FFT hasbeen "made obsolete by the Hartley formalism" [14].

In the last ten years there has been much animated discussion both on and off therecord about the relative merits of the FFT and the FHT, with claims of superiority

9JACQUES HADAMARD (1865-1963) was a French mathematician who is best known for his proofof the prime number theorem (on the density of primes) in 1896. He also wrote on the psychology ofmathematical creativity.

10Born in Konigsberg in 1862, DAVID HlLBERT founded the formalist school of mathematical thoughtin an attempt to axiomatize mathematics. His famous 23 problems, proposed at the InternationalMathematical Congress in Paris in 1900, set the agenda for twentieth century mathematics. He wasa professor of mathematics at Gottingen from 1895 to 1930, and died there in 1943.

11F. W. BESSEL (1784-1846) was a friend of Gauss and a well-known astronomer. He is best knownamong mathematicians and engineers for the second-order differential equation, and the functionssatisfying it, that bear his name.

12HERMANN HANKEL (1839-1873) was a German historian of mathematics. Although his name isassociated with an integral transform, he appears to have done most of his work in algebra.

13As a researcher at Western Electric, RALPH V. L. HARTLEY worked on the design of receiversfor transatlantic radiotelephones. During World War I, he proposed a theory for the perception ofsound by the human ear and brain. His early work in information theory led to Hartley's Law andwas recognized when the fundamental unit of information was named the Hartley.


on both sides. In this section we will stop short of jumping into the FFT/FHT frayand try to keep the discussion as nonpartisan as possible. There is plenty to say if wejust present the essentials of the DHT and point out its remarkable similarities to theDFT. We will collect the arguments for and against the FHT as an alternative to theFFT. If a final judgment is necessary, we will leave it to the reader!

The Hartley Transform

We will begin is with the definition of the continuous Hartley transform, essentially inthe form given by Hartley in his 1942 paper. The input to the Hartley transform is areal-valued function h defined on (—00, oo) that satisfies the condition J_ |/i(x)|dx <oo. Although the input may be a time-dependent signal, we will maintain ourconvention of using x as the independent variable. The Hartley transform operates onthe input h and returns another function H of the frequency variable w. The kernelof the Hartley transform is the combination

which clearly resembles the kernel of the Fourier transform (cos x + i sin x) except thatit is real-valued. In its most commodious form, the forward Hartley transform is givenas follows.

Forward Hartley Transform

Given the transform H, the original input can be recovered using the inverse Hartleytransform.

Inverse Hartley Transform

Expressions (8.21) and (8.22) form the continuous Hartley transform pair. Twoimportant properties can be gleaned immediately: the transform involves only real-valued quantities, and the Hartley transform is its own inverse.

Given the definition of the transform pair, one may deduce the relationshipbetween the Fourier and Hartley transforms. As with any function, the even andodd parts of the Hartley transform are given by

and

It is now easily shown (problem 209) that the Fourier transform of /i,


is related to the Hartley transform by the following identities.

Hartley —> Fourier

Conversely the Hartley transform can be obtained from the Fourier transform withthe following relationship.

Fourier —> Hartley

Before turning to the discrete transform, there is another Fourier-Hartley link thatshould be mentioned since it bears on the controversy between the two formalisms.One of the most common reasons for computing the Fourier transform of an inputsignal is to determine two quantities: its power spectrum

and its phase

A point frequently cited by Hartley proponents is that these two important quantitiesare easily obtained from the Hartley transform by the relations (problem 210)

The Discrete Hartley Transform (DHT)As mentioned earlier, the necessity of sampling input signals and approximating theHartley transform numerically led naturally to the discrete version of the Hartleytransform. The framework for the DHT is absolutely identical to that used throughoutthis book for the DFT. An input signal h is sampled at N equally spaced points ofan interval [0, A] in the spatial (or time) domain. We will denote these samples /in,where n — 0 : TV — 1. We have followed the most common convention of using indicesin the range 0 : N — I, but the periodicity of the Hartley kernel allows other choicessuch as 1 : N or -TV/2 + 1 : N/2.

The output of the DHT is a real-valued sequence H^ given by the followingrelation.

Forward Discrete Hartley Transform

for k = 0 : N — 1. Notice that the sequence Hk denned in this way is periodic withperiod N. Furthermore, the kth Hartley coefficient should be interpreted as the weightassociated with the kth mode. Several representative modes of the DHT are shownin Figure 8.6. Notice that each mode is a linear combination of a sine and cosine,


and can be viewed as a single shifted sine or cosine mode. With this choice of indices(n, k = 0 : N — 1), the highest frequency occurs at k = AT/2, while the kih and(N — k)ih frequencies are the same. The reciprocity relations of the DFT carry overentirely to the DHT.

As with the DFT, the inverse DHT follows with the aid of orthogonality relations.The usual arsenal of trigonometric identities and/or geometric series can be used(problem 211) to show that

It is now possible (problem 213) to deduce the inverse of the relationship given in(8.23).

Inverse Discrete Hartley Transform

for n — 0 : N— 1. The sequence hn defined by the inverse DHT is real and periodic withperiod N. However, the most significant property of this transform pair (particularlyin the proselytism of the DHT) is the fact that the DHT is its own inverse. In otherwords, the same algorithm (computer program) can be used for both the forward andinverse transform. This cannot be said of the DFT, in which the forward transform ofa real sequence is a conjugate even sequence, which necessitates a distinct algorithmfor the inverse transform.

FlG. 8.6. Modes of the discrete Hartley transform, cas(2Trnk/N), with frequencies (clockwisefrom upper left) k = 1,3,9, and 5 on a grid with N = 16 points are shown. Each mode is asingle shifted sine (or cosine) wave.


Now the properties of the DHT follow in great profusion, most of them withpalpable similarity to the familiar DFT properties. We will highlight a few of thesefeatures and leave many more to the exercises. The relationships between the DFTand DHT are of particular interest. Arguing as we did with the continuous transforms,the even and odd parts of the DHT are given by

Notice that the periodicity of the sequence Hk allows us to replace H-k by H^-k-If we let Fk denote the DFT of the real input sequence hn, it is a straighforwardcalculation to show that

where in both relations k = 0 : N — I. It is useful and reassuring to write theserelations in a slightly different way. Recall that since the input sequence hn is assumedto be real, the DFT coefficients have the conjugate even symmetry Fk = F^_k. Thissays that there are TV independent real quantities in the DFT sequence: the real andimaginary parts of F I , ..., FN__I plus F0 and FK , which are both real. Therefore, theDHT -> DFT relations can be found easily (problem 212).

DHT -> DFT

for k = 0 : AT/2. We can also use the conjugate even symmetry of the DFT sequenceto convert the DFT to DHT.

DFT -» DHT

for k = 0 : N/2. Note that these relations imply F0 = H0 and FJV = HN. . The secondpair shows clearly that given a real input sequence of length TV, the DFT and theDHT are both sets of N distinct real quantities that are intimately related.

A few properties of the DHT are worth listing, particularly those with obviousDFT parallels. In all cases the proofs of these properties follow directly from thedefinition (8.23) and are relegated to the problem section. The operational notationH is used to denote the DHT; that is, Hk —'H{hn}k-

1. Periodicity. If the TV-point sequences hn and Hk are related by the DHTtransform pair (8.23) and (8.24), then hn = hn±N and Hk = Hk±N-

2. Reversal. n{h-n}k = H^k.

3. Even sequences. If hn is an even sequence (h^-n — hn), then Hk is even.

4. Odd sequences. If hn is an odd sequence (h^-n = —hn), then Hk is odd.

5. Shift property.


6. Convolution. Letting

where is the cyclic convolution of the two sequences hn

and gn.

For the details on these and other properties, plus an excursion into two-dimensional DHTs, filtering with the DHT and matrix formulations of the DHT,the reader is directed to Bracewell's testament, The Hartley Transform [14].

In closing we shall attempt to summarize the current state of affairs regardingthe DFT and the DHT. We must look at the issue from mathematical, physical, andcomputational perspectives (and even then we will have undoubtedly committed someoversimplifications).

Mathematically, there is a near equivalence between the two discrete transformswhen the input consists of N real numbers. The two sets of transform coefficientsconsist of TV distinct real numbers that are simply related. Furthermore, quantities ofphysical interest, such as the power and phase, are easily computed from either set ofcoefficients. From this perspective, there is little reason to prefer one transform overanother.

In terms of physical applicability, Nature appears to have a predilection for theFourier transform. It is the Fourier transform that appears as the analog output ofoptical and diffraction devices; it is the Fourier transform that arises most naturally inclassical models of physical phenomena such as diffusion and wave propagation. Fromthis perspective, a reasonable conclusion is that if a particular problem actually callsfor the Hartley transform (or a set of Hartley coefficients), then the DHT should beused. In the vast preponderance of problems in which the Fourier transform (or a setof DFT coefficients) is needed, the DFT should be used. So far, these conclusions arehardly profound or unexpected.

The question remains: is there anything to be gained by using the DHT to computethe DFT? (The opposite question does not seem to attract as much attention.) It isthis computational issue that has drawn the lines between the DFT and the DHTcamps. As mentioned earlier, there are now fast versions of both the DFT andthe DHT, which reduce the computation of these transforms from an O(N2} to anO(N\ogN} chore. If a comparison is made between the FHT and the complex formof the FFT (which is occasionally done), then there is a factor-of-two advantage infavor of the FHT (in arithmetic operations). As is well known (see Chapter 4),there are compact symmetric versions of the FFT that exploit the symmetries ofa real input sequence and reduce the computational effort by that same factor oftwo. Therefore, if the appropriate comparison between the DHT and the real FFT ismade, the two algorithms have virtually identical operation counts (one count favorsthe compact FFT by N — 2 additions). The difference in operation counts is faroutweighed by practical considerations such as hardware features, data handling,overall architecture (serial vs. vector vs. multiprocessor), as well as the efficiencyof software implementation.

Let's prolong the discussion a bit more and agree to call the performance issuea toss-up; that is, in a fixed computer environment, the computation of the Hartleycoefficients by the FHT is as efficient as the computation of Fourier coefficients by theFFT. Then two additional factors should be cited. If the FHT is used to compute

PROBLEMS 347

Fourier coefficients, then an extra postprocessing step is needed (given by (8.25)).While there is minimal arithmetic in this step, it does require an additional passthrough the data, which will almost certainly give the advantage to the FFT. Onthe other hand, the FHT has the clear advantage that only one code (program ormicroprocessor) is needed to do both the forward and inverse DHT. The real FFTneeds one code for the forward transform and a separate (but equally efficient) codefor the inverse transform. If one is designing chips for satellite guidance systems, thismay be a deciding factor; for supercomputer software packages, it probably is not. Onthat note of equivocation we will close this brief glance at the Hartley transform, andindeed this chapter on related transforms.

Notes

One of the most ardent champions of the fast Hartley transform was the late OscarBuneman. His papers [23], [24] contain valuable reading on further refinements andimplementations of the FHT, as well as the extension to multidimensional DHTs.Further deliberations on the uses and abuses of the DHT, as well as comments on theFFT vs. FHT issue can be found in [15], [53], [71], [100], [123], and [127]. Attemptsto determine the current status of FHT codes were unsuccessful. Bracewell's book[14] contains BASIC programs for various versions of the FHT. Each program carriesthe citation Copyright 1985 The Board of Trustees of the Leland Stanford JuniorUniversity. The extent to which this caption limits the use and reproduction of FHTcodes is not known.

8.7. Problems

Laplace Transforms

172. Inverse Laplace transforms. Use the inversion method described in the textto approximate the inverse Laplace transform of the following functions of s = c + iuj.The exact inverse is also given.

(a) F(s) = l / ( s - 3) where c = Re {s} > 3 has the inverse /(t) = e3t.

(b) F(s) = s/(s2 + 16) where c = Re{s} > 0 has the inverse /(<) = cos4i.

(c) F(s) = l/s2 where c — Re {s} > 0 has the inverse f(t) = t.

In each case experiment with the number of sample points AT, the length of thesampling interval fi, and the free parameter c to determine how the errors behave.For fixed values of N and f2, try to determine experimentally the optimal value of c.

z- Transforms

173. Linearity of the 2-transform. Show that if a and b are constants and if un

and vn are sequences, then


174. Shift property. Show that if un+k is the sequence obtained from un byshifting it to the left by k places, then

where k is any positive integer.

175. Finding z-transforms. Find the z-transform of the following sequences un

where n = 0,1,2,3, . . . . Indicate the values of z for which the transforms are valid.

where a > 0 is any real number.

if n is even,if n is odd.

sin n9, where 8 is any real number.

176. Two special inverses. Verify the last two transform pairs of Table 8.1:

and

What are the regions of validity for these transforms?

177. Inverse z-transforms. Find the sequences that are the inverse z-transformsof the following functions.

where p is a positive integer.

178. Solving initial value problems. Use the z-transform to solve the followinginitial value problems.

PROBLEMS 349

179. Fibonacci sequence. One of the most celebrated sequences in mathematics isthe Fibonacci sequence, which arises in many diverse applications. It can be generatedfrom the initial value problem

for n = 0,1,2,3,. . . , with UQ — 1 and u\ = 2. Use the ^-transform to solve this initialvalue problem and generate the Fibonacci sequence.

180. Inverse ^-transform from the definition. Begin with the inversion formulafor the ^-transform

for n = 0,1, 2,3, . . . , where C is a circle of radius R centered at the origin. Approximatethis contour integral by a sum of integrand values at the points Zk = Ret2nk/N, wherek = 0 : N — 1, and show that

for n = 0 : TV - 1.

181. Laplace transform to 2-transform. Begin with the definition of the Laplacetransform

and make the change of variables z = es (where z and s are complex). Let un besamples of u(t) at the points tn — n for n — 0,1,2,3,... to derive the ^-t^ansformU(z) as an approximation to the Laplace transform U(s). If the Laplace transform isvalid for Re {s} > SQ, for what values of z is the corresponding z-transform valid?

182. The convolution theorem. It should come as no surprise that the z-transform has a convolution theorem. Prove that

183. Inverse z-transforms with nonsimple poles. The problem of inverting a^-transform U(z) that has multiple roots in the denominator is more challenging. Itcan be done either by long division or by using the convolution theorem. For example,to invert U(z) = (z - 2)~2, one may let F(z) = G(z] - (z - 2)"1 and note thatfn = gn = 2n~l for n > 1 and fo — go = 0. The convolution theorem can then beused. Find the inverse ^-transforms of the following functions.


Chebyshev Transforms

184. Zeros of the Chebyshev polynomials. Show that the zeros of the nthChebyshev polynomial Tn are given by

185. Extreme values. Show that |Tn(x)| < 1 on [—1,1] for n — 0,1,2,. . . , andthat the extreme values of Tn occur at the points

186. Multiplicative property. Use the definition of the Chebyshev polynomialsto show that

187. Semigroup property. Use the definition of the Chebyshev polynomials toshow that

188. Recurrence relation. Prove the recurrence relation

for n = 2 ,3 ,4 , . . . .189. Differential equation. Compute T'n(x] and T'^(x) in terms of the variable0 = cos"1 x to show that Tn satisfies the differential equation

for n = 0 ,1 ,2 , . . . .

190. Representation of polynomials. Show that the polynomial

has the representation

Show further (in a process called economization) that the truncated polynomials

and

for j = 1 : n.

for j = 0 : n.

PROBLEMS 351

give approximations to p that satisfy

for

191. Least squares property. Show that of all nth-degree polynomials pn, thenormalized Chebyshev polynomial Tn minimizes

(Hint: Expand an arbitrary pn as a sum of Chebyshev polynomials.)

192. Error in truncating a Chebyshev expansion. Use the orthogonality ofthe Chebyshev polynomials to show that if the expansion

is truncated after N + 1 terms, the resulting partial sum SN satisfies

193. Discrete orthogonality. Assuming that 0 < jf, k < N, prove the discreteorthogonality properties with respect to both the grid points and the degree of thepolynomials

and

where r)n = cos(irj/N} are the ^-coordinates of the extreme points of TN. (Hint: Usethe discrete orthogonality of the cosine.)

194. Chebyshev series synthesis. Assume that the coefficients CQ,CI, . . . ,CN ofthe expansion

are given. Show how the inverse DCT can be used to approximate f ( x ] at selectedpoints of [—1,1].

195. Discrete orthogonality on the zeros. The Chebyshev polynomials alsohave a discrete orthogonality property that uses the zeros of the polynomials as grid

N)


points. Let £n be the zeros of T/v for n — 1 : N. Show that for 0 < j, k < N — 1 thefollowing relations hold:

(Hint: Use the discrete orthogonality of the cosine.) Note that this property is withrespect to the degree of the polynomials in analogy to the discrete orthogonalityproperty 2 of the text.

196. A matrix property. Recall that a matrix identity was used to derive thesecond discrete orthogonality property from the first. Show that if an N x N matrixC satisfies CTC = /, then CCT = I.

197. Representation of polynomials. Use the discrete orthogonality on zeros(problem 195) to show that any polynomial p of degree N — I can be expressed in theform

where

198. Representations of functions. Verify the following Chebyshev expansionsanalytically or with a symbolic algebra package:

for k = 0 : N - 1.

In each case

(a) Approximate the coefficients in the expansion using the forward discreteChebyshev transform and compare the computed results to the exactcoefficients. How do the errors vary with TV?

(b) Use the exact coefficients CQ , . . . , CN as input to the inverse discreteChebyshev transform and compute approximations to the function valuesat selected points of [—1,1]. Monitor the errors as they vary with N.

Legendre Transforms

199. Legendre coefficients. Use the orthogonality of the Legendre polynomialsto show that the coefficients in the expansion

on [—1,1] are given by

PROBLEMS 353

for k = 0 , 1 , 2 , . . . .

200. Legendre expansions for polynomials.

(a) Find the representation of f ( x ) = 3rc3 — 1x in terms of Legendre polyno-mials.

(b) Show that any polynomial of degree n can be represented exactly by alinear combination of P0,..., Pn.

(c) Show that Pn is orthogonal to all polynomials of degree less than n.

201. Rodrigues' formula. Use Rodrigues' formula

for n — 0,1, 2 , . . . to determine PO> PI, PI, and P%.

202. Recurrence relations. Combine recurrence relations (8.16) and (8.17) toobtain the relation

for n — 1,2, 3,. . . , which is needed for the discrete orthogonality properties. Simplifythis relation when x = £&, a zero of Pn.

203. Legendre's differential equation. Eliminate the term P^-i between therecurrence relations (8.16) and (8.17) to derive the differential equation

satisfied by Pn.

204. Some discrete Legendre transforms. Let P{fn}k denote the kthcomponent of the JV-point discrete Legendre transform of the sequence fn. Showthat

where g is an arbitrary function

where £n are the zeros of PN-

(Hint: Use the recurrence relation (8.15).)

205. Discrete orthogonality property 2 for Legendre polynomials. Derivethe second discrete orthogonality property for Legendre polynomials (8.20) from thegeneral property (8.14).

206. Discrete Legendre transform pair Derive the discrete Legendre transformpair

for k = 0 : N - I and

for n = 1 : N by


(a) appealing to the general transform pair (8.11) and (8.12), and

(b) using the discrete orthogonality properties of the Legendre polynomials(noting that a different property must be used for the forward and inversetransforms).

207. Legendre expansions. For each of the following functions on [—1,1]compute as many coefficients Ck in the expansion as possible (usingeither analytical methods or a symbolic algebra package).

Then

(a) Approximate these same coefficients using an TV-point discrete Legendretransform and compare the results to the exact coefficients Ck for variousvalues of N.

(b) Use the exact coefficients C Q , . . . , C N - I as input to the inverse JV-pointdiscrete Legendre transform to approximate /(£n) and compare the resultsto the original functions.

208. Open questions about discrete orthogonality. The discrete Chebyshevtransform was presented separately in this chapter because it is related so closely tothe discrete cosine transform. However, it could also be developed within the generalframework of discrete orthogonal polynomial transforms.

(a) Find the two discrete orthogonality properties of the Chebyshev polynomi-als from the general orthogonality properties (8.13) and (8.14). Note thatthese two properties use the zeros, not the extreme points, of the polyno-mials as grid points. Conclude that there are now four different discreteorthogonality relations for the Chebyshev polynomials: two that use theextreme points as grid points and two that use the zeros as grid points;in each case, we have property 1 (with respect to the grid points), andproperty 2 (with respect to the degree of the polynomials).

(b) Show that the discrete orthogonality property with respect to degreederived in part (a) of this problem (using the zeros as grid points)corresponds to the orthogonality property derived above in problem 195.(Hint: You will need the identity that T'N(£n) = NUN-i(£n) whereUn = sin(n#)/sin# is the nth Chebyshev polynomial of the second kind).

(c) Derive the discrete Chebyshev transform pair that uses the zeros of thepolynomials as grid points. How does it compare to the first transformpair?

(d) Since the Chebyshev polynomials have four different discrete orthogonalityproperties, do all orthogonal polynomials have orthogonality properties thatuse the extreme points (instead of the zeros) as grid points? If so, arethere two such properties: property 1 with respect to the grid points, andproperty 2 with respect to the degree of the polynomials?

PROBLEMS

The Hartley Transform

355

209. Continuous Hartley and Fourier transforms. Given a function definedon (—00, oo), show that its Fourier transform F is related to its Hartley transform Hby

where E and O are the even and odd parts of the Hartley transform, respectively.Show from this representation of F that it has the conjugate even symmetry F(—UJ) =W-210. Power spectrum and phase. Show that the power spectrum and phase ofa function can be obtained from the continuous Hartley transform by the relations

211. Orthogonality. Prove the orthogonality property of the DHT kernel

212. DFT to DHT and back. Verify the DHT -» DFT relationships

for k = 0 : AT/2, and the DFT -» DHT relationships

for k = 0 : AT/2.

213. Inverse DHT. Using the orthogonality of the cas functions, prove that theDHT is its own inverse up to a multiplicative constant.

214. DHT properties. Verify the following properties of the DHT.

(a) Reversal. H{h^n}k = H-k.

(b) Even sequences. If hn is an even sequence (h^-n — hn), then Hk is even.

(c) Odd sequences. If hn is an odd sequence (h^-n = —hn), then Hk is odd.

(d) Shift property.

(e) Convolution. Letting

where (h*g)n

hn and gn.is the cyclic convolution of the two sequences


215. Further properties. Given an input sequence /in, let Fk and H^ be its DFTand DHT, respectively. Show that the following relationships are true.

(a) Sum of sequence (DC component):

(b) First value:

(c) Parseval's theorem:

216. Convolution and symmetry. Show that if either of the two sequences hn orgn in a cyclic convolution is even, then the convolution theorem for the DHT simplifiesconsiderably and becomes

217. Generalized Hartley kernels. The function cas x — cosx 4- sin a; is a sinefunction shifted by Tr/4 radians. A more general transform may be devised by usingthe kernel \/2 sin (x + </>), where </> is any angle other than a multiple of ir/2. Show thatif this kernel is used in the forward transform, the kernel for the inverse transform isv/cot<£ sin x + \Aan0 cos x [14].

Chapter 9

Quadrature and the DFT

9.1 Introduction

9.2 The DFT and the Trapezoid Rule

9.3 Higher-Order Quadrature Rules

9.4 Problems

It [Fourier series] revealsthe transcendence of

analysis over geometricalperception. It signalizes

the flight of humanintellect beyond the

bounds of the senses.- Edward B. Van Vleck,

1914 357

358 QUADRATURE AND THE DFT

9.1. Introduction

The theory and techniques of numerical integration, or quadrature, comprise oneof the truly venerable areas of numerical analysis. Its history reflects contributionsfrom some of the greatest mathematicians of the past three hundred years. The subjectis very tangible since most methods are ultimately designed for the practical problemof approximating the area of regions under curves. And yet, as we will see, there aresome subtleties in the subject as well. It is not surprising that eventually the DFT andquadrature should meet. After all, the DFT is an approximation to Fourier coefficientsor Fourier transforms, both of which are defined in terms of integrals. The design ofquadrature rules for oscillatory integrands (particularly Fourier integrals) goes backat least to the work of Filon in 1928 [58], and the literature is filled with contributionto the subject throughout the intervening years [22], [45], [80]. During the 1970s, thesubject of the DFT as a quadrature rule was revisited in a flurry of correspondences[1], [2], [56], [105], that provided provocative numerical evidence. The subject remainsimportant today, as practitioners look for the most accurate and efficient methodsfor approximating Fourier integrals [95]. The goal of this section is to explore thefundamental connections between the DFT and quadrature rules, to resolve a fewsubtle points, and to provide a survey of the work that has been done over manyyears.

In previous chapters we considered how the DFT can be used to approximate bothFourier coefficients and Fourier transforms. Relying heavily on the Poisson SummationFormula, it was possible to develop accurate error bounds for these approximations.In the process, we were actually using the DFT as a quadrature rule to approximatedefinite integrals; yet, ironically, the notion of quadrature errors was never mentioned.In this chapter we will establish the DFT as a bona fide quadrature rule; in fact, as wehave seen, it is essentially the familiar trapezoid rule. From this vantage point, it willbe possible to gain complementary insights into the DFT and its errors. The DFTerror bounds that arise in the quadrature setting rely on the powerful Euler-MaclaurinSummation Formula. These error bounds are stronger than the results of the previouschapters in some cases, and weaker in others. We will also look at how higher-order(more accurate) quadrature rules can be related to the DFT. There are cases in whichhigher-order methods provide better approximations to Fourier coefficients, and othersituations in which the DFT appears to be nearly optimal. As usual, the journey isas rewarding as the final results, so let us begin.

9.2. The DFT and the Trapezoid Rule

For ease of exposition we use the interval [—TT, TT], although everything that is said anddone can be applied to any finite interval. We will consider the problem of computingthe Fourier coefficients of a function / that is defined on the interval [—TT, TT]. As wehave already seen, these coefficients are given by

(corresponding to A = 27r). Note that the problem of approximatingjust the special case of approximating 2?rco. Recall how the trapezoid rule is used toapproximate such an integral. The interval [—TT, TT] is first divided into N subintervals

dx is

THE DFT AND THE TRAPEZOID RULE 359

FlG. 9.1. The trapezoid rule for f"^g(x)dx results when the integrand g is replaced by cpiecewise linear function over N subintervals o/[—7r,7r]. The area under the piecewise lineaicurve is the sum of areas of trapezoids.

of width 27T/N. To use fairly standard notation, we will now let h rather than Axdenote the width of these subintervals. The resulting grid points will be denotedxn = nh, where n = —N/2 : N/2, with For the moment, let the integrand

be denoted g(x) = f(x)e~lkx. Then the trapezoid rule approximation to 7rg(x}dxwith N subintervals (Figure 9.1) results when the integrand g is approximated by apiecewise linear function (connected straight line segments). The area of the resultingN trapezoids is

where we have introduced the notation T/v{<?} to stand for the TV-point trapezoid ruleapproximation with g as an integrand.

One small rearrangement of this expression will be very instructive. Letting

This last expression begins to resemble the DFT. Two cases now present themselves.

1. If the given function / satisfies the condition that /(—TT) = /(?r), then thefirst and last terms of the rule may be combined. Letting f and

and we can write


for k — — AT/2 + 1 : N/2. In other words, if / has the same value at the endpoints,then the trapezoid rule and the DFT are identical.

2. What happens if /(—TT) ^ /(TT)? If we are interested in approximating theFourier coefficients Cfc, then we can agree to definewe still have

and

for k = —N/2 + 1 : N/2. Notice that this "agreement" about /jv is not arbitrary.Letting /N be the average of the endpoint values is precisely the requirement(AVED) that we have used consistently to define the input to the DFT.

In either case, we see that the trapezoid rule applied to f(x}e~lkx gives thesame result as the DFT of the sequence /m provided that average values are usedat endpoints and discontinuities in defining fn. For this reason we will identify thetrapezoid rule with the DFT for the remainder of this chapter.

There is a wealth of theory about the error in quadrature rules such as thetrapezoid rule. It seems reasonable to appeal to those results to estimate the errorin the trapezoid rule (DFT) approximation to Fourier coefficients. The most familiarstatement about the error in the trapezoid rule can be found in every numericalanalysis book (e.g., [22], [45], [80], [166]), and for our particular problem it looks likethis.

THEOREM 9.1. ERROR IN THE TRAPEZOID RULE. Let g e C2[-7r,7r] (that is, ghas at least two continuous derivatives on [—7r,7r]). Let T/v{<?} be the trapezoid ruleapproximation to f*^ g(x)dx using N uniform subintervals. Then the error in thetrapezoid rule approximation is

where —TT < £ < TT.The most significant message in this result is that the error in the trapezoid rule

decreases as h2 (equivalently as N~2), meaning that if h is halved (or N is doubled)we can expect a roughly fourfold decrease in the error. We will observe this patternin many examples. This theorem will now be used to estimate the error in trapezoidrule approximations to the Fourier coefficients of /. On the face of it, this theoremseems to say that if / 6 C2[—TT,TT], then the error should decrease as N~2 as thenumber of grid points N increases. We might suspect already that this result does nottell the whole story, since we know that there are situations in which the error in theDFT decreases more rapidly than this; indeed, there are instances in which the DFTis exact. Let's investigate a little further. In our particular case, g(x) = f(x}e~lkx,and it follows that

we have


Since the exact location of the mystery point £ is not known, it is difficult to evaluate#"(£). It is customary at this point to settle for an error bound which takes the form

In practice, the Fourier coefficients Ck would be approximated for k = —N/2 + 1 :TV/2. Notice that for values of k near ±N/2 this bound decreases very slowly withrespect to N. In fact, for k = N/2 the bound is essentially independent of N. Thisis not consistent with observations that were made in Chapter 6, in which we founduniform bounds for the error for all k = —N/2 + 1 : N/2. The conclusion is that thestandard trapezoid rule error theorem, when applied to the DFT, can be misleading.As we will soon see, it does not account for the possible continuity of higher derivativesof/; equally important, it does not reflect the periodicity of/ and its derivatives whichis often present in the calculation of Fourier coefficients.

In order to forge ahead with this question, we need a more powerful tool, andfortunately it exists in the form of the celebrated Euler-Maclaurin1 SummationFormula. This remarkable result can be regarded, among other things, as ageneralization of the trapezoid rule error theorem given above. In a form best suitedto our purposes, it can be stated in the following theorem.

THEOREM 9.2. EULER-MACLAURIN SUMMATION FORMULA. Let g eand let T^{g} be the trapezoid rule approximation to

with N uniform subintervals. Then the error in this approximation is

where —TT < £ < TT, and Bn is the nth Bernoulli number (to be discussed shortly).A proof of this theorem is not central to our present purposes, and a variety of

proofs are easily found [22], [45], [166]. First let's interpret the result and then putit to use. We see that the theorem gives a representation for EN, the error in thetrapezoid rule approximation on the interval [—TT, TT]. In contrast to Theorem 9.1,the Euler-Maclaurin Formula expresses the error as a sum of terms plus one finalterm that involves the "last" continuous derivative of the integrand. Each successiveterm reflects additional smoothness of g and can be regarded as another correctionto the error given in Theorem 9.1. We see that this error expansion accounts for thecontinuity of the higher derivatives of g as well as the endpoint values of g and its

1 COLIN MACLAURIN (1698-1746) became a professor in Aberdeen, Scotland at the age of nineteenand succeeded James Gregory at the University of Edinburgh eight years later. He met Newton in1719 and published the Theory of Fluxions elaborating Newton's calculus. The special case of Taylor'sseries known today as the Maclaurin series was never claimed by Maclaurin, who quite properly citedBrook Taylor and James Stirling for its discovery. Maclaurin and Euler independently published theEuler-Maclaurin Formula in about 1737 by generalizing a previous result of Gregory.

2


derivatives. The constants B-2m that appear are the Bernoulli2 numbers, a few ofwhich appear in Table 9.1. Note that although the terms of this sequence initiallydecrease in magnitude, they ultimately increase in magnitude very swiftly, a fact ofimpending importance (see problem 231).

It is instructive to consider some specific cases of the Euler-Maclaurin errorexpansion, and then generalizations will follow easily. Assume for the moment thatwe wish to approximate the Fourier coefficients of a function / G C4[—7r,7r] using thetrapezoid rule (DFT). Once again, let the full integrand be g(x) — f(x)e~lkx, andconsider the case in which /(—TT) ^ /(TT). This condition will generally imply thatP'l"71") T^ ff'M (problem 221). As a consequence, the Euler-Maclaurin formula withp = 1 tells us that

This is essentially the result given by Theorem 9.1, since by the mean value theorem(/(TT) — g'(—ir) = 27rg"(£). Therefore, in this particular case, we expect the errors inthe trapezoid rule to decrease as N~2. Let's see how well thi« rrror estimate works.

Example: No endpoint agreement. Consider the problem of approximatingthe Fourier coefficients cjt of the function f ( x ) = sin((3x — 7r)/4) on [—TT, TT]. Although/ is infinitely differentiate, / ( — T V ) ^ /(TT); and we can expect Theorem 9.1 to apply tothe trapezoid rule (DFT) errors. Table 9.2 shows the errors in the approximations toC2,cg, and CIQ for several values of N. In all three cases, both the real and imaginaryparts of the errors decrease almost precisely by a factor of four when N is doubled, onceN is sufficiently large. The error can decrease more slowly when k ss N/2] therefore,the coefficients c§ and CIQ require larger values of N before the N~2 behavior isobserved. This behavior should be seen in light of the discussion in Chapter 6: aswe saw there, the errors in approximating the Fourier coefficients of a function withdifferent endpoint values decrease as JV"1 for all k = —N/2 + 1 : N/2.

2JAKOB BERNOULLI (1654-1705) was one member (together with his brother Johann and nephewDaniel) of a family of prolific Swiss mathematicians. Jakob occupied the chair at the University ofBasel from 1687 until his death and did fundamental work in the calculus of variations, combinatorics,and probability.

TABLE 9.1A few Bernoulli numbers.

N8163264128

EN for ca7.0(-3) + 1.7(-2)t1.6(-3) + 4.1(-3)t3.9(-4) + 1.0(-3)»9.6(-5) + 2.6(-4)t2.4(-5) + 6.4(-5)»

EN for eg-2.1(-l) + 2.0(-2)t2.8(-3) + 2.0(-2)t4.4(-4)+4.3(-3)t1.0(-4) + 1.0(-3)»2.4(-5) + 2.6(-4)t

EN for CIG-2.1(-l) + 1.0(-2)i-2.1(-l) + 1.0(-2)i6.9(-4) + 1.0(-2)*1.0(-4) + 2.1(-3)t2.5(-5) + 5.1(-4)i

TABLE 9.2Errors in trapezoid rule (DFT) approximations to

Fourier coefficients C2, eg, and cig of f ( x ) = sin((3x — 7r)/4)


Now let's assume that / e C4[—TT, TT] has the additional properties /(—TT) = /(TT)and /'(-TT) = /'(TT). It then follows (problem 221) that g'(-7r} = g'(ir) and the firstterm in Euler-Maclaurin error expression (9.1) vanishes. This leaves us with an error

At first glance we seem to have an error term that decreases as N 4 with increasingN. Therefore, the Euler-Maclaurin formula predicts a faster convergence ratethan Theorem 9.1 because the smoothness and endpoint properties of / have beenrecognized. And this faster convergence rate can be observed, provided that thefrequency k is small. However, recalling that g(x) = f(x)e~lkx, the derivative g^has a term proportional to fc4. This means that for frequencies with k approaching±AT/2, the error may decrease very slowly with TV, and for the k — TV/2 coefficient,the error may be insensitive to increases in N. These predictions should be comparedto the error results that were derived using the Poisson Summation Formula. For theparticular case at hand, /^(—n) = f^(n) for p = 0,1, Theorem 6.3 for periodic,non-band-limited functions asserts that the error in the DFT decreases as TV~3 for allk = —TV/2 : TV/2 — 1. Therefore, the Euler-Maclaurin error bound predicts a fasterrate of convergence as N increases, provided k « TV/2. For k w ±TV/2, the errorbound given in Chapter 6 is stronger. A short numerical experiment will be useful.

Example: Endpoint agreement of / and /'. Consider the function f ( x ) =(x - l)(;r2 - x2)2, which has the property /(-TT) = /(TT) and /'(-TT) = /'(TT). Table9.3 shows the trapezoid rule (DFT) errors in approximating c^ and eg on [—7r,7r]. Tohighlight the rates at which the errors decrease, the "Factor" columns give the amountby which the real and imaginary parts of the error are reduced from their values inthe previous row.

The pattern in the error reduction as N increases is quite striking. For theapproximations to C2, the errors decrease by a factor of 16 almost immediately eachtime TV is doubled. For the higher frequency coefficient eg, larger values of TV areneeded before that same reduction rate is observed, all of which confirms the errorbounds of Theorem 9.2.

Clearly, this pattern can be extended to higher orders of smoothness (p > 1).In general, if g 6 C2p+2[—TT, TT] and g and its odd derivatives through order 2p — Ihave equal values at the endpoints, then the error in using the trapezoid rule with TVsubintervals on [—TT, TT] is given by


Fourier coefficients c-2 and eg of f ( x ) = (x — l)(7r2 — x2)2.

N8163264128

EN for2.2(-2) -9.2(-4) -5.1(-5)-3.1(-6) -1.9(-7) -

fc = 22.9(-l)«1.6(-2)i1.0(-3)»6.1(-5)»3.8(-6)i

Factor

2.3,1818,1616,1616,16

EN for6.0(-3) -6.0(-3) -8.6(-5) -3.6(-6) -2.0(-7) -

fc = 81.5(-l)t1.5(-l)i4.8(-3)t2.6(-4)i1.5(-5)»

Factor

8.5,1.07.0,3.124,1818,17


where — TT < £ < TT. For the problem of computing Fourier coefficients (g(x) =the Euler-Maclaurin formula suggests that the error decreases as N~2p~2

as N increases. This rate of error reduction will generally be observed for the lowfrequency coefficients in which the term fc2p+2, arising from g^2p+2\ does not competewith N2p+2. This estimate is a stronger bound than that given by Theorem 6.3.However, for the high frequency coefficients (k near ±AT/2), the Euler-Maclaurinbound allows the possibility that the error could decrease very slowly with increasingAT, and the error bound of Theorem 6.3 could become stronger. The way in which thetwo error results, one based on the Poisson Summation Formula and the other basedon the Euler-Maclaurin Summation Formula, complement each other, is summarizedin Table 9.4. In all cases, the function / is assumed to have as many continuousderivatives on [—7r,7r] as needed. The Fourier coefficients Ck for k = —N/1 + 1 : N/1are assumed to be approximated by an AT-point DFT (trapezoid rule). The table showshow the k and N dependence of the error bound, as predicted by the two theories,varies with the behavior of / at the endpoints.

The Euler-Maclaurin error expansion also exposes an interesting and often-observed difference between approximating sine and cosine coefficients. Assume that/ e C4[-7r,7r] and that /(-TT) = /(TT). Then the Euler-Maclaurin formula (9.1)predicts that the error in approximating the sine coefficients is

where — TT < £ < TT, and where g(x) = f(x)sin(kx). It is easily shown (problem 221)that the difference g'(TT) — g'(-TT) vanishes for the sine coefficient, but not for the cosinecoefficient. This means that one could expect errors in the approximations to the sinecoefficients to decrease as A7""4, while the usual AT~2 convergence rate still governsthe cosine coefficients. Of course, this rule extends to higher orders of smoothness, asshown in Table 9.4. This effect is easily observed.

TABLE 9.4Comparison of error bounds: Poisson summation

(Theorem 6.3) versus Euler-Maclaurin summation (Theorem 9.2).

Endpoint properties Poisson SummationFormula (Theorem 6.3)

Euler-MaclaurinFormula (Theorem 9.2)

for cosine coeffs

for sine coeffs

for cosine coeffs

for sine coeffs


Example: Improved sine coefficients. We will approximate the Fouriercoefficients c-i and c8 of f(x) = (x — l)(x2 — ir2) on the interval [—TT, TT] with TV-point DFTs. This function satisfies /(—TT) = /(TT). The results tabulated in Table9.5 show the real and imaginary parts of the errors together with the factor by whichthey are reduced from the previous approximation. With a few anomalies along theway, the real part of the error (corresponding to the cosine coefficients) decreases by afactor of roughly four each time that TV is doubled according to the N~2 dependenceThe imaginary part of the errors (corresponding to the sine coefficients) decreases 16-fold when TV is doubled, reflecting an TV~4 dependence. In both cases, k is sufficientlysmall that the predicted dependence on TV is actually observed for relatively smavalues of TV. As expected, the approximations to the higher frequency coefficient areslightly less accurate.


Fourier coefficients c-2 and eg of f(x) = (x — l)(x2 — 7r2).

AT8163264128

EN for k = 2-1.2(-l)-2.3(-2)t-2.7J-2) - 1.2(-3)»-6.4(-3) - 7.5(-5)t-1.6(-3) -4.7(-6)*-4.0(-4)-2.9(-7)»

Factor

4.4,194.2,164.0,164.0,16

EN for k = 86.5(0) - 1.2(-2)i

-4.6(-2) - 1.2(-2)t-7.3(-3)-3.6(-4)i-1.7(-3) - 1.9(-5)«-4.0(-4)- l.l(-6)z

Factor

14,1.06.3,334.3,194.2,17

The Euler-Maclaurin formula leads to some intriguing and subtle questions whenwe consider functions that are infinitely differentiable (C°°[—TT, TT]) with all derivatives27r-periodic. One might be tempted to argue that in this case, all of the terms of theerror expansion vanish and the trapezoid rule (DFT) should be exact for computingthe Fourier coefficients of such functions. One need look no further than the followingsimple example [166] to see that there is a fallacy lurking. Surely f ( x ) = cos4x isinfinitely differentiable and all of its derivatives are 2?r-periodic. Yet, in computing itsFourier coefficient CQ exactly and by a trapezoid rule with TV = 4 points, we see that

while

Clearly, the trapezoid rule is not exact. You are quite right if you claim that aliasingis the explanation for this error and argue that the error vanishes provided that TV ilarge enough to resolve all of the modes of the integrand. Indeed, T/v{cos4x} = 0,provided that TV > 5. But how does this failure of the trapezoid rule show up in thEuler-Maclaurin error bound, which seems to predict that the trapezoid rule shouldbe exact? The answer is subtle and worth investigating.

Suppose that g £ C°°[—TT, TT], and all of its derivatives are 27r-periodic. Let's lookat the trapezoid rule approximations to /^ g(x)dx. Since all derivatives of g haveequal values at ±TT, the Euler-Maclaurin error expansion says that


for arbitrarily large values of p, where —TT < £p < TT. The critical question indetermining whether or not the trapezoid rule is exact becomes the question of howthis expression for EN behaves as p —* oo (for fixed values of AT). We will need somefacts about the asymptotic behavior of the Bernoulli numbers Bp as p —> oo. It isknown [3] that

where we write A ~ B as p —» oo if the ratio A/B approaches 1 as p —>• oo. Using thefirst of these relations in (9.2), we find that

Of course, this relationship is not too useful without an estimate of how <JTP) behavesas p increases. Therefore, let us consider the example above, in which g(x) = cos kxand k is a positive integer. It follows that J5f^p^(x) | < fcp, and we can use (9.3) todeduce that

From this we may conclude that if \k\ < N then limp-^ \EN\ = 0 and the trapezoidrule is exact. In the above example with k — N = 4, limp_»oo \EN\ ̂ 0, and we expectthe trapezoid rule to be in error. This result does not preclude the possibility thatthe trapezoid rule may be exact for some values of k > N. In fact (problem 222), ifN is fixed, then TXrjcosfcx} is exact as long as k is not a multiple of N. For example,T4{cos7x} = 0, which is exact.

This may appear to contradict what was learned in Chapter 6 about the DFTfor periodic band-limited functions. But notice that can be viewed athe Fourier coefficient c^ for f(x) — I. Chapter 6 taught us that the Appoint DFT isexact in computing Cfc, provided that \k\ < AT/2, and indeed it is in this case.

Example: Periodic C°° functions. The plot thickens when we consider 2?r-periodic, C°° functions other than simple sines and cosines. Davis and Rabinowitz[45] suggest the function g(x) = (1 + crsinrarr)"1, which belongs to C°°[—7T,7r] andhas 2?r-periodic derivatives of all orders when ra is an integer (Figure 9.2). It can beshown that if |a| < 1, then

independent of m. (This result is most easily obtained using complex variables andthe theory of residues.)

In order to form an error bound from the Euler-Maclaurin expansion, we mustestimate g^p\ particularly for large values of p. This is a rather cumbersome calculation(although g(p^ appears not to be bounded as p —> oo), so we turn directly to thenumerical evidence as given in Table 9.6.


FIG. 9.2. The function g(x) — (1 + crsinmx) : is shown using a = .5, with m = 3 (left)and m = 5 (right). It is infinitely differentiable and all of its derivatives are IK-periodic. Thetrapezoid rule applied to f* g(x)dx is extremely accurate for even moderate values of N.

TABLE 9.6Errors in trapezoid rule approximations to

m — 3m = 5m = 8m= 17m = 25m = 40

N = 83.9(-4)3.9(-4)-9.7(-l)3.9(-4)3.9(-4)-9.7(-l)

N = 161.0(-8)1.0(-8)

-9.7(-l)1.0(-8)1.0(-8)

-9.7(-l)

N = 3200

7.5(-2)00

7.5(-2)

N = 6400

3.9(-4)00

3.8(-4)

N = 12800

1.0(-8)00

1.0(-8)

a(—n) means a X 10 n, while 0 means the error is < 10 10.

There are patterns and irregularities in these errors that are rather mysterious.Most importantly, note that for many values of m the trapezoid rule is exact (errordenoted 0) even for small values of N. Surely this reflects the fact that derivatives ofg of all orders are periodic. However, two choices of m (and undoubtedly there areothers) show relatively slow decay of errors as N increases. This fact requires furtherinvestigation, and we cannot offer a full explanation at this time.

As the two previous examples indicate, the trapezoid rule can behave in un-expected ways for C°° functions: occasionally, it is more accurate than the theorypredicts, which is always a welcome occurrence. It would be nice to say somethingconclusive about when the trapezoid rule is exact, but a general statement is elusive.This is partly due to the presence of the intermediate point £ in the error term of theEuler-Maclaurin expansion, which can always conspire to make the error much smallerthan the bounds suggest. Nevertheless, here is a modest statement about when thetrapezoid rule is exact for C°° functions.

THEOREM 9.3. ZERO ERROR IN THE TRAPEZOID RULE. Assume and that f and all of its derivatives are 2n-periodic. Let \f^(x)\ < ap on [—TT, n] forsome a > 0 and for all p > 0. Then the trapezoid rule (DFT) is exact when used to


approximate the Fourier coefficient

provided that a <\k\ < N.This theorem gives sufficient conditions for the trapezoid rule to be exact, but

the conditions appear to be rather weak. As the above examples illustrate, thetrapezoid rule can be unexpectedly accurate or virtually exact in situations that donot meet the conditions of this theorem. The proof of the theorem collects several ofthe observations already made in the chapter, and is left as an exercise (problem 225).

Before parting with the relationship between the trapezoid rule and the DFT,we will consider one final matter. It merits attention because it pertains to thepersistent issue of proper treatment of endpoints. The endpoint question appearsin a slightly different form as a technique known as subtracting the linear trend.This idea, attributed to Lanczos [1], [91], is developed fully in problems 219 and 220;we will sketch the essentials here. Assume that the Fourier coefficients of / are to beapproximated on [—•rr, TT], and that although / is continuous on [—7r,7r], it does notsatisfy /(—TT) = /(TT). Then the DFT will see a discontinuity in the input, and bythe error results of Chapter 6, we expect the error to decrease rather slowly as AT"1.Alternatively, by the Euler-Maclaurin error bound we expect errors that decrease as(k/N)2. We now define the linear function

which satisfies £(—TT) = /(—TT) and £(TT) = /(TT). It follows that the auxiliary function(j)(x) = f ( x ) — i(x) has the property that <f>(—TT) = (f)(n) = 0, and hence the DFTapplied to 0 should have errors that are smaller than when the DFT is applied directlyto /. Furthermore, the Fourier coefficients of t can be determined exactly, which allowsthe DFT of / to be recovered. This technique results in approximations that convergemore rapidly to the Fourier coefficients of / as N increases. Let's see how well itworks.

Example: Subtracting the linear trend. We return to the function f ( x ) =sin((3x — 7r)/4) considered in an earlier example (see Table 9.2). This function hasthe property that /(—TT) ^ /(TT), and as we saw, the DFT approximations converge tothe Fourier coefficients Cjt at a rate proportional to (k/N)2. As shown in Figure 9.3, ifthe function l(x) = (x + 7r)/27r is subtracted from /, the resulting function 0 = / — Isatisfies (j>(—TT) = (j>(ir) = 0. Table 9.7 shows the errors that result when the DFT withthe linear trend subtracted is used to approximate three Fourier coefficients of /.

TABLE 9.7Errors in approximations to Fourier coefficients C2,cs, and C\Q

) with linear trend subtracted.

N8163264128RF

EN for C27.0(-3) - 3.5(-4)t1.6(-3) - 1.9(-5)t3.9(-4) - l.l(-6)»9.6(-5) - 7.0(-8)t2.4(-5) - 4.3(-9)t

4.0,16

EN for eg-2.1(-1) - 1.8(-4)t2.8(-3) - 1.8(-4)t4.4(-4) - 5.4(-6)t1.0(-4) - 2.9(-7)«2.4(-5) - 1.8(-8)t

4.2,16

EN for cie-2.1(-1) -2.2(-5)t-2.1(-l)-2.2(-5)i6.9(-4) - 2.2(-5)»1.0(-4)-6.8(-7)»2.5(-5) - 3.6(-8)z

4.0,19

RF = Reduction Factor = ratio of errors in last two rows.

of

HIGHER-ORDER QUADRATURE RULES 369

FiG. 9.3. The top figure is a graph of the function f ( x ) = sin((3:r — 7r)/4) on the interval[—TT, TT] . // the function £(x) = (x + 7r)/27r — 1 (dotted line in top figure) is subtracted from f ,the resulting function 4> satisfies <p(—TT) = </>(TT) = 0 (bottom figure). The DFT converges morequickly as N increases when applied to <p, rather than f , and produces better approximationsto the sine coefficients.

This table should be compared to Table 9.2, in which the DFT errors withoutsubtracting the linear trend are recorded. The approximations to the real part (cosinecoefficients) are identical in both tables; in other words, subtracting the linear trendoffers no improvement in the cosine coefficients, and the errors in these approximationsdecrease as (k/N)2, as indicated by the reduction factor. However, the errors ithe approximations to the sine coefficients (the imaginary parts of the errors) areimproved significantly when the linear trend is subtracted showing a convergence rateproportional to (k/N)4. We also see the typically slower convergence for the largervalues of k. The explanation for this occurrence is pursued in problem 221.

9.3. Higher-Order Quadrature Rules

Up until now we have focused on the trapezoid rule because of its virtual identity withthe DFT. However, a few words should be offered about the use of other quadraturerules to approximate Fourier integrals. As mentioned earlier, Filon's rule [45], [5dates from 1928 and is designed for Fourier integrals. The method is most easilydescribed as it applies to the integrals

The interval [—TT, TT] is divided into IN subintervals of equal width h, and over eachpair of subintervals (taking groups of three points) the function / is approximated bya quadratic interpolating polynomial. The products of these quadratics with sin kx


and cos kx can be integrated analytically to give a rule of the form

and

The constants a,/3,7 are easily computed, and the terms SAT, S'N, C/v, and C'N aresimple sums, with N ± I terms, involving /(o;)sinA;:r and f(x)coskx evaluated atthe grid points. The method is easy to implement and has an error proportionalto h4 provided kh « 1. It can also be generalized to higher-order interpolatingpolynomials [56]. Not surprisingly, the method performs best when / is smooth andwhen k is small. In practice, N w lOfc (ten points per period) seems to give goodresults, although larger values of N must be used if / is also oscillatory. However,there is an interesting connection between Filon's rule and the DFT. The sums SN,S'N, CAT, and C'N are essentially trapezoid rule sums on either the even or odd points.Therefore, if a full set of N Fourier coefficients is needed, these sums (which comprisemost of the computational effort) can be done using the DFT, which, in practice,means the FFT. As we will show momentarily, Filon's rule gives approximations toFourier integrals which are generally as accurate as DFT approximations. Thus, whenused in conjunction with the FFT, it presents a very efficient method.

The other direction in which one might look for more accurate approximationsto Fourier integrals is toward even higher-order quadrature rules. In fact, there is atemptingly simple way in which higher-order quadrature rules might be married tothe DFT (and the FFT). Suppose that a particular quadrature rule has the form

where the :rn's are uniformly spaced points on [—7r,7r], h is the grid spacing, and thean's are known weights. To approximate the Fourier coefficients of a function / usingthis rule, we have that

It appears possible to obtain more accurate (higher-order) approximations simply byapplying the DFT to a slightly modified input sequence (anfn rather than fn), atvirtually no extra cost (problem 224). Indeed, this approach has been used in practicewith many well-known quadrature rules including Filon's rule. When this maneuverworks, it provides an effective method for computing Fourier coefficients. However,it must be used with some care: as is well known, some families of quadrature rules(for example, Newton3—Cotes4 rules) actually become less accurate as the orderincreases.

3ISAAC NEWTON was born in Lincolnshire, England in 1642. At the age of 18, he attendedCambridge University, where he met the well-known professor of Greek and mathematics IsaacBarrow. Newton conceived of his method of fluxions, which led to the calculus in 1665, while workingon the quadrature of curves. The Philosophae Naturalis Principia Mathematica, which contains hiscalculus and the theory of gravitation, appeared in 1687. Newton was 85 years old when he died.

4ROGER COTES was born in 1682 and, as a professor of astronomy and natural philosophy


We will illustrate some general conclusions about higher-order methods with thewell-known Simpson's5 rule. There are several ways to derive Simpson's rule;the most common approach replaces the integrand / by quadratic interpolatingpolynomials on pairs of subintervals. However, there is another way to arrive atSimpson's rule, and it is so useful that it merits a quick glimpse. We begin with thetrapezoid rule and assume that it is applied to the integral

with a grid spacing of h = 2n/N. We will assume that the function g has severalcontinuous derivatives, but for the moment no special endpoint conditions. As wehave seen, the error in this trapezoid rule approximation can be expressed as

where the constants Q are known from the Euler-Maclaurin formula. In other words,the error has an expansion in even powers of h that can be continued to h2p providedg(2p) \s continuous on [—•TT, TT].

The goal is to obtain a better approximation: one whose error is proportional notto /i2, but to a higher power of h. Toward this end, imagine that a second trapezoidrule approximation to / is computed with 2N subintervals and a grid spacing of h/2.We could then write that

The question is whether these two trapezoid rule approximations (and their errorexpansions) can be combined to give a higher-order method. Indeed they can: if thefirst trapezoid rule (9.4) is subtracted from four times the second (9.5), we find that

We see that a simple combination of the two trapezoid rules, which we have denotedS2N, differs from the exact value of the integral / by an amount that depends upon/i4. The goal has been accomplished: we have a new higher-order rule called &2N, andit can be obtained with virtually no work once the two trapezoid rule approximationshave been computed. It may not be obvious, but the new rule SZN is just Simpson'srule based on 2N uniform subintervals (problem 223). This is the first step of a processcalled Richardson6 extrapolation [117] or simply extrapolation.

But why stop here? We see that the error in the rule S%N also has an expansionin powers of /i, assuming that g is sufficiently differentiate. Therefore, another

at Cambridge, assisted Newton with the second edition of his Principia. Cotes created manycomputational tables and died at the age of 33.

5THOMAS SIMPSON (1710-1761) was a self-taught English mathematican who is also known fordiscoveries in plane geometry and trigonometry. There is evidence suggesting that Simpson's rulewas known to Cavalieri in 1639 and to Gregory in about 1670.

6LEWIS FRY RICHARDSON (1881-1953) was an eccentric and visionary English meteorologist whoconceived of numerical weather forecasting in the 1920s. He also carried out experimental work(measuring coastlines) that led to the formulation of fractal geometry.


extrapolation step may be done to eliminate the leading h4 error term. The result isa method that involves two Simpson's rule approximations with a leading error termproportional to h6. We will not carry this entire procedure to its conclusion (problem223) except to say that the repeated use of extrapolation leads to a systematicquadrature method called Romberg integration [22], [80], [120]. Perhaps evenmore important is the fact that the idea of extrapolation needn't be confined to thetrapezoid rule. It can be used with other quadrature rules (for example, Filon's rule[56]), it can be used to improve approximations to derivatives, and it finds use in thenumerical solution of differential equations.

But we have diverged a bit. What does this say about DFT approximations toFourier coefficients? There are two situations to discuss. Letting g ( x ) = f(x)e~lkx,we can apply the above remarks to the computation of the Fourier coefficients of/ on [—TT, TT]. If / has 2p + 2 continuous derivatives on [—TT, TT], but no specialendpoint properties, then extrapolation may be carried out (p times in fact) to provideimproved approximations over the original trapezoid rule (DFT) approximations.Practical experience suggests that the work required to do one or two extrapolationsis justified by the improvement in accuracy. Often, lack of smoothness of / obviatesthe value of further extrapolations. A simple example demonstrates the effectivenessof extrapolation.

Example: Extrapolation in quadrature. The function f ( x ) = x2 — -rr2 is realand even, therefore its Fourier coefficients are also real and even. Since /(—TT) = /(TT),but /'(—TT) ^ /'(TT), we expect the errors in the DFT approximations to the Fouriercoefficients CK to decrease as 7V~2 for k « N/2. Therefore, extrapolation shouldoffer some improvements at little additional cost. Table 9.8 shows the errors in theapproximation to c^.

TABLE 9.8Errors in approximations to the Fourier coefficient 02 of

f ( x ) = x2 — 7T2 by extrapolation from the DFT.

N163264128256512RF

EN forTN

9.3(-l)l.l(-l)1.2(-2)2.6(-3)6.4(-4)1.6(-4)

4

EN forfirst extrap.

-1.6(-1)-2.0(-2)-5.9(-4)-3.2(-5)-1.9(-6)

16

EN forsecond extrap.

-1.0(-2)-7.2(-4)-5.6(-6)-7.6(-8)

73

EN forthird extrap.

9.0(-4)-5.7(-7)-1.2(-8)

475

RF — Reduction Factor = ratio of errors in last two rows.

The errors in the trapezoid rule (DFT) approximations (second column) decreaseby a factor of four each time N is doubled, abiding by the expected 7V~2 convergencerate. The errors in the first extrapolation (Simpson's rule in the third column) showa strict reduction by a factor of 16 each time N is doubled, conforming to the N~4

convergence rate. The second and third extrapolations show reductions of factorsexceeding 64 and 256 each time N is doubled, according to the expected N~6 andN~8 convergence rates. The RF row gives the actual factor by which the error isreduced between N = 256 and TV = 512. This is a convincing demonstration of theremarkable effectiveness of extrapolation applied to quadrature.


The other situation to consider is that in which / does have some specialperiodicity or endpoint properties. We will now assume that / and its first 2p + 2derivatives are 27r-periodic and continuous. This means that g(x) = f(x)e~lkx

also has this degree of smoothness and periodicity. The following conclusionsalso apply to nonperiodic functions whose odd derivatives happen to agree at theendpoints (problem 230). If the trapezoid rule (DFT) with N subintervals is usedto approximate the Fourier coefficients of /, then as we have seen from the Euler-Maclaurin Summation Formula, the error in the approximation is given by

where — TT < £ < TT, and cp is a known constant given by the Euler-Maclaurinremainder term. Suppose we now wish to obtain a better approximation byextrapolation (which is equivalent to using a higher-order method). The error inthe DFT no longer has an expansion in powers of /i, since the periodicity of / and itsderivatives has reduced this expansion to a single term. Without an error expansion,extrapolation cannot be done, or more precisely, extrapolation will not lead to moreaccurate approximations. Of course, Simpson's rule or extrapolation may still beapplied, but in general, we should expect no improvement over the "superconvergent"approximations given by the trapezoid rule (DFT). The trapezoid rule applied toperiodic integrands is exceptionally accurate, and the resulting approximations haveas much accuracy as the smoothness of the integrand allows.

One last numerical experiment will allow us to demonstrate many of the foregoingremarks. We close this section with a case study of a family of functions that has beenused often to demonstrate the performance of the DFT as a quadrature rule [1], [45].

Example: A comparison of all methods. We will consider the problemof approximating the Fourier coefficients Ck of the function f ( x ) = xcoskox on theinterval [0, 2?r] where ko is an integer. The exact values of the coefficients Ck can bedetermined for any integers k and ko; note that neither the real nor the imaginarypart of Cfc vanishes identically. With these exact values of the coefficients, the errorsin various approximations can be computed easily. Specifically, we will investigate theperformance of the trapezoid rule (DFT) with and without the linear trend subtracted(denoted T>*N and T>N, respectively), Simpson's rule (denoted SV), and Filon's rule(denoted FN) for various values of A;, ko, and N. The errors in these approximationsare given in Table 9.9. When errors are less that 10~15, as is the case with many ofthe approximations to the cosine coefficient, the error is given a value of zero.

Many of the conclusions of this section are demonstrated quite convincingly in thistable. As mentioned, the first three methods produce very accurate approximationsto the cosine coefficients

A quick calculation (problem 226) reveals that the integrand g(x) =xcos(kox)cos(kx), while not periodic, satisfies the condition that g^2p~l\Q) =0(2p-i)(27r) for p > 1 provided k and ko are integers. The Euler-Maclaurin formulasuggests that the trapezoid rule may be extraordinarily accurate in this case, andindeed it is virtually exact. We see that T>^ and SN also inherit this superaccuracy,but there is no reason to incur the extra expense of these rules. Depending on thechoice of A; and k0, FN can also be extremely accurate.


TABLE 9.9Errors in approximations to Fourier coefficients of f ( x ) = xcoskox;

V* and T> = DFT with and without linear trends subtracted, respectively;SN = Simpson's rule, while FN = Filon's rule.

ko

1

fco1

ko

20

k

20

fe

1

k

20

k

1

ko

20

N

3264128256512

N

3264128256512

N

3264128256512

N

3264128256512

Error in DN

2.0(-2)t5.0(-3)i1.3(-3)t3.2(-4)t7.9(-5)i

Error in T>N

5.7(-l)t1.1(-1)»2.6(-2)i6.3(-3)i1.6(-3)»

Error in T>N

5.6(-2)i6.2(-3)i1.3(-3)i3.2(-4)i7.9(-5)i

Error in T>N

-2.3(-l)i1.4(-l)i2.7(-2)i6.4(-3)i1.6(-3)t

Error in T>*N

3.9(-5)i2.4(-6)»1.5(-7)t9.5(-9)»5.9(-10)t

Error in V*N

3.7(-3)»6.7(-5)»3.3(-6)i1.9(-7)t1.2(-8)t

Error in T>^

3.5(-2)il.l(-3)»6.3(-5)t3.8(-6)t2.4(-7)»

Error in T>*N

8.1(-1)»3.4(-2)»1.4(-3)»7.8(-5)t4.8(-6)i

Error in SN

-2.1(-4)«-1.3(-5)t-8.1(-7)i-5.1(-8)»-3.2(-9)i

Error in SN

l.l(0)i-4.7(-2)t-1.8(-3)t-1.0(-4)»-6.4(-6)»

Error in SN

-9.1(-2)»-1.0(-2)»-3.0(-4)i-1.6(-5)t-9.6(-7)i

Error in SN

-3.3(-l)«2.6(-l)t

-l.l(-2)»-4.6(-4)t-2.6(-5)i

Error in FN

2.4(-4) - 2.7(-4)»1.5(-5) - 1.7(-5)i9.5(-7) - 1.0(-6)t6.0(-8) - 6.6(-8)t3.7(-9)-4.1(-9)i

Error in FN

7.5(-5)t9.6(-7)t2.0(-8)»

-1.8(-0)t-1.2(-10)t

Error in FN

8.9(-2)t1.0(-2)t3.0(-4)»1.6(-5)i9.6(-7)t

Error in Fjy

8.3(0) + 3.0(-l)i1.6(0) - 3.3(-2)t

1.4(-l) + 9.3(-2)»9.3(-3) + 5.4J-4)*5.9(-4) + 3.3(-5)i

No error shown means the error is < 10 15, and a(—n) means a x 10 n.

Turning to the sine coefficients

we see the errors in the DFT approximations (indicated T>N] decreasing at thepredictable rate of JV~2 as N is increased. When the linear trend is subtracted(indicated by 2^), the accuracy of the approximations is improved momentously,exceeding that of the Simpson's rule approximations (SN)- The errors in both T>*N

and SN decrease as N~4 once N is sufficiently large relative to k and &o- Filon'srule performs extremely well in all cases. The errors in all of its approximationsdecrease strictly in an N~4 fashion except in the two cases in which the method isexact in approximating cosine coefficients. With all of the methods, less accurateapproximations are obtained (for the same values of N) for both higher frequencycoefficients (k large) and for more oscillatory integrands (fc0 large). It is remarkablethat overall the DFT with the linear trend subtracted (D^) is comparable, if notsuperior, to the higher-order and more expensive methods.

It would be nice to conclude with some sweeping statements about the bestquadrature method for computing Fourier integrals. Unfortunately, such a statementmost likely does not exist in any generality, in part because of the capricious nature

PROBLEMS 375

of the error terms. We have seen that, for the problem of approximating a set of NFourier coefficients of a function /, the DFT can be identified with the trapezoid rule.If / has both periodicity and smoothness of some of its derivatives, then the DFTgives approximations that are extremely accurate. In the absence of special endpointconditions, the techniques of subtracting the linear trend and extrapolation can beused to improve DFT approximations. Since the DFT is generally implemented via theFFT, any methods based on the trapezoid rule will always be very efficient. Numericalevidence confirms that Filon's rule, particularly with extrapolation, also providesaccurate approximations; when joined with the FFT it also becomes a competitivemethod.

9.4. Problems

218. Finite sums from the Euler—Maclaurin Summation Formula. The firsttwo terms (p = 1) of the Euler-Maclaurin Summation Formula for an arbitrary finiteinterval [a, b] are given by

for a < £ < 6, where h = (b — a)/N and g is at least four times continuouslydifferentiate. Use this expression with a = 0, 6 = TV, and special choices of g toevaluate the following finite sums.

Find a closed form expression for

219. Subtracting the linear trend. Carry out the full derivation of the methodof subtracting out the linear trend.

(a) Let / be a continuous function on [—7r,7r], define

and let 4>(x) — f(x] — i(x). Also let fn and <pn be samples of / and </> atthe N equally spaced grid points of [-TT, TT]. Then, for k — -N/2 +1 : N/2,show that approximations to the Fourier coefficients Ck are given by

n

for k 0 and

for k = 0.


(b) Argue that since </> is at least continuous on [—TT, IT] and (j>(—TT) = </>(TT), theerror in the DFT of </>n should decrease at least as rapidly as the error inthe DFT of fn.

(c) Why does this technique improve approximations to the sine coefficients(imaginary part of Cfc), but not to the cosine coefficients?

220. Subtracting the linear trend calculation. Carry out the strategy of theprevious problem to approximate the Fourier coefficients c^ of the function

Compute approximations to the same coefficients using the DFT without the lineartrend technique with several increasing values of N. Comment on the improvementsthat you observe in the approximations to both the cosine coefficients (Re {cfc}) andthe sine coefficients (Im-fcfc}).

221. Effect of endpoint conditions. Assume that / has as many continuousderivatives as necessary. Investigate the effect of various endpoint conditions on theerror in the trapezoid rule approximations to the Fourier coefficients of / on [—7r,7r].Use the Euler-Maclaurin error term and let g(x) — f(x}e~lkx.

(a) Show that if /(TT) ^ /(—TT) then in general </(TT) 7^ g'(—TT) and the errorsin the trapezoid rule approximations decrease as 7V~2 with increasing TV,provided k « N.

(b) Verify the claim made in the text that if /(TT) = /(—TT) then the error in theDFT approximations to the sine coefficients of / decreases as JV~4 whilethe error in the approximations to the cosine coefficients decreases as 7V~2,provided k « N.

(c) Show that if /(TT) = /(—TT) and /'(TT) = /'(—TT), then the errors in thetrapezoid rule approximations decrease as AT~4 with increasing TV", providedk«N.

222. Trapezoid rule and harmonics. Consider using the trapezoid rule (DFT)to approximate the following integrals that involve trigonometric functions on [—7r,7r].

(a) Verify that the /V-point trapezoid rule approximation to J .̂ cos kx dx isexact, provided that \k\ < N. Assuming that N is fixed, is it exact for anyvalues of I A: I > Nl

(b) Verify that for approximations to coskxdx, the error term

approaches zero as p —» oo, provided \k\ < N. Use the properties of Bp

given in the text.

(c) Suppose that the trapezoid rule (DFT) is used to approximate the Fouriercosine coefficients of the function f ( x ) = cosk^x on the interval [—7r,7r]where fco 7^ 0 is an integer:

PROBLEMS 377

For fixed values of ko and TV, for what values of k is the trapezoid ruleexact?

223. Simpson's rule by extrapolation. Using N = 4, show that Simpson's ruleusing 27V points is given by

where T27v and T/v are the trapezoid rules using 2/V and N points, respectively. Showthat if another extrapolation is performed using SN and S2jv> then another quadraturerule,

is obtained with an error proportional to h6.

224. Simpson's rule directly. As it is usually applied directly, Simpson's ruleusing 2/V points is given by

where h = 27T/TV and the uniformly spaced grid points are given by xn = nh forn=-N:N.

(a) Describe how Simpson's rule could be implemented directly (not throughextrapolation) on the integrand g(x) = f(x)e~tkx by expressing it in termsof DFTs of modified input sequences.

(b) Assume that the DFT is implemented via the FFT. For the sake ofsimplicity assume that the FFT of a sequence of length N costs TV log TVarithmetic operations (the base of the logarithm is not important here).Compare the cost of implementing Simpson's rule with 2TV points (i)directly as described in part (a), and (ii) by computing the trapezoid ruleapproximations TXr and T^N and using extrapolation. Assume that FFTsare used for all of the quadrature rules.

225. Zero error DFT. Work out the details of the proof of Theorem 9.3, statingthat if / G C°°[—7r,7r] and has 27r-periodic derivatives with \f^p\x}\ < ap, thenthe trapezoid rule (DFT) is exact in computing the Fourier coefficients Ck providedthat a < \k\ < TV. (Suggestion: Start with the general error expression (9.3) andcompute g^(x) where g(x) = f(x}e~tkx. Use the geometric series to bound thesum that arises in the expression for g^p\ Note the conditions on TV, k,a needed forlinip^oo \EN\ = Q.)

226, Endpoint conditions. Verify that the function

satisfies g^2p~^'(27r) — g^2p~^(Q) = 0 when k and ko are integers, and thus the leadingterms of the Euler-Maclaurin error expansion vanish. (Suggestion: Use


Then compute

227. Comparing theories. Consider the function f ( x ) — (7r2 — or2)2 on the interval[—7r,7r]. The DFT is to be used to approximate the Fourier coefficients of /. Usingthe theory of Chapter 6 (based on the Poisson Summation Formula) and the theory ofthis chapter (based on the Euler-Maclaurin Summation Formula), how do you predictthat the error in these approximations should decrease as AT is increased?

228. What does it cost? Assume that for a particular integral, the error in thetrapezoid rule, Filon's rule, and Simpson's rule decreases as N~2, 7V~4, and JV~4,respectively (which is the general behavior of these methods).

(a) By what factor do you need to increase the number of grid points in eachmethod to achieve a reduction in the error by a factor of 64?

(b) Neglecting "lower-order amounts of work," assume that each of the threemethods above can be implemented on N points with a single FFT thatcosts N log N arithmetic operations. What is the increase in computationaleffort that is needed to achieve a 64-fold reduction in the error in eachmethod?

229. Is TAT exact? The functions f ( x ) = e±sinx and f ( x ) = e

±cosx are infinitelydifferentiable, and they and their derivatives are 27r-periodic. Furthermore,

for all four functions! Investigate the performance of the trapezoid rule in approxi-mating the value of /. Does the theory predict that the trapezoid rule is exact? Doesthe numerical evidence support the theory? How well does the trapezoid rule (DFT)perform in computing the Fourier coefficients of /?

230. More vanishing derivatives. The function f ( x ) = ex (l~x) , while notperiodic, has the property that its odd derivatives vanish at x = 0 and x = 1. Doesthe theory predict that the trapezoid rule is exact in approximating J0 f(x)dx? Carryout the numerical experiments for 7V = 2m,m = 3,.. . ,10 and compare the numericalresults to the theory. Other functions with the property that its odd derivatives vanishat the endpoints are [105]

where a > 0, k Z, all with respect to the interval [0,27r]. Investigate theperformance of the trapezoid rule with these integrands.

231. Bernoulli numbers. Show that the Bernoulli numbers are the coefficients inthe Taylor series for the function

about x = 0. Confirm that the series converges for x\ < 2?r.

Chapter 10

The Fast FourierTransform10.1 Introduction

10.2 Splitting Methods

10.3 Index Expansions

10.4 Matrix Factorizations

10.5 Prime Factor and Convolution Methods

10.6 FFT Performance

10.7 Notes

10.8 Problems

Truly that method greatlyreduces the tediousness ofmechanical calculations;success will teach the one

who tries it.- Carl Friedrich Gauss,

ca 1805 379

380 THE FAST FOURIER TRANSFORM

10.1. Introduction

Until now we have explored the uses and applications of the DFT assuming that itsactual calculation could always be carried out in a handy black box. If nothing else,that black box can contain the explicit definition of the DFT in terms of a matrix-vector product. However, we have occasionally hinted that there are better ways todo it. Therefore, in this chapter we will look inside of that particular black box knownas the fast Fourier transform (FFT). For the initiated, this treatment will seemrather cursory; for those who view the FFT with awe and wonder (as we all should),the chapter may provide some guidance for further study; and for those who wouldprefer to leave the FFT inside of the black box, you have reached the end of this book!

This chapter should be regarded as a high altitude reconnaisance of a broad andcomplex landscape. One quickly realizes that the FFT is not a single algorithm, buta large and still proliferating family of methods, all designed to compute the variousforms of the DFT with remarkable efficiency.

One reason that the FFT literature is both rich and occasionally impermeableis that there are several different frameworks in which FFTs may be developedand presented. Some FFTs arise quite naturally in one setting, but are difficultto formulate in another. Other FFTs have clearly equivalent expressions in severalframeworks. As we all know, many attempts to understand the FFT have ended ina maze of signal flow diagrams or in a cloud of subscripted subscripts. However, weare also fortunate that the subject abounds with excellent presentations, in manymathematical languages and with many different perspectives. It would be foolishto duplicate and impossible to improve upon these existing treatments. We mentionspecifically the superb books of Brigham [20], [21], which for 20 years have surelyoffered revelations about the FFT to more people than any other source.

Let's begin with some history. The FFT was unveiled by a pair of IBM researchersnamed John Tukey and James Cooley in 1965. One might be tempted to thinkthat their four-page paper "An Algorithm for the Machine Calculation of ComplexFourier Series" [42] was the first and last word on the subject, but it was neither.It certainly was not the last word: in the years since 1965, the original Cooley-Tukey method has been seized by practitioners from many different fields, refined forcountless applications, and modified for a variety of computer architectures. The sizeof the "industry" spawned by that single paper is reflected by the fact that it has beencited 2047 times between 1965 and 1991 in scientific journals from Psychophysiology toEconometrics to X-Ray Spectroscopy. There can be little doubt that the idea presentedby Cooley and Tukey in 1965 changed the face of signal processing, geophysics,statistics, computational mathematics, and the world around us forever [35].

On the other hand, the 1965 paper was not the first word on the subject either.In that publication, Cooley and Tukey cited the 1958 paper by I. Good [68], which isgenerally regarded as the origin of the prime factor algorithm (PFA) FFT. However,there is a much longer FFT lineage, not known to Cooley and Tukey in 1965, thatwas revealed shortly thereafter by Rudnick [121] and Cooley, Lewis, and Welch [40].Those studies uncovered the crux of the FFT in a 1942 paper by Danielson andLanczos [44], which itself was inspired by work of C. D. T. Runge1 in 1903 and1905 [122]. Those interested in tracing this thread further into the past must read

1Born in Bremen, Germany in 1856, CARL DAVIS TOLME RUNGE studied mathematics in Munichand Berlin. His interests turned to physics, and his most notable work was in the field of spectroscopy.He was a professor at Gottingen until he died in 1927.

SPLITTING METHODS 381

the fascinating investigation by Heideman, Johnson, and Burrus [74], in which theauthors document several nineteenth century methods for the efficient computation ofFourier coefficients. Of these methods, only the one proposed by Gauss in 1805 [61](published posthumously in 1866) is a genuine FFT. Ironically, references to Gauss'method have appeared at least twice in the literature in the past 100 years: once in1904 in H. Burkhardt's encyclopedia of mathematics [25] and once again in 1977 inH. H. Goldstine's history of numerical analysis [67].

The goal of this chapter is to describe several different avenues along which theFFT may be approached. The various FFT paths we will travel in this chapter are

• splitting methods,

• index expansions (one—> multi-dimensional transforms),

• matrix factorizations,

• prime factor and convolution methods.

The chapter will conclude with some remarks about related issues such as FFTperformance, computing symmetric DFTs, and implementation of FFTs on advancedarchitectures. There is a lot to say, and we have given ourselves little space, so let'sbegin.

10.2. Splitting Methods

If you had ten minutes and a single page to explain how the FFT works, the splittingmethod would be the best approach to use. Let us recall that the goal is to computethe DFT of a (possibly complex) sequence xn which has length N and is assumed tohave a period of N. As we have noted many times, there are several commonly usedforms of the DFT. The FFT can be developed for any of these forms, but a specificchoice must be made for the sake of exposition. For the duration of this chapter wewill use the definition in which both indices n and k belong to the set {0,..., N — 1}.We will use the definition

for k = 0 : N — 1, where UN — ez27r/N. We will dispense with the scaling factor ofI/A/", which can always be included at the end of the calculation if it is needed at all.

Both history and pedagogy agree that in developing the FFT, it is best to beginwith the case N = 2M, where M is a natural number; this is called the radix-2case. If we now split xn into its even and odd subsequences by letting yn = x2n andzn = x-2n+i, and then substitute these subsquences into the definition (10.1), we findthat

for k — 0 : N — I . We now use a simple but crucial symmetry property of the complexexponential, which might be identified as the linchpin of the whole FFT enterprise: itis that

or equivalently


(More generally, we can always replace(see problem 232).) Using this property, expression (10.2) can be written as

, where p and q are real numbers

And now, if we stand back, we see that the original DFT has been expressed as asimple combination of the DFT of the sequence yn and the DFT of the sequence zboth of length AT/2. Keeping with our convention, we will call these half-length DFTsYk and Zk, respectively. By letting A; = 0 : N/2 — 1, we can write

— — 1, and that the sequences YJt and Zk have a period ofNil to conclude thatWe now note that

for k = 0 : N/2 — 1, which accounts for all N DFT coefficients X^. Expressions (10.4)are often called combine formulas or butterfly relations. They give a recipe forcombining two DFTs of length N/2 (corresponding to the even and odd subsequencesof the original sequence) to form the DFT of the original sequence.

It is worthwhile to pause and note the savings that have already been achieved.Computing the sequence Xk explicitly from its definition (10.1) requires approximatelyN2 complex multiplications and N2 complex additions. On the other hand, if thsplitting method is used and if the sequences Yk and Zk are computed as matrix-vectorproducts, then the DFT requires approximately 2(JV/2)2 = N2/2 multiplications andN2/2 additions to compute Yk and Zk plus an extra N/2 multiplications and Nadditions for the butterfly relations. We see that a DFT computed by one splitting ofthe input sequence requires roughly N2/2 multiplications and JV2/2 additions, whichis a factor-of-two savings. It is also worth noting that this single application of thesplitting method is Procedure 2 of Chapter 4.

However, we are not finished. We just assumed that the DFT sequences Yk andZk would be computed as matrix-vector products; of course, they needn't be. A fullFFT algorithm results when the splitting idea is applied to the computation of Yfc andZk- In a divide-and-conquer spirit, it is then repeated on their subsequences, and soon, for M — Iog2 N steps. Eventually, the original problem of computing a DFT oflength N has been replaced by the problem of computing N DFTs of length 1. Atthis point there are really no DFTs left to be done, since the DFT of a sequence oflength 1 is itself.

The procedure we have just described is shown schematically in Figure 10.1 andis one of the fundamental FFT algorithms. It is often described as decimation-in-time, since it involves the splitting of the input (or time) sequence. This FFT is verysimilar to the one described by Cooley and Tukey in their 1965 paper (although it wasderived differently there). The method consists of two stages:


Reordering Phase Combine Phase

FIG. 10.1. The Cooley-Tukey FFT, shown here for a sequence of length N = 8, takesplace in two phases. In the reordering phase, the input sequence xn is successively split in anodd/even fashion until N sequences of length 1 remain (shown inside the solid boxes). Thecombine phase begins with N trivial DFTs of length 1 (also shown inside the solid boxes).Then the DFT sequences of length 1 are combined in pairs to form DFTs of length 2 (showninside the coarse dashed boxes). Then DFT sequences of length 2 are combined in pairs toform two DFTs of length 4 (shown inside the fine dashed boxes and denoted Yk and Zk).Finally, the two DFTs of length 4 are combined to form the desired DFT Xk, of length 8.All of the combine steps use the Cooley-Tukey butterfly relations and are shown as webs ofarrows.

• a reordering stage (often called bit-reversal), in which the input sequence issuccessively split into even and odd subsequences; and

• a combine stage, in which the butterfly relations (10.3) and (10.4) are usedto combine sequences of length 1 into sequences of length 2, then sequences oflength 2 into sequences of length 4, and so on, until the final transform sequenceXk is formed from two sequences of length N/2.

Now let's count complex arithmetic operations with the help of Figure 10.1. Thecombine stage consists of M — Iog2 N steps, each of which involves N/2 butterflies,each of which requires one complex multiplication and two complex additions. Thisgives a total operation count for an N-point complex DFT of

complex additions and

Even this tally can be improved if one accounts for multiplications by ±1 and ±i(which needn't be done). The point of overwhelming practical importance is that theFFT provides a factor of N/ log N in computational savings over the evaluation of theDFT by its definition. As shown in Figure 10.2, this factor itself increases with N.

Note that the FFT just described can be done in-place, meaning that thecomputation can be done without additional storage arrays. The method does havethe bothersome reordering stage, but, as we will see, this is often unnecessary or FFTs

complex multiplications.


FIG. 10.2. A comparison of the computational complexities of the DFT (solid line) andthe FFT (dashed line) is shown graphically. Plotted is the floating-point operation countas a function of N. The DFT is an O(N2) operation, while the FFT has a complexity ofO(N\og2 N), assuming that N = 2P. Similar savings are produced for other values of N aswell.

can be found that avoid it. It should also be noted that the splitting algorithm can begeneralized to the radix-r case for sequences of length N = rM, where r > 1 and Mare natural numbers, provided one has efficient ways to evaluate r-point DFTs (seeproblems 233 and 234). With a bit more ingenuity, the method can be extended tothe mixed radix case for sequences of length N = r™1 • • • rL

L, where rj > 1 and Miand L are natural numbers. A quick physical interpretation of the splitting methodmight go as follows. The even subsequence yn is a sampling of the original sequencexn. The odd subsequence zn is also a sampling of xn, shifted by one grid point withrespect to the first sampling. The DFT of the original sequence is just an average ofthe DFTs of the two subsequences. Since the second sample is shifted relative to thefirst, the shift theorem of Chapter 3 requires that the rotation u^fe multiply the DFTof the odd subsequence [19].

With more brevity, we will now outline how another canonical FFT can be derivedby the splitting method. As before, we begin with an input sequence xn of length AT,but this time it is split into its first-half and last-half subsequences xn and We make these replacements in the DFT definition (10.1) to find that

We are now headed down a different road, and a new orientation is needed.Replacing k by 2A; in (10.5) tells us that the even coefficients of the DFT are given by

have been used.Once again, the crucial properties i andwhere


The expressions (10.6) and (10.7) can be used for k — 0 : N/2 — 1 to generate all Nvalues XQ, ..., XN-I.

We see that both the even and odd terms of Xk can be found by performingDFTs of length N/2 on simple combinations of the subsequences xn and :rn+jv/2-These simple combinations are both subsequences of length AT/2, and we call them yn

and zn, respectively. They are defined by another set of butterfly relations, namely

for n = 0 : N/2 — 1. Once again, the task of computing a DFT of length TV hasbeen replaced by the task of computing the DFTs of shorter sequences, in this caseyn and zn, both of length N/2. As with the previous FFT we do not stop here. Thesame strategy is applied to compute the DFTs of yn and zn, then again to computethe resulting DFTs of length AT/4, until eventually (recalling that TV is a power oftwo), the job has been reduced to computing TV DFTs of length 1, which is trivial.Therefore, the only work involves preparing the input sequences for each successiveDFT, which can be done by applying the butterfly relations (10.8) and (10.9) withthe appropriate value of TV.

This procedure leads to another quite different FFT which is illustrated schemat-ically in Figure 10.3. It is the algorithm proposed by Gentleman and Sande in 1966[64]. Notice that in contrast to the Cooley-Tukey FFT (which requires a reorderingof the input sequence, but computes the transform coefficients in natural order), thisFFT takes the input sequence in natural order, but produces the transform coeffi-cients in scrambled (bit-reversed) order. For this reason, this FFT is often described'as decimation-in-frequency. In contrast to the Cooley-Tukey FFT, the combinestage occurs first and the reordering stage occurs last. This FFT can also be donein-place, it has the same computational cost as the Cooley-Tukey FFT, and it can beextended to more general values of TV (see problem 236).

In both of these FFTs, the scrambling of the input or output sequence is an annoy-ance that can be avoided. By a clever rearrangement of the intermediate sequences,it is actually possible to devise "self-sorting" FFTs that accept a naturally orderedinput sequence and produce a naturally ordered output sequence: unfortunately, theprice is an extra storage array. These FFTs are generally associated with the nameStockham [38], although they are special cases of a family of FFTs proposed by Glass-man [66]. Another rearrangement of the Cooley-Tukey FFT produces an FFT inwhich the length of the inner loops remains constant (N/2}. The advantage of this

whereas if k is replaced by 2k+1, we find the odd coefficients given by


FlG. 10.3. The Gentleman-Sande FFT, shown here for a sequence of length N — 8,also takes place in two phases. The input sequence xn goes into the combine phase innatural order. After the first combine step two subsequences of length 4, yn and zn

(shown inside the fine dashed boxes), are produced whose DFTs are the even and odd termsof the desired DFT, X^k and A^jt+i, respectively. After the second combine step, foursubsequences of length 2 are produced (shown inside the coarse dashed boxes) whose DFTsare { A ' o , X i } , { X 2 , X G } , { X \ , X * , } , { X 3 , X ~ [ } , respectively. After the third combine step, eightsubsequences of length I are produced whose DFTs can be found trivially (solid boxes). Thesethree combine steps result in the desired DFT Xk in scrambled (bit-reversed) order. Thereordering phase recovers the DFT in its natural order. All combine steps use the Gentleman-Sande butterfly relations (shown as webs of arrows).

FFT for parallel computation was realized in a prescient 1968 paper by Pease [112].Much more recently, the seemingly inescapable trade-off between reordering and extrastorage has been overcome, but only within the prime factor setting (to be discussedshortly).

It should also be mentioned that in many important FFT applications (such ascomputing convolutions and solving boundary value problems), forward DFTs areultimately followed by inverse DFTs. Therefore, if one is willing to do computationsin the transform domain with respect to bit-reversed indices (and these are often verysimple computations), it is possible to use a decimation-in-frequency FFT for theforward transforms and a decimation-in-time FFT for the inverse transforms, therebyavoiding a reordering of either the input or output data.

Finally, we mention that all FFTs, including those presented in this section, can beadapted to compute inverse DFTs (IDFTs). Recall that the DFT and the IDFT differonly by a multiplicative factor of I/TV and the replacement of u! by u~l. Therefore,the simplest way to produce an inverse FFT is to replace all references to u in an FFTprogram by a;"1 (equivalently, negate all references to angles). The scaling by I/TVcan be handled after the transform is done, if necessary. However, inverse FFTs canbe created in other ways as well (see problems 237 and 238).

]

INDEX EXPANSIONS 387

10.3. Index Expansions

In this section we give just a hint of another approach that can be used to derive bothof the FFTs of the previous section plus many more. It also has historical significancebecause it is the technique used by Cooley and Tukey. We assume that the length ofthe input sequence N can be factored as N = pq. The indices n and k that appear inthe definition of the DFT may now be represented in the forms

To reflect these index expansions, we now write xn and X^ as x(ni,n0) andX(k\,ko), and substitute into the DFT definition (10.1). We immediately find that

for A?o = 0 : q — 1, ki — 0 : p — 1. The exponent can now be expanded and simplifiednoting that u~pq = u^N = 1 and u~nik°p = u;-nifc°. This leads to

for ko — 0 : q — 1, ki = 0 : p — 1. This expression requires some inspection. The innersum on TT-i can be identified as p DFTs of length q of sequences consisting of every pthterm of the input sequence xn. Since this inner sum depends directly on ko and no,we will denote it Y(ko,no}. This allows us to write that

for ko — 0 : q — 1, and fci = 0 : p — 1. We must now simplify the exponent of UN bynoting that

With this observation we have that

for &o = 0 : q— I and ki — 0 ,p— 1. This last expression may be interpreted as q DFTsof length p of sequences derived from the intermediate sequences Y(ko,no}.

Now let's step away from the forest far enough to make an extremely valuableobservation, one that will lead to many more FFTs. At least conceptually, we see thata single DFT of length N — pq may be cast as a two-dimensional DFT of size p x q.To be sure, there are some complications, such as the appearance of the "twiddle


factors" t^n° ° in the second set of DFTs. Nevertheless, this interpretation is veryimportant. It may be illuminated by Figure 10.4, which shows the two-dimensionalcomputational array in which the FFT takes place. Note the arrangement of the fourindices. The input sequence xn is ordered by rows in this array. The inner sum (pDFTs of length q) corresponds to transforming the columns of the array, since with nofixed, the input sequences for these DFTs consist of every pih element of the originalsequence. The outer sum corresponds to transforming in the direction of the rows ofthe array (with fco fixed).

If FFTs for sequences of length p and q are already available, then this singlefactorization would result in a complete FFT for sequences of length N. Otherwise,the procedure can be repeated for subsequent factorizations of p and q, which results inthe conversion of a single DFT into a multidimensional DFT. For example, if N — pqr,the ./V-point DFT could be expressed as a three-dimensional (pxqxr) DFT. If p andq are themselves prime or relatively prime, there are still other possibilities that willbe mentioned later.

The computational savings with this single factoring of N should be noted. If theDFTs of length p and q are done as matrix-vector products, then q DFTs of length prequire roughly qp2 complex multiplications and additions, while p DFTs of length qrequire roughly pq2 complex multiplications and additions. Thus, the DFT of lengthN requires

complex multiplications and additions compared to N2 operations to compute theDFT by the definition. If this process is repeated for the DFTs of length p and q,then it can be shown that an NlogN operation count results for the entire DFT.We mention an interesting variation on index expansions due to de Boor [46], whointerprets the FFT as nested polynomials "with a twist."

The decimation-in-time FFT presented in the previous section may be viewed inthis framework by letting p = 2 and q = N/2. We then have that

for ko = 0 : N/2 — 1, k\ = 0 : 1. The inner sum consists of two DFTs of lengthN/2, one transforming the even terms of the input sequence, the other transformingthe odd terms of the input sequence. This reflects the odd/even rearrangement of theinput sequence that arose in the splitting method. The outer sum can be identifiedas the two-term butterfly relations (10.3) and (10.4) that appeared in the splittingapproach. Thus, one factoring of N (N — 2 x N/2) corresponds to one step of thesplitting method. As with the splitting method, a full FFT results when the factoringis next applied to the DFTs of length N/2.

Hopefully this glimpse of the index expansion idea at least suggests how it can leadto many more FFT algorithms. Just within the example presented above, it is possibleto reverse the order of the two sums, which leads to a decimation-in-frequency FFT(see problem 239). However, there is much more generality: if N is composite andconsists of M different factors (counting multiplicity), then the DFT can be expressedin terms of M nested sums, and each index, n and /c, has M terms in its expansion.Now the M nested sums can be rearranged in M! different ways, leading to as manydifferent FFTs, some of which are useful, most of which are not. Fortunately thissudden proliferation of FFTs can be organized greatly by the next perspective.

MATRIX FACTORIZATIONS 389

FIG. 10.4. A single DFT of length N = pq may be viewed as p DFTs of length q followedby q DFTs of length p that take place in the two-dimensional (p x q} array shown above.The first wave of DFTs transforms along the columns of the array; the second wave of DFTstransforms along the rows of the array. The indices no and k\ increase along the rows of thearray, while the indices n\ and ko increase along the columns of the array.

10.4. Matrix Factorizations

We will now survey a very powerful and encompassing setting in which FFTs canbe derived and understood. This approach has been promoted very convincingly ina recent book by Van Loan [155], which deserves to be recognized as the next in aseries of definitive books about the FFT. The book succeeds in expressing most knownFFTs as matrix factorizations of the DFT matrix Wjy, thus unifying and organizingthe entire FFT menagerie. The idea of expressing the FFT as a factorization of WATfirst appears in the late 1960's in papers by W. M. Gentleman [63] and F. Theilheimer[151], and was elaborated upon frequently thereafter. Hopefully our glancing coverageof this approach will suggest its generality.

In order to interpret the DFT as a matrix-vector product, we will think of theperiodic input and output sequences of the DFT as the column TV-vectors

This enables us to write the DFT in the form X = WATX. To be quite specific,consider the case of a DFT of length N — 16 which is to be computed by the Cooley-Tukey (decimation-in-time) FFT. Recall the butterfly relations (10.3) and (10.4) thatindicate how two DFTs, Yk and Zk, of length N/2 — 8 can be combined to form X^,the DFT of length N = 16:


Letting Y and Z be the eight-vectors corresponding to the sequences Yk and Zfc,we can write the butterfly relations (10.12) and (10.13) in the form

where in all that follows Im is the identity matrix of order m and fim is the (m x m)diagonal matrix

The matrix used for this combine step is called a butterfly matrix and we havedenoted it BIQ. It should be verified (problem 241) that this product is exactly theeight pairs of butterfly relations given in (10.12) and (10.13).

Now note that Y and Z are themselves DFTs (of length N = 8) of the even andodd subsequences of x, which we will call xeven and x0dd- Therefore, similar butterflymatrices may be used to compute them. For example,

where Y' and Z' denote the DFTs (of length N = 4) of the even and odd subsequencesof Xeven- A similar expression may be written for the transform of x0dd:

where Y" and Z" are the DFTs (of length N = 4) of the even and odd subsequencesof xodd.

Before we become strangled by our notation, we should find a pattern. So far wehave that

If we were to continue in this manner and express the remaining DFTs of length N = 4and TV = 2 in terms of the butterfly matrices B4 and B2, the result would be

We see that the DFT matrix WIG has been expressed as a product of four matricesconsisting of the butterfly matrices Bm for m = 2,4,8,16. At each step, the matrix Bm

MATRIX FACTORIZATIONS 391

holds the butterfly relations for combining two DFTs of length m/2 into one DFT oflength m. Notice that as this factorization progresses, the rearrangements of the inputsequence "pile up" at the right end of the product. Therefore, this product of matricesoperates not on the input sequence x, but on a permuted input sequence PTx, whichis the scrambled sequence that we have already encountered in the decimation-in-timeFFT. (The matrix P is a permutation matrix created by performing successiveodd/even rearrangements of the columns of the identity matrix Ii6; premultiplying xby PT rearranges the components of x in the same manner.) This formulation clearlyshows the two stages of this particular FFT: the input sequence is first reordered bythe permutation matrix PT, and then it passes through Iog2 N combine stages.

We have written the matrix factorization in this rather cumbersome way tomotivate the next step, which expresses it much more compactly. Recall that theKronecker product of a p x q matrix A and an r x s matrix B is the pr x qs matrix

where the a^-'s are the elements of A. We see that the factorization in (10.14) can besimplified nicely by the Kronecker product to give

In general, the Cooley-Tukey FFT for N = 2M can be expressed as

where A^ = Ir <S> Bm, m = 2fc, and rm = N for k = I : M. As before, we have let

The matrix factorization approach gives a concise representation of the Cooley-Tukey FFT. Now, how does it generalize? First, the matrix factorization idea can beapplied directly to the radix-r case (see problem 245). A greater challenge is to extendit to the mixed-radix case in which N is a product of powers of different integers; butit can be done. Furthermore, noting that the DFT matrix W^v is symmetric, it followsthat

is also a factorization that corresponds to an FFT. This transpose factorization,in which the combine stage appears first and the reordering stage appears last isthe decimation-in-frequency FFT of Gentleman and Sande that we have alreadyencountered. Notice that inverse DFT algorithms could also be obtained by takingthe inverse of the above factorizations.

But there is yet another level of generality. Instead of applying the permutationmatrix P before or after the combine phase, it is possible to distribute it throughoutthe combine phase. The factorization

where the Pf's are permutation matrices, is an FFT in which a (usually simple)reordering of the data occurs before each butterfly step. Clearly, many possible FFTs


could result from this strategy. Among the few useful ones are the Stockham self-sorting FFTs and the Pease FFT mentioned earlier. We have still not exhaustedthe scope of the matrix factorization approach, and it will reappear to simplify thediscussion of the next section.

We close the discussion on FFTs in the general Cooley-Tukey mode by mentioninga clever and effective variation on the methods discussed heretofore. Revealed first ina paper by Duhamel and Hollmann [52] in 1984, the split-radix FFT can be derivedwithin any of the above frameworks and it is fairly easy to describe. In the radix-2case, the split-radix method proceeds by splitting the original sequence of length Ninto one sequence of length N/2 and two more sequences of length N/4.. It turnsout that a splitting of this form requires less arithmetic than either a regular radix-2 or radix-4 splitting. The strategy is then repeated on subsequences to produce acompetitive, if not superior, FFT. The split-radix idea can also be generalized to otherradices [157] and to symmetric sequences [51], [126].

10.5. Prime Factor and Convolution Methods

The FFTs discussed so far represent conceptually about half of all known FFTalgorithms. In this section we attempt to summarize briefly the ideas that lead tothe "other half." This task is more difficult since the foundations of prime factoralgorithms (PFAs) are closely related to a strategy of evaluating DFTs throughconvolution. In fact, the two approaches can be combined within the same FFT,which further blurs the distinction between them. Nevertheless, a coarse survey isworth attempting in order to complete the FFT picture.

The idea underlying prime factor FFTs is generally traced to the 1958 paperby Good [68]. The essential ideas were also published by L. H. Thomas [152] in1963. Good's 1958 paper is concerned primarily with a problem in the design ofexperiments. Almost as an afterthought, the author appends "some analogous shortcuts in practical Fourier analysis" with the hope that the work "may be of someinterest and practical value!" The prime factor idea was dormant for almost 20 yearsafter the publication of Good's paper. Its revival is usually attributed to the 1977paper by Kolba and Parks [86], with subsequent work by Winograd [164], Burrus [26],Burrus and Eschenbacher [27], and Temperton [147] leading to the ultimate acceptanceof prime factor algorithms as competitive, and in some cases, preferable, FFTs [85],[148], [150].

Perhaps the simplest way to reach the crux of the prime factor idea is to returnto the decimation-in-time FFT of length N = pq as it was presented in the indexexpansion setting. Recall that the TV-point DFT can be viewed as two waves of DFTs:

1. p DFTs of length q:

for no = 0 : p — 1, /CQ = 0 : q — 1, and k± = 0 : p — 1.

2. q DFTs of length p:

PRIME FACTOR AND CONVOLUTION METHODS 393

for ko — 0 : q — 1 and k\ =Q : p—1.

With just a bit of a leap, it is possible to rewrite this two-step process in thematrix form

where PT represents the reordering of the input sequence, (Ip <8> W9) representsthe p transforms of length q, and (Wp <g> I9) represents the q transforms of lengthp. The important issue is the diagonal N x N matrix fi that holds the "twiddlefactors" (powers of a;) that must be applied between the two waves of transforms. Ifthese twiddle factors could be eliminated, then we would have, by a property of theKronecker product ((A <8> B)(C ® D) = (AC) <8> (BD)), that

(We have also used the fact that a permutation matrix P satisfies P"1 = PT.) Thiswould result in the reduction of a DFT of length N to a, genuine two-dimensional DFTof size p x q with no intermediate "twiddling" required. And this is the motivation ofthe prime factor movement.

The underlying task becomes one of finding orderings of the input and outputsequences, represented by permutation matrices C and R, respectively, such that

If this task can be accomplished, then the DFT can be written as

The quest for these orderings requires a refreshing excursion into elementarynumber theory, and while it is entirely worthwhile, we can only summarize the salientresults [98], [107]. The conclusion is that if p and q are relatively prime (have nocommon divisors other than 1), then such orderings can be found. The orderings areaccomplished through mappings M that take a single index array of length N = pqinto a p x q index array:

In the original prime factor scheme proposed by Good, the ordering of the inputsequence, RT, was given by the Ruritanian map, while the reordering of the outputsequence, CT, was given by the Chinese Remainder Theorem (CRT) map.

As an example, consider the case in which N = 3-5, for which we need reorderingsCT and Rr such that

The reordered vectors RTx and CTX may be represented as 3 x 5 arrays. For thisparticular case, the appropriate reorderings are given by


and

The prime factor FFT now proceeds in the following steps.

1. Form the array RTx from the input sequence.

2. Apply (I5 ® Ws), which means performing three-point DFTs of the columns ofRTx (overwriting the original array).

3. Apply (Ws <g> la), which means performing five-point DFTs of the rows of thearray produced in step 2 (overwriting the previous array).

4. Recover the components of X from the array CTX.

The prime factor algorithm can seem like an improbable sleight-of-hand until it isactually worked out in detail. This opportunity is provided in problems 243 and 244.

We hasten to add that the prime factor idea can be extended to cases in whichN has several factors which are mutually prime. Prime factor FFTs are competitivebecause they avoid computations with "twiddle factors," because there are no indexcalculations or data movement during the actual combine phase, and because veryefficient "small N DFT modules" can be designed to perform the smaller DFTsthat arise. A selection of these specially crafted and streamlined modules is usuallyincluded with prime factor programs. More recently, by using more general maps or"rotated DFTs" [27], [149], PFAs have been discovered that are both in-place (requireno additional storage arrays) and in-order (avoid reordering of the input and outputsequences).

We now turn to the question of designing DFT modules for small values of AT,such as those required by the prime factor FFTs. This leads to at least two moreinteresting paths. The first is a variation that has proven to be extremely effective inthe design of FFTs. In his work on the complexity of FFT and convolution algorithms,Winograd [164] realized the possibility of expressing a DFT matrix as a product ofthree matrices, WAT = ABC, such that the elements of A and C are either zero or ±1,and the diagonal matrix B has elements that are either real or purely imaginary. Itcan be proved that such a factoring minimizes the number of multiplications requiredin the FFT. Winograd factorizations exist for many small prime-order DFTs and canbe incorporated into prime factor algorithms to evaluate the small N DFTs that arise.

Another path takes us a bit further afield, but it is worth the diversion. As wehave seen, there is an intimate connection between DFTs and the convolution of twosequences. The connection is usually considered to be a one-way street: if an FFT isused to evaluate DFTs, then convolutions may be evaluated very efficiently by way ofthe convolution theorem. However, there is a history of ideas that "go in the oppositedirection." In a 1970 paper, Bluestein [11] made the observation that writing

allows us to rewrite the DFT of order N as

FFT PERFORMANCE 395

The sum on the right side may be recognized as the cyclic convolution of two sequences

{xnUJ^N } and {^N}- I*1 °ther words,

While this maneuver does not lead directly to an efficient FFT algorithm, it does provethat a DFT of any order can be computed in O(NlogN) operations by evaluatinBluestein's convolution (possibly embedded in a larger convolution) by FFTs.

The Bluestein approach is actually predated by the quite different convolutionmethod of Rader [113], who observed that if A^ = p is an odd prime, then thereis a number-theoretic mapping that prescribes a row and column permutation ofthe DFT matrix that puts it in circulant form. Since a circulant matrix may beidentified with convolution, well-known fast convolution methods may be applied.The critical mapping in the Rader FFT relies upon the notion of primitive rootsof prime numbers, and it can be extended to cases in which N = pk , where p is anodd prime and k > 1 is a natural number. The Rader factorization, when combinedwith fast convolution, has been used to design very efficient small prime-order DFTmodules that are often integrated with prime factor FFTs.

10.6. FFT Performance

The preceding tour of the FFT has been breathless and brief, but perhaps it providesan impression of the richness and breadth of the subject. In this final section, wewill discuss FFT performance and address the inevitable question of which is the bestFFT. Not surprisingly, the answer to this question is "it depends!" In defending thisnonanswer, we will cite several factors that determine FFT performance and insurethat there will always be open problems for FFT users and architects.

1. Symmetries. It was recognized from the start (even in Good's 1958 paper)that if the input sequence has certain symmetries (such as real, real/even,or real/odd), then corresponding savings in storage and computation can berealized in the FFT. As mentioned in Chapter 4, sequences with symmetriescan be transformed with savings by pre- and postprocessing algorithms. Analternative approach is to incorporate the symmetry directly into the FFT itself.This idea is generally attributed to Edson [9], who proposed an FFT for realsequences that has roughly half of the computational and storage costs of thecomplex FFT. The same strategy has been extended via the splitting method todesign compact symmetric FFTs for even, odd, quarter-wave even and quarter-wave odd sequences that have one-fourth the computation and storage costs ofcomplex FFTs [18], [141]. Symmetric FFTs have also been devised for primefactor FFTs [109] (an unfinished task). Clearly, the exploitation of symmetriesimproves FFT performance and leads to new and specialized FFT variations.

2. Multiple transforms. In many applications, FFTs do not come one at a time,but rather in herds. This generally occurs when problems are posed in morethan one spatial dimension (for example, filtering of two-dimensional imagesor solutions of boundary value problems in a three-dimensional region). Aswe saw in Chapter 4, the DFT of a two-dimensional array of data is done byoverwriting each row of the array with its DFT, and then computing the DFT ofeach resulting column. Clearly, a wave of multiple DFTs must be done in each


direction. There is a wealth of literature about how new FFTs can be designedand how existing FFTs can be adapted to economize the computation of multipleFFTs. There are situations in which it is advantageous to convert M sequencesof length N into a supersequence of length MN and then perform N steps of a"truncated FFT" on the supersequence [136]. Regardless of how multiple FFTsare done, very often the average performance, measured in computation timeper FFT, can be significantly improved with multiple FFTs. Needless to say,computer architecture bears heavily on this issue.

3. Overhead. By overhead we mean less-tangible, and often neglected costs, thatimpact the comparision of FFTs. These include all computational factorsbeyond the actual execution of the butterfly relations. Examples of overheadare additional storage arrays, computation and storage of powers of uj (see VanLoan [155] for excellent discussions of different ways to compute powers of a;),permutations and movement of data, and the arithmetic and storage necessitatedby index mappings. It is difficult to compare FFTs accurately unless the playingfield is leveled with respect to these factors.

4. Hybrid strategies. The pigeon-hole approach of this chapter suggests thatdifferent FFTs must never be seen together. In fact, the mixing of FFT methodswithin larger FFTs can often be very effective and leads to a nearly countlessfamily of hybrid FFTs. One example may suffice. Assume that an FFT is tobe applied to a sequence of length N = 296000 = 25 • 53 • 72. One approachis to let N = TIT^S where r\ = 25,r2 = 53, and r^ — 72, and call on amixed-radix Cooley-Tukey FFT. On the other hand, it would be possible to letn = 5 • 7 • 8, r-2 = 4 • 5 • 7, and r$ = 5 (each r» consists of mutually prime factors),use prime factor FFTs on each of the transforms of length n, and then combinethem in a mixed-radix Cooley-Tukey FFT. It would be difficult to enumerate,analyze, or compare the various hybrid FFT options for even a single large valueof AT.

5. Architecture. FFTs reside on "hard-wired" microchips inside the navigationalsystems of space craft; they are performed by experimental codes for massivelyparallel distributed memory computers; they can be called "on-line" in micro-computer/workstation environments; and they can be found in software librariesfor highly vectorized supercomputers [7], [59], [87], [136] and parallel computers[18]. More recently, symmetric FFTs have been developed for parallel comput-ers [16], [75]. Therefore, it goes without saying that FFT performance dependson hardware and architecture issues. With the explosive progress of computingtechnology, there is an attendant need to tailor FFTs to new advanced architec-tures. Some existing FFTs fit new architectures ideally (for example, the radix-2Cooley-Tukey FFT on a hypercube architecture [18]); other architectures maydemand entirely new FFTs. (Occasionally these "new" FFTs may actually beold FFTs reincarnated; for example, Bluestein's algorithm, which is not efficienton scalar computers, may be the only efficient way of computing certain FFTson a hypercube [139].) This factor alone insures that there will never be a singlebest FFT; the answer depends upon the arena in which the race is run.

6. Software. FFTs abound in software for everything from microcomputers to su-percomputers. No single package will conform to all applications and computing

NOTES 397

environments. Those interested in reliable and widely used mainframe FFT soft-ware might begin with FFTPACK [140] and its vectorized version VFFTPACK.

10.7. Notes

Those readers interested in the historical development of the FFT should locate a1967 volume (AU-15) of the IEEE Transactions on Audio and Electroacoustics, whichis a collection of early papers on the recently revealed FFT. Volume AU-17 of thesame journal (1969) contains the proceedings of the Arden House Workshop of FFTProcessing, which also is of historical interest [39]. Extensive FFT bibliographies canbe found in Brigham [21], Van Loan [155], and Heideman and Burrus [73].

10.8. Problems

232. Crucial properties of Letting u>jv — et2n/N for any natural number

233. Radix-3 FFT. With N = 3M, derive the butterfly relations for the Cooley-Tukey decimation-in-time FFT using the splitting method.

234. Radix-4 FFT. With N = 4M, derive the butterfly relations for the Cooley-Tukey decimation-in-time FFT using the splitting method.

235. Real operation counts. Use the results of the chapter and the previous twoproblems to find the real operation counts for TV-point decimation-in-time radix-2,radix-3, and radix-4 FFTs. Assume that a complex multiplication requires two realadditions and four real multiplications. Assume that a radix-r FFT consists of logr TVstages, each of which has N/r butterflies. Neglect all multiplications by ±1 and avoidall duplicate multiplications. Express all operation counts in the form CN Iog2 TV,where terms proportional to TV are neglected. Verify that the entries in Table 10.give the appropriate values of C.

TABLE 10.1Real operation counts for radix-2,3,4 FFTs

of length, N; values of C in CN Iog2 N.

Real +Real XTotal

N = 2M

3.002.005.00

7V = 3M

41og32« 2.5216/31og32w3.26

5.78

N = 4M

11/4 = 2.753/2 = 1.5

4.25

Comment on the relative costs of these FFTs. How would you design an FFT fora sequence of length N = 256?

236. Another radix-3 FFT. Use the splitting method to find the butterfly relationsfor the Gentleman-Sande decimation-in-frequency FFT for the case N — 3M.

237. Inverse FFTs. Here is a way to devise inverse FFTs. Begin with the

N, prove that


decimation-in-frequency butterfly relations (10.8) and (10.9) in the form

for k = 0 : N/2 — 1, where T>p represents the DFT of length p. Equivalently, we have

for n = 0 : JV/2 — 1, where T>p l represents the inverse DFT of length p. Now formally

invert the second pair of relations (solve for xn and £n+./v/2) and show that the resultis the decimation-in-time butterfly relations (10.3) and (10.4) with u replaced by u;"1

and with a scaling factor of 1/2. Argue that these relations describe how to combinetwo IDFTs of length AT/2 to form an IDFT of length N.

238. Another inverse FFT. Beginning with the butterfly relations for thedecimation-in-time FFT (10.3) and (10.4), follow the procedure of the precedingproblem to obtain a decimation-in-frequency inverse FFT.

239. The Gentleman—Sande FFT by index expansions. Follow the indexexpansion method outlined in the text with p = N/2 and q = 2 to derive theGentleman-Sande butterfly relations (10.8) and (10.9) for N = 2M. The outcomedepends on the ordering of the two nested sums.

240. An N = 16 FFT by index expansions. Letting N = 4 • 4, express the16-point DFT as two nested 4-point DFTs. Arrange the two nested sums so that bothsets of 4-point DFTs require no multiplications.

241. Matrix factorization. Consider the N = 16 decimation-in-time FFT. Showthat the butterfly relations for the first splitting can be expressed in the form

where Im is the identity matrix of order ra and Jlm is the (m x ra) diagonal matrix

242. An N = 6 FFT by index expansions. Letting N = 3 • 2 use the indexexpansion method to derive an FFT for N = 6. Note that depending upon how thetwo nested sums are arranged, it is possible to produce either a decimation-in-time ordecimation-in-frequency FFT.

243. An N — 6 prime factor FFT. In the previous problem, you most likelyencountered "twiddle factors" (extra multiplications by powers of a;). These termscan be eliminated by a careful reordering of the input and output sequences. Considerthe case in which N — pq, p = 3, and q — 2 and proceed as follows.

(a) Let (m)p (said um mod p") denote the remainder in dividing m by p. Checkthat the Chinese Remainder Theorem map C and the Ruritanianmap 71 given by

PROBLEMS 399

when applied to the terms of the sequence s = (0,1, 2,3,4,5)T produce thefollowing reorderings of s:

respectively. (Note that row indices vary from 0 — > 2 and column indicesvary from 0 — » 1.)

(b) Now reorder the input sequence x so it has the form RTx and reorderthe output sequence X so it has the form CTX. Verify that by takingthree-point DFTs of the columns of RTx (overwriting the original array)and then taking two-point DFTs of the columns of the resulting array, thetransform coefficients Xk appear in the array CTX.

(c) Why does it work? Note that C and 71 are one-to-one and onto mappings.Verify that the mappings

are the inverses of C and 7£.

(d) Write out the definition of the six-point DFT as a pair of nested sums,replacing the xn by x(K(n}), Xk by X(C(k)), n by n, and k by k. Verify thatthe "twiddle factors" do not appear, and that the six-point DFT becomesa genuine two-dimensional (3 x 2) DFT.

244. An N = 15 prime factor FFT. Carry out the procedure of the precedingproblem in the case TV = pq,p = 3, and q = 5. Show that the maps

produce the orderings CTX and RTx given in Section 10.5. Show that when theinverse maps

are used in the definition of the DFT, a genuine two-dimensional (3 x 5) DFT results.

245. Matrix factorization for radix-3. Show that the Cooley-Tukey decimation-in-time FFT for N = 34 = 81 can be expressed as

In analogy with the radix-2 case, find the block entries of the butterfly matrices BTO

in terms of the diagonal matrices Hm and the identity matrices Im.

Appendix

Table of DFTs

In the following table we have collected several analytical DFTs. A few words ofexplanation are needed. Each DFT entry is arranged as follows.

The first column has two boxes. The upper box gives the name of the input,below which are graphs of the real and imaginary parts of the discrete input sequence.The lower box contains the name of the continuum input, and the correspondingcontinuum input graphs. The middle column has six boxes containing, in order fromtop to bottom, the formula of the input sequence /n; the analytical TV-point DFToutput Fk\ a measure of the difference \Ck — Fk\, where Ck is the fcth Fourier coefficient;the formula of the continuum input function /(#); the formula for c^\ an entry forcomments, perhaps the most important of which is the AVED warning. This meansthat average values at endpoints and discontinuities must be used if the correct DFTis to be computed. The third column consists of two boxes. The upper box displaysgraphically the real and imaginary parts of the DFT. The lower box gives the maximumerror max \Ck — Fk\, and displays graphically the error \Ck — F^\ for a small (24-point)example.

Unless otherwise noted, the function is assumed to be sampled on the interval[—.A/2, A/2]. The difference |cfc — Fk\ is generally in the form CN~P for some constantC and some positive integer p, which should be interpreted in an asymptotic sense forN —»• oo; in other words, if \Fk — Ck = CN~P, then lim^v-^oo Fhxpk ~ C- While thismeasure is different than the pointwise errors that were derived in Chapter 6, it doesagree with those estimates in its dependence on N.

401

Discrete input name

Graph of fn


Graph of f ( x )

Graph of Fk

Graphof |cfe-Ffc|

max\ck - Fk\Comments

402 APPENDIX: TABLE OF DFTs

The following notational conventions hold throughout The Table of DFTs:

AVED = average values at endpoints and discontinuities,C is a constant independent of k and N.

TABLE OF DFTs

Discrete input name

Graph of fn


Graph of f ( x )

Graph of Ffc

Graph of |cjt - Fk\

max |cfc - Ffc |Comments

1. Impulse

1. None

2a. Paired impulses

2a. None

Exact

Exact

APPENDIX: TABLE OF DFTs 403

2b. Paired impulses

2b. None

3. Complex harmonic

3. Complex harmonic

Exact

3a. Constant

3a. Constant

Exact

4a. Cosine harmonic

4a. Cosine harmonic

Exact


4b. Critical mode

4b. Critical mode

4c. Sine harmonic

4c. Sine harmonic

5. Complex wave

5. Complex wave

6a. Cosine

6a. Cosine

Exact

Exact


6b. Half cosine

6b. Half cosine

6c. Sine

6c. Sine

6d. Even sine

6d. Even sine

7. Linear

7. Linear


8. Triangular wave

8. TViangular wave

9. Rectangular wave

9. Rectangular wave

10. Square pulse

10. Square pulse

lOa. Square pulse

lOa. Square pulse


11. Exponential

11. Exponential

12. Even exponential

12. Even exponential

13. Odd exponential

13. Odd exponential

14. Linear/exponential

14. Linear/exponential


15a. Cosine/exponential

15a. Cosine/exponential

15b. Sine/exponential

15b. Sine/exponential

Bibliography

[1] F. ABRAMOVICI, The accurate calculation of Fourier integrals by the fastFourier transform technique, J. Comp. Phys., 11 (1973), pp. 28-37.

[2] , Letter: The accuracy of finite Fourier transforms, J. Comp. Phys., 17(1975), pp. 446-449.

[3] M. ABRAMOWITZ AND I. STEGUN, Handbook of Mathematical Functions, Dover,New York, 1972.

[4] J. ARSAC, Fourier Transforms and the Theory of Distributions, Prentice-Hall,Englewood Cliffs, NJ, 1966.

[5] L. AUSLANDER AND M. ScHENEFELT, Fourier transforms that respect crystal-lographic symmetries, IBM J. Res. and Dev., 31 (1987), pp. 213-223.

[6] L. AUSLANDER AND R. TOLIMIERI, Is computing with the finite Fouriertransform pure or applied mathematics?, Bull. Amer. Math Soc., 1 (1979),pp. 847-897.

[7] D. BAILEY, A high peformance FFT algorithm for vector supercomputers,Internat. J. Supercomputer Applications, 2 (1988), pp. 82-87.

[8] D. BAILEY AND P. SWARZTRAUBER, A fast method for the numerical evaluationof continuous Fourier and Laplace transforms, SIAM J. Sci. Comput., 15 (1994),pp. 1105-1110.

[9] G. BERGLAND, A fast Fourier transform algorithm for real-valued series, Comm.ACM, 11 (1968), pp. 703-710.

[10] G. BEYLKIN AND M. BREWSTER, Fast numerical algorithms using wavelet baseson the interval, 1993, in progress.

[11] L. BLUESTEIN, A linear filtering approach to the computation of the discreteFourier transform, IEEE Trans. Audio and Electroacoustics, AU-18 (1970),pp. 451-455.

[12] J. BOYD, Chebyshev and Fourier Spectral Methods, Springer-Verlag, Berlin,1989.

[13] R. BRACEWELL, The Fourier Transform and Its Applications, McGraw-Hill,New York, 1978.

[14] , The Hartley Transform, Oxford University Press, U.K., 1986.409

410 BIBLIOGRAPHY

[15] R. BRACEWELL, Assessing the Hartley transform, IEEE Trans. Acoust. SpeechSignal Process., ASSP-38 (1990), pp. 2174-2176.

[16] B. BRADFORD, Fast Fourier Transforms for Direct Solution of Poisson'sEquation, PhD thesis, University of Colorado at Denver, 1991.

[17] A. BRASS AND G. PAWLEY, Two and three dimensional FFTs on highly parallelcomputers, Parallel Computing, 3 (1986), pp. 167-184.

[18] W. BRIGGS, L. HART, R. SWEET, AND A. O'GALLAGHER, MultiprocessorFFT methods, SIAM J. Sci. Stat. Comput., 8 (1987), pp. 27-42.

[19] W. L. BRIGGS, Further symmetries of in-place FFTs, SIAM J. Sci. Stat.Comput., 8 (1987), pp. 644-655.

[20] E. O. BRIGHAM, The Fast Fourier Transform, Prentice-Hall, Englewood Cliffs,NJ, 1974.

[21] , The Fast Fourier Transform and Its Applications, Prentice-Hall, Engle-wood Cliffs, NJ, 1988.

[22] R. BULIRSCH AND J. STOER, Introduction to Numerical Analysis, Springer-Verlag, Berlin, 1980.

[23] O. BUNEMAN, Conversion of FFT to fast Hartley transforms, SIAM J. Sci. Stat.Comput., 7 (1986), pp. 624-638.

[24] , Multidimensional Hartley transforms, Proc. IEEE, 75 (1987), p. 267.

[25] H. BURKHARDT, Encyklopddie der mathematischen wissenschaften, 1899-1916.

[26] C. BURRUS, A new prime factor FFT algorithm, in Proc. 1981 IEEE ICASSP,1981, pp. 335-338.

[27] C. BURRUS AND P. ESCHENBACHER, An in-place in-order prime factor FFTalgorithm, IEEE Trans. Acoust. Speech Signal Process., 29 (1981), pp. 806-817.

[28] C. BURRUS AND T. PARKS, DFT/FFT and Convolution Algorithms, JohnWiley, New York, 1985.

[29] B. BUZBEE, G. GOLUB, AND C. NiELSON, On direct methods for solvingPoisson's equation, SIAM J. Numer. Anal., 7 (1971), pp. 627-656.

[30] H. S. CARSLAW, Introduction to the Theory of Fourier's Series and Integrals,Third Ed., Dover, New York, 1930.

[31] B. CHAR, MAPLE V Library Reference Manual, Springer-Verlag, Berlin, 1991.

[32] T. CHIHARA, An Introduction to Orthogonal Polynomials, Gordon and Breach,New York, 1975.

[33] C. CHU, The Fast Fourier Transform on Hypercube Parallel Computers, PhDthesis, Cornell University, Ithaca, NY, 1988.

[34] R. V. CHURCHILL, Fourier Series and Boundary Value Problems, McGraw-Hill,New York, 1941.

BIBLIOGRAPHY 411

[35] B. CIPRA, The FFT: Making technology fly, SIAM News, May 1993.

[36] V. ClZEK, Discrete Fourier Transforms and Their Applications, Adam Hilger,Bristol, England, 1986.

[37] R. CLARKE, Transform Coding of Images, Academic Press, New York, 1985.

[38] W. COCHRANE, What is the fast Fourier transform?, IEEE Trans. Audio andElectroacoustics, AU-15 (1967), pp. 45-55.

[39] J. COOLEY, R. GARWIN, C. RADER, B. BOGERT, AND T. STOCKHAM, The1968 Arden House workshop on fast Fourier transform processing, IEEE Trans.Audio and Electroacoustics, AU-17 (1969), pp. 66-76.

[40] J. COOLEY, P. LEWIS, AND P. WELCH, Historical notes on the fast Fouriertransform, IEEE Trans. Audio and Electroacoustics, AU-15 (1967), pp. 76-79.

[41] , The fast Fourier transform algorithm: Programming considerations inthe calculation of sine, cosine and Laplace transforms, J. Sound Vibration, 12(1970), pp. 315-337.

[42] J. COOLEY AND J. TUKEY, An algorithm for the machine calculation of complexFourier series, Math. Comp., 19 (1965), pp. 297-301.

[43] R. COURANT AND D. HlLBERT, Methods of Mathematical Physics, Vol. 1,Interscience Publishers, New York, 1937. Original in German in 1924.

[44] G. DANIELSON AND C. LANCZOS, Some improvements in practical Fourieranalysis and their application to x-ray scattering from liquids, J. Franklin Inst.,233 (1942), pp. 365-380, 435-452.

[45] P. DAVIS AND P. RABINOWITZ, Numerical Integration, Blaisdell, New York,1967.

[46] C. DE BOOR, FFT as nested multiplication with a twist, SIAM J. Sci. Stat.Comput., 1 (1980), pp. 173-178.

[47] S. R. DEANS, The Radon Transform and Some of Its Applications, John Wiley,New York, 1983.

[48] M. B. DOBRIN, Introduction to Geophysical Prospecting, Third ed., McGraw-Hill, New York, 1976.

[49] J. DOLLIMORE, Some algorithms for use with the fast Fourier transform, J. Inst.Math. Appl., 12 (1973), pp. 115-117.

[50] H. DUBNER AND J. ABATE, Numerical inversion of Laplace transforms, J.Assoc. Comput. Mach, 15 (1968), p. 115.

[51] P. DuHAMEL, Implementation of the split radix FFT algorithms for complex,real, and real-symmetric data, IEEE Trans. Acoust. Speech Signal Process.,ASSP-34 (1986), pp. 285-295.

[52] P. DUHAMEL AND H. HOLLMANN, Split radix FFT algorithms, Electron. Lett.,20 (1984), pp. 14-16.

412 BIBLIOGRAPHY

[53] P. DUHAMEL AND M. VETTERLI, Improved Fourier and Hartley algorithms:Application to cyclic convolution of real data, IEEE Trans. Acoust. Speech SignalProcess., ASSP-35 (1987), pp. 818-824.

[54] A. DUTT AND V. ROKHLIN, Fast Fourier transforms for nonequispaced data,SIAM J. Sci. Comput., 14 (1993), pp. 1368-1393.

[55] H. DYM AND H. McKEAN, Fourier Series and Integrals, Academic Press,Orlando, FL, 1972.

[56] B. EiNARSSON, Letter: Use of Richardson extrapolation for the numericalcalculation of Fourier transforms, J. Comp. Phys., 21 (1976), pp. 365-370.

[57] D. F. ELLIOT AND K. R. RAO, Fast Transforms, Algorithms, Analysis andApplications, Academic Press, Orlando, FL, 1982.

[58] L. FiLON, On a quadrature formula for trigonometric integrals, Proc. Royal Soc.Edinburgh, 49 (1928), pp. 38-47.

[59] B. FORNBERG, A vector implementation of the fast Fourier transform, Math.Comp., 36 (1981), pp. 189-191.

[60] J. FOURIER, (Euvres de Fourier: Theorie analytique de la chaleur, Vol. 1,Gauthier-Villars et Fils, Paris, 1888.

[61] C. GAUSS, Theoria interpolations methodo nova tractata. Carl Friedrich GaussWerke, Band 3, Koniglichen Gesellschaft der Wissenschaften, Gottingen, 1866.

[62] I. M. GEL'FAND, G. E. SHILOV, AND N. Y. VILENKIN, Generalized Functions,Vol. 5, Academic Press, New York, 1966.

[63] W. GENTLEMAN, Matrix multiplication and fast Fourier transforms, BellSystems Tech. J., 47 (1968), pp. 1099-1103.

[64] W. GENTLEMAN AND G. SANDE, Fast Fourier transforms for fun and profit,Proc. 1966 Fall Joint Computer Conf. AFIPS, 29 (1966), pp. 563-578.

[65] W. M. GENTLEMAN, Implementing Clenshaw-Curtis quadrature II: Computingthe cosine transform, Comm. ACM, 15 (1972), pp. 343-346.

[66] J. GLASSMAN, A generalization of the fast Fourier transform, IEEE Trans.Comp., C-19 (1970), pp. 105-116.

[67] H. H. GOLDSTINE, A History of Numerical Analysis from the 16th through the19th Century, Springer-Verlag, Berlin, 1977.

[68] I. GOOD, The interaction algorithm and practical Fourier analysis, J. RoyalStat. Soc. Series B, 20 (1958), pp. 361-372.

[69] D. GOTTLIEB AND S. A. ORSZAG, Numerical Analysis of Spectral Methods,Society for Industrial and Applied Mathematics, Philadelphia, 1977.

[70] R. HAMMING, Digital Filters, Second ed., Prentice-Hall, Englewood Cliffs, NJ,1983.

BIBLIOGRAPHY 413

[71] H. HAO AND R. BRACEWELL, A three-dimensional DFT algorithm using thefast Hartley transform, Proc. IEEE, 75 (1987), p. 264.

[72] R. HARTLEY, A more symmetrical Fourier analysis applied to transmissionproblems, Proc. Inst. Radio Engrg., 30 (1942), pp. 144-150.

[73] M. HEIDEMAN AND C. BURRUS, A bibliography of fast transform and convo-lution algorithms, Department of Electrical Engineering Technical Report 8402,Rice University, Houston, TX, 1984.

[74] M. HEIDEMAN, D. JOHNSON, AND C. BURRUS, Gauss and the history of thefast Fourier transform, Arch. Hist. Exact Sciences, 34 (1985), pp. 265-277.

[75] V. E. HENSON, Parallel compact symmetric FFTs, in Vector and ParallelComputing: Issues in Applied Research and Development, J. Dongarra, I. Duff,and S. M. Patrick Gaffney, eds., Chichester, England, 1989, Ellis Horwood.

[76] , Fourier Methods of Image Reconstruction, PhD thesis, University ofColorado at Denver, 1990.

[77] , DFTs on irregular grids: The anterpolated DFT. Technical Report NPS-MA-92-006, Naval Postgraduate School, Monterey, CA, 1992.

[78] G. T. HERMAN, Image Reconstruction from Projections, Academic Press,Orlando, FL, 1980.

[79] R. HOCKNEY, A fast direct solution of Poisson's equation using Fourier analysis,J. Assoc. Comput. Mach., 12 (1965), pp. 95-113.

[80] E. ISAACSON AND H. KELLER, Analysis of Numerical Methods, John Wiley,New York, 1966.

[81] D. JACKSON, Fourier Series and Orthogonal Polynomials, Mathematical Asso-ciation of America, Washington, D.C., 1941.

[82] A. JERRI, The Shannon sampling thoerem: Its various extensions and applica-tions: A tutorial review, Proc. IEEE, 65 (1977), pp. 1565-1596.

[83] , An extended Poisson sum formula for the generalized integral transformsand aliasing error bound for the sampling theorem, J. Appl. Anal, 26 (1988),pp. 199-221.

[84] , Integral and Discrete Transforms with Applications and Error Analysis,Marcel Dekker, New York, 1992.

[85] H. JOHNSON AND C. BURRUS, An in-place in-order radix-2 FFT, Proc. IEEEICASSP, San Diego, 1984, p. 28A.2.

[86] D. KOLBA AND T. PARKS, A prime factor FFT algorithm using high speedconvolution, IEEE Trans. Acoust. Speech Signal Process., ASSP-25 (1977),pp. 281-294.

[87] D. KORN AND J. LAMBIOTTE, Computing the fast Fourier transform on a vectorcomputer, Math. Comp., 33 (1979), pp. 977-992.

414 BIBLIOGRAPHY

[88] T. KORNER, Fourier Analysis, Cambridge University Press, U.K., 1988.

[89] H. KREISS AND J. OLIGER, Stability of the Fourier method, SIAM J. Numer.Anal., 16 (1979), pp. 421-433.

[90] J. LAGRANGE, Recherches sur la nature et la propagation du son. MiscellaneaTaurinensia (Melanges de Turin), Vol. I, Nos. I-X, 1759, pp. 1-112. Reprintedin Oeuvres de Lagrange, Vol. I, J. A. Serret, ed., Paris, 1876, pp.39-148.

[91] C. LANCZOS, Applied Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1956.

[92] R. M. LEWITT, Reconstruction algorithms: Transform methods, Proc. IEEE,71 (1983), pp. 390-408.

[93] M. LIGHTHILL, Introduction to Fourier Analysis and Generalised Functions,Cambridge University Press, U.K., 1958.

[94] E. LiNFOOT, A sufficiency condition for Poisson's formula, J. London Math.Soc., 4 (1928), pp. 54-61.

[95] J. LYNESS, Some quadrature rules for finite trigonometric and related integrals,in Numerical Integration: Recent Developments and Software Applications,P. Keast and G. Fairweather, eds., D. Reidel, Dordrecht, the Netherlands, 1987pp. 17-34.

[96] MATHCAD User's Guide, MathSoft, Cambridge, MA, 1991.

[97] MATLAB Reference Guide, The Math Works, Natick, MA, 1992.

[98] J. McCLELLAN AND C. RADER, Number Theory in Digital Signal Processing,Prentice-Hall, Englewood Cliffs, NJ, 1979.

[99] J. H. McCLELLAN AND T. W. PARKS, Eigenvalue and eigenvector decomposi-tion of the discrete Fourier transform, IEEE Trans. Audio and Electroacoustics,20 (1972), pp. 66-74.

[100] H. MECKELBURG AND D. LIPKA, Fast Hartley transform algorithm, Electron.Lett., 21 (1985), pp. 341-43.

[101] L. MORDELL, Poisson's summation formula and the Riemann zeta function, J.London Math. Soc., 4 (1928), pp. 285-291.

[102] F. NATTERER, Fourier reconstruction in tomography, Numer. Math., 47 (1985),pp. 343-353.

[103] , Efficient evaluation of oversampled functions, J. Comp. Appl. Math., 14(1986), pp. 303-309.

[104] , The Mathematics of Computerized Tomography, John Wiley, New York,1986.

[105] K.-C. NG, Letter: On the accuracy of numerical Fourier transforms, J. Comp.Phys., 16 (1973), pp. 396-400.

[106] A. NlKiFOROV, S. SuSLOV, AND V. UvAROV, Classical Orthogonal Polynomialsof a Discrete Variable, Springer-Verlag, Berlin, 1991.

BIBLIOGRAPHY 415

[107] H. NUSSBAUMER, FFT and Convolution Algorithms, Springer-Verlag, Berlin,1982.

[108] A. OPPENHEIMER AND R. SCHAFER, Digital Signal Processing, Prentice-Hall,Englewood Cliffs, NJ, 1975.

[109] J. OTTO, Symmetric prime factor fast Fourier transform algorithms, SIAM J.Sci. Stat. Comput., 10 (1989), pp. 419-431.

[110] R. PALEY AND N. WIENER, Fourier Transforms in the Complex Plane,American Mathematical Society, Providence, RI, 1934.

[Ill] A. PAPOULIS, The Fourier Integral and Its Applications, McGraw-Hill, NewYork, 1962.

[112] M. PEASE, An adaptation of the fast Fourier transform for parallel processing,J. Assoc. Comput. Mach., 15 (1968), pp. 252-264.

[113] C. RADER, Discrete Fourier transforms when the number of data samples isprime, Proc. IEEE, 5 (1968), pp. 1107-1108.

[114] J. RADON, Uber die bestimmung von funktionen durch ihre integralwerte langsgewisser mannigfaltigkeiten, Berichte Sachsische Akademie der Wissenschaften,Leipzig, Math.-Phys., Kl., 69 (1917), pp. 262-267.

[115] G. RAISBECK, The order of magnitude of the Fourier coefficients in functionshaving isolated singularities, Amer. Math. Monthly, (1955), pp. 149-154.

[116] K. RAO AND P. YIP, The Discrete Cosine Transform: Algorithms, Advantagesand Applications, Academic Press, Orlando, FL, 1990.

[117] L. RICHARDSON, The deferred approach to the limit, Phil. Trans. Royal Soc.,226 (1927), p. 300.

[118] T. RIVLIN, Chebyshev Polynomials, John Wiley, New York, 1990.

[119] V. ROKHLIN, A fast algorithm for the discrete Laplace transformation, ResearchReport YALEU/DCS/RR-509, Yale University, New Haven, CT, January 1987.

[120] W. ROMBERG, Vereinfachte numerische integration, Norske Vid. Slesk. Forh.Trondheim, 28 (1955), pp. 30-36.

[121] P. RUDNICK, Note on the calculation of Fourier series, Math. Comp., 20 (1966),pp. 429-430.

[122] C. RUNGE, Uber die zerlegung einer empirischen funktion in sinuswellen, Z.Math. Phys., 52 (1905), pp. 117-123.

[123] R. SAATCILAR, S. ERGINTAV, AND N. CANITEZ, The use of the Hartleytransform in geophysical applications, Geophysics, 55 (1990), pp. 1488-1495.

[124] C. E. SHANNON, Communication in the presence of noise, Proc. IRE, 37 (1949),pp. 10-21.

[125] L. A. SHEPP AND B. F. LOGAN, The Fourier reconstruction of a head section,IEEE Trans. Nucl. Sci., NS-21 (1974), pp. 21-43.

416 BIBLIOGRAPHY

[126] H. SORENESEN, M. HEIDEMAN, AND C. BuRRUS, On calculating the split-radixFFT, IEEE Trans. Acoust. Speech Signal Process., ASSP-34 (1986), pp. 152-156.

[127] H. SORENSEN, D. JONES, C. BURRUS, AND M. HEIDEMAN, On computingthe discrete Hartley transform, IEEE Trans. Acoust. Speech Signal Process.,ASSP-33 (1985), pp. 1231-1238.

[128] I. STAKGOLD, Green's functions and boundary value problems, John Wiley, NewYork, 1979.

[129] H. STARK, J. WOODS, I. PAUL, AND R. HINGORANI, Direct Fourier recon-struction in computer tomography, IEEE Trans. Acoust. Speech Signal Process.,ASSP-29 (1981), pp. 237-244.

[130] , An investigation of computerized tomography by direct Fourier recon-struction and optimum interpolation, IEEE Trans. Biomedical Engrg., BME-28(1981), pp. 496-505.

[131] F. STENGER, Numerical methods based on sine and analytic functions, Springer-Verlag, New York, 1993.

[132] D. STONE AND G. CLARKE, In situ measurements of basal water quality as anindicator of the character of subglacial drainage systems, Hydrological Processes,(1994). to appear.

[133] P. SWARZTRAUBER, A direct method for the discrete solution of separable ellipticequations, SIAM J. Numer. Anal, 11 (1974), pp. 1136-1150.

[134] , The methods of cyclic reduction, Fourier analysis, and cyclic reductionFourier analysis for the discrete solution of Poisson's equation on a rectangle,SIAM Review, 19 (1977), pp. 490-501.

[135] , Algorithm 541: Efficient fortran subprograms for the solution of separablelliptic partial differential equations, ACM Trans. Math. Software, 5 (1979),pp. 352-364.

[136] , Vectorizing the FFTs, in Parallel Computations, G. Rodrigue, ed., NewYork, 1982, Academic Press.

[137] , Fast Poisson solvers, in Studies in Numerical Analysis, G. Golub, ed.Washington, D.C., 1984, Mathematical Association of America.

[138] , Multiprocessor FFTs, Parallel Comput., 5 (1987), pp. 197-210.

[139] P. SWARZTRAUBER, R. A. SWEET, W. L. BRIGGS, V. E. HENSON, ANDJ. OTTO, Bluestein's FFT for arbitrary n on the hypercube, Parallel Comput.,17 (1991), pp. 607-617.

[140] P. N. SWARZTRAUBER, FFTPACK, a package of fortran subprograms forthe fast Fourier transform of periodic and other symmetric sequences, 1985.Available from NETLIB. Send email to [email protected].

[141] , Symmetric FFTs, Math. Comp., 47 (1986), pp. 323-346.

BIBLIOGRAPHY 417

[142] R. SWEET, Direct methods for the solution of Poisson's equation on a staggeredgrid, J. Comp. Phys., 12 (1973), pp. 422-428.

[143] , Crayfishpak: A vectorized fortran package to solve Helmholtz equa-tions, in Recent Developments in Numerical Methods and Software forODE/ADE/PDEs, G. Byrne and W. Schiesser, eds., World Scientific, Singa-pore, 1992, pp. 37-54.

[144] R. SWEET AND U. SCHUMANN, Fast Fourier transforms for direct solutionof Poisson's equation with staggered boundary conditions, J. Comp. Phys., 75(1988), pp. 123-137.

[145] G. SZEGO, Orthogonal Polynomials, Fourth ed., American Mathematical Soci-ety, Providence, RI, 1975.

[146] W. M. TELFORD, L. P. GELDART, R. E. SHERIFF, AND D. A. KEYS, AppliedGeophysics, Cambridge University Press, U.K., 1976.

[147] C. TEMPERTON, A note on prime factor FFT algorithms, J. Comp. Phys., 52(1983), pp. 198-204.

[148] , Self-sorting mixed radix fast Fourier transforms, J. Comp. Phys., 52(1983), pp. 1-23.

[149] , Implementation of a self-sorting in-place prime factor FFT algorithm, J.Comp. Phys, 58 (1985), pp. 283-299.

[150] , Self-sorting in-place fast Fourier transforms, SIAM J. Sci. Comput., 12(1991), pp. 808-823.

[151] F. THEILHEIMER, A matrix version of the fast Fourier transform, IEEE Trans.Audio and Electroacoustics, AU-17 (1969), pp. 158-161.

[152] L. THOMAS, Using a computer to solve problems in physics, in Applications ofDigital Computers, Ginn, Boston, MA, 1963.

[153] G. P. TOLSTOV, Fourier Series, Prentice-Hall, Englewood Cliffs, NJ, 1962.Reprinted by Dover, New York, 1976.

[154] C. TRUESDELL, The Tragicomical History of Thermodynamics: 1822-1854,Springer-Verlag, Berlin, 1980.

[155] C. VAN LOAN, Computational Frameworks for the Fast Fourier Transform,Society for Industrial and Applied Mathematics, Philadelphia, 1992.

[156] E. B. VAN VLECK, The influence of Fourier series upon the development ofmathematics, Science, 39 (1914), pp. 113-124.

[157] M. VETTERLI AND P. DUHAMEL, Split radix algorithms for length pm DFTs,IEEE Trans. Acoust. Speech Signal Process., ASSP-34 (1989), pp. 57-64.

[158] J. S. WALKER, Fourier Analysis, Oxford University Press, New York, 1988.

[159] , Fast Fourier Transforms, CRC Press, Boca Raton, FL, 1991.

418 BIBLIOGRAPHY

[160] P. WALKER, The Theory of Fourier Series and Integrals, John Wiley, New York,1986.

[161] W. WEEKS, Numerical inversion of Laplace transforms, J. Assoc. Comput.Mach., 13 (1966), p. 419.

[162] E. WHITTAKER AND G. ROBINSON, The Calculus of Observation, Blackie andSons, London, 1924.

[163] O. WlNG, An efficient method of numerical inversion of Laplace transforms,Arch. Elect. Comp., 2 (1967), p. 153.

[164] S. WlNOGRAD, On computing the discrete Fourier transform, Math. Comp., 32(1978), pp. 175-199.

[165] S. WOLFRAM, Mathematica: A System for Doing Mathematics by Computer,Addison-Wesley, Boston, 1988.

[166] D. M. YOUNG AND R. T. GREGORY, A Survey of Numerical Mathematics,Addison-Wesley, Boston, 1972.

Index

Absolute integrability, 17, 87, 206Aliasing, 95-98, 185

a beautiful instance, 224and the Nyquist frequency, 97and the stagecoach wheel, 95and trapezoid rule error, 365anti-aliasing recipe, 159by extending spatial domain, 203defined, 95DFT interpolation error, 224dip aliasing of depth section, 284for / of compact support, 197in image reconstruction, 297in spatial domain, 217in two dimensions, 156-159of periodic, non-band-limited input,

185Amplitude distortion filter, 265Amplitude spectrum, 264

of distortionless filter, 265of low-pass filter, 266

Amplitude-phase formcontinuous, 263discrete, 264

Analysisof / into its modes, 18time series, 6

Analytical DFTs, 100-109, see also Tableof DFTs

• compared to Ck and f(uik), 104how can we check accuracy?, 102in two dimensions, 150-152like reading Euclid in Greek, 100

Analytical methods, 100Angles

incidence, reflection, refraction, 274Anticline, 276Approximation

by DFT interpolation, 222-226by trigonometric polynomial, 41-44Fourier transform

improved using replication, 203least squares

minimized by Chebyshev series,326

least squares polynomial, 44of Fourier coefficients, 33-40

with periodic, band-limited input,181-184

with periodic, non-band-limitedinput, 184-192

of Fourier integralsby higher-order quadrature, 370-

375of Fourier transform, 17-23

with general band-limited input,200-206

with general input, 206-211with periodic, compactly sup-

ported input, 197-200of orthogonal expansion coefficients,

334Array

data, 144doubly periodic, 145separable, 145

Artifacts, 298Asymmetric exponentials

case study of DFT errors, 209-211Attenuation coefficient, 286AVED

and convergence of Fourier series, 94and the trapezoid rule, 360averaging at endpoints, 93-95defined, 95in DFT image reconstruction, 293in DFT interpolation, 222in digital filtering, 262in FK migration, 283in inverting the Laplace transform,

311in low-pass filter design, 267two-dimensional arrays, 145used before it's named, 38used in analytical DFT, 104, 107

Averageat endpoints, see AVED, 95

419

420 INDEX

Average (continued)diffusion modeled as temperature av-

erage, 242weighted average is a filter, 81, 260

Average value test, 103

Band-limited input, 57, 96, 182approximating Fourier coefficients of,

181-184case study of DFT errors, 205-206essentially, 97example: DFT of a sine, 205general or nonperiodic, 200IDFT error is sampling error, 217right side of summation formula fi-

nite, 207Band-pass filter, 265, 268

generalized, 270time domain representation, 269

Band-reject filter, 271Basis

solving BVPs by change of, 244Bernoulli, Daniel, 2Bernoulli, Jakob, 362Bernoulli numbers, 362

asymptotic behavior of, 366Bessel, F. W., 341Bessel transform, 341Bit reversal, 383Bluestein method

convolutional FFT, 394Boundary conditions, 238

Dirichlet, 238Neumann, 247periodic, 249

Boundary value problems, 236-259as a system of equations, 238, 247,

249diffusion of gossip, 251diffusion of heat, 241for ordinary differential equations,

253for partial differential equations,

256-259Poisson BVP solvers nearly optimal,

259three-dimensional, 259two-dimensional, 256-259

three steps to solving, 258with Dirichlet boundary conditions,

238-247with Neumann boundary conditions,

247-249

with periodic boundary conditions,249-253

Bowtie on seismic data, 277Boxcar function, 101Bracewell, Ronald N.

patented FHT, 341The Hartley Transform, 346

Brigham, E. Oran, 227, 380Buneman, Oscar

champion of the DHT, 347Burkhardt, Heinrich, 5, 381Burrus, C., 5Butterfly

matrix, 390relations, 130, 382

as matrix-vector product, 390

Cas(x)Hartley transform kernel, 342orthogonality of, 344

Case studycompactly supported functions, 199-

200CAT (computer aided tomography), 287Causal function, 310Central Slice Theorem, 291Ceres

Gauss interpolates the orbit, 5, 41Chebyshev, Pafnuti Liwowich, 322Chebyshev polynomials, see PolynomialsChebyshev series

of absolute value of x, 329written as a Fourier cosine series, 327

Chebyshev transform, 321-330discrete, 329inverse discrete, 329numerical example, 329

Chinese Remainder Theoremordering for FFTs, 393

Christoffel, Elwin Bruno, 335Christoffel-Darboux formula, 335Circulant matrix

for BVPs with periodic boundaries,250

Clairaut, Alexis Claude, 2, 4first explicit DFT, 4

Comb function, 49Combine relations, 130, 382Combine stage, 383Common midpoint (CMP), 274Compact support, 197-200

makes left side of summation formulafinite, 207

INDEX 421

means DFT error is replication error,198

means IDFT error is truncation er-ror, 218

of / can't be band-limited, 197Compact symmetric FFT, 119, 128

for solving two-dimensional BVPs,258

requires fewer passes through thedata, 137

Conjugate symmetricDFT of, 77

Conjugate symmetric data, 76in two dimensions, 165is conjugate even, 76, 120

Conjugation, complex, 30Contour integral to define inverse Laplace

transform, 310Convergence

and numerical methods, 100of Chebyshev series expansions, 326of finite difference methods, 255of Fourier series, 35-38of Fourier series with AVED, 94of geometric series, 314of orthogonal polynomial expansions,

332of z-transform excludes origin, 314to average at discontinuities, 38

Convolution, 48as a filtering operator, 83, 84, 261Bluestein's FFT, 394discrete or cyclic, 78-85, 261FFTs based on, 392-395for multiplying polynomials, 84frequency, 85graphical development, 82of a function with a spike, 48Rader FFT, 395

Convolution theoremcontinuous form, 87, 263discrete, 83discrete form, 261for the DHT, 346

Cooley, James W., 5seminal FFT paper, 380

Cooley-Tukey FFT, 380schematic, 383

Coordinatessolving BVPs by change of, 244

Correlationcontinuous, 87discrete, 85

in the frequency domain, 86Correlation theorem, 86Cosine series

from a Chebyshev series, 327Cosine transform, discrete (DCT), 122-

125Cotes, Roger, 370Cut-off frequency, 266

D'Alembert, Jean, 2Danielson, G.

FFT precursor, 380Darboux, Jean Gaston, 335Data analysis, 6DC component, 27DCT, 122-125

a new and independent transform,124

by pre- and postprocessing, 133computational cost of, 123computing in two dimensions, 171defined, 124for solving BVPs with Neumann

boundary conditions, 248-249inverse, 125shift property, 248solving differential BVPs, 256to find Chebyshev coefficients, 327to compute the discrete Chebyshev

transform, 328two-dimensional, 170two-dimensional for Poisson equa-

tion, 258two-dimensional inverse, 171

Deans, Stanley, 289Decay rate

of Chebyshev coefficients, 326of Fourier coefficients, 186

strong result, 187Decomposition

of /„ into even and odd, 78of Hartley transform into even and

odd, 342spectral, 7

Delta function, 45, 263properties, 47sifting property, 288

Delta sequences, 46Depth section, 279Derivative property

of the Fourier transform, 279Derivatives

one-sided, 36replaced by differences, 255, 257

422 INDEX

Determinantof A - AI, 245

DFT, 1-434amplitude-phase form, 264and quadrature, 358-375and the ^-transform, 319-321and the trapezoid rule, 358-369basic modes, 25basic properties, 72-87defined, 23defined on double set of points, 67defined on single set of points, 67error approximating Fourier coeffi-

cients, 188error approximating Fourier trans-

form, 198with general band-limited input,

203error without AVED, 93errors in the inverse, 212-222for approximating Fourier coeffi-

cients, 33-40with periodic, band-limited input,

181-184with periodic, non-band-limited

input, 184-192for approximating Fourier trans-

forms, 17-23with compactly supported input,

197-200with general band-limited input,

200-206with general input, 206-211

for causal sequences, 24for even length sequences, 23for odd length sequences, 23general form, 66indexed 0 to N - 1, 24interpolation, 222-226inverse, 28limit as A —» oo and Ax —» 0, 211limit as N -» oo, 53-55, 104limit as N —»• oo and A —> oo, 211matrix factorizations of, 389-392matrix properties of, 32multidimensional, 144-172

from two-dimensional, 163of a correlation, 86of a linear profile, 104of a padded sequence, 90of a real even sequence, 124of a reversed sequence, 74of a square pulse, 101

of asymmetric exponential, 210of convolution is product of DFTs, 83of cyclic convolution, 78-85of symmetric sequences, 76-78on irregular grids, 298quarter-wave, for BVPs with mixed

boundaries, 259real form, 25relating replications of samples of /,

/, 196relation to Ck and f(u>k], 211relation to DHT, 345should we use the DHT instead?,

346-347symmetric, 76table of properties, 88table of various forms, 69two-dimensional, 144-152used to approximate /, 222using the IDFT to compute, 74what does it approximate?, 53-55

DHT, 341-347relation to DFT, 345should it replace the DFT?, 346-347

Diagonal dominance, 241, 249, 251Diagonal matrix, 245Difference equation, 236-259

boundary value problem (BVP), 237BVPs with Dirichlet boundaries,

238-247BVPs with Neumann boundaries,

247-249BVPs with periodic boundaries, 249-

253homogeneous and nonhomogeneous,

236initial value problem (IVP), 237IVP solution by ^-transform, 318-

319linear and nonlinear, 236order of, 236time-dependent, 237trial solution of, 239used in analytical DFT, 105

Difference, finite, 254Differential equation

satisfied by Chebyshev polynomials.324

satisfied by Legendre polynomial, 339Diffractions, 277Diffusion

of gossip, 251of heat: an example BVP, 241

INDEX 423

steady state modeled by Poissonequation, 256

Digital filtering, 260-271and convolution, 84and ringing, 268filter design, 264-271

Digital signal, 260Dip aliasing, on depth section, 284Dirac, Paul Adrien, 45Dirac delta function, 45Dirichlet

boundary conditions, 238two-dimensional, 256

smoothness conditions, 188Dirichlet, P. G. Lejeune, 4, 188Discrete Chebyshev transform, 329

inverse, 329numerical example, 329

Discrete Hartley transform, see DHTDiscretization

error, 54and the IDFT, 58

image reconstruction problem, 292of differential equation, 254of the continuous ^-transform, 320

Dispersion relation, 280differentiated, 282

Distortionless filtercontinuous, 264discrete, 265

Distribution, 46Double set, samples in the DFT, 67DST (discrete sine transform) 125-127

a safe but inefficient method for com-puting, 134

BVP matrix diagonalized by modesof, 246

computational cost of, 126defined, 126for solving BVPs with Dirichlet

boundaries, 239-247for solving differential BVPs, 255inverse, 127renders a monstrous equation benign,

258shift property, 244two-dimensional, 172two-dimensional for Poisson equa-

tion, 257two-dimensional inverse, 172

Duhamelintroduces split-radix FFT, 392

Economization, 350

Edson, J. O.first compact symmetric FFT, 128

Eigenvalues, 245found by solving difference equation,

245of the DFT matrix, 32

Eigenvectors, 245found by solving difference equation,

245of Dirichlet BVP matrix are DST

modes, 246Endpoints, 368Energy, and Parseval's relation, 86Error

discrete least squares, 41discretization, 54due to leakage, 99in DFT of piecewise continuous /,

189in quadrature

extrapolation, 372-375in the DFT

case study of periodic, band-limited input, 183-184

case study of periodic, non-band-limited input, 189

of compactly supported functions,198

of general band-limited input, 203of periodic, non-band-limited in-

put, 188in the IDFT

computing Fourier series synthesis,213

for band-limited /, 217with compactly supported /, 217-

218in the trapezoid rule, 360-369

when is it exact?, 365-368in trigonometric interpolation, 223-

225interpolation is source for image re-

construction, 298mean square, 180, 223truncation, 54

Essentially band-limited input, 97Euler, Leonhard, 2, 17

and Lagrange discretized wave equa-tion, 2

Euler relations, 17in the DFT of a square pulse, 102

Euler-Maclaurin summation formula, 358,361

424 INDEX

Even functionChebyshev polynomial Tn is, for n

even, 322decomposition of Hartley transform

into, 342Legendre polynomial Pn is, if n is

even, 338the Radon transform is an, 290

Even sequence, see also Waveform decom-position

DFT of, 77DHT of is even, 345or array in two dimensions, 169-172

Evolution of systemgoverned by an IVP, 318

Exploding reflectorseismic exploration model, 279

Extensioneven, 125odd, 127periodic, 38, 181, 197

Extrapolation, 371Extreme points of Chebyshev polynomi-

als, 323

Fast Hartley transform, 341faster than FFT?, 346

Fast Poisson solvers, 253-259computational cost of, 259

FFT, 380-39710-minute, 1-page derivation, 381a large, proliferating family, 380Bluestein method, 394compact symmetric, 119, 128

for solving BVPs, 258computational savings with, 388convolution methods for, 392-395Cooley-Tukey algorithm, 380

matrix factorization of, 391schematic, 383

cost offor digital filtering, 84for fast Poisson solvers, 259

decimation-in-frequency, 385index expansions, 388

decimation-in-timeindex expansions, 387-388

for image reconstruction, 295for vector and parallel computers,

396Gauss used the first, 5Gentleman-Sande algorithm, 385

schematic, 386hybrid strategies for, 396

index expansions, 387-388in-place algorithm, 383in-place, in-order PFA methods, 394inside the black box, 380-397is the FHT faster?, 346length N = pq as two-dimensional

FFT, 387length N = pqr as three-dimensional

FFT, 388made obsolete by FHT?, 341matrix factorizations of, 32, 389-392

inverse, 391mixed-radix, 384multiple transforms, 395nested polynomials "with a twist",

388not known for polar coordinates, 293often come in herds, 395on irregular grids, 298operation count, 383, 388overhead, 396Pease algorithm, 386, 392performance, 395-397prime factor algorithms (PFA), 392-

395Rader algorithm, 395self-sorting, 385split-radix, 392Stockham self-sorting, 385, 392symmetric, 395to compute symmetric DFTs, 119two-dimensional for image recon-

struction, 294Winograd factorizations of, 394

FFTPACK, 71FHT

makes FFT obsolete?, 341Filon, L.

quadrature rules, 358Filon's rule, 369-370

O(/i4) accuracy, 370Filter, 263

amplitude distortion, 265band-pass, 265, 268continuous distortionless, 264discrete distortionless, 265generalized band-pass, 270generalized low-pass, 270high-pass, 270low-pass, 260, 266modified low-pass, 268notch, or band-reject, 271time-shifting, 264

INDEX 425

zero phase, 265Filter design

from generalized low-pass filter, 270-271

in the frequency domain, 264in the time domain, 264of band-pass filter, 268-270of low-pass filter, 266-268

Filtering, see also Digital filteringand convolution, 83, 84glacial turbidity example, 10

Finite difference, 254FK migration, 272-284

a numerical example, 284a three-step process, 282with the DFT, 283-284

Folded difference, 92Folded sum, 92Fourier, Jean Baptiste Joseph, 2, 16

Theorie analytique de la chaleur, 4Fourier coefficients

approximation by DFT, 33-40for periodic, band-limited input,

181-184for periodic, non-band-limited in-

put, 184-192of ^-periodic / on [—pA/2,pA/2] ,

190of asymmetric exponential, 210relation to Fourier transform, 104

Fourier seriesconvergence of, 35-38decay rate of coefficients, 186

strong result for, 187defined, 33for real-valued /, 34letting A —* oo, 55orthogonality of modes, 35relation to Fk and f(uk), 211synthesis, 212-215

/ from IDFT on replicated Ck, 215table of properties, 88used to check analytical DFTs, 103used to invert Laplace transform, 312

Fourier synthesiscase study of IDFT errors, 214case study of improved, 215

Fourier transformapproximation by DFT, 17-23

general band-limited input, 200-206

general input, 206-211

periodic, compactly supported in-put, 197-200

converts a PDE to an ODE, 279defined, 17fractional, 312inverse, 17inverse of symmetric exponential, 218of a Radon transform

Central Slice Theorem, 291of a spike, 48of a spike train, 50of asymmetric exponential, 210of band-limited / has compact sup-

port, 201of the exponential function, 48of the wave equation, 279relation to Fk and Cfc, 211relation to Fourier coefficients, 104relation to Hartley transform, 343table of properties, 88three-dimensional, 280two-dimensional, 279used to check analytical DFTs, 103within a Laplace transform, 311

Fractional Fourier transform, 312Fredholm, Eric Ivar, 249Fredholm Alternative, 249Frequency, 17

cut-off, 201, 266distribution of high and low, 67

in two dimensions, 147, 152Nyquist, 97of two-dimensional mode, 155shift property, 74vector, 154-155

Frequency convolution, 85Frequency correlation, 86Frequency domain, 7, 17

two-dimensional, 155Function

absolutely integrable, 17amplitude-phase form, 263.A-periodic, 33auxiliary to satisfy AVED, 95band-limited, 57, 96, 182boxcar, 101cas(:c), kernel of Hartley transform,

342causal, 310C°°, 365comb, 49compactly supported, 197delta, 45

426 INDEX

Function (continued)density, 287generalized, 46interpolating, 44not limited in space and frequency,

97odd, 94of finite duration, 197piecewise monotone, 187quadratic with cusps, 200reconstructed from its samples /n, 96replication, 193sine, 96, 104smooth quartic, 200spatially limited, 54, 197square pulse, 101, 199triangular pulse or hat, 200

FQ test, 103, 107

Gauss, Carl Priedrich, 5, 41interpolates the orbit of Ceres, 5, 41the first FFT, 5

Gaussian elimination, 239, 248computational cost of, for BVPs, 259the best method for one-dimensional

BVPs, 239Gaussian quadrature, see QuadratureGeneralized function, 46Gentleman, W. M.

first FFTs as factorizations, 389Gentleman-Sande FFT, 386Geologic structures

anticline, 276syncline, 276

Geometric seriescomputing z-transforms using, 314

Geometric sum, 100computing analytical DFTs, 106computing analytical DFTs using,

102Geophones, 272Ghost reflections, 286Gibbs, Josiah, 206Gibbs effect, 206, 214Goldstine, Herman Heine, 5, 6, 381Good, I.

FFT precursor, 380introduces idea behind PFAs, 392

Gossip modeled by a BVP with periodicboundary conditions, 251

Grid spacing, see Sampling rate, 144Grid spacing ratios and two-dimensional

reciprocity relations, 160

Hadamard, Jacques, 341Hankel, Hermann, 341Hankel transform, 341Hartley, Ralph V. L., 341Hartley transform, 342

discrete, 343inverse discrete, 344is its own inverse, 342kernel of, 342properties, 345-346relation to Fourier transform, 343

Heideman, M., 5Helmholtz, Hermann von, 256Helmholtz equation, 256Hermite, Charles, 74Hermitian symmetry, 74Hertz, Heinrich, 18High-pass filter, 270Hilbert, David, 341Hilbert transform, 341Huygens, Christian, 273Huygens' principle, 273

IDFTapproximating Fourier series with,

212-215/ can be recovered from replicated

cfc, 215for periodic, band-limited /, 213the error is truncation, 213

approximating inverse Fourier trans-forms with, 215-222

for band-limited f , 217for compactly supported /, 217-

218case study of errors in, 218-221defined, 28effect of letting N —> oo, 57

on the error, 214limit as N —» oo and Au; —> 0, 221using the DFT to compute, 74what does it approximate?, 56-59

Image, 287Image reconstruction, 286-297

a simple example, 295an inverse problem, 292artifacts, 298DFT method for, 292-297Fourier transform method for, 292geometry of interpolation in, 294interpolation is main source of error

in, 298operation count, 295

Imaginary unit, 17, 66

INDEX 427

Impulse, 263function, 45response, 263

IMSL, 70Incidence

normal incidence points, 274Incidence angle, 274Initial value problems (IVPs), 237, 318-

319Inner product

continuum orthogonality of, 34discrete orthogonality of, 29of continuum functions, 34, 223of two-dimensional arrays, 147

Integrationby parts, 100numerical, 358

Interpolation, see also Trigonometric in-terpolation

by the DFT, 222-226error in, 223-225

by trigonometric polynomial, 44, 222for image reconstruction problem,

294for seismic migration, 284in the frequency domain, 91of polar grid data to Cartesian grid,

294Interval of support, 197Inversion of the DFT, 30

Johnson, D., 5

Kelvin, Lord, 4Kernel

of Fourier transform, 17of Hartley transform, 342

Kohonen-Loewe transform, 341Kronecker, Leopold, 28Kronecker delta, 28, 34, 148

modular, 28sequences, 46

Kronecker product, 391use in PFAs, 393

Lagrange, Joseph-Louis, 2, 4and Euler discretized wave equation,

2discoverer of Fourier series?, 4early Fourier sine series, 3

Laguerre, Edmond Nicola, 312Lanczos, Cornelius

FFT precursor, 380Laplace, Pierre Simon, 4, 310

Theorie analytique des probabilites,310

Laplace transform, 310-312denned if input decays as t —+ oo, 310fixed parameter of, gives a Fourier

transform, 311inverse defined by a contour integral,

310numerical inversion, 310numerical inversion example, 312^-transform is a discrete form of, 313

Laurent, Pierre Alphonse, 321Laurent series, 321Leakage, 98-99, 189

and periodic, non-band-limited in-put, 189

graphical example, 192graphically displayed, 99

Least squareserror, 41minimized by Chebyshev polynomi-

als, 325Lebesgue, H., 4Legendre, Adrien Marie, 4, 331Legendre polynomials, see PolynomialsLegendre transform, 333

derivation of discrete transform pair,338-341

L'Hopital's rule, 335Line integral, 287Linear trend

subtract to improve quadrature accu-racy, 368-369

Linearityof the DFT, 72of the Radon transform, 290of the z-transform, 317

Long divisionfor inverting z-transforms, 316

Low-pass filter, 266amplitude spectrum of, 266discrete form of, 266generalized, 270modified to reduce ringing, 268time domain representation, 266with linear taper applied to reduce

ringing, 270

Maclaurin, Colin, 361Theory of Fluxions, 361

MAPLE, 71MATHCAD, 71MATHEMATICA, 70MATLAB, 71

428 INDEX

Matrixblock tridiagonal, from BVPs, 257butterfly, 390circulant, 395circulant from BVPs, 250factorization of Cooley-Tukey FFT,

391factorization of inverse DFT, 391Kronecker product of, 391of DFT is unitary up to a factor of

JV, 32orthogonal, 337permutation, 391sparse, from BVPs, 257theory for the DFT, 31-32theory useful for FFTs, 32tridiagonal, from BVPs, 239, 248

Matrix perspectivethe DST and BVPs, 244-247

Maxwell, Clerk, 4Mean square error, 223Migration

of seismic data, 272-284phase shift method for, 286

Minimax propertyof Chebyshev polynomials, 324

Modes, 5, 17basic, 25, 145

aliasing, 185fundamental, 6, 21two-dimensional geometry of, 147,

152-160, 281Modulation, 73Monotone, piecewise, 187Multidimensional DFT

by two-dimensional methods, 163

Natterer, Frank, 289Neumann

boundary conditions, 247two-dimensional, 258

Neumann, Carlboundary conditions named for, 247

Neumann, Franz Ernst, 247Newton, Isaac, 370

Philosophae Naturalis PrincipiaMathematica, 370

Newton-Cotes quadrature, 370Newton's method, 100Norm, 223

mean square, 223weighted two-norm, 326

Normal equations, 42Normal incidence points, 274

Notch filter, 271Numerical methods, 100

for inverting the Laplace transform,310

of integration, 358Nyquist

frequency, 97sampling rate, 96, 183, 202

Nyquist, Harry, 96

Odd function, 94Chebyshev polynomial Tn is, for n

odd, 322decomposition of Hartley transform

into, 342Legendre polynomial Pn is, if n is

odd, 338Odd sequence, see also Waveform decom-

positionarray in two dimensions, 172DFT of, 77DFT of real, 125DHT of, is odd, 345

One-mode, 21Operational perspective

the DST and BVPs, 243-244Operator

DCT, 248DFT, 24DHT, 345DST, 244Fourier transform, 292IDFT, 28interpolation (image reconstruction),

294inverse z-transform, 315projection, 2882-transform, 314

Orthogonal matrix, 337Orthogonal polynomials, 331-341

Chebyshev, see also Polynomialsfrom expansions to discrete trans-

forms, 333-338Laguerre, see also Polynomialsorthogonal with respect to degree,

337orthogonal with respect to grid

points, 335representing functions as expansions,

324, 332Orthogonality

a function orthogonal to its own in-terpolation, 224

and the convolution theorem, 84

INDEX 429

and the normal equations, 42and the transform of a spike train, 52Chebyshev polynomials

two discrete properties, 327continuum property, 33Discrete Poisson Summation For-

mula, 182discrete property, 28-30in DFT interpolation, 222least squares error, 43of Chebyshev polynomials

continuum, 324, 332of continuum functions, 223of cosine modes, 125

and Chebyshev polynomials, 324of Fourier series modes, 35of Laguerre polynomials

continuum, 332of Legendre polynomials

continuum, 332, 338two discrete properties, 340

of sine modes, 126of the Hartley transform kernel, 344of two-dimensional modes, 147of two vectors, 29polynomials

two discrete properties, 333with respect to a weight function,

324, 332with respect to degree, 337

Chebyshev polynomials, 327Legendre polynomials, 340

with respect to grid points, 335Chebyshev polynomials, 327Legendre polynomials, 340

Paley-Wiener theorem, 97Parseval, Marc Antoine, 44Parseval's relation, 44

DFT interpolation error, 224energy of input function is also en-

ergy of the DFT, 86Partial derivatives

replaced by differences, 257Partial differential equation, 256

the wave equation, 279Partial fractions may be used for inverting

^-transforms, 316Period, 17Periodic array

doubly periodic, 145Periodic boundary conditions, 249Periodic extension, 38, 181, 197

of odd function, 94

Periodic function^4-periodic, 33truncation leads to leakage, 99

Periodic replication, 193and decay of the Laplace transform

kernel, 311in the frequency domain, 193is not periodic extension, 193of ck used to find / by IDFT, 215to improve DFT approximation, 203

Periodicityand the definition of the DFT, 66and wrap around effect, 265of cyclic convolution, 81of DHT, 345of the DFT, 72of two-dimensional coefficients, 165used to check analytical DFTs, 102,

107Permutation matrix, 391Pester

defined, 180of the IDFT

when / is nonzero at band limits,217

when / is nonzero at endpoints,218

when a space-limited / is nonzero atendpoints, 199

when f(u] is nonzero at the cut-offfrequency, 202

when the band-limit equals N/2, 183yields hidden lessons about the DFT,

180Phase lines, 153, 156

orthogonal to the frequency vector,155

Phase shiftlinear in distortionless filter, 264migration, 286

Phase spectrum, 264obtained from Hartley transform, 343of distortionless filter, 265

Piecewise continuous, 35Piecewise monotone, 187Piecewise smooth, 36Plane wave, 281Poisson, Simeon Denis, 4, 182

Theorie mathematique de la chaleur,227

Poisson equationfast solvers for, 253-259steady state diffusion, 256

430 INDEX

Poisson Summation Formulacontinuum, 195

replication form, 196discrete, 182, 185

replication form, 193Inverse Summation Formula, 217,

219replication form, 221

replication form, 203yields DFT interpolation error, 225

Polar coordinatesno FFT for, 293

PolynomialsChebyshev, 322

minimum least squares, 325multiple angle formulas for cosine

lead to, 321multiplicative property of, 323orthogonality, 324, 332properties, 322-326two discrete orthogonality proper-

ties, 327white curves of, 330

discrete orthogonality, with respectto degree, 337

discrete orthogonality, with respectto grid points, 335

Laguerreorthogonality, 332used to invert Laplace transform,

312least squares approximation by, 44Legendre, 331

derivation of discrete transform,338-341

orthogonality, 332properties, 338-339two discrete orthogonality proper-

ties, 340multiplication by convolution, 84orthogonal, 331-341

expansions of / in, 324, 332transforms, 331-341two discrete orthogonality proper-

ties, 333Power spectrum, 9

obtained from Hartley transform, 343Pre- and postprocessing, 127-137

computational cost, 131computing the DCT from refining a

bad idea, 131-134computing the DST by, 134cost compared to compact FFT, 136

to compute symmetric DFTs, 119to get a length 2N RDFT from a

length N DFT, 131to get two real DFTs by one complex

DFT, 129Prime factor algorithm, 392-395

four-step method, 394Projection operator, 288

Quadrature, 358and the DFT, 358-375Gaussian, 333

exact for xp where p = 0 : 2N — 1,334

to get polynomial expansion coef-ficients, 334

higher-order rules, 369-375Filon's rule, 369-370Newton-Cotes rules, 370

Simpson's rule, 371to derive discrete polynomial trans-

forms, 333trapezoid rule, 358-369

Quarter-wave symmetries, 78, 137BVPs with mixed boundary condi-

tions, 259

Radon, Johann, 287Radon transform, 287-297

basic properties, 289-291defined, 287Fourier image reconstruction, 292-

297geometry, 288is a set of projections, 288of characteristic function of a disk,

289p and 4> are not polar coordinates,

288Ramp filter, 270

time domain representation, 270Ratio

in a geometric sum, 100of grids for reciprocity relations, 160of leading coefficients, Legendre poly-

nomials, 339signal-to-noise, 276

Raypath, 274Real DFT

coefficients related to complex DFT,122

computational cost of, 120defined, 121defined forn , k = 0 : N - 1, 134

INDEX 431

for solving BVPs with periodicboundaries, 250-253

in two dimensions, 165using one-dimensional RDFTs,

168inverse, 121

in two dimensions, 169solving differential BVPs, 256

Real sequenceDFT of, 76, 120-122DFT of even and odd, 77even, 122inverse DFT of

even, 125odd, 127

odd, 125zero phase filter, 265

Reciprocity relations, 20-22an exquisite example, 205-206and filter design, 267and interpolation error in image re-

construction, 299and leakage, 98and refining the frequency grid, 55and the transform of a spike train, 51and zero padding, 90approximating Ck for band-limited

input, 183carry over to the Hartley transform,

344choosing IDFT parameters, 213, 216choosing sample rates for FK migra-

tion, 283for double/even/noncentered DFT,

68for single/odd/centered DFT, 70inverting the Laplace transform, 311link replication periods in space, fre-

quency, 196reducing one error increases the

other, 209two-dimensional, 159-160

Record section, 276Recurrence relation

of Chebyshev polynomials, 324of Legendre polynomials, 339

Reflectionangle of, 274coefficient, 274

Reflector section, 279Refraction, angle of, 274Reordering stage, 383Replication, see Periodic replication

Residue theory, 321Reversed sequence

DFT of, 74Richardson, Lewis Fry, 371Richardson extrapolation, 371Riemann, Georg Friedrich, 4Ringing, 268Rodrigues, Olinde, 338Rodrigues' formula, 338Romberg integration, 372Roots, of unity, 29Rotation

of DFT by shift, 73shift property in two dimensions, 150

Runge, Carl Davis Tolme, 380in 1903 developed near-FFTs, 380

Running averageacts as a low-pass filter, 260

Ruritanian map, 393

Sample pointseven or odd number of, 67using extreme points for Chebyshev

series, 327Sampling

in space replaces / with its replica-tion, 198

when do we have enough samples?,96

Sampling errorin DFT interpolation, 226in the frequency domain, 217

Sampling intervalcentered vs. noncentered, 67

Sampling rate, 7and highest resolvable frequency, 9Nyquist, 96, 183twice per period needed, 97, 156two-dimensional, 144

in frequency domain, 155Sampling theorem, 96

from Poisson Summation Formula,204

Scaling factor, 66Seismic data

bowtie, 277diffractions, 277FK migration, 272-284

Seismic explorationa crash course, 272-278

Seismic migration, 277Seismic section, 276Semigroup property of Chebyshev polyno-

mials, 323

432 INDEX

Separable, array/mn, 145Sequence

causal, 67DFT of even and odd, 77DFT of reversed, 74DFT of symmetric, 76-78differenced, 92folded, 92Kronecker delta, 46of functions, 46padded, 90periodic replication, 193summed, 91the unknown of a difference equation

is a, 236Shah function, 49Shannon, Claude, 96Shannon Sampling Theorem, 96Shift property, 73

and rotation of DFT, 73design of distortionless filters, 264in the frequency domain, 74, 269in two dimensions, 149of the DOT, 248of the DHT, 345of the DST, 244of the Radon transform, 290of the z-transform, 317

is related to the Laplace transformof derivatives, 318

used to interpret the splittingmethod, 384

Sidelobes, 99, 267poor sampling of periodic input, 191

Sifting propertyof the delta function, 47, 288

Signal, digital, 260Signal processing, see Digital filteringSignal-to-noise ratio

generally low in seismology, 276Simpson, Thomas, 371Simpson's rule., 371Sine function, 96, 104

band-limited, 205Sine transform, discrete (DST), 125-127Single set, samples in the DFT, 67Smoothness

defined by number of continuousderivatives, 186

reflected in Euler-Maclaurin summa-tion, 361

Snell, Willebrod, 274Snell's law

derived from Huygens' principle, 274links angles of incidence, reflection,

refraction, 274Software

DFTs used in, 70-71Sparse matrix, 257Spatial domain, 10, 17

two-dimensional, 152Spatially limited, see Compact supportSpectral analysis, 7

and linearity, 72Spectrum, 9, 18

amplitude, 264phase, 264

Spikelimit of a delta sequence, 46used to explain AVED error, 95

Spike train, 49Splitting method, 381-386

decimation-in-time, 382discovered by Gauss, 6gives a DFT from two half-length

DFTs, 129-131Square pulse, 101, 199

transform of a sine, 205Stanford University

copyright on FHT codes, 347Storage

issues for vector and parallel comput-ing, 163

savingsof a real even sequence, 123of real odd symmetry, 125of real sequence, 120

Strobing, 95, see also AliasingSummation

by parts, 100used in analytical DFT, 104

Summation formulaEuler-Maclaurin, 358, 361Poisson, 182, 358

versus Euler-Maclaurin, results,363-364

Support, 197Symmetric DFTs, 76, 118-137

applied to any real sequence, 118computational savings of, 120, 123,

126in two dimensions, 167-169, 171

cost comparison of methods for, 136explicit forms, 119in two dimensions, 163-172methods for computing, 127-137

INDEX 433

pre- and postprocessing methods for,119

steps in developing, 118table of costs, 127table of numerical examples, 135two-dimensional hybrid symmetries,

172Symmetry

and economy of storage, 77, 120, 123,125

properties, 76-78quarter-wave, 78quarter-wave even and odd, 137real even, 122real odd, 125storage economy from, in two dimen-

sions, 165-167, 169, 170two-dimensional even, 169-172two-dimensional odd, 172used to check analytical DFTs, 102,

107Syncline, 276

focuses seismic raypaths, 276Synthesis

Fourier, 212of / from its modes, 18

Table of DFTs, 400-408examples from, 98, 189, 191, 210,

219, 252V 261, 262, 267explanation of, 109

Table of ^-transforms, 317Taylor, Brook, 210Taylor series, 210, 220

used to derive difference operators,254

Tchebycheff, see ChebyshevTheilheimer, F.

first FFTs as factorizations, 389Theory of residues, 321Thompson, William (Lord Kelvin), 4Time domain, 10, 17Time series analysis, 6Time-dependent difference equation, 237Time-shifting filter, 264Tomography

computer aided (CAT), 287Traces on seismic section, 276Transfer function, 263Transform

Bessel, 341discrete Chebyshev, 329discrete Fourier, 23discrete Hartley, 343

properties, 345-346relation to DFT, 345

discrete Legendre, 340derivation, 338-341

discrete orthogonal polynomial, 333-338

fractional Fourier, 312Hankel, 341Hartley, 341-347

relation to Fourier transform, 343Hilbert, 341inverse discrete, 340inverse discrete Chebyshev, 329inverse discrete Hartley, 344inverse discrete orthogonal polyno-

mial, 335inverse Hartley, 342inverse Legendre, 333Kohonen-Loewe, 341Laplace, 310Legendre, 333orthogonal polynomial, 333-338Radon, 287Walsh-Hadamard, 3412, see z-transform

Transform domain, 17Trapezoid rule, 19

aliasing and error, 365and the DFT, 358-369error of, 360-369

differs for sines and cosines, 364endpoints agree for /, /', 363for infinitely differentiate func-

tions, 365for periodic, C°° functions, 366no endpoint agreement, 362

for image reconstruction, 293in FK migration, 283used for approximating Fourier coef-

ficients, 39when is it exact?, 365-368

Travel time of seismic wave, 274Trial solution, 105, 239Triangle inequality, 198, 204Tridiagonal matrix, 244

block, 257for BVP diagonalized by DST modes,

246for BVP with Dirichlet boundaries,

239for BVP with Neumann boundaries,

248Tridiagonal system, 239

434 INDEX

Trigonometric interpolationby DFT interpolation, 222-226by Gauss for the orbit of Ceres, 5-6by the DFT

error in, 223-225case study of DFT errors, 226

Trigonometric polynomial, 41as an interpolating function, 44used to approximate /, 41-44, 222

Trigonometric seriesBernoulli's solution to wave equation,

2Truncation error, 54, 204, 218

and the IDFT, 57leakage, 192of DFT interpolation, 226of finite difference operator, 255

Tukey, John W., 5seminal FFT paper, 380

TurbidityDFT used to analyze, 7-10

Twiddle factors, 388, 393, 394Two-dimensional DFT, 144-152

by successive one-dimensional DFTs,162

computational cost, 162computing methods, 161-163defined, 146extending methods to more dimen-

sions, 163for indices 0 : M - 1, 0 : JV - 1, 151hybrid symmetries, 172inverse, 146

Two-dimensional modesfour viewpoints about aliasing, 157-

159frequency, 155geometry of, 147, 152-160phase lines of, 153table of wavelengths, 155wavelength of, 154

Unitary matrix, 32

Von Helmholtz, Hermann, 256

Walsh-Hadamard transform, 341Wave equation, 2, 279

formal solution by Fourier transform,280

Waveform decomposition, 78Wavelength, 17, 147, 153

of two-dimensional mode, 154Wavelets

leading to fast Legendre transforms,341

Wavenumber, 279Weight function, 332

for Chebyshev polynomials, 324Weighted average

a simple filter, 81Weights

Gaussian quadrature, 334White curves, 330Wiener, Norbert, 97Window, 192, 271Winograd factorizations, 394Wrap around effect, 265

ghost reflections in seismic migration,286

in image reconstruction, 297

Zero errorin the trapezoid rule, 367

Zero padding, 90and refining the frequency grid, 91to eliminate ghost reflections, 286to minimize image reconstruction ar-

tifacts, 297Zero phase filter, 265

is a real-valued sequence, 265Zeros

of Chebyshev polynomials, 323of Legendre polynomials, 338

z-transform, 312-321a discrete Laplace transform, 313a semi-discrete transform, 314and the DFT, 319-321

the inverse transforms, 320-321approximated by DFT on auxiliary

sequence, 320basic properties, 317-318computed analytically with geomet-

ric series, 314defined, 314defined outside a circle in complex

plane, 319inversion examples, 315-316not defined at z = 0, 314of a geometric sequence, 314of a step sequence, 314of sine and cosine sequences, 315solving IVP difference equations

with, 318-319table of 2-transforms, 317

Facts, Definitions, and Conventions

Fourier Transform/•oo

Fourier Series

Discrete Fourier Transform

& &

Spatial Grid Frequency Grid

Reciprocity Relations Critical Sampling Rate

Discrete Orthogonality

Notation

the dft: an owners' manual for the discrete fourier transform

Documents