harmonicnotes.pdf

8/11/2019 harmonicnotes.pdf

http://slidepdf.com/reader/full/harmonicnotespdf 1/167

Harmonic Analysis

W. Schlag



Contents

Chapter 1. Fourier Series: Convergence and Summability 5

Chapter 2. Harmonic Functions onD and the Poisson Kernel 25

Chapter 3. Analytic h1 D

Functions, F. & M. Riesz Theorem 35

Chapter 4. The Conjugate Harmonic Function 43

Chapter 5. Fourier Series on L1 T

: Pointwise Questions 53

Chapter 6. Fourier Analysis on Rd 63

Chapter 7. Calderon-Zygmund Theory of Singular Integrals 79

Chapter 8. Almost Orthogonality 101

Chapter 9. The Uncertainty Principle 111

Chapter 10. Littlewood-Paley theory 125

Chapter 11. Fourier Restriction and Applications 145

Bibliography 165

3



CHAPTER 1

Fourier Series: Convergence and Summability

We begin this book with one of the most basic objects in analysis, namely the

Fourier series associated with a function or a measure on the circle. To be specific,

let T R Z be the one-dimensional torus (in other words, the circle). We consider

various function spaces on the torus T, namely the space of continuous functions

C T , the space of Holder continuous functions C α T where 0 α 1, and the

Lebesgue spaces L p

T

where 1

p

. The space of complex Borel measureson T will be denoted by M T

. Any µ M T has associated with it a Fourier

series

µ

n

ˆ µ n e nx (1.1)

where we let e x

:

e2πix and

ˆ µ n :

1

0

e nx µ dx

T

e nx µ dx

The symbol in (1.1) is formal, and simply means that the series on the right-hand

side is associated with µ. If µ dx f x dx where f L1 T , then we also write

ˆ f

n

instead of ˆ µ

n

.Of course, the central question which we wish to explore in this chapter is

when µ equals the right-hand side in (1.1) or represents f in a suitable sense. Note

that if we start from a trigonometric polynomial

f x

n

ane nx

where all but finitely many an are zero, then we see that

ˆ f n an n Z (1.2)

In other words, we have the pointwise equality

f x

n

ˆ f n en x

with en x : e nx . What enters (1.2) is of course the basic orthogonality relation

T

en x em x dx δ0 n m (1.3)

where δ0 j

1 if j

0 and δ0

j

0 otherwise.

5



6 1. FOURIER SERIES: CONVERGENCE AND SUMMABILITY

It is therefore natural to explore the question of convergence of the Fourier se-

ries for more general functions. Of course, the sense of convergence of the infinite

series needs to be specified before this fundamental question can be answered. Itis fair to say that much of modern analysis (including functional analysis) arose

out of the struggle with this question. For example, the notion of Lebesgue integral

was developed in order to overcome deficiencies of the older Riemannian definition

of the integral which had been revealed in the study of Fourier series. The reader

will also note the recurring theme of convergence of Fourier series throughout this

book.

It is natural to start from the most classical notion of convergence, namely that

of pointwise convergence in case the measure µ is of the form µ dx f x dx

with f x continuous or better. The partial sums of f L1

T are defined as

S N

f x

N

n N

ˆ f n e nx

N

n N

T

e ny f y dy e nx

T

N

n N

e n x y f y dy

T

D N x y f y dy

where D N x

N n N e

nx is the Dirichlet kernel. In other words, we have

shown that the partial sum operator S N is given by convolution with the Dirichlet

kernel D N :

S N f x D N f x (1.4)

In order to understand basic properties of this convolution, we first sum the geo-

metric series defining D N x

so as to have an explicit expression for the Dirichlet

kernel.

Exercise 1.1. Verify that for each integer N 0

D N x

sin 2 N

1 π x

sin π x

(1.5)

and draw the graph of D N for several diff erent values of N , say N 2 and N

5.

Prove the bound

D N x C min

N ,1

x

(1.6)

for all N 1 and some absolute constant C . Finally, prove the bound

C 1

log N

D N L1 T

C log N (1.7)

for all N 2 where C is another absolute constant.

The growth of the bound in (1.7) as well as the oscillatory nature of D N as given

by (1.5) may indicate that it is in general very delicate to understand pointwise or

almost everywhere convergence properties of S N f . This will become more clear

as we develop the theory.



1. FOURIER SERIES: CONVERGENCE AND SUMMABILITY 7

In order to study (1.4) we need to establish some basic properties of the con-

volution of two functions f , g on T. If f and g are continuous, say, then define

f

g

x

:

T

f x

y

g

y

dy

T

g x

y

f

y

dy (1.8)

It is helpful to think of f g as an average of translates of f by the measure g y dy

or vice versa. In particular, convolution commutes with the translation operator τ z

which is defined for any z T by the action on functions: τ z f x f x z

.

Indeed, one immediately verifies that

τ z f g τ z f g f τ zg (1.9)

In passing, we mention the important relation between the Fourier transform and

translations:

τ z µ n e zn ˆ µ n n Z

In what follows, we abbreviate almost everywhere or almost every by a.e.

Lemma 1.1. The operation of convolution as defined above satisfies the follow-

ing properties:

(1) Let f , g L1 T . Then for a.e. x T one has that f x y g y is L1

in y. Thus, the integral in (1.8) is well-defined for a.e. x T (but not

necessarily for every x), and the bound f g 1 f 1 g 1 holds.

(2) More generally, f g p f p g 1 for all 1 p , f L p , g L1.

This is called Young’s inequality.

(3) If f C T , µ M T then f µ is well-defined. Show that, for 1 p

,

f

µ p f

p µ

which allows one to extend f µ to arbitrary f L p.(4) If f

L p

T and g

L p

T where 1

p

, and 1

p

1 p

1 then

f g, originally defined only a.e., extends to a continuous function on T

and

f g

f p g p (1.10)

(5) For f , g L1 T show that for all n Z

f g n

ˆ f n g n

Proof. (1) is an immediate consequence of Fubini’s theorem since f x

y

g

y

is jointly measurable on T T and belongs to L1 T T

. (2) can be obtained by

interpolating between the cases p 1 just convered and the easy bound for p

.

Alternatively, one can use Minkowski’s inequality:

f g p

T

f y p g y dy f p g 1

which also implies (3). The bound (1.10) is just Holder’s inequality. Part (4)

follows from the density of C T

in L p T

for 1 p

and the translation

invariance (1.9). Indeed, one verifies from the uniform continuity of functions

in C T

and (1.9) that f µ

C T

for any f C

T and µ

M T

. Since




uniform limits of continuous functions are continuous, (4) now follows from the

aforementioned density of C T and (1.10).

Finally, (5) is a consequence of Fubini’s theorem and the character property of the exponentials e n x y e nx e ny

.

The following exercise introduces the convolution on the Fourier side in the

context of the largest class of functions where the respective series are absolutely

convergent. This class of functions is necessarily a subalgebra of C T called the

Wiener algebra.

Exercise 1.2. Let µ M

T have the property that

n Z

ˆ µ n (1.11)

Show that µ dx f x dx where f C T . Denote the space of all measures with

this property by A

T and identify these measure with their respective densities.

Show that A T is an algebra under multiplication, and that

f g n

m Z

ˆ f m g n m n Z

where the sum on the right-hand side is absolutely convergent for every n Z, and

itself is absolutely convergent over all n. Moreover, f g A f A g A where

f A :

ˆ f 1 . Finally, verify that if f , g L2 T , then f g A T .

Note that the Wiener algebra has a unit, namely the constant function 1. It is

clear that if f has an inverse in A T , then it is 1

f which in particular require that

f 0 everywhere on T. It is a remarkable theorem due to Norbert Wiener that the

converse here holds, too. I.e., if f A T

does not vanish anywhere on T

, then1 f

A T .

One of the most basic as well as oldest results on the pointwise convergence of

Fourier series is the following theorem. We shall see later that if fails for functions

which are merely continuous.

Theorem 1.2. If f C α

T with 0

α 1 , then

S N f

f

0 as N

.

Proof. One has, with δ 0, 1

2 to be determined,

S N f x f x

1

0

f x y f x D N y dy

y δ

f

x

y

f

x

D N

y

dy

12 y

δ

f x y f x D N y dy

(1.12)

Here one exploits the fact that

T

D N y dy 1




We now use the bound from (1.6), i.e.,

D N y

C min

N , 1

y

Here and in what follows, C will denote a numerical constant that can change from

line to line. The first integral in (1.12) can be estimated as follows

y δ

f x f x y

1

y

dy f α

y δ

y

α 1 dy C f αδα (1.13)

with the usual C α semi-norm:

f α sup

x, y

f x f x y

y

α

To bound the second term in (1.12) one needs to invoke the oscillation of D N y

.

In fact,

B :

12 y

δ

f x y f x D N y dy

12 y

δ

f x y f x

sin π y

sin 2 N

1

π y dy

12 y

δ

h x y sin

2 N

1 π y

1

2 N 1

dy

where h x y

:

f x y f x

sin π y

.

Therefore, with all integrals being understood to be in the interval

12

, 12

,

2 B

y δh x y sin 2 N 1 π y dy

y

12 N 1

δ

h x

y

1

2 N 1

sin 2 N

1

π y dy

y δ

h x y h x y

1

2 N 1

sin 2 N

1 π y dy

δ,

δ

12 N 1

h x

y

1

2 N 1

sin 2 N 1 π y dy

δ,δ

12 N 1

h x

y

1

2 N 1

sin 2 N

1

π y dy

These integrals are estimated by putting absolute values inside. To do so we usethe bounds

h x y C f

δ

h x

y

h x y

τ C

τ

α f α

δ

f

δ2 τ

if y

δ 2τ.




In view of the preceding one checks

B

C

N α

f α

δ

N 1

f

δ2

(1.14)

provided δ

1 N

. Choosing δ N

α 2 one concludes from (1.12), (1.13), and

(1.14) that

S N f x f x C

N α2 2

N α 2 N 1 α

(1.15)

for any function with f

f

α 1, which proves the theorem.

The reader is invited to optimize the rate of decay that was derived in (1.15).

In other words, the challenge is to obtain the largest β 0 in terms of α so that the

bound in (1.15) becomes C N β for any f with

f

f α 1.

The difficulty with the Dirichlet kernel, such as its slow 1 x

-decay, can be re-

garded as a result of the “discontinuity” of D N χ N , N

; in fact, this indicatorfunction on the lattice Z jumps at

N . Therefore, we may hope to obtain a kernel

which is easier to analyze — in a sense that will be made precise by means of the

notion of approximate identity below — by substituting D N with a suitable average

whose Fourier transform does not exhibit such jumps.

One elementary way of carrying this out is given by the Cesaro means, i.e.,

σ N f :

1

N

N 1

n 0

S n f

Setting K N :

1 N

N 1

n 0 Dn, which is called the Fej´ er kernel, one therefore has

σ N f K N f .

Exercise 1.3. Let K N be the Fejer kernel with N a positive integer.

Verify that K N looks like a triangle, i.e., for all n Z

K N n

1

n

N

(1.16)

Show that

K N x

1

N

sin N π x

sin π x

2

(1.17)

Conclude that

0 K N x C N

1 min

N 2, x 2

(1.18)

We remark that the square and thus the positivity in (1.17) are not entirelysurprising, since the triangle in (1.16) can be written as the convolution of two

rectangles (convolution is now on the level of the Fourier coefficients on the lat-

tice Z). Therefore, we should expect K N to look like the square of a version of D M

with M about half the size of N .

The properties established in Exercise 1.3 ensure that K N are what one calls an

approximate identity.




Definition 1.3. Φ N

N 1 L

T form an approximate identity provided

A1) 1

0

Φ N x dx 1 for all N

A2) sup N

1

0 Φ N x dx

A3) for all δ 0 one has

x δ

Φ N x dx 0 as N .

The name here derives from the fact that Φ N f f as N in any

reasonable sense, see Proposition 1.5. In other words, Φ N δ0 in the weak-

sense of measures. Clearly, the so-called box kernels

Φ N x N χ x N

12

, N 1

satisfy A1)–A3) and can be considered as the most basic example of an approx-

imate identity. Note that D N N 1 are not an approximate identity. Finally, we

remark that Definition 1.3 has nothing to do with the torus T, it applies equally

well to the line R, tori Td

, or Euclidean spaces Rd

.Next, we verify that K N belong to this class.

Lemma 1.4. The Fej´ er kernels K N

N 1 form an approximate identity.

Proof. We clearly have 1

0 K N x dx

1 and K N x 0 so that A1) and A2)

hold. A3) follows from the bound (1.18).

Now we establish the basic convergence property of such families.

Proposition 1.5. For any approximate identity Φ N

N 1 one has

(1) If f C T , then Φ N f f

0 as N

(2) If f L p T where 1 p , then Φ N f f p 0 as N

(3) For any measure µ M T , one has

Φ N µ µ N

in the weak- sense.

Proof. We begin with the uniform convergence. Since T is compact, f is uni-

formly continuous. Given ε 0, let δ 0 be such that

sup x

sup y

δ

f x y f x ε

Then, by A1)–A3),

Φ N f x f x

T

f x y f x Φ N y dy

sup x T

sup y

δ

f x y f x

T

Φ N t dt

y δ

Φ N y 2 f

dy

C ε

provided N is large.




Fix any f L p T and let g C T be such that f g p ε. Then

Φ N f f p Φ N f g p f g p Φ N g g p

sup N

Φ N 1 1

f g p Φ N g g

where we have used Young’s inequality, see Lemma 1.1, to obtain the first term

on the right-hand side. Using A2), the assumption on g, and Young’s inequality

finishes the proof.

The final statement is immediate from (1) and duality.

This simple convergence result applied to the Fejer kernels implies some basic

analytical properties. A trigonometric polynomial is a series

n ane nx

in which

only finitely many an are non-zero. Clearly, this class forms a vector space over

the complex numbers. In fact, they form an algebra under both multiplication and

convolution.

Corollary 1.6. The exponential family e nx n Z satisfies the following prop-

erties:

(1) The trigonometric polynomials are dense in C T in the uniform topology,

and in L p T for any 1 p .

(2) For any f L2 T

f

22

n Z

ˆ f n

2

This is called the Plancherel theorem.

(3) The exponentials e nx n Z form a complete orthonormal basis in L2 T .

(4) For all f , g L2 T one has Parseval’s identity

T

f x

g

x

dx

n Z

ˆ f n

g

n

Proof. By Lemma 1.4, K N

N 1

form an approximate identity and Proposi-

tion 1.5 applies. Since σ N f K N f is a trigonometric polynomial, we are done

with (1).

Properties (2)–(4) are equivalent by basic Hilbert space theory. We begin from

the orthogonality relations (1.3). One thus has Bessel’s inequality

ˆ f n

2 f

22

and equality is equivalent to the linear span of en

being dense in L2 T

. That,

however, is guaranteed by part (1).

We remark that the previous corollary is only possible with the Lebesgue inte-

gral, since unlike the Riemann integral it guarantees that L2 T

is a complete space

and thus a Hilbert space. We record two further basic facts which are immediate

consequences of the approximate identity property of the Fejer kernels.




Corollary 1.7. One has the following uniqueness property: If f L1 T and

ˆ f n 0 for all n Z , then f

0. More generally, if µ M T satisfies ˆ µ n 0

for all n Z

, then µ

0.

Proof. One has σ N f 0 for all N by assumption and now apply the conver-

gence property of Proposition 1.5.

The following is the Riemann-Lebesgue lemma.

Corollary 1.8. If f L1 T , then ˆ f n

0 as n

Proof. Given ε 0, let N be such that

σ N f f 1 ε. Then

ˆ f n

σ N f n

ˆ f n

σ N f f

1 ε for all n

N .

We now turn our attention to the issue of convergence of the partial sums S N f in the sense of L p T or C T . Observe that it makes no sense to ask about uniform

convergence of S N f for general f L

T because uniform limits of continuous

functions are continuous.

Proposition 1.9. The following statements are equivalent for any 1 p

:

a) For every f L p

T or f

C

T if p

one has

S N f f p 0 as N

b) sup N S N p p

Proof. The implication b) a) follows from the fact that trigonometric poly-

nomials are dense in the respective norms, see Corollary 1.6. The implication a) b) can be deduced immediately from the uniform boundedness principle of

functional analysis. Alternatively, can also prove this directly by the method of the

“gliding hump”: suppose sup N S N p p . For every positive integer one can

therefore find a large integer N such that

S N f p 2

where f is a trigonometric polynomial with f p

1. Now let

f x

1

1

2e M x f x

with some integers M to be specified. Notice that

f

p

1

1

2 f p

Now choose M

tending to infinity so rapidly that the Fourier support of

e M j x

f j x




lies to the right of the Fourier support of

j 1

1

1

2 e

M x

f

x

for every j 2 (here Fourier support means those integers for which the corre-

sponding Fourier coefficients are non-zero). We also demand that N M

as . Then

S M N S M N

1 f p

1

2 S N

f p

2

2

which as . On the other hand, since N M and

M N

1

, the left-hand side

0 as . This contradiction fin-

ishes the proof.

Exercise 1.4. Let cn n Z be an arbitrary sequence of complex numbers and

associate to it formally a Fourier series f x

n

cn e nx

Show that there exists µ M T with the property that ˆ µ n cn for all n Z

if and only if σn f n

1 is bounded in M T

. Discuss the case of L p T

with

1 p and C T as well.

We can now settle the question of uniform convergence of S N on functions

in C T .

Corollary 1.10. Fourier series do not converge on C T and L1 T , i.e., there

exists f C

T so that

S N f

f

0 and g L1

T so that

S N g

g 1 0

as n .

Proof. By Proposition 1.9 it suffices to verify the limits

sup N

S N

sup N

S N 1

1

(1.19)

Both properties follow from the fact that

D N 1

as N

, see (1.7). To deduce (1.19) from this, notice that

S N

sup f

1

D N f

sup f

1

D N f 0 D N 1

We remark that the inequality sign here can be replaced with an equality in view

of the translation invariance (1.9). Furthermore, with K M

M 1 being the Fejer

kernels,

S N 1 1 D N K M 1 D N 1

as M

.




Exercise 1.5. The previous proof was indirect. Construct f C T and g

L1 T so that S N f does not converge to f uniformly as N , and such that S N g

does not converge to g in L

1 T

.

In the positive direction, we shall see below that for 1 p

sup N

S N p p

so that by Corollary 1.10 for any f L p T with 1 p

S N f

f

p 0 as N

(1.20)

The case p 2 is clear, see Corollary 1.6, but p 2 is a somewhat deeper result.

We will develop the theory of the conjugate function to obtain it, see Chapter 4.

Note that unlike Theorem 1.2 there is no explicit rate of convergence here in terms

of some expression involving N . Clearly, this cannot be expected given only the

size of

f

p since g

x

:

e

mx

f

x

has the same L p

-norm as f but g

n

ˆ f n m for all n Z. This suggests that we may hope to obtain such a rate

provided we remove this freedom of translation on the Fourier side. One way of

accomplishing this is by imposing a little regularity, as expressed for example in

terms of the standard Sobolev spaces. Since we do not yet have the L p-convergence

in (1.20) for general p at our disposal, we need to restrict ourselves to p 2 which

is particularly simple.

Exercise 1.6. For any s R define the Hilbert space H s T by means of the

norm

f

2 H s :

ˆ f 0

2

n Z

n

2 s

ˆ f n

2 (1.21)

Derive a rate of convergence for

S N f

f

2 in terms of N alone assuming that f H s 1 where s 0 is fixed.

We now investigate the connection between regularity of a function and its

associated Fourier series somewhat further. The following estimate, which builds

upon the fact that d dx

e nx 2πine nx , goes by the name of Bernstein’s inequal-

ity.

Proposition 1.11. Let f be a trigonometric polynomial with ˆ f k 0 for all

k n. Then

f

p Cn f p

for any 1 p

. The constant C is absolute.

Proof. Let

V n x

:

1

e

nx

e

nx

K n

x

(1.22)

be de la Vallee Poussin’s kernel. We leave it to the reader to check that

V n j 1 if

j n

as well as

V

n 1 Cn




Then f V n f and thus f

V

n f so that by Young’s inequality

f

p V

n 1 f

p Cn

f

p

as claimed.

It is interesting to note that the following converse holds, due to Bohr.

Exercise 1.7. Suppose that f C 1 T satisfies ˆ f j

0 for all j with j n.

Then

f

p Cn f p

for all 1 p

, where C is independent of n Z , the choice of f and p.

Now we will turn to the question how smoothness of a function reflects itself

in the decay of its Fourier coefficients. We begin with the easy observation, based

on integration by parts, that for any f C 1 T one has

f

n 2πin ˆ f n n Z (1.23)

This not only shows that ˆ f n O n 1

but also that C 1 T H 1 T where H 1

is the Sobolev space defined in (1.21). If f possesses more derivatives, say k 2,

then we may iterate this relation to obtain decay of the form O n 2

. The following

exercise establishes the connection between rapid decay of the Fourier coefficients

and infinite regularity of the function.

Exercise 1.8. Let f L1

T .

Show that f C

T if and only if ˆ f decays rapidly, i.e., for every M 1

one has ˆ f n

O

n

M

as n

.

Show that f x F e x

where F is analytic on some neighborhood of

z 1 if and only if ˆ f n decays exponentially, i.e., ˆ f n O e

ε

n

as n for some ε 0.

As far as less than one derivative is concerned, i.e., for H older continuous

functions, one has the following facts. Starting from

ˆ f n

T

e n y

1

2n f y dy

T

e ny f

y

1

2n

dy

whence

ˆ f n

1

2

T

e ny

f y f y

1

2n

dy

which implies that if f C α T , then

ˆ f n O n

α as n (1.24)

Note that (1.23) is strictly stronger than this bound, since it allows for square sum-

mation and thus the embedding into H 1 T , whereas (1.24) does not. However,

since we already observed that C 1 T H 1 T and since clearly C 0 T

H 0 T

L2 T

, one can invoke some interpolation machinery at this point to

conclude that C α T H α T for all 0 α 1; however, we shall not make use

of this fact.




Next, we ask how much regularity on f it takes so that

n

ˆ f

n

(1.25)

In other words, we ask for a sufficient condition for f to lie in the Wiener algebra

A T of Exercise 1.2. There are a number of ways to interpret the requirement of

“regularity”. It is easy to settle this imbedding question on the level of the spaces

H s T

. In fact, applying Cauchy-Schwarz yields

n0

ˆ f n

n0

ˆ f n

2 n

1 ε

1

2

n0

n

1 ε

1

2

so that

n

ˆ f n

2 n

1 ε

(1.26)

is a sufficient condition for (1.25) to hold. In other words, we have shown that

H s T

A T for any s

12

. We leave it to the reader to check that this fails for

s

12

.

Next, we would like to address the more challenging question as to the minimal

α 0 so that C α T A T . In fact, we first ask which α have the property that

C α T H s T for some s

12

. We already observed that α

12

suffices for

this, but it required some interpolation theory.

Instead, we prefer to give a direct proof of this fact in the following theorem.

It introduces an important idea that we shall see repeatedly throughout this book,

namely the grouping of the Fourier coefficients ˆ f n into blocks of the same size

of n; more precisely, we introduce the partial sums

P j f

x

:

2 j

1 n

2 j

ˆ f n

e

nx

(1.27)

for all j 1.

Theorem 1.12. For any 1 α

0 one has C α T H β T for arbitrary

0 β α. In particular, for any f C α T with α

12

one has

n Z

ˆ f n

and thus C α T A T for any α

12

.

Proof. Let f α

1. We claim that for every j 0,

2 j n 2 j

1

ˆ f n

2 C 2 2 jα (1.28)

The reader should note the similarity between this bound and the estimate (1.24),

with (1.28) being of course strictly stronger. What allows for this improvement is




the use of orthogonality or in other words, the Plancherel theorem. If (1.28) is true,

then for any β α,

n0

ˆ f n

2 n

2 β C

j 0

22 j β

2 j

n

2 j

1

ˆ f n

2 C

j 0

2 2 j

α β

To prove (1.28) we choose a kernel ϕ j so that

ϕ j n 1 if 2 j

n 2 j

1 (1.29)

and

ϕ j n 0 if

n 2 j or

n 2 j 1 (1.30)

where and

mean “much smaller” and “much bigger”, respectively. The point

is of course that (1.29) implies that

2 j n 2 j 1

ˆ f n

2 ϕ j f

22 (1.31)

so that it remains to bound the right-hand side for which (1.30) will be decisive

— at least implicitly. The explicit properties of ϕ j that will allow us to bound the

right-hand side are cancellation meaning that ϕ j 0 0 and bounds on the kernel

ϕ j x

which resemble those of the Fejer kernel K N with N

2 j.

There are various ways to construct ϕ j. We use de la Vallee Poussin’s kernel

from above for this purpose. Set

ϕ j x :

V 2 j

1 x e 3

2 j 1

1

x e 3

2 j 1

1

x (1.32)

We leave it to the reader to check that

ϕ j n 1 for 2 j

n 2 j 1

and that ϕ j 0 0 which is the same as

T

ϕ j x dx 0

Moreover, since the ϕ j are constructed from Fejer kernels one has the bounds

ϕ j x

C 2 j min

22 j , x

2

Therefore,

ϕ j f x

T

ϕ j y f x y f x dy

T

ϕ j y f x y f x dy

C

1

0

ϕ j y

y

α dy

C 2 j

y 2 j

y

α 2 dy C 2 j

y 2 j

y

α dy

C 2 α j




as claimed. In summary, we see that for any 0 β α 1

f H β C α, β f C α

and the theorem follows.

Next, we show that Theorem 1.12 is optimal up to the fact that one can take

α β in the first statement.

Exercise 1.9. Consider lacunary series of the form

f x :

k 1

k 22 αk e

2k x

to show that C α T does not embed into H β for any β α.

The second statement of Theorem 1.12 concerning A T is more difficult.

Proposition 1.13. There is no embedding from C 1

2

T into A

T . In fact, there

exists a function f C

12

T A T .

Proof. We claim that there exists a sequence of trigonometric polynomials

Pn x

2n

1

0 an, e x such that with N 2n,

Pn

N

Pn 1 N

(1.33)

for each n 1. Here a

b means C 1a

b

Ca where C is an absolute

constant, and Pn refers to the sequence of Fourier coefficients.

Assuming such a sequence for now, we set

T n x : 2

ne 2n x Pn x

f :

n 1

T n

Note that this series converges uniformly to f C T since T n

C 2

n2 .

Moreover, the Fourier supports of the T n are pairwise disjoint for distinct n. Thus,

f A T

and f A T . Finally, let h 2 m for some positive integer m.

Then

f x h f x

m

n 1

T n x h T n x

n m

T n

m

n 1

C h T

n

n m

C 2 n2

C

m

n 1

2n2

h 2

m2

C h

12

and thus f C 12

T as desired. To pass to the last line, we used Bernstein’s

inequality, i.e., Proposition 1.11.




It thus remains to establish the existence of polynomials as in (1.33). Define

them inductively by P0 x Q0 x 1 and

Pn 1 x Pn x e 2n x Qn x

Qn 1 x Pn x e

2n x Qn x

for each n 0. Since

Pn 1 x

2 Qn 1 x

2 2 Pn x

2 Qn x

2 2n

1

we see that Pn 1

is of the desired size. Furthermore, all coefficients of both Pn

and Qn are either 1 or

1, and the only exponentials with nonzero coefficients are

e x with 1 2n 1. Hence Pn

n 0 is the desired family of polynomials.

Exercise 1.10. Show that any trigonometric polynomial P with

P 1 N

and the property that the cardinality of its Fourier support is at most N satisfies

N

P 2

P

N . Hence, the polynomials constructed in the previous

proof are “extremal” for the lower bound on P

, whereas the Dirichlet kernel

(or Fejer kernel) are extremal for the upper bound.

We conclude this chapter with a brief discussion of Fourier series associated

with functions on Td R

d Z

d with d 2. The exponential basis in this case is

e

ν x

ν Zd , and this is an orthonormal family in the usual L2

Td

sense. Thus,

to every measure µ M Td

we associate a Fourier series

ν Zd

ˆ µ ν e ν x , ˆ µ ν :

Td

e ν x µ dx

As before, a special role is played by the trigonometric polynomials

ν Zd

aν e ν x

where all but finitely many aν vanish. In contrast to the one-dimensional torus, in

higher dimensions we face a very nontrivial ambiguity in the definition of partial

sums. In fact, we can pose the convergence problem relative to any exhaustion of

Zd by finite sets Ak , which are increasing and whose union is all of Zd . Possible

choices here are of course the squares k , k d , but one can also take more general

rectangles, or balls relative to the Euclidean metric or other shapes. Even though

this distinction may seem innocuous, it gave rise to very important developments

in harmonic such as Feff erman’s ball multiplier theorem, and the still unresolved

Bochner-Riesz conjecture. We will return to these matters in a later chapter.

For now, we merely content ourselves with some very basic results.

Proposition 1.14. The space of trigonometric polynomials is dense in C Td

,

and one has the Plancherel theorem for L2 T

d , i.e.,

f

22

ν Zd

ˆ f ν

2 f L2

Td




If f C Td

, then the Fourier series associated to f converges uniformly to f

irrespective of the way in which the partial sums are formed.

Proof. We base the proof on the fact that the products of Fejer kernels with

respect to the individual coordinate axes form an approximate identity. In other

words, the family

K N ,d x :

d

j 1

K N x j N 1

forms an approximate identity on Td . Here x x1, . . . , xd . This follows imme-

diately from the one-dimensional analysis above. Hence, for any f C Td one

has

K N ,d f f

0 as N

By inspection, K N ,d f is a trigonometric polynomial which implies the claimed

density. This implies all assertions about the L2

case.Suppose that f

C

Td

. Repeated integration by parts yields the estimate

ˆ f ν O ν

m

as ν

for arbitrary but fixed m 1. The convergence

statement now follows from K N ,d f f in C Td

as N , and the rapid

decay of the coefficients.

A useful corollary to the previous result is the density of tensor functions in

C Td

. A tensor function on Td is a linear combination of functions of the form d

j 1 f j x j where f j C T

. In particular, trigonometric polynomials are tensor

functions, whence the claim.

Notes

An encyclopedic treatment of Fourier series is the classic reference by Zygmund [ 58].

A less formidable but still comprehensive account of many classical results is Katznel-

son [30]. A standard reference for interpolation theory is the book by Bergh and Lofstrom [4].

For the construction in Proposition 1.13, see [30, page 36, Exercise 6.6]. For gap series

see the timeless paper by Rudin [41] which introduced the Λ p-set problem, later solved by

Bourgain [5].

Problems

Problem 1.1. Suppose that f L1 T and that S n f

n 1

(the sequence of partial sums

of the Fourier series) converges in L p T to g for some p 1, and some g L p. Prove

that f g. If p conclude that f is continuous.

Problem 1.2. Let T x

N n

0 an cos 2πnx bn sin 2πnx be an arbitrary trigono-

metric polynomial with real coefficients a0, . . . , a N , b0, . . . , b N . Show that there is a poly-

nomial P z

2 N 0 c z

C z so that

T x e 2πiN x P e2πix

and such that P z z2 N P ¯ z

1 . How are the zeros of P distributed in the complex plane?




Problem 1.3. Suppose T x

N n

0

an cos 2πnx bn sin 2πnx

is such that T 0

everywhere on T and an, bn R for all n 0, 1, . . . , N . Show that there are c0, . . . , c N C

such that

T x

N

n 0

cne2πinx

2

(1.34)

Find the cn for the Fejer kernel.

Problem 1.4. Suppose that T x a0

H h

1 ah cos 2πhx satisfies T x 0 for all

x T and T 0 1. Show that for any complex numbers y1, y2, . . . , y N ,

N

n

1

yn

2 N H

a0

N

n

1

yn

2

H

h 1

ah

N

h

n

1

yn

h ¯ yn

Hint: Write (1.34) with

n cn 1. The apply Cauchy-Schwartz to

n yn

m,n yn cm.

Problem 1.5. Suppose

n 1 n an

2 and

1 an is Cesaro summable. Show

that

n 1 an converges. Use this to prove that any f C T H

12

T satisfies S n f f uniformly. Note that H

12

T does not embed intoA T , so this convergence does not follow

by trivial means.

Problem 1.6. Show that there exists an absolute constant C so that

C 1

n0

n

ˆ f n

2

T2

f x f y

2

sin2 π x y

dxdy C

n0

n

ˆ f n

2

for any f H 12

T .

Problem 1.7. Use the previous two problems to prove the following theorem of Pal-

Bohr: For any real function f C T there exists a homeomorphism ϕ : T T such

that

S n

f

ϕ

f

ϕuniformly. Hint: Without loss of generality let f 0. Consider the domain defined in

terms of polar coordinates by means of r θ f θ 2π . Then apply the Riemann mapping

theorem to the unit disc.

Problem 1.8. Show that

f g

2 L2

T

f f L2

T

g g L2

T

for all f , g L2 T .

Problem 1.9. Given N disjoint arcs I α

N α

1 T, set f

N α 1 χ I α . Show that

ν

k

ˆ f ν

2 N

k .

Problem 1.10. Given any function ψ : Z

R so that ψ n 0 as n , show

that you can find a measurable set E T for which

limsupn

χ E n

ψ n

.

Problem 1.11. The problem discusses some well-known partial diff erential equations

on the level of Fourier series.




Solve the heat equation ut uθθ 0, u 0 u0 (the data at time t 0) on T

via Fourier series. Show that if u0 L2 T , then u t , θ is analytic in θ for every

t 0 and solves the heat equation. Prove that u t u0

2 0 as t 0. Write

u t Gt u0 and show that G t is an approximate identity for t 0. Conclude

that u t 0 as t 0 in the L p or C T sense. Repeat for higher-dimensional

tori.

Solve the Schrodinger equation iut uθθ 0, u 0 u0 on T with u0 L2 T .

In what sense can you say that this Fourier series “solves” the equation? Show

that u t 2 u0 2 for all t . Discuss the limit u t as t 0. Repeat for

higher-dimensional tori.

Solve the wave equation utt uθθ 0 on T by Fourier series. Discuss the Cauchy

problem as in the previous items. Show that if ut 0 0, then with u 0 f ,

u t , θ

1

2 f θ t f θ t



CHAPTER 2

Harmonic Functions on D and the Poisson Kernel

There is a close connection between Fourier series and analytic or harmonic

functions on the disc D : z C z

1 . In fact, at least formally, Fourier

series can be viewed as the “boundary values” of a Laurent series

n

an zn (2.1)

which can be seen by setting z x iy e θ . Alternatively, suppose we are

given a continuous function f on T and wish to find the harmonic extension u of f

into D, i.e., a solution tou

0 in D

u f on D T

(2.2)

“Solution” here refers to a classical one, i.e., a function f C 2 D C D . How-

ever, as we shall see it is also very important to investigate other notions of solu-

tions of (2.2) with rougher f .

Note that we cannot use negative powers of z as in (2.1) in order to write an

ansatz for u. However, we can use complex conjugates instead. Indeed, since

zn

0 and ¯ zn

0 for every integer n

0, we are lead to defining

u z

n 0

ˆ f n

zn

1

n

ˆ f n

¯ z n (2.3)

which at least formally satisfies u e θ

n

ˆ f n e nθ f θ . Inserting

z re θ and

ˆ f n

T

e

nϕ f

ϕ d ϕ

into (2.3) yields

u re

θ

T

n Z

r n e n

θ ϕ f

ϕ d ϕ

This resembles the derivation of the Dirichlet kernel in the previous chapter, andwe now ask the reader to find a closed form for the sum.

Exercise 2.1. Check that, for 0 r

1,

Pr θ :

n Z

r n e nθ

1 r 2

1 2r cos 2πθ r 2

This is the Poisson kernel.

25



26 2. HARMONIC FUNCTIONS ON D AND THE POISSON KERNEL

Based on our formal calculation above, we therefore expect to obtain the har-

monic extension of a “nice enough” function f on T by means of the convolution

u re θ

T

Pr θ ϕ f ϕ dy Pr f θ

for 0 r

1.

Note that Pr θ , for 0 r 1, is a harmonic function of the variables x iy

re θ . Moreover, for any finite measure µ M T the expression

Pr µ θ is not

only well-defined, but defines a harmonic function on D.

The remainder of this chapter will therefore be devoted to analyzing the bound-

ary behavior of Pr µ. Clearly, Proposition 1.5 will play an important role in this

investigation, but the fact that we are dealing with harmonic functions will enter in

a crucial way (such as through the maximum principle).

In the following, we use the notion of approximate identity in a more general

form than in the previous chapter. However, the reader will have no difficultytransferring this notion to the present context, including Proposition 1.5.

Exercise 2.2. Check that Pr 0 r 1 is an approximate identity. The role of

N Z in Definition 1.3 is played here by 0

r 1 and N

is replaced with

r 1.

An important role is played by the kernel Qr θ which is the harmonic conju-

gate of Pr θ . Recall that this means that Pr θ iQr θ is analytic in z re θ

and Q0 0. In this case it is easy to find Qr θ since

Pr θ Re

1 z

1 z

and therefore

Qr θ Im

1 z

1 z

2r sin 2πθ

1 2r cos

2πθ r 2

.

Exercise 2.3.

(1) Show that Qr 0 r 1 is not an approximate identity.

(2) Check that Q1 θ cot

πθ . Draw the graph of Q1 θ . What is the

asymptotic behavior of Q1 θ for θ close to zero?

We will study conjugate harmonic functions later. First, we clarify in what

sense the harmonic extension Pr f of f attains f as its boundary values.

Definition 2.1. For any 1 p

define

h p D

:

u : D C harmonic

sup

0 r 1

T

u re θ

p d θ

These are the “little” Hardy spaces with norm

u

p :

sup0 r 1

u

re

L p T



2. HARMONIC FUNCTIONS ON D AND THE POISSON KERNEL 27

It is important to observe that Pr θ h1 D . This function has “boundary

values” δ0 (the Dirac mass at θ 0

since Pr Pr δ0.

Theorem 2.2. There is a one-to-one correspondence between h1 D and M T ,

given by µ M T F r θ : Pr µ θ . Furthermore,

µ sup

0 r 1

F r 1

limr 1

F r 1 , (2.4)

and

(1) µ is absolutely continuous with respect to Lebesgue measure µ

d θ

if and only if F r

converges in L1 T

. If so, then d µ f d θ where

f L1-limit of F r .

(2) The following are equivalent for 1 p : d µ f d θ with f L p T

F r 0 r 1 is L p- bounded

F r converges in L p if 1 p and in the weak- sense in L

if

p

as r

1

(3) f is continuous

F extends to a continuous function on D F r con-

verges uniformly as r 1

.

This theorem identifies h1 D

with M T

, and h p D

with L p T

for 1 p

. Moreover, h

D contains the subclass of harmonic functions that can be ex-

tended continuously onto D; this subclass is the same as C T . Before proving

the theorem we present two simple lemmas. In what follows we use the notation

F r θ : F

re θ

.

Lemma

2.3.a) If F

C

D and F

0 in D , then F r

Pr F 1 for any 0

r

1.

b) If F 0 in D , then F rs Pr F s for any 0

r , s 1.

c) As a function of r 0, 1 the norms F r p are non-decreasing for any

1 p .

Proof.

a) Let u re θ : Pr F 1 θ for any 0 r 1, θ . Then u 0 in D.

By Proposition 1.5 and Exercise 2.2, ur

F 1

0 as r

1. Hence,

u extends to a continuous function on D with the same boundary values

as F . By the maximum principle, u F as claimed.

b) Rescaling the disc sD to D reduces b) to a).c) By b) and Young’s inequality

F rs p Pr 1 F s p F s p

as claimed.

Lemma 2.4. Let F h1

D . Then there exists a unique measure µ

M T

such that F r Pr µ.




Proof. Since the unit ball of M T is weak- compact there exists a subse-

quence r j 1 with F r j µ in weak-

sense to some µ M T

. Then, for any

0

r

1, Pr µ lim

j

F r j Pr lim

j

F rr j F r

by Lemma 2.3, b). Let f C

T . Then

F r , f

Pr µ, f

µ, Pr f

µ, f

as r 1 where we again use Proposition 1.5. This shows that in the weak-

sense

µ limr 1

F r (2.5)

which implies uniqueness of µ.

Proof of Theorem 2.2. If µ M T , then Pr µ h1

D . Conversely, given

F h1 D

then by Lemma 2.4 there is a unique µ so that F r Pr µ. This gives

the one-to-one correspondence. Moreover, (2.5) and Lemma 2.3 c) show that

µ

lim supr 1

F r

1

sup0 r 1

F r

1

limr 1

F r

1

Since clearly also

sup0 r 1

F r 1 sup

0 r 1

Pr 1 µ µ

(2.4) follows. If f L1 T

and µ d θ f θ d θ , then Proposition 1.5 shows that

F r f in L1 T

. Conversely, if F r f in the sense of L1 T

, then because of

(2.5) necessarily µ d θ f θ d θ which proves 1 . 2 and 3 are equally easy

and we skip the details (simply invoke Proposition 1.5 again).

Next, we turn to the issue of almost everywhere convergence of Pr f to f as

r 1. By the previous theorem, we have pointwise convergence if f is continuous.

If f L1 T

, then we write f limn gn as a limit in L1 with gn C T

. But then

we face two limits, namely then one as n and that as r 1 , and wewish to interchange them. It is a common feature that such an interchange requires

some form of uniform control. To be specific, we will use the Hardy-Littlewood

maximal function M f associated to f in order to dominate the convolution with the

Poisson kernel Pr f uniformly in 0 r

1. This turns out to be an instance of

the general fact that radially bounded approximate identities are dominated by the

Hardy-Littlewood maximal function, see condition A4) below.

The Hardy-Littlewood maximal function is defined as

M f θ sup

θ I T

1

I

I

f ϕ d ϕ

where I T is an open interval and

I

denotes the length of I . The basic fact about

M f that we shall need in this context is given by the following result. We say that

g belongs to weak- L1 if

θ T g

θ λ C λ 1

g

1

for all λ 0. Henceforth,

applied to sets denotes Lebesgue measure. Note that

any g L1 belongs to weak- L1. This is an instance of Markov’s inequality

θ T f

θ λ λ p

f

p p




for any 1 p and any λ 0.

Proposition 2.5. The Hardy-Littlewood maximal function satisfies the follow-

ing bounds:

a) M is bounded from L1 to weak-L1 , i.e.,

θ T M f θ λ

3

λ f 1

for all λ 0.

b) For any 1 p , M is bounded on L p.

Proof. Fix some λ 0 and any compact

K θ T

M f θ λ

(2.6)

There exists a finite cover I j

N j 1

of T by open arcs I j such that

I j

f ϕ d ϕ λ I j (2.7)

for each j. We now apply Wiener’s covering lemma to pass to a more convenient

sub-cover: Select an arc of maximal length from I j

; call it J 1. Observe that any

I j such that I j J 1 satisfies I j 3 J 1 where 3 J 1 is the arc with the same

center as J 1 and three times the length (if 3 J 1 has length larger than 1, then set

3 J 1 T). Now remove all arcs from

I j

N j

1 that intersect J 1. Let J 2 be one of

the remaining ones with maximal length. Continuing in this fashion we obtain arcs

J

L

1 which are pair-wise disjoint and so that

N

j 1

I j

L

1

3 J

In view of (2.6) and (2.7) therefore

K

L

1

3 J

3

L

1

J

3

λ

L

1

J

f ϕ d ϕ

3

λ f 1

as claimed.

To prove part b), one interpolates the bound from a) with the trivial L bound

M f

f

by means of Marcinkiewicz’s interpolation theorem.

We remark that the same argument based on Wiener’s covering lemma yields

the corresponding statement for M µ when µ M T . In other words, if we define

M µ θ

supθ

I T

µ I

I




then for all λ 0

θ T M µ θ λ

3

λ µ

where µ

is the total variation of µ.

Exercise 2.4. Find within a multiplicative constant M δ0, where δ0 is the Dirac

measure at 0. Use this to prove that the weak- L1 bound on M µ cannot be improved

in general, even when µ is absolutely continuous.

For a considerable strengthening of the previous exercise, the reader should

consult Problem 3.4 in the next chapter. We now introduce the class of approximate

identities which are dominated by the maximal function.

Definition 2.6. Let Φn

n 1 be an approximate identity as in Definition 1.3.

We say that it is radially bounded if there exist functions Ψn

n 1 on T so that the

following additional property holds:

A4) Φn Ψn, Ψn is even and decreasing, i.e., Ψn θ Ψn ϕ for 0 ϕ

θ

12

, for all n 1. Finally, we require that supn Ψn 1

.

All basic examples of approximate identities which we have encountered so

far (Fejer, Poisson, and box kernels) are of this type. The following lemma estab-

lishes the aforementioned uniform control of the convolution with radially bounded

approximate identities in terms of the Hardy-Littlewood maximal function.

Lemma 2.7. If Φn

n 1

satisfies A4 , then for any f L1

T one has

supn

Φn f θ supn

Ψn 1 M f θ

for all θ T.

Proof. It clearly suffices to show the following statement: let

K :

1

2,

1

2 R

0

be even and decreasing. Then for any f L1

T we have

K

f

θ K

1 M f θ (2.8)

Indeed, assume that (2.8) holds. Then

supn

Φn f θ supn

Ψn f θ supn

Ψn 1 M f θ

and the lemma follows. The idea behind (2.8) is to show that K can be written as

an average of box kernels, i.e., for some positive measure µ

K θ

12

0

χ

ϕ,ϕ

θ µ d ϕ

(2.9)

We leave it to the reader to check that d µ dK

K

12

δ 12

is a suitable choice.

Notice that (2.9) implies that 1

0

K θ d θ

12

0

2ϕ d µ ϕ




Moreover, by (2.9),

K f θ

12

0

1

2ϕ χ ϕ,ϕ

f

θ 2ϕ µ d ϕ

M f θ

12

0

2ϕ µ d ϕ

M f θ K 1

which is (2.8).

The following theorem establishes the main almost everywhere convergence

result of this chapter. The reader should note how the maximal function enters

precisely in order to interchange the aforementioned double limit.

Theorem 2.8. If

Φn

n 1 satisfies A1)–A4), then for any f

L1

T

one hasΦn f f almost everywhere as n .

Proof. Pick ε 0 and let g

C

T with

f

g

1 ε. By Proposition 1.5,

with h f g one has

θ T limsupn

Φn f θ f θ

ε

θ T limsupn

Φn h θ

ε 2 θ T h θ

ε 2

θ T sup

n Φn

h θ

ε 2

θ T h

θ

ε 2

θ T C Mh

θ

ε 2

θ T h

θ

ε 2

C

ε

To pass to the final inequality we used Proposition 2.5 as well as Markov’s inequal-

ity (recall h

1 ε).

As a corollary we not only obtain the classical Lebesgue diff erentiation theo-

rem (by considering the box kernel), but also almost everywhere convergence of

the Cesaro means σ N f (via the Fejer kernel), as well as of the Poisson integrals

Pr f to f for any f L1 T . A theorem of Kolmogoroff states that this fails for

the partial sums S N f on L1 T . We will present this example in Chapter 5.

Exercise 2.5. It is natural to ask whether there is an analogue of Theorem 2.8

for measures µ M T . Prove the following:

a) If µ M T is a positive measure which is singular with respect to

Lebesgue measure m (in symbols, µ m), then for a.e. θ T with respect

to Lebesgue measure we have

µ θ ε, θ ε

2ε 0 as ε 0




b) Let Φn

n 1 satisfy A1)–A4), and assume that the Ψn

n 1 from Defini-

tion 2.1 also satisfy

supδ

θ

12

Ψn θ 0 as n

for all δ 0. Under these assumptions show that for any µ

M T

Φn µ f a.e. as n

where µ d θ f θ d θ νs d θ is the Lebesgue decomposition, i.e.,

f L1 T and νs m.

Notes

Standard references on everything here are Hoff man [25], Garnett [20], and Koosis [34].

Problems

Problem 2.1. Let X , µ be a general measure space. We say a sequence f n

n 1

L1 µ is uniformly integrable if for every ε 0 there exists δ 0 such that

µ E δ supn

E

f n du

ε

Suppose µ is a finite measure. Let φ : 0, 0, be a continuous increasing function

with limt

φ

t

t . Prove that

supn

φ f n x µ dx

implies that f n is uniformly integrable.

Problem 2.2. Suppose f n

n 1

is a sequence in L1 T . Show that there is a subse-

quence f n j

j 1

and a measure µ with f n j

σ

µ provided supn f n 1 . Here σ is

the weak-star convergence of measures. Show that in general µ L1 T . However, if we

assume that, in addition, f n

1 is uniformly integrable, then µ d θ f θ d θ for some

f L1 T . Can we conclude anything about strong convergence (i.e., in the L1-norm) of

f n ? Consider the analogous question on L p T , p 1.

Problem 2.3. Let µ be a positive finite Borel measure on Rd or Td . Set

M µ x : supr

0

µ B x, r

m B x, r

where m is the Lebesgue measure. This problem examines the behavior of M µ when µ is asingular measure, cf. Problem 3.4 for more on this case.

(1) Show that µ m implies µ x M µ x 0

(2) Show that if µ m, then

limsupr

0

µ B x, r

m B x, r

for µ a.e. x. Also show that this limit vanishes m-a.e.




Problem 2.4. For any f L1 R

d and 1 k d let

M k f x supr 0

r k

B x,r

f y dy

Show that for every λ 0

m L x L M k f x λ

C

λ f L1

Rd

where L is an arbitrary affine k -dimensional subspace and “m L” stands for Lebesgue mea-

sure (i.e., k -dimensional measure) on this subspace. C is an absolute constant.

Problem 2.5. Prove the Besicovitch covering lemma on T: Suppose I j are finitely

many arcs with I j 1. Then there is a sub-collection I jk such that the following

properties hold:

(1) k I jk j I j

(2) No point belongs to more than C of the I jk ’s where C is an absolute constant.

What is the best value of C ?What can you say about higher dimensions?

Problem 2.6. Let F 0 be a harmonic function onD. Show that there exists a positive

measure µ on T with F re θ Pr µ θ for 0 r 1 and with µ F 0 .

Problem 2.7. A sequence of complex numbers an n Z is called positive definite if

an a

n for all n Z

n,k an

k zn zk 0 for all complex sequences zn n Z

Show that any positive definite sequence satisfies an a0 for all n Z. Show that if

µ is a positive measure, then an

T e nθ µ d θ is such a sequence. Now prove that

every positive definite sequence is of this form. Hint: apply the previous problem to the

harmonic function

n an r ne nθ .



CHAPTER 3

Analytic h1 D

Functions, F. & M. Riesz Theorem

In the previous chapter, we mainly dealt with functions harmonic in the disk

satisfying various boundedness properties. We now turn to functions F u iv

h1 D which are analytic in D. This is well-defined since analytic functions are

complex-valued harmonic ones. These functions form the class H1 D , the “big”

Hardy space. We have shown there that F r Pr µ for some µ

M T

. It

is important to note that by analyticity ˆ µ

n

0 if n

0. A theorem by F. &M. Riesz asserts that such measures are absolutely continuous. From the example

F z :

1 z

1 z

Pr θ iQr θ , z re θ

one sees an important diff erence between the analytic and the harmonic cases. In-

deed, while Pr h1

D , clearly F h1

D . The boundary measure associated with

Pr is δ0, whereas F is not associated with any boundary measure in the sense of

the previous chapter.

An important technical device that will allow us to obtain more information in

the analytic case is given by subharmonic functions. Loosely speaking, this device

will allow us to exploit algebraic properties of analytic functions that harmonic

ones do not have (such as the fact that F 2

is again analytic if F is analytic, whichdoes not hold in the harmonic category).

Definition 3.1. Let Ω R2 be a region (i.e., open and connected) and let

f : Ω R where we extend the topology to R in the obvious

way. We say that f is subharmonic if

(1) f is continuous

(2) for all z Ω there exists r z 0 so that

f z

1

0

f

z re θ

d θ

for all 0 r r z. We refer to this as the (local) “submean value prop-

erty”.

It is helpful to keep in mind that in one dimension “harmonic = linear” and

“subharmonic = convex”. Subharmonic functions derive their name from the fact

that they lie below harmonic ones, in the same way that convex functions lie be-

low linear ones. We will make this precise by means of the important maximum

principle which subharmonic functions obey. We begin by deriving some basic

properties of this class.

35



36 3. ANALYTIC h1 D

FUNCTIONS, F. & M. RIESZ THEOREM

Lemma 3.2. Subharmonic functions satisfy the following properties:

1) If f and g are subharmonic, then f g :

max

f , g

is subharmonic.

2) If f C 2 Ω then f is subharmonic f 0 in Ω3) F analytic

log F and F

α with α 0 are subharmonic

4) If f is subharmonic and ϕ is increasing and convex, then ϕ f is subhar-

monic we set ϕ :

lim x

ϕ x .

Proof. 1) is immediate. For 2) use Jensen’s formula

T

f z re θ d θ f z

D z,r

log r

w z

f w m dw (3.1)

where m stands for two-dimensional Lebesgue measure and

D z, r

w

C w

z

r

The reader is asked to verify this formula in the following exercise. If f 0, then

the submean value property holds. If f z0 0, then let r 0 0 be sufficiently

small so that f 0 on D z0, r 0 Since log r 0 w z0

0 on this disk, Jensen’s

formula implies that the submean value property fails. Next, we verify 4) by means

of Jensen’s inequality for convex functions:

ϕ f z ϕ

T

f z re θ d θ

T

ϕ f z re θ d θ

The first inequality sign uses that ϕ is increasing, whereas the second uses convex-

ity of ϕ (this second inequality is called Jensen’s inequality). If F is analytic, then

log F

is continuous with values in R

. If F z0 0, then log

F

z

is

harmonic on some disk D z0, r 0 . Thus, one has the stronger mean value property

on this disk. If F z0 0, then log F z0 , and there is nothing to prove.

To see that F α is subharmonic, apply 4) to log F z with ϕ x exp α x .

Exercise 3.1. Prove Jensen’s formula (3.1) for C 2 functions.

Now we can derive the aforementioned domination of subharmonic functions

by harmonic ones.

Lemma 3.3. Let Ω be a bounded region. Suppose f is subharmonic on Ω ,

f C Ω and let u be harmonic onΩ , u C

Ω . If f u on Ω , then f u on

Ω.

Proof. We may take u 0, so f 0 on Ω. Let M maxΩ f and assume

that M 0. Set

S z

Ω f z M

Then S Ω and S is closed in Ω. If z S , then by the submean value property

there exists r z 0 so that D z, r z Ω. Hence S is also open. Since Ω is assumed

to be connected, one obtains S Ω. This is a contradiction.

The following lemma shows that the submean value property holds on any disk

in Ω. The point here is that we upgrade the local submean value property to a true

submean value property using the largest possible disks.



3. ANALYTIC h1 D

FUNCTIONS, F. & M. RIESZ THEOREM 37

Lemma 3.4. Let f be subharmonic in Ω , z0 Ω , D z0, r Ω. Then

f

z0

T f

z0

re

θ

d θ

Proof. Let gn max

f ,

n , where n

1. Without loss of generality z0

0.

Define un z to be the harmonic extension of gn restricted to

D z0, r where r 0

is as in the statement of the lemma. By the previous lemma,

f 0 gn 0 un 0

T

un re θ d θ

the last equality being the mean value property of harmonic functions. Since

max z r

un z max

z r f z

the monotone convergence theorem for decreasing sequences yields

f 0

T

f re θ d θ

as claimed.

Corollary 3.5. If g is subharmonic on D , then for all θ

g rse θ Pr gs θ

for any 0 r , s 1.

Proof. If g

everywhere on D, then this follows from Lemma 3.3. If

not, then set gn g n. Thus

g rse

θ gn

rse θ

Pr gn s θ

and consequently

g rse

θ limsup

n

Pr

gn s θ Pr

gs θ

where the final inequality follows from Fatou’s lemma (which can be applied in the

“reverse form” here since the gn have a uniform upper bound).

Note that if g s L1 T

, then g on D

0, s and so g

on D 0, 1

.

We now introduce the radial maximal function associated with any function on thedisk. It, and the more general nontangential maximal function where the supremum

is taken over a cone in D, are of central importance in the analysis of this chapter.

Definition 3.6. Let F be any complex-valued function on D then F : T R

is defined as

F

θ sup

0 r 1

F

re

θ



38 3. ANALYTIC h1 D


We showed in the previous chapter that any u h1 D satisfies u

C M µ

where µ is the boundary measure of u, i.e., ur Pr µ. In particular, one has

θ T u

θ λ C λ

1 u 1

We shall prove the analogous bound for subharmonic functions which are L1-

bounded.

Proposition 3.7. Suppose g is subharmonic on D, g 0 and g is L1-bounded,

i.e.,

g 1 : sup0 r 1

T

g re θ d θ

Then

(1) θ T g

θ λ

3λ

g 1 for λ

0.

(2) If g is L p bounded, i.e.,

g

p p :

sup

0 r 1

T

g re

θ

p d θ

with 1 p , then

g

L p T

C p g p

Proof.

(1) Let gr n µ M

T in the weak-

sense. Then

µ g

1 and

gs gr n s gr n Ps Ps µ

Thus, by Lemma 2.7,

g

sup

0 s 1

Ps µ M µ

and the desired bound now follows from Proposition 2.5.

(2) If g p

, then d µ

d θ L p

T with

d µ

d θ p g p and thus

g

C M d µ

d θ

L p T

by Proposition 2.5, as claimed.

We now present three versions of a theorem due to F. & M. Riesz. It is impor-tant to note that the following result fails without analyticity.

Theorem 3.8 (First Version of the F. & M. Riesz Theorem). Suppose F h1 D

is analytic. Then F L1

T .

Proof. F

12 is subharmonic and L2-bounded. By Proposition 3.7 therefore

F

12

L2

T . But

F

12

F

12 and thus F

L1

T .



3. ANALYTIC h1 D


Let F h1 D . By Theorem 2.2, F r Pr µ where µ M T has a Lebesgue

decomposition µ d θ f θ d θ νs d θ , νs singular and f L1 T . By Exer-

cise 2.5 b) one has Pr

µ

f a.e. as r

1. Thus, limr

1 F

re

θ

f

θ

existsfor a.e. θ T. This justifies the statement of the following theorem.

Theorem 3.8 (Second Version). Assume F h1

D and F analytic. Let f

θ

limr 1 F

re θ

. Then F r Pr f for all 0 r 1.

Proof. We have F r f a.e. and F r F

L1 by the previous theorem.

Therefore, F r f in L1 T

and Theorem 2.2 finishes the proof.

Theorem 3.8 (Third Version). Suppose µ M T , ˆ µ n 0 for all n 0.

Then µ is absolutely continuous with respect to Lebesgue measure.

Proof. Since ˆ µ n 0 for n Z

one has that F r Pr µ is analytic

on D. By the second version above and the remark preceding it, one concludes that

µ d θ f θ d θ with f limr 1

F

re θ

L1 T

, as claimed.

The logic of this argument shows that if µ m (where m is Lebesgue measure),

then the harmonic extension u µ of µ satisfies u

µ L1 T

. It is possible to give a

more quantitative version of this fact. Indeed, suppose that µ is a positive measure.

Then for some absolute constant C ,

C 1 M µ u

µ C M µ (3.2)

where the upper bound is Lemma 2.7 (applied to the Poisson kernel) and the lower

bound follows from the assumption µ 0 and the fact that the Poisson kernel

dominates the box kernel. Part a) of Problem 3.4 below therefore implies the quan-

titative non- L1 statement

θ T u

µ θ λ

C

λ µ

for µ m , where µ is a positive measure.

The F. & M. Riesz theorem raises the following question: Given f L1 T ,

how can one decide if

Pr f iQr f h1 D ?

We know that necessarily u

f Pr f

L1 T

. A theorem by Burkholder,

Gundy, and Silverstein states that this is also sufficient (they proved this for the

non-tangential maximal function). It is important to note the diff erence from (3.2),

i.e., that this is not the same as the Hardy-Littlewood maximal function satisfying

M f L1 T

due to possible cancellation in f . In fact, it is known that

M f L1 T f log 2 f L1

, see Problem 7.6.

We conclude this chapter with another theorem due to the Riesz brothers,

which can be seen as a generalization of the uniqueness theorem for analytic func-

tions. In fact, it says that if an analytic function F on the disk does not have too wild

growth as one approaches the boundary (expressed through the L1-boundedness



40 3. ANALYTIC h1 D


condition), then the boundary values are well-defined and cannot vanish on a set of

positive measure (unless F vanishes identically).

Theorem 3.9 (Second F. & M. Riesz Theorem). Let F be analytic on D and

L1-bounded, i.e., F h1

D . Assume F

0 and set f

limr 1

F r . Then

log f

L1

T . In particular, f does not vanish on a set of positive measure.

Proof. The idea is that if F 0 0, then

T

log f θ d θ log F 0

Since log

f f L1 T by Theorem 3.8, we should be done. Some care

needs to be taken, though, as F attains the boundary values f only in the almost

everywhere sense. This issue can easily be handled by means of Fatou’s lemma.

First, F

L1

T , so log

F r

F implies that log

f

L1

T by Lebesgue

dominated convergence. Second, by subharmonicity,

T

log F r θ d θ log F 0

so that

T

log f θ d θ

T

limr 1

log F r θ d θ

lim sup

r 1

T

log F r θ

d θ log

F

0

If F 0

0, then we are done. If F 0

0, then choose another point z0 D

for which F z0

0. Now one either repeats the previous argument with the

Poisson kernel instead of the submean value property, or one composes F with an

automorphism of the unit disk that moves 0 to z0. Then the previous argument

applies.

Theorem 3.9 generalizes the following fact from complex analysis, which is

proved by Schwarz reflection and Riemann mapping: Let F be analytic in Ω and

continuous up to Γ Ω which is an open Lipschitz arc. If F 0 on Γ, then

F 0 in Ω.

Notes

For all of this see Hoff

man’s classic [25], Garnett [20], and especially Koosis [34]. TheBurkholder-Gundy-Silverstein theorem, which is proved in [34], can be seen as a precursor

to the Feff erman-Stein grand maximal function which leads to the real-variable theory of

Hardy space, see [46]. There are some glaring omissions in this chapter such as inner and

outer functions. However, we shall not require that aspect of H p theory in this book.

Problems



3. ANALYTIC h1 D


Problem 3.1. Establish the maximum principle for subharmonic functions:

a) Show that if u is subharmonic on Ω and satisfies u z u z0 for all z Ω and

some fixed z0

Ω, then u is a constant.b) Show that if Ω is bounded and u is continuous on Ω, then u z max

Ω u for all

z Ω with equality being possibly only if u is a constant.

Problem 3.2. Let u be subharmonic on a domain Ω. Show that there exist a unique

measure µ on Ω such that µ K for every K Ω (i.e., K is a compact subset of Ω)

and so that

u z

log z ζ d µ ζ h z

where h is harmonic on Ω. This is called “Riesz’s representation of subharmonic func-

tions”.

Problem 3.3. With u and µ as in the previous exercise, show that

Tu

z

re

θ

d θ

u

z

r

0

µ D z, t

t dt

for all D z, r Ω (this generalizes “Jensen’s formula” (3.1) to functions which are not

twice continuously diff erentiable). Recover the classical Jensen’s formula from complex

analysis by setting u z : log f z where f is analytic.

Problem 3.4. This problem explores the behavior of the Hardy-Littlewood maximal

function of measures which are singular relative to Lebesgue measure.

a) Prove that if µ M T 0 satisfies µ m, then M µ L1. In fact, show that

θ

T M µ θ

λ

c

λ µ

provided λ µ with an absolute constant c 0.

b) Prove that there is a numerical constant C such that if µ M T is a positive

measure and F the associated harmonic function, then M µ CF

. Concludethat if µ is singular, then F L1.

Problem 3.5. Show that the class of analytic function F z in the unit disk with posi-

tive real part is in one-to-one relation with the class of functions of the form

F ζ

T

1 ζ e θ

1 ζ e θ µ d θ ic, ζ 1

where µ is a positive Borel measure on T and c is a real constant.



CHAPTER 4

The Conjugate Harmonic Function

While the previous two chapters focused on the Poisson kernel Pr , this chapter

is devoted to the study of the kernel Qr conjugate to Pr which we introduced in

Chapter 2. This turns out to have far-reaching consequences, for example it will

allow us to answer the question of L p-convergence of Fourier series with 1 p

which was raised in Chapter 1. We begin by recalling the definition of conjugate

function.

Definition 4.1. Let u be real-valued and harmonic in D. Then we define u to be

that unique real-valued and harmonic function in D for which u iu is analytic and

u 0 0. If u is complex-valued and harmonic, then we set u : Re u

i Im u

.

The following lemma presents some simple but useful properties of the har-

monic conjugate u.

Lemma 4.2. Let u be the harmonic conjugate as above.

1) If u is constant, then u 0.

2) If u is analytic in D and u 0 0 , then u iu. If u is co-analytic

(meaning that u is analytic), then u iu.

3) Any harmonic function u can be written uniquely as u c f g with

c constant, f , g analytic, and f

0

g

0

0.

Proof. 1) and 2) follow immediately from the definition, whereas 3) is given

by

u u 0

1

2 u u

0 iu

1

2 u u

0 iu

Uniqueness of c, f , g is also clear.

There is a simple relation between the Fourier coefficients of ur and that of

its harmonic conjugate. This relation will be the key to expressing the partial sum

operator for Fourier series in terms of the harmonic conjugate. Henceforth, we

shall occasionally use F to denote the Fourier transform for notational reasons.

Lemma 4.3. Suppose u is harmonic on D. Then for all n Z , n 0

F ur

n

i sign n

ur

n

(4.1)

Proof. By Lemma 4.2 it suffices to consider u constant, analytic, co-analytic.

We present the case u analytic, u 0 0. Then u iu so that F ur n

i ur

n for all u

Z. But ur n

0 for u

0 and thus (4.1) holds in this case.

43



44 4. THE CONJUGATE HARMONIC FUNCTION

Together with the Plancherel theorem, the previous lemma immediately allows

us to conclude the following relation between the L2-norms.

Proposition 4.4. Let u h2 D be real-valued. Then u h2

D . In fact,

ur

22

ur

22

u 0

2

Proof. As already mentioned, this follows from Lemma 4.3 and the Plancherel

theorem. Alternatively, by Cauchy’s theorem,

T

ur iur

2 θ d θ u iu

2

0 u2

0

Since the right-hand side is real-valued, the left-hand side is also necessarily real

and thus

u2 0

Tu2r θ d θ

Tu2r θ d θ

as claimed.

The previous L2-boundedness result allows us to determine the boundary val-

ues of the conjugates of h2 functions.

Corollary 4.5. If u h2

D , then limr 1 u

re

θ exists for a.e. θ T as an

L2 T

function, and the convergence is also in the sense of L2.

Proof. By Proposition 4.4 we know that u h2 D and the result of Chapter 2

lead to the desired conclusion.

We now consider the case of h1 D

. From the example of the Poisson kernel

and its conjugate we see that there is no hope to obtain a result as strong as in

the case of h2. However, we shall now show that the radial maximal function (as

defined in the previous chapter) of the conjugate to any h1 D -function is bounded

in weak- L1 T .

We emphasize, however, that this has nothing to do with the weak- L1 bound-

edness of the Hardy-Littlewood maximal function. In fact, it is not possible to

control the convolutions with Qr by means of that maximal function uniformly

in 0 r

1.

Nevertheless, the following theorem due to Besicovitch and Kolmogoroff shows

that the weak- L1

property still holds. The proof presented here is based on the no-tion of harmonic measure.

Theorem 4.6. Let u h1 D . Then

θ T u

θ λ

C

λ u 1

with some absolute constant C.



4. THE CONJUGATE HARMONIC FUNCTION 45

Proof. By Theorem 2.2, ur Pr µ. Splitting µ into real and imaginary parts,

and then each piece into its positive and negative parts, we reduce ourselves to the

case u

0. Let E λ θ T u

θ λ

and set F u iu. Then F is analytic and F

0 iu

0 . Define a function

ωλ x, y :

1

π

, λ λ,

y

x t 2 y2

dt

which is harmonic for y 0 and non negative. The following two properties of ωλ

will be important:

(1) ωλ x, y

12

if x λ

(2) ωλ 0, y

2 y

πλ.

For the first property it suffices to note that

0

1

π

y

x2 y2

dx

1

2 y

0

For the second property compute

ωλ 0, y

1

π

,

λ

λ,

y

t 2 y2 dt

2

π

λ y

dt

1 t 2

2 y

πλ,

as claimed.

Observe now that ωλ F is harmonic and that θ

E λ implies

ωλ F

re

θ

1

2

for some 0 r

1. Thus

E λ θ T ωλ F

θ 1 2

3

1 2

ωλ F 1 (4.2)

by Proposition 3.7. Since ωλ F 0, the mean value property implies that

ωλ F 1 ωλ F 0 ωλ iu 0

2

π

u 0

λ

2

π

u 1

λ.

Combining this with (4.2) yields

E λ

12

π

u

1

λ

and we are done.

Exercise 4.1. Show that ωλ x, y

equals an angle measured from z x

iy.

Which angle? Use this to show that ωλ x, y

12

provided x, y

lies outside the

semi-circle with radius λ and center 0. Furthermore, show that ωλ is the unique

harmonic function in the upper half plane which equals 1 on Γ : , λ

λ, , equals 0 on λ, λ R

Γ, and which is globally bounded. It is called the

harmonic measure of , λ λ,

, and can be defined in the same fashion

on general domains Ω and any open Γ

Ω (in fact, more general Γ). It turns out

that the harmonic measure of Γ relative to Ω equals the probability that Brownian




motion starting at z Ω hits the boundary for the first time at Γ rather than in Ω Γ.

Of particular importance is the conformal invariance of this notion, but we make

no use of this probabilistic connection in this book.

The following result introduces the Hilbert transform and establishes a weak-

L1 bound for it. Formally speaking, the Hilbert transform H µ of a measure µ

M T is defined by

µ u µ u µ

limr 1

u µ r

: H µ

i.e., the Hilbert transform of a function on T equals the boundary values of the

conjugate function of its harmonic extension. By Corollary 4.5 this is well defined

if µ d θ f θ d θ , f L2 T . We now consider the case f L1

T .

Corollary 4.7. Given u h1

D the limit limr 1 u

re

θ exists for a.e. θ .

With u Pr µ , µ

M T

, this limit is denoted by H µ. There is the weak-L1

bound

θ T H µ θ λ

C

λ µ

Proof. If µ d θ

f θ d θ with f

L2

T, then limr 1

u f re

θ exists for a.e. θ

by Corollary 4.5. If f L1 T and ε 0, then let g L2

T such that f g 1 ε.

Denote, for any δ 0,

E δ : θ T

lim supr ,s 1

u f

re θ

u f se

θ δ

and

F δ : θ T limsup

r ,s 1

uh re θ

uh se θ δ

where h f g. In view of the preceding theorem and the L2-case,

E δ F δ θ T uh

θ δ 2

C

δ

uh 1

C

δ

f g

1 0

as ε 0. This finishes the case where µ is absolutely continuous relative to

Lebesgue measure.

To treat singular measures, we first consider measures µ ν which satisfy

supp ν 0. Here

supp ν

: T I T ν I

0

I being an arc. Observe that for any θ supp ν

the limit limr 1 ur θ exists

since the analytic function u iu can be continued across that interval J on T

for which µ J 0 and which contains θ . Hence limr

1 ur exist a.e. by the

assumption supp ν 0. If µ M T is an arbitrary singular measure, then use

inner regularity to say that for every ε 0 there is ν M T

with µ ν ε

and supp ν 0. Indeed, set ν A : µ A K for all Borel sets A where

K is compact and µ T

K ε. The theorem now follows by passing from the




statement for ν to that for µ by means of the same argument that was used in the

absolutely continuous case above.

It is now easy to obtain the L p boundedness of the Hilbert transform on 1

p . This result is due to Marcel Riesz, who gave a diff erent proof, see the

following exercise for his original argument.

Theorem 4.8. If 1 p , then H f p C p f p. Consequently, if u

h p D with 1 p , then u h p

D and u p C p u p

Proof. By Proposition 4.4, Hu 2 u 2 with equality if and only if

T

u θ d θ 0

Interpolating this with the weak- L1 bound from Corollary 4.7 by means of the

Marcinkiewicz interpolation theorem finishes the case 1

p

2. If 2

p

,then we use duality. More precisely, if f , g L2

T , then

f , Hg

n Z

ˆ f n

Hg n

n Z

isign n

ˆ f n g n

n Z

H f n

g

n

H f , g

This shows that H

H . Hence, if f L p T L2

T and g L2 T L p

T ,

then

H f , g

f , Hg

f

p Hg

p

C p

f

p g

p

and thus H f p C p

f p as claimed.

Exercise 4.2. Give a complex variable proof of the L2m

T

-boundedness of theHilbert transform by applying the Cauchy integral formula to u iu

2m when m

is a positive integer, cf. the proof of Proposition 4.4. Obtain the previous theorem

from this by interpolation and duality.

Consider the analytic mapping F u iv that takes D onto the strip

z Re z

1

Then u h

D but clearly v h

D so that Theorem 4.8 fails on L

T . By

duality, it also fails on L1 T . The correct substitute for L1 in this context is the

space of real parts of functions in H1 D

. This is a deep result that goes much

further than the F. & M. Riesz theorem. The statement is that

H f 1

C

u

f 1 (4.3)

where u

f is the non-tangential maximal function of the harmonic extension u f of

f . Recall that by the Burkholder-Gundy-Silverstein theorem the right-hand side in

(4.3) is finite if and only if f is the real part of an analytic L1-bounded function.

A more modern approach to (4.3) is given by the real-variable theory of Hardy

spaces, which subsume statements such as (4.3) by the boundedness of singular

integral operators on such spaces.




Next, we turn to the problem of expressing H f in terms of a kernel. By Exer-

cise 2.3, it is clear that one would expect that

H µ θ

T

cot π θ ϕ µ d ϕ

(4.4)

for any µ M

T . This, however, requires justification as the integral on the

right-hand side is not necessarily convergent.

Proposition 4.9. If µ M T , then

lim

0

θ

ϕ

cot π θ ϕ µ

d ϕ H µ θ (4.5)

for a.e. θ T. In other words, (4.4) holds in the principal value sense.

Exercise 4.3. Verify that the limit in (4.5) exists for all d µ f d θ d ν where

f

C

1 T

and

supp

ν

0. Furthermore, show that these measures are dense inM T .

Proof of Proposition 4.9. The idea is to represent a general measure as a limit

of measures of the kind given by the previous exercise. As we have seen in the

proof of the a.e. convergence result Theorem 2.8, the double limit appearing in

such an argument requires a bound on an appropriate maximal function. In this

case the natural bound is of the form

mes

θ T sup

0

12

θ ϕ


λ

C

λ µ

(4.6)

for all λ 0. We leave it to the reader to check that (4.6) implies the theorem.

In order to prove (4.6) we invoke our strongest result on the conjugate function,namely Theorem 4.6. More precisely, we claim that

sup0 r 1

Qr µ θ

θ

ϕ

1

r


C M µ θ (4.7)

where M µ is the Hardy-Littlewood maximal function. Since

sup0

r 1

Qr µ θ u µ

θ

(4.6) follows from (4.7) by means of Theorem 4.6 and Proposition 2.5. To verify

(4.7) write the diff erence inside the absolute value signs as K r µ θ , where

K r θ

Qr θ cot πθ if 1 r θ

12

Qr

θ

if

θ

1

r By means of calculus one checks that

K r θ

C 1 r 2

θ

3 if θ

1 r

C 1 r 1 if

θ 1

r

This proves that K r 0 r 1 form a radially bounded approximate identity and (4.7)

therefore follows from Lemma 2.7.




Proposition 4.9 raises the question whether Theorem 4.8 can be proved based

on the representation (4.5) alone, i.e., without using complex variables or harmonic

functions. Going even further, we may ask if the L

2

-boundedness of the Hilberttransform can be proved without using the Fourier transform. We shall answer

all of these questions later in the context of Calderon-Zygmund theory where they

play an important role.

Exercise 4.4. Show by means of (4.5) that H is not bounded on L

T . For

example, consider H χ 0, 1

2

. Also, deduce from the fact that cot πθ L1

T that H

is not bounded on L1.

The following proposition shows that the Hilbert transform H f is exponen-

tially integrable for bounded functions f . In Exercise 4.4, you should find that

H χ 0, 1

2

has logarithmic behavior at 0 and 12

, which shows that the following result

is best possible.

Proposition 4.10. Let f be a real-valued function on T with f 1. Then for

any 0 α

π2

T

eα H f θ d θ

2

cos α

Proof. Let u u f be the harmonic extension of f to D and set F u iu.

Then u 1 by the maximum principle and hence cos αu cos α. Therefore,

Re eαF Re

eαu e iαu

cos αu eαu cos α eαu (4.8)

By the mean value property,

T

Re eαF r θ

d θ Re eαF 0

Re e iα u

0

cos αu 0 1

Combining this with (4.8) yields

T

eαur θ d θ

1

cos α

and by Fatou’s lemma therefore

T

eα H f

θ d θ

1

cos α

Since this inequality also holds for f , the proposition follows.

In later chapters we will develop the real-variable theory of singular integrals

which contains the results on the Hilbert transform obtained above. The basic

theorem, due to Calderon and Zygmund, states that singular integrals are boundedon L p

Rn

for 1 p thus generalizing Theorem 4.8.

The analogue of Proposition 4.10 for singular integrals is given by the fact

that they are bounded from L to BMO (the space of bounded mean oscillation)

and that BMO functions are exponentially integrable (this is called John-Nirenberg

inequality). The dual question of what happens on L1 leads into the real variable

theory of Hardy spaces. The analogue of (4.3) is then that singular integrals are




Conclude that S N f converges in the L p sense for any f L p T

d and 1 p

provided each component of N .

We remark that the corresponding result in which expanding rectangles are

replaced with Euclidean balls in Zd is false; in fact, Charlie Feff erman’s ball multi-

plier theorem states that the analogue of the previous exercise only holds for p 2

in that case. This is delicate, and relies on the existence of Kakeya sets. We shall

return to these matters in a later chapter.

Notes

The conjugate function and the Hilbert transform provide the link between the complex-

variable based theory of Hardy spaces and the real-variable Calderon-Zygmund theory

which we turn to in a later chapter. Koosis [ 34] proves the Burkholder-Gundy-Silverstein

theorem. This reference also presents some elements of the modern real-variable theory of

H 1 spaces, such as the atomic decomposition of H 1. The full real-variable H p theory in

higher dimensional is developed in Stein’s book [46]. We shall discuss some of the basic

ideas of that theory by means of a dyadic model in a later chapter.

Problems

Problem 4.1. Find an expression for the harmonic measure of an arc on T relative to

the disk D. Relate it to the harmonic measure of an interval relative to the upper half plane

by means of a conformal mapping.

Problem 4.2. This problem provides a weak form of a.e. convergence of Fourier series

on L1 T . In the following chapter we will see via Kolmogoroff ’s construction that the full

form of the a.e. convergence fails for L1 functions.

a) Show that, for any λ 0

sup N

θ T S N f θ λ

C

λ f 1

with some absolute constant C .

b) What would such an inequality mean with the sup N inside, i.e.,

θ T sup N

S N f θ λ

C

λ f 1

for all λ 0? Can this be true?

c) Using a) show that for every f L1 T there exists a subsequence N j

depending on f such that

S N j f f a.e.

as j .

Problem 4.3. Prove the following multiplier estimate. Let m : mn n

Z be a se-

quence in C satisfying

n

Z

mn mn 1 B, lim

n

mn 0




Define the multiplier operator T m by the rule

T m f θ

n

mn ˆ f n e nθ

for trigonometric polynomials f . Prove that for any such f one has

T m f p C p B f p

for any 1 p .

Problem 4.4. Let ϕ C

T and denote by Aϕ the multiplication operator by ϕ. With

H the Hilbert transform on T, show that

Aϕ, H Aϕ H H Aϕ

is a smoothing operator, i.e., if µ M T is an arbitrary measure, then

Aϕ, H µ C T

Problem 4.5. The complex variable methods developed in Chapters 2, 3, as well as

this one equally well apply to the upper half plane instead of the disk. For example, thePoisson kernel is

Pt x

1

π

t

x2 t 2

x R, t 0

and its harmonic conjugate is

Qt x

1

π

x

x2 t 2

Observe that Pt x iQt x

iπ

1 x

it

iπ z

with z x it , which should remind the reader

of Cauchy’s formula. The Hilbert transform now reads

H f x

1

π

f y

x ydy

in the principal value sense, the precise statement being just as in Proposition 4.9.

In this problem, the reader is asked to develop the theory of the upper half-plane in thecontext of the previous chapters. The relation between the disk and upper half-plane given

by conformal mapping should be considered as well.

Problem 4.6. By considering F u iu log 1 u iu show that if u 0 and

u h1 D , then

T u log 1 u d θ

.



CHAPTER 5

Fourier Series on L1 T

: Pointwise Questions

This chapter is devoted to the question of almost everywhere convergence of

Fourier series for L1 T

functions. From the previous chapters we know that this

is not an elementary question, since the partial sums S N are given by convolution

with the Dirichlet kernels D N which do not form an approximate identity. Note also

that we encountered an a.e. convergence question which was not associated with

an approximate identity, namely for the Hilbert transform, see Proposition 4.9.However, the needed uniform control was provided by the weak- L1 bound on the

maximal function u , see Theorem 4.6.

By a natural analogy, one sees that a similar bound on the maximal function

associated with S N f , i.e., sup N S N f , would lead to an a.e. convergence result.

What Kolmogoroff realized is essentially that no bound on this maximal func-

tion is possible for f L1 T . This has to do with the oscillatory nature of the

Dirichlet kernel D N and the growth of D N L1 . In fact, by a careful choice of

f L1 we may arrange the convolution

D N

f x

so that only the positive peaks

of D N contribute at many points x.

What Stein [44] realized is that a weak-type bound on sup N S N f

x

is not

only sufficient but also necessary for the a.e. convergence. In view of this factwe will conclude indirectly that S N f fails to converge a.e. for some f

L1

T .

Kolmogoroff ’s original argument involved the construction of a function f L1 T

for which a.e. convergence failed. However, what we do here is very close to his

original approach even though it is somewhat more streamlined.

Later in this book we will take up to more challenging question of a.e. conver-

gence of S N f for f L2 T or L p

T for 1 p . This is Carleson’s theorem,

and the proof introduces some essentially new and far-reaching ideas.

Next to presenting Kolomogoroff ’s theorem, this chapter also serves to intro-

duce ideas based on the use of randomness. This turns out to be a very powerful

device for many purposes. We begin the actual argument with such a probabilis-

tic construction, which is a version of the Borel-Cantelli lemma (but the following

proof is self-contained).

Lemma 5.1. Let E n

n 1 be a sequence of measurable subsets of T such that

n 1

E n

53



54 5. FOURIER SERIES ON L1 T

: POINTWISE QUESTIONS

Then there exists a sequence xn

n 1 T so that

n 1

χ E n x xn (5.1)

for almost every x T.

Proof. View Ω :

n 1 T as a probability space equipped with the infinite

product measure. Given x T, let A x Ω be the event characterized by (5.1). We

claim that

P A x 1 x T (5.2)

By Fubini, it then follows that for almost every xn

n 1 Ω, the event (5.1) holds

for almost every x T. Hence fix an arbitrary x T. Then

A x

xn

n

1

x belongs to infinitely many E n

xn

xn

n 1 x

N 1

m N

xm E m

N 1

A N

where

Ac N :

xn

n 1 x

m N

xm E cm

xn

n 1 xm

x E cm

m N

By definition of the product measure on Ω, it therefore follows that

P Ac N

m N

1 E m 0

by assumption (5.1). Hence (5.2) holds, and the lemma follows.

Lemma 5.1 will play an important role in the proof of the following theo-

rem, which establishes the necessity of a weak-type maximal function bound for

a.e. convergence. In addition to the previous probabilistic lemma, we shall also

require the basic Kolmogoro ff 0, 1 law which we now recall: let X k

k 1 be inde-

pendent random variables on a probability space Ω,P

. Assume that E Ω is

measurable with respect to the tail σ-algebra. This means that E is measurable

with respect to the σ-algebra generated by X k

k K for every positive integer K .

Then either P E 0 or P E

1. Such an E is called a tail event . The proof of

this law proceeds by showing that E is independent of itself whence

0 P

E E c

P E

1

P E

yielding the desired conclusion.



5. FOURIER SERIES ON L1 T

: POINTWISE QUESTIONS 55

In our case, the Carleson maximal operator

C f x : sup

N

S N f x (5.3)

is the main object that needs to be bounded. However, we shall formulate the

following “abstract” result in much greater generality. In fact, the reader will easily

see that it has nothing to do with any special properties of T, either, although we

restrict ourselves to that case for simplicity.

Theorem 5.2. Let T n be a sequence of linear operators, bounded on L 1 T ,

and translation invariant. Define

M f x

:

sup

n 1

T n f

x

and assume that M f

for every trigonometric polynomial f on T.

If for any f L1 T ,

x T M f x 0

then there exists a constant A so that

x

T M f

x

λ

A

λ

f 1

for any f L1 T and λ

0.

Proof. We will prove this by contradiction. Thus, assume that there exists a

sequence f j

j 1

L1 T

with f j 1

1 for all j 1, as well as λ j

0 so that

E j : x T M f j x λ j

satisfies

E j

2 j

λ j

for each j 1. Be definition of M we then also have

limm

x T sup1 k m

T k f j x λ j

2 j

λ j

for each j 1. Hence there are M j with the property that

x T sup1

k M j

T k f j x λ j

2 j

λ j

for each j 1. Let σ N denote the N th Cesaro sum, i.e., σ N f

K N

f , where K N

is the Fejer kernel. Since each T j is bounded on L1, we conclude that

lim N

x T sup1 k M j

T k σ N f j x λ j

2 j

λ j

Hence, we assume from now on that each f j is a trigonometric polynomial. Let m j

be a positive integer with the property that

m j

λ j

2 j m j 1





Then

j 1

m j E j

by construction. Counting each of the sets E j with multiplicity m j, the previous

lemma implies that there exists a sequence of points x j, , j 1, 1

m j, so

that

j 1

m j

1

χ E j x x j, (5.4)

for almost every x T. Let

δ j :

1

j2m j

and define

f x

:

j 1

m j

1

δ j f j x

x j,

where the signs will be chosen randomly. First note that irrespective of the

choice of these signs,

f

1

j 1

m jδ j

The point is now to choose the signs so that

M f x

for almost every x T. For this purpose, select x T such that x E j x j, for

infinitely many j and j (just pick one such j if there are more than one).Denote the set of these j by J . Since the T n are translation invariant, so is M.

Hence

M f j x x j, λ j

for all j J . We conclude that for those j there is a positive integer n j, x so that

T n j, x

f j x x j, λ j

At the cost of removing another set of measure zero we may assume that

T n f x

j 1

m j

1

δ jT n f x x j,

for all positive integers n. In particular, we have that for all j J

T n j, x

f x

k 1

mk

1

δk T n j, x

f x xk ,

which implies that

P T n j, x

f x δ jλ j

1

2





where the probability measure is with respect to the choice of signs . Since

δ jλ j , we obtain that

P M f x

12

We claim that the event on the left-hand side is a tail event. Indeed, this holds since

M f x

j 1

M f j x

and each summand on the right-hand side here is finite ( f j is a trigonometric poly-

nomial and we are assuming that M is uniformly bounded on trigonometric poly-

nomials). By Kolmogoroff ’s zero-one law we therefore have

P M f x 1

It follows from Fubini’s theorem that almost surely (in the choice of

)M f x for a.e. x

This would contradict our main hypothesis, and we are done.

The previous lemma reduces Kolmogoroff ’s theorem on the failure of almost

everywhere convergence of Fourier series of L1 functions to disproving a weak-

L1 bound for the Carleson maximal operator. More precisely, we arrive at the

following corollary. It will be technically more convenient later on to formulate

this result in terms of measures rather than on L1 T

.

Corollary 5.3. Suppose S N f

N 1 converges almost everywhere for every

f L1

T . Then there exists a constant A such that

x

T C µ

x λ

A

λ µ

(5.5)

for any complex Borel measure µ on T and λ 0 , where C is as in (5.3).

Proof. By the previous theorem, our assumption implies that there exists a

constant A such that

x T C f x λ

A

λ f 1

for all λ 0. This is the same as

x

T max

1 n N

S n f

x

λ

A

λ

f

1

for all N 1 and λ

0. If µ is a complex measure, then we set f V m µ, where

V m is de la Vallee-Poussin’s kernel, see (1.22). It follows that for all N 1

x T max

1 n N S n V m µ x λ

A

λ µ

for all m 1 and λ 0. Note that for m N the kernel V m can be removed from

the left-hand side, and we obtain our desired conclusion.





The idea behind Kolmogoroff ’s theorem is to find a measure µ which would

violate (5.5). This measure will be chosen to create “resonances”, i.e., so that the

peaks of the Dirichlet kernel all appear with the same sign. More precisely, forevery positive integer N we will choose

µ N :

1

N

N

j 1

δ x j, N (5.6)

where the x j, N are close to j

N . Then

S n µ N

x

1

N

N

j 1

sin 2n

1 π x x j, N

sin π x x j, N

. (5.7)

If x is fixed, we will then argue that there exists n so that the summands on the

right-hand have the same sign (for this, we will need to make a careful choice of

the x j, N ). Thus, the size of the entire sum will be about

N

j 1

1

j

log N

because of the denominators in (5.7), which clearly contradicts (5.5).

The choice of the points x j, N is based on the following lemma due to Kro-

necker.

Lemma 5.4. Assume that θ 1, . . . , θ d T

d is incommensurate , i.e., that for any

n1, . . . , nd Zd

0

one has that

n1

θ 1

. . . nd

θ d Z.

Then the orbit

nθ 1, . . . , nθ d mod Zd

n Z

Td (5.8)

is dense in Td .

Proof. It will suffice to show that for any smooth function f on Td

1

N

N

n 1

f nθ 1, . . . , nθ d

Td

f x dx. (5.9)

Indeed, if the orbit (5.8) is not dense, then we could find f 0 so that the set

Td

f 0

does not intersect it. Cleary, this would contradict (5.9). To prove (5.9),

expand f into a Fourier series, cf. Proposition 1.14. Then

1

N

N

n 1

f nθ 1, . . . , nθ d

1

N

N

n 1

ν Zd

ˆ f ν

e2πinθ ν

ˆ f 0

ν Zd 0

ˆ f ν

1

N

1 e2πi N 1

θ ν

1 e2πiθ

ν





Theorem 5.6. There exists f L1 T so that S n f does not converge almost

everywhere. In fact, there exists f L1 T such that

x T C f x

has measure 0.

It is known that this statement also holds with everywhere divergence. In fact,

Konyagin [35] showed the following stronger result: Let φ : 0,

0, be

non-decreasing and satisfy φ u

o

u

log u

log log u as u

. Then there

exists f L1 T

so that

T

φ f x dx

and lim supm

S m f

x

for every x

T.

Notes

Kolmogoroff ’s original reference is [33]. Much more on pointwise and a.e. conver-

gence questions can be found in Zygmund [58] and Katznelson [30].

Problems

Problem 5.1. Consider the following variants of Theorem 5.2:

1) Let µn

n 1

be a sequence of positive measures on Rd supported in a common

compact set. Define

M f x : supn 1

f µn x

Let 1 p and assume that for each f L p Rd

M f x on a set of positive measure.

Then show that f M f is of weak-type p, p .

2) Now suppose µn are complex measures of the form µn dx K n x dx, but again

with common compact support. Show that the conclusion of part 1) holds, but

only for 1 p 2.

Problem 5.2. If ω is an irrational number, show that

1

N

N

n 1

f nω

T

f θ

d θ

L2 0

for any f L2 T . In particular, if f L2 is such that f x ω f x for a.e. x, then f

= const.

Problem 5.3. Let xn

n 1

be an infinite sequence of real numbers. Show that the

following three conditions are equivalent:

a) For any f C T ,

1

N

N

n 1

f xn

T

f x dx





b) 1 N

N n

1 e kxn 0 for all k Z

c) lim N

sup I

T

1 N

# 1 j N x j I mod 1 I

0

If these conditions hold we say that xn

n 1

is uniformly distributed modulo 1 (abbreviated

by u.d.). The quantity

D

x j

N j

1

: sup I

T

# 1 j N x j I mod 1 N I

is called the discrepancy of the finite sequence x j

N j

1.

Problem 5.4. Using Problem 1.4 with a suitable choice of T , prove the following: if

xn

n 1 is a sequence for which xn

k xn

n 1 is u.d. modulo 1 for any k Z

, then

xn

1 is also u.d. mod 1. In particular, show that nd ω

n 1

is u.d. mod 1 for any irrational

ω and d Z .

Problem 5.5. Let p 2 be a positive integer. Show that for a.e. x T the sequence

pn x

n

1 is u.d. modulo 1. Can you characterize those x which have this property? Hint:

Consider expansions to power p.

Problem 5.6. Let θ j

N j

1 T be N distinct points and consider

P z

N

j 1

z e θ j

Assume that sup

z 1 P z eτ. We may assume that 1 τ N log 2. Prove the

following bound on the discrepancy of θ j

N j

1 as defined above:

∆ N : D

θ j

N j

1

C

τ N

Conclude that

θ T P e θ e

AN Ce

AN ∆ N

Ce

A

N τ

for any A

0. The constants C are absolute but may diff er. Hint: Consider the functionF θ µ θ 0, θ N θ θ 0 with µ

N j

1 δθ j where θ 0

12

, 12

is chosen such that

12

12

F θ

d θ 0. Relate F to to u

θ : log P e

θ via the Hilbert transform. Then

write F K µ with a suitable kernel K .



CHAPTER 6

Fourier Analysis on Rd

It is natural to extend the definition of the Fourier transform for the circle given

in Chapter 1 to Euclidean spaces. Thus, let µ M Rd

be a complex-valued Borel

measure and set

ˆ µ ξ

Rd

e 2πix

ξ µ dx ξ Rd

which is called the Fourier transform of µ. Clearly, ˆ µ

µ , the total variationof µ. Moreover, it follows immediately from the dominated convergence theorem

that ˆ µ is continuous. If µ dx f x dx, then we also write ˆ f ξ . In particular, the

Fourier transform is well-defined on L1 R

d

and defines a contraction into C R

d

,

the space of continuous functions. Many elementary properties from Chapter 1

carry over to this setting, such as convolutions, Young’s inequality, and the action

of the Fourier transform on convolutions and translations. To be precise, we collect

a few facts in the following lemma.

Lemma 6.1. Let µ M Rd

and let τ y denote translation by y Rd : τ y µ E

µ E y . Then

τ y µ ξ e 2πi ξ y ˆ µ ξ ξ Rd

Let eη x : e2πix η . Then eη µ ξ ˆ µ ξ η . If f , g L1 Rd

, then the integral

Rd

f x y g y dy

is absolutely convergent for a.e. x R

d and is denoted by f

g

x

. One has

f g L1 R

d , and f g

ˆ f g.

Finally, if g L p for 1 p and f L1 , then f g L p and f g p

f 1 g p.

We leave the elementary proof to the reader. We record one more very impor-

tant fact, namely the change of variables formula for the Fourier transform: let A

be an invertible d

d -matrix with real entries, and denote by f

A the function f Ax

. Then for any f

L1

Rd

one has

f A det A

1 ˆ f A t (6.1)

where A t is the transpose of the inverse of A. A special case of this is the dilation

identity. For any λ 0 let f λ x f λ x . Then

f λ ξ λ d ˆ f ξ λ

(6.2)

63



64 6. FOURIER ANALYSIS ON Rd

In analogy with the Fourier transform on the circle, we expect that L2 R

d

plays a special role. The be more precise, we expect the Plancherel theorem

ˆ f 2

f

2 to hold. Notice that as it stands this is only meaningful for f

L1

L2

Rd

,since otherwise the defining integral is not absolutely convergent. In fact, it is

convenient to work with a much smaller space called Schwartz space. In what

follows, we use the standard multi-index notation: for any α α1, α2, . . . , αd

Zd 0

(where Z0 are the nonnegative integers)

xα :

d

j 1

xα j

j , x x1, x2, . . . , xd

α :

d

j 1

α j

xα j

j

Moreover,

α

d

j 1 α j denotes the length of the multi-index.

Definition 6.2. The Schwartz spaceS Rd

is defined as all functions in C

Rd

which decay rapidly together with all derivatives. In other words, f S Rd

if and

only if f C

Rd

and

xα

β f x

L

Rd

α, β

where α, β are arbitrary multi-indices. We introduce the following notion of con-

vergence in S Rd

: a sequence f n S Rd

converges to g S Rd

if and only

if

xα

β f n g

0 n

for all α, β.

For readers familiar with Frechet spaces, we remark that the semi-norms

pαβ f : xα

β f

α, β Zd 0

turn S Rd

into such a space. Since Frechet spaces are metrizable, it follows that

the notion of convergence introduced in Definition 6.2 completely characterizes

the topology in S Rd

. However, we make no use of this fact, and only work with

the sequential characterization.

One has that S Rd

is dense in L p

Rd

for all 1

p since this is already

the case for smooth functions with compact support. It is evident that the map

f xα

β f x is continuous onS R

d

. Somewhat less obvious is that the Fourier

transform has the same property.

Proposition 6.3. The Fourier transform is a continuous operation from theSchwartz space into itself.

Proof. Fix any f S Rd

. The proof hinges on the pointwise identities

α f ξ

2πi

α

ξ α ˆ f ξ

β ˆ f ξ 2πi

β

x β f ξ



6. FOURIER ANALYSIS ON Rd 65

valid for all multi-indices α, β. The first one follows by integration by parts,

whereas the second is obtained by diff erentiating under the integral sign. We leave

the formal verification of these identities to the reader (introduce diff

erence quo-tients, justify the limits by the dominated convergence theorem etc.). This is easy,

due to the rapid decay and smoothness of f .

We first note from these identities that ˆ f C and, moreover, that

ξ α

β

ξ ˆ f L

Rd

α, β Zd 0

which implies by definition that ˆ f S Rd

. For the continuity, let f n be a sequence

in S Rd

so that f n 0 as n . Then

ξ α

β

ξ ˆ f n

C α,β

α x x β f n 1

0

as n .

In view of the preceding fact, it is most natural to ask if the Fourier transformtakes S

Rd

onto itself. We will now prove that this is indeed the case, which is a

special case of the Fourier inversion theorem. The latter is the analogue the Fourier

summation problem for Fourier series. The Schwartz space is a most convenient

setting in which to formulate the inversion.

Proposition 6.4. The Fourier transform takes the Schwartz space onto itself.

In fact, for any f S Rd

one has

f x

Rd

e2πix ξ ˆ f

ξ d ξ x

Rd (6.3)

Proof. We would like to proceed by inserting the expression

ˆ f ξ

Rd

e 2πiy ξ f y dy

into (6.3) and show that this recovers f x

. The difficulty here is of course that

interchanging the order of integration leads to the formal expression

Rd

e2πi x y ξ d ξ

which needs to equal the δ0 x y measure for the inversion formula (6.3) to

hold. Although this is “essentially” the case, the problem here is of course that the

previous integral is not well-defined. We shall therefore need to be more careful.

The standard procedure is to introduce a Gaussian weight since the Fourier

transform has an explicit form on such weights. In fact, in Exercise 6.1 below we

ask the reader to check the following identity

e π x

2 ξ e

π ξ

2

(6.4)

Taking this for granted, we conclude from this and (6.2) that

e πε2

x

2 ξ ε

d e

π ξ

2

ε2

for any ε 0.




Now fix f S Rd

. We commence the proof of (6.3) with the observation that

Rd

e2πix ξ ˆ f ξ d ξ lim

ε 0

Rd

e2πix ξ e πε2 ξ 2 ˆ f ξ d ξ

for each x Rd . This follows from the dominated convergence theorem since

ˆ f L1. Next, one has

Rd

e2πix ξ e

πε2 ξ

2ˆ f ξ d ξ

Rd

Rd

e2πi x y ξ e

πε2 ξ

2

d ξ f y dy

Rd

ε d e πε

2 x y

2

f y dy

The final observation is that

ε

d e

πε

2

x

2

is an approximate identity in the sense of Chapter 1 with respect to the limit ε 0.

Hence, since f C

Rd

L

Rd

one sees that

limε

0

Rd

ε d e πε

2 x y

2

f y dy f x

for all x Rd , which concludes the proof.

Exercise 6.1. Prove the identity (6.4). Hint: Use contour integration in the

complex plane.

Let us now return to the dilation identity (6.2) with f a smooth bump function.

We note that this identity expresses an important normalization principle (at leastheuristically), namely that the Fourier transforms maps an L1-normalized bump

function onto an L -normalized one, and vice versa.

The dual of S, denoted by S is the space of tempered distributions. In other

words, any u S is a continuous functional on S, and we write u, φ

for u

applied to φ S. The space S is equipped with the weak- topology. Thus,

un u in S if and only if

un, φ

u, φ as n

for every φ

S. Naturally,

L p R

d S

Rd

by the rule

f , φ

Rd

f x φ x dx, f L p R

d , φ S R

d

By a little functional analysis we may see that a linear functional u on S is contin-

uous if and only if there exists N such that

u, φ C

α

,

β

N

xα

βφ

φ S

It follows that tempered distributions can be diff erentiated any number of times,

and multiplied by any element of C

Rd

of at most polynomial growth. One has

αu, φ 1

α

u,

αφ φ S

Rd




Any measure of finite total variation is an element of S , in particular the Dirac

measure δ0 is.

The Fourier transform of u

S

is defined by the relation

u, φ

:

u, φ

for all φ S

Rd

. Since the Fourier transform is an isomorphism on S

Rd

, this

definition is meaningful and establishes the distributional Fourier transform as an

isomorphism on S . For example, 1 δ0 since

1, φ

Rd

φ ξ d ξ φ 0

In particular, we can meaningfully speak of the Fourier transform of any function

in L p R

d

with 1 p

.

The following exercise presents a very useful fact, namely that an element of

S

Rd

which is supported at a point, say

0

is necessarily a linear combination of

δ0 and finitely many derivatives thereof. “Support” here refers to the property that

u, ϕ 0 for all ϕ

S R

d

that vanish near 0.

Exercise 6.2. Let u, v S with u, ϕ v, ϕ

for all ϕ S with supp ϕ

Rd

0

. Then

u v

P

for some polynomial P.

The following exercise introduces an important example of a distributional

Fourier transform. The functions x

α go by the name of Riesz potentials.

Exercise 6.3. For any 0

α

d compute the distributional Fourier transformof x

α in Rd . Hint: Show that the Fourier transform equals a smooth function

of ξ 0. Use dilation and rotational symmetry to determine this smooth function.

Nonzero multiplicative constants do not need to be determined. Finally, exclude

any contributions coming from ξ 0, see the previous exercise.

An interesting example of a distributional identity is given by the Poisson sum-

mation formula.

Proposition 6.5. For any f S

Rd

one has

n Zd

f n

ν Zd

ˆ f ν

Equivalently, with the convergence being in the sense of S ,

ν Zd

e2πix ν

n Zd

δn x

where δn denotes the Dirac distribution at n.




Proof. Given f S Rd

, define F x :

n Zd f x n which lies in C T

d .

The Fourier coefficients of F are with e θ e2πiθ as usual,

F ν

n Zd

0,1

d

f x n e x ν dx

n Zd

0,1

d n

f x e x ν dx

Rd

f x e x ν dx

ˆ f ν

From Proposition 1.14 we therefore conclude that

F x

ν Zd

ˆ f ν

e x

ν

which implies the first identity. The second simply follows by chasing definitions.

Henceforth, we shall use the notation ˇ f for the integral in (6.3). It is now easy

to establish the Plancherel theorem.

Corollary 6.6. For any f S Rd

one has

ˆ f 2 f 2. In particular, the

Fourier transform extends unitarily to L2 R

d .

Proof. We start from the following identity which is of interest in its own right:

for any f , g S

Rd

one has

Rd

f ξ g ξ d ξ

Rd

ˆ f x g x dx

The proof is an immediate consequence of the definition of the Fourier transformand Fubini’s theorem. Therefore, one also has

Rd

ˆ f ξ g ξ d ξ

Rd

f x

ˆg x dx

However,ˆg

ˇg g

by the previous proposition, whence we obtain Parseval’s indentity

Rd

ˆ f ξ g ξ d ξ

Rd

f x g x dx

which is equivalent to the Plancherel theorem and the unitarity of the Fourier trans-

form.

Exercise 6.4. Show that the Schwartz space is an algebra under both multipli-

cation and convolution. In particular, prove that f g

ˆ f g for any f , g S R

d

.

We now take another look at the proof of Proposition 6.4. Denote

Γε x

:

ε d e πε

2 x

2




Then the proof makes use of the identity

Rd

e2πix ξ e

πε2 ξ

2 ˆ f ξ d ξ

f Γε

x

(6.5)

valid pointwise for all Schwartz functions f and any ε 0. Let I ε

f be the oper-

ator defined by the previous line. Since Γε is an approximate identity as ε 0, we

conclude the following useful observation which parallels results we encountered

for Fourier series on the circle.

Corollary 6.7. For any 1 p and f L p

Rd

one has I ε f f in

L p R

d . If f C Rd

and vanishes at infinity, then I ε f f uniformly.

In particular, one has uniqueness.

Corollary 6.8. Suppose f L1

Rd

satisfies ˆ f

0 everywhere. Then f

0.

Exercise

6.5. Let µ be a compactly supported measure in R

d

. Show that ˆ µextends to an entire function in Cd . Conclude that ˆ µ cannot vanish on an open set

in Rd .

Finally, we note the following basic estimate which is called Hausdor ff -Young

inequality.

Lemma 6.9. Fix 1 p 2. Then one has the property that ˆ f L p

Rd

for

any f L p R

d . Moreover, there is the bound

ˆ f p

f p.

Proof. Note that f L p

Rd

can be written as f

f 1

f 2 where

f 1 f χ f

1

L2 R

d , f 2 f χ

f

1

L1 R

d

Hence ˆ f 1

L2

is well-defined by the L2

-extension of the Fourier transform, andˆ f 2 L by integration. The estimate follows by interpolating between p 1 and

p 2.

As in the case of the circle, one can easily express all degrees of smoothness

via the Fourier transform. For example, one can do this on the level of the Sobolev

spaces H s and H s which are defined in terms of the norms

f

H s ξ s ˆ f 2,

f

H s ξ s ˆ f 2 (6.6)

for any s R where ξ

1 ξ 2. These spaces are the completion of S Rd

under the norms in (6.6). For the homogeneous space H s Rd

one needs the re-

striction s

d 2

since otherwise S is not dense in it.

Perhaps the most basic question relating to the Sobolev spaces concerns theirembedding properties. An example is given by the following result.

Lemma 6.10. For any f H s Rd

one has

f p C s f H s Rd

2

p

provided s

d 2

. In fact H s Rd

C Rd L p

Rd

for any such s and all

2 p

.




number of ways, for example on the L p-scale. We remark that (6.10) is scaling

invariant.

Lemma 6.12. Let f S Rd satisfy supp ˆ f ξ R . Then

f q C Rd 1

p

1q

f p (6.10)

1 p q .

Proof. Since (6.10) is scaling invariant, it suffices to consider R 1. Then

one has ˆ f χ ˆ f where χ is smooth, compactly supported and χ 1 on the unit

ball. It follows that f ˇ χ

f and by Young’s inequality therefore

f q ˇ χ r f p

with 1

1q

1r

1 p

, which is the same as 1 p

1q

1r

. Note that 1 p

q

guarantees that 1 r p .

Exercise 6.7. Formulate and prove the general Young’s inequality that was

used in the previous proof.

In particular, if ˆ f is supported on R ξ 2 R , then

f

p CR

d 12

1 p

f

2 C

f

H s Rd

with s as in (6.9). This proves (6.8) for such functions, and the challenge is now

to obtain the general case which is indeed true. The approach is to write f

j Z

P j f where P j f has Fourier support roughly on dyadic shells of size 2 j. While

j Z

P j f

2 H s

f

2 H s

by definition, the problem is to show that

f

2 p C

j

P j f

2 p (6.11)

This is indeed true for finite p 2, but requires some machinery. We shall develop

the necessary tools in the Littlewood-Paley chapter below. There are alternative

routes leading to (6.8), such as fractional integration.

We now take up a topic that has proven to be very important for various ap-

plications, especially in partial diff erential equations: the decay properties of mea-

sures with smooth compactly supported densities that live on smooth submanifolds

of Rd . Let us start with a concrete example, namely the Cauchy problem for the

Schrodinger equation

i t ψ ∆ψ 0, ψ 0 ψ0 (6.12)

where ψ0 is a fixed function, say a Schwartz function, and ψ 0

refers to the func-

tion ψ t , x evaluated at t

0. Applying the Fourier transform converts (6.12)

into

i t

ψ ξ 4π2

ξ 2ψ ξ 0, ψ

0

ψ0




where ψ is the Fourier transform with respect to the space variable alone. The

solution to the previous ordinary diff erential equation with respect to time is given

byψ t , ξ e 4it π2

ξ 2 ψ0 ξ

whence the actual solution is after Fourier inversion

ψ t , x

Rd

e2πix ξ e 4it π2

ξ

2 ψ0 ξ d ξ (6.13)

This formula is of course very relevant as it gives a solution to (6.12). For our

purposes, we would like to re-interpret the integral on the right-hand side in the

following form. Consider the paraboloid

P :

ξ, 4π2

ξ 2 ξ R

d

Rd 1

and place the measure µ onto P by the formula (for given ψ0 as above)

µ

d ξ, d τ

ψ0

ξ

d ξ via the relation

Rd 1

F ξ, τ µ d ξ, d τ

Rd

F ξ, 4π2

ξ 2

ψ0 ξ d ξ

valid for any continuous and bounded F . Then we see that (6.13) can be interpreted

as

ψ t , x

Rd 1

e2πi x ξ t τ µ d ξ, d τ

In other words, the solution is given by the inverse Fourier transform of the measure

µ which lives on the paraboloid P. For a variety of reasons it has turned out to

be of fundamental importance to understand the decay properties of such Fourier

transforms. Before investigating this question, we ask the reader to verify a few

properties of (6.13).

Exercise 6.8. Let ψ0 S Rd

. Prove that (6.13) defines a function ψ t , x

which is smooth in t and belongs to the Schwartz class for all fixed times. More-

over, show that ψ solves (6.12) and satisfies

ψ t L2 ψ0 2 t R

Finally, verify that for every t 0

ψ t , x cd t

d 2

Rd

eiπ2

x y

2

4t ψ0 y dy (6.14)

with a suitable constant cd . Conclude that

ψ

t

cd t

d

2

ψ0 1

for all t 0.

Now let M Rd be a smooth manifold of dimension k d with Riemannian

measure µ. Fix some smooth, compactly supported function φ on M. The problem

is to describe the asymptotic behavior of the Fourier transform φ µ ξ for large ξ .

For example, consider M to be a plane of dimension d 1. By translation and




rotational symmetry we may take M Rd 1, the subspace given by the first d 1

coordinates and φ is a Schwartz function of x : x1, . . . , xd 1

, whereas µ is

simply Lebesgue measure onR

d 1

. Therefore, with ξ

ξ

, ξ d

,

φµ ξ

Rd 1

e 2πix

ξ φ x

dx

which evidently is a Schwartz function in ξ but does not depend at all on ξ d whence

also does not decay in that direction. What we have shown is that the Fourier trans-

form of a measure that lives on a plane does not decay in the direction normal to the

plane. In contrast to this case, we shall demonstrate that nonvanishing Gaussian

curvature of a hypersurface M always guarantees decay at a universal rate.

To facilitate computations, fix some M and a measure on it with smooth com-

pactly supported density. Working in local coordinates and fixing a direction in

which we wish to describe the decay reduces matters to oscillatory integrals of the

form

I λ

Rd

eiλφ ξ

a ξ d ξ , (6.15)

with a compactly supported function a C

0 and smooth, real-valued phase φ

C . The asymptotic behavior of the integral I λ

as λ is described by the

method of (non)stationary phase. This refers to the distinction between φ having

a critical point on supp a or not. We begin with the easier case of non-stationary

phase. The reader should note that the standard Fourier transform is of this type.

Lemma 6.13. If ∇φ 0 on supp a then the integral (6.15) decays as

Rd

eiλφ ξ

a ξ d ξ

C N , a, φ λ

N , λ

for arbitrary N 1.

Proof. Note that the exponential eiλφ is an eigenfunction for the operator

L :

1

iλ

∇φ

∇φ

2∇ (6.16)

Therefore we can apply L to the exponential inside of (6.15) as often as we wish.

The adjoint of L is

L

i

λ∇

∇φ

∇φ

2

Then for any positive integer N , one has

Rd eiλφ

ξ

a ξ d ξ

Rd L N eiλφ

ξ

a ξ d ξ

Rd

eiλφ ξ L

N a ξ d ξ

Rd

L

N a ξ d ξ

C

N , a, φ λ N




as claimed.

The situation changes when φ has a critical point inside supp

a

. In that caseone no longer has arbitrary decay and it can be very difficult to determine the exact

rate. However, if the critical point is nondegenerate, i.e., φ has a nondegenerate

Hessian at that point, then there is a precise answer as we now show.

Lemma 6.14. If ∇φ ξ 0 0 for some ξ 0

supp a , ∇φ 0 away from ξ 0 and

the Hessian of φ at the stationary point ξ 0 is non-degenerate, i.e., det D2φ ξ 0 0 ,

then for all λ 1 ,

Rd

eiλφ ξ

a ξ d ξ

C d , a, φ λ

d 2 (6.17)

In fact,

k λ

e iλφ 0

Rd

eiλφ ξ

a ξ d ξ

C d , a, φ, k λ

d 2 k (6.18)

for any integer k 1.

Proof. Without loss of generality we assume ξ 0 0. By Taylor expansion

φ ξ φ 0

1

2 D2φ 0 ξ, ξ O ξ 3

Since D2φ 0

is non-degenerate we have

∇φ ξ ξ on supp

a

. We split the

integral into two parts,

Rd

eiλφ ξ a ξ d ξ I II

where the first part localizes the contribution near ξ 0,

I :

Rd

eiλφ ξ a ξ χ0 λ12 ξ d ξ

and the second part restores the original integral, viz.

II

Rd

eiλφ ξ

a ξ 1 χ0 λ12 ξ d ξ

The first term has the desired decay

I C

Rd

χ0 λ12 ξ d ξ C λ

d 2

To bound the second term, we proceed as in the proof of the nonstationary case.

Thus, let L be as in (6.16) and note that

Dα

∇φ

∇φ

2

ξ

C α ξ

1

α




for any multi-index α. Thus, let N d be an integer. Then the Leibnitz rule yields

II C

ξ

λ

12

L

N 1 χ

0 λ

12 ξ a ξ d ξ

C λ N

λ

12

λ N 2 r N

r 2 N r d 1 dr C λ

d 2

as claimed. The stronger bound (6.18) follows from the observation that

φ ξ φ 0

C ξ 2

and the same argument as before (with N d 2k ) concludes the proof.

It is possible to obtain asymptotic expansions of I λ in inverse powers of

large λ. In fact, the estimate provided by the preceding stationary phase lemma isoptimal.

Let us now return to the Fourier transform of d µ φ d σ where σ is the surface

measure of a hypersurface M in Rd and φ is compactly supported and smooth.

Denote the unit normal vector to M at x M by N x .

Corollary 6.15. Let ξ λν where ν S d 1 and λ 1. If ν is not parallel to

N x for any x supp

φ , then ˆ µ O λ k as λ for any positive integer k.

Assume that M has nonvanishing Gaussian curvature on supp φ . Then

ˆ µ ξ C µ,M ξ

d 12

for all λ 1 and all ν. This is optimal if ν does belong to the normal bundle of M

on supp φ .

Proof. One has

ˆ µ ξ

M

eiλ ν, x φ

x σ

dx

Note that the function x ν, x as a function on M has a critical point at x0 M

if and only if ν is perpendicular to T x0M, the tangent space at x0 to M. Lemma 6.13

implies the first statement. For the second statement we note that the critical point

at x0 is nondegenerate if and only if the Gaussian curvature at x0 does not vanish,

and thus the bound follows from stationary phase.

Lemma 6.14 in fact allows for a more precise description of ˆ µ. We illustrate

this by means of σS d 1 x for large x, where σS d 1 is the surface measure of theunit sphere in Rd . This representation is very useful in certain applications where

the oscillatory nature of the Fourier transform of the surface measure is needed,

and not just its decay.

Note that the function σS d 1 x

is smooth and bounded, so we are interested in

its asymptotic behavior for large x , both in terms of oscillations and size. We need

to understand the oscillations in order to bound derivatives of σS d 1 x

. We remark




that the following result can also be obtained from the asymptotics of Bessel func-

tions, but we make no use of special functions in this book.

Corollary 6.16. One has the representation

σS d 1 x ei x ω

x e i x ω

x x 1 (6.19)

where ω

are smooth and satisfy

k r ω

r

C k r

d 12 k

r

1

and all k 0.

Proof. By definition,

σS d 1 x

S d 1

eix ξ σS d 1

d ξ

which is rotationally invariant. Therefore we choose x 0, . . . , 0, x

and denote

the integral on the right-hand side by I x . Thus,

I x

S d 1

ei x ξ d σS d 1 d ξ

Note that ξ d , viewed as a function on S d 1, has exactly two critical points, namely

the north and south poles

0, . . . , 0,

1

which are, moreover, nondegenerate. We therefore partition the sphere into three

parts: a small neighborhood around each of these poles and the remainder. The

Fourier integral I x then splits into three integrals by means of a smooth partition

of unity adapted to these three open sets, and we write accordingly

I x

I

x I eq x

The latter integral has non-stationary phase and decays like x

N for any N , see

Lemma 6.13. On the other hand, the integrals I

exactly fit into the frame work of

Lemma 6.14 which yields the desired result. Note that one can absorb the equato-

rial piece I eq x into either e i x ω

x without changing the stated properties

of ω

, and (6.19) follows.

Notes

Many important topics are omitted here, such as spherical harmomics. A classical

as well as comprehensive discussion of the Euclidean Fourier transform is the book by

Stein and Weiss [49]. Folland [17] and especially Rudin [40] contain more information on

the topic of tempered distributions. Hormander [26] and Wolff [56] show that the Fourier

transform on L p with p 2 take values in the distributions outside of the Lebesgue spaces

(see Theorem 7.6.6 in [26]). For more on the topic of Sobolev embeddings and trace

estimates see for example the comprehensive treatments in Gilbarg and Trudinger [ 22],

as well as Evans [13], which do not invoke the Fourier transform. Stein [45] develops




Sobolev spaces in the context of the Fourier transform, but also uses the “fundamental the-

orem of calculus” approach. A standard reference for asymptotic expansions in stationary

phase is the classical reference by Hormander [26]. For Exercise 6.2 see Theorem 6.25 inRudin [40].

Problems

Problem 6.1. Compute the Fourier transform of P.V. 1 x

. In other words, determine

limε

0

x

ε

e 2πix

ξ dx

x

for every ξ R. Conclude that (up to an irrelevant normalization) the Hilbert transform is

an isometry on L2 R . This is the analogue for the line of Lemma 4.3.

Problem 6.2. Solve the boundary value problem in the upper half plane for the Laplace

equation via the Fourier transform. Compare your findings with Problem 4.5. Generalizeto higher dimensions.

Problem 6.3. Carry out the analogue Problem 1.11 with R instead of T and Rd instead

of Td (the case of the Schrodinger equation was already addressed in Exercise 6.8 above).

Compute the explicit form of the heat kernel, i.e., the kernel Gt so that Gt u0 with t 0 is

the solution to the Cauchy problem for the heat equation ut ∆u 0, u 0 u0. Discuss

the limit t 0 .

Problem 6.4. Prove the following generalization of Bernstein’s inequality, cf. Lemma 6.12:

if supp

ˆ f E Rd where E is measurable and f is Schwartz, say, prove that

f q E

1 p

1q

f p 1 p q

(6.20)

where E stands for the Lebesgue measure. Hint: First handle the case q , p 2, by

Plancherel and Cauchy-Schwarz, and then dualize and interpolate.

Problem 6.5. Show that if f L2 R is supported in R, R , then ˆ f extends to C as an

entire function of exponential growth 2π R. The Paley-Wiener theorem states the converse:

if F is an entire function of exponential growth 2π R and such that F L2 R , then F

ˆ f

where f L2 R is supported in R, R . Prove this via a deformation of contour, and

extend to higher dimensions.

Problem 6.6. Consider the kernel K x, y

1 x

y on R2

and define the operator

T f x

0

f y

x ydy x 0

Show that T is bounded on L p R

for every 1 p , but not at p 1 or p .

Problem 6.7. The Laplace transform is defined as

L f s :

0

e

sx f x dx, s 0 (6.21)

Show that L is bounded on L2 R

but not on any L p R

with p 2. Verify that

L 2

2

π. Use this to show that T 2

2 π where T is as in (6.21). Hint: For

the L2 boundedness, write e

sx e

sx2 x

14 e

sx2 x

14 and apply Cauchy-Schwarz. For T

considerL2.




Problem 6.8. With L the Laplace transform on L2 R

as in the previous problem,

find an inversion formula.

Problem 6.9. Establish a representation as in Corollary 6.16 for the ball instead of thesphere. I.e., find the asymptotics as in that corollary for χ B where B R

d is the unit-ball

with d 2. Generalize to the case of ellipsoids, i.e.,

x x1, . . . , xd Rd

d

j 1

λ

2 j

x2 j 1

where λ j 0 are constants. Contrast this to the Fourier transform of the indicator function

of the cube.

Problem 6.10. Let A be a positive d d -matrix. Compute the Fourier transform of the

Gaussian e

Ax, x .

Problem 6.11. Compute the inverse Fourier transform of 4π2 ξ 2

k 2

1 in R3 where

k 0 is a constant. Denote this function by G x; k . Show that the operator

R k f x

Rd

G x y; k f y dy, f S R3

is bounded on L2 R

d and that R k f C

R3

as well as

∆ k 2 R k f f f S

R3

Conclude that R k ∆ k 2

1 for k 0, i.e., R k is the resolvent of the Laplacian.



CHAPTER 7

Calderon-Zygmund Theory of Singular Integrals

In this chapter we take up the problem of developing a real-variable theory of

the Hilbert transform. In fact, we will subsume the Hilbert transform in a class

of operators defined in any dimension. We shall make no reference to harmonic

functions as we did in Chapter 4, but rather rely only the properties of the kernel

alone. For simplicity, we shall mostly restrict ourselves to the translation invariant

setting.We begin with the following class of kernels which define operators by con-

volution. As for the Hilbert transform, we include a cancellation condition, see

part iii in the following definition. For example, this condition guarantees that

the principal value as in (7.1) exists (contrast this to the case K x x

d ). In

addition, this condition will be convenient in proving the L2-boundedness of the

associated operators.

This being said, it is often preferable to assume the L2-boundedness of the

Calderon-Zygmund operator instead of the cancellation condition. The logic is

then to develop the L2-boundedness theory separately from the L p-theory, with the

goal of finding necessary and sufficient conditions for L2-boundedness while the

L p theory requires only condition ii in the following definition. In many ways, ii

,

which is called H ¨ ormander condition, is therefore the most important one.

Definition 7.1. Let K : Rd

0 C satisfy, for some constant B,

i) K x B x

d for all x Rd

ii)

x 2 y

K x K x y dx B for all y 0

iii)

r x s K x dx 0 for all 0 r s

Then K is called a Calder´ on-Zygmund kernel.

Our first goal is With such a kernel we associate a translation invariant operator

by means of the principal value. Thus, the singular integral operator (or Calderon-

Zygmund operator) with kernel K is defined as

T f x :

lim

ε 0

x y ε

K x y f y dy (7.1)

for all f S Rd

.

Exercise 7.1.

79



80 7. CALDERON-ZYGMUND THEORY OF SINGULAR INTEGRALS

a) Check that the limit exists in (7.1) for all f S Rd

. How much reg-

ularity on f suffices for this property (you may take f to be compactly

supported if you wish)?b) Check that the Hilbert transform on the line with kernel 1 x

is a singular

integral operator.

There is the following simple condition that guarantees ii .

Lemma 7.2. Suppose ∇K x B x

d 1 for all x 0 and some constant B.

Then

x 2

y

K

x

K

x

y

dx

C B (7.2)

with C C

d

.

Proof. Fix x, y Rd with x 2 y . Connect x and x y by the line segment x ty, 0 t 1. This line segment lies entirely inside the ball B x, x 2). Hence

K x K x y

1

0

∇K x ty y dt

1

0

∇K x ty y dt B2d 1 x

d 1 y

Inserting this bound into the left-hand side of (7.2) yields the desired bound.

Exercise 7.2. Check that, for any fixed 0 α

1, and all x 0,

sup

y

x

2

K x K x y

y

α

B x

d α

also implies (7.2).

We now prove L2-boundedness of T , by computing K and applying Plancherel.

This is an instance where the full force of Definition 7.1 comes into play. We shall

use all three conditions, in particular the cancellation condition.

Proposition 7.3. Let K be as in Definition 7.1. Then T 2 2 CB with

C C d .

Proof. Fix 0 r s

and consider

T r ,s f

x

Rd

K y

χ r y s

y

f

x

y

dy

Let

mr ,s ξ :

Rd

e 2πix

ξ χ

r x s

K x dx

be the Fourier transform of the restricted kernel. By Plancherel’s theorem it suffices

to prove that

sup0

r s

mr ,s

CB (7.3)



7. CALDERON-ZYGMUND THEORY OF SINGULAR INTEGRALS 81

Indeed, if (7.3) holds, then

T r ,s 2 2 mr ,s

CB

uniformly in r , s. Moreover, for any f S Rd

one has

T f x lim

r 0s

T r ,s f x

pointwise in x R

d . Fatou’s lemma therefore implies that T f

2 CB

f

2 for

any f S Rd

. To verify (7.3) we split the integration in the Fourier transform

into the regions x ξ 1 and

x ξ 1. In the former case

r x ξ

1

e 2πix ξ K

x

dx

r x ξ

1

e 2πix

ξ

1 K

x

dx

x ξ

1 2π

x

ξ

K

x

dx

2π ξ

x ξ

1

B x

d 1 dx

C B ξ ξ 1 CB

as desired. Note that we used the cancellation condition iii in the first equality

sign. To deal with the case x ξ 1 one uses the cancellation in e 2πix ξ which

in turn requires smoothness of K , i.e., condition ii) (but compare Lemma 7.2).

Firstly, observe that

s x ξ

1

K x

e 2πix

ξ dx

s x ξ

1

K x

e

2πi

x

ξ

2 ξ

2

ξ

dx (7.4)

s x

ξ

2 ξ

2 ξ

1

K

x

ξ

2 ξ

2

e 2πix

ξ dx (7.5)

Denoting the expression on the left-hand side of (7.4) by F one thus has

2F

s x ξ

1

K x K x

ξ

2 ξ

2

e 2πix

ξ dx O

1

(7.6)

The O 1

term here stands for a term bounded by CB. Its origin is of course the

diff erence between the regions of integration in (7.4) and (7.5). We leave it to the

reader to check that condition i) implies that this error term is really no larger than

CB. Estimating the integral in (7.6) by means of ii) now yields

2F

x ξ

1

K x K

x

ξ

2 ξ

2

dx CB C B

as claimed. We have shown (7.3) and the proposition follows.

Exercise 7.3. Supply the omitted details concerning the O 1 term in the pre-

vious proof.




Our next goal is to show that singular integrals are bounded as operators from

L1 R

d

into weak- L1. This requires the following basic decomposition lemma due

to Calderon-Zygmund for L

1

functions. The proof is an example of a stopping timeargument .

Lemma 7.4. Let f L1 R

d and λ

0. Then one can write f g b where

g

λ and b

Q χQ f where the sum runs over a collection B

Q of disjoint

cubes such that for each Q one has

λ

1

Q

Q

f

2d λ (7.7)

Furthermore,

Q B

Q

1

λ f 1 (7.8)

Proof. For each Z we define a collection D of dyadic cubes by means of

D

Πd i

1 2 mi, 2

mi 1

m1, . . . , md Z

Notice that if Q D and Q

D , then either Q Q

or Q Q or Q

Q.

Now pick 0 so large that

1

Q

Q

f

dx

λ

for every Q D 0 . For each such cube consider its 2d “children” of size 2 0 1.

Any such cube Q will then have the property that either

1

Q

Q

f

x

dx

λ or

1

Q

Q

f

x

dx

λ (7.9)

In the latter case we stop, and include Q in the family B of “bad cubes”. Observe

that in this case

1

Q

Q

f x dx

2d

Q

Q

f x dx 2d λ

where Q denotes the parent of Q . Thus (7.7) holds. If, however, the first inequality

in (7.9) holds, then subdivide Q again into its children of half the size. Continuing

in this fashion produces a collection of disjoint (dyadic) cubes B satisfying (7.7).

Consequently, (7.8) also holds, since

B

Q

B

Q

B

1λ

Q

f x dx 1λ

Rd

f x dx

Now let x0 Rd

B Q. Then x0 is contained in a decreasing sequence Q j of

dyadic cubes each of which satisfies

1

Q j

Q j

f

x

dx

λ




By Lebesgue’s theorem f x0 λ for a.e. such x0. Since moreover Rd B Q

and Rd B Q diff er only by a set of measure zero, we can set

g f

Q

B

χQ f

so that g

λ a.e. as desired.

We can now state the crucial weak- L1 bound for singular integrals. We shall

do this under more general hypotheses than those of Definition 7.1.

Proposition 7.5. Suppose T is a linear operator bounded on L2 R

d

such that

T f

x

Rd

K x

y

f

y

dy

f

L2

comp Rd

and all x supp f . Assume that K satisfies condition iii of Definition 7.1, but

not necessarily any of the other conditions. Then for every f

S R

d

there is theweak-L1 bound

x

Rd

T f

x

λ

CB

λ

f 1 λ

0

where C C d .

Proof. Dividing by B if necessary, we may assume that B 1. Now fix

f S Rd

and let λ

0 be arbitrary. By Lemma 7.4 one can write f g b with

this value of λ. We now set

f 1 g

Q B

χQ

1

Q

Q

f x dx

f 2 b

Q B

χQ 1 Q

Q f x dx

Q B

f Q

where we have set

f Q : χQ

f

1

Q

Q

f x dx

Notice that f f 1 f 2, f 1

C λ f 1, f 2 1 2 f 1, f 1 1 2 f 1, and

Q

f Q x dx 0

for all Q B. We now proceed as follows:

x Rd

T f x λ

x R

d

T f 1 x

λ2

x R

d

T f 2 x

λ2

C

λ2 T f 1

22

x Rd

T f 2 x

λ

2

(7.10)

The first term in (7.10) is controlled by Proposition 7.3:

C

λ2 T f 1

22

C

λ2 f 1

22

C

λ2 f 1

f 1 1

C

λ f 1




To estimate the second term in (7.10) we define, for any Q B, the cube Q to be

the dilate of Q by the fixed factor 2

n (i.e., Q has the same center as Q but side

length equal to 2

n times that of Q). Thus

x Rd

T f 2 x

λ

2

B Q

x R

d B Q

T f 2

x

λ

2

C

Q

Q

2

λ

Rd Q

T f 2 x dx

C

λ f 1

2

λ

Q B

Rd Q

T f Q x dx

Perhaps the most crucial point of this proof is the fact that f Q has mean zero which

allows one to exploit the smoothness of the kernel K . More precisely, for any

x Rd

Q ,

T f Q x

Q

K x y f Q y dy

Q

K x y K x yQ f Q y dy

(7.11)

where yQ denotes the center of Q. Thus

Rd Q

T f Q x dx

Rd Q

Q

K x y K x yQ f Q y dy dx

Q

f Q y dy 2

Q

f y dy

To pass to the second inequality sign we used condition ii) in Definition 7.1 and

our choice of Q . Hence the second term on the right bound side of (7.11) is no

larger thanC

λ

Q

Q

f y dy

C

λ f 1

and we are done.

The assumption f S Rd

in the previous proposition was for convenience

only. It ensured that one could define T f by means of the principal value (7.1).

However, observe that Proposition 7.3 allows one to extend T to a bounded opera-

tor on L2. This in turn implies that the weak- L1 bound in Proposition 7.5 holds for

all f

L

1

L

2 R

d

. Indeed, inspection of the proof reveals that apart from the L

2

boundedness of T , see (7.10), the definition of T in terms of K was only used in

(7.11) where x and y are assumed to be sufficiently separated so that the integrals

are absolutely convergent.

Theorem 7.6 (Calderon-Zygmund). Let T be a singular integral operator as in

Definition 7.1. Then for every 1 p one can extend T to a bounded operator

on L p R

d with the bound T p p CB with C C p, d .




Proof. By Proposition 7.3 and 7.5 (and the previous remark) one obtains this

statement for the range 1 p

2 from the Marcinkiewicz interpolation theorem.

The range 2

p

now follows by duality. Indeed, for f , g

S R

d

one has

T f , g f , T g where T g x

limε 0

x y ε

K

x y g y dy

and K

x

:

K

x

. Since K clearly verifies conditions i)–iii) in Definition 77.1,

we are done.

Remark. It is important to realize that the cancellation condition iii) was only

used to prove L2 boundedness, but did not appear in the proof of Proposition 7.5

explicitly. Therefore, T is bounded on L p R

d provided it is bounded for p 2

and conditions i) and ii) of Definition 7.1 hold.

We now present some of the most basic examples of singular integrals. Con-

sider the Poisson equation u f in Rd where f S Rd , d 2. In the following

exercise we ask the reader to show that in dimensions d 3

u x C d

Rd

x y

2 d f y dy (7.12)

with some dimensional constant C d is a solution to this equation. In fact, one can

show that (7.12) gives the unique bounded solution but we shall not need to know

that here. In two dimensions, the formula reads

u x C 2

R2

log x y f y dy (7.13)

The kernels x

2 d and log x

are called Newton potentials. It turns out that the

second derivatives

2u xi x j with u defined by these convolutions can be expressed

as Calderon-Zygmund operators acting on f (the so-called “double Riesz trans-

forms”).

Exercise 7.4.

a) With f S and a suitable constant C d , show that (7.12) satisfies

∆u f

Hint: pass the Laplacian onto f and introduce a cut-off to the set x y

ε in (7.12). Apply Green’s formula, and let ε 0 to recover f x .

b) With u as in (7.12), show that for f S

Rd

with d

3

2u

xi x j

x C d d 2 d 1

Rd K i j x y f y dy

in the principal value sense, where

K i j x

xi x j

x

d

2 if i j

x2i

1d

x

2

x

d

2 if i

j

Find the analogue for d 2.




The convolution in (7.12) is an example of what one calls fractional integra-

tion. More generally, for any 0 α d define

I α f x :

Rd

x y

α d f y dy, f S Rd (7.16)

We cannot use Young’s inequality since the kernel is not in any Lr R

d

space, but

it is clearly in weak- Lr with r

d d α

. The following proposition shows that never-

theless one still has (up to an endpoint) the same conclusion as Young’s inequality

would imply.

Proposition 7.8. For any 1 p q and any 0

α d there is the

bound

I α f q C p, q, d f p provided 1

q

1

p

α

d

for any f S R

d

.

Proof. By Marcinkiewicz’s interpolation theorem it suffices for this bound to

hold with weak Lq. Thus, we need to show that

x Rd

I α f x λ C λ q f

q p (7.17)

Fix some f with f p 1 and an arbitary λ 0. With r 0 to be determined,

break up the kernel as follows

k α x : x

α d

k α x χ x

r

k α x χ x

r

: k 1

α x k 2

α x

Then

k 2

α f

k 2

α p

f p Cr α

d p

Determine r so that the right-hand side equals λ2

. Then

x

Rd

I α f

x

λ x

Rd

k

1

α f

x

λ 2

C λ p k

1

α f

p p C λ p

k 1

α

p

1

C λ pr α p C λ q

which is (7.17).

The condition on p and q is of course dictated by scaling. By Exercise 6.3, one

has

∆

s2 f C s, d I s f f S R

d

(7.18)

Thus, Proposition 7.8 implies the following Sobolev imbedding estimate.

Corollary 7.9. Let 0 s d

2 . Then for all 12

1q

sd one has

f Lq Rd

C s, d f

H s Rd

f S Rd

(7.19)

Proof. Fix f S Rd

and set g :

∆

s2 f . Then by (7.18) one has

f C s, d I sg

and the estimate (7.19) follows from the previous proposition.




We now address the question whether a singular integral operator can be de-

fined by means of formula (7.1) even if f L p R

d

rather than f S Rd

, say.

This question is the analogue of Proposition 4.9 and should be understood as fol-lows: for 1 p

we defined T “abstractly” as an operator on L p R

d

by exten-

sion from S Rd

via the apriori bounds

T f p C p,d f p for all f S Rd

(the

latter space is dense in L p R

d , cf. Proposition 1.5). We now ask if the principal

value (7.1) converges almost everywhere to this extension T for any f L p

Rd

.

Based on our previous experience with such questions, we introduce the associated

maximal operator

T

f x

sup

ε 0

y ε

K y

f

x

y

dy

and seek weak- L1 bounds for it. As in previous cases where we studied almost

everywhere convergence, this will suffice to conclude the a.e. convergence on L p.

In what follows we shall need the Hardy-Littlewood maximal function

M f x sup x B

1

B

B

f y dy

where the supremum runs over all balls B containing x. M satisfies the same type

of bounds as those stated in Proposition 2.5.

We prove the desired bounds on T

only for a subclass of kernels, namely the

homogeneous ones. This class includes the Riesz transforms from above. More

precisely, let

K x :

Ω

x x

x

d for x 0 (7.20)

where Ω : S d 1 C, Ω C 1 S d 1

and

S d 1 Ω x σ dx 0 where σ is the

surface measure on S d 1. Observe the homogeneity K tx t d K x for all t

0.

Exercise 7.6. Check that any such K satisfies the conditions in Definition 7.1.

Proposition 7.10. Suppose K is of the form (7.20). Then T

satisfies

T

f x C M T f x M f x

with some absolute constant C. In particular, T

is bounded on L p R

d

for 1

p . Furthermore, T

is also weak-L1 bounded.

Proof. Let K x :

K x χ x 1

and more generally

K ε x ε d K

x

ε K x χ

x ε

Pick a smooth bump function ϕ C

0 Rd

, ϕ 0,

Rd ϕ dx 1. Set

Φ : ϕ

K

K

and observe that ϕ K is well-defined in the principal value sense. For any function

F on Rd let F ε x :

ε d F

xε

be its L1-normalized rescaling. Then K ε K ,

K ε

K ε, and thus

Φε ϕ K

ε

K ε ϕε K ε

K ε ϕε K

K ε




Hence, for any f S Rd

, one has

K ε f ϕε K f Φε f

We now invoke the analogue of Lemma 2.7 for radially bounded approximate iden-

tities in Rd . This of course requires that Φε ε 0 from such a radially bounded

approximate identity, the verification of which we leave to the reader (the case of

ϕε is obvious). Indeed, one checks that

Φ x C min 1, x

d 1

which implies the desired property. Therefore,

T

f C M T f M f

as claimed. The boundedness of T

on L p R

d

for 1 p

now follows from

that of T and M . The proof of the weak- L1 boundedness of T

is a variation of the

same property of T , of Proposition 7.5.

Exercise 7.7. Show that Φε ε

0 as it appears in the previous proof forms a

radially bounded approximate identity.

Corollary 7.11. Let K be a homogeneous singular integral kernel as in (7.20).

Then for any f L p R

d , 1

p , the limit in (7.1) exists almost everywhere.

Proof. Let

Λ f x lim supε 0

T ε f x liminf ε

0

T ε f x

Observe that Λ f

2T

f . Fix f L p and let g

S

Rd

so that

f g p δ

for a given small δ 0. Then Λ

f Λ

f g

and therefore

Λ f p C Λ f g p C δ

if 1 p . Hence, Λ f 0. The case p 1 is similar.

Operators of the form (7.20) arise very naturally by considering multiplier op-

erators of the form T m f : m ˆ f

where m is homogeneous of degree zero and

belongs to C R

d

0

(for simplicity we assume infinite regularity). Indeed, with

f S Rd

one has over the tempered distributions

T f

K

f , K

mwhere m

S

Rd

is the distributional inverse Fourier transform. From the dilation

law 6.2 one sees that K is homogeneous of degree d which means that for any

λ 0

m λ , ϕ m, λ d ϕ λ 1 m, ϕ λ

m, ϕ λ λ d m λ 1

, ϕ




for all ϕ S Rd

. Observer that constants are homogeneous of degree zero,

whence 1 δ0 is homogeneous of degree d . This might seem less strange if

one remembers that as ε

0 one has1

B 0, ε

χ B 0,ε

δ0 in S

Rd

since the left-hand scales like a function that is homogeneous of degree d . We

now prove the following general statement which is very natural in view of the fact

that the representation of m needs to be invariant under adding constants to m.

Lemma 7.12. Let m C

Rd

0

be homogeneous of degree zero. Let

m

S d 1

denote the mean of m on S d 1. Then m S

Rd

satisfies

m m S d 1 δ0 P.V. K Ω

where K Ω is of the form (7.20) with Ω S d 1

0 and Ω C

S d 1

.

Proof. We may assume that m has mean zero on the sphere, i.e.,

S d 1

m ω σ

d ω 0

Observe that u j :

d j

m S

Rd

is homogeneous of degree

d and has mean

zero on S d 1. This implies that P.V. u j is well-defined in the usual integral sense,

and moreover, one has

u j P.V. u j

α

cα

αδ0

by Exercise 6.2 where the sum is finite. In fact, by homogeneity, the sum is neces-

sarily of the form c j δ0. Passing to the Fourier transforms in S one obtains

u j

P.V. u j c j

Next, one verifies that P.V. u j x is a smooth function on Rd

0

which is homo-

geneous of degree zero. Indeed, with χ ξ a smooth radial, compactly supported

function which equals 1 on a neighborhood of the origin one has for any x 0

P.V. u j x

Rd

d j m ξ χ ξ e

2πix ξ

1

d ξ

d

j 1

x j

2πi x

2

Rd

j

d j m ξ

1 χ ξ

e 2πix ξ d ξ

where p j

ξ j. Both integrals are absolutely convergent, and the first integral

clearly defines a smooth function in x. We may repeat the integration by parts in

the second integral any number of times, which implies that the second integral

also defines a smooth function. The homogeneity of degree zero follows from the

homogeneity of P.V. u j. We may therefore write

m x

Ω

x x

x

d x 0




where Ω C

S d 1 . Let ϕ S R

d be radial. Then

m, ϕ m, ϕ 0

since m has mean zero on spheres. Thus, Ω also has mean zero on S d 1. By

construction, m P.V. K Ω is supported at

0

. By Exercise 6.2 we conclude that

this diff erence is therefore a polynomial in derivatives of δ0. By homogeneity,

it must be a constant. By the vanishing means, this constant is zero and we are

done.

As a consequence, we see that singular integral operators of the type T m form

an algebra, with T m being invertible if and only if m 0 on S d 1. In Chapter 10 we

shall see that one can still obtain L p boundedness of T m without the strong assump-

tion of m being homogeneous of degree 0. Instead, one requires that ξ k

αm ξ

is bounded for all α

k for finitely many k . While these conditions are satisfied

if m is homogeneous of degree zero, they are much weaker than that property.

We now wish to quantify the failure of the boundedness of singular integral

operators on L . It turns out that the correct extension of L which contains

T L

for any singular integral operator T is the space of functions of bounded

mean oscillation on Rd , denoted by BMO R

d

. In what follows, Q

f y dy Q

1

Q

f y dy

Definition 7.13. Let f L1loc

Rd

. Define

f

x :

sup

Q x

Q

f y f Q dy (7.21)

where the supremum runs over cubes Q and f Q is the mean of f over Q. Then

f BMO Rd

if and only if f

L

Rd

. Then f BMO : f

. If Q0 is

a fixed cube, then BMO Q0 is obtained by defining f in terms of all subcubes

of Q0.

Of course one may replace cubes here with balls without changing anything.

Moreover, BMO is a norm only after factoring out the constants. Clearly,

L

BMO but BMO is strictly larger. In addition, BMO scales like L , i.e.,

f

λ BMO f

BMO. Suppose Q2 Q1 are two cubes with Q2 being half the

size of Q1. Then

f Q1

f Q2

Q1

Q2

Q1

f

y

f Q1

dy

2d

f

BMO (7.22)

Iterating this relation leads to the following: suppose Qn Qn 1 . . . Q1 Q0

is a sequence of nested cubes so that the diameters decrease by a factor of 2 at each

step. Then

f Qn f Q0

n 2d f BMO (7.23)




The important feature here is that while the size of Qn is by a factor of 2 n smaller

than Q0, the averages increase only linearly in n. As an example, consider the

function f defined on the interval

0, 1

as follows

g

n 0

an χ 0 ,2 n

(7.24)

with 1 an for all n 0.

Exercise 7.8.

Show that g

BMO

0, 1

if and only if an

O 1

. In that case, verify

that g x log x

for 0 x

12

.

Show that χ x

1

log x BMO 1, 1 , but

χ x 1

sign x

log x BMO

1, 1

Show that for any f as in Definition 7.13

f

x : sup x Q

inf c

Q

f y c dy

satisfies f

f

2 f .

We shall not distinguish between f and f . It is easy to show that singular

integrals take L

BMO.

Theorem 7.14. Let T be a singular integral operator as in Definition 7.1. Then

T f

BMO

CB

f

f

L

2 R

d

L

Rd

We are assuming here that f L2 R

d so that T f is well-defined.

Proof. We may assume that B 1 in Definition 7.1. Fix some f as above, a

ball B0 : B x0, R R

d and define

c B0 :

y x0 2 R

K x0 y f y dy

Then, with B

0 :

B x0, 2 R , one estimates

B0

T f

x

c B0

dx

B0

y

x0

2 R

K

x

y

K

x0

y

f y

dydx

B0

T

χ B

0 f

x dx

C B0 f

B0

12

T 2 2 χ B

0 f 2

C B0 f

and we are done.




In fact, it turns out that BMO is the smallest space with contains T L

Rd

for

every singular integral operator T as in Definition 7.1, see the notes to this chapter.

If BMO is supposed to be a useful substitute to L

Rd

, then we would like tobe able to use it in interpolation theory. This is indeed possible, as the following

theorem shows.

Theorem 7.15. Let T be a linear operator bonded on L p0 R

d for some 1

p0 and bounded from L

BMO. Then T is bounded on L p R

d for any

p0 p

.

The proof of this result requires some machinery, more specifically a com-

parison between the Hardy-Littlewood maximal function and the sharp maximal

function as in (7.21). For this purpose it is convenient to use the dyadic maximal

function, which we denote by M dyad and which is defined as

M dyad f x : sup x Q

Q

f y dy

where the supremum now runs over dyadic cubes which are defined as the collec-

tion

Qdyad :

2k 0, 1

d

2k Z

d k Z

Note that any two dyadic cubes are either disjoint, or one contains the other. Sup-

pose Q0 Qdyad, and let f

L1

Q0

. Then for any

λ

Q0

f y dy (7.25)

we can perform a Calderon-Zygmund decomposition as in Lemma 7.4 to conclude

that

x Q0 M dyad f x λ

Q

B

Q

whence in particular the measure of the left-hand side is λ 1

f L1 Q0

. In other

words, we have reestablished the weak- L1 bound for the maximal function. The

condition (7.25) serves as a “starting condition” for the stopping-time construction

in Lemma 7.4. The other values of λ are of no interest, as in that case

λ 1 f L1

Q0

Q0

so the weak- L1 bound becomes trivial. The construction of course does not neces-

sarily need to be localized to any Q0, but can also be carried out on Rd globally.

We now come to the aforementioned comparison between M dyad f and f .

While it is clear that f

2 M f , the usual Hardy-Littlewood maximal function,

we require a lower bound. In essence, the following theorem says that on L p with

finite p, the sharp function contains no new information.

Theorem 7.16. Let 1 p0

p

and suppose f L p0

Rd

. Then

Rd

M dyad f

p x dx C p, d

Rd

f

x

pdx (7.26)




We now take a closer look at the property (7.23). In fact, as suggested by

Exercise 7.8 we shall now show that any BMO function is exponentially integrable.

The precise statement is given by the following John-Nirenberg inequality.Theorem 7.17. Let f

BMO R

d . Then for any cube Q one has

x Q f x f Q λ C Q exp

c λ

f

BMO

for any λ 0. The constants c, C are absolute.

Proof. We may take Q 0, 1

d . Assume f BMO 1 with f Q 0, and

let λ 10. Perform a Calderon-Zygmund decomposition of f on the cube Q at a

level µ 2 that we shall fix later. This yields f g b, g 2d µ and

b

Q

B

χQ

f

f Q

,

Q

B

Q

µ 1

Define

E λ

:

sup g BMO

Q

1

x

Q

g

x

gQ λ

Then

E λ

x Q

b

x

λ 2d µ

Q

B

x Q

f x f Q

λ 2d µ

Q

B

E λ 2d µ Q

µ 1 E λ 2d µ

(7.28)

Iterating this relation implies

E λ µ n E λ 2d µn

Setting for example µ 2 leads to the desired bound.

Exercise 7.9. Show that one can characterize BMO also by the following con-

dition, where 1 r

is fixed but arbitrary:

supQ

Q

f y f Q

r dy

1r

(7.29)

In fact, show that if the left-hand side is 1, then

f

BMO C

d , r

. Conversely,

by Holder’s f BMO

1 implies that the left-hand side of (7.29) is 1.

A remarkable characterization of BMO was found by Coifman, Rochberg and

Weiss. The considered commutators between Calderon-Zygmund operators and

operators given by multiplication by a function, in other words, T , b

T b

bT .

While it is clear that T , b

is bounded on L p for 1 p

if b L , these

authors discovered that this property remains true for b BMO Rd

. What is more,

if this property holds for a single Riesz-transform (or a single nonzero operator of

the form (7.20), then necessarily b BMO. We shall now prove one direction of

this statement.




Theorem 7.18. Let T be a linear operator as in Proposition 7.5 with the added

property that for some fixed 0 δ 1 one has

K x y K x0 y A x

x0

δ

y x0

d δ

y x0 2 x x0

(7.30)

as well as T

2 2 A. Then for any b

BMO

Rd

one has

T , b

p p C

d , p

A

b

BMO

for all 1 p .

Proof. Since (7.30) implies the Hormander condition, we see that any operator

T as in the theorem is a Calderon-Zygmund operator. We can also assume that

b

BMO

1 and A

1. Fix f

S

R

d

, say. We shall show that for any 1

r

T , b f

C r , d M r T f M r f (7.31)

where

M r f x : supQ x

Q

f y

r dy

1r

Letting 1 r p

and noting that M r is L p-bounded, we conclude from (7.31)

that T , b

f

p C

d , p

f

p and Theorem 7.16 implies the desired estimate

(the p0-condition in that theorem is clear by Theorem 7.17 and f S Rd

).

To establish (7.31) fix any cube Q, which by scaling and translation invariance

may be taken to be the unit cube centered at 0. Denote by Q the cube of side-length

2

d , and write

T , b

f

T

b

b Q

f 1

T

b

b Q

f 2

b

b Q

T f : g1

g2 g3

where f 1 : χ Q f , f 2 f f 1. First, by Exercise 7.9

Q

g3 x dx

Q

b bQ

r dx

1

r

Q

T f

r x dx

1r

C inf Q

M r T f

Second, choose some 1 s r . Then by boundedness of T on L s one obtains

Q

g1 x dx

Q

g1 x

s dx

1

s

Q

b bQ

s f x

s dx

1

s

C inf

Q M r f

where we used (7.22) to compare bQ with b Q. Finally, set

cQ : T

b b Q f 2

0




Then by (7.30) one has the estimate

Q

g2

x

cQ dx

C

Q

Qc

K x y K 0 y b y b Q f y dydx

C

0

2 1 Q 2 Q

2

d δ

b y bQ f y dy

C

0

2 δ

2 1 Q

b y bQ

r dy

1

r

2 1 Q

f y

r dy

1r

C

0

2 δ inf Q

M r f C inf Q

M r f

where we used (7.23) to pass to the final estimate.

We have merely scratched the surface of BMO theory. Of course one would

like to determine a duality relation analogous to the basic L1 R

d

L

Rd

.

This turns out to be H 1

Rd

BMO

Rd

where H 1

Rd

is the completion

of S Rd

under the norm

f H 1 : f 1

d

j 1

R j f 1 (7.32)

Dually, one can then write

BMO L

d

j 1

R j L (7.33)

where the action of the Riesz transforms on L needs to be defined suitably. In

particular, this shows that BMO is the smallest space into which singular integral

operators are bounded from L . We shall establish some of these facts in a one-

dimensional dyadic setting in a later chapter.

We close this introductory chapter by stating a number of natural questions.

First, we may ask if a Calderon-Zygmund operator is bounded on Holder spaces

C α Rd

where 0 α 1. As we shall see, this turns out to be the case. The

analogue of Corollary 7.7 in this case is known as Schauder estimates which play

a fundamental role in elliptic equations. There are a number of proofs known of

this classical fact, and we shall develop one possible approach in Chapter 10.

Another question that one may ask, and which turned out to have far-reaching

consequences, is as follows: is there a proof of the L2-boundedness of the Hilbert

transform which does not rely on the Fourier transform? In other words, we wish

to find a proof of the L2-boundedness which relies exclusively on the properties

of the kernel itself. We shall answer this question in the affirmative in the larger

context of Calderon-Zygmund operators, and introduce the important method of

almost orthogonality in the process, see Chapter 8.




And finally, one should of course ask if there is a version of Calderon-Zygmund

theory for kernels which are not translation invariant. This question has far-reaching

consequences especially with respect to the L

2

boundedness. While it is true thatthe L p-theory is quite similar to the translation invariant case (see Problem 7.3),

the L2 theory leads to the T 1 theorem. We shall develop the basic ideas for that

theorem by means of a dyadic model on the line.

Notes

For most of the basic material of this chapter, excluding BMO, see Stein’s 1970

book [45]. For a systematic development of H 1 and BMO see [46] which gives a systematic

account of many aspects of Calderon-Zygmund theory that appeared from 1970 to roughly

1992. A classic reference for H 1 and the duality with BMO is the paper by Feff erman and

Stein [16], which contains (7.32). Uchiyama [55] gives a constructive proof of the decom-

position of a BMO function as a sum of L and images thereof under the Riesz transforms,

see (7.33). Coifman and Meyer [10] off er another perspective of Calderon-Zygmund the-

ory with many applications and connections with wavelets. The proof of the Coifman,

Rochberg, Weiss commutator estimate is from Janson [28] (credited to Stromberg), see

also Torchinsky [54]. One important aspect of Calderon-Zygmund theory that we do not

discuss in this text is that of A p-weights, which is closely related to BMO. For this material

see Stein [46], Garcıa-Cuerva, Rubio de Francia [7], as well as Duoandikoetxea [12]. The

classical “method or rotations” which can be used to reduce higher-dimensional results to

one-dimensional ones, is discussed in [12].

Problems

Problem 7.1. Let ai j x

d i, j

1 be continuous in some domain Ω R

d , d 2, and

suppose that these matrices are elliptic in the sense that for all x Ω

θ ξ

2

d

i, j 1

ai j x ξ i ξ j Θ ξ 2 ξ R

d

for some fixed 0 θ Θ. Assume that u C 2 Ω satisfies

d

i, j 1

ai j x

xi

x j

u x f x

Then show that for any compact K Ω and any 1 p one has the bound

D2u

L p

K

C u

L p

Ω

f

L p

Ω

where C only depends on p, K ,Ω and the ai j.

Problem 7.2. This problem shows how symmetries limit the boundedness properties

of operators which commute with these symmetries. We shall consider translation and

dilation symmetries, respectively.

a) Show that for any translation invariant non-zero operator T : L p R

d Lq

Rd

one necessarily has q p.




b) Show that a homogeneous kernel as that in (7.20) can only be bounded from

L p R

d Lq

Rd

if p q.

Problem 7.3. Let T be a bounded linear operator on L2 R

d so that for some function

K x, y measurable and locally bounded on Rd R

d ∆ with ∆ x y being the

diagonal, one has for all f L2comp R

d

T f x

Rd

K x, y f y dy, x supp f

Furthermore, assume the Hormander conditions

x

y 2

y

y

K x, y K x, y

dx A y y

x

y 2

x

x

K

x, y

K

x

, y

dy

A

x x

(7.34)

Then T is bounded from L1 to weak- L1 and strongly on L p for every 1 p .

Problem 7.4. This is the continuum analogue of Problem 4.3. Let m be a function of

bounded variation on R. Show that T m f m ˆ f

is bounded on L p R for any 1 p

. Find a suitable generalization to higher dimensions, starting with R2.

Problem 7.5. Let f BMO Rd

. Show that for any σ d and 1 r one has

Rd

f x f Q0

r

1 x

σ dx C d , σ, r f BMO

where Q0 is any cube of side-length 1. What does one obtain by rescaling this inequality?

Problem 7.6. Let f L1 Rd have compact support, say B : B 0, 1 . Let M be theHardy-Littlewood maximal function. Show that M f L1

B if and only if f log 2 f

L1 B . Now redo Problem 4.6 based on this fact and the F.&M. Riesz theorem. Hint:

Reprove the weak- L1 bound on M by means of the Calderon-Zygmund decomposition. In

particular, show that it in order to bound M f λ

it is enough to control the L1 norm

of f on f c λ where c 0 is some constant. See Stein [45] page 23 for more details.

Problem 7.7. Show that the Hilbert transform H with kernel 1 x

is bounded on the

Holder spaces C α R , 0 α 1. In other words, prove that for any 0 α 1

H f α C α f α

for all f C 10

R , say. Here α is the C α-seminorm. Use only properties of the kernel for

this problem.

Problem 7.8. Let A, B be reflexive Banach spaces (a reader not familiar with this

may take them to be Hilbert spaces). Let K x, y be defined, measurable, and locally

bounded on Rd R

d ∆ where ∆ x y is the diagonal, and take its values inL A, B ,

the bounded linear operators from A to B. Define an operator T on C 0comp Rd ; A , the

continuous compactly supported functions taking their values in A, such that

T f x

K x, y f y dy, x supp f




for all f C 0comp Rd ; A . Assume that for some finite constant C 0

x

y 2

x

x

K x, y K x , y L

A, B

dy C 0 x x

x

y 2

y

y

K x, y K x, y

L

A, B

dx C 0 y y

Further, assume that T : Lr R

d ; A Lr R

d ; B for some 1 r with norm bounded

by C 0. Show that then T : L p R

d ; A L p R

d ; B for all 1 p with norm bounded

by C 0 C 1 where C 1 C 1 d , r .

Problem 7.9. Investigate the endpoint p

d α

of Proposition 7.8. More precisely, find

examples which show that fractional integration in that case is not bounded into L . Verify

that your examples get mapped into BMO L

Rd

. For the positive result which proves

boundedness L p BMO see Adams [1].

Problem 7.10. Let f L1 T . Given λ 1, show that there exists E T (depending

on λ and f ) so that E λ

1 and for all N

Z

1

N

T

E

N

n 1

S n f θ

2 d θ C λ f

21

C is a constant independent of f , N , λ. Hint: First apply the Plancherel theorem in n, and

then use a Calderon-Zygmund decomposition.



CHAPTER 8

Almost Orthogonality

The proof of L2 boundedness of Calderon-Zygmund operators in the previous

chapter was based on the Fourier transform. This is quite restrictive as it requires

the operator to be translation invariant. We now present a device that allows one to

avoid the Fourier transform in the context of L2-theory in many instances.

Let us start from the basic observation that the operator norm of an infinite

diagonal matrix viewed as an operator on 2 Z is as large as its largest entry. Tobe specific, let

T

ξ j j Z

λ j ξ j j Z ξ j j

Z 2 Z

where λ j j Z is a fixed sequence of complex numbers. Then

T 2

2 sup

j

λ j

More generally, suppose a Hilbert space H can be written as an infinite orthogonal

sum

H

j

H j, H f

j

f j, f j H j

and suppose that T j are operators on H with T jH k 0

if k j. Furthermore,

assume that the range of T j is a subspace of H j. Then

T f

2H

j,k

T j f j, T k f k

j

T j f j

2H j

sup

j

T j

2H j

H j

j

f j

2H j

M 2

f

2H

where M sup j

T j H j H j .

Generalizing from the previous example, assume that T is a bounded operator

on the Hilbert space H which admits the representation T

j T j such that

Ran T j Ran T k as well as Ran T

j Ran T

k for j k . In other words,

we assume that T

k T j 0 and T k T

j 0 for j k . Now recall that Ran T

j

ker T j

and denote by P j the orthogonal projection onto that subspace. Then for

any f H one has

T f

j

T jP j f

j

T j f j, f j P j f

101



102 8. ALMOST ORTHOGONALITY

which implies that

T f

2

H

j

T j f j

2

H

sup

j

T j

2

j

f j

2

H

M 2

f

2

H

with M : sup j T j .

As a final step, we shall now relax the orthogonality conditions T

k T j

0 and

T k T

j 0 for j k from above, which turn out to be too restrictive for most

applications. The idea is simply to replace this strong vanishing requirement by

a condition which ensures sufficient decay in j k . One can think of this as

replacing diagonal matrices with matrices whose entries decay in a controllable

fashion away from the diagonal. The following lemma, known as Cotlar-Stein

lemma, carries this idea out.

Lemma 8.1. Let T j

N j 1

be finitely many operators on L2 such that for some

function γ : Z R , one has

T

j T k γ 2 j k , T jT

k γ 2 j k

for any 1 j, k N. Let

γ : A

Then

N j

1 T j A.

Proof. For any positive integer n,

T

T

n

N

j1,..., jn 1k 1,...,k n

1

T

j1T k 1 T

j2T k 2 T

jnT k n

We now take the operator norm of both sides of this identity, and apply the triangle

inequality on the right-hand side. Furthermore, we bound the norm of the products

of operators by the products of the norms in two diff erent ways, followed by taking

the square root of the product of the resulting estimate.

Therefore, with sup1 j N

T j : B

T T

n

N

j1,..., jn 1

k 1,...k n 1

T j1

12

T

j1 T k 1

12

T k 1 T

j2

12

T k n

1 T

jn

12

T

jn T k n

12

T k n

12

N

j1,..., jn 1k 1,...,k n 1

B γ j1 k 1 γ k 1

j2 γ j2 k 2 γ k n 1

jn γ jn k n

B

N BA2n 1



8. ALMOST ORTHOGONALITY 103

Since T T is self-adjoint, the spectral theorem implies that T T

n T T

n

T

2n. Hence,

T

N BA 1

1

2n

ALetting n

yields the desired bound.

Note that one needs both smallness of T

j T k and T jT

k in the Cotlar-Stein

lemma, as can be seen from the examples preceding the lemma.

In order to apply Lemma 8.1 one often invokes the following simple device.

Lemma 8.2. Define an integral operator on a measure space X Y with the

positive product measure µ ν via

T f

x

Y

K x, y

f y

ν dy

where K is a measurable kernel. One has the following bounds (the first three items

are known as Schur’s lemma):

T

1 1 sup y

Y

X K

x, y µ

dx

: A

T

sup x X

Y K

x, y ν

dy

: B

T p

p A1 p B

1

p where 1 p .

T 1

K L

X Y

Proof. The first two items are immediate from the definitions, and the third

then follows by interpolation. Alternatively, one can use Holder’s inequality. The

fourth item is again evident from the definitions.

Henceforth, we shall simply refer to this lemma as “Schur’s test”. We will now

give an alternative proof of L2 boundedness of singular integrals for kernels as inDefinition 7.1 which satisfy the stronger condition

∇K

x

B

x

d 1

cf. Lemma 7.2. This requires a certain standard partition of unity over a geometric

scale which we now present.

Lemma 8.3. There exists ψ C

Rd

with the property that supp ψ R

d

0

is compact and such that

j

ψ 2 j x

1 x 0 (8.1)

For any given x 0 at most two terms in this sum are nonzero. Moreover, ψ canbe chosen to be a radial nonnegative function.

Proof. Let χ C

Rd

satisfy χ x

1 for all x

1 and χ x 0 for

x

2. Set ψ

x :

χ x

χ 2 x

. For any positive N one has

N

j N

ψ 2 j x χ 2 N x χ 2 N 1 x




If x 0 is given, then we take N so large that χ 2 N x 1 and χ 2 N 1 x 0.

This implies (8.1), and the other properties are immediate as well.

The point of the following corollary is the method of proof rather than the

statement (which is weaker than the one in the previous chapter).

Corollary 8.4. Let K be as in Definition 7.1 with the additional assumption

that ∇K x B x

d 1. Then

T 2 2 CB

with C C d .

Proof. We may take B 1. Let ψ be a radial function as in Lemma 8.3 and

set K j x K x ψ 2 j x . One now easily verifies that these kernels have the

following properties: for all j Z one has

K j x dx 0,

∇K j

C 2 j2 jd

In addition, one has the estimates

K j x dx C

x K j x dx C 2 j

with some absolute constant C . Define

T j f

x

Rd

K j x

y

f

y

dy

Observe that this integral is absolutely convergent for any f L1 oc

Rd

. We shall

now check the conditions in Lemma 8.1. Let K j x

:

K j

x . Then it is easy to

see that

T

j T k f x

:

Rd

K j K k

y f

x

y

dy

and

T jT

k f

x

Rd

K j

K k

y

f

x

y

dy

Hence, by Young’s inequality,

T

j T k 2

2

K j K k 1

and

T jT

k 2 2 K j

K k 1




It suffices to consider the case j k . Then, using the cancellation condition

K k y dy 0, one obtains

K j K k x

Rd

K j y x K k y dy

Rd

K j y x K j x

K k y dy

Rd

∇K j

y K k y dy

C 2 j2 jd 2k

Since

supp

K j K k supp

K j supp K k B 0, C 2 j

we further conclude that

K j K k 1 C 2k

j C 2

j

k

Therefore, Lemma 8.1 applies with

γ 2 C 2

and the corollary follows.

Exercise 8.1.

a) In the previous proof it suffices to consider the case j k 0. Provide

the details of this reduction.

b) Observe that Corollary 8.4 covers the Hilbert transform. In that case,

draw the graph of K x

1 x

and also K j x

1 x

ψ 2 j x

and explain the

previous argument by means of pictures.

Another simple application of these almost orthogonality ideas is the Calderon-

Vaillancourt theorem. This result concerns the L2-boundedness of pseudo-di ff erential

operators of the form

T f x

Rd

eix ξ a x, ξ ˆ f ξ d ξ, f S Rd

(8.2)

where a C Rd

Rd

is such that

sup x,ξ Rd

α x a x, ξ

α ξ a x, ξ

B (8.3)

for all α d 1.

Proposition 8.5. Under the conditions (8.3) the operators in (8.2) are bounded

on L2 R

d

.

Proof. We may assume that B 1. Let χ be smooth and compactly supported

such that χ k k Zd forms a partition of unity in Rd :

k Zd

χ ξ k

1

ξ Rd




To construct such a χ, start from a compactly supported smooth η 0 with the

property that

ψ

x

:

k Zd η

x

k

0

x

R

d

Note that ψ is periodic with respect to Zd and therefore uniformly lower bounded.

Now define χ :

χ

ψ.

Set

χk x, ξ : χ x k χ ξ

ak , x, ξ : a x, ξ χ x k χ ξ

and define

T k , f x :

Rd

eix ξ ak , x, ξ f ξ d ξ

for f

S R

d

. Note that we have dropped the Fourier transform of f , which isadmissible by Plancherel’s theorem. First, Lemma 8.2 implies that

supk ,

T k , 2 2 C 1

Furthermore, we claim that

T

k ,

T k , 2

2 C k k d 1 d 1

T k , T

k , 2

2 C k k d 1 d 1

(8.4)

for all k , k , ,

Zd . Since the bounds on the right-hand side are summable over

k , Zd , Lemma 8.1 implies that

k N

N

T k , f

2 C f 2 f S R

d

for any positive integer N with some absolute constant C . Passing to the limit

N concludes the proof. To prove (8.4) we start from

T

k , g ξ

Rd

e ix ξ a x, ξ g x dx

which shows that T

k ,

T k , 0 requires that k k C and similarly, T k , T

k , 0

requires that C . The kernel of T

k ,

T k , is

K k , ;k ,

ξ, η :

Rd

e ix ξ η ak , x, ξ ak ,

x, η dx (8.5)

To prove the decay in

we may assume that this diff erence exceeds some

constant which is chosen such that ξ η

1 on the support of ak , x, ξ ak ,

x, η .

We now use the relation

i ξ η

ξ η

2∇ xe ix ξ η

e ix ξ η

in order to integrate by parts in (8.5) repeatedly, to precise d 1 times. This yields

K k , ;k ,

ξ, η C

d 1




Since the support of the kernel with respect to either variable is of size C 1, we

conclude from Schur’s test that

T

k , T k , 2 2 C

d

1

where may assume that k k C 1. The same type of argument now implies

the second relation of (8.4) and we are done.

We now turn to Hardy’s inequality, which provides a natural transition between

this chapter and next one devoted to the uncertainty principle.

Theorem 8.6. For any 0 s

d 2

there is a constant C s, d with the property

that

x

s f 2 C s, d f

H s Rd

(8.6)

for all f

H s

R

d

.

Proof. The stated bound is evident for s 0, so we may assume that 0 s

d 2

by interpolation (any reader uncomfortable with this interpolation may simply

assume that condition on s without missing anything from the following argument).

By density, it suffices to prove this estimate for f S Rd

. We use the partition

of unity from Lemma 8.3 and write Pk f ψ

2 k ξ ˆ f

for any k Z. By

construction, f

k Pk f where the series converges in several ways, for example

in L2. Furthermore, we set χ x ψ 2

x with Z. Then

x

s f 2

k

x

sPk f 2

C

k

k 0

2 2 s χ Pk f

22 22sk

Pk f

22

1

2

(8.7)

where the final term is the contribution from the region x 2 k . The idea

here is that the main contribution should come from k 0 which immediately

yields (8.6). The mechanism here derives from the uncertainty principle which we

will discuss in the following chapter; an indication of this is given by the trans-

formation law (6.1) and in particular the special case of this, namely the dilation

identity (6.2).

To implement this idea, we let ψ be smooth and compactly supported in Rd

0

such that ψ 1 on supp ψ . This implies that Pk f :

ψ 2 k ξ ˆ f

for any k Z

satisfies Pk Pk Pk Pk Pk whence

χ Pk f 2 χ Pk Pk f 2 χ Pk 2 2

Pk f 2

We now claim that

χ Pk

22 2

C 2d k

k 0 (8.8)

which is simple consequence of Schur’s test. We shall provide the details for this

estimate at the end of the proof. If this holds, then we can continue from (8.7) as




follows:

k

k 0

2 2 s

χ Pk f

2

2

12

C

k

k 0

2 2 s

22d

k

Pk f

2

2

12

C

k

k 0

2 d

2s

k

22sk Pk f

22

12

C

k

2sk Pk f 2

In summary we arrive at (it is easy to see that we my drop the tilde on Pk )

x

s f 2

k

22sk Pk f 2

But this is strictly weaker than what we would like which shows that one cannotgive away as much of the (almost) orthogonality of the Pk f as we did.

Hence, we should go about the first step (8.7) a bit diff erently:

x

s f

22

C

2 2 s χ f

22

C

2 2 s

k 0

χ Pk f 2

2

2 2 s χ P

f

22

(8.9)

where P :

j P j. For the sum involving both k , we invoke (8.8) to wit

2 2 s

k 0

χ

Pk f

2

2

C

k 0

2

d 2 s

k 2sk

Pk f

2

2

C

22ks Pk f

22 C f

2 H s

since d 2

s 0. To pass to the last line one can use Schur’s test for sums, for

example. The final sum in (8.9) uses that s 0:

2 2 s χ P f

22

2 2 s

P f

22

C

2 2

k s

k

0

22sk Pk f

22

C

k

22sk Pk f 22

k 0

2

2

k

s C f 2 H s

which is (8.6).

It remains to prove (8.8). For this write

χ Pk f x

Rd

ψ 2 x 2kd ψ 2k

x y f y dy




and observe that

sup x

Rd

ψ 2 x

2kd ψ 2k

x y dy C 1

sup y

Rd

ψ 2 x 2kd ψ 2k

x y dx C 2 k d

while implies the claim by Schur’s test.

Exercise 8.2. Show that one may “drop the tilde” as was done in the previous

proof, i.e., prove that

k

22ks Pk f

22

k

22ks Pk f

22

for any s.

Theorem 8.6 is sharp in the following sense: clearly, both sides scale the sameso it is impossible to use any other Sobolev space, for example. The condition

s 0 is of course needed as we cannot allow for growing weights. And s

d 2

is

not possible since in that case the estimate fails for Schwartz functions which do

not vanish at the origin (the norm on the left-hand side is infinite in that case).

Several questions pose themselves naturally now: (i) is there a substitute for

s

d 2

in the previous theorem? (ii) Is there a version of Theorem 8.6 with L p

instead of L2?

In case of s 1, we shall answer these questions in the problems.

Notes

For more on the method of almost orthogonality, in particular the use of Cotlar’s

lemma, see Stein [46]. Hardy’s inequalities are basic tools that have been generalized

in many diff erent directions, see for example Kufner and Opic [39]. For fractional s one

needs more tools, namely Littlewood-Paley theory.

Problems

Problem 8.1. Let T be of the form (8.2) where a C Rd

Rd

is such that

sup x

Rd

α x a x, ξ

α ξ a x, ξ

B ξ s

for all α d 1 and all ξ Rd . Here s 0 is arbitrary but fixed. Show that T is bounded

form H s

R

d

to L2

R

d

.

Problem 8.2. Suppose K x, y is defined on 0,

2 and is homogeneous of degree

1, i.e., K λ x, λ y λ

1K x, y for all λ 0. Further, assume that for an arbitrary but

fixed 1 p one has

0

K x, y y

1 p dy A

Show that T f x :

0 K x, y f y dy satisfies T f p A f p.




Problem 8.3. Prove Hardy’s original inequality, valid for every measurable f 0

0

x

0

f t dt

p

x

r 1 dx

p

r

p

0

x f x

p x

r 1 dx

0

x

f t dt

p

xr 1 dx

p

r

p

0

x f x

p xr 1 dx

for every 1 p , r 0. Equality holds if and only if f 0 a.e. Hint: use the

previous problem.

Problem 8.4. Show the following Hardy’s inequality in Rd

Rd

∇u x

p dx

d p

p

p

Rd

u x

p

x

p dx (8.10)

for all u S Rd

if 1 p d and for all such u which also vanish on a neighborhood of

the origin if p d . The constants here are sharp. Hint: use the previous problem.

Problem 8.5. Show that if 1 p d and 1 p

1 p

1d

, then

Rd

u x

p

x

p dx C p, d u

p

p

provided u S Rd

is radial and decreasing. Conclude that for such functions the Sobolev

embedding inequality u p

C p, d ∇u p. By means of the device of nonincreas-

ing rearrangement this special case implies the general one. In other words, the Sobolev

embedding holds without the radial decreasing assumption. The point here is that the re-

arrangement does not change u p and at most decreases ∇u p, see [36]. This is useful

in order to find the optimal constants, see Frank, Seiringer [18].



CHAPTER 9

The Uncertainty Principle

This chapter deal with various manifestations of the heuristic principle that

it is not possible for both f and ˆ f the be localized on small sets. We already

encountered a rigorous, albeit qualitative and rather weak version of this principle:

it is impossible for both f L2 R

d and ˆ f to be compactly supported.

Here we are more interested in quantitative versions, and moreover, the tight-

ness of these quantitative versions. What is meant by that can be seen from thetransformation law (6.1): let χ be a smooth bump function, and A be an invertible

affine transformation that takes the unit ball onto an ellipsoid

E :

x Rd

d

j 1

x j y j

2r 2 j

1

where y j are arbitrary constants and r j 0. Then χ A :

χ A 1 can be thought of

as a smoothed out indicator function of E. By (6.1), χ A then essentially “lives” on

the dual ellipsoid

E :

ξ Rd

d

j 1

ξ 2 j r 2 j 1

Moreover, while χ A is L -normalized, χ A is L1-normalized. To “essentially live

on” E means that

χ A ξ

C N E

1 ξ 2

E

N

where

ξ 2E

:

d

j 1

ξ 2 j r 2 j ξ

is the Euclidean norm relative to E . What we can see from this is another, still

heuristic but more quantitative form of the uncertainty principle, namely the fol-

lowing: if f L2 R

d is supported in a ball of size R, then it is not possible for ˆ f

to be “concentrated” on a scale much less than R 1.

We begin the rigorous discussion with another form of Bernstein’s estimate.In what follows B x0, r denotes the Euclidean ball in Rd which is of radius r and

centered at x0.

Lemma 9.1. Suppose f L2 R

d satisfies supp

f B 0, r . Then

α ˆ f 2

2πr

α

ˆ f 2 (9.1)

for all multi-indices α.

111



112 9. THE UNCERTAINTY PRINCIPLE

Proof. This follows from ˆ f C

Rd

, the relation

α ˆ f ξ

2πi

α

xα f ξ

and Plancherel’s theorem.

Heurstically speaking, this can be seen as a version of the non-concentration

statement from before. Indeed, if ˆ f were to be concentrated on a scale of size

r 1, then we may expect the left-hand side of (9.1) to be much larger than r α

.

Later in this chapter, we shall see a better reason why Bernstein’s estimate should

be viewed in the context of the uncertainty principle.

Next, we prove Heisenberg’s uncertainty principle.

Proposition 9.2. For any f S R one has

f

2

2

4π

x

x0

f

2

ξ

ξ 0

ˆ f

2 (9.2)

for all x0, ξ 0 R. This inequality is sharp, with extremizers being precisely given

by the modulated Gaussians

f x C e2πi ξ 0 x e πδ x x0

2

where C and δ 0 are arbitrary.

Proof. Without loss of generality x0 ξ 0 0. Define D

12πi

d dx

, and let

X f x x f x . Then the commutator

D, X

DX

XD

1

2πi

whence for any f S R ,

f

22

2πi D, X

f , f

2πi

X f , D f

D f , X f

4πIm D f , X f 4π D f 2 X f 2

(9.3)

as claimed.

For the sharpness, note that if equality is achieved in (9.3) then from the con-

dition on equality in Cauchy-Schwarz one has D f λ X f for some λ C. Then

D f , X f λ X f

22

which means that λ iδ with some δ 0. Finally, this

implies that

f

x

2πδ f

x

, f x

Ce

πδ x2

The general case follows by translation in x and ξ (the latter being the same as

multiplication by eix ξ 0 .

There is an analogous statement in higher dimension which we leave to the

reader. Once again, we can see Proposition 9.2 as a manifestation of the principle

that if f is concentrated on scale R 0, then ˆ f cannot be concentrated on a scale

R 1.



9. THE UNCERTAINTY PRINCIPLE 113

Hardy’s inequality from the previous chapter can be seen as another instance

of this principle. Indeed, in R3 it implies that

x x0

1 f 2 C ∇ f 2 C ξ ˆ f 2 f S R3 (9.4)

for any x0 R3. Thus, if ˆ f is supported on B 0, R , then

sup x0 R3

f L2 B x0,ρ

C ρ R f 2 0 ρ R 1

We leave it to the reader to derive similar statements in all dimensions. In fact, the

proof of Hardy’s inequality which we gave above clearly shows the dominant role

of the dual scales R in x, and R 1 in ξ , respectively.

Let us reformulate (9.4) with x0 0 as follows:

x

2 f , f C ∆ f , f f S R

3

which can be rewritten as ∆

c x 2 for some c

0 as operators (in other words, ∆

c x

2 is a nonnegative symmetric operator).

It is interesting to note that we can now give a proof of this special case of

Hardy’s inequality using commutator arguments, similar to those in the proof of

Proposition 9.2. In addition, this will yield the optimal value of the constant.

Proposition 9.3. One has ∆

d 2

2

4 x

2 in Rd with d 3 , which means that

for any f S Rd

∆ f , f

d 2

2

4 x

2 f , f x

1 f 2

d 2

2

4 ∇ f 2

Proof. Let p i∇. The relevant commutator here is, with r x ,

i r 1 pr 1, X dr 2

This implies that

d r 1 f

22

2Im r 1 pr 1 f , X f f S R

d

Since

p x

1

x

1 p i

x

x

3

one obtains

d 2

r 1 f

22

2Im p f , Xr 2 f f S R

d

Cauchy-Schwartz concludes the proof.

The bound in the previous proposition is optimal, but not attained in the largest

admissible class, i.e., H 1 Rd

. We shall not dwell on this issue here.

Next, we investigate the following question: if E , F Rd are of finite measure,

can there be a nonzero f L2 R

d with supp

f E and supp f F ?

Note that if there were such an f , then T f : χ E χF ˆ f

satisfies T f f .

In particular, this would imply T

2 2 1. We might expect that T has small




norm if E and F are small, leading to a contradiction and negative answer to our

question. Now note that

T f x

Rd

χ E x ˇ χF x y f y dy

In other words, T has kernel K x, y χ E x ˇ χF x y with Hilbert-Schmid norm

R2d

K x, y

2 dxdy E F : σ2

We remark that σ is a scaling invariant quantity. Hence, T min σ, 1 in oper-

ator norm and T is a compact operator. So if σ 1, then we see that there can be

no f 0 as in the question. Interestingly, the answer is “no” in all cases and one

has the following quantitative expression of this property.

Theorem 9.4. Let E and F be sets of finite measure in Rd . Then

f L2 Rd

C f L2 E c

ˆ f L2 F c

(9.5)

for some constant C A, B, d .

The reader will easily prove the estimate in case σ 1. For the general case,

we formulate the following equivalencies with the goal of showing that T

1

for any E , F as in the theorem. In the lemma supp refers to the essential support.

Lemma 9.5. Let E , F be measurable subsets of Rd . Then the following are

equivalent (C 1 and C 2 denote constants):

a) f L2 Rd

C 1 f L2 E c

ˆ f L2 F c

for all f L2 R

d

b) there exists ε

0 so that

f

2

L2 E

f

2

L2 F

2

ε

f

2

2 for all f

L2

Rd

c) if supp

ˆ f F, then f 2 C 2 f L2 E c

d ) if supp f E, then

ˆ f 2 C 2

ˆ f L2 F c

e) there exists 0 ρ

1 so that χ E χF ˆ f

2 ρ f 2 for all f L2 R

d .

Proof. a b : one has

f

2 L2

E

ˆ f

2 L2

F

2

f

22 f

2 L2

E c

ˆ f

2 L2

F c

2

2C 21

1

f

22

b

c : for f as in c

,

f

2

L2

E

1

ε f

22

or f

22

ε 1

f

2

L2

E

. c

a :

write f PF f PF c f where PF f χF ˆ f

and set Q E c f χ E c f . Then

f

2 PF f

2 PF c f

2 C 2

Q E c PF f 2

PF c f 2

C 2 Q E c f 2 C 2 Q E c PF c f 2 PF c f 2

C 2 Q E c f 2 C 2 1

PF c f 2

Interchanging f with ˆ f one sees that property d to the first three.




c e : with the projection P, Q as above one has

PF f

22 Q E PF f

22 Q E c PF f

22

Q E PF f

22 C

22

PF

22

which implies the desired bound with ρ 1

C 2

2

12 .

e c : clear.

Proof of Theorem 9.4. We already observed that T Q E PF is compact with

Hilbert-Schmid norm T

HS σ. In particular,

dim f L2 R

d T f λ f λ

2σ2 (9.6)

To prove this bound, let f j

m j 1

be an orthonormal sequence in the eigenspace on

the left. Then, with K being the kernel of K , one has

mλ2

j

R2d

K x, y

f j x

f j

y dxdy

2

K

2 L2

R2d (9.7)

by Bessel’s inequality. Furthermore, as product of projections, T

1 in operator

norm. If T 1, then it follows from compactness of T that there exists f

L2 R

d

with f 0 and T f

2 f

2.

Thus, there exists f with supp f E and supp

ˆ f F (as essential sup-

ports). We shall now obtain a contradiction to (9.6) for λ 1. Inductively, define

S 0 :

supp f

, S 1 :

S 0 S 0

x0

S k 1 : S k S 0 xk k 1

where the translations xk are chosen such that

S k S k 1 S k 2 k

If follows that the collection f k :

f

xk with k

0 is linearly independent

with

supp f k

S

:

0

S k

0

where S

, while maintaining supp

ˆ f k F . This gives the desired contra-

diction.

Next, we formulate some results that provide further evidence of the non-

concentration property of functions with Fourier support in B 0, 1 .

Proposition 9.6. Let α 0 and suppose that E Rd satisfies

E

B

α B

for all balls B or radius 1

Suppose f L2 R

d satisfies supp

ˆ f B 0, 1

. Then

f L2 E

δ α f 2 (9.8)

where δ α 0 as α

0.




Proof. Let ψ be a smooth compactly supported function with ψ 1 on B 0, 1 .

Define T g : χ E ψg

. The kernel of this operator is

K x, y χ E x ψ x y

and satisfies the bound, for any N 1,

K x, y C N χ E x 1 x y

N

We apply Schur’s lemma to this kernel. To this end let ε 0 be arbitrary but fixed.

First, K

x, L1

y

C 1. Second, take N d and choose R so large that

sup y

x y R

C N 1 x y

N dx ε

Then we see by covering B x, R

by finitely many balls of radius 1 that

x

y

R

C N χ E x 1 x y

N dx ε

provided α 0 is small enough. Therefore, by Schur’s lemma 8.2 we conclude

that T C

ε. Hence, if f is as in the hypothesis, then χ E f T f whence

f

L2 E

C

ε f

2

as claimed.

We now establish a somewhat more refined version of the previous result,

which goes by the name of Logvinenko-Sereda theorem.

Theorem 9.7. Suppose a measurable set E Rd satisfies the following “thick-

ness” condition: there exists γ 0, 1 such that

E

B

γ B for all balls B of radius R 1

where R 0 is arbitrary but fixed. Assume that supp

ˆ f B 0, R . Then

f L2 Rd

C f L2 E

(9.9)

where the constant C only depends on d and γ .

By the same argument as in Proposition 9.6 (or directly from that proposition)

one proves this result for γ near 1.

Exercise 9.1. Prove Theorem 9.7 if γ is close to 1.

We shall base the proof of Theorem 9.7 on arguments involving analytic func-

tions of several complex variables. This is natural in view of the Paley-Wiener

theorem, cf. 6.5 which we shall state below. However, we shall not obtain explicitconstants C d , γ from our argument.

In order to keep this discussion self-contained, we briefly review the few facts

about analytic functions of several complex variables that will be needed. Let

Ω Cd be open and connected (a “domain”), and suppose G C Ω is complex-

valued. We say that G is analytic in Ω if and only if

z j G

z1, z2, . . . , z j 1, z j, z j 1, . . . , zd




is analytic on the corresponding sections of Ω as all other z j are kept fixed. We

denote this class by A Ω . The Cauchy integral formula applies via integration

along contours in the individual variables. Examples of such functions are powerseries

f z

α

aα z z0

α, z ∆ : z max1

j d z j z0

j M

provided aα M α

for all multi-indices α and a constant M 0. Indeed, it

is easy to see that the power series converges uniformly in ∆ and the sum f z is

analytic in ∆. In particular, if

aα M B

α

α! α

then the power series represents an entire function in Cd .

Conversely, if f A Ω , then

f z

α

α z f z0

α! z z0

α z z0 r

for some r 0. The uniqueness theorem therefore also holds in this setting: if

α z f

z0

0 for all α, then f 0 in Ω. Moreover, if Ω R

d contains a set

of positive measure on which f vanishes, then f 0 in Ω. Due to the Cauchy

estimates, one has Montel’s theorem on normal families: if F A Ω is a locally

bounded family, then F is precompact, i.e., every sequence in F has a subsequence

which converges locally uniformly on Ω to an element of A Ω .

Finally, we recall the Paley-Wiener theorem: suppose f L2

Rd

satisfies

supp ˆ f B 0, R . Then f extends to an entire function in Cd with exponentialgrowth:

f z C Rd 2 e2π

R z

f L2 Rd

which is proved by placing absolute values inside the integral defining ˆ f z and

applying Cauchy-Schwarz.

Proof of Theorem 9.7. We may assume that R 1. Let A

2 be a constant

that we shall fix later. Partition Rd into congruent cubes of sidelength 2, which we

shall denote by Q. We say that one such Q is good if

α f L2

Q

A

α

f

L2 Q

α

We now claim that

f L2

bad Q

CA 1 f 2 (9.10)

To prove it, consider the sum

α0

Q bad

A 2

α

Dα f

2 L2

Q (9.11)




Since Q is bad, there is a lower bound of f

2 L2

bad Q

. On the other hand, by

Lemma 9.1, one has an upper bound of the form

α0

Q bad

A 2 α

2π

α

f

22

k 1

k

1

d

2π A 2

k

f

22

CA 2

f

22

which proves the claim.

Next, we claim that if Q is good, then there exists x0 Q such that

α f x0

C α A2 α

1

f L2

Q

, α (9.12)

Indeed, if not, then

A2 Q f 2 L2 Q

Q

α

A

3

α

A4

α

2 f 2 L2 Q

dx

Q

α

A 3

α

α f x

2 dx

α

A 3 α

A2 α

f

2 L2

Q

k 0

k

1

d 2 k

f

2 L2

Q

which is a contradiction for A 2 large.

And finally, we claim that there exists η 0 which depends only on d , γ, A so

that

f

L2

E

Q

η f

L2

Q

for all good Q (9.13)

If not, there exist f n

n 1

in L2 R

d

with supp

ˆ f n B 0, 1

and measurable sets

E n Rd with

E n B x, 1 γ B x, 1 x Rd

and good cubes Qn such that

f n L2

Qn

1, f n L2

E n

Qn

0

as n

. After translation, Qn 1, 1

d

: Q0. Denote the entire extension of

f n given by the Paley-Wiener theorem by F n. We expand F n around a good xn as

in (9.12) to conclude that for any z R one has

F n z

α

A2

α 1

α! z xn

α

0

A2 1

!

2

z

d

C

A, d , R

So F n

n 1

is a normal family in Cd and we may extract a locally uniform limit

F n F

(up to passing to a subsequence) whence F

L2 Q0

1. On the other




hand,

x Q0

E n f n

x λn

λ 2n f n

2 L2

E n

λ 2n η2

n

γ

2 B 0, 1

if ηn λn

γ

2 B

0, 1 . In particular, λn

0. By the thickness condition,

X n : x Q0 E n f n x λn

satisfies X n

γ

2. But then

X

:

lim supn

X n

satisfies X

γ

2. By construction, λn

0 whence F

0 on X

. By the

uniqueness theorem for analytic functions of several variables, F

0 on Q0.

This contradiction establishes our final claim. We can now finish the proof of

the theorem. Indeed, one has from (9.13) and (9.10)

f

2 L2

E

Q good Q

Q good

f

2 L2

E Q

Q good

η2 f

2 L2

Q

η2 f

2 L2

good Q

η2

f

2 L2

C 2 A 2

f

22

1

2 η2 f 22

if A was chosen large enough. This implies the desired estimate.

Exercise 9.2. Prove that Theorem 9.7 with R 1 can also be proved under the

following weaker hypothesis: there exists γ 0 and r 0 such that

E B x, r γ x Rd

The constant in (9.9) then of course also depends on r .

We now formulate a second version of the Logvinenko-Sereda theorem. We

begin with a definition.

Definition 9.8. A set F R

d is called B 0, 1

-negligible if and only if there

exists ε 0 so that for all f L2

Rd

one has either

ˆ f L2

Rd

B 0,1

ε f

2 or

f L2 Rd

F

ε f 2

(9.14)

By means of Theorem 9.7 we can characterize negligible sets.




Theorem 9.9. A set F is B 0, 1 -negligible if and only if there exist r 0 and

β 0, 1

such that for all x

Rd

F B x, r β B x, r (9.15)

Proof. First, we note that the condition (9.15) is equivalent to the following:

there exist r 0 and γ

0 so that

F c B x, r γ x Rd (9.16)

If (9.16) fails, then there exists x j Rd and r j with

F c B x j, r j 0 j

Now take ϕ Schwartz with supp ϕ B

0, 1 and set ϕ j ξ :

e2πix j ξ ϕ. Then ϕ j

is still supported in B 0, 1

and with B j :

B

x j, r j

, we have

ϕ j

L2 Rd F

ϕ j

L2 B j F

ϕ j

L2 Rd B j

ϕ

B j F

x x j r j

C x x j

d 1 dx

12 (9.17)

which vanishes in the limit j

. But this contradicts (9.14) which finishes the

necessity.

For the sufficiency, let E :

F c. Applying Theorem 9.7, see also Exercise 9.2,

we infer that for any f L2 R

d with supp

ˆ f B 0, 1 the estimate

f 2 C f L2 E

with C C γ, r , d . Now split f f 1 f 2 with f 1 χ B 0,1

ˆ f

. If f 2 2

ε f 2, then

f

L2 E

f 1 L2

E

f 2 2

C 1 f 2 ε f 2 ε f 2

provided ε 0 is sufficiently small and we are done.

We shall now apply these uncertainty principle ideas to the local solvability of

partial diff erential equations with constant coefficients. To be specific, suppose we

are given a polynomial

p ξ

α

aα ξ α

Then with D :

1

2πi x one has

p D :

α

aα 2πi

α Dα

This definition is chosen so that for all f S Rd

p D f ξ p ξ ˆ f ξ ξ Rd

The following theorem is known as Malgrange-Ehrenpreis theorem.




Theorem 9.10. Let Ω be a bounded domain in Rd and p 0 be a polynomial.

Then for all g L2 Ω there exists f L2

Ω such that p D f g in the sense of

distributions.

The “distributional sense” means that for all ϕ C

Ω with compact support

supp ϕ Ω one has

g, ϕ f , p D ϕ

where p ξ :

α aα ξ α. The pairing ,

is the L2 pairing.

We emphasize that we are not making any ellipticity assumptions in the theo-

rem. Even so, one can show that f can be taken to be C

Ω , see the Problem 9.6

below. This is very diff erent from the assertion that every solution f is C which

is known as hypoellipticity. This is not only a large area of research onto itself, but

also false in the generality of the Malgrange-Ehrenpreis theorem.

The proof follows the common principle in functional analysis that uniqueness

(for the adjoint operator) implies existence (for the operator). Recall that this is the

essence of Fredholm theory which establishes this fact for compact perturbations

of the identity operator. Of course, we cannot expect uniqueness in our case with-

out any boundary conditions, so we introduce a strong vanishing condition at the

boundary. Henceforth we take Ω to be a ball, which we may do since Ω bounded.

By dilation and translation, the ball may furthermore be taken to be B 0, 1 .

Proposition 9.11. One has the bound

p D ϕ 2 C 1 ϕ 2

for all ϕ C B 0, 1 which are compactly supported in B 0, 1 . The constant C

depends on p and the dimension.

Proof of Theorem 9.10. We show how to deduce the theorem from the propo-

sition. Let

X :

p D ϕ ϕ C comp B 0, 1

viewed as a linear subspace of L2 B

0, 1 . Define a linear functional on X by

p D

ϕ g, ϕ

where the right-hand side is the L2-pairing. Proposition 9.11 implies that this is

well-defined and in fact one has a bound

p D

ϕ g

2 ϕ 2 C

g

2 p

D

ϕ 2

By the Hahn-Banach theorem we may extend as a bounded linear functional to

L2

B

0, 1

with the same norm. So by the Riesz representation of such functionalsthere exists f L2

B 0, 1 with the property that

p D ϕ f , p D ϕ g, ϕ

for all ϕ as above which proves the theorem.

The proof of Proposition 9.11 will be based on Theorem 9.7, which requires

some preparatory lemmas on polynomials.




Lemma 9.12. Let p be a nonzero polynomial in Rd . There exists β 0, 1

which only depends on the degree of p and the dimension so that

B p x

12

max B

p β B

for all balls B of radius 1.

Proof. Denote the degree of p by N . The proof exploits the fact that any

two norms on a finite-dimensional space are comparable (in our case the space of

polynomials of degree at most N ). Fix a unit ball B. Then there exists a constant

C N

such that

max x B

p x ∇ p x C N max

x B p x

whence

max x B

∇ p x C N max x B

p x

By the mean value theorem,

p x p y max B

∇ p x y C N x y max B

p

from which we infer that

p

y

p

xmax

C N

max

B p

xmax

y

1

2 p

xmax

for any y B xmax, 2C N

1 B. In particular,

y B 2 p y max B

p c N , d 0

Finally, we may translate B without aff ecting any of the constants which concludes

the proof.

The following lemma is formulated so as to fit Theorem 9.9.

Lemma 9.13. Let p be any nonzero polynomial in Rd . Then there exists ε0 0

depending on p such that

x B p x ε0 β B

for all unit balls B. Here β 0, 1

is from Lemma 9.12.

Proof. By the previous lemma it suffices to show that

inf B

max x B

p x ε0

where the infimum is taken over all unit balls. First, with N the degree of p,

max B 0,1

p x C N

1

α

aα

where aα are the coefficients of p. Furthermore, one has

max B

p x C N , p

1

α

N

aα : ε0




uniformly in B. This follows form the previous inequality and the property that

the highest order coefficients are invariant under translation. This concludes the

proof.

Proof of Proposition 9.11. Lemma 9.13 guarantees that

F : x Rd

p x ε0

is B 0, 1

-negligible. Thus, by Theorem 9.9,

¯ p

D

ϕ 2 ¯ p

ξ ϕ 2 ¯ pϕ L2

F c

ε0 ϕ L2

F c

ε1 ϕ L2

Rd

with some constant ε1 that depends on p and d .

There is no analogue of Theorem 9.10 for partial diff erential operators with

nonconstant smooth coefficients. This is known as Lewy’s example, see for exam-

ple [17].

Notes

A standard reference for the material in this chapter is Havin and Joricke’s book [24].

Another basic resource in this area is Charles Feff erman’s survey [15]. An explicit constant

in Theorem 9.4 of the form eC

E

F in dimension 1 was obtained by Nazarov [38], and in

higher dimensions by Jaming [27]. For the solvability of linear constant coefficient PDEs

see Jerison [29].

Problems

Problem

9.1. Does there exist a nonzero function f

L

2 R

d

so that both f

0 andˆ f 0 on nonempty open sets?

Problem 9.2. Does there exist a nonzero f L2 R with f 0 on 1, 1 and ˆ f 0

on a half-line? What about f 0 on a set of positive measure and ˆ f 0 on a half-line?

Problem 9.3. Suppose E , F Rd have finite measure. Show that for any g1, g2

L2 R

d there exists f L2

Rd

with

f g1 on E , ˆ f g2 on F

This can be seen as the statement that f E and ˆ f F are independent.

Problem 9.4. Prove that for E , F Rd of finite measure one has

dim f L2 R

d f 0 on E , ˆ f 0 on F

Problem 9.5. There are several ways of proving Proposition 9.6. As an alternative tothe Schur lemma proof we gave, one can use Sobolev embedding on cubes together with

Bernstein’s theorem. This problem suggests yet another route.

Prove Poincare’s inequality: for any f H 1 Rd

one has for every R 0

B 0, R

f x f R

2 dx CR2

B 0, R

∇ f x

2 dx

with an absolute constant. Here f R is the average of f on B 0, R .



CHAPTER 10

Littlewood-Paley theory

One of the central concerns of harmonic analysis is the study of multiplier

operators, i.e., operators T which are of the form

T m f x

Rd

e2πix ξ m ξ ˆ f ξ d ξ (10.1)

where m : Rd C is bounded. By Plancherel’s theorem T

m

2

2 m

. In

fact, it is easy to see that one has equality here. Over the distributions we may write

T m f K f for any f S Rd

with K :

m S . Then T m is bounded on L1 if

and only if the associated kernel is in L1, in which case

T m 1 1

K 1

There are many cases, though, where K L1 R

d

but T is bounded on L p for

some (or all) 1 p

. The Hilbert transform is one such example. We shall

now discuss one of the basic results in the field, which describes a large class of

multipliers m that give rise to L p bounded operators for 1 p .

In Chapter 7 we previously studied the case where m is homogeneous of de-

gree 0 and we showed that the T m is a Calderon-Zygmund operator with homo-

geneous kernel, and thus L p

-bounded for 1

p

. In the following theorem,which goes by the name of Mikhlin multiplier theorem, we shall allow for more

general functions m which satisfy finitely many derivative conditions of the zero-

degree type..

Theorem 10.1. Let m : Rd

0 C satisfy, for any multi-index γ of length

γ d 2

γ m ξ B ξ

γ

for all ξ 0. Then for any 1 p there is a constant C C d , p so that

m ˆ f

p CB f p

for all f S.

Proof. Let ψ give rise to a dyadic partition of unity as in Lemma 8.3. Definefor any j Z

m j ξ ψ 2 j ξ m ξ

and set K j m j. Now fix some large positive integer N and set

K x

N

j N

K j x

125



126 10. LITTLEWOOD-PALEY THEORY

We claim that under our smoothness assumption on m one has the pointwise esti-

mates

K

x

C B

x

d

,

∇K

x

CB

x

d 1

(10.2)where C C d . One then applies the Calderon-Zygmund theorem and lets N

.

We will verify the first inequality in (10.2). The second one is similar, and will

be only sketched. By assumption, Dγ m j

CB2 j γ

and thus

Dγ m j 1 CB2 j γ

2 jd

for any multi-index γ d

2. Similarly,

Dγ ξ im j 1 CB2

j γ

1

2 jd

for the same γ . Hence,

x

γ

m j

x

CB2

j d γ

and

xγ Dm j x

CB2 j d 1

γ .

Since x

k C k , d

γ k xγ one concludes that

m j

x

CB2 j d k x

k (10.3)

and

Dm j

x

CB2 j d 1 k x

k (10.4)

for any 0 k d 2 and all j Z, x Rd

0 . We shall use this with k 0 and

k d 2. Indeed,

K x

j

m j x

2 j x

1

m j x

2 j x

1

m j x

CB

2 j x

1

2 jd CB

2 j x

1

2 jd 2 j

x

d 2

CB

x

d

CB x

2 x

d 2

CB x

d ,

as claimed. To obtain the second inequality in (10.2) one uses (10.4) instead of

(10.3). Otherwise the argument is unchanged. Thus we have verified that K sat-

isfies the conditions i) and iii) of Definition 7.1, see Lemma 7.2. Furthermore,

m

B so that

m ˆ f

2 B

f

2. By Theorem 7.6, and the remark following

it, one concludes that

m ˆ f

p C p, d f p

for all f S and 1 p , as claimed.

The number of derivatives entering into the hypothesis can be decreased, see

the problem section. Theorem 10.1 allows one to give proofs of Corollaries 7.7

without going through Exercise 7.4. Indeed, one has

2u

xi x j

ξ

ξ i ξ j

ξ 2 u

ξ



10. LITTLEWOOD-PALEY THEORY 127

Since m ξ

ξ i ξ j

ξ

2 satisfies the conditions of Theorem 10.1 (this is obvious, as m

is homogeneous of degree 0 and smooth away from ξ 0

, we are done. Observe

that this example also shows that Theorem 10.1 fails if p 1 or p .Theorem 10.1 is formulated as an apriori inequality for f

S

Rd

. In that

case m ˆ f S so that m ˆ f

S . The question arises whether or not m ˆ f

is

meaningful for f L p

Rd

. If 1

p

2, then ˆ f

L2

L

Rd

(in fact ˆ f

L p

by Hausdorff -Young), so that m ˆ f S and therefore m ˆ f

S is well-defined. If

p 2, however, it is known that there are f L p R

d such that ˆ f S has positive

order (see the Notes to Chapter 6). For such f it is in general not possible to define

m ˆ f

in S .

We now present an important application of Mikhlin’s theorem to Littlewood-

Paley theory. With ψ as in Lemma 8.3 we define P j f ψ j ˆ f

where ψ j ξ

ψ 2 j ξ (it is customary to assume ψ

0). Then by Plancherel,

C 1

f

22

j Z

P j f

22

f

22 (10.5)

for any f L2

Rd

. Observe that the middle expression is equal to

S f

22

with

S f

j

P j f

2

12

This is called the Littlewood-Paley square-function. It is a basic result of harmonic

analysis that (10.5) generalizes to

C 1 f p S f p C f p (10.6)

for any f L p Rd provided 1 p and with C C p, d . It is of course im-portant to understand why such a result might hold, and so we off er some heuristic

explanations.

Consider a lacunary trigonometric polynomial of the form

T θ

M

n 1

cn e 2nπθ

where θ T. Let p 1 be an integer and estimate by Plancherel’s theorem

1

0

M

n 1

cn e 2nθ

2 p

d θ

1

0

M

n1,...,n p 1

cn1 cn p

e 2nθ

2

d θ

C p

M

n1,...,n p 1

cn1 cn p

2

n

cn

2 p

We used here that for every positive integer N the number of ways in which it can

be written in the form

N 2n1

2n p , n j

0




is bounded by a constant that depends only on p. In other words, given a sequence

ck 2 Z so that ck

0 unless k is of the form 2n with n

0, we have the

property

k

ck

2

12

k

ck e k πθ

p

k

ck

2

12

(10.7)

for every p 2 (and by Holder’s inequality one can lower this to p

1, see the

proof of Lemma 10.3 below). Clearly, (10.7) is remarkable as it says that any L2 T

function f so that ˆ f k 0 unless k is a power of 2 has finite L p-norm for every

p (but of course not for p

). We remark in passing that this applies to

any lacunary or gap Fourier series, but this is harder to see (the standard argument

involves Riesz products).

The rough idea behind (10.6) can be construed as follows: if we consider an ar-

bitrary trigonometric polynomial

n ane nθ , then (10.7) is of course false. How-

ever, we may try to salvage something of (10.7) by setting

ck θ :

2k 1 n 2k

ane nθ , k 1

We might then expect that the square function

S f θ :

k

ck θ

2

12

satisfies S f

p f

p.

It is insightful to view lacunary characters such as e

2nθ n 1 from a proba-

bilistic angle. Indeed, we can view these functions as being close to independent

random variables on the interval 0, 1 . For example, on any interval k 2

n, k

1 2 n 0, 1 with integer k , on which sin 2nπθ has a fixed sign, the func-

tion sin 2n 1πθ is equally likely to be positive or negative. The reader should

also compare the graphs of Re e 2nθ with those of the Rademacher functions r n

defined below.

Taking another leap of faith, we are thus lead to viewing lacunary series from

the point of view of the central limit theorem in probability theory, which at least

heuristically implies that a lacunary Fourier series f as above with f

2 1 satis-

fies

θ T f

θ λ

Ce

λ2

2

with a suitable constant C 0. In particular, this ensures that f satisfies an estimate

of the type (10.7).

It should therefore come as no surprise to the reader that some probabilistic

elements enter into the proof of (10.6). In fact, we shall now derive (10.6) by means

of a standard randomization technique. Let r j be a sequence of independent

random variables with P r j

1 P

r j 1

12

, for all j (in other words,

the r j are a coin tossing sequence). Readers familiar with the central limit theorem

will not be surprised by the following sub-Gaussian bound.




To pass to the final inequality one uses that only an absolutely bounded number

of terms is non-zero in the sum preceding it for any ξ 0. Hence, in view of

Lemma 10.3,

Rd

S f x

p

dx C lim sup N

E

Rd

N

j N

r j P j f x

p

dx C f

p p

as desired.

To prove the lower bound we use duality: choose a function ψ so that ψ 1 on

supp ψ

and ψ is compactly supported with supp ψ Rd

0

. Defining P j like P j

with ψ instead of ψ yields

P j satisfying P jP j P j. Therefore, for any f , g S,

and any 1 p

,

f , g

j

P j f , P jg

Rd

j

P j f x

2

12

k

Pk g x

2

12

dx

S f p

S g p

C S f p g p

For the final bound we use that the argument for the upper bound equally well

applies to S instead of S . Thus, f p C S f p, as claimed.

It is desirable to formulate Theorem 10.4 on L p R

d rather than on S R

d .

This is done in the following corollary. Observe that S f is defined pointwise if

f S

Rd

since P j f is a smooth function.

Corollary 10.5.

a) Let 1 p

. Then for any f

L p

Rd

one has S f

L p and

C 1

p,d

f

p

S f

p

C p,d

f

p

b) Suppose that f S and that S f

L p

Rd

with some 1

p

. Then

f g P where P is a polynomial and g L p R

d . Moreover, S f S g

and

C 1 p,d

S f p g p C p,d S f p

Proof. To prove part a , let f k

S so that f k

f p

0 as k

. We claim

that

limk

S f k S f p 0 (10.10)

If (10.10) holds, then passing to the limit k

in

C

1 p,d f k p S f k p C p,d f k p

implies part a . To prove (10.10) one applies Fatou repeatedly. Fix x R

d . Then

S f k x S f x P j f k x

k

2 P j f x

j

2

P j f k x P j f x

j

2

S f k f x lim inf

m

S f k f m x




Therefore,

S f k S f p lim inf

m

S f k f m p C p,d liminf

m

f k f m p

The claim (10.10) now follows by letting k .

For the second part we argue as in the proof of the lower bound in Theo-

rem 10.4: Let f S and h

S with supp

h R

d

0 . Then

f , h C p S f p

S h p

C p S f p h p

where S is the modified square function from the proof of Theorem 10.4. By the

Hahn-Banach theorem there exists g L p so that f , h g, h

for all h as above

satisfying g p C p S f p. By Exercise (6.2) one has f g P. Moreover,

S f S g and

S f p S g p C g p

by part i).

The following exercise shows that the Littlewood-Paley theorem fails at the

endpoints p 1 and p

.

Exercise 10.2.

a) Show that Theorem 10.4 fails at p 1. The intuition is of course to take

f δ0. For that case verify that that

S f x x

d

so that S f L1 R

d

. Next, transfer this logic to L1 R

d

by means of

approximate identities.

b) Show that Theorem 10.4 fails for p

.

It is natural to ask at this point whether the Littlewood-Paley theorem holds

for square functions defined in terms of sharp cut-o ff s rather than smooth ones as

above. More precisely, suppose we set

S new f

2

j Z

χ 2 j

1 ξ 2 j

ˆ f ξ

2

Is it true that

C 1

p f p S new f p C p f p (10.11)

for 1 p ? Due the boundedness of the Hilbert transform the answer is “yes”

in dimension 1, but it is rather deep fact that it is “no” in dimensions d 2, see the

notes and problems to this chapter.

As a first application of (10.6) we now finish the proof of (6.8).

Proposition 10.6. For any 0 s d and with 1

2

1 p

sd

one has

f L p Rd

C s, d f

H s Rd

for all f S

Rd

.




Proof. In Chapter 6 we already settled the case where ˆ f is localized to a dyadic

shell ξ

2 j

. Thus, it remains to prove (6.11). In fact, we shall now prove that

for any finite p

2 f p C

j Z

P j f

2 p

12

with a constant that only depends on p and d . By (10.6)

f p C p, d S f p C p, d P j f j Z 2

Z

p

C p, d

P j f p j Z

2 Z

where we used the triangle inequality in the final step (see the following exercise).

Exercise 10.3.

Show that for any

p

q

1 and on any measure space

X , µ

,

f j j Z q

Z

L p µ

f j L p µ

j Z

q Z

Also show that for q p

1 one has the reverse inequality.

Show that (6.11) fails for every 1

p

2.

As evidenced by the proof of Proposition 10.6, Exercise 10.3 is a useful (and

sharp) tool when combined with (10.6). In particular, it is used in comparing

Sobolev with Besov spaces, see the problem section below.

As final general comment about Littlewood-Paley theory, we ask the reader to

verify that there can be no version of 10.6 in which the square function is replaced

by some q-based norm.

Exercise 10.4. Suppose that for some 1 q and 1 p one has

P j f j Z q

Z

L p Rd

f p f S Rd

with constants that do not depend on f . Show that q 2.

We now present a connection between BMO and Littlewood-Paley theory. This

will give us the opportunity to introduce some notions that historically played an

important role in the development of the theory, such as Carleson measure and the

Calder´ on reproducing formula, which can be seen as a continuum version of the

discrete Littlewood-Paley decomposition.

Definition 10.7. Let µ be a (complex) measure on the upper half-space Rd 1

.

Denote a general dyadic cube in Rd by I and the cube in Rd 1

with base I and

of height I where is the side-length of I , by Q I

. In other words, Q I :

x, t x I , 0 t I . One says that µ is a Carleson measure if

µ C :

sup I

µ Q I

I

The following result may seem unmotivated for now, but it is a standard device

by which a BMO function can be broken up into basic constituents.




Proposition 10.8. Let f BMO Rd

with f BMO 1 have compact support

and mean zero. There exist functions b I I indexed by the dyadic cubes in Rd and

scalars λ I such that f

I λ I b I

supp

b I

3 I

Rd b I x dx 0

∇b I

I 1

I J λ I

2 I J for all dyadic cubes J

The final property means that

I λ I

2 I δ

x I , I

is a Carleson measure where x I

is the center of I.

Proof. Let ϕ S Rd

be real-valued, radial, and such that supp

ϕ B 0, 1

and with

Rd

ϕ t ξ 2 dt

t 1 ξ 0 (10.12)

Note that the left-hand side is a radial function and by scaling does not even depend

on ξ . In other words, it does not depend on the choice of ξ 0 at all. The only

requirement is that ϕ 0 0 or equivalently,

ϕ 0. Moreover, any ϕ 0 which

satisfies this condition can be normalized so as to fulfil (10.12).

Clearly, (10.12) is a continuum analogue of the discrete partition of unity in

Lemma 8.3. The Calderon reproducing formula is now the following statement,

which is implied by (10.12) and the vanishing

f 0:

f x

0

ϕt ϕt f

x

dt

t

I

T I

ϕt x y ϕt f y

dt

t dy

I

b I x

(10.13)

where T I :

x, t x I , t I 2, I

and ϕt x t d ϕ x t . By

construction,

b I 0 and supp

b I 3 I . Moreover, if we define with a suitable

constant C

λ I : C

I

1

T I

ϕt f y

2t 1 dtdy

12

then one checks by means of Cauchy-Schwarz that

∇˜b I

x

T I

∇ϕt

x

y

ϕt

f

y

dt

t dy

λ I

I

1

Define b I : λ 1

I b I . We have verified all properties up to the Carleson measure. To

obtain this final claim, we first show that for any f BMO

Rd

and any (dyadic)

cube I Rd

Q I

ϕt f

y

2 t 1 dtdy C

f

2BMO

I (10.14)




By the scaling and translation symmetries, we may assume that I is the cube

1, 1

d . Define I :

2, 2

d and split

f f I f f I χ I f f I χRd I : f 1 f 2 f 3

The constant f 1 does not contribute to (10.14) and f 2 is estimated by

Rd 1

ϕt f 2 y

2 t 1 dtdy f 2

22 C f

2BMO

Finally,

Q I

ϕt f 3 x

2 t 1 dtdx

2

0

dt

t

1,1

d

dx

Rd 2,2

d

f

y

f I

t d ϕ

x y

t

dy

2

C

2

0

dt t

1,1

d

dx

Rd 2,2

d

f y f I

2 t d ϕ

x yt

dy

C

Rd

f y f I

2

1 y

d 1 dy C f

2BMO

as desired. For the final step, see Problem 7.5. Hence (10.14) holds and therefore

J I

λ J

2 J

J I

T J

ϕt f

x

2 t 1 dtdx

C

Q I

ϕt f x

2 t 1 dtdx C f

2BMO I

and we are done. Exercise 10.5. Give a detailed proof of (10.13).

To conclude this chapter, we shall show that singular integrals as in Defini-

tion 7.1 are bounded on C α R

d

, 0 α

1. Problem 7.7 of the previous chapter

deals with the C α-boundedness of the Hilbert transform, which can be proved on

the level of the kernel alone. Here we shall follow a diff erent route based on the

projections P j. We begin with a characterization of C α Rd

for 0

α 1 in terms

of the projections P j. The reader should realize that the proof of the following

lemma is reminiscent of the proof of Bernstein’s Theorem 1.12.

Lemma 10.9. Let f 1. Then f C α Rd

for 0 α 1 if and only if

sup j Z

2 jα P j f

A (10.15)

Moreover, the smallest A for which (10.15) holds is comparable to f α.

Proof. Set ψ j x

2 jnψ

2 j x

: ϕ j x

. Hence

ϕ j 1

ψ 1 for all j Z and

Rd

ϕ j x xγ dx 0




for all multi-indices γ . First assume f C α. Then

P j f

x

Rd

f

x

y

f

x

ϕ j

y

dy

Rd

f α y

α ϕ j y dy

2 jα f α

Rd

y

α ψ y dy

Hence A C f α, as claimed. Conversely, define

g x

j

P j f x

for any integer 0. We need to show that, for all y R

d ,

sup

g x y g x CA y

α

with some constant C C

d

. Fix y 0 and estimate

y

1 2 j

2

P j f

x

2 j y

1

A2 jα

CA y

α (10.16)

Secondly, observe that

P j f x y P j f x ∇P j f

y C 2 j P j f

y

C 2 j 1 α A y

(10.17)

where we invoked Bernstein’s inequality to pass to the second inequality sign.

Combining (10.16) and (10.17) yields

g x y g x

2

2 j

y

1

P j f x y P j f x

y

1 2 j

2

2 P j f

2 j y

1

CA2 j 1 α

y

y

1

2 j

2 A2 jα

CA y

α

(10.18)

uniformly in 1.

So g g 0

1 are uniformly bounded on C α K for any compact K . By

the Arzela-Ascoli theorem one concludes that

g g

0

g

uniformly on any compact set and therefore g α CA by (10.18) (up to passing

to a subsequence i ). It remains to show that f has the same property. This will

follow from the following claim: f g

const. To verify this property, note that

g g 0

g in S . Thus, also g δ0g 0

g in S which is the same as

j

ψ 2

j ξ ˆ f ξ δ0g 0

g in S




Soif h Swith supp h Rd

0 , then

ˆ f g, h 0, i.e., we have supp

ˆ f g

0

. By Exercise 6.2 therefore

f g x

γ

M

C γ xγ (10.19)

On the other hand, since g 0

0,

f g x f

g x g 0 1 CA x

α

Since α 1, comparing this bound with (10.19) shows that the polynomial in

(10.19) is of degree zero. Thus, f g

const, as claimed.

Next, we need the following result which gives an example of a situation where

T maps into L1 R

d

. We postpone the proof.

Proposition 10.10. Let K be a singular integral kernel as in Definition 7.1. For

any η

S

R

d

with

Rd η

x

dx

0 one has T η 1 C η B

with a constant depending on η.

We can now formulate and prove the C α bound for singular integrals.

Theorem 10.11. Let K be as in Definition 7.1 and 0 α

1. Then for any

f L2 C α R

d one has T f C α R

d and

T f

α C α B f α with C α C α, d .

Proof. We use Lemma 10.9. In order to do so, notice first that

T f

CB f α f 2

which we leave to the reader to check. Therefore, it suffices to show that

sup j

2 jα P jT f

CB f α (10.20)

Let P j be defined as

P ju

ψ 2 j

u

ξ

where ψ C

0 R

d 0 and ψψ ψ. Thus, P jP j P j. Hence,

P jT f

P jP jT f

P jT P j f

P jT

P j f

P jT

C f α2 jα

It remains to show that sup j

P jT

CB, see (10.20). Clearly, the kernel of

P jT is 2 jd ˇ ψ 2 j

K so that

P jT

2 jd ˇ

ψ 2 j

K 1

ˇ ψ

2 jd K 2 j

1 CB (10.21)

by Proposition 10.10. Indeed, set η

ˇ ψ in that proposition so that

Rd

η x dx

ψ 0 0




as required. Furthermore, we apply Proposition 10.10 with the rescaled kernel

2 jd K 2 j

. As this kernel satisfies the conditions in Definition 7.1 uniformly in

j, one obtains (10.21) and the theorem is proved.

It remains to show Proposition 10.10. We will build up from the case of an

“atom” as defined by the following lemma.

Lemma 10.12. Let f L

Rd

with

f x dx 0 , supp f B 0, R and

f

R d . Then

T f 1 CB

Proof. By Cauchy-Schwarz,

x 2 R

T f x

dx T f 2CRd 2

C B f 2 Rd 2

CBR d

R

d

2 R

d

2

CB

Furthermore,

x 2 R

T f x dx

Rd

x 2 y

K x y K x dx f y dy B f 1 C B

as desired.

In passing, we point out a simple relation between these atoms and BMO. In

fact, let ϕ BMO and a

f as defined in Lemma 10.12. Then

a, ϕ

B

a x ϕ x ϕ B dx

a

B

ϕ x ϕ B dx

B

ϕ x ϕ B dx ϕ BMO

Moreover, it is easy to see from this that supa a, ϕ ϕ BMO. It is there-

fore natural to define the atomic space H 1atom Rd

as all sums f

j c j a j with

j c j and to define the norm

f H 1 as the smallest possible value of the sum

j c j for a given f .

It is a remarkable fact that this atomic space coincides with H 1 as defined

in (7.32). We will establish this in a later chapter in a one-dimensional dyadic

setting. In the following lemma we show that one can write a general Schwartz

function with mean zero as a superposition of infinitely many atoms.

Lemma 10.13. Let η S Rn

,

Rd η x dx 0. Then one can write

η

1

c a

with

Rn a x dx 0 , a

d , supp a B

0, for all 1 and

1

c

C η




Property (10.27) implies that

η

B 1

1

B 1

Rd B

1

η

x

dx

1

B 1

Rd B

1

η x

dx

C 20d

(10.28)

since η η on Rd B , see (10.26), and since η has rapid decay. (10.28) implies

that

η 1

C 20d ,

see (10.26), and also

f 1

C 20d

see (10.22). We now conclude from (10.23)–(10.25) that

η

1

f

1

c a

with, see (10.25) and (10.24),

1

c

1

d f

1

C 19d

and the lemma follows.

Proof of Proposition 8. Let η be as in the statement of the proposition. By

Lemma 10.13,

η

1

c a

with c and a as stated there. By Lemma 10.12,

T c a 1 c T a 1 C B c

for 1. Hence,

1

T c a 1 C η B

and this easily implies that T η 1 C η B, is claimed.

A typical application of Theorem 10.11 is to the so-called “Schauder estimate”.

Corollary 10.14. Let f C 2,α

0 R

d . Then

sup1

i, j d

2 f

xi

x j

α

C

α, d f

α (10.29)

for any 0 α

1.




Proof. As in the L p case, see Corollary 7.7, this follows from the fact that

2 f

xi x j

Ri j

f

for 1

i, j

d

where Ri j are the double Riesz transforms. Now apply Theorem 10.11 to the sin-

gular integral operators Ri j.

For an extension of this corollary to variable coefficients, see Problem 10.6

below.

Exercise 10.6. Show that Theorem 10.11 fails at α 0 and α 1.

As a final application of Lemma 10.9 we remark that it easily implies Sobolev’s

imbedding W 1, p R

d

C α R

d

in the range d p

with α

1

d p

. See

Problem 10.12.

Notes

The Littlewood-Paley theorem can be proved in a number of diff erent ways, see for ex-

ample Stein [47], which also contains a martingale diff erence version of this theorem. For

many applications of Littlewood-Paley theory (such as to Besov spaces) and connections

with wavelet theory, see Frazier, Jawerth and Weiss [19].

Concerning (10.11), it is a relatively easy consequence of the L p-boundedness of the

Hilbert transform that the answer is “yes” in one dimension. On the other hand, it is a very

remarkable result of Charles Feff erman that the answer is “no” for dimensions d 2. In

fact, Feff erman [14] showed that the ball multiplier χ B where B is any ball in Rd is bounded

on L p R

d only for p 2. This latter result is based on the existence of Kakeya sets and

will shall return to it in a later chapter. The positive answer for d 1 is presented as

Problem 10.4 below.The “atoms” of Lemma 10.12 are exactly what one calls an H 1-atom where H 1 R

d

is the real-variable Hardy space. Lemma 10.13 is an instance of an atomic decomposition

which exists for any f H 1, see Stein [46] and Koosis [34].

Proposition 10.8 is from [55].

Problems

Problem 10.1. The conditions in Theorem 10.1 can be relaxed. Indeed, it suffices to

assume the following:

sup R

0

R2

α

R

ξ 2 R

α ξ m

ξ

2 d ξ A2

for all α k where k is the smallest integer

d 2

.

Hint: Verify the Hormander condition ii of Definition 7.1 directly rather than going

through Lemma 7.2.

Problem 10.2. Let T by a bounded linear operator on L p X , µ where X , µ is some

measure space and 1 p . Show the following vector-valued L p estimate:

T f j 2

p C p T p

p

f j 2

p




for any sequence f j of measurable functions for which the right-hand side is finite. Such

vector-valued extensions are very useful. A challenging question is how to extend this

property to sub-linear operators such as the Hardy-Littlewood maximal function, see [46],

page 51.

Problem 10.3. Let 1 p 2. By means of Khintchin’s inequality show that for

every ε 0 there exists f S Rd

such that

ˆ f p

ε f p, cf. Corollary 6.8.

Problem 10.4. In this problem the dimension is d 1.

a) Deduce (10.11) from Theorem 10.4 by means of the following “vector-valued”

inequality: Let I j j

Z be an arbitrary collection of intervals. Then for any 1

p

χ I jˆ f j

j

L p 2

:

j

χ I jˆ f j

2

12

p

C p

f j

L p

2

(10.30)

where f j j

Z are an arbitrary collection of functions in S R1

, say.

b) Now deduce (10.30) from Theorem 4.8 by means of Khinchin’s inequality Hint:

express the operator f χ I ˆ f

by means of the Hilbert transform via the same

procedure which was used in the proof of Theorem 4.11.

c) By similar means prove the following Littlewood-Paley theorem for functions in

L p T : For any f L1

T let

S f

j

0

P j f

2

12

where

P j f θ

2 j 1 n 2 j

ˆ f n e nθ

for j 1 and 0 f θ

ˆ f 0 . Show that, for any 1 p

C

1 p f L p

T

S f L p

T

C p f L p

T

(10.31)

for all f L p T .

d ) As a consequence of (10.11) with d 1 show the following multiplier theorem:

Let m : R 0 C have the property that, for each j Z

m ξ m j constant

for all 2 j 1

ξ

2 j. Then, for any 1 p

m ˆ f

L p

R

C p sup j

Z

m j f p

for all f S

R . Prove a similar theorem for L p T using (10.31).

Problem 10.5. Prove the following stronger version of Problem 7.4. By a dyadic

interval we mean an interval of the form 2k , 2k 1 where k Z. Suppose m is a bounded

function on R which is locally of bounded variation. In fact, assume that for some finite B

one has m

B and

I

dm B




for all dyadic intervals I . Show that m is bounded on L p R as a Fourier multiplier for

any 1 p with m ˆ f

p C p B f p. For an extension to higher dimensions

see [45].

Problem 10.6. We extend Corollary 10.14 to estimates for elliptic equations on a re-

gion Ω Rd with variable coefficients, i.e.,

n

i, j 1

ai j x

2u

xi x j

x f x in Ω

where

i, j ai j ξ i ξ j λ ξ

2 and ai j C α Ω . By “freezing x”, prove the following apriori

estimate from (10.29) for any f C 2,α Ω :

sup1

i, j

d

2 f

xi x j

C α K

C α, d , K

f C α

Ω

f L

Ω

for any compact K Ω. Hint: See Gilbarg-Trudinger [22] for this estimate and much

more.

Problem 10.7. Under the same assumptions as in Theorem 10.1 one has

m ˆ f

α C α B f α

for any 0 α 1 and f C α Rd

L2 R

d . Hint: This is a corollary of the proof of

Theorem 10.1. Analyze which properties of K it establishes and what is needed in order

for the proof of Theorem 10.11 to go through.

Problem 10.8. For this problem we need to recall the definition of a martingale diff er-

ence sequence: let F n

n

1 be an increasing sequence of σ-algebras on some probability

space Ω,P (this is called a filtration). Assume that the function X n : Ω R is mea-

surable with respect to F n and E X n F n 1 0. Then X n

n 1

is called a martingale

diff

erence sequence. Suppose

X n

n 1 is such a martingale diff

erence sequence adopted tosome filtration F n

n 1

. Show that

P

N

n

1

X n

λ

N

n

1

X n

2

1

2

Ce

cλ2

for any N 1, 2, . . ., and λ 0. Here C , c 0 are absolute constants.

Problem 10.9. Let r j

1 be a coin tossing sequence as before. Show that for any

a j C, and N Z

P

sup0

θ

1

N

j 1

r j a je2πi jθ

C 0

N

j 1

a j

2

12

log N

C 0 N

2

provided C 0 is a sufficiently large absolute constant.

Problem 10.10. Using the ideas of this chapter as well as the proof strategy of The-

orem 8.6, try to prove (8.10) (but with an arbitrary multiplicative constant). For which p

does this approach succeed?

Problem 10.11. This problem introduces Sobolev and Besov spaces and studies some

embeddings. We remark that the spaces with dots are called “homogeneous”, whereas

those without are called “inhomogeneous”.




Let 1 p and s 0. Define the Sobolev spaces W s, p R

d and W s, p

Rd

as completions of S Rd

under the norms

∆

s

2 f p respectively

∆

s

2 f p

where a

1 a2. The operators here are interpreted as Fourier multipliers.

Show that for any 1 p

f

W s, p

j

Z

22 js P j f

2

12

p

with implicit constants that only depend on s, p, d . Formulate and prove an

analogous statement for W s, p R

d . Does anything change for s 0? Note that

s 0 naturally arises by the duality relation

W s, p

W

s, p. For which s is

this valid? What is the analogue for the inhomogeneous spaces?

Define the Besov spaces Bs p,q R

d and Bs

p,q Rd

by means of the norms (for q

finite)

j

Z

2q js P j f

q p

1

qrespectively

j 0

2q js P j f

q p

1

q

where in the second sum P0 is interpreted as the projection onto all frequencies

C 1. By means of the results of this chapter obtain embedding theorems

between Besov and Sobolev spaces.

Prove the embedding Bσq,2

Rd

L p R

d for 2 q p , σ 0, and

1q

1 p

σd

. Does the same hold for the inhomogeneous spaces? What if

anything can one change in that case? By means of the previous item, deduce

embedding theorems for the Sobolev spaces from this.

Problem 10.12. Prove that f α C α, d f

W 1, p Rd

with d p , α 1

d p

and f S

Rd

. This is called Morrey’s estimate. For the case p d see Problem 9.7.

Problem 10.13. Show that Bd 2

2,1 R

d L

Rd

but Bd 2

2,q R

d L

Rd

for any 1

q . Show the embedding Bd 2

2,

Rd

BMO Rd

. Conclude that if f

Bd 2

2,q

Rd

A for

some 1 q , then w : e f satisfies Q

w x dx

Q

w 1 x dx C A, d , q

uniformly for all cubes Q Rd .



CHAPTER 11

Fourier Restriction and Applications

In the late 1960’s, Elias M. Stein posed the following question: is it possible to

restrict the Fourier transform ˆ f of a function f Rd

with 1

p 2 to the sphere

S d 1 as a function in Lq S d 1

for some 1 q ? By the uniform boundedness

principle this is the same as asking about the validity of the inequality

ˆ f S d 1 Lq

S d 1

C f L p Rd

(11.1)

for all f S

Rd

, with a constant C

C

d , p, q

? As an example, take p

1,

q and C

1. On the other hand, p 2 is impossible, as ˆ f is no better

than a general L2-function by Plancherel. Stein asked whether it is possible to find

1 p

2 so that for some finite q one has the estimate (11.1).

We have encountered an estimate of this type in Chapter 6, under the name of

trace estimate. In fact, by Lemma 6.11 one has

ˆ f S d 1

C x

σ f 2

as long as σ

12

(some work is required to pass from the planar case to the curved

one). However, (11.1) is translation invariant and therefore much more useful for

applications to nonlinear partial diff erential equations. In addition, as we shall see,

the answers to (11.1) crucially involve curvature, whereas the trace lemmas do notdistinguish between flat and curved surfaces.

Exercise 11.1. Suppose S is a bounded subset of a hyper-plane in Rd . Prove

that if

ˆ f S 1 C f L p Rd

for all f S Rd

, then necessarily p 1. In other

words, there can be no nontrivial restriction theorem for flat surfaces.

The following Tomas-Stein theorem settles the important case q 2 of Stein’s

question. For simplicity, we state it for the sphere. But it equally well applies to

any bounded subset of any smooth hyper-surface inRd with nonvanishing Gaussian

curvature.

Theorem 11.1. For every dimension d 2 there is a constant C d such that

for all f L p

R

d

ˆ f S d 1 L2

S d 1

C d f L p Rd

(11.2)

with p pd :

2d 2d 3

. Moreover, this bound fails for p pd .

The left-hand side in (11.2) is

S d 1

ˆ f w

2 σ dw

12

145



146 11. FOURIER RESTRICTION AND APPLICATIONS

where σ is the surface measure on S d 1.

We shall prove this theorem below, after making some initial remarks. Firstly,

there is nothing special about the sphere. In fact, if S 0 is a compact subset of ahypersurface S with nonvanishing Gaussian curvature, then

ˆ f S 0 L2 S 0

C d , S 0 f

L2d 2d 3

Rd

(11.3)

for any f S Rd

. For example, take the truncated paraboloid

S 0 : ξ , ξ

2 ξ R

d 1, ξ 1

which is important for the Schrodinger equation. On the other hand, (11.3) fails

for

S 0 : ξ , ξ ξ R

d 1 , 1 ξ

2

since this piece of the cone has exactly one vanishing principal curvature, namely

the one along a generator of the one. This latter example is relevant for the wave

equation as the characteristic surface of the wave equation is a cone, and so we

shall need to find a substitute of (11.3) for the wave equation. A much simpler

remark concerns the range 1 p pd in Theorem 11.1: for p 1, one has

ˆ f S d 1 L2

S d 1

ˆ f S d 1

S d 1

12

f

L1 Rd

S d 1

12 . (11.4)

Hence, it suffices to prove Theorem 11.1 for p pd since the cases 1 p pd

follows by interpolation with (11.4). The Stein-Tomas theorem is more accessible

than the general restriction problem with q 2; in fact, we shall see that the

appearance of L2 S d 1

allows one to use duality in the proof. To do so, we need

to identify the adjoint of the restriction operator

R : f

ˆ f S d 1

Lemma 11.2. For any finite measure µ in Rd , and any f , g S Rd one has the

identity

Rd

ˆ f ξ g ξ µ d ξ

Rd

f x g ˆ µ x dx

Proof. We use the following elementary identity for tempered distributions: If

µ is a finite measure, and φ S, then

φµ

φ ˆ µ

Therefore (and occasionally using F f

synonymously with ˆ f for notational rea-

sons),

Rd

ˆ f ξ g ξ µ

d ξ

Rd

f x

F

g µ

x

dx

Rd

f x

g ˆ µ x dx

Rd

f x g ˆ µ x dx

since

g

g g.

Lemma 11.3. Let µ be a finite measure on Rd , and d 2. Then the following

are equivalent:



11. FOURIER RESTRICTION AND APPLICATIONS 147

a)

f µ Lq Rd

C f L2 µ

for all f S Rd

b) g L2 µ

C g

Lq

Rd

for all g S Rd

c) ˆ µ f Lq Rd

C 2 f

Lq

Rd

for all f S Rd .

Proof. By the previous lemma, for any g S

Rd

,

g L2 µ

sup f

S, f

L2

µ

1

Rd

g ξ f ξ µ d ξ

sup f S,

f

L2 µ

1

Rd

g x

f µ x dx

(11.5)

Hence, if a) holds, then the right-hand side of (11.5) is no larger than g

Lq

Rd

and b) follows. Conversely, if b) holds, then the entire expression in (11.5) is no

larger than C

g

Lq

Rd

, which implies a). Thus a) and b) are equivalent with thesame choice of C . Clearly, applying first b) and then a) with f g yields

F

g µ Lq

Rd

C

g

Lq

Rd

for all g S Rd

. Since F g µ g ˆ µ, part c) follows. Finally, we note the

relation

g x ˆ µ f x dx

g x F µ ˇ f x dx

g ξ ˇ f ξ µ d ξ

for any f , g S Rd

. Hence, if c) holds then

Rd

g ξ ˇ f

ξ µ d ξ

C 2

g

Lq

Rd

f

Lq

Rd

Now set f x g x . Then

Rd

g ξ

2 µ d ξ C 2 g

2

Lq

Rd

which is b).

Setting µ σ σS d 1 , the surface measure of the unit sphere in Rd , one now

obtains the following:

Corollary 11.4. The following assertions are equivalent

a) The Stein-Tomas theorem in the “restriction form”:

ˆ f S d 1 L2

σ

C f

Lq

Rd

for q

2d 2d

3

and all f S Rd

b) the “extension form” of the Stein-Tomas theorem

gσS d 1 Lq

Rd

C g L2

σ

for q

2d 2d 1

and all g S Rd

.




c) The composition of a) and b): for all f S Rd

f σS d 1 Lq

Rd

C 2 f Lq

Rd

with q

2d 2d 1

.

Proof. Set µ σS d 1 σ in Lemma 10.2.

Exercise 11.2. In general, a) and b) above remain true, whereas c) requires

L2 σ

. More precisely, show the following: The restriction estimate

ˆ f S d 1 L p

σ

C

f

Lq

Rd

g

S

is equivalent to the extension estimate

gσS d 1 Lq

Rd

C g

L p

σ

g S

We shall now show via part b) of Corollary 11.4 that the Stein-Tomas theoremis optimal. This is the well-known Knapp example.

Lemma 11.5. The exponent pd

2d 2d 3

in Theorem 11.1 is optimal.

Proof. This is equivalent to saying that the exponent q

2d 2d 1

in part b) of

the previous corollary is optimal. Fix a small δ 0 and let g S such that g 1

on B ed ;

δ , g 0, and supp g B en; 2

δ where ed 0, . . . , 0, 1 is a unit

vector.

Then

gσ ξ

e 2πi x

ξ

ξ d

1 x

2

1

g x ,

1 x

2

1 x

2dx

cos 2π

x

ξ ξ d

1

x

2

1

g x ,

1 x

2

1

x

2dx

cos

π

4

g d σ C 1δd 1

2

(11.6)

provided ξ

δ

1

100 ξ d

δ

1

100. Indeed, under these assumptions, and for δ 0

small,

x

ξ ξ d

1

x

2

1

δ

δ

1

100

δ 1

100

δ

2

1

50 ,

so that the argument of the cosine in (11.6) is smaller than 2π50 π4 in absolute

value, as claimed.

Hence,

gσS d 1 Lq

Rd

C 1δ d 1

2

δ

d 1

2 δ 1

1q

C 1δd 1

2 δ

d 1

2q




whereas g L2

σ

C δd 1

4 . It is therefore necessary that

d 1

4

d 1

2

d 1

2q

or q

2d 2d 1

, as claimed.

For the proof of Theorem 11.1 we need the following decay estimate for the

Fourier transform of the surface measure σS d 1 , see Corollary 6.16:

σS d 1 ξ

C

1

ξ

d 12 (11.7)

It is easy to see that (11.7) imposes a restriction on the possible exponents for an

extension theorem of the form

f σS d 1 Lq Rd

C

f

L p S d 1

(11.8)

Indeed, setting f 1 implies that one needs

q

2d

d 1

by (11.7). On the other hand, one has

Exercise 11.3. Check by means of Knapp’s example from Lemma 10.3 that

(11.8) can only hold for

q

d 1

d 1

p (11.9)

The still unproved (in dimensions d 3) restriction conjecture states that (11.8)

holds under these conditions, i.e., provided

q

2d

d 1

and q

d 1

d 1

p

Observe that the Stein-Tomas theorem with q

2d 2d 1

and p 2 is a partial result

in this direction. In dimension d 2 the conjecture is true and can be proved by

elementary means, see the following chapter.

It is known, see Wolff ’s notes, that the full restriction conjecture implies the

Kakeya conjecture which states that Kakeya sets in dimension d 3 have (Haus-

dorff ) dimension d . This latter conjecture appears to be also very difficult.

Proof of the

Stein

-Tomas theorem for p

2d 2

d

3 . Let

j Z

ψ 2 j x

1

for all x 0 be the usual Littlewood-Paley partition of unity. By Lemma 10.2 and

Corollary 11.4 it is necessary and sufficient to prove

f σS d 1

L p

Rd

C f L p Rd




for all f S Rd

. Firstly, let

ϕ x 1

j 0

ψ 2 j x

Clearly, ϕ C

0 R

d

, and

1 ϕ

x

j 0

ψ 2 j x

for all x

Rd

Now observe that ϕ σS d 1 C

0 so that

f ϕ σS d 1 L p

Rd

C f L p Rd

(11.10)

with C ϕ σS d 1 Lr where 1

1 p

1r

1 p

, i.e., 2 p

1r

. It therefore remains to

control

K j : ψ

2 j x

σS d 1 x

in the sense that we need to prove an estimate of the form

f

K j L p

Rd

C 2 jε

f

L p Rd

f

S Rd

(11.11)

for all j 0 and some small ε

0. It is clear that the desired bound follows by

summing (11.10) and (11.11) over j 0. To prove (11.11) we interpolate a 2

2

and 1

bound as follows:

f

K j L2

ˆ f L2

K j L

f

L2 2n jψ

2 j σS d 1 L

C

f

L2 2d j

2 j d 1

C 2 j

f

L2

(11.12)

To pass to the estimate (11.12) one uses that

sup x

σS d 1 B x, r Cr d 1

as well as the fact that ψ has rapidly decaying tails, which implies that the estimate

is the same as for compactly supported ψ.

On the other hand,

f

K j

K j

f

1 C 2 j

d 1

2

f 1 (11.13)

since the size of K j is controlled by (11.7). Interpolating (11.12) with (11.13)

yields

f

K j p

C 2 j d 1

2 θ 2 j 1

θ

f

p

where 1 p

θ

1 θ

2

1 θ

2 . We thus obtain (11.11) provided

0

d 1

2 θ

1 θ

d 1

2 θ

1

d 1

1

2

1

p

1

d 1

2

d 1

p

This is the same as p

2d 2d 1

or p

2d 2d 3

, as claimed.




Exercise 11.4. Provide the details concerning the “rapidly decaying tails” in

the previous proof.

The previous proof strategy of the Tomas-Stein theorem does not achieve the

sharp exponent p

2d 2d 3

because (11.11) leads to a divergent series with ε 0. In

order to achieve this endpoint exponent, one therefore has to avoid interpolating the

operator bounds on each dyadic piece separately. The idea is basically to sum first

and then to interpolate, rather than interpolating first and then summing. Of course,

one needs to explain what it means to sum first: recall that the proof of the Riesz-

Thorin interpolation theorem is based on the three lines theorem from complex

analysis. The key idea in our context is to sum the dyadic pieces T j : f f K jtogether with complex weights w j z in such a way that

T z :

j 0

w j z T j

converges on the strip 0 Re z

1 to an analytic operator-valued function with

the property that

T z : L1 L for Re z

1 and

T z : L2 L2 for Re z

0

It then follows that T θ L p R

d L p

Rd

for 1 p

1 θ

2 . The “art” is then to

choose the weights w j z so that T θ at a specific θ is a prescribed operator such as

convolution by σS d 1 in our case.

While it is conceptually correct and helpful to describe complex interpolation

as a method of summing divergent series, it is rarely implemented in this fashion.

Rather, one tries to embed the desired operator under consideration into an analytic

family from the start without ever breaking it up into dyadic pieces. There are

certain standard ways of doing this which involve a little complex analysis and

distribution theory (on the level of integration by parts and analytic continuation).

We shall present such a standard approach now.

Proof of the endpoint for Tomas-Stein. We consider a surface of non-zero cur-

vature which can be written locally as a graph: ξ d h ξ , ξ

Rd 1. Define

M z ξ

1

Γ z

ξ d h ξ

z 1

χ1 ξ χ2

ξ d h ξ

(11.14)

where χ1 C 0

Rd 1

, χ2 C

0 R are smooth cut-off -functions, Γ is the Gamma

function, and Re z 0. Moreover,

refers to the positive part. We will show

thatT z f :

M z ˆ f

(11.15)

can be defined by means of analytic continuation to Re z 0. The main estimates

are now

T z 2 2 B z for Re z 1 (11.16)

T z 1

A z for Re z

d 1

2 (11.17)




where A z , B z

grow now faster than eC z

2

as Im z

. Moreover, we will

see that the singularity of

ξ d h ξ

z 1

at z 0 “cancels” out against the simple

zero of 1Γ

z

at z 0 so as to produce

M 0 ξ χ1 ξ δ0 ξ d h

ξ d ξ (11.18)

see (11.21); this means that M 0 ξ is proportional to surface measure on the graph.

It then follows from Stein’s complex interpolation theorem that

f

M 0 f

is bounded from L p L p

where

1

p

θ

1 θ

2 , 0

θ d 1

2

1 θ

which implies that

1

p

d 1

2d 2

as desired. It remains to check (11.16)–(11.18). To do so, recall first that 1Γ

z

is

an entire function with simple zeros at z 0,

1,

2 , . . .. It has the product

representation

1

Γ z

zeγ z

ν 1

1

z

ν

e z ν, z x iy

which converges everywhere in C. Thus,

1

Γ z

2

z

2 e2γ x

ν 1

1

x

ν

2

y2

ν2

e

2ν x

z

2 e2γ x

ν 1

e2 xν

z

2

ν2 e

2 xν

z

2e2γ xe z

2 π2

6

(11.19)

In particular, if Re z 1, then for all ξ R

d ,

M z ξ 1

y2 e2γ e 1 y2

π2

6 χ1 ξ χ2

ξ d h ξ

Cecy2

(11.16) holds with the stated bound on B z

. We remark that the bound we just

obtained is very wasteful. The true growth, as given by the Stirling formula, is of

the form y

12 e

π2 y as

y

but this makes no diff erence in our particular case.

Now let ϕ S Rd

. Thus, for Re z

0,

Rd

M z ξ ϕ ξ d ξ

1

Γ z

Rd 1

0

χ2 t ϕ ξ , t h ξ t z 1 dt χ1 ξ d ξ

1

zΓ z

Rd 1

0

d

dt

χ2 t

ϕ

ξ , t h

ξ

t z dt χ1 ξ d ξ

(11.20)




Observe that the right-hand side is well-defined for Re z 1. Furthermore, at

z 0, using zΓ z

z 0 1

Rd

M 0 ξ ϕ ξ d ξ

Rd 1

χ2 0

ϕ ξ , h ξ χ1 ξ d ξ

which shows that the analytic continuation of M z to z 0 is equal to (setting

χ2 0

1)

M 0 ξ χ1 ξ d ξ δ0 ξ d h ξ (11.21)

Clearly, M 0 is proportional to surface measure on a piece of the surface

S

ξ , h ξ ξ Rd 1

This is exactly what we want, since we need to bound σS f .

Observe that (11.20) defines the analytic continuation to Re z 1. Integrat-ing by parts again extends this to Re z 2 and so forth. Indeed, the right-hand

side of

Rd

M z ξ ϕ ξ d ξ

1

k

z z 1

. . . z k 1

Γ z

Rd 1

χ1 ξ

0

t z k 1

d k

dt k

χ2 t

ϕ ξ , ξ d k

ξ

dt d ξ

is well-defined for all Re z

k . Next we prove (11.17) by means of an estimate

on

M z

. This requires the following preliminary calculation: let N be a positive

integer such that N Re z 1 0. Then we claim that

0

e 2πit τt z χ2 t

dt

C N 1

z

N

1 Re z

1

τ

Re z 1 (11.22)

To prove (11.22) we will distinguish large and small t τ. Let ψ C 0

R be

such that ψ t

1 for

t

1 and ψ

t

0 for t

2. Then, since 0

χ2 1,

we have

0

e 2πit τt zψ t τ χ2 t dt

0

t Re zψ t τ dt

τ

Re z 1

2

0

t Re z dt

C

Re z 1

τ

Re z 1

(11.23)

If τ 1, then (11.23) is no larger than

0

t Re z χ2 t dt

C

Re z 1

Hence

(11.23)

C

1 Re z 1 τ

Re z

1 (11.24)




Observe that the growth in xd for Re z

d 12

is exactly balanced by the decay

in (11.28). One concludes that for Re z

d 1

2 and with k as in (11.27)

M z

C d

1

Γ z z z 1

. . . z k 1

1

z

2

see (11.26)–(11.28). Thus (11.17) follows from the growth estimate (11.19) or

Stirling’s formula, and we are done.

The endpoint of the Tomas-Stein theorem is so important because of its scaling

invariance. To see this, let u t , x

be a smooth solution of the Schrodinger equation

1i

t u

12π Rd u F

u t 0 f

(11.29)

where f S Rd

, say. The constant 12π

is rather arbitrary and chosen for cosmetic

reasons having to do with our choice of normalization of the Fourier transform. It

could be replaced by any other positive constant by rescaling; one can even change

the sign of the constant by passing to complex conjugates. We first treat the case

F 0. Then

u t , x

Rd

e2πi x ξ t ξ

2 ˆ f ξ d ξ

ˆ f µ

t , x (11.30)

where µ is the measure in Rd 1 defined by the integration

Rd 1

F ξ, τ µ d ξ, τ

Rd

F ξ, ξ 2 d ξ

for all F C 0 Rd 1

. Now let ϕ C 0

Rd 1

, ϕ ξ, τ 1 if ξ τ 1. Then

the endpoint of Stein-Tomas applies and one concludes that

ˆ f ϕµ

Lq Rd 1

C

ˆ f L2

ϕµ

where q

2d 4d

2

4d

. In other words, if supp

ˆ f B 0, 1

, then

ˆ f µ

L2

4d

Rd 1

C ˆ f L2 Rd

C f L2 Rd

(11.31)

To remove the condition supp

ˆ f B 0, 1 we rescale. Indeed, the rescaled func-

tions

f λ x f x λ

uλ x, t u x λ, t λ2

(11.32)

satisfy the equation

1

i

t uλ

1

2π uλ

0uλ t 0 f λ

If supp

ˆ f is compact, then supp

ˆ f λ B 0, 1

if λ is large. Hence, in view of

(11.30) and (11.31),

uλ Lq Rd 1

C f λ L2 Rd

(11.33)

However,

f λ L2 Rd

λd 2

f L2 Rd




and

uλ Lq Rd 1

λd 2

q u Lq

λd 2

u Lq

Thus, (11.33) is the same as

u Lq Rd 1

C f L2 Rd

(11.34)

for all f S will supp

ˆ f compact. These functions are dense in L2

Rd

. There-

fore, (11.34), which is the original Strichartz bound for the Schrodinger equation in

d spatial dimensions, follows for all f L2 R

d . Note the main diff erence between

the Strichartz estimate (11.33) and the Stein-Tomas endpoint estimate: while the

latter is confined to compact surfaces, the former applies to functions which live on

the non-compact characteristic variety of the Schrodinger equation, i.e., the parab-

oloid ξ, ξ 2 ξ R

d . Therefore, the Strichartz estimate necessarily obeys the

scaling symmetry of that variety which is also all we needed to pass from compact

surfaces to this specific non-compact surface.

While the approach we followed to derive the estimate (11.34) for solutions

of (11.29) with F 0 is the original one used by Strichartz and instructive due

to its reliance on the restriction property of the Fourier transform, it is technically

advantageous for many reasons to use a shorter and somewhat diff erent argument

which was introduced and developed by Ginibre and Velo in a series of papers.

We will present this argument now to derive the following theorem. We call a pair

p, q Strichartz admissible if and only if

2

p

d

q

d

2 (11.35)

and 2

p

with

p, q

2,

.Theorem 11.6. Let F be a space-time Schwartz function in d

1 dimensions,

and f a spatial Schwartz function. Let u t , x solve (11.29). Then

u L pt L

q x Rd 1

C

f L2 Rd

F

La

t Lb

x Rd 1

(11.36)

where p, q

and

a, b

are Strichartz admissible with a

2 and p

2. Finally,

these estimates localize in time: if u is restricted to some time-interval I on the

left-hand side, then F can be restricted to I on the right-hand side.

Some remarks are in order: first, the estimate (11.36) is scaling invariant under

the law (11.32) if and only if p, q

and

a, b

obey the relation (11.35). Hence the

latter is necessary for the Strichartz estimates (11.36) to hold. Note that the pair p q

2

4d

which we derived above for F 0 is of this type. On the other

hand, the restriction p 2 is technical and the important endpoint estimate for

p 2 and d

3 was proved by Keel and Tao. Finally, under the strong conditions

on F and f it is easy to actually solve (11.29) by means of an explicit expression:

u t

e

i2πt f

t

0

e

i2π

t s F s

ds (11.37)




with the understanding that the operator needs to be understood as a Fourier multi-

plier:

e

i

2πt f

x :

e2πi

x

ξ

ξ

2

t

ˆ f ξ d ξ

which is well-defined for f S Rd

, say.

Proof of Theorem 11.6. We again start with F 0 and let U

t

denote the

propagator, i.e., U t

f is the solution of (11.29) as give by (11.37): U

t

e

i2πt .

By (6.14) we have the bound

U t f Lq Rd

C t

d 2

1

q

1q

f

Lq

Rd

(11.38)

for 1 p

2. Indeed, for p 1 this follows by placing absolute values in-

side (6.14), whereas p 2 is Plancherel’s theorem. The intermediate p values

then follow by interpolation. Our goal for now is to prove that for any p, q

as in

the theorem U f L

pt L

q x

C f 2

where U f t , x U t f x

, slightly abusing notation. In analogy with the

“duality equivalence” Lemma 11.3 one now has that each of the following estimate

implies the other two:

U f L pt L

q x

C f L2 x

U F L2 x

C F

L p

t Lq

x

U U

L pt L

q x

C 2 F

L p

t Lq

x

As in the case of the Tomas-Stein theorem, the key is to prove the third property.

Note that one has

U F

U s F s ds

whence by the group property of U t

U U F t

U t s F s ds

By (11.38),

U U F t Lq x

C

t s

d 2

1

q

1q

F s

Lq

x

ds

whence by fractional integration in time with

1

1

p

1

p

d

2

1

q

1

q

(11.35)

provided

0

d

2

1

q

1

q

1 p

2 (11.39)

one has

U U F L pt L

q x

C F

L p

t Lq

x




which is the desired estimate for the case F 0. Now suppose f 0, and write

the solution of (11.29) as

u t

t

0

U t

s

F

s

ds

χ 0 s t U

t

s

F

s

ds

We now claim two estimates on u as a space-time function:

u L pt L

q x

C F

L p

t Lq

x

u L pt L

q x

C F L1t L

2 x

The first one is proved by the same argument as before, whereas the second one is

proved as follows:

u

t

Lq x

χ 0 s t

U t

s

F

s

Lq x

ds

U t s F s Lq x

ds

whence

u L pt L

q x

C

U t s F s L pt L

q x

ds

C

F

s

L2 x

ds

F L1

s L2 x

By duality and interpolation one now concludes that

u

L pt L

q x

C

F

La

s Lb

x

for any admissible pair a, b

and

p, q

with a

2 and p

2. This concludes

the argument. The statement about time-localizations follows from the solution

formula (11.37).

Exercise 11.5. The previous argument of course off ers an alternative to the

complex interpolation approach that we used to obtain the Stein-Tomas endpoint.

Provide the details of this alternative proof to Theorem 11.1 for general bounded

subsets of surfaces of nonvanishing Gaussian curvature.

As an application of Theorem 11.6 we now solve the nonlinear equation

i t ψ ∆ψ λ ψ

4d ψ (11.40)

in d spatial dimensions and with an arbitrary real-valued constant λ for small L2-

data ψ t 0 ψ0. First, it is not apriori clear what we mean by “solve” since the

data are L2, nor is it immediately clear why we do not consider smoother data.

Second, we remark that the choice of power in (11.40) is not accidental. In fact, it

is the unique power for which the rescaled functions

λd 2 ψ λ x, λ2t (11.41)




are again solutions. The relevance of this scaling law is the fact that the L2-invariant

scaling of the data is λd 2 ψ0 λ x , which therefore leaves the smallness condition un-

changed. Moreover, the induced rescaling of the solution is then given by (11.41).Hence, the only setting in which we can hope for a meaningful global small L2-data

theory is (11.40).

As in the existence and uniqueness theory of ordinary diff erential equations,

one reformulates (11.40) as an integral equation as follows (this is called Duhamel

formula):

ψ t eit ∆ψ0 iλ

t

0

ei t s ∆

ψ s

4d ψ s ds (11.42)

and then seeks a solution of this integral equation by contraction or Picard iteration

belonging to the space C t 0, ; L2 x R

d . This is very natural, as (11.40) con-

serves the L2-norm for real λ (see the problem section). However, this space is still

too large to formulate a meaningful theory in, and so we restrict it further in a waythat allows us to invoke Theorem 11.6.

Definition 11.7. By a global weak solution of (11.40) with data ψ0 L2 R

d

we mean a solution to the integral equation (11.42) which lies in the space

X : C t L

t 0,

; L2 x R

d L

p0

t 0,

; L2 p0 x R

d

(11.43)

with p0 :

1

4d

.

We remark that the power p0 in the definition is very natural. In fact, passing

L2-norms onto (11.42) yields

ψ

t

2

C

ψ0

2

λ

t

0

ψ

s

p0

2 p0 ds

which is at least finite for ψ X . Now we will show much more.

Corollary 11.8. The nonlinear equation (11.40) in R1 d t , x admits a unique

weak solution in the sense of Definition 11.7 for any data ψ0 L2 R

d provided

ψ0 2 ε0 λ, d .

Proof. For the sake of simplicity, we set d 2 which implies that the nonlin-

earity takes the form λ ψ

2 ψ and p0 3. We leave the case of general dimensions

to the reader. Let us first prove uniqueness. Thus, let ψ and ϕ be two weak solutions

to (11.42). Taking diff erences yields

ψ ϕ t iλ

t

0

ei t s ∆ ψ s

2ψ s ϕ s

2ϕ s ds

Therefore, applying the Strichartz estimates of the previous theorem locally in time

on I 0, T

yields, with X I being as in (11.43) localized to I ,

ψ ϕ X I C λ ψ s

2ψ s ϕ s

2ϕ s L1 0,T ; L2

R2

C λ ψ X I ϕ X I

2 ψ ϕ X I




However, if I is small enough then C λ ψ X I ϕ X I

2 1 whence ψ

ϕ X I 0. Thus, ψ ϕ on I . This argument shows that the set where ψ ϕ is

both open and closed and since it is nonempty (it contains t

0) therefore equalsthe whole time-line.

To prove existence, we set up a contraction argument in the space X for small

data. Thus, we define and operator A on X by the formula

Aψ

t :

eit ∆ψ0

iλ

t

0

ei t s ∆

ψ s

2ψ s

ds

and note that by Theorem 11.6

Aψ X C ψ0 2 λ ψ

3 X

Therefore, if ψ X R, and if

C ψ0 2 λ R3 R (11.44)

we conclude that A takes and R-ball in X into itself. To satisfy (11.44) we simply

choose ε so small that R C ε will verify (11.44) for any ψ0 2 ε. Taking

diff erences as in the uniqueness proof then shows that A is in fact a contraction.

Therefore, there exists a unique fixed-point ψ in the R-ball in X . But Aψ ψ is

precisely the definition of a weak solution and we are done.

Corollary 11.8 is only one of many results which can be obtained in this area.

It is also clear that many interesting questions pose themselves now, such as the

question of global existence for large data (which depends on the sign of λ) which

is relatively easy for data in H 1

but hard for data in L2

, persistence of regularity,spatial decay of solutions etc. We present some of the easier results in this direction

in the problem section. Showing that an equation with a derivative nonlinearity

such as

i t ψ ∆ψ ψ

2 1ψ (11.45)

has global solutions for small data in S, say, cannot be accomplished by means of

Strichartz estimates alone. Indeed, the Strichartz estimates are not able to make up

for the loss of a derivative on the right-hand side.

We conclude this chapter with a sketch of Strichartz estimates for the wave

equation in R1 d t , x :

u tt u ∆u F

u

t 0 f , t u

t 0 g

(11.46)

The solution to this Cauchy problem is

u t

cos

t

∇

f

sin t ∇

∇

g

t

0

sin t s ∇

∇

F s

ds




at least for Schwartz data, say. The meaning of the operators appearing here is of

course based on the Fourier transform. In other words,

cos t ∇ f x

12

Rd

e2πi x ξ

t ξ ˆ f ξ d ξ

Proceeding analogously to the Schrodinger equation were are therefore lead to

interpreting these integrals as Fourier transforms of measures living on the double

light-cone

Γ :

ξ, τ Rd

1

τ ξ

This is diff erent from the paraboloid in the Schrodinger equation in two important

ways: first, Γ has one vanishing principal curvature namely along the generators

of the light cone. Second, Γ is singular at the vertex of the cone. To overcome the

latter difficulty, we define

Γ0 : τ, ξ

R1

d 1 ξ τ 2

to be a section of the light cone in R1 d . Since the cone has one vanishing principal

curvature, the Fourier transform of the surface measure on Γ0 exhibits less decay

as for the paraboloid which reflects itself in a diff erent Stein-Tomas exponent (in

fact, the numerology here is such that the dimension d is reduced to d 1 reflecting

the loss of one principal curvature).

Exercise 11.6.

By means of the method of stationary phase from Chapter 6 show that

ϕσΓ0

t , x

C

1

t , x

d 1

2 (11.47)

for all t , x R1 d . Moreover, this is optimal for all directions

t , x

belonging to the dual cone Γ (which is equal to Γ if the opening angle is

90 ).

Check that the complex interpolation method from above therefore im-

plies that there is the following restriction estimate for Γ0:

ˆ f Γ0 L2

σΓ

C f L p Rd

where p

2d 2

d 3

and d 2 (for d 1 this amounts to a trivial bound).

Via a rescaling and Littlewood-Paley theory we now arrive at the following

analogue of the argument leading to (11.34) for the Schrodinger equation.

Theorem 11.9. Let Γ be the double light-cone in R1 d τ,ξ

with d 2 , equipped

with the measure µ d

τ, ξ

d ξ

ξ . Then

Γ

ˆ f τ, ξ

2 µ d

τ, ξ

12

C

f

L p Rd

(11.48)

with p

2d 2d 3

.




Proof. Let Γ0 be the cone restricted to 1 τ 2 as above. Then

λ τ,ξ 2λ

ˆ f

τ, ξ

2

µ

d

τ, ξ

12

Γ0

ˆ f λ τ, ξ

2 µ d

λ τ, ξ

12

C λd 1

2

1

λd 1 f

1

λ

L p R1

d

C λ

d 12

λ d 1 λ

d 1 p

f

L p Rd 1

C

f

L p Rd 1

(11.49)

by choice of p. To sum up (11.49) for λ 2 j we use the Littlewood-Paley theorem

together with Exercise 10.3. In fact, since p 2, we conclude that

ˆ f

L2

µ

C

j Z

P j f

2

L p

12

C

j Z

P j f

2

12

L p

C f

L p

Rd 1

which is (11.48).

For the wave equation this means the following.

Corollary 11.10. Let u t , x be a solution of (11.46) with f , g S Rd

, d 2.

Then

u

L2d 2d 1

Rd 1

C f

H 12

Rd

g

H

12

Rd

with a constant C C d .

Proof. For simplicity, set f 0. Then

u t , x

Rd

e2πi t ξ x ξ

g ξ d ξ

4πi ξ

Rd

e2πi t ξ x

ξ g ξ

d ξ

4πi ξ

F µ

x, t

where F ξ, ξ

ˆ f ξ and d µ ξ, ξ

d ξ

ξ

. By the dual to Theorem 11.9,

F µ

L p

Rd 1

C F F 2 µ

(11.50)

where p

2d 2d 1

. Clearly,

F L2 µ

Rd

g ξ

2 d ξ

ξ

12

g

H

12

so that (11.50) implies that

u

L2d 2d 1

Rd 1

C g

H

12

Rd

as claimed.

Exercise 11.7. Derive estimates for the wave equation as in the previous corol-

lary but with f

H 1 g

L2 on the right-hand side.




Notes

A standard introductory reference in this area are Stein’s Beijing lectures [48] and

Wolff ’s course notes [56]. The original reference by Strichartz is [50]. For an account

summarizing more recent developments see Tao [53]. Strichartz estimates are a vast area,

and continue to be developed, especially in the variable coefficient setting, see Bahouri,

Chemin, and Danchin [3] for that aspect. A standard references on dispersive evolution

equations is Tao [52]. Keel, Tao [31] prove the endpoint for the Strichartz estimates, i.e.,

L2t -type estimates. For more on the Schrodinger equation see Cazenave [8], as well as

Sulem and Sulem [51]. For the wave equation, see Shatah and Struwe [42]. Strichartz

esimates for the Klein-Gordon equation, which behaves like the wave equation for large

frequencies and the Schrodinger equation for small frequencies, respectively, are derived

in [37]. Some of the original work by Ginibre and Velo on Strichartz estimates for various

dispersive equations can be found here [23]. Finally, for the solution theory of equations

of the type (11.45) which require smoothing estimates for the Schr¨ odinger equation, seeKenig, Ponce and Vega [32].

Problems

Problem 11.1. Suppose that φ is a smooth function on 1, 1 with φ

t 0 if and

only if t 0. Assume further that φ

0 0 and φ

0 0. Let a C 1, 1 with

supp a 1, 1 . Determine the sharp decay of

1

1

eiλφ

t a t dt

as λ .

Problem 11.2. Investigate the restriction of the Fourier transform of a function in R3

to a (section of a) smooth curve in R3. Assume that the curve has nonvanishing curvature

and torsion. Try to generalize to other dimensions and codimensions.

Problem 11.3. This problem expands on Corollary 11.8.

Show that if ψ0 H k R

d (one can again take d 1 or d 2 for simplicity)

where k is a positive integer, then ψ C t 0, ; H k R

d . Conclude via Sobolev

imbedding that if ψ0 S Rd

, say, then ψ t , x is smooth in all variables and

satisfies the nonlinear equation (11.40) in a pointwise sense.

Show that if ψ t , x is any solution (not necessarily small) of (11.40) with t ψ

C 0, T ; L2 R

d and ψ

C 0, T ; H 2 Rd

, then one has conservation of mass

(recall λ is real)

M ψ :

1

2 ψ t

22

1

2 ψ0

22 0 t T

and energy

E ψ :

Rd

1

2 ∇ψ t , x

2

λ

2

4d

ψ t , x

2

4d

dx

Hint: diff erentiate M and E with respect to time, and integrate by parts.




Prove the relation

eit ∆ x x 2i∇ eit ∆

as operators acting on Schwartz functions and use this to show that if ψ0

S Rd

, then the solution constructed in Corollary 11.8 has the property that for

any time t one has ψ t S Rd

.

Show that the solution of Corollary 11.8 with ψ0 L2 R

d preserves the mass

M ψ and the energy provided the data satisfy ψ0 H 1 R

d .

Problem 11.4.

For any power p 2 show that the equation

i t ψ xx ψ λ ψ

p 1ψ (11.51)

on the line R, has local in-time unique weak solutions for any data ψ0 H 1 R .

These solutions again need to be interpreted in the Duhamel integral equation

sense, and the space in which to contract is C 0, T ; H 1 R where T 0 can

be chosen so that it is bounded below by a function of

ψ0

H 1 R alone. Showthat these solutions conserve mass and energy.

Show that if T

is the maximal time so that a weak solution exists on 0, T

,

and if T

, then ψ t H 1

R

as t T

. Conclude that for λ 0 one

has global solutions, i.e., T

. Note: for λ 0 this is false, for example

solutions of negative energy blow up in finite time in the sense that T

,

see [8] or [51].

For small data in H 1 R and powers p 5 show that one has global solutions

irrespective of the sign of λ.

Problem 11.5. For the one-dimensional wave equation utt u xx 0 with smooth data

u

t 0

f and t u

t 0

g show that the solution is given by

u t , x

1

2 f x t f x t

1

2

x

t

x

t

g y dy

for all times. Conclude that such waves do not decay, precluding any possibility of a

Strichartz estimate other than one involving L

t .

Problem 11.6. Carry out the Ginibre-Velo approach to Strichartz estimates for the

wave equation, in analogy with the Schrodinger equation as in Theorem 11.6. This needs to

be done for a fixed Littlewood-Paley piece, say P0, and then rescaled and summed up using

the results of the previous chapter. This approach leads to Besov-space estimates which

are stronger than the Sobolev ones (one can switch to the latter by means of Exercise 10.3

as we did in the proof of Corollary 11.10). Hint: See [42] for details.



Bibliography

[1] Adams, D. R. A note on Riesz potentials. Duke Math. J. 42 (1975), no. 4, 765–778.

[2] Antonov, N. Yu. Convergence of Fourier series. Proceedings of the XX Workshop on Function

Theory (Moscow, 1995). East J. Approx. 2 (1996), no. 2, 187–196.

[3] Bahouri, H., Chemin, J.-Y., Danchin, R. Fourier analysis and nonlinear partial di ff erential

equations. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Math-

ematical Sciences], 343. Springer, Heidelberg, 2011.

[4] Bergh, J., Lofstrom, J. Interpolation spaces. An introduction. Grundlehren der Mathematischen

Wissenschaften, No. 223. Springer-Verlag, Berlin-New York, 1976.

[5] Bourgain, J. Bounded orthogonal systems and the Λ

p

?(p)-set problem. Acta Math. 162

(1989), no. 3-4, 227–245.

[6] Carleson, L. On convergence and growth of partial sums of Fourier series. Acta Math. 116,

135–157 (1966).

[7] Garca-Cuerva, J., Rubio de Francia, J., Weighted norm inequalities and related topics. North-

Holland Mathematics Studies, 116. Notas de Matematica [Mathematical Notes], 104. North-

Holland Publishing Co., Amsterdam, 1985.

[8] Cazenave, T. Semilinear Schr¨ odinger equations. Courant Lecture Notes in Mathematics, 10.

New York University, Courant Institute of Mathematical Sciences, New York; American Math-

ematical Society, Providence, RI, 2003.

[9] Christ, M. Lectures on singular integral operators. CBMS Regional Conference Series in Math-

ematics, 77. Published for the Conference Board of the Mathematical Sciences, Washington,

DC; by the American Mathematical Society, Providence, RI, 1990.[10] Coifman, R., Meyer, Y. Wavelets. Calder´ on-Zygmund and multilinear operators. Translated

from the 1990 and 1991 French originals by David Salinger. Cambridge Studies in Advanced

Mathematics, 48. Cambridge University Press, Cambridge, 1997.

[11] Davis, K. M., Chang Lectures on Bochner Riesz means. London Mathematical Society, Lecture

Note Series 114, 1987.

[12] Duoandikoetxea, J. Fourier analysis. (English summary) Translated and revised from the 1995

Spanish original by David Cruz-Uribe. Graduate Studies in Mathematics, 29. American Math-

ematical Society, Providence, RI, 2001.

[13] Evans, L. C. Partial di ff erential equations. Second edition. Graduate Studies in Mathematics,

19. American Mathematical Society, Providence, RI, 2010.

[14] Feff erman, C. The multiplier problem for the ball. Ann. of Math. (2) 94 (1971), 330–336.

[15] Feff erman, C The uncertainty principle. Bull. Amer. Math. Soc. (N.S.) 9 (1983), no. 2, 129–

206.

[16] Feff

erman, C., Stein, E. M. H

p

Hp spaces of several variables. Acta Math. 129 (1972), no. 3-4,137–193.

[17] Folland, Gerald B. Introduction to partial di ff erential equations. Second edition. Princeton Uni-

versity Press, Princeton, NJ, 1995.

[18] Frank, R., Seiringer, R. Non-linear ground state representations and sharp Hardy inequalities.

J. Funct. Anal. 255 (2008), no. 12, 3407–3430.

[19] Frazier, M., Jawerth, B., Weiss, G. Littlewood-Paley theory and the study of function spaces.

CBMS Regional Conference Series in Mathematics, 79. Published for the Conference Board

165



166 BIBLIOGRAPHY

of the Mathematical Sciences, Washington, DC; by the American Mathematical Society, Prov-

idence, RI, 1991.

[20] Garnett, J. Bounded analytic functions. Revised first edition. Graduate Texts in Mathematics,

236. Springer, New York, 2007.

[21] Garnett, J. B., Marshall, D. E. Harmonic measure. Reprint of the 2005 original. New Mathe-

matical Monographs, 2. Cambridge University Press, Cambridge, 2008.

[22] Gilbarg, D., Trudinger, N. Elliptic partial di ff erential equations of second order. Second edition.

Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical

Sciences], 224. Springer-Verlag, Berlin, 1983.

[23] Ginibre, J., Velo, G. On a class of nonlinear Schr¨ odinger equation. I. The Cauchy problems; II.

Scattering theory, general case, J. Func. Anal. 32 (1979), 1-32, pp. 33–71; Scattering theory

in the energy space for a class of nonlinear Schr¨ odinger equations, J. Math. Pures Appl. (9)

64 (1985), no. 4, pp. 363–401; The global Cauchy problem for the nonlinear Klein-Gordon

equation. Math. Z. 189 (1985), no. 4, 487–505.; Time decay of finite energy solutions of the

nonlinear Klein-Gordon and Schr¨ odinger equations. Ann. Inst. H. Poincare Phys. Theor. 43

(1985), no. 4, 399–442.

[24] Havin, V., Joricke, B. The uncertainty principle in harmonic analysis. Ergebnisse der Mathe-

matik und ihrer Grenzgebiete (3), 28. Springer-Verlag, Berlin, 1994.

[25] Hoff man, K. Banach spaces of analytic functions. Reprint of the 1962 original. Dover Publica-

tions, Inc., New York, 1988.

[26] Hormander, L. The analysis of linear partial di ff erential operators. I. Distribution theory and

Fourier analysis. Second edition. Springer Study Edition. Springer-Verlag, Berlin, 1990.

[27] Jaming, P. Nazarov’s uncertainty principles in higher dimension. J. Approx. Theory 149 (2007),

no. 1, 3041.

[28] Janson, S. Mean oscillation and commutators of singular integral operators. Ark. Mat. 16

(1978), no. 2, 263–270.

[29] Jerison, D. An elementary approach to local solvability for constant coe fficient partial di ff eren-

tial equations. Forum Math. 2 (1990), no. 1, 45–50.

[30] Katznelson, I. An introduction to harmonic analysis. Dover, New York, 1976.

[31] Keel, M., Tao, T. Endpoint Strichartz estimates, Amer. J. Math., 120 (1998), pp. 955–980.

[32] Kenig, C. E., Ponce, G., Vega, L. The initial value problem for the general quasi-linear

Schr¨ odinger equation. Recent developments in nonlinear partial diff erential equations, 101–

115, Contemp. Math., 439, Amer. Math. Soc., Providence, RI, 2007.

[33] Kolmogorov, A. N. Une s´ erie de Fourier-Lebesgue divergente presque partout. Fundamenta

Mathematicae (4), 1923, 324–328.

[34] Koosis, P. Introduction to H p spaces. Second edition. With two appendices by V. P. Havin [Vik-

tor Petrovich Khavin]. Cambridge Tracts in Mathematics, 115. Cambridge University Press,

Cambridge, 1998.

[35] Konyagin, S. V. On the divergence everywhere of trigonometric Fourier series. (Russian) Mat.

Sb. 191 (2000), no. 1, 103–126; translation in Sb. Math. 191 (2000), no. 1-2, 97–120.

[36] Lieb, E. H., Loss, M. Analysis. Second edition. Graduate Studies in Mathematics, 14. American

Mathematical Society, Providence, RI, 2001.

[37] Nakanishi, K., Schlag, W. Invariant Manifolds and Dispersive Hamiltonian Evolution Equa-

tions., EMS, 2011.

[38] Nazarov, F. L. Local estimates for exponential polynomials and their applications to inequali-

ties of the uncertainty principle type. Algebra i Analiz 5 (1993), no. 4, 3–66; translation in St.

Petersburg Math. J. 5 (1994), no. 4, 663–717.

[39] Opic, B., Kufner, A. Hardy-type inequalities. Pitman Research Notes in Mathematics Series,

219. Longman Scientific & Technical,

[40] Rudin, W. Functional analysis. Second edition. International Series in Pure and Applied Math-

ematics. McGraw-Hill, Inc., New York, 1991.

[41] Rudin, W. Trigonometric series with gaps. J. Math. Mech. 9 1960, 203–227.



BIBLIOGRAPHY 167

[42] Shatah, J., Struwe, M. Geometric wave equations. Courant Lecture Notes in Mathematics, 2.

American Mathematical Society, Providence, RI, 1998.

[43] Sjolin, P. An inequality of Paley and convergence a.e. of Walsh-Fourier series. Ark. Mat. 7,

551–570 (1969).

[44] Stein, E. M. On limits of seqences of operators. Ann. of Math. (2) 74 (1961), 140–170.

[45] Stein, E. M. Singular integrals and di ff erentiability properties of functions. Princeton Mathe-

matical Series, No. 30 Princeton University Press, Princeton, N.J., 1970.

[46] Stein, E. M. Harmonic analysis: real-variable methods, orthogonality, and oscillatory inte-

grals. With the assistance of Timothy S. Murphy. Princeton Mathematical Series, 43. Mono-

graphs in Harmonic Analysis, III. Princeton University Press, Princeton, N.J., 1993.

[47] Stein, E. M. Topics in harmonic analysis related to the Littlewood-Paley theory. Annals of

Mathematics Studies, No. 63 Princeton University Press, Princeton, N.J.; University of Tokyo

Press, Tokyo 1970.

[48] Stein, E. M. Oscillatory integrals in Fourier analysis. Beijing lectures in harmonic analysis

(Beijing, 1984), 307355, Ann. of Math. Stud., 112, Princeton Univ. Press, Princeton, NJ, 1986.

[49] Stein, E. M., Weiss, G. Introduction to Fourier analysis on Euclidean spaces. Princeton Math-

ematical Series, No. 32. Princeton University Press, Princeton, N.J., 1971.

[50] Strichartz, R. S. Restrictions of Fourier transforms to quadratic surfaces and decay of solutions

of wave equations. Duke Math. J. 44 (1977), no. 3, 705–714.

[51] Sulem, C., Sulem, P-L. The nonlinear Schr¨ odinger equation. Self-focusing and wave collapse,

Applied Mathematical Sciences, 139. Springer-Verlag, New York, 1999.

[52] Tao, T. Nonlinear dispersive equations. Local and global analysis. CBMS Regional Conference

Series in Mathematics, 106. AMS Providence, RI, 2006.

[53] Tao, T. Some recent progress on the restriction conjecture. Fourier analysis and convexity, 217–

harmonicnotes.pdf

Documents