universit a degli studi dell’insubria dipartimento di ... notes/met_14e… · 1. functional...

100
Universit´a degli Studi dell’Insubria Dipartimento di Scienze ed Alta Tecnologia Lecture Notes for the Course of Mathematical Methods of Physics Italo Guarneri Academic Year 2012-13 March 3, 2015

Upload: others

Post on 08-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

Universita degli Studi dell’InsubriaDipartimento di Scienze ed Alta Tecnologia

Lecture Notes for the Course of Mathematical Methods of PhysicsItalo Guarneri

Academic Year 2012-13

March 3, 2015

Page 2: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

CONTENTS

1. Functional Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.1 Generalities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Normed Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.1 Topological Vector Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.2 Normed Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.3 Pre-Hilbert spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.4 ℓ2- spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.2.5 Spaces of Continuous Functions. . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 Banach spaces, and Hilbert spaces. . . . . . . . . . . . . . . . . . . . . . . . . . 131.3.1 Need of a new concept of Integral. . . . . . . . . . . . . . . . . . . . . . . 15

2. Rudiments of Measure Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.1 Elementary measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.1 Quadrature of plane sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2.1 Lebesgue measure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2.2 Measurable Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2.3 Sets of zero measure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.3.1 Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.3.2 Elementary properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.3.3 Special Cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.3.4 Exchanging lim and

∫. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.4 Square summable functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.4.1 L2- spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.4.2 Convergence in the Mean. . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3. Elementary Theory of Hilbert Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.1 Orthogonal projections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.1 Hilbert subspaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.1.2 The Projection theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.1.3 Decomposition theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 Hilbert bases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2.1 Orthonormal systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Page 3: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

Contents 3

3.2.2 Best Approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.2.3 Generalized Fourier series. . . . . . . . . . . . . . . . . . . . . . . . . . . 413.2.4 Completeness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.2.5 Hilbert bases: examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.2.6 Separability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.3 Linear Maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.3.1 Isomorphic Hilbert spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . 453.3.2 Bounded linear maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.3.3 A theorem of Riesz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.4 The Algebra of bounded operators. . . . . . . . . . . . . . . . . . . . . . . . . . 513.4.1 Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.4.2 Inverse operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.4.3 Unitary Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.4.4 Projectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.4.5 Convergence of operator sequences. . . . . . . . . . . . . . . . . . . . . . 56

4. Fourier Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.1 Fourier Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.1.1 Finite Fourier transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.1.2 Periodic Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.1.3 Square-summable functions. . . . . . . . . . . . . . . . . . . . . . . . . . 604.1.4 Fast Convergence of Fourier series. . . . . . . . . . . . . . . . . . . . . . 624.1.5 Trigonometric series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.1.6 Multiple Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2 The Fourier Integral. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.2.1 Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.2.2 Elementary Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.2.3 Fast-decreasing test-functions. . . . . . . . . . . . . . . . . . . . . . . . 694.2.4 The Harmonic Oscillator basis. . . . . . . . . . . . . . . . . . . . . . . . 704.2.5 Fourier transform in L2(RN). . . . . . . . . . . . . . . . . . . . . . . . . . 73

5. Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.1 Tempered distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.1.1 Regular Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.1.2 Singular Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.1.3 The space of Tempered Distributions. . . . . . . . . . . . . . . . . . . . . 79

5.2 Differential Calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.2.1 Operations with Distributions. . . . . . . . . . . . . . . . . . . . . . . . . 805.2.2 The Derivatives of a Distribution. . . . . . . . . . . . . . . . . . . . . . 80

5.3 Other Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.3.1 Change of variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.3.2 Product. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.3.3 Tensor Product. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Page 4: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

Contents 4

5.4 Fourier transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865.4.1 Explicit Calculation of some Transforms. . . . . . . . . . . . . . . . . . . 875.4.2 Convolution of distributions. . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.5 Fundamental Solutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925.5.1 The Poisson equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.5.2 Fundamental solutions, and the Cauchy problem. . . . . . . . . . . . . . 935.5.3 The Diffusion, or Heat, equation. . . . . . . . . . . . . . . . . . . . . . . 955.5.4 The Schrodinger equation for a free particle. . . . . . . . . . . . . . . . . 975.5.5 The Wave equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Page 5: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. FUNCTIONAL SPACES.

Paragraphs and proofs marked by * are optional. Some basic definitions and constructions ofthe theory of Vector Spaces are assumed to be known.

1.1 Generalities.

Let E be an arbitrary non-empty set and let F (E,R) and F (E,C) denote the set of all functionsdefined on E, with values in R or in C respectively . In F (E,R) and in F (E,C) one canintroduce the structure of a vector space (real and complex, respectively) by introducing theoperations of sum, and of multiplication by a scalar (a real, or complex number respectively),in the following natural way. For x ∈ F (E,K) and y ∈ F (E,K) where K = R or K = C, thesum x+ y of x and y is defined as the function on E, that in any point t ∈ E takes the value :

(x + y)(t) := x(t) + y(t) . (1.1)

The product αx of a function x and a number α is defined as the function on E, that in everypoint t ∈ E takes the value:

(αx)(t) := αx(t) .

It is immediately verified that with such operations F (E,K) is a complex vector space wheneverK = C , and a real vector space whenever K = R.If E is a finite set of n elements, then, with no limitation of generality, such elements , or ”points”, can be identified with the first n natural numbers: 1, 2, . . . , n. Every function x : E → Kis fully specified by the value it takes at each of these points , that is, it is fully specified bythe n numbers x(1), x(2), . . . , x(n). One may denote them, if that looks better, by x1, . . . , xn,and then it is immediately seen that whenever E consists of a finite number n of elements, thevector spaces F (E,C) and F (E,R) have finite dimension n and can be identified with Cn, andRn respectively.If E is not finite, the vector spaces just described have infinite dimension and are termedfunctional spaces. Even in such cases a function in a functional space can be thought of as avector, the components of which are given by the values that the function takes at the variouspoints in E; and then the points of E play the role of labels for the components of the vector.Properly speaking, a functional space is not, typically, the whole vector space F (E,K); ratherit is a vector subspace of F (E,K), which does not contain all possible functions that may bedefined on E, but only those which enjoy some distinctive property, such as, e.g., continuity,or others. Explicit examples will be studied in some detail in the following.

Page 6: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 6

1.2 Normed Spaces.

1.2.1 Topological Vector Spaces.

In rough terms, and for the restricted purposes of this course, a Topological vector space(TVS)may be defined as a vector space, in which a notion of convergence has been introduced forsequences of vectors ; and this in such a way, that vector operations are continuous operations.This means the following. If a sequence of scalars (real or complex numbers) {αj} converges toa limit α∞ , and {xk} , {yl} are sequences of vectors, that converge to respective limits x∞, y∞according to the given definition of convergence, then it must be true that

limj,k→∞

αjxk = α∞x∞ (1.2)

limk,l→∞

(xk + yl) = x∞ + y∞ (1.3)

according to the same definition of convergence. Obvious examples of TVS are R in the realcase and C in the complex case. A quite general (though not the most general) method ofdefining convergence for sequences of elements of a given set X, consists in defining a distancein X, that is, in endowing it with the structure of a metric space. Given such a structure, ifd(x, y) denotes the distance of the elements x and y, then a sequence {xk} converges to thelimit x∞ if limk→∞ d(xk, x∞) = 0. However, if the considered set X already has the structureof a vector space, it is not granted at all that any metric introduced in X will make (1.2),(1.3) true. In other words, an arbitrary metric introduced in a vector space may not turn itinto a TVS. A way of introducing a metric in a vector space, so that vector operations areautomatically continuous, is to derive it from a norm, as described in the next section.

Problem 1: The discrete metric in an arbitrary nonempty set X is defined by d(x, y) = 1 for x = y and

d(x, y) = 0 if x = y. Show that a sequence of elements is convergent in this metric if and only if it is eventually

constant. Then show that if X is a real or complex vector space with this metric, then the external product is

not continuous (that is, (1.2) is false in general.)

1.2.2 Normed Spaces.

Definition 1: A normed vector space is a vector space in which a norm is defined. A normin a vector space X is a function ∥.∥ : X → R+ with the following properties:1) positivity: ∀x ∈ X, ∥x∥ ≥ 0; moreover, ∥x∥ = 0 if and only if x = 0 (the null vector).2) homogeneity: ∀x ∈ X and for all scalars α, ∥αx∥ = |α|∥x∥3) subadditivity: ∀x ∈ X, ∀y ∈ X, ∥x+ y∥ ≤ ∥x∥+ ∥y∥ .

The absolute value of a real number and the modulus of a complex number are norms in thevector spaces R and C respectively. In general, one may define different norms in the samevector space.

Problem 2: Show that ∥x∥ := |x1|+ |x2| is a norm in R2.

Page 7: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 7

Theorem 1: Let X be a normed vector space, and for arbitrary x, y ∈ X define: d(x, y) :=∥x− y∥. Then1) the function d : X ×X → R+ is a distance in X , that is thereby a metric space;2) using this metric, the vector operations in X are continuous, so X is a TVS.

Proof (1): Problem 4. (2) If xn → x∞ e yn → y∞ in this metric then

limn→∞

∥xn − x∞∥ = 0 , limn→∞

∥yn − y∞∥ = 0

and so

∥(xn+ym)−(x∞+y∞∥ = ∥(xn−x∞)+(ym−y∞∥ ≤ ∥xn−x∞∥+∥ym−y∞∥ → 0 per n,m→ ∞

therefore lim(xn + ym) = x∞ + y∞ = limxn + lim ym, so the sum of vectors is a continuousoperation. If {αm} is a sequence of scalars that converges to a limit α∞, then

∥αmxn − α∞x∞∥ = ∥αm(xn − x∞) + (αm − α∞)x∞∥ (1.4)

≤ ∥αm(xn − x∞)∥+ ∥(αm − α∞)x∞∥ (1.5)

= |αm∥|xn − x∞∥+ |αm − α∞|∥x∞∥ . (1.6)

The last expression tends to zero in the limit n,m → ∞, because xn → x∞ and αm → α∞ byassumption. This proves continuity of the external product �Problem 3: Prove that the following inequality is true for all x, y in X

∥x− y∥ ≥ | ∥x∥ − ∥y∥ |

Problem 4: Verify that d(x, y) := ∥x− y∥ enjoys the distinctive properties of a distance function.

Theorem 2: Every norm in X is a continuous function on X with respect to the metric itdefines.

Proof: If xn → x∞ then | ∥xn∥− ∥x∞∥ | ≤ ∥xn − x∞∥ → 0 thanks to the inequality in Problem3).�

1.2.3 Pre-Hilbert spaces.

Definition 2: A scalar, or inner product in a vector space X is a function h : X ×X → K,where K is the set of scalars (i.e. K = R if X is a real VS, e K = C if X is a complex VS),endowed with the following distinctive properties :

1. ∀x, y, z ∈ X, ∀α, β ∈ K,

h(x, αy + βz) = αh(x, y) + βh(x, z)

that is, the inner product is linear in the 2nd factor,

Page 8: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 8

2. ∀x, y ∈ X,h(x, y) = h(y, x)∗

3. ∀x ∈ X, h(x, x) ≥ 0 ; moreover, h(x, x) = 0 if, and only if, x = 0.

Notes:1. The star ∗ in property 2 denotes the complex conjugate, and is to be ignored whenever Xis a real VS .2. From (2) and (1) it follows that ∀x, y, z ∈ X, ∀α, β ∈ K,

h(αy + βz, x) = α∗h(y, x) + β∗h(z, x)

that is, in the complex case, the inner product is anti-linear in the 1st factor ;3. h(x, 0) = 0 ∀x ∈ X follows from linearity (1).4. If X is a complex VS and a function h : X×X → C has properties 1 and 2, then, ∀x, y ∈ X:

h(x, y) = 14h(x+y, x+y) − 1

4h(x−y, x−y) − i

4h(x+ iy, x+ iy) + i

4h(x− iy, x− iy) , (1.7)

as follows from straightforward calculation.

Definition 3: A vector space endowed with an inner product is called a pre-Hilbert space.

Theorem 3: For two vectors x, y in a pre-Hilbert space the Cauchy-Schwarz inequalityholds :

|h(x, y)| ≤√h(x, x)

√h(y, y) .

Proof :Consider the case of a complex space. We assume that neither vector is null, for otherwise

the inequality is trivially true. For arbitrary α ∈ C, h(x+ αy, x+ αy) ≥ 0 thanks to property1, so, using the other properties :

0 ≤ h(x, x) + |α|2h(y, y) + αh(x, y) + α∗h(x, y)∗ (1.8)

= h(x, x) + |α|2h(y, y) + 2ℜ(αh(x, y)) . (1.9)

Now choose α = λh(y, x)/h(y, y) with λ an arbitrary real number :

0 ≤ h(x, x) + 2λ|h(x, y)|2

h(y, y)+ λ2

|h(x, y)|2

h(y, y).

The rhs is never negative, and is a trinomial of degree 2 in the real variable λ. By elemen-tary algebra, its discriminant cannot be positive. On explicitly writing this discriminant, thiscondition immediately yields the Cauchy-Schwarz inequality . �

In the following the inner product of vectors x, y in a pre-Hilbert space will be denoted⟨x|y⟩. This denotation is not universally adopted; in the math literature, the denotation (x, y)is prevalent.Examples The abstract notion of an inner product is modeled after the scalar product of

Page 9: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 9

elementary vector calculus . For vectors x, y in R3, this is defined by h(x, y) = x1y1+x2y2+x3y3.The ”canonical” scalar product in Cn is defined by:

⟨x|y⟩ :=n∑

j=1

x∗jyj . (1.10)

Theorem 4: In a pre-Hilbert space X define, for arbitrary x ∈ X, ∥x∥ :=√⟨x|x⟩ (arithmetic

square root). The function ∥.∥ : X → R+ thus defined on X is a norm in X. It is termed thecanonical norm associated with the inner product in X.

Proof: We have to prove properties (1),(2),(3) in Def.1. (1) e (2) are left as an easy exercise.As to (3),

∥x+ y∥2 = ⟨x+ y|x+ y⟩ = ⟨x|x⟩+ ⟨y|y⟩+ 2ℜ⟨x|y⟩ (1.11)

≤ ⟨x|x⟩+ ⟨y|y⟩+ 2|⟨x|y⟩| (1.12)

≤ ⟨x|x⟩+ ⟨y|y⟩+ 2√

⟨x|x⟩√⟨y|y⟩ (1.13)

= ∥x∥2 + ∥y∥2 + 2∥x∥∥y∥ = (∥x∥+ ∥y∥)2 , (1.14)

where in the 2nd line the Cauchy-Schwarz inequality has been used. �Every pre-Hilbert space is therefore automatically endowed with a metric, which is induced theinner product.

Theorem 5: (Continuity of the Inner product): let Xbe a pre-Hilbert space, with themetric that is induced by the inner product. Let sequences {xn} e {yn} of vectors in Xconverge in this metric to limits x∞ and y∞ respectively. Then

limn,m→+∞

⟨xn|ym⟩ = ⟨x∞|y∞⟩ .

Proof.:

|⟨xn|ym⟩ − ⟨x∞|y∞⟩| ≤ |⟨xn|ym⟩ − ⟨x∞|ym⟩| + |⟨x∞|ym⟩ − ⟨x∞|y∞⟩| (1.15)

= |⟨xn − x∞|ym⟩| + |⟨x∞|ym − y∞⟩| (1.16)

≤ ∥xn − x∞∥∥ym∥ + ∥x∞∥∥ym − y∞∥ (1.17)

The claim now follows from xn → x∞ and ym → y∞ (which also implies that ∥ym∥ → ∥y∞∥).�

Proposition 1: (Parallelogram Identity) The following identity is true for two arbitraryvectors x, y in a pre-Hilbert space:

∥x+ y∥2 + ∥x− y∥2 = 2∥x∥2 + 2∥y∥2 . (1.18)

Problem 5: Prove Proposition 1, and explain why (1.18) is called the Parallelogram identity.

Theorem 6: A normed VS is a pre-Hilbert space, that is, its norm is induced by an innerproduct, if, and only if, the Parallelogram Identity is valid.

Page 10: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 10

the ”only if” part of the statement is just Proposition 1; the ”if” part will not be proven here.

Proposition 2: (Polarization Identity) For all x, y in a real pre-Hilbert space:

⟨x|y⟩ = 14∥x+ y∥2 − 1

4∥x− y∥2 ,

and for all x, y in a complex pre-Hilbert space:

⟨x|y⟩ = 14∥x+ y∥2 − 1

4∥x− y∥2 − i

4∥x+ iy∥2 + i

4∥x− iy∥2 .

Proof.: by direct calculation. In the complex case, this is property (1.7) written for the casewhen h is a scalar product. �

1.2.4 ℓ2- spaces.

The pre-Hilbert space Cn has finite dimension n. A vector in Cn is defined by n complexcomponents. The norm, or length, of such a vector is obtained by summing the squared moduliof all its components, and by taking the square root of the result. A direct infinite-dimensionalgeneralization is obtained by allowing vectors to have infinitely many components , but stillrequiring their lengths to remain finite. Any such vector is thus a sequence of complex numbers,with the property that the sum of the squared moduli of all such numbers is finite.To formalize this idea, let us consider the vector space F (E;K) which was introduced in Section1.1, and take E = N. Let ℓ2(N) denote the (strict) subset of F (N,K) that consists of all thosefunctions x : N → K that satisfy the condition :

+∞∑n=1

|x(n)|2 < +∞ . (1.19)

Theorem 7: ℓ2(N) is a vector subspace of F (N,K).

Proof.: we have to prove that if x, y ∈ ℓ2(N) and α, β ∈ K then αx + βy ∈ ℓ2(N), that is,∞∑n=1

|(αx+ βy)(n)|2 < +∞ . This is easily proven by using two elementary inequalities:

Lemma 1: For arbitrary a ∈ C and b ∈ C (i) 2|ab| ≤ |a|2 + |b|2, (ii) |a+ b|2 ≤ 2|a|2 + 2|b|2.

Proof of the Lemma: (i) immediately follows from 0 ≤ (|a| − |b|)2 = |a|2 + |b|2 − 2|ab|. As to(ii): |a+ b|2 = |a|2 + |b|2 + 2ℜ(a∗b) ≤ |a|2 + |b|2 + 2|ab| whence (ii) follows thanks to (i).The proof of the Theorem is then as follows :

∞∑n=1

|(αx+ βy)(n)|2 =∞∑n=1

|αx(n) + βy(n)|2 (1.20)

≤ 2|α|2+∞∑n=1

|x(n)|2 + 2|β|2+∞∑n=1

|y(n)|2 (1.21)

The last expression is finite thanks to (1.19) because x, y ∈ ℓ2(N) by assumption. �

Page 11: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 11

Theorem 8: ℓ2(N) is a pre-Hilbert space with the inner product defined as follows:

⟨x|y⟩ :=+∞∑n=1

x(n)∗y(n) . (1.22)

Proof .: Since ℓ2(N) is a VS thanks to Thm. 7, one has only to prove that (1.22) satisfies theproperties that are required of a scalar product. First we note that it is well defined for allx, y ∈ ℓ2(N), because:∣∣∣∣∣

+∞∑n=1

x(n)∗y(n)

∣∣∣∣∣ ≤+∞∑n=1

|x(n)∗y(n)| ≤ 12

∞∑n=1

|x(n)|2 + 12

∞∑n=1

|y(n)|2 < +∞

thanks to ineq.(i) in Lemma 1. Checking that it has the inner product properties is left as anelementary exercise. �The canonical norm in ℓ2(N) is therefore defined as follows:

∥x∥2 = ⟨x|x⟩ =+∞∑n=1

|x(n)|2 . (1.23)

The space ℓ2(N) is also known as the space of square-summable functions on N. In a completelysimilar way one introduces the space ℓ2(Z) of the square-summable functions on Z; this isa vector subspace of F (Z;K) and vectors in it are two-sided sequences. All definitions and

proofs shown for ℓ2(N) are carried through to ℓ2(Z) by just replacing one-sided sums+∞∑n=1

with

two-sided sums+∞∑

n=−∞. The space ℓ2(Z) is a special case of a space ℓ2(Zn), that is the space of

square-summable functions on the lattice Zn.

1.2.5 Spaces of Continuous Functions.

Let us consider the vector space F (K;K) where K = R or K = C (cp. Sect.1.1), in the casewhen K is a compact metric space. Let x : K → K and y : K → K be continuous functions,and let α, β ∈ K be arbitrary scalars. The function αx+ βy : K → K which is defined by (1.1)is a continuous function, thanks to elementary properties of continuous functions. Therefore,the set C(K) of all real- or complex-valued continuous functions on K is a vector subspaceof F (K;K); hence, it is a (real, or complex) VS itself. Using the fundamental property, thatcontinuous function on compact metric spaces have maxima, one can define, for all x ∈ C(K):

∥x∥ := max{|x(t)| | t ∈ K} . (1.24)

It is straightforward to check that this definition satisfies the properties that are required ofa norm (to this end, note that the null vector in the vector space C(K) is the function thatvanishes at all points in K). This particular norm is usually denoted by ∥.∥∞. Thanks to this

Page 12: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 12

definition, C(K) is a normed VS. One immediately verifies that the corresponding definition ofconvergence is the same as the definition of uniform convergence for sequences of functions onK; that is, a sequence xn ∈ C(K) converges to the limit x∞ if, and only if, limn→∞ xn(t) = x∞(t)for all t ∈ K, and uniformly in t ∈ K.A special case is when K = [a, b] (bounded closed interval in the real line) with a < b.

Problem 6: Show that C([a, b]) is not a pre-Hilbert space (it suffices to find two continuous functions f, g

on [a, b], that violate the parallelogram identity if the norm 1.24 is used.)

Since a vector x ∈ C([a, b]) is a continuous function x : [a, b] → K and its components arethe values x(t) of the function at the points t ∈ [a, b], analogy to the finite-dimensional casesuggests that a scalar product may be introduced in C([a, b]) by using the following definition:

⟨x|y⟩ :=

∫ b

a

dt x(t)∗y(t) . (1.25)

where the integral is meant in the sense of Riemann.

Proposition 3: (1.25) defines an inner product in C([a, b]).

Proof: (1.25) is well defined for two continuous functions x(t) e y(t) because all continuousfunctions, and hence also the function x(t)∗y(t), are integrable on [a, b] in Riemann’s sense.Properties (1) e (2) of the inner product are then straightforward. Property (3) follows fromthe Lemma below.�Lemma 2: Let f : [a, b] → R be a continuous, non-negative function on [a, b] (that is, f(t) ≥ 0,

∀t ∈ [a, b]); and let∫ b

adtf(t) = 0. Then f(t) = 0, ∀t ∈ [a, b].

Proof: By contradiction: let f(t0) > 0 at a point t0 ∈ [a, b]. Denote ξ := f(t0). Thanks to awell-known property of continuous functions, there is an interval I of length L > 0, such thatt0 ∈ I ⊂ [a, b] and moreover f(t) > ξ/2, ∀t ∈ I. But then:∫ b

a

dt f(t) ≥∫I

dt f(t) ≥ L ξ2> 0 ,

contrary to assumptions. �Let C2([a, b]) denote the pre-Hilbert space that is obtained by introducing in the VS C([a, b])the inner product (1.25). In C2([a, b]) the norm is defined by:

∥x∥22 :=

∫ 1

0

dt |x(t)|2 ,

so a sequence xn converges to a limit x∞ if, and only if,

limn→+∞

∫ b

a

dt |xn(t) − x∞(t)|2 = 0 .

Note that C([a, b]) e C2([a, b]) , in spite of being the same as vector spaces, are sharply differentas normed vector spaces.

Problem 7: For all n ∈ N let xn : [0, 1] → R be the function defined by xn(t) = 1−|2nt−1| for 0 ≤ t ≤ 1/n,

and xn(t) = 0 for 1/n ≤ t ≤ 1 (draw the graph!) . Each such function xn is continuous. Show that the sequence

xn converges to 0 (the null function in [0, 1]) in C2([0, 1]), but not in C([0, 1]).

Page 13: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 13

1.3 Banach spaces, and Hilbert spaces.

This section is based on the concept of completeness for metric spaces.

Definition 4: If a normed (real or complex) VS is a complete in the metric that is induced bythe norm, then it is called a (real or complex) Banach space. If a (real or complex) pre-Hilbertspace is complete in the metric that is induced by the scalar product, then it is called a (realor complex) Hilbert space .

A Hilbert space is thus a Banach space, where the norm is induced by a scalar product.

Problem 8: Show that if a sequence x(n) of vectors in a Banach space satisfies:

∞∑n=1

∥x(n)∥ < +∞

then the series of vectors∑

n x(n) is convergent (Using subadditivity of norms, show that partial sums of the

series make a Cauchy sequence...).

Completeness of the spaces C ed R is assumed as a known fact. The spaces Cn ed Rn withn > 1 are also complete (with their canonical inner products), and this will be proved later. Thespaces Cn e Rn provide the simplest examples of Hilbert spaces (real and complex, respectively).

Theorem 9: The space C(K) (see Sect. 1.2.5) is a Banach space .

*Proof : Let xn ∈ C(K) be a Cauchy sequence. This means that, for all arbitrarily small ϵ > 0,one can find an integer Nϵ ∈ N so that:

∥xn − xm∥∞ < ϵ , ∀n,m > Nϵ.

From the definition of the norm in C(K) it follows that, at all points t ∈ K:

|xn(t)− xm(t)| ≤ ∥xn − xm∥∞ < ϵ (1.26)

anytime n,m > Nϵ. Therefore, for any fixed t the sequence of the (real or complex) numbersxn(t) is a Cauchy sequence, so it has a limit x∞(t). Then , taking the m→ +∞ limit in (1.26)we find that :

|xn(t) − x∞(t)| ≤ ϵ ∀t ∈ K , ∀n > nϵ .

This shows that the functions xn(t) uniformly converge to the function x∞(t). Therefore, thanksto a well known property of uniform convergence, the latter function is continuous, so it is thelimit of the sequence in C(K). �

Theorem 10: The spaces ℓ2 defined in sect. 1.2.4 are Hilbert spaces.

Page 14: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 14

Proof : we consider the case of ℓ2(N); the proof for other case is different only in notations. Forthis reason the VS will be simply denoted by ℓ2. Let xn ∈ ℓ2 be a Cauchy sequence. Given anarbitrary ϵ > 0, we can find an integer Nϵ ∈ N so that ∥xn − xm∥2 < ϵ whenever n,m > Nϵ;the norm being defined as in (1.23). It follows that, for every integer M > 0 :

M∑k=1

|xn(k)− xm(k)|2 ≤ ∥xn − xm∥2 < ϵ (1.27)

whenever n,m > Nϵ. In particular, for any fixed k, the sequence (indexed by n) of the numbersxn(k) is a Cauchy sequence in K, so it has a limit x∞(k). Therefore, taking the m→ +∞ limitin (1.27), we find that:

M∑k=1

|xn(k)− x∞(k)|2 ≤ ϵ (1.28)

whenever n > Nϵ. As this is true for all M , it is also true that

∞∑k=1

|xn(k)− x∞(k)|2 ≤ ϵ .

whenever n > Nϵ. It follows that

∞∑k=1

|x∞(k)|2 =∞∑k=1

|x∞(k)− xn(k) + xn(k)|2

≤ 2∞∑k=1

|x∞(k)− xn(k)|2 + 2+∞∑k=1

|xn(k)|2 < +∞ (1.29)

so the sequence x∞ is a vector in ∈ ℓ2. Finally , from (1.28) we obtain ∥xn − x∞∥2 ≤ ϵ forn > Nϵ, so x∞ is the limit in ℓ2 of the sequence xn. �Remark. the above proof rests on the single fact that K is a complete metric space. It worksunaltered if the series which define the ℓ2 norms are replaced by finite sums up to a fixed n. Itis thus proved that Kn is a complete metric space for all n > 1.

Proposition 4: The pre-Hilbert space C2([a, b]) with b > a (see Sect. 1.2.5 for the definition)is not complete.

Proof: : with no limitation of generality we choose a = 0 e b = 2. For arbitrary integer ndefine xn(t) = tn for 0 ≤ t ≤ 1 and xn(t) = 1 for 1 ≤ t ≤ 2. Then xn ∈ C2([0, 2]) , ∀n andthe xn make a Cauchy sequence in C2([0, 2]) (Exercise 9). It is on the other hand obvious that∀t ∈ [0, 2], limn→∞ xn(t) = w(t), where w(t) = 0 for 0 ≤ t < 1 and w(t) = 1 for 1 ≤ t ≤ 2.Direct calculation shows that:

limn→+∞

∫ 2

0

dt |xn(t)− w(t)|2 = 0 (1.30)

Page 15: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

1. Functional Spaces. 15

By contradiction, let us assume that the sequence {xn} has a limit x∞ in C2([0, 2]); i.e., thereis a continuous function x∞(t) in [0, 2], such that:

limn→+∞

∫ 2

0

dt |xn(t)− x∞(t)|2 = 0 . (1.31)

From this, using the 2nd inequality in Lemma 1, it follows that for n an arbitrary integer :∫ 2

0

dt |x∞(t)− w(t)|2 ≤ 2

∫ 2

0

dt |x∞(t)− xn(t)|2 + 2

∫ 2

0

dt |xn(t)− w(t)|2

Thanks to (1.30) and (1.31), this implies that the integral on the lhs is 0, and so∫ 1

0

dt |x∞(t)− w(t)|2 = 0 ,

∫ 2

1

dt |x∞(t)− w(t)|2 = 0 .

In each integral, the integrand is a continuous non-negative function in the whole integrationinterval. So Lemma 1.2.5 entails x∞(t) = w(t) , ∀t ∈ [0, 2]. As w(t) is not continuous, this iscontrary to the assumption that x∞ is continuous. �Problem 9: (1) Show that the functions xn(t) which were defined in the proof of Prop.4 satisfy:

limn,m→∞

∫ 2

0

dt |xn(t)− xm(t)|2 = 0

(2) Show that:

limn→∞

∫ 2

0

dt |xn(t)− w(t)|2 = 0

where w(t) is the function that is defined in the proof of Prop.4.

1.3.1 Need of a new concept of Integral.

Proposition 4 shows that the Cauchy sequence xn(t) is not convergent within the space C2([0, 2]),if convergence is meant in the sense of eq. (1.31). However, looking at eq. (1.30), one mayargue that the sequence does have a limit, that coincides with the function w(t); and thatthe trouble arises, because this function is not continuous, so it does not belong in the spaceC2([0, 2]). In other words, the latter space appears to be too small, as it cannot accommodatethe limits of all its Cauchy sequences. If one wants to build a Hilbert space of functions, usingthe integral (1.25) as scalar product, then one cannot restrict to continuous functions. Themore so, because functions like w(t), although discontinuous, are still Riemann-integrable, sothe scalar product as defined by (1.25) is still fully meaningful for them.The theory of metric spaces has an abstract result, that any non-complete metric space canbe made complete, by suitably ”extending” it. This ”completion” process is modeled after theprocess that produces the set R of the reals , as a ”completion” of the set Q of all rationals. Asimilar completion allows for constructing Hilbert spaces of functions, where the scalar productis defined by an integral, formally resemblant of (1.25). However, to this end nothing less isnecessary, than a thorough revision of the very concept of an integral.

Page 16: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. RUDIMENTS OF MEASURE THEORY.

2.1 Elementary measures.

Let E be an arbitrary non empty set.

Definition 5: An Algebra of sets in E is a family E of subsets of E, with the followingproperties:

1. E ∈ E ;

2. if A ∈ E then the complement Ac ∈ E ;

3. if A ∈ E and B ∈ E then A ∪B ∈ E .

Property (3) is equivalent to:3’. If A1 ∈ E , A2 ∈ E , . . . , An ∈ E , with n an arbitrary integer, then also A1∪A2∪ . . .∪An ∈ E .Indeed, prop. 3 is the n = 2 case of 3′, which is in turn obtained from 3 by means of anelementary induction over n. This definition has some immediate consequences :

Proposition 5: If E is an algebra of sets in E, then:1. ∅ ∈ E ;2. if A ∈ E and B ∈ E then also A ∩B ∈ E , and A \B ∈ E .

Proof: 1: follows from ∅ = Ec and from properties 1 e 2 of E . 2: follows from A∩B = (Ac∪Bc)c,from properties 2 e 3 , and from A \B = A ∩Bc. �An algebra of sets in E may thus be described as a family of subsets of E, that is closed withrespect to any finite number of set-theoretic operations.Immediate examples of algebras in an arbitrary set E are the so called trivial algebras. Theseare : the algebra E0 := {∅, E}, and the ”total algebra” T (E), that consists of all the subsets ofE.

Problem 10: Let E = N and define E as the family of all the subsets A of N such that either A or its

complement Ac is finite. Show that E is an algebra.

Definition 6: A real set function in E is a map f : F → R that to every set F in a family Fof subsets of E attaches a real number f(F ), possibly ±∞. The family of sets F is called thedomain of the set function.

Definition 7: An elementary measure in E is a real set function m in E such that :

Page 17: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 17

1. the domain M of m is an algebra of sets in E,

2. ∀A ∈ M, m(A) ≥ 0; moreover, m(∅) = 0;

3. if A ∈ M, B ∈ M, and A, B are disjoint ( i.e., A∩B = ∅), thenm(A∪B) = m(A)+m(B) .

The sets which belong in M are called m-measurable sets, and the algebra M is called thealgebra of m-measurable sets. The value m(A) of the measure m on a measurable set A issimply called the m-measure of A. Property (3) is called additivity of the measure, and isequivalent to:3’. If A1, A2, . . . , An arem-measurable sets, and are pairwise disjoint in the sense that Aj∩Ak =∅ whenever j = k, then

m(A1 ∪ A2 ∪ . . . ∪ An) = m(A1) +m(A2) + . . .+m(An) . (2.1)

Proposition 6: (Monotonicity): if A ∈ M, B ∈ M, and A ⊆ B, then m(A) ≤ m(B).

Proof: A ⊆ B entails B = A ∪ (B \ A). Moreover A ∩ (B \ A) = ∅, so thanks to additivity(property 3)m(B) = m(A)+m(B\A) and thenm(B) ≥ m(A) becausem(B\A) ≥ 0 (property2). (�

Example 1: The counting measure can be defined on the total algebra T (E) of an arbitrarynonempty set E. It is often denoted by # and if A ⊆ E is a finite set, then #(A) = the numberof elements in A; if A is not finite, then #(A) = +∞. Properties 1-2-3 are immediate.

Definition 8: Let A ⊆ E. The characteristic function of A is the function χA : E → {0, 1}which is equal to 1 at all points in A and is equal to 0 at all points in Ac.

Example 2: Let E = Rn, and let a ∈ Rn be an arbitrary point . The Dirac measure withsupport a has domain T (Rn) and its value on an arbitrary set A ⊆ Rn is given by χA(a). Itwill be denoted ∆a. Thus ∆a(A) = 1 if a ∈ A, and ∆a(A) = 0 if a ∈ Ac. By using the Diracmeasures other measures can be generated. For instance let a1, a2, . . . , aN be fixed points inRn and let p1, p2, . . . , pN be arbitrarily chosen positive numbers, or ”weights”. Attributing toA ⊆ Rn the number p1∆a1(A) + p2∆a2(A) + . . . + pN∆aN (A) a measure is defined. It can bedenoted by

∑N1 pi∆ai . This construction works also in the case N = +∞.

2.1.1 Quadrature of plane sets.

The counting measure is perhaps the most intuitive means of attaching numbers to sets, whichdescribe ’how big’ the sets are. However the most familiar example is to be found in Geometry, notably in the definition of areas and volumes. Let us briefly recall this theory, for the case ofbounded sets, or ”figures”, in the plane. The process whereby areas of plane figures are definedis known as ”quadrature” (”squaring”). Rectangles can be squared by definition, and their areais given by the rule ” basis times height ”. Let Q be an arbitrary bounded figure in the plane.It is contained in a sufficiently large square, which with no limitation of generality we assumeto have unit side. Let us construct a netting of this square, by drawing a finite number of

Page 18: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 18

straight lines, parallel to the sides; the number and positions of such lines are arbitrary. Givensuch a netting, the square is divided in rectangles , which we assume to be closed (i.e., inclusiveof their sides). Of all such rectangles let us us select only those, that are fully contained in thefigure Q; and let a denote the sum of the areas of all such rectangles. Next let us select a widerfamily of rectangles, namely those which have a non-empty intersection with Q ; and denoteA the sum of their areas. On repeating this construction for all possible netting of the square,two sets I− and I+ of real positive numbers are generated, which contain all the a numbers,and all the A numbers, respectively. Both I+ and I− are subsets of [0, 1] by construction. Lets+ denote the infimum (greatest lower bound) of I+ and let s− denote the supremum (lowestupper bound) of I−. It can be proven that no number in I− is larger than any number in I+,so s+ ≥ s−. If it happens that s+ = s−, then the figure Q is said to be quadrable, and itsarea is given by the common value of s+ ad s−. With this familiar construction, all figures ofelementary geometry turn out to be quadrable. Moreover, one can show that the family Q ofall quadrable figures which are contained in the square is an algebra of sets; and that the realset function which is defined on Q by the area is an elementary measure.Nevertheless, not all subsets of the plane are quadrable and, in particular, one can constructsubsets of the square, that cannot be squared. A classic example is the following. Let usintroduce Cartesian coordinates, with axes along two sides of the square. Let Q0 be the set ofall those points in the square, which have both coordinates x and y given by rational numbers.That this is not a quadrable set is immediately seen as follows. For any netting of the square,all rectangles contain points of Q0 because Q0 is a dense set of points, so the number A = 1 ;on the other hand, no rectangle can consist exclusively of points in Q0, so the number a = 0.As this is true for all possible nettings, 0 = s− = s+ = 1.The theory of Quadrature just summarized is the n = 2 case of the Elementary Measure Theoryfor subsets of Rn , with n an arbitrary integer.

2.2 Measures.

Although every algebra of sets is closed with respect to unions of finitely many sets, it maynot be closed with respect to unions of countable families of sets. A simple counterexample isprovided by exercise 11. Another very important counterexample is provided by the algebra ofquadrable figures , which was defined in the preceding section. To see this, let us call a singletevery set which consists of one element, and note that the set Q0 is a countable set and so canbe pictured as the union of a countable family of singlets, each containing a single point of Q0.Every such singlet is obviously quadrable, with zero area; and yet Q0 is not quadrable as wehave seen.

Definition 9: A σσσ-algebra of sets in E is an algebra of sets in E, that is closed with respectto unions of countably many sets.

The trivial algebras E0 e T (E) are obviously σ-algebras. Every algebra with only a finite numberof sets in it is automatically a σ-algebra; so the difference between algebras and σ-algebras isimmaterial whenever E is a finite set. It is instead substantial whenever E is infinite.

Page 19: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 19

Problem 11: Show that the algebra in Exerc.10 is not a σ-algebra. (Singlets are elements of the algebra,

however the set of the even numbers is not an element of the algebra, in spite of being a countable union of

singlets. )

Definition 10: A measure in a set E is an elementary measure µ such that :

1. the family of µ-measurable sets is a σ-algebra,

2. µ is countably additive, that is: if {Aj}j∈N is sequence of pairwise disjoint sets in A,(i.e., j = k ⇒ Aj ∩ Ak = ∅), then

µ

(∞∪j=1

Aj

)=

+∞∑j=1

µ(Aj) .

Note that the above properties (1) and (2) are a strengthening of properties (1) and (3)( or(3’)) of an elementary measure.

Problem 12: Verify that the counting measure (Example 1) and the Dirac measures (Example 2) are mea-

sures according to Def.(10).

Proposition 7: (Subadditivity) If A1, A2, . . . , An, . . . ∈ A then

µ

(+∞∪n=1

An

)≤

+∞∑n=1

µ(An) .

Proof: define D0 = ∅ and for n ≥ 1 Dn := A1 ∪ A2 ∪ . . . ∪ An, e Cn := An \Dn−1. Then:

+∞∪n=1

An =+∞∪n=1

Cn .

Each set Cn is a subset of An and is disjoint from A1, . . . , An−1. Hence the Cn are pairwisedisjoint sets, and so:

µ

(+∞∪n=1

An

)=

+∞∑n=1

µ(Cn) ≤+∞∑n=1

µ(An)

because ∀n, Cn ⊆ An. �

2.2.1 Lebesgue measure.

The elementary measure for subsets of Rn is not a measure in the technical sense of Def.10because the algebra of elementarily measurable sets is not a σ-algebra as we have seen. It isnevertheless possible to define a measure in Rn such that :- the σ-algebra of measurable sets contains all elementarily measurable sets, and- the measure of every such set has the same value as its elementary measure.

Page 20: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 20

This measure is called the Lebesgue measure in Rn, and plays a fundamental role in math-ematical analysis. Only a crude overview of the definition and the main properties of theLebesgue measure will be given here.First note that a σ-algebra of sets, which contains all elementarily measurable sets , must ofnecessity contain all rectangles 1. As a consequence, it must also contain every set that canbe generated by operating on rectangles by means of at most countably many set operations.Such sets are called Borel sets in Rn.

Proposition 8: The family of all Borel sets in Rn is a σ-algebra. It is called the Borel σ-algebrain Rn.

This statement is more or less intuitive and no details of its proof are given here. The Borelσ-algebra in Rn will be denoted Bn.

Proposition 9: Every open set in Rn is a Borel set.

*Proof: we refer to n = 2 for notational convenience, however the argument is independent on dimension. Let

A be an open set in R2. Consider the set A ∩ Q2 of the points in A which have rational coordinates . This is

a countable set; let rn, (n ∈ N) be its points (with arbitrary numbering) . Because A is open, for all n we can

find an integer N so that the open square of side 2−N centered at rn is fully contained in A. Let Nn be the

smallest such integer and let Qn be the corresponding square. Denoting Q∞ :=∪∞

n=1Qn, we note that Q∞ is

a Borel set by construction; moreover, Q∞ ⊆ A because Qn ⊆ A by construction. Now let w be an arbitrary

point in A. There is an integer M so that the square CM with center in w and side 2−M is fully contained in

A. As the points rn densely fill A, we can choose a rn so that ∥rn −w∥ < 2−M−2. Then the square with center

in rn and side 2−M−2 contains w and is in turn contained in CM , and hence in A; so it is also contained in Qn.

It follows that w ∈ Qn ⊆ Q∞ and so A ⊆ Q∞. As the opposite inclusion has already been proved, we conclude

that A = Q∞ and hence A is a Borel set. �

As a consequence, the family of all Borel sets also contains all closed sets. As a matter of fact,it is an extremely vast family; however it does not include all the sets in Rn. It can be proventhat Rn has non-Borel subsets, however this proof rests on the set-theoretic ”axiom of choice”,so, as is always the case when recourse to this axiom is made, it is a non-constructive proof.That means, existence of non-Borel sets is proven, but no explicit example is known.Let A be an arbitrary Borel set in Rn. Let {I1, I2, . . . , In, . . .} be a family, at most countable,of rectangles, not necessarily disjoint, such that B ⊆

∪n In. Let such a family be called a

rectangular covering of B . For each rectangle I let us denote by |I| its elementary measure,given by the product of the edges. Consider all possible rectangular coverings of A, and foreach such covering compute the sum

∑n

|In|. Then define:

|A| := inf

{∑n

|In| : {I1, I2, . . . , In, . . .} a rectangular covering of A

}. (2.2)

1 By a ”rectangle” we here mean any set in Rn which can be obtained as a Cartesian product of intervals(closed, open, or half-open)

Page 21: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 21

We immediately note that whenever A is a set with at most countably many points in it (likeQ0 in sect. 2.1.1) the |A| thus defined is 0; indeed, singlets are just a special type of closedrectangles , so A is a covering of itself by means of rectangles of zero measure.

Theorem 11: The real set function defined in Rn by |.| : Bn → R is a measure on the Borelσ-algebra.

This is the central result of the theory, but no hint of its proof can be given here.

Definition 11: Let µ be a measure in a set E. A set Z ⊂ E is said to be µ-negligible if it isa subset of a µ-measurable set, of zero measure; that is, if ∃N ∈ A so that Z ⊆ N e µ(N) = 0.The measure µ said to be complete if every µ-negligible set is a µ-measurable set , and so haszero measure.

Proposition 10: Let A be the family of all subsets of E that can be written as A∪Z where Ais µ-measurable (i.e., A ∈ A) and Z is µ-negligible. A is a σ-algebra, and the real set functiondefined in A by µ(A ∪ Z) = µ(A) is a measure, that is called the Completion of µ.

Definition 12: The Lebesgue measure in Rn is the completion of the measure that is definedon Bn as in (2.2).

The Lebesgue measure will still be denoted by |.|.

Problem 13: Prove Proposition (10).

Proposition 11: Every elementarily measurable set Q ⊂ Rn is also Lebesgue-measurable, withthe same value of the measure .

*Proof: we give a sketch of the proof, for the case of a plane, bounded, quadrable figure Q . First of all, it isnot difficult to prove the claim when Q is a plurirectangle, i.e., a union of a finite number of rectangles. Thenlet Q be quadrable and bounded, and for the rest arbitrary. Let s(Q) be its area. Thanks to the process bywhich the area is defined, for every integer n two plurirectangles Q−

n e Q+n are found, such that Q−

n ⊆ Q ⊆ Q+n ,

and moreover s(Q−n ) > s(Q) − 1

n and s(Q+n ) < s(Q) + 1

n . Define Q−∞ =

∪nQ

−n and Q+

∞ =∩

nQ+n . Both sets

Q±∞ are Borel sets and satisfy Q−

∞ ⊆ Q ⊆ Q+∞; moreover, denoting N := Q+

∞ \Q−∞, it is true that N ⊆ Q+

n \Q−n

∀n; and so|N | ≤ |Q+

n \Q−n | = s(Q+

n \Q−n ) = s(Q+

n )− s(Q−n ) ≤ 2

n

As this inequality holds ∀n, it implies |N | = 0. Therefore Q \Q−∞ is a negligible set because it is contained in

N . Hence it has measure zero, Q is measurable, and |Q| = |Q−∞| = s(Q). �

2.2.2 Measurable Functions.

Let µ a measure on a σ-algebra A in a set E; and let f : E → R a real-valued function on E.

Definition 13: The function f is said to be µ-measurable if, ∀a ∈ R, the subset of E definedby {x ∈ E : f(x) > a} is µ-measurable. In symbols:

f−1(]a,+∞[) ∈ A , ∀a ∈ R . (2.3)

Page 22: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 22

Problem 14: Show that the characteristic function χA (cfr. Def.8) of a set A ⊆ E is µ-measurable if and

only if A is a µ-measurable set.

Proposition 12: If f : E → R is µ-measurable then f−1(I) ∈ A whenever I is a half-line,right or left, closed or open, or an arbitrary real interval closed open or half-open.

Proof: let e.g. I = [a,+∞[ and define, for all integer n , In =]a − 1n,+∞[. Then I ≡

∩n In.

Therefore f−1(I) =∩

n f−1(In) so it is measurable because every f−1(In) ∈ A due to Def.13.

All remaining cases are worked out by taking complements and/or intersections.

Definition 14: A function f : E → C is said to be µ-measurable if such are its real andimaginary parts.

Theorem 12: If f and g are measurable functions on E, and α, β ∈ C, then :

• the function αf + βg is µ-measurable;

• the function f · g is µ-measurable;

• the functions fn and |f | are µ-measurable.

The proof is omitted.A predicate P (x) about a point x ∈ E is said to be µ-almost everywhere (in short: µ − a.e.)true, or, equivalently, to be true at µ-almost all x ∈ E, if the set of points x ∈ E where thepredicate is false is µ-measurable, and has measure 0.Examples:- if µ is Lebesgue measure in R, then µ-almost all real numbers are irrational .- if µ is the Dirac measure in R with support a = 1, then µ-almost all real numbers are positive.- if µ is Lebesgue measure in R, for integer n define fn(x) := n exp(−n2x2), and also f∞ := 0,∀x.Then:

limn→+∞

fn(x) = f∞(x) , µ-a.e. .

that is :the sequence of functions fn converges almost everywhere in R to the function f∞.In the case of Rn, the Lebesgue measure is understood whenever ”almost everywhere ” or”almost all” are used with no specification of a measure µ .

Theorem 13: If µ is a complete measure, and the µ-measurable functions fn : E → C convergeµ-a.e. to a function f∞, then f∞ is a µ-measurable function.

The proof is omitted. The abundance ( or scarcity) of measurable functions is directly relatedto abundance of measurable sets. . Let for instance the σ-algebra of measurable sets be thesmallest possible :A = {∅, E}; then in order for (2.3) to be true, f(x) > a has to hold either forall x, or for no x; so f must be a constant function. The class of measurable functions is quitesmall in this case. In contrast, if A = T (E), like in the case of the counting measure, or Diracmeasures, then all sets are measurable , so (13) is true for all a and for all functions; therefore,all functions are measurable. In the case of the Lebesgue measure the class of measurable

Page 23: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 23

functions is quite a large one. It includes all continuous functions ( because in that case f−1(I)is open, hence a measurable set), but it is much bigger. Non-measurability of a function is tiedup with non-measurability of a set f−1(I), so the non-Lebesgue-measurable functions are asrare, as are the non-Lebesgue-measurable sets; such functions do exist, but no explicit examplecan be shown.

Problem 15: (1) Show that |A| > 0 for every non-empty open set A ⊆ R, (2) show that if f and g are

continuous function on R such that f(x) = g(x) a.e., then f(x) = g(x) ,∀x ∈ R.

2.2.3 Sets of zero measure.

Finite or countable sets in Rn have zero Lebesgue measure, however the class of the zero-measure sets is much larger than that; indeed, countable sets are in a sense exceptional inthat class. The best known counterexample is perhaps the so-called ternary Cantor set. Itsconstruction is as follows. Let the interval [0, 1] be divided in three equal subintervals; thenremove the middle (open) one . Divide each of the two remaining closed intervals in three equalintervals ; and, again, from each remove the middle one. Now 4 closed intervals are left ; oneach of them the construction is repeated, again and again. The Cantor set is the set of thepoints that survive when this process is repeated ad infinitum. It is not empty, as it contains,at least, the integer multiples of the inverse powers of 3 ; but it is actually proven to be anon-countable set.

*A ternary sequence is a sequence of digits taken from {0, 1, 2}. To any such sequence {t(n)} one may uniquely

associate a real number in [0, 1] as follows. Divide [0, 1] in 3 equal intervals, and choose the 1st, the 2nd, or

the 3d interval according to whether t(1) = 0, 1, or 2. Denote I1 this interval. Next divide I1 in thirds, and

again choose the 1st, the 2nd, or the 3d of them, according to whether t(2) = 0, 1, or 2. Denote I2 this interval.

Continuing in this way, a sequence of closed intervals I1 ⊃ I2 ⊃ . . . ⊃ In ⊃ . . . is generated with |In| = 3−n.

Thanks to a fundamental property of the real numbers, there is a unique real number x that belongs in all such

intervals. The sequence t is just a ternary (”base 3”) expansion of the number x. It is immediately seen that

the points c in the Cantor set are in one-to-one correspondence with the ternary sequences in which the digit

1 never appears. This class of ternary sequences can be obviously identified with the binary sequences. Every

binary sequence can in turn be read as the binary expansion of a unique real number in [0, 1], so there are as

many binary sequences as there are numbers in [0, 1]. Thanks to the above construction , the same is true of

the points in the Cantor set.

In spite of that, the ternary Cantor set has zero Lebesgue measure .

Problem 16: Prove the last statement. (Compute the total measure of the removed intervals...)

2.3 Integrals.

In this section we define the integral, denoted∫Edµ(x)f(x) of a µ-measurable real or complex

function define in E.

Page 24: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 24

2.3.1 Definition.

Definition 15: If A ∈ A then the integral of the characteristic function χA di A is defined as:∫E

dµ(x)χA(x) = µ(A) ,

(possibly infinite).

The ”Dirichlet function” in [0, 1] is the characteristic function of the set Q∩ [0, 1]. It is a classicexample of a function that is not Riemann integrable in [0, 1]. In contrast, it is immediatelyintegrated using the Lebesgue measure.

Problem 17: Compute the integral over E = [0, 1] of the Dirichlet function with µ = the Lebesgue measure.

Definition 16: A µ-simple function on E is a µ-measurable function s on E, that takes finitelymany values.

Let s1, s2, . . . , sn be the values of s and let Aj := {x ∈ E : s(x) = sj}, (1 ≤ j ≤ n). Each ofthese sets is µ-measurable thanks to Prop.12. ∀x ∈ E one may write:

s(x) =n∑

j=1

sjχAj(x) .

Definition 17: The integral of a real, µ-simple, non-negative function (s(x) ≥ 0, ∀x ∈ E) isdefined by: ∫

E

dµ(x)s(x) =n∑

j=1

sjµ(Aj) ,

with the proviso that if sj = 0 and µ(Aj) = +∞ then sjµ(Aj) = 0.

Definition 18: Let f : E → R be an arbitrary µ-measurable, real, non negative function. Itsintegral is defined as :∫

E

dµ(x) f(x) = sup

{∫E

dµ(x) s(x) , s µ-simple, real, such that 0 ≤ s(x) ≤ f(x) ∀x ∈ E

}.

The set of numbers, of which the supremum is taken in this definition, is generated by varyings(x) within the class of all functions that verify the specified conditions.Finally let f be µ-measurable , real, with arbitrary sign. One may write:

f(x) = f+(x) − f−(x) ; f+(x) := |f(x)| , f−(x) := |f(x)| − f(x) ,

and then f± are by construction µ-measurable and non-negative. Therefore:

Page 25: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 25

Definition 19: A µ-measurable real function f is said to be µ-integrable, if one at least of thefunctions f± has a finite integral (as defined in Def.18), and then one defines:∫

E

dµ(x) f(x) =

∫E

dµ(x) f+(x) −∫E

dµ(x) f−(x) .

A complex-valued µ-measurable function is integrable if its real part and its imaginary part areintegrable and one defines:∫

E

dµ(x) f(x) =

∫E

dµ(x) ℜf(x) + i

∫E

dµ(x) ℑf(x) .

Definition 20: For a µ-misurable f : E → C and A ∈ A one defines:∫A

dµ(x) f(x) :=

∫E

dµ(x) f(x)χA(x) .

The function f is said to be µ-summable over A if it is integrable and its integral is finite.The set of all µ-summable functions over A is denoted L1(A, µ).

Problem 18: Let E = {1, 2, . . . , n} with the counting measure. Let f : E → C. What is the integral of f

over E? (f is a ♯-simple function...)

2.3.2 Elementary properties.

Theorem 14: Let E be an arbitrary nonempty set, µ an arbitrary complete measure in E, aA the σ-algebra of µ-measurable sets, and A ∈ A a measurable set.

1. If f ∈ L1(A, µ), g ∈ L1(A, µ), and α, β ∈ C, then also αf + βg ∈ L1(A, µ), and∫A

dµ(x)[αf(x) + βg(x)] = α

∫A

dµ(x)f(x) + β

∫A

dµ(x)g(x) .

2. f is µ-summable over A, if, and only if, |f | is µ-summable over A, and:∣∣∫

Adµ(x)f(x)

∣∣ ≤∫Adµ(x)|f(x)|.

3. Let A1, A2, . . . , An, . . .be a countable family of pairwise disjoint µ-measurable sets; andlet A∞ =

∪nAn. If f is µ-summable over A∞, then it is µ-summable over every An, and:∫

A∞

dµ(x) f(x) =+∞∑n=1

∫An

dµ(x) f(x) .

4. If N ∈ A and µ(N) = 0 then∫Ndµ(x)f(x) = 0 for all µ-measurable functions f . If

f, g ∈ L1(E, µ) and f(x) = g(x) µ-a.e., then∫Edµ(x)f(x) =

∫Edµ(x)g(x).

Page 26: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 26

5. If f, g are real functions in L1(E, µ) and f(x) ≤ g(x) µ-a.e. then∫Edµ(x)f(x) ≤∫

Edµ(x)g(x).

Problem 19: Prove Property 2. (Use Defs.19).

Problem 20: Prove Property 4. (Thanks to Property 2, it is enough to prove it for non-negative functions.

Thanks to Def.18, it is then enough to prove it for the µ-simple non-negative functions ...)

Problem 21: Prove Property 5. (Thanks to Defs.17 e 18, the integral of a non-negative function is never

negative...)

No proof will be given of the remaining properties.

2.3.3 Special Cases.

a. Series Summation.

Let E = N, and µ = #, the counting measure. Let points in N be denoted by n as usual(instead of the letter x used in the above general theory). The functions f : N → C are thecomplex-valued sequences. Every such function is measurable, because every subset in N ismeasurable for the counting measure. It will be shown that: The #-summable functions arethe sequences f such that the series

∑∞1 f(n) is absolutely convergent, and for such sequences:∫

Nd#(n) f(n) =

∞∑n=1

f(n) (2.4)

that is: in this case, the general notion of integral coincides with the elementary notion of sumof a series .Let f be summable. Noting that

N =+∞∪m=1

{m} ,

and using property 3 in Theor. 14 one finds that:∫Nd#(n) f(n) =

+∞∑m=1

∫{m}

d#(n) f(n) .

On the other hand,∫{m} d#(n) f(n) =

∫N d#(n) f(n)χ{m}(n). The function in the last inte-

grand only takes values 0 and f(m), so it is a #-simple function. Definition 17 immediatelyyields the value f(m) of its integral, because #({m}) = 1. So if f is summable then (2.4)holds. However, due to Thm.14, property 2, f is summable if and only if such is |f |, so∑∞

1 |f(n)| < +∞ must hold; that means, the series∑∞

1 f(n) has to be absolutely convergent..

Page 27: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 27

b. Dirac measures.

Let E = Rn and µ = ∆a, the Dirac measure supported in the point a ∈ Rn. As all sets aremeasurable in this case, all functions f : R → C are measurable. It will be presently shownthat every f is ∆a-summable, and that:∫

Rn

d∆a(x) f(x) = f(a) . (2.5)

Let us consider the function f(x)χ{a}(x). It is a simple function, because it takes only twovalues : 0 and f(a) . Thanks to Def.17, its integral is f(a)∆a({a}) = f(a). On the other handthis function is ∆a-a.e. equal to f(x) so the integral of f(x) is the same due to property 4 inThm.14.

c. The Lebesgue Integral.

The integral of functions in Rn with the Lebesgue measure is called the Lebesgue integral.It is roughly correct to say that the Lebesgue integral is to the Riemann integral what the Lebesgue measure is to

elementary measure. From a practical viewpoint, it does not introduce complication in practical computation of

integrals, on the contrary it greatly simplifies calculations of certain limits, as will be shown in the next Section.

From a theoretical viewpoint, the class of integrable functions is much larger.

Theorem 15: Let A ⊆ Rn be an arbitrary rectangle: A = [a1, b1] × [a2, b2] × . . . × [an, bn]with −∞ < aj < bj < +∞ for 1 ≤ j ≤ n. Let f : A → C be a bounded function, that isRiemann-integrable over A. Then f is Lebesgue-integrable , and∫

A

dx f(x) =

∫ b1

a1

dx1

∫ b2

a2

dx2 . . .

∫ bn

an

dxn f(x1, x2, . . . , xn) .

The integral on the lhs is meant with respect to the Lebesgue measure. Conforming to prevalentuse, whenever µ is the Lebesgue measure dµ(x) is replaced just by dx; or else, and even moregenerally, the same notation is used as for the Riemann integral. The Lebesgue-summable func-tions are a much wider class than the Riemann-integrable ones. One such function is χQn∩A,the integral of which is 0.In the case of unbounded functions, or of functions which are integrated over unbounded do-mains, reference has to be made to the generalized Riemann integral as a term of comparison.Then a function may be integrable in the latter sense, but not in the sense of Lebesgue.

Theorem 16: If f : A → C is defined in a domain A ⊆ Rn and is integrable over A in thegeneralized sense of Riemann, and moreover |f | is integrable in that sense , then f is summableover A in the sense of Lebesgue, with the same result.

Note that the assumption that |f | , and not only f , be integrable is crucial, because theRiemann integral does not enjoy property 2 in Thm.14. As an example: in [0, 1] ⊂ R thefunction f(x) := 2χQ − 1 is equal to ±1 depending on whether x is rational, or not. It is notRiemann-integrable, however |f(x)| is Riemann-integrable, as it is the constant function = 1.

Page 28: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 28

The functions that are integrable in the generalized sense of Riemann , and yet are not Lebesgue-summable, are exactly those functions that are not absolutely integrable. With respect to theLebesgue integral they play the same role, that the convergent yet not absolutely convergentseries play with respect to the integral which was described in subsec. ”a” of this Section.The function eix

2in R is an example of a function that is integrable in the generalized sense

of Riemann, but is not Lebesgue-summable. Its generalized integral is known as the Fresnelintegral.The following Thm is the basic result about calculation of multiple integrals. Every x =(x1, . . . , xn+m) ∈ Rn+m can be written as x = (u, v) where u = (x1, . . . , xn) ∈ Rn, and v =(xn+1, . . . , xn+m) ∈ Rm. So, every f(x) on Rn+m may be written as f(u, v).

Theorem 17: (the Fubini theorem. ) Let f(x) be a summable function over Rn+m.

• for almost all u ∈ Rn, the function Fu that is defined in Rm by Fu : v 7→ f(u, v) issummable over Rm, and its integral

∫Rm dv Fu(v) is summable over Rn;

• for almost all v ∈ Rm, the function Fv that is defined in Rn by Fv : u 7→ f(u, v) issummable over Rn , and its integral

∫Rn du Fv(u) is summable over Rm

•∫Rn+m dx f(x) =

∫Rm dv

(∫Rn du Fv(u)

)=∫Rn du

(∫Rm dv Fu(v)

).

2.3.4 Exchanging lim and∫.

Let f∞ be a µ-measurable function in E, and let a sequence fn : E → C of µ-summablefunctions over E converge µ-a.e. to f∞. Consider the following questions:I: is f∞ a µ-summable function?II: if f∞ is µ-summable, is it true that limn→∞

∫Edµ(x)fn(x) =

∫Edµ(x)f∞(x)?

The following examples show that such questions have no unique answer in the absence offurther specifications.

About question I: Let E = R with the Lebesgue measure and let fn be the characteristicfunction of the interval [−n, n]. Every fn summable, but f∞ is the constant function = 1, andis not summable, as its integral is +∞). In contrast, if fn(x) = e−n|x|, then f∞(x) = 0, whichis summable.

About question II: in the last example :∫Rdx e−n|x| = 2

n,

∫Rdx f∞(x) = 0 = lim

n→∞

∫Rdx e−n|x|.

In contrast, if :fn(x) := n√

πe−n2x2

, (2.6)

then :limn→∞

fn(x) = f∞(x) := 0 a.e. ,

but∫R dx fn(x) = 1, so :

1 = limn→∞

∫Rdx fn(x) = 0 =

∫Rdx f∞(x) .

Page 29: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 29

−3 −2 −1 0 1 2 30

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

x

f n(x

)

n=1

n=3

n=2

Fig. 2.1: The 1st three functions (2.6).

Theorem 18: (Dominated Convergence Theorem) Let f∞ be a µ-measurable func-tion in E, and let fn : E → C be a sequence of µ-summable functions in E, such that

limn→∞

fn(x) = f∞(x) µ− q.o.

If a non-negative µ-summable function φ : E → R exists such that |fn(x)| ≤ |φ(x)| ∀nand for µ-almost all x, then f∞ is µ-summable, and

limn→∞

∫E

dµ(x) fn(x) =

∫E

dµ(x) f∞(x) .

Theorem 19: (Monotone Convergence Theorem ) Let f∞ be a non-negative µ-measurablefunction in E, and fn : E → C a sequence of µ-summable functions in E, such that :0 ≤ f1(x) ≤ f2(x) ≤ . . . ≤ fn(x) ≤ . . . µ-q.o., and :

limn→∞

fn(x) = f∞(x) µ− q.o.

Then

limn→∞

∫E

dµ(x) fn(x) =

∫E

dµ(x) f∞(x) .

This holds also in the case when the limit is +∞.

2.4 Square summable functions.

Let E be an arbitrary set, µ a measure in E, and A the σ-algebra of measurable sets. The setof all µ-measurable functions f : E → C is a functional space thanks to prop.1 in Thm.12; i.e.,

Page 30: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 30

it is a vector subspace of the space F (E,C) which was defined in Sect.1.1. Let L2(E, µ) denotethe class of all µ-square-summable functions f : E → C, i.e. of the functions f : E → Csuch that: ∫

E

dµ(x) |f(x)|2 < +∞ . (2.7)

Theorem 20: L2(E, µ) is a vector space.

Proof: Let α, β ∈ C and f, g ∈ L2(E, µ). The function |αf + βg| is µ-measurable thanks toThm.12, and, moreover:∫

E

dµ(x)|(αf + βg)(x)|2 =

∫E

dµ(x)|αf(x) + βg(x)|2

≤ 2|α|2∫E

dµ(x)|f(x)|2 + 2|β|2∫E

dµ(x)|g(x)|2 < +∞ , (2.8)

thanks to the 2nd inequality in Lemma 1, and to the properties of the integral that were statedin Thm.14.�For f, g ∈ L2(E, µ) let us define:

h(f, g) :=

∫E

dµ(x) f(x)∗g(x) . (2.9)

This is a well defined expression, thanks to the 1st inequality in Lemma 1. It is immediatelyseen that it satisfies all properties of a scalar product, except for one; notably, h(f, f) = 0 doesnot of necessity imply that f is the null vector in L2(E, µ) (that is the function = 0 at all pointsin E). For instance, if E = R, µ is the Lebesgue measure, and f(x) = χQ(x) is the Dirichletfunction, then h(f, f) = 0, and yet f is not identically 0; instead, it is 0 almost everywhere. Ingeneral, the following is true:

Proposition 13: Let φ : E → R be µ-measurable, and non-negative. Then∫Edµ(x)φ(x) = 0,

if, and only if, φ(x) = 0, µ-almost everywhere.

Proof: If φ(x) = 0 µ-a.e. then∫dµ(x)φ(x) = 0 due to property 4 in Thm.14. Conversely,

for integer n define En = {x ∈ E : φ(x) > 1n}. Each En is µ-measurable and moreover the

union E∞ of all the sets En is the set of all points x ∈ E where φ(x) > 0 strictly. Noting thatφ(x) ≥ χEn(x)

1nholds ∀x, from prop.5 in Thm.14 it follows that, ∀n,∫

E

dµ(x)φ(x) ≥ 1nµ(En) .

If the integral on the lhs is 0 then µ(En) = 0 ∀n, and so

µ(E∞) ≤∞∑n=1

µ(En) = 0

due to subadditivity of the measure (Prop.7). �Therefore (2.9) does not define a scalar product in L2(E, µ), except in the case when the

Page 31: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 31

measure is such that there is no set of 0 measure other than the empty set , because then thezero function is the unique function that is 0 µ-a.e . This in particular happens whenever µ isthe counting measure. In particular , recalling subsect. (a) in Section 2.3.3, it is immediatelyseen that the spaces L2(N,#) and L2(Z,#) are just the ℓ2(N), ℓ2(Z) spaces, and (2.9) is thescalar product that was already introduced in such spaces.

2.4.1 L2- spaces.

Proposition 14: In L2(E, µ) define a binary relation ℜ as follows:

fℜg ⇔ f(x) = g(x) µ-almost everywhere.

Then:

• ℜ is an equivalence relation,

• if fℜf ′ and gℜg′ then, for all α, β ∈ C, (αf + βg)ℜ(αf ′ + βg′).

Problem 22: Prove the above Theorem.

The quotient set L2(E, µ)/ℜ is denoted L2(E, µ). The elements of L2(E, µ) are equivalenceclasses of square-summable functions E, and all functions in one class are µ-almost everywhereequal to one another. Let us (temporarily) denote [f ] the equivalence class where a function fbelongs; then , [f ] = [g] if and only if fℜg. Let us define the sum of classes, and the productof a class by a complex number, as follows:

α[f ] + β[g] := [αf + βg] .

This definition makes sense , because the rhs does not depend on which particular f and g arechosen in their respective classes [f ] and [g] , thanks to the above theorem. L2(E, µ) is then avector space. The null vector 0 in this space is the class of functions which are equivalent tothe identically vanishing function; in other words, it is the class of all functions which vanishµ-almost everywhere. Finally, let us define:

h([f ], [g]) :=

∫E

dµ(x)f(x)∗g(x) .

This is a good definition, because it does not depend on how f e g are chosen in their respectiveclasses [f ] e [g]. Indeed, let for instance f in the rhs be replaced by f ′ such that f ′ℜf ; doing so,the integrand is changed only at points in a set of 0 measure, so the integral does not change.Now h is a scalar product, because, if h([f ], [f ]) = 0 then due to Proposition 13 f is almosteverywhere 0 and so [f ] is the null vector in L2(E, µ). Therefore, L2(E, µ) is a pre-Hilbertspace, and the norm is defined by:

∥[f ]∥2 =

∫E

dµ(x)|f(x)|2 . (2.10)

Note that if µ is the counting measure then each equivalence class is reduced to a single functionand so there is no difference between L2(E,#) e L2(E,#).

Page 32: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 32

Theorem 21: L2(E, µ) is a Hilbert space.

In the special case of L2(N,#) = ℓ2(N), this result has already been proven (Thm.10). Theproof in the general case will not be given here. It rests in crucial ways on the theorems aboutinterchanging limits and integrals, that were presented is Sect.2.3.4.

The special cases that will be considered in the following are - besides the already mentioned spaces ℓ2 - the

spaces L2(I) where I ⊆ Rn is a measurable set of positive measure, and the measure is the Lebesgue measure

(and is left understood as usual). In such spaces vectors are not functions, but classes of functions. Nevertheless,

to avoid cumbersome notations, no distinction will be made between vectors in L2(I) and functions in L2(I) by

which such vectors are represented - whenever not strictly necessary.

2.4.2 Convergence in the Mean.

In a Hilbert space L2(I), where I ⊆ Rn, convergence is defined by the norm (2.10). A sequenceof square-summable functions in I are said to converge in quadratic mean to a function f∞,if

limn→∞

∫I

dx |fn(x)− f∞(x)|2 = 0 . (2.11)

Let us compare this type of convergence with the more familiar ”convergence almost every-where”. Consider the functions fn(x) which were introduced in (2.6). As we have seen, theyconverge almost everywhere to the function f∞(x) ≡ 0. Everyone of them is square-summableover I ≡ R and the lhs in (2.11) is in this case given by:∫

Rdx |fn(x)− f∞(x)|2 =

∫Rdx fn(x)

2 =√2πn ,

so instead of tending to 0 in the limit n → ∞, it diverges to +∞. Hence this example showsthat convergence almost-everywhere does not imply convergence in quadratic mean.Now let I = [0, 1] and define a sequence of intervals J0, J1, ... as follows. First set J0 = I; thendivide J0 in 2 equal parts and set J1 = [0, 1

2], and J2 = [1

2, 1]. Next divide again both J1 and J2

in two equal parts, and set J3 = [0, 14], J4 = [1

4, 12], J5 = [1

2, 34], J6 = [3

4, 1]. Continuing in this way

by successive bisection we construct a sequence of intervals Jn. They come in ”generations”:the intervals in the k-th generation are the intervals that are obtained on dividing [0, 1] in 2k

equal parts of width 2−k, and are numbered from left to right from n = 2k − 1 to n = 2k+1 − 2.Let, for all n, fn(x) = χJn(x). Then:∫

I

dx |fn(x)|2 =

∫Jn

dx 1 = |Jn|

The measure |Jn| of Jn tends to 0 as n → ∞, so the last equation shows that the sequence fnconverges in quadratic mean to f∞(x) ≡ 0. Nevertheless, for no x ∈ [0, 1] does the sequencefn(x) have a limit for n → ∞. To see this, fix an x and note that in every generation there isone interval, that contains the chosen x, so the corresponding fn(x) = 1. The fn that followwill vanish at x, but eventually, as n increases, the next generation will be entered, and there

Page 33: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

2. Rudiments of Measure Theory. 33

again an interval will be met, that contains x . It is thus seen that fn(x) is frequently equal toboth 0 and 1 as n→ ∞. This sequence is sometimes called ”the typewriter sequence”. It showsthat convergence in quadratic mean does not imply pointwise convergence almost everywhere.This conclusion is somewhat softened by the following theorem of Weyl :

Theorem 22: If a sequence fn of square-summable functions on I ⊆ Rn converges in quadraticmean in I to a function f∞, then it has a subsequence that converges pointwise to the samelimit almost everywhere.

The proof is omitted.

Problem 23: Find subsequences of the typewriter sequence, that converge almost everywhere.

Page 34: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. ELEMENTARY THEORY OF HILBERT SPACES.

Throughout the following, the letter H will denote a generic Hilbert space.

3.1 Orthogonal projections.

Reminder from Linear Algebra.

A vector subspace of a vector space X is a subset V ⊆ X such that x, y ∈ V e α, β ∈ K imply that αx+βy ∈ V .

A vector subspace of X is itself a vector space, with the same operations as in X. If A is a subset of X, the

subspace spanned by A in X is the set of all the vectors in X that can be obtained by linear combinations of

vectors in A. It will be denoted by V(A). It is the smallest subspace of X that contains A as a subset.

3.1.1 Hilbert subspaces.

Every Hilbert space H is a vector space. Let V be a vector subspace of H. It is a vectorspace itself, and it has a scalar product. So, is it a Hilbert space? Not necessarily. For that,it ought to be complete, that is, all Cauchy sequences of vectors in V should have a limit inV . Now, every such sequence certainly has a limit in H, because H is complete; however, thislimit may not be in V ; because a subspace V may not be closed. 1 Here is a counter-example.Let H = ℓ2(N), and let V be the subset that consists of all those vectors x ∈ ℓ2(N), the n-thcomponent x(n) of which is eventually 0 as n → ∞. For instance, the vector with x(n) = 1for 1 ≤ n ≤ 10, and x(n) = 0 for all n > 10 is V ; in contrast, the vector with componentsx(n) = 1/n is not in V . The set V is clearly a vector subspace of ℓ2(N). Let z be an arbitraryvector in ℓ2(N), and for every integer N let xN be the vector, that has components given byxN(n) = z(n) if 1 ≤ n ≤ N , and by xN(n) = 0 otherwise. Every xN is obviously in V . Thencompute:

∥xN − z∥2 =+∞∑n=1

|xN(n)− z(n)|2 =+∞∑

n=N+1

|z(n)|2 .

The last series is the so-called N-th remainder of the convergent series∑∞

1 |z(n)|2. It is anelementary fact, that it tends to 0 as N → ∞. So, by choosing a suitably large integer N , wecan make the distance between xN and z as small as we like. Since xN ∈ V , but z is arbitraryin ℓ2(N), this means that V is a dense subset of ℓ2(N). Therefore, its closure V is the whole ofℓ2(N), so it is strictly larger than V . So V is not a closed set.

1 The notion of closed set is assumed known from the elementary theory of metric spaces.

Page 35: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 35

Definition 21: A Hilbert subspace of a Hilbert space H is a vector subspace of H, that isa closed set in the metric of H.

Proposition 15: Let V a subspace in H. Its closure V is still a subspace, so it is a closedsubspace.

Proof: let α, β ∈ K, and x, y ∈ V . x e y are either vectors in V , or else they are limit pointsof V . In both cases, sequences {xn} e {yn} of vectors in V exist, such that limn→∞ xn = x, elimn→∞ yn = y; so αx + βy = limn→∞(αxn + βyn) because vector operations are continuous.As V is a vector subspace, αxn + βyn is a vector in V , ∀n, so the limit αx + βy is in V . �In particular, if A ⊆ H then V(A) is a closed subspace. It is called the closed subspace spannedby A.

Definition 22: Two vectors inH are orthogonal, x ⊥ y, if ⟨x|y⟩ = 0. A vector x is orthogonalto a subset A ⊆ H, x ⊥ A, if x is orthogonal to all y ∈ A. Two subsets A and B in H areorthogonal, A ⊥ B, if x ∈ A e y ∈ B entails x ⊥ y. The set of all vectors x that are orthogonalto a set A is called the orthogonal complement A⊥ of A.

Problem 24: Let I ⊆ Rn be a measurable set and let SI be the set of those vectors in L2(Rn) which are

represented by functions that vanish outside I. Show that SI is a closed subspace. ( Use Thm.22 to prove

closure .)

Problem 25: Show that if x ⊥ A then x ⊥ V(A).(By linearity and continuity of scalar products...)

Proposition 16: For every A ⊆ H, A⊥ is a closed subspace in H.

Problem 26: Prove the above Proposition.

3.1.2 The Projection theorem.

Definition 23: Let X be a metric space, A ⊆ X, and x ∈ X. The distance d(x,A) of x fromA is defined by:

d(x,A) := inf{d(x, y) : y ∈ A}. (3.1)

Note that d(x,A) may not be equal to the distance of x from a point in A; e.g, if X = R, thend(x,Q) = 0 even when x /∈ Q.

Theorem 23: Let S be a closed subspace in a Hilbert space H, and let x be an arbitrary vectorin H. There is one (and just one) vector x0 ∈ S such that d(x, x0) = d(x, S). It is the uniquevector in S such that x − x0 ⊥ S. It is called the orthogonal projection of x on S, and∥x0∥ ≤ ∥x∥.

Page 36: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 36

0

x0

Sx

Fig. 3.1: The vector x0 is the orthogonal projection of the vector x on the closed subspace S in R2.

Proof: for simplicity let us denote d ≡ d(x, S). By definition of infimum, for every integern there is xn ∈ S such that:

d ≤ d(x, xn) ≡ ∥x− xn∥ < d+ 1n

(3.2)

It will be shown that the sequence {xn} is a Cauchy sequence. First, using the ParallelogramIdentity, one can write:

∥xn − xm∥2 = ∥(xn − x) + (x− xm)∥2

= 2∥xn − x∥2 + 2∥xm − x∥2 − ∥(xn − x) + (xm − x)∥2 . (3.3)

Next note that:

∥(xn − x) + (xm − x)∥2 = 4∥x− 12(xn + xm)∥2 ≥ 4d2 ,

because S is a subspace, and so 12(xn + xm) ∈ S. Hence,

∥xn − xm∥2 ≤ 2(d+ 1n)2 + 2(d+ 1

m)2 − 4d2 =

4d

n+

4d

m+

2

n2+

2

m2,

the rhs tends to 0 when n,m→ +∞. Therefore {xn} is a Cauchy sequence so it has a limit x0in H, because H is a Hilbert, hence complete, space. As xn ∈ S for all n, x0 ∈ S because Sis closed. Taking the limit n → ∞ in (3.2) and using that norms are continuous we find thatd(x, x0) = d.Let us show that x − x0 ⊥ S. If λ ∈ C and y ∈ S then x0 + λy ∈ S, and by definition ofdistance:

∥(x− x0)− λy∥2 = d(x, x0 + λy)2 ≥ d(x, x0)2 = d2 .

Page 37: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 37

Expanding the lhs, we find:

∥x− x0∥2 + |λ|2 ∥y∥2 − 2ℜ{λ⟨x− x0|y⟩} ≥ d2 ,

and so, recalling that ∥x− x0∥ = d,

|λ|2 ∥y∥2 − 2ℜ{λ⟨x− x0|y⟩} ≥ 0 .

As this is true for arbitrarily chosen λ ∈ C , we can in particular choose λ = ⟨y|x − x0⟩/∥y∥2and then we find:

|⟨y|x− x0⟩|2 ≤ 0 .

The lhs cannot be negative, so the latter inequality imposes ⟨y|x− x0⟩ = 0. Hence x− x0 ⊥ Sbecause y is arbitrary in S.Next we show that given x there is only one such x0. From x − x0 ⊥ S and x − x′0 ⊥ S itfollows that x0 − x′0 = (x− x0)− (x− x′0) ⊥ S and so, being itself a vector in S, x0 − x′0 mustbe self-orthogonal. Thus x0 − x′0 = 0.Finally, for arbitrary x we can write:

∥x∥2 = ∥(x− x0) + x0∥2 = ∥x− x0∥2 + ∥x0∥2

because x− x0 ⊥ x0; so ∥x∥2 ≥ ∥x0∥2. �

3.1.3 Decomposition theorem.

Theorem 24: Let S be a closed subspace of H, and let S⊥ be the orthocomplement of S. Forevery x ∈ H there are a unique u ∈ S and a unique v ∈ S⊥, such that x = u + v. The vectoru is the orthogonal projection of x on S, and the vector v is the orthogonal projection of x onS⊥.

Proof: let u be the orthogonal projection of x on S and let v := x − u. Then v ⊥ S, and ify ∈ S⊥ then :

⟨x − v|y⟩ = ⟨u|y⟩ = 0 ,

so v is the orthogonal projection of x on S⊥, and x = u+ v. Let at the same time x = u′ + v′

with u′ ∈ S and v′ ∈ S⊥; then u − u′ = v′ − v; however u − u′ ∈ S and v − v′ ∈ S⊥ , so bothu− u′ and v − v′ are vectors in S ∩ S⊥. The one such vector is 0. �

3.2 Hilbert bases.

Reminder from Linear Algebra.

- n vectors x1, x2, . . . , xn in a vector space X are linearly independent if the unique n-ple of scalars α1, α2, . . . , αn

such that α1x1 + α2x2 + . . . + αnxn = 0 is the n-ple α1 = α2 = . . . = αn = 0. An arbitrary set A of vectors

in X is said to be a linearly independent set, if every finite family of distinct vectors taken from A are linearly

independent.

Page 38: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 38

- X has finite dimension N if N is the maximum number of linearly independent vectors that can be found

in X. X is said to have infinite dimension if it doesn’t have finite dimension, that is, one can find N -ples of

linearly independent vectors, for all integer N .

- a vector basis of X is a linearly independent set B ⊂ X, such that X = V(B). If X has finite dimension N ,

then it has a vector basis which consists of exactly N vectors. If X has infinite dimension, existence of a vector

basis is a highly nontrivial fact, that is proven by using the axiom of choice, and so in a non-constructive way.

In a space of dimension ∞, the notion of a vector basis is thus of little practical utility. Therefore a different

notion of a basis is introduced, which makes use of linear combinations of infinitely many vectors. To this end,

the sum of infinitely many vectors must be given a meaning, so one has to introduce in the vector space a notion

of convergence. In the case of Hilbert spaces, this leads to the concept of a Hilbert basis.

3.2.1 Orthonormal systems.

An orthonormal set of vectors in a Hilbert space H is a set T of vectors, such that if x, y ∈ Tthen ⟨x|y⟩ = 0 whenever x = y and ⟨x|y⟩ = 1 whenever x = y; hence, ∥x∥ = 1 for all x ∈ T .

Proposition 17: Every orthonormal set is a linearly independent set.

Proof : let T be an orthonormal set, x1, x2, . . . , xn ∈ T , and α1x1 + . . .+ αnxn = 0. Then:

0 =⟨xj|

n∑k=1

αkxk⟩

=n∑

k=1

αk⟨xj|xk⟩ = αj

for all 1 ≤ j ≤ n.�The converse is not true in general . However:

Theorem 25: Let F = {x1, x2, . . . , } be a finite or countable linearly independent family ofvectors. Define :

e1 =x1

∥x1∥,

e2 =x2 − ⟨e1|x2⟩e1

∥x2 − ⟨e1|x2⟩e1∥,

. . . . . .

en+1 =

xn+1 −n∑

j=1

⟨ej|xn+1⟩ej

∥xn+1 −n∑

j=1

⟨ej|xn+1⟩ej∥,

. . . . . . (3.4)

Then, ∀n, x1, . . . , xn and e1, . . . , en generate the same subspace; the set {ej} is an orthonormalset ; and the subspace that is generated by all the vectors xn coincides with the subspace thatis generated by all the vectors en.

Page 39: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 39

x

x’

e1

e2

e3

Fig. 3.2: The vector x′ is the best approximation of the vector x by means of the orthonormal system{e1, e2}.

Proof : that ∀n, V({e1, . . . , en}) = V({x1, . . . , xn}), will be proven by induction over n. ∥x1∥ = 0because no set can be linearly independent, that contains 0; instead, the set of the xn is alinearly independent set. Then V(x1) = V(e1) is obvious. Assume that V({e1, . . . , en}) =V({x1, . . . , xn}); then the numerator on the rhs in the ’(n + 1)-th equation cannot be inV({e1, . . . , en}) = V({x1, . . . , xn}), otherwise also xn+1 would be in V({e1, . . . , en}) = V({x1, . . . , xn}).In particular it cannot be 0 so the divisor cannot be 0. Then en+1 is in V({x1, . . . , xn+1}) be-cause it ia a combination of xn+1 and of e1, . . . , en, and thanks to the inductive assumptionevery ej with j ≤ n is in V({x1, . . . , xn}). Orthonormality is proven by a direct calculation. �This method of constructing an orthonormal set starting with a linearly independent set, finite,or countable, is called Schmidt orthonormalization .Throughout the following, an ”orthonormal system ” is a finite or countable orthonormalset ’.

3.2.2 Best Approximation.

The Problem of the best approximation is stated as follows :

Given :

• an orthonormal system T = {e1, e2, . . .} in H,

• a vector x ∈ H,

• an integer n,

find :

Page 40: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 40

• n scalars α1, α2, . . . , αn, so that:

∥x −n∑

j=1

αjej∥ (3.5)

has the least possible value.

It is understood, that n cannot exceed the number of vectors in T if T is finite. Once all datahave been specified, the quantity to be made a minimum is a function of the n scalars α1, . . . αn,so the problem is that of minimizing a function of n real or complex variables .

Theorem 26: The problem of the best approximation is solved by :

αj = ⟨ej|x⟩ , 1 ≤ j ≤ n .

and the minimum value of (26) is given by :√√√√∥x∥2 −n∑

j=1

|⟨ej|x⟩|2 .

Proof:

∥x −n∑

j=1

αjej∥2 =

⟨x−

n∑j=1

αjej

∣∣∣∣x −n∑

j=1

αjej

= ∥x∥2 +n∑

j=1

n∑k=1

α∗jαk⟨ej|ek⟩ −

n∑j=1

{α∗j⟨ej|x⟩+ αj⟨ej|x⟩∗}

= ∥x∥2 +n∑

j=1

{|αj|2 − α∗j⟨ej|x⟩ − αj⟨ej|x⟩∗} . (3.6)

Next note that , for two complex numbers a, b :

a∗a− b∗a− ba∗ = |a− b|2 − |b|2 . (3.7)

Letting a = αj, b = ⟨ej|x⟩ in this identity, and substituting in (3.6):

∥x −n∑

j=1

αjej∥2 = ∥x∥2 −n∑

j=1

|⟨ej|x⟩|2 +n∑

j=1

|αj − ⟨ej|e⟩|2

This expression depends on the αj only through the last addendum on the rhs, which is nevernegative. So the minimum of this expression is attained when this addendum vanishes. Thisdirectly yields the claimed result. �

Page 41: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 41

Proposition 18: Let T be an orthonormal system in a Hilbert space H. Then ∀x ∈ H theBessel inequality holds:

#(T )∑n=1

|⟨en|x⟩|2 ≤ ∥x∥2 . (3.8)

Proof: if #(T ) < +∞, then Thm.26 used with n = #(T ) shows that the difference betweenthe rhs and the lhs cannot be negative, as it is the minimum value of a non-negative quantity.If #(T ) = +∞ the inequality is true for the same reason, as soon as #(T ) is replaced by anyinteger N . It follows that the series with non-negative terms in the lhs of (3.8) converges, andits sum satisfies (3.8).�

3.2.3 Generalized Fourier series.

Proposition 19: Let T = {e1, e2, . . .} be an infinite orthonormal system in a Hilbert space H, and let {αj} be a sequence of scalars. The series (of vectors)

∑∞1 αjej converges in H if, and

only if,∑∞

1 |αj|2 < +∞.

Proof : for n integer let σn =∑n

1 αjej. By definition, the series∑∞

1 αjej converges if and onlyif the sequence {σn} converges in H, hence if and only if this sequence is a Cauchy sequence.For m > n integers:

∥σm − σn∥2 =⟨ m∑

j=n+1

αjej

∣∣∣∣ m∑j=n+1

αjej

=m∑

j=n+1

m∑k=n+1

α∗jαk⟨ej|ek⟩

=m∑

j=n+1

|αj|2 =m∑j=1

|αj|2 −n∑

j=1

|αj|2 ; (3.9)

so the sequence of the vectors σj ∈ H is Cauchy if and only the sequence of the partial sums ofthe numerical series

∑∞1 |αj|2 is Cauchy. �

Theorem 27: Let T = {e1, e2, . . .} be an infinite orthonormal system in a Hilbert space H andlet x be an arbitrary vector in H. The series of vectors in H:

+∞∑n=1

⟨en|x⟩en (3.10)

always converges in H. Its sum is the orthogonal projection of x on the closed subspace V(T ).

Proof : convergence of the series immediately follows from Propositions 19 and 18. Let x′ beits sum , and σN its N -th partial sum. Then σN ∈ V(T ) and x′ = limN→∞ σN so x′ ∈ V(T ).

Page 42: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 42

∀n,

⟨x− x′|en⟩ = ⟨x|en⟩ − ⟨ limN→∞

σN |en⟩

= ⟨x|en⟩ − limN→∞

⟨σN |en⟩ = ⟨x|en⟩ − limN→∞

N∑j=1

⟨x|ej⟩⟨ej|en⟩ = 0 , (3.11)

where continuity of scalar products has been used to move limN→∞

out of the product. Therefore

x−x′ ⊥ T and so x−x′ ⊥ V(T ) (Problem 25). This proves that x′ is the orthogonal projectionof x on V(T ). �

3.2.4 Completeness.

Theorem 28: The following properties of an orthonormal system are equivalent, in the sensethat everyone of them implies all the others:

1. ∀x ∈ H,

x =

#(T )∑n=1

⟨en|x⟩en ,

2. the Parseval identity : for all x, y ∈ H,

⟨x|y⟩ =

#(T )∑n=1

⟨x|en⟩⟨en|y⟩ ,

3. the Bessel identity : ∀x ∈ H

∥x∥2 =

#(T )∑n=1

|⟨ej|x⟩|2 ,

4. if z ∈ H and z ⊥ T then z = 0;

5. V(T ) = H .

Proof: It is sufficient to prove that 1 ⇒ 2 ⇒ 3 ⇒ 4 ⇒ 5 ⇒ 1.(1) ⇒ (2):

⟨x|y⟩ =

⟨x

∣∣∣∣#(T )∑n=1

⟨en|y⟩en⟩

=

#(T )∑n=1

⟨x|en⟩⟨en|y⟩ . (3.12)

If #(T ) = +∞, the 2nd step uses continuity besides linearity of scalar products.(2) ⇒ (3): just take x = y .

Page 43: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 43

(3) ⇒ (4): if z ⊥ T then ⟨en|z⟩ = 0, ∀n, so from (3) it follows that ∥z∥2 = 0.(4) ⇒ (5): thanks to the decomposition theorem (Teor.24) every vector x is the sum of a vector

in V(T ) and a vector in V(T )⊥. Due to (4), V(T )

⊥contains only the 0 vector;

(5) ⇒ (1): if (5) holds, then every x ∈ H coincides with its own projection on V(T ) so (1) istrue thanks to Thm.27. �

Definition 24: An orthonormal system T in H is complete if it satisfies one, and henceall, of the properties 1-5 in Thm.28.

A complete orthonormal systems (cons) is also called a Hilbert basis.

Definition 25: Let T be a cons in H. For x ∈ H , the series (3.10) is called the generalizedFourier expansion of x on the basis T , and the coefficients ⟨ej|x⟩ are called the generalizedFourier coefficients of x on the basis T .

3.2.5 Hilbert bases: examples.

Finite-dimensional spaces.

In every Hilbert space of finite dimension , orthonormalization of a vector basis always yieldsa Hilbert basis. In the space Cn (and in Rn as well) the ”canonical basis” {e1, . . . , en} in Cn isdefined by ej(k) = 0, if j = k and ej(k) = 1, if j = k. It is a cons.

ℓ2- spaces.

In ℓ2(N), and in ℓ2(Zd) the infinitely many vectors ej, that are defined (for j ∈ N and for j ∈ Zd

respectively) by ej(n) = 0 if j = n and ej(j) = 1 are a cons.

Problem 27: Prove the last statement (to prove completeness use (4) in Thm.28).

The Fourier basis.

Let I = [a, b] be a bounded interval in R. For n ∈ Z define a function en : I → C as follows:

en(x) = 1√Leinωx , ω =

L. (3.13)

Theorem 29: The vectors in L2(I) which are represented by the functions en(x) are acons in L2(I).

Proof: only orthonormality (and not completeness) will be proven here:

⟨en|em⟩ = 1L

∫I

dx en(x)∗em(x) = 1

L

∫ b

a

dx ei(m−n)ωx

If n = m, the integrand is constant = 1 so the integral is b− a = L. If n = m, note that everyfunction en(x) is periodic with period L = b− a and so :

⟨en|em⟩ =ei(m−n)ωb − ei(m−n)ωa

i(m− n)Lω= 0 . �

Page 44: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 44

This cons is called the Fourier basis. The expansion of a function f : I → C over the Fourierbasis is called the Fourier series expansion (in the strict sense), and that’s why the expansionover an arbitrary cons in an infinite-dimensional Hilbert space is called a generalized Fourierexpansion. The theory of Fourier series expansions in the strict sense is a classic chapter ofmathematical analysis, to which we shall return later.An immediate consequence of Thm.29 is :

Proposition 20: Let I ⊂ Rn be defined by I = I1 × I2 × . . . × In where Ij = [aj, bj] for1 ≤ j ≤ n. The vectors in L2(I) which are represented by the functions er : I → C that aredefined for all r = (r1, r2, . . . , rn) ∈ Zn by:

er(x) = (n∏

j=1

|Ij|)−1/2ei(r1ω1x1+r2ω2x2+...+rnωnxn) ,

where ωj =2π|Ij | , are a un cons.

Note : no function of the form eiλx with real λ is square summable over R (because its modulusis constant = 1). Therefore,

˜ There is NO Fourier basis in L2(R).

The Hermite basis.

L2(R) has nevertheless conss , and a frequently used one is the Hermite basis, also known asthe ”harmonic oscillator basis”. It will be described in detail in sect. 4.2.4.

Problem 28: For n ∈ Z let χn(x) denote the characteristic function of the interval [n, n+1]. Show that thefamily of functions :

fn,m(x) = χn(x) exp(2πimx) , n,m ∈ Z

defines a Hilbert basis in L2(R).

3.2.6 Separability.

Definition 26: A Hilbert space in which there is a cons is said to be separable.

To avoid misunderstandings about this definition it is necessary to emphasize that in the presentnotes a cons is, by definition, a countable set at most.Every finite-dimensional space is therefore separable, the ℓ2 spaces are separable, and such arethe L2 spaces that were considered in the last section. Every vector in a separable Hilbert spacecan be represented by a countable infinity of components, given by the coefficients of the gen-eralized Fourier expansion on a cons. For this reason, separable Hilbert spaces are sometimescalled spaces of countable dimension .To this intuitive illustration of the concept of separability one may object that L2(R) is sepa-rable, and yet vectors in L2(R) are (classes of) functions f(x) in R, so they appear to have anon-countable infinity of components - one for each x ∈ R. This seeming contradiction has a

Page 45: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 45

subtle solution : the latter ’components’ are not independent, because the measurability of fimposes a constraint on them. In the following example, all functions are measurable, so thisconstraint is lifted.

Proposition 21: The Hilbert space L2(R,#) is not separable.

*Proof : a function whose squared modulus is #-summable on R may be = 0 at a countable set of points at

most. To see this, let In := {x ∈ R : |f(x)|2 > 1n}; then

∫R d#(x)|f(x)|2 ≥ #(In)

1n so if |f |2 is ♯-summable

then, for all n, In consists of a finite number of points. On the other hand σ(f) :=∪

n In is the set of all

points where f(x) is not 0 ; so it is a countable set, because it is a countable union of finite sets. Then let

T be an orthonormal system, defined by functions en(x). For all n, σ(en) is at most countable , so the set

σ∞ :=∪

n σ(en) is at most countable; therefore, one can choose in R a point x0 /∈ σ∞. Let z(x) be a function

that is 1 for x = x0 and is 0 for x = x0. This is a square-#-summable function , and∫R d#(x)z(x)2 = 1.

However, for every n, z(x)en(x) is identically 0 by construction ∀x. So z is orthogonal to T , and T cannot be

complete. �

3.3 Linear Maps.

Reminder from Linear Algebra.

Let X and X ′ be vector spaces, both real or both complex. A linear map from X in X ′ is a map τ : D(τ) → X ′,where D(τ) is a subspace of X, and τ(αx+βy) = ατ(x)+βτ(y) for all scalars α, β, and for all x and y in D(τ).The subspace D(τ) is the domain of the map τ . The subset of X ′ that is defined by:

R(τ) = {τ(x), x ∈ D(τ)}

is called the range of τ . If D(τ) = X and a linear map is a bijection of X onto X ′, then it is a vector isomorphism

of the spaces X, X ′, which are thereby said to be isomorphic.

3.3.1 Isomorphic Hilbert spaces.

Definition 27: Two Hilbert spaces H e H′, with respective scalar products ⟨.|.⟩ and ⟨.|.⟩′, aresaid to be isomorphic , or unitarily equivalent, if there is a vector isomorphism ϕ : H → H′,that preserves scalar products; that is , ∀x, y ∈ H, ⟨x|y⟩ = ⟨ϕ(x)|ϕ(y)⟩′. Then ϕ is called aHilbert isomorphism .

Proposition 22: ϕ : H → H′ is a Hilbert isomorphism if, and only if, it is a vector isomorphismand moreover ∥ϕ(x)∥′ = ∥x∥ is true ∀x ∈ H.

Proof: if ϕ preserves scalar products then it also preserves norms, because norms are definedby scalar products. Conversely, all scalar products may be retrieved using norms, thanks tothe Polarization Identities (Prop.2).�

Theorem 30: (Riesz-Fischer) Every complex Hilbert space of finite dimension n is iso-morphic to Cn. Every infinite dimensional, complex, separable Hilbert space is isomorphicto ℓ2(N).

Page 46: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 46

Proof: let H be separable. Denote d its dimension and let Cd denote the space Cn if d = n,or the Hilbert space ℓ2(N) if d = +∞. In H there is a cons T = {e1, e2, . . .} with #(T ) = d.The generalized Fourier coefficients ⟨ej|x⟩ may be thought of as the components of a vector inCd ( in the case d = ∞, this is true thanks to the Bessel equality) . This allows to define amap ϕ : H → Cd such that the j-th component of the vector ϕ(x) is given by ⟨ej|x⟩. It will beproven that this map ϕ is a Hilbert isomorphism. Linearity of ϕ follows from ⟨ej|αx + βy⟩ =α⟨ej|x⟩ + β⟨ej|y⟩. The norm ϕ(x) is equal to the norm of x thanks to the Bessel identity ,so the one vector x such that ϕ(x) = 0 is x = 0; therefore, ϕ is an injective map . Finally if{ξ(1), ξ(2), . . .} are the components of a vector ξ ∈ Cd, then ξ = ϕ(x) where x =

∑d1 ξ(j)ej ∈ H;

in the case when d = +∞, convergence of the latter series in H is ensured by Teor.27, becauseconvergence of

∑∞1 |ξ(j)|2 is ensured by ξ ∈ ℓ2(N). So ϕ is surjective. All conditions in Prop.22

are thus satisfied.�

3.3.2 Bounded linear maps.

Let H e H′ be Hilbert spaces and let τ be a linear map with domain D(τ) ⊆ H with values inH′. If both spaces have finite dimensions n ed m respectively, then linear maps are naturallyassociated with matrices. We can assume that D(τ) = H; indeed, were it not so, one mightre-define H = D(τ). One can introduce cons {e1, . . . , en} and {e′1, . . . , e′m} in H and in H′

respectively, and thereafter one can write x =∑n

1 xjej, x′ := τ(x) =

∑m1 x

′je

′j where xj = ⟨ej|x⟩

e x′j = ⟨e′j|x′⟩′; so

x′j =

⟨e′j

∣∣∣∣τ( n∑k=1

xkek)⟩′

=n∑

k=1

⟨e′j|τ(ek)⟩′xk (3.14)

that is:

x′j =n∑

k=1

τjk xk , τjk = ⟨e′j|τ(ek)⟩′ . (3.15)

Once conss have been chosen in H e in H′, the matrix {τjk} with m rows and n columns yieldsa complete representation of map τ . The situation is much more complicated if the spaceshave infinite dimension. Assuming that they are separable, one may still choose conss in bothspaces, but the 1st step (3.14) is not justified any more; indeed, the sum over k is actually aseries so it involves taking a limit. Then linearity, alone, of the map is no longer enough toexchange τ with the sum of the series, because continuity is also required.Continuity of linear maps may be discussed in the broader context where X ed X ′ are normedspaces, with respective norms ∥, ∥ e ∥.∥′. Then D(τ) and X ′ are metric spaces so one may usefor τ the general definition of continuity for maps between metric spaces. (Mat3, Sez.2.5). Themap τ is said to be continuous, if it is continuous at every point x ∈ D(τ).

Proposition 23: τ is continuous at all x ∈ D(τ) if and only if it is continuous at 0.

Proof: ”only if” is trivial because 0 is in all subspaces of X hence also in D(τ). If xn ∈ D(τ)and limn→∞ xn = x∞ ∈ D(τ), then xn − x∞ tends to 0 and so if τ is continuous at 0 thenτ(xn − x∞) = τ(xn)− τ(x∞) tends to 0 �.

Page 47: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 47

A set A ⊂ X is bounded if it is contained in BR(0) (the ball of radius R centered at 0) for asuitable R > 0. The unit sphere Σ1 is the set of vectors in X that have unit norm.

Proposition 24: The following properties of a linear map τ are equivalent:

1. τ maps bounded sets onto bounded sets ;

2. the real function ∥τ(x)∥′ is bounded on the unit sphere, that is:

sup {∥τ(x)∥′ : x ∈ Σ1 ∩ D(τ)} < +∞ ; (3.16)

3. there is a C ≥ 0 such that :∥τ(x)∥′ ≤ C∥x∥ (3.17)

holds ∀x ∈ D(τ) .

Proof : 1 ⇒ 2: Σ1 is a bounded set , so Σ1 ∩ D(τ) is also bounded . Hence τ(Σ1 ∩ D(τ)) ={τ(x) : x ∈ D(T ) ∩ Σ1} is bounded and this is equivalent to 2;2 ⇒ 3: let R denote the supremum in 2. If x ∈ D(τ) and x = 0 then x1 := x∥x∥−1 ∈ D(τ)∩Σ1,so ∥τ(x1)∥′ ≤ R whence, by linearity, ∥τ(x)∥′ ≤ R∥x∥;3 ⇒ 1: let A ⊂ D(τ) be bounded , so that ∥x∥ < R1 for some R1 > 0 and for all x ∈ A. Ifx′ ∈ τ(A) then x′ = τ(x) for some x ∈ A , so ∥x′∥′ ≤ C∥x∥ ≤ CR1 and τ(A) is contained inBCR1(0) and so it is bounded .�

Definition 28: A linear map τ is said to be bounded if it enjoys one, and hence all, of theproperties 1,2,3 . The supremum in (3.16) is called (for reasons to be clarified later) the normof τ and is denoted ∥τ∥.

Note that the norm of a bounded linear map satisfies:

∥τ(x)∥ ≤ ∥τ∥ ∥x∥ , ∀x ∈ D(τ) , (3.18)

and is actually the smallest number C that can be used in property 3.

Theorem 31: A linear map is continuous if, and only if, it is bounded.

Proof: Assume that τ is not bounded. Then for every integer n a xn ∈ D(τ) exists, such that∥xn∥ = 1 and ∥τ(xn)∥′ > n. Define yn : xn

n. Then ∥yn∥ = 1

n, yn ∈ D(τ), and limn→∞ yn = 0.

Nevertheless, τ(yn) does not tend to 0; this is seen from

∥τ(yn)∥′ =1

n∥τ(xn)∥′ > 1 ,

and shows that τ is not continuous at 0. Hence the ”only if” part of the thesis is proved.Conversely, let τ be bounded. If xn ∈ D(τ) and limn→∞ xn = 0 then from ∥τ(xn)∥′ < ∥τ∥∥xn∥it follows that τ(xn) tends to 0 in X ′. Hence τ is continuous at 0. �

Page 48: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 48

Proposition 25: If H e H′ have finite dimension, then all linear maps from H to H′ arecontinuous.

Proof: with reference to (3.14) and (3.15), let us denote C the largest of the numbers |τjk|,(1 ≤ j ≤ m , 1 ≤ k ≤ n). From ∥x∥ = 1 it follows that |xk| ≤ 1; from (3.15), |x′j| ≤ Cn; andthen ∥x′∥ ≤ Cn

√m.�

In the following, only cases with H′ = H or H′ = K will be considered. In the former case,the map is commonly called a linear operator in H. In the latter case, it is called a linearfunctional in H. Linear operators will be usually (though not always) denoted by upper caseroman letters . Furthermore, it is customary to denote the image of a vector x under a linearoperator T by Tx rather than by T (x). The identity operator will always be denoted by I; soIx = x, ∀x.In spaces of infinite dimension, it is easy to find examples of linear functionals or operators that are notcontinuous. Let for instance H = ℓ2(N); and let V be the subspace that was defined in Sect.3.1.1. On thedomain V define a functional f by f(x) =

∑n nx(n). Whenever x ∈ V the sum is well defined because it

has only a finite number of nonzero terms. Linearity of f is obvious. The canonical basis {en} is a boundedset, en ∈ V ∩ Σ1 for all n, and f(en) = n; so f is not bounded because it maps a bounded set onto anunbounded one. In a similar way one may define a linear unbounded operator T on the domain V , by means ofTx =

∑n nx(n)en.

A similar, yet more important example is as follows. In L2(R) let D be the set of vectors that are representedby square-summable functions f(x), which are eventually 0 for |x| → ∞; in other words, by the f ∈ L2(R)such that f(x) = 0 whenever |x| is larger than some Lf > 0 (which depends on f). It is easily seen that D isa subspace in L2(R). Define a linear operator X0 with domain D as follows: (X0f)(x) = xf(x). This operatoris not bounded. To see this, for arbitrary R > 0 let IR := [0, R] and define a function fR(x) := R−1/2χIR(x).Then

∥fR∥2 =

∫Rdx |fR(x)|2 = 1

R

∫ R

0

dx = 1 ,

so fR ∈ Σ1 ∩ D; however,

∥X0f∥2 =

∫Rdx x2|fR(x)|2 = 1

R

∫ R

0

dx x2 = R2

3 ,

that can be arbitrarily large depending on R ; so X0 is not bounded.

Problem 29: Prove that the domain D of the operator X0 is a dense subspace in L2(R). (For f ∈ L2(R)define fR(x) = f(x)χR(x) where χR is the characteristic function of the interval [−R,+R]. Then fR is in D;

use dominated convergence to show that∫dx|f(x)− fR(x)|2 tends to 0 as R→ +∞...)

In the above examples of non-continuous linear maps, the domain was never a closed subspace.In contrast, in the case of continuous maps, one may always assume the domain to be closed,thanks to the following result:

Theorem 32: Let τ be a bounded linear map from H to H′ with domain D(τ). There is oneand only one bounded linear map τ , the domain D(τ) of which is the closed subspace D(τ),and that satisfies τ(x) = τ(x) for all x ∈ D(τ). Moreover, ∥τ∥ = ∥τ∥.

Page 49: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 49

*Proof :if x ∈ D(τ), then there is a sequence xn ∈ D(τ) so that xn → x for n→ +∞. From:

∥τ(xn)− τ(xm)∥ = ∥τ(xn − xm)∥ ≤ ∥τ∥∥xn − xm∥

it is seen that yn := τ(xn) is Cauchy, and so it has a limit y∞. Let us show that this limit does not dependon how the sequence xn is chosen. If x′n is a different sequence in D(τ) that tends to x, and y′n = τ(x′n), then,using that norms are continuous:

∥y′∞ − y∞∥ = limn→+∞

∥yn − y′n∥ , (3.19)

and because yn − y′n = τ(xn − x′n),

∥y′∞ − y∞∥ ≤ ∥τ∥ limn→+∞

∥xn − x′n∥ = 0 . (3.20)

For arbitrary x ∈ D(τ) we can then define τ(x) := y∞. If x ∈ D(τ), we will again find τ(x) = τ(x) because

τ is continuous . From ∥τ(x)∥ = ∥ limn→+∞ τ(xn)∥ it follows that ∥τ(x)∥ ≤ ∥τ∥∥x∥, so τ is continuous, and

∥τ∥ ≤ ∥τ∥. The reverse inequality comes directly from the definition, noting that ∥τ∥ is the supremum of a set

of numbers , which is a subset of another set, the supremum of which is ∥τ∥. �

So one may always assume that the domain of a bounded map is a closed subspace. Later(Problem 45) it will be shown that one may further extend the map so that its domain is thewhole of H. For this reason, in the following, continuous maps will always be assumed to bedefined on the whole Hilbert space .

Integral Operators.

It was already recalled that in a space of finite dimension n every linear operator T is repre-sented by a matrix n × n: notably if x′ = Tx then the components of x′ on a given cons arerelated to the components of x by:

x′j =n∑

k=1

Tjkxk .

This seems to suggest that in a space L2(I) (I ⊆ Rn) a generic linear operator T may berepresented by a function K : I × I → C, through the equation:

(Tf)(x) =

∫I

dy K(x, y)f(y) , (3.21)

However this is false in general and indeed the class of operators that can be represented byequations like (3.21) is quite a special one . Every such operator is called an integral operator,and the corresponding function K is the integral kernel of the operator. Here we describetwo important examples.

Theorem 33: Let K be a square summable function in I × I, i.e. a function on I × I suchthat

∥K∥2 =

∫I

dx

∫I

dy |K(x, y)|2 < +∞ .

Then equation (3.21) defines a continuous linear operator in L2(I).

Page 50: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 50

Proof: using the Cauchy-Schwarz inequality in eqn. (3.21):

|(Tf)(x)|2 ≤(∫

I

dy |K(x, y)|2)∫

I

dy |f(y)|2 =

(∫I

dy |K(x, y)|2)∥f∥2 ,

and so

∥Tf∥ =

(∫I

dx |(Tf)(x)|2)1

2

≤ ∥K∥∥f∥ .

This shows that T is well defined for all vectors in L2(I), and is continuous, with norm ∥T∥ ≤∥K∥. �The operators T which are defined in this way are called integral operators of the Hilbert-Schmidt type. The Hilbert norm ∥K∥ of the function K is called the Hilbert-Schmidtnorm of the operator T of which K is the integral kernel.

Theorem 34: Let G : Rn → C be summable and bounded . The equation

(Tf)(x) =

∫Rn

dy G(x− y)f(y) (3.22)

defines a bounded linear operator T in L2(Rn).

*Proof:

|(Tf)(x)| ≤∫Rn

dy |G(x− y)||f(y)| =

∫Rn

dy |G(x− y)|1/2|G(x− y)|1/2|f(y)| , (3.23)

whence, using Cauchy-Schwarz :

|(Tf(x)|2 ≤(∫

Rn

dy |G(x− y)|)(∫

Rn

dy |G(x− y)||f(y)|2),

The integral∫dy|G(x − y)| is finite by assumption, moreover it does not depend on x; let it be denoted by

CT . The rightmost integral is also finite, because G is bounded by assumption, and f is square-summable. So,integrating with respect to x we find that:

∥Tf∥ ≤ CT ∥f∥ . �

Definition 29: For t > 0 the Heat Kernel is defined in Rn by:

Gt(x) = (2πt)−n/2e−∥x∥22t , (3.24)

It is summable, and its integral is = 1.

The name of such kernels will be explained later, along with some remarkable properties of thefamily of operators Tt which they define via eq. (3.22).

Page 51: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 51

3.3.3 A theorem of Riesz.

Given z ∈ H, for all x ∈ H define f(x) = ⟨z|x⟩. In this way a functional f : H → C isobtained . Thanks to linearity and continuity of scalar products, this functional is linear andcontinuous. The following theorem of Riesz says that the functionals of this type are all thelinear continuous functionals that may be defined in a Hilbert space.

Theorem 35: Let f : H → C be a continuous linear functional . There is a z ∈ H such thatf(x) = ⟨z|x⟩ for all x ∈ H, and this z is unique.

*Proof: Denote Ker(f) := {x ∈ H : f(x) = 0}. Using linearity and continuity of f , one immediately verifiesthat this set of vectors is a closed subspace of H. It is called the kernel of f . Consider its orthocomplementKer(f)⊥. If it contains the 0 vector alone, then every vector in H is in Ker(f) so f is identically zero; then theclaim is true with z = 0. So let us assume that there is a nonzero v in Ker(f)⊥. By construction , f(v) = 0; it will be shown that Ker(f)⊥ ≡ {λv : λ ∈ C}, that is the 1-dim subspace that is spanned by the vector v.To this end for u ∈Ker(f)⊥ define u′ = λv with λ = f(u)/f(v). Then f(u′) = f(u) so u′ − u ∈Ker(f). At thesame time, u′ − u ∈Ker(f)⊥, because both u′ and u are in u ∈Ker(f)⊥. Hence, u = u′ = λv. Then from thedecomposition theorem it follows that an arbitrary x may be written as x = w + αv, with w ∈Ker(f) ; and so

f(x) = f(w) + αf(v) , ⟨v|x⟩ = α∥v∥2 .

Extracting α from the 2nd equation and replacing it in the 1st yields f(x) = ⟨z|x⟩ with z = vf(v)∗∥v∥−2.

About uniqueness: if f(x) = ⟨z′|x⟩ for all x then z − z′ ⊥ x for all x so z = z′. �

3.4 The Algebra of bounded operators.

The set of all bounded linear operators in H will be denoted L(H) . If A ∈ L(H) and B ∈ L(H)then define a 3d operator to be denoted A+B by means of (A+B)x := Ax+Bx. This operatoris still in L(H), because

∥(A+B)x∥ = ∥Ax+Bx∥ ≤ ∥Ax∥+ ∥Bx∥ ≤ (∥A∥+ ∥B∥)∥x∥ ,

which shows that A+B is bounded , and moreover ∥A+B∥ ≤ ∥A∥+ ∥B∥. Next, let α be anarbitrary scalar . If A ∈ L(H), define (αA)x := αAx. Then ∥(αA)x∥ = |α|∥Ax∥ ≤ |α|∥A∥∥x∥hence αA ∈ L(H) and ∥αA∥ ≤ |α|∥A∥. Conversely, |α|∥A∥ = |α|∥α−1αA∥ ≤ |α||α|−1∥αA∥ =||αA∥ and so ∥αA| = |α|∥A∥. Using such remarks, it is easy to see that L(H) is a normedvector space (real or complex, according to whether H is real or complex ), with the norm ∥.∥defined as in Def.28.

Theorem 36: L(H) is a Banach space .

The proof is omitted .The product AB of two operators A and B is defined as follows:

(AB)x := A(Bx) .

Page 52: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 52

From ∥(AB)x∥ = ∥A(Bx)∥ ≤ ∥A∥∥Bx∥ ≤ ∥A∥∥B∥∥x∥ it follows that AB ∈ L(H) and that

∥AB∥ ≤ ∥A∥ ∥B∥ . (3.25)

Further properties of the product of operators are :

(A+B)C = AC +BC , C(A+B) = CA+ CB , α(AB) = (αA)B = A(αB) .

However the product is not commutative in general, that is: AB = BA.The space L(H) with the just described operations is called the Algebra of bounded oper-ators in H.

Problem 30: Show that the Canonical Commutation Relation: AB − BA = iI cannot be satisfied by any

two operators A,B ∈ L(H), so if two linear operators satisfy the CCR then at least one of them is not bounded.

(show that if A,B ∈ L(H) satisfy the CCR, then ABn − BnA = inBn−1 for all n ≥ 1; then show that this

entails 2∥A∥∥B∥ ≥ n for all n...)

3.4.1 Adjoint Operators .

Theorem 37: Let T ∈ L(H). There is a unique T ∗ ∈ L(H) such that

⟨T ∗y|x⟩ = ⟨y|Tx⟩ , ∀x, y ∈ H . (3.26)

It is called the adjoint operator of T , and ∥T ∗∥ = ∥T∥.

*Proof: if y is fixed then the rhs in (3.26) defines a function f of x which is at once linear (because T is linear,and scalar products are linear) and continuous (because it is the composition f1 ◦ T of the continuous maps Tand f1 : x ⟨y|x⟩ ). Thanks to the Thm. of Riesz, there is one and only one y∗ ∈ H such that ⟨y∗|x⟩ = f(x)for all x. Define T ∗y = y∗. This T ∗ is a linear map, hence an operator. Moreover

∥T ∗y∥2 = ⟨T ∗y|T ∗y⟩= ⟨y|T (T ∗y)⟩ = |⟨y|T (T ∗y)⟩| ≤ ∥y∥∥T (T ∗y)∥ ≤ ∥y∥∥T∥∥T ∗y∥ ,

whence ∥T ∗y∥ ≤ ∥T∥∥y∥ and so T ∗ ∈ L(H) and ∥T ∗∥ ≤ ∥T∥. Replacing T by T ∗ in this inequality, one finds

∥(T ∗)∗∥ ≤ ∥T ∗|, however (T ∗)∗ = T (see the next Proposition), and then ∥T∥ ≤ ∥T ∗∥. As the reverse inequality

was already proven, ∥T∥ = ∥T ∗∥ follows. �

Proposition 26: (Properties of the adjoint operator )

1. (A+B)∗ = A∗ +B∗ ,

2. (αA)∗ = α∗A∗ ,

3. (A∗)∗ = A ,

4. (AB)∗ = B∗A∗ ,

5. ∥T ∗T∥ = ∥T∥2 .

Page 53: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 53

Proof: 1...4 are left for an easy exercise. As to 5: since ∥T∥ = sup{∥Tx∥ , ∥x∥ = 1} one maychoose a sequence of vectors xn with ∥xn∥ = 1 so that lim ∥Txn∥ = ∥T∥. Then:

∥T ∗T∥ ≥ |⟨T ∗Txn|xn⟩| = ∥Txn∥2 → ∥T∥2 ,

so ∥T ∗T∥ ≥ ∥T∥2. Conversely, ∥T ∗T∥ ≤ ∥T∥∥T ∗∥ = ∥T∥2. �Problem 31: Prove properties 1-2-3-4.

Problem 32: If H has finite dimension then, once a cons has been chosen, every operator T in L(H) is

represented by a matrix n× n (cf. eq.(3.14)). If {Tjk} is the matrix that represents T , what is the matrix that

represents T ∗ ? (use eqs. (3.26) and (3.15)).

Definition 30: T in L(H) is selfadjoint if T = T ∗.

Theorem 38: T ∈ L(H) is selfadjoint if, and only if, ⟨Tx|x⟩ is real ∀x ∈ H.

Proof : If T is selfadjoint then ⟨Tx|x⟩∗ = ⟨x|Tx⟩ = ⟨T ∗x|x⟩ = ⟨Tx|x⟩ so ⟨Tx|x⟩ is real.Conversely, define h(x, y) = ⟨Tx|y⟩ ;a simple calculation using eqn.(1.7) shows that if h(x, x)is real ∀x ∈ H then ⟨Tx|y⟩∗ = ⟨Ty|x⟩ so that ⟨Ty|x⟩ = ⟨T ∗y|x⟩ whence T = T ∗ follows. �

3.4.2 Inverse operators.

If T ∈ L(H) is bijective , then it has an inverse; that is the operator T−1 that satisfiesT−1T = TT−1 = I. It is easy to see that T−1 is a linear operator. It is also a boundedoperator, but this is a far less trivial fact, that will not be proven here.

Theorem 39: If T ∈ L(H) is invertible, then T−1 ∈ L(H).

3.4.3 Unitary Operators.

An isometry, or isometric operator of H in H is a linear operator T in H of domain H, whichsatisfies ∥Tx∥ = ∥x∥ for all x. It is clearly a bounded operator, and it preserves scalar products(cp. Prop.22). Moreover it is injective, because Tx = 0 ⇒ ∥x∥ = ∥Tx∥ = 0.

Definition 31: A unitary operator in H is a surjective isometry in H. Equivalently: it is aHilbert isomorphism of H in itself.

Problem 33: Show that in a finite dimensional space every isometry is surjective and so it is a unitary

operator (the image of a cons is another cons...)

Problem 34: (Change of basis) Let H a separable space and let {e′j} and {e′′j } be two cons in H. Let anoperator U be defined in H by

Ux =d∑

j=1

⟨e′j |x⟩e′′j

where d = n if H has finite dimension n and d = +∞ if H has infinite dimension.

(1) show that U is unitary,

(2) show that U∗ has the same form, with e′j and e′′j exchanged,

(3) show that all unitary operators in a separable Hilbert space can be written in this form.

Page 54: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 54

Problem 35: In ℓ2(N), let operators S± be defined by (S+x)(n) = x(n + 1) for all n ≥ 1 and (S−x)(n) =

x(n− 1) if n > 1, (S−x)(1) = 0 respectively. They are called the left and right shift operator respectively. In

ℓ2(Z) shift operators are defined by (S±x)(n) = x(n± 1) ∀n ∈ Z.(1) find the adjoint operators of S± in ℓ2(Z), and in ℓ2(N)(2) show that in ℓ2(Z) the operators S± are unitary.

(3) show that in ℓ2(N) the left shift S+ is continuous, but not isometric, and that the right shift S− is isometric,

but not unitary.

Problem 36: Let θ : Rn → R be a measurable function. Show that the operator U that is defined in L2(Rn)

by (Uf)(x) = eiθ(x)f(x) is unitary.

Problem 37: The Parity operator R is defined in L2(Rn) by (Rf)(x) = f(−x). Show that it is unitary.

A unitary operator U is bijective by definition. Hence it has a bounded inverse U−1. Thenext result says that this inverse is just the adjoint operator U∗.

Theorem 40: U ∈ L(H) is unitary if and only if

U∗U = UU∗ = I .

Proof : Let U be unitary : then ⟨x|y⟩ = ⟨Ux|Uy⟩ = ⟨U∗Ux|y⟩ for all x, y ∈ H. Hencex − U∗Ux ⊥ y, ∀y, and so U∗U = I. As U is surjective by definition, for all x, y there arex′, y′ such that x = Ux′, y = Uy′. Then ⟨U∗x|U∗y⟩ = ⟨U∗Ux′|U∗Uy′⟩ = ⟨x′|y′⟩ = ⟨x|y⟩; on theother hand, ⟨U∗x|U∗y⟩ = ⟨UU∗x|y⟩ thanks to U∗∗ = U . It follows that UU∗ = I. Conversely :⟨x|y⟩ = ⟨U∗Ux|y⟩ = ⟨Ux|Uy⟩ so U is isometric. Thanks to UU∗ = I, every x ∈ H is the imageunder U of U∗x, so U is surjective. �

Problem 38: Find S∗±S± and S±S

∗± in ℓ2(N).

Problem 39: Let θ ∈ R. Find α ∈ C so that the matrix :(cos(θ) αα cos(θ)

)is unitary , (i.e. represents a unitary operator in C2).

3.4.4 Projectors.

Let S be a closed subspace in H. Define a map P : H → S so that , for all x, P (x) is theorthogonal projection of x in S.

Proposition 27: P is linear and continuous , and ∥P∥ = 1. It is called a Projection opera-tor, or simply a Projector.

Page 55: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 55

Proof: let x, y ∈ H, α, β arbitrary scalars, and z ∈ S .

⟨z|αx+ βy − (αP (x) + βP (y)⟩ = α⟨z|x− P (x)⟩+ β⟨z|y − P (y)⟩ = 0 ,

by definition of orthogonal projection . Since αP (x) + βP (y) ∈ S, this shows that αP (x) +βP (y) = P (αx+βy) so P is a linear operator. Thm. 23 entails ∥Px∥ ≤ ∥x∥ so P is continuousand ∥P∥ ≤ 1; as Px = x whenever x ∈ S, ∥P∥ = 1 follows. �

Note that the range R(P ) of a projector P is the closed subspace S whereupon it projects .

Theorem 41: P ∈ L(H) is a projector if and only if it is selfadjoint and idempotent, that is:P 2 = P .

Proof : Let P be a projector and let S = R(P ). Denote P⊥ the projector onto the orthogonalsubspace S⊥. Then for arbitrary x, y :

⟨x|Py⟩ = ⟨Px+ P⊥x|Py⟩ = ⟨Px|Py⟩= ⟨Px|Py + P⊥y⟩ = ⟨Px|y⟩ , (3.27)

and so P = P ∗. Moreover, ∀x, Px ∈ S , so P 2x = P (Px) = Px. Conversely: let P ∈ L(H) beselfadjoint and idempotent. Define S = {x ∈ H : Px = x}. It is easily seen , using linearity andcontinuity of P , that S is a closed subspace. Idempotence entails that ∀x, Px ∈ S. Moreover ,∀y ∈ S:

⟨y|x− Px⟩ = ⟨y|x⟩ − ⟨y|Px⟩ = ⟨y|x⟩ − ⟨Py|x⟩ = ⟨y − Py|x⟩ = 0 ,

because Py = y due to y ∈ S. It follows that Px is the orthogonal projection of x on S.�

Problem 40: Let S be the closed subspace in L2(Rn) that was defined in Problem 24. Show that the

associated projector acts according to (Pf)(x) = χI(x)f(x).

Problem 41: A vector f in L2(Rn) is even (resp., odd) if Rf = f (resp., Rf = −f), where R is the Parity

operator (see Problem 37). Show that the set of all even vectors is a closed subspace, and its orthocomplement

is the set of all odd vectors. Show that the corresponding projector is the operator Pe :=12R+ 1

2 I.

Problem 42: T ∈ L(H) is positive if ⟨Tx|x⟩ ≥ 0 holds for all x. Show that every projector is positive .

Deduce that if P is a projector, and P = 0, then Px = −x implies x = 0.

Problem 43: Let P ′ and P ′′ be projectors. Show that P ′ + P ′′ is a projector , if and only if P ′ and P ′′ are

orthogonal, i.e. their ranges are mutually orthogonal subspaces; or, equivalently, P ′P ′′ = P ′′P ′ = 0 ( verify

that idempotence of P ′ + P ′′ requires P ′P ′′ = −P ′′P ′. Apply this eq. to a vector x in the range of P ′, and let

P ′′x = y. It will be found that P ′y = −y...)

Problem 44: Let I ⊆ Rn be a set of positive measure, S a closed subspace of finite dimension in L2(I) ,

and PS the corresponding projector . Show that PS is an integral operator of the type of Hilbert-Schmidt (cp.

Thm.33).(S has a finite cons e1(x), . . . , en(x). Define a kernel K(x, y) =∑n

j=1 ej(x)ej(y)∗...)

Page 56: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 56

Problem 45: Let T be a bounded linear operator whose domain D(T ) is a closed proper subspace of H.

Find a bounded linear operator T , whose domain is the whole of H, so that T x = Tx ∀x ∈ D(T ). Use the

projector on the domain of T ...

Problem 46: Let g(x) = π−1 cos(x). Show that the operator in L2([0, 2π] defined by

(Gf)(x) =

∫ 2π

0

dx′ g(x− x′)f(x′)

is a projection operator.

3.4.5 Convergence of operator sequences.

Many different types of convergence are known for sequences of operators in L(H). Here werestrict to the most elementary ones .

Definition 32: A sequence of operators Tn ∈ L(H) converges strongly to a limit operator T∞,if, for all x ∈ H, the sequence of vectors Tnx converges, in the norm of H, to the vector T∞x.

A necessary supplement to this Definition is the following Theorem, which is not proven here:

Theorem 42: (Banach-Steinhaus) If Tn ∈ L(H) strongly converges to T∞, then T∞ ∈ L(H).

Example : if H is separable with infinite dimension and {e1, e2, . . .} is a cons in H, then forall n let Sn be the closed subspace of dimension n that is spanned by the first n vectors of thecons. Let Pn be the corresponding projector. Then ∀x ∈ H,

Pnx =n∑

j=1

⟨ej|x⟩ej ,

so limn→∞

Pnx = x thanks to (1) in Thm. 28. Therefore, the sequence Pn is strongly convergent

to I (the identity operator).L(H) has a norm, and hence a natural definition of convergence. However this convergence isnot the same as the just defined strong convergence.

Definition 33: A sequence Tn ∈ L(H) converges uniformly, or in norm, to T∞ ∈ L(H), if itconverges to T∞ in the norm of L(H).

It is obvious from ∥Tnx−T∞x∥ = ∥(Tn−T∞)x∥ ≤ ∥Tn−T∞∥∥x∥ that norm convergence entailsstrong convergence. The converse is not true and to show this one may use the previous exam-ple. In fact for every n > 1, I−Pn is the projector P⊥

n and so ∥Pn−I∥ = 1. Thus this particularoperator sequence converges strongly, yet it doesn’t converge in the (still stronger!) norm sense.

Problem 47: Show that if Tn → T∞ in norm then T ∗n → T ∗

∞ in norm.

Problem 48: Show that the product of operators in L(H) is a continuous operation with respect to both

the strong and the uniform convergence.

Problem 49: In ℓ2(N) consider the shift operators S± (Problem 35).

(1) show that Tn := Sn+ strongly converges to 0 (the null operator ) for n→ ∞.

(2) show that T ∗n does not tend to 0 in the strong sense for n→ ∞.

Page 57: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 57

Functions of a bounded operator.

Theorem 43: Let f(z) be an analytic function in the open circle BR(0). If A ∈ L(H) and∥A∥ < R, then:(1) the series

∞∑n=0

f (n)(0)

n!An

converges in the norm sense. Its sum is a bounded linear operator that is denoted f(A).(2) The adjoint f(A)∗ of f(A) is :

f(A)∗ =∞∑n=0

f (n)(0)∗

n!A∗n = f(A∗) ,

where f(z) := f ∗(z∗).(3) If g(z) is also analytic in BR(0), then

(f · g)(A) = f(A)g(A) .

Proof: (1) the Taylor expansion in z = 0 of the function f(z):

f(z) =∞∑n=0

f (n)(0)

n!zn

is absolutely convergent in BR(0). Hence the series

∞∑n=0

|f (n)(0)|n!

∥A∥n

converges; the thesis then follows from ∥An∥ ≤ ∥A∥n, and Thm. 8.(2): the adjoint of the sum of the series is the sum of the series of the adjoints; see Problem47. The thesis easily follows.(3): f · g is analytic in BR(0); and its Taylor series coincides with the term-by-term product ofthe Taylor series of f and g. �

Corollary 1: If A ∈ L(H) is self-adjoint, then ∀t ∈ R the operator exp(itA) is a unitaryoperator.

Proof: Problem 50.�

Definition 34: The resolvent set of T ∈ L(H) is the set of all complex numbers a such thatthe operator T−aI is invertible. The operator (T−aI)−1 is then called the resolvent operatorof operator T at a. The spectrum of T is the complement of the resolvent set.

Page 58: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

3. Elementary Theory of Hilbert Spaces. 58

Theorem 44: All a ∈ C \ {0} such that ∥T∥ < |a| are in the resolvent set of T .

Proof: define f(z) = (z − a)−1. The radius of convergence of the Taylor expansion of f(z) is|a|, so if |a| < ∥T∥ then Thm. (43) can be used to define the operator f(T ); thanks to (3) inthat Thm., (T − a)f(T ) = f(T )(T − a) = I, so f(T ) = (T − a)−1. �

Theorem 45: If T ∈ L(H) is invertible, then all T ′ ∈ L(H) which are sufficiently close to Tin norm are also invertible.

Proof: Denote A = T−1(T − T ′). If ∥T − T ′∥ < ∥T−1∥−1, then ∥A∥ < 1 so A − I is invertiblethanks to Thm.(44). Noting that T ′ = −T (A − I), it follows that T ′ is also invertible, withinverse (I− A)−1T−1. �

Corollary 2: The resolvent set of T ∈ L(H) is an open set in C, and the spectrum of T is aclosed set.

Problem 50: Prove Corollary 1 (Use (2),(3) in Thm.43, and Thm. 40.).

Problem 51: Using Thm. 43, compute exp(itP ), for t ∈ R and P = 0 a projection operator.

Page 59: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. FOURIER ANALYSIS.

4.1 Fourier Series.

4.1.1 Finite Fourier transform.

Define a linear operator F in Cn as follows:

(Fx)(j) = 1√n

n∑k=1

x(k) e−2πin

(j−1)(k−1) .

Proposition 28: the operator F is unitary. It is called the Fourier transform in Cn. Ify = Fx then

x(k) = 1√n

n∑j=1

y(j) e2πin

(j−1)(k−1) .

This is called the inverse Fourier transform in Cn.

Proof: denote ξ = e−2πin . For 1 ≤ k ≤ n define a vector φk ∈ Cn with components φk(j) =

1√nξ(k−1)(j−1). The system of the n vectors φk is a cons in Cn (Problem 52). Using the canonical

basis ej of Cn, the transform (28) can be rewritten in the form :

Fx =n∑

k=1

⟨ek|x⟩φk .

whence unitarity immediately follows (see Problem 34). The rest comes from Thm.40 andproblem 32. �

Problem 52: Show that the vectors φk are a cons. (by direct calculation of scalar products, using that if

α = 1 is a n-th root of unity then∑n

r=1 αr−1 = 0.)

4.1.2 Periodic Functions.

A function f : R → C is periodic if there is L > 0 so that , ∀x ∈ R, f(x+L) = f(x); L is calleda period of f . If L is a period of f then clearly such is nL, for all n ∈ N. A trivial example ofa periodic function is a constant function, and in this case every L > 0 is a period. In order tocompletely define a periodic function f(x), it is sufficient to specify it in an arbitrarily chosenperiod-interval, i.e. in any interval I ⊂ R of length |I| = L. Any periodic function actuallydepends on x only through its phase, that is the angle (ωx mod (2π) (where ω := 2π/L); so

Page 60: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 60

it can be identified with a function, that is defined on a circle. Indeed, if S1 is the unit circlein the complex plane, and F (z) is a function on S1, then the function which is defined in R by:

f(x) = F (eiωx) (4.1)

is periodic with period L = 2π/ω. Eqn. (4.1) establishes a one-to-one correspondence betweenfunctions f : R → C of period L and functions F : S1 → C .Given L > 0, for every relative integer n the function einωx , where ω = 2π/L, is periodic withperiod L. The classical problem of the theory of Fourier expansions is to reproduce a generalfunction f(x) of period L by means of a superposition of such ”elementary oscillations”, i.e. towrite

f(x) =∑n∈Z

c(n)einωx ,

by suitably choosing the coefficients c(n), and with the series on the rhs converging in someappropriate sense.

4.1.3 Square-summable functions.

The problem of the Fourier series expansion can be studied, by restricting to a period-intervalI. Then, whenever the periodic function of interest is square-summable over I, an answer tothe problem is provided by Thm. 29. If f denotes the vector in L2(I) that is represented bya function f(x), square summable over I, and if en are the vectors that are represented by thefunctions (3.13), then:

f =∑n∈Z

f(n) en , (4.2)

where:

f(n) := ⟨en|f⟩ = 1√L

∫I

dx e−inωxf(x) . (4.3)

Note that the coefficients f(n) calculated that way do not depend on a particular choice of aperiod-interval, thanks to the following general fact:

Proposition 29: If G : R → C is periodic with period L, and is summable on all intervals,then ∫ a+L

a

dx G(x)

does not depend on a.

Proof: changing a by an integer multiple of L does not change the integral, due to periodicityof G. Using this, one can always resort to the case when 0 ≤ a < L. But then:∫ a+L

a

dx G(x) =

{∫ L

0

−∫ a

0

+

∫ a+L

L

}dx G(x) =

∫ L

0

dx G(x)

because the 2nd and the 3d integral are equal, thanks to periodicity of G.�From (2)(3) in Teor.28:

Page 61: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 61

−10 −8 −6 −4 −2 0 2 4 6 8 10−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

π−π

Fig. 4.1: Sawtooth function.

Proposition 30: if f and g are periodic with period L and f(n), g(n) are their Fourier coeffi-cients then for any period interval I:∫

I

dx f(x)∗g(x) =∑n∈Z

f(n)∗g(n) ,

and in particular: ∫I

dx |f(x)|2 =∑n∈Z

|f(n)|2 .

According to the general theory of Hilbert bases, convergence of the series in (4.3) is meant inthe sense of the Hilbert space L2(I). In terms of functions in L2(I), equality in (4.3) is to beunderstood almost everywhere, and the series of functions on the rhs is convergent in quadraticmean. That is:

limm→+∞n→−∞

∫I

dx∣∣f(x)− 1√

L

m∑j=n

f(j)eijωx∣∣2 = 0 .

It is a remarkable result (not proven here) that, in spite of warnings in Sect.2.4.2 (which remainvalid in general), in this special case:

Theorem 46: If f is periodic with period L and is square-summable over period-intervals,then

f(x) = 1√L

+∞∑n=−∞

f(n)einωx (4.4)

in quadratic mean , and also almost everywhere .

Page 62: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 62

Problem 53: (1) Expand in Fourier series the ”sawtooth function” (Fig.4.1) 1 :

f(x) = 1π mod (x− π, 2π) − 1 .

(2) find the sum of the series∑+∞

11n2 .

(the function has period 2π. It is convenient to choose the period-interval [−π, π] because the function is

continuous therein. Direct calculation of (4.3) yields f(0) = 0, and for n = 0: f(n) = i(−1)n+121/2π−1/2n−1.

For (2) use the Bessel identity (30)).

4.1.4 Fast Convergence of Fourier series.

The possibility of performing various operations on Fourier series - for instance, of differentiatingthem term by term - is very important in applications. Legitimacy of such operations restson fast convergence of the series. The mean-square convergence of the Fourier series, as it isguaranteed by Thm.46, is a weak one in this respect. A basic fact about Fourier series is thathow fast they converge depends on how regular are the functions that are expanded. Herewe mean a function to be the more regular, the more derivatives it has . A square-summablefunction may be quite rough in this sense, because such functions need not even be continuous.In qualitative terms, the result to be illustrated here is that :

the Fourier series of f(x) converges the faster, the more regular f(x) is .

Proposition 31: If the Fourier series of f(x) converges absolutely, then it converges uniformly,and f(x) is a.e. equal to a continuous function.

Proof : taking the moduli of the terms in a Fourier series removes dependence on x , so if theseries of moduli is convergent then it is automatically uniformly convergent and this in turnentails uniform convergence of the original series. As all terms in the series are continuousfunctions, uniform convergence entails continuity of the sum. On the other hand the sum isa.e. equal to f(x). �Note that the converse is not true .

Proposition 32: If f(x) is of class Ck(R) with k ≥ 1 then :

1. |f(n)| = o(|n|−k) for n→ ±∞.

2. the Fourier series of f(x) converges to f(x), ∀x, uniformly in R.

3. the Fourier series can be derived term by term k times , i.e. the Fourier amplitudes of

the k-th derivatives are given by f (k)(n) = (−inω)kf(n).Proof: let I = [0, L]. The k-th derivative f (k)(x) of f is continuous, hence square-summable

on I. Its Fourier coefficients are given by (4.3) :

f (k)(n) = 1√L

∫ L

0

dx f (k)(x)e−inωx (4.5)

= 1√L

∫ L

0

dxe−inωx d

dxf (k−1) , (4.6)

1 The function mod (x, a) yields the real number x′ ∈ [0, a[ that differs from x by a multiple of a.

Page 63: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 63

so, integrating by parts:

f (k)(n) = 1√Le−inωxf (k−1)(x)

∣∣∣∣x=L

x=0

+ inωf (k−1)(n)

= inωf (k−1)(n) . (4.7)

Repeating this calculation k times yields:

f (k)(n) = (inω)k f(n) , (4.8)

which proves (3) in the thesis. As f (k) is square-summable, from the Bessel identity we find

that∑

n |f (k)(n)|2 < +∞. Thanks to (4.8), this implies:∑n

n2k|f(n)|2 < +∞

that directly yields (1) in the thesis. Finally, the following calculation:∑n∈Z\{0}

|f(n)| =∑

n∈Z\{0}

|f(n)|n 1n

(4.9)

≤( ∑

n∈Z\{0}

|f(n)|2n2

)1/2( ∑n∈Z\{0}

n−2

)1/2

< +∞ (4.10)

shows that the Fourier series is absolutely and hence uniformly convergent. �An extreme form of regularity is that of analytic functions. Let f(x) be analytic in the sensethat the function F (z) which is defined for z := eiωx ∈ S1 by (4.1) can be analytically continuedin a domain B ⊃ S1. In the special case when B = C \ {0} it is known from the theory ofLaurent expansions (Mat3, Teor.46) that

F (z) =∑n∈Z

c(n)zn , c(n) =1

2πi

∫γ

dzF (z)

zn+1,

where γ is a simple closed regular path on S1. Substituting z = eiωx and definition (4.1), theLaurent expansion of F is seen to coincide with the Fourier expansion of f . The following istrue :

Proposition 33: f(x) is analytic in the sense just specified, if, and only if, |f(n)| tends to 0for |n| → +∞ at least exponentially fast, and then its Fourier series can be differentiated termby term an arbitrary number of times.

Proof: omitted. �

Page 64: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 64

4.1.5 Trigonometric series.

In the series (4.4) let us replace einωx by cos(nωx) + i sin(nωx). By re-ordering the terms theseries may be rewritten as follows :

f(0) ++∞∑n=1

(f(n) + f(−n)) cos(nωx) + i

∞∑n=1

(f(n)− f(−n)) sin(nωx) ;

and then , using eqs.(4.3), the trigonometric form of the Fourier series is obtained :

f(x) = c ++∞∑n=1

a(n) cos(nωx) ++∞∑n=1

b(n) sin(nωx) , (4.11)

where:

c =1

L

∫I

dx f(x) , a(n) =2

L

∫I

dx cos(nωx)f(x) , b(n) =2

L

∫I

dx sin(nωx)f(x) .

Problem 54: Write a trigonometric series expansion for the sawtooth function of Problem 53.

Observe the case of the sawtooth function (Problem 53). The function f(x) is discontinuous atall points x = odd multiple of π ; the series of the moduli does not converge (it behaves like theharmonic series ); the trigonometric Fourier series converges to 0 at the points of discontinuity. This is an instance of the more general behaviour that is described by the following result(not proven here):

Proposition 34: If a periodic function f(x) is bounded, with at most a finite number of jumpdiscontinuities in every period interval , and is of class C1 in every closed interval from onepoint of discontinuity to the next2 , then its trigonometric Fourier series converges at all pointsx ∈ R. Its sum is f(x) if f(x) is continuous at x, otherwise it is 1

2(f(x+) + f(x−)) , where

f(x±) = limh↘0

f(x ± h). Convergence is uniform in every closed subset of the set where f(x) is

continuous .

4.1.6 Multiple Fourier series .

A function f : RN → C is periodic in the N variables x1, . . . , xN if there is a ’vector of periods” (L1, . . . , LN) so that for every integer vector l ≡ (l1, . . . , lN) ∈ ZN, and for all x1, . . . , xN :

f(x1 + l1L1, . . . , xN + lNLN) = f(x1, . . . , xN) .

Such a function depends on x ≡ (x1, . . . , xN) ∈ RN only through the N phases ϕj =mod(ωjx,2π)so it may be thought as a function on the Cartesian product of N circles : SN = S1×S1×. . .×S1.This geometrical entity SN is called a N-dimensional torus . This name comes from S2, thatis shown in Fig.4.1.6. The Fourier series reads:

2 In such points right- and left-derivatives are considered.

Page 65: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 65

Fig. 4.2: The torus S2.

f(x) =( N∏j=1

Lj

)−1/2∑l∈ZN

f(l) ei(l1ω1x1+...+lNωNxN) ,

with coefficients given by:

f(l) =( N∏j=1

Lj

)−1/2∫I

dx e−i(l1ω1x1+...+lNωNxN)f(x) ,

where I is a ”period cell”, defined as a Cartesian product of N period-intervals, one for eachof the variables x1, . . . , xN. Some of the properties which were illustrated about convergence inthe case N = 1 carry over more or less directly to the multidimensional case. However some donot hold any more . It is for instance false that the Fourier series of a generic square-summablefunction converges almost everywhere.

4.2 The Fourier Integral.

Definition 35: Let f : RN → C be summable. The functions which are defined in RN by:

f(k) = (2π)−N/2

∫RN

dx e−i⟨k|x⟩f(x) , f(x) = (2π)−N/2

∫RN

dk ei⟨k|x⟩f(k) (4.12)

are the Fourier transforms, respectively direct (FT) and inverse (IFT) , of the function f .

The integrals on the right hand sides exist for all k ∈ RN and for all x ∈ RN respectively thanksto property 2 in Thm.14. Therefore, both transforms are well defined for all k and x in ∈ RN

The inverse transform owes its name to the following theorem:

Page 66: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 66

Theorem 47: (Inversion Theorem ) If f is summable , and also f is summable, then

(ˇf)(x) = f(x) a.e. .

If f is summable, and also f is summable, then

( ˆf)(k) = f(k) a.e. .

No proof of the inversion theorem will be given here. Some intuitive support may be providedby the case of the finite Fourier transforms , see Thm.28 3 .It must be noted that summability of f does not of necessity entail summability of f .

Problem 55: Let f(x) = 0 for x < 0 and f(x) = e−x for x ≥ 0. Show that f is not summable .

4.2.1 Examples.

The following examples have wide application:

Gaussian function.

The normalized Gaussian , of standard deviation σ, is defined in R by:

gσ(x).=

1

σ√2πe−

x2

2σ2 (4.13)

The Heat kernel (eq.(3.24)) is of this type (with σ2 = t). The Fourier transform of the Gaussian(4.13) is:

gσ(k) = σ−1gσ−1(k) . (4.14)

Thus the FT of a Gaussian function is another Gaussian function , and the product of thestandard deviations is 1. However the transform of a normalized Gaussian is not normalized,except in the case when σ = 1; indeed, the FT of the function g1 is the function g1 itself. Itmay be useful to remember that the same is true of the ”multivariate” Gaussian defined in RN

by e−∥x∥2/2.

The FT of the Gaussian function can be computed in several ways . Here we derive gσ under the integral sign:

g′σ(k) =1

2πσ

∫(−ix)e−

x2

2σ2 e−ikxdx =

=iσ

∫e−ikx d

dxe−

x2

2σ2 dx .

Next we integrate by parts and find:g′σ(k) = −σ2kgσ(k) .

This is a simple differential equation for the function gσ(k). Solving it with the initial condition:

gσ(0) =1

2πσ

∫e−

x2

2σ2 dx =1√2π

the result (4.14) is immediately found.

3 The choice of the prefactors (2π)−N/2 in the definitions (4.12) is not universal, however it is prevalent inthe mathematical literature. It is the one choice that yields the same prefactor for both transforms, withoutbreaking the inversion theorem.

Page 67: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 67

Lorentz function.

The function in R:ℓa(x) :=

1

π

a

a2 + x2(4.15)

is known as aLorentzian function (a > 0). The numerical prefactor is chosen so that∫ℓadx = 1.

The FT of this function was computed in Mat3, p.70 by means of path integration:

ℓa(k) =1√2πe−a|k| . (4.16)

Problem 56: Verify the Inversion theorem in the above examples.

4.2.2 Elementary Properties.

Some important properties of FT and IFT are directly obtained from the definitions, by meansof elementary manipulations that are left for the Exercises. They are summarized below:

Theorem 48: Let f : RN → C be summable, a ∈ RN, α ∈ R :

1. If g(x) = f(x) exp(i⟨a|x⟩), then g(k) = f(k − a) ;

2. if g(x) = f(x− a), then: g(k) = f(k) exp(−i⟨a|k⟩) ;

3. if g(x) = f(−x), then g(k) = f(−k) = f(k); ,

4. if g(x) = f(αx), (α > 0) then g(k) = α−Nf(α−1k) .

Next we state two more properties, without worrying for the time being about precise conditionsof validity. To simplify notations, we restrict to the case of functions in R:

df

dk= (−ixf)∧ ;

(df

dx

)∧

= ikf (4.17)

They are easily obtained by a formal use of integration by parts, and of differentiation underthe integral sign. For instance:

df

dk= (2π)−1/2 d

dk

∫e−ikxf(x)dx =

= (2π)−1/2

∫(−ix)e−ikxf(x)dx =

= (−ixf)∧

In the 1step it was assumed that f has a derivative, and that this derivative can be takenunder the integral sign. Neither assumption is necessarily true. For instance , the FT of theLorentzian has no derivative in k = 0; moreover, if the derivative is taken under the integralthen the resulting integral does not converge. So now we give a few exact results.

Page 68: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 68

Proposition 35: (Riemann-Lebesgue): If f : RN → C is summable then f is continuousand bounded and moreover:

1. sup{|f(k)| : k ∈ RN} ≤ (2π)−N/2∫RN dx |f(x)| ,

2. lim∥k∥→+∞

|f(k)| = 0 .

Proof: for h ∈ RN,

f(k + h)− f(k) = (2π)−N/2

∫f(x)e−i⟨k|x⟩(e−i⟨x|h⟩ − 1

)dx (4.18)

As the modulus of the integrand is bounded by 2 |f(x)| , which is summable, thanks to theDominated Convergence Thm the limit of the above expression for h → 0 can be taken underthe integral sign and so it is 0. Hence f is continuous. Directly from (35) :∣∣∣f(k)∣∣∣ ≤ (2π)−N/2

∫|f(x)| dx

as this holds ∀k ∈ RN the 1st claim in the thesis immediately follows. The 2nd will not beproven here. �Let us consider (4.17) in a more precise way.

Proposition 36: : If f : RN → C is summable and also xjf(x) is summable over RN , then fhas a continuous partial derivative with respect to kj, and:

∂f

∂kj=(−ixjf

)∧(k) . (4.19)

Proof: In eq. (4.18) let us assume that only the j-th component of the vector h is differentfrom 0. Then we can write:

f(k + h)− f(k)

hj= (2π)−N/2

∫dx f(x)e−i⟨k|x⟩ e

−i⟨x|h⟩ − 1

hj. (4.20)

Next note that:h−1j

∣∣e−i⟨x|h⟩ − 1∣∣ ≤ h−1

j |⟨h|x⟩| ≤ |xj| ,

hence under the stated assumptions, the hj → 0 limit of (4.20) can be taken under the integralsign by Dominated Convergence. Eq.(4.19) then follows . Having thus found that the derivativeof f is the FT of a summable function, its continuity follows from Proposition 35. �This result is easily generalized. First let us introduce convenient notations. A multi-index pppis a N -vector with integer non-negative components components (p1, . . . , pN). The sum of suchcomponents will be denoted by |ppp| . For x = (x1, . . . , xN) ∈ RN denote:

xppp := xp11 xp22 . . . xpNN , ∂ppp =

∂|ppp|

∂xp11 ∂xp22 . . . ∂xpNN

.

Page 69: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 69

Corollary 3: If ∥x∥mf(x) is summable for all 0 ≤ m ≤ n, then f(k) is a function of classCn(RN), and:

∂qqqf(k) =((−ix)qqqf

)∧for all qqq such that |qqq| ≤ n.

The proof is a simple induction over n, based on the previous Proposition.A basic problem in the theory of Fourier transforms is to establish general relations betweenproperties of a function, and properties of its transform. The above result is the simplestexample. The following qualitative principle is valid:

The faster f vanishes at infinity, the more regular f is - and conversely.

This principle may be remindful of a remark about Fourier series in sect. 4.1.4.

4.2.3 Fast-decreasing test-functions.

It is important to identify functional spaces that are stable under FT and IFT, in the sensethat FTs and IFTs of functions in such spaces are still functions in the same spaces. It wasalready found that the space L(RN) of all summable functions is not stable in this sense (seeproblem 55). A clue to constructing FT-stable spaces is provided by the above stated principle,that functions which are well-behaved at infinity are transformed to functions that are locallywell-behaved (in the sense of differentiability), and vice-versa. This suggests that a FT-stablespace may be constructed by using functions that are well-behaved in both senses at once. Sucha space is indeed described in this section. 4

A function φ : RN → C is a rapidly decreasing test-function, if it is of class C∞, and for allmulti-indices qqq it satisfies:

(∂qqqφ)(x) = o(∥x∥−m) for ∥x∥ → ∞ , for all integer m . (4.21)

Simplest examples are Gaussian functions, e.g. exp(−∥x∥2). The family of all rapidly decreasingtest functions is clearly a complex vector space 5, which is denoted SN.

Lemma 3: If φ ∈ SN then for all ppp:

(∂pppφ)(k) = (ik)pppφ(k) . (4.22)

Proof: replacing f in def. (28) by ∂pppφ, and integrating by parts |ppp| times. �

Proposition 37: If φ ∈ SN then also φ ∈ SN and φ ∈ SN. For all ppp and qqq :

(ik)ppp∂qqqφ(k) =[∂ppp((−ix)qqqφ

)]∧(k) . (4.23)

4 Two more FT-stable spaces will be described in the following: the space L2(RN ), and the space of tempereddistributions. In both cases, generalized definitions of FT will be introduced .

5 It is often termed the Schwarz space ( from the French mathematician Laurent Schwarz, who should notbe confused with the German mathematician Hermann Schwarz of the famous inequality ).

Page 70: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 70

Proof: for φ ∈ S(RN) the hypotheses of Corollary 3 are verified, for all multi-indices qqq, so(4.23) is immediately obtained on combining that Corollary and the above Lemma. Moreover,∀ppp and ∀qqq the function on the rhs in (4.23) is bounded thanks to (1) in Proposition 35, andthis is equivalent to condition (4.21) on φ.�

Corollary 4: The Fourier transform φ 7→ φ is a vector isomorphism of the space SN in itself,and its inverse is given by φ 7→ φ.

Proof: linearity of the map is obvious. It is bijective, because (φ)∨(x) = φ(x) , ∀φ ∈ SN.6 �

4.2.4 The Harmonic Oscillator basis.

For all integer n ≥ 0, the function fn(x) = xne−x2/2 is square-summable over R. The set of thosevectors in L2(R) which are represented by the functions fn is a linearly independent set. Let usconsider the orthonormal system of vectors un which is obtained by Schmidt ortonormalizationof this set of vectors. For all n, un(x) clearly has the form un(x) = hn(x)e

−x2/2, where hn(x) isa polynomial of degree n. The function un(x) is called the n-th Hermite function.

Problem 57: Show that the set {fn} is linearly independent, and compute the first two Hermite functions.

Lemma 4: Let z(x) be a summable function, such that:(i)∫dx xnz(x) = 0 ∀n,

(ii) |z(x)| exp(|kx|) is summable, ∀k ∈ R. Then z(x) = 0 a.e. .

Proof: ∫Rdx e−ikx z(x) =

∫Rdx

∞∑n=0

(−ik)n

n!xn z(x) .

The n-sum can be taken out of the integral thanks to Dominated Convergence, because themoduli of all its partial sums are bounded from above by |z(x)| exp(|kx|), that is summable byassumption. So, ∫

Rdx e−ikx z(x) =

∞∑n=0

(−ik)n

n!

∫Rdx xn z(x) = 0 .

The last eqn. says that the Fourier transform of the summable function z(x) vanishes identically.From the Inversion theorem it follows that this function vanishes almost everywhere, and soz(x) = 0 a.e. �

Theorem 49: The Hermite functions are a cons in L2(R).6 According to the Inversion theorem, this equality holds almost everywhere; however, in the present case

both the lhs and the rhs are in SN , so they are continuous functions; and if two continuous functions are a.e.equal, then they are the same function (Problem 15).

Page 71: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 71

Proof The Hermite functions generate the same subspace as the functions fn(x), so, if w(x) issquare-summable and orthogonal to all Hermite functions, then it is also orthogonal to fn(x),∀n: that is, ∫

Rdx xne−x2/2w(x) = 0 .

The function |w(x)| exp(|kx| − x2/2) is summable ∀k ∈ R because it is the product of twosquare summable functions. The above Lemma then implies w(x) exp(−x2/2) = 0 a.e. hencew(x) = 0 a.e. �

Corollary 5: The vs SN is a dense subset in L2(RN).

Proof: thanks to the above Proposition, and to 5 in Thm. 28, the Hermite functions generatea dense subspace in L2(RN). The same is then true of SN, because it contains all the Hermitefunctions, hence it also contains the subspace they generate.�

Proposition 38: The n-th Hermite function is :

un(x) = cnvn(x) , vn(x) := Hn(x)e−x2/2 , cn := π−1/42−n/2(n!)−1/2 ,

where Hn(x) is the n-th Hermite polynomial and is defined by:

Hn(x) := (−1)nex2 dn

dxne−x2

.

*Proof: u0(x) = c0v0(x) is immediate. For n > m , after n integrations by parts we find:

⟨vn|vm⟩ = (−1)n+m

∫Rdx ex

2 dn

dxne−x2 dm

dxme−x2

= (−1)m∫Rdx e−x2 dn

dxn(ex

2 dm

dxme−x2)

= 0 , (4.24)

because the n-th derivative of a polynomial of degree m < n is 0 . It follows that: the vn are an orthogonalsystem ; everyone of them is a n-th degree polynomial multiplied by e−x2/2; the same is true of the Hermitefunctions, and v0 = c0u0. All this implies that , for all n, cnvn = un with cn = ∥vn∥−1. To compute ∥vn∥, wefirst write the Taylor expansion of the function e−(x−t)2 near t = 0:

+∞∑n=0

(−1)ntn

n!

dn

dxne−x2

= e−(x−t)2 .

Using the definition of the Hermite polynomials this can be rewritten as :

+∞∑n=0

tn

n!vn(x) = et

2

e−(x−2t)2/2 .

Next we integrate over x the squares of both sides. Thanks to orthogonality of the functions Hne−x2/2, we find

that:+∞∑n=0

t2n

(n!)2∥vn∥2 =

√π

+∞∑n=0

2nt2n

n!,

Page 72: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 72

holds for all values of t. The thesis follows, on equating coefficients of t2n on both sides . �

All Hermite functions un(x) are in S1. Let us introduce the following operators7 in the vectorspace S1:

A† = x − d

dx, A = x+

d

dx. (4.25)

It is immediately seen that :A†A − AA† = −2I . (4.26)

and also that ∀n,A†vn = vn+1 , Avn+1 = 2(n+ 1)vn , Av0 = 0 . (4.27)

The 1st of these equations is verified by direct calculation:(x − d

dx

)Hn(x)e

−x2/2 = Hn+1(x)e−x2/2 , (4.28)

and similarly the 3d, and the 2nd for n = 0 . The 2nd is proven for all n > 0 by induction. Wehave

Avr+2 = AA†vr+1 = (A†A+ 2I)vr+1 ,

so, using the inductive assumption that the 2nd eq.(4.27) holds for r = n:

Avn+2 = 2(n+ 1)A†vn + 2vn+1 = 2(n+ 2)vn+1 .

that is, the equation also holds for r = n+ 1.From eqs. (4.27) it follows that:

12(A†A+ 1)vn =

−1

2

d2vndx2

+1

2x2vn = (n+ 1

2)vn . (4.29)

Proposition 39: ∀n , un = (−i)nun.

Proof: it suffices to show that vn = (−i)nvn. The 1st eq. (4.27) yields vn = A†nv0. Fourier-transforming both sides of the 1st eq.(4.27) and using the definition of A† together with eqs.(4.23), we find:

vn+1 = −iA†vn ,

so vn = (−i)nA†nv0 = (−i)nA†nv0 = (−i)nvn. �The Hermite basis can be defined also in L2(RN) with N > 1. Denote ZN

+ the set of multi-indiceswith N components. For all rrr = (r1, . . . , rN) ∈ ZN

+ define

ur(x) := ur1(x1)ur2(x2) . . . urN(xN)

Such functions represent a cons in L2(RN); moreover, ur = (−i)|rrr|urrr.7 Apart from some constant factors these are the creation and annihilation operators of Quantum Me-

chanics.

Page 73: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 73

* More about the Hermite basis (supplementary material).

For N = 1 denote N = 12A

†A. The operator N thus defined on the functions in S(R) is the Numberoperator and satisfies: N vn = nvn, and hence Nun = nun. For N > 1, the number operator is

defined by N =N∑

j=1Nj where Nj = A†

jAj , A†j = xj − ∂/∂xj , Aj = xj + ∂/∂xj . It follows that

Nur = |rrr|urrr.The following equations are easily obtained:∫

RN

dx ∥x∥2urrr(x)2 =

∫RN

dx ∥∇urrr(x)∥2 = |rrr| + n2 . (4.30)

(∇ denotes the gradient, and the norms are those of N -dimensional vectors). To derive them forN = 1, replace vn by un in eq.(4.2.4) (this is allowed because the two functions differ by a constantfactor), then multiply both sides by un(x) and integrate over x by parts. Doing so, the sum of thetwo integrals in (4.30) is found to be equal to twice the rhs. The two integrals are found to be equalusing eqs. (4.23) along with Prop.39. Extension to the case when N > 1 is straightforward .

Proposition 40: If φ ∈ SN, then the amplitudes srrr = ⟨urrr|φ⟩ in the expansion of φ on the Hermitebasis satisfy srrr = o(|rrr|−m) for all integer m > 0.

Proof: for simplicity we restrict to N = 1. φ ∈ S(R) implies Nmφ ∈ S(R) for all integer m. On the

other hand ψ and φ in S(R) ⟨ψ|Nmφ⟩ = ⟨Nmψ|φ⟩ (it is sufficient to check this for m = 1, and to

this end integrate by parts twice) ; then ⟨ur|Nmφ⟩ = ⟨Nmur|φ⟩ = kmsr. Then thanks to the Bessel

identity∑r|sr|2r2m < +∞, and so |sr|rm → 0 for r → ∞.�

4.2.5 Fourier transform in L2(RN).

The FT and the IFT defined in eqs. (28) can be used only for summable functions, so theymay not be applicable to square-summable functions, because such functions are not necessarilysummable. For instance (1+x2)−1/2 is square-summable, yet it is not summable. FT and IFT asdefined in (28) do not act in the whole L2(RN) space, but only in the subspace that correspondsto functions that besides being square summable are also summable . This subspace will bedenoted L2 ∩L1. It is a dense subspace, because it contains as a subset the dense subspace SN.In this section we extend the definition of FT and IFT to all square-summable functions. Anarbitrary f ∈ L2(RN) can be expanded on the Hermite basis in the form

f =∑rrr∈ZN

+

⟨urrr|f⟩ urrr , (4.31)

so proposition (39) suggests the following definition:

Definition 36: The Fourier-Plancherel transform is the linear operator F that is definedin L2(RN) by:

Ff =∑rrr∈ZN

+

(−i)|rrr|⟨urrr|f⟩urrr .

Page 74: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 74

Theorem 50: (1) F is a unitary operator; (2) Its adjoint F∗ is obtained by replacing −i by iin Def.36; (3) For f ∈ L2 ∩ L1, Ff = f and F∗f = f .

Proof: (1): F is unitary because it represents a ”change of basis” (see Problem 34): indeed,the vectors (−i)rrrurrr are a cons, as are the urrr. (2): follows from (2) in Problem 34. (3): Forf = un, an Hermite function, (4.31) yields Ff = f and F∗f = f . So (3) is true whenever f isa Hermite function. To show that it is true for all f ∈ L2 ∩ L1, note that for all such f :

⟨f | urrr⟩ = ⟨f | F∗urrr⟩ = ⟨Ff |urrr⟩. (4.32)

On the other hand, the leftmost scalar product can be rewritten as:

⟨f | urrr⟩ = (2π)−N/2

∫RN

dx f(x)∗∫RN

dk ei⟨k|x⟩urrr(k) . (4.33)

As f(x) and urrr are summable functions over RN, f(x)∗ei⟨k|x⟩urrr(k) is a summable function overR2N, so thanks to the Fubini theorem integrals can be interchanged, leading to:

⟨f | urrr⟩ = (2π)−N/2

∫RN

dk urrr(k)

∫RN

dx ei⟨k|x⟩f(x)∗

=

∫RN

dk urrr(k) f(k)∗ .

(4.34)

Together with eqn.(4.32), this says that, ∀f ∈ L2 ∩ L1,∫dx urrr(x)

(Ff − f

)= 0 , ∀rrr . (4.35)

From this one concludes Ff = f a.e., by an argument that here is given only for the case N = 18. From eqn.(4.35) it easily follows that the function (Ff − f) exp(−x2/2) fulfills assumption(i) of Lemma 4. It satisfies assumption (ii) as well9, so the Lemma entails Ff = f a.e. . �Though logically convenient, the above definition of the FT of square-summable functions isalmost useless for practical calculations of FT . Using unitarity of F a more easily implementableresult is derived . For α > 0 let Iα := {x ∈ RN : |xj| ≤ α, 1 ≤ j ≤ n }. Define:

fα(k) := (2π)−N/2

∫Iα

dx e−i⟨k|x⟩f(x) .

Proposition 41: In the limit α → +∞, the functions fα(k) tend to the function (Ff)(k) inquadratic mean.

8 The general case requires a straightforward multi-dimensional generalization of Lemma 4.9 exp(−x2/2 + |kx|) is summable and square-summable, ∀k ∈ R. Its product with Ff is summable, because

Ff is square summable, and a product of square summable functions yields a summable function. Its productwith f is summable, because f is bounded due to Thm.35.

Page 75: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

4. Fourier Analysis. 75

Proof: let χα(x) be the characteristic function of Iα. From Cauchy-Schwarz:∣∣∣∣∫RN

dx χα(x)f(x)

∣∣∣∣2 ≤ (∫RN

dx χα(x)

)(∫RN

dx |f(x)|2)

= (2α)N∥f∥2

so fα(x) := χα(x)f(x) is a summable function, and Ffα = fα. On the other hand:

limα→+∞

∥fα − f∥2 = limα→+∞

∫RN

dx |1− χα(x)|2|f(x)|2 = 0 ,

thanks to the dominated convergence theorem , because the integrand tends to 0 and is dom-inated by the summable function |f(x)|2. As F is unitary, Ffα = fα tends to Ff in L2(RN),which is equivalent to the thesis. �

Problem 58: Let R denote the Parity operator (see Problem 37). Show that : FR = F∗, that F2 = R,and

also that F4 = I.

Problem 59: The conjugation operator C is defined in L2(RN) by Cf(x) = f(x)∗. It is an antilinear

operator, because, for α ∈ C, C(αf) = α∗Cf . Show that CF = F∗C.

Problem 60: Show that if f, g ∈ L2(RN) then∫RN dx f(x)(Fg)(x) =

∫RN dx (Ff)(x)g(x).

(Sol.:∫RN dx f(x)(Fg)(x) = ⟨Cf |Fg⟩ = ⟨F∗Cf |g⟩...then use Problem 59.)

Problem 61: Let f(x) = (x+ ia)−1, (a > 0); f(x) is square-summable, but not summable. Find the Fourier-

Plancherel transform. ( The result comes in the form of principal part of a generalized Riemann integral, that

is computed by means of Jordan’s Lemma (Mat3 Teor.52). )

Problem 62: Show that the subspace that is generated in L2(R) by the family of functions exp(−(x− a)2),

(a ∈ R) is dense in L2(R) (take Fourier transforms, and then use property 2 in Thm. 48...)

Page 76: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. DISTRIBUTIONS.

5.1 Tempered distributions.

Let f be a square-summable function on RN. As every φ ∈ SN is square-summable, the integral∫RN dxf(x)φ(x) is well defined ; so one can define a map Tf : SN → C by means of:

Tf (φ) =

∫RN

dx f(x)φ(x).

This map is a linear functional on the vector space SN. Thus, square-summable functions on RN

may be alternatively viewed as linear functionals on SN. As such, they are but a small subclassin the class of all the linear functionals that can be defined on SN. A much wider class includesthose linear functionals that are not square-summable functions, and yet can be arbitrarilywell approximated by square-summable functions, in a sense to be presently clarified. Suchfunctionals are called tempered distributions in RN.

Definition 37: A linear functional T on SN is a tempered distribution in RN, if there is asequence fn(x) of square-summable functions on RN, such that, for all φ ∈ SN,

T (φ) = limn→∞

∫RN

dx fn(x)φ(x) . (5.1)

In particular, every square-summable function is a tempered distribution. However the tem-pered distributions are a much larger class, which includes objects that in no way can beassimilated to functions on RN.

5.1.1 Regular Distributions.

Proposition 42: Let g(x)be a measurable function on RN, such that g(x)φ(x) is summable,∀φ ∈ SN. The linear functional

Tg(φ) =

∫RN

dx g(x)φ(x)

is a tempered distribution .

Proof: For all integers n denote χn(x) the characteristic function of the measurable set {x ∈RN : ∥x∥ < n, |g(x)| < n}. Let gn(x) = χn(x)g(x) . This function is square-summable , more-over (1) limn→∞ gn(x)φ(x) = g(x)φ(x) a.e., and (2) |gn(x)φ(x)| is dominated by |g(x)φ(x)|,

Page 77: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 77

that is summable by assumption. By dominated convergence, Tg(φ) = limn→∞

∫RN dx gn(x)φ(x).

Definition 38: The tempered distributions which are defined in Prop.42 are called regulardistributions.

Regular distributions are usually identified with the functions to which they correspond viaProp.42, so the distribution Tg will be usually (though not always) denoted g(x). All square-summable functions are regular distributions, and such are summable functions, and also thefunctions to be defined below:A function g(x) on RN is locally summable if it is summable on every compact subset of RN.It is said to be of algebraic growth if there are m integer, C and R positive constants suchthat:

|g(x)| < C(1 + ∥x∥m)

for all x ∈ RN such that ∥x∥ > R.

Proposition 43: Every locally summable g(x) of algebraic growth is a regular distribution.

Proof: g(x) verifies the assumption of Prop. 42, because |g(x)φ(x)| is summable: indeed, itis dominated by the summable function gR(x)φ(x), where gR(x) = g(x) for ∥x∥ < R andgR(x) = C(1 + ∥x∥m) for ∥x∥ > R. �

5.1.2 Singular Distributions.

Definition 39: A singular distribution is a distribution that is not regular ( so ”it is not afunction”).

Definition 40: The ”Dirac delta ” with support a ∈ RN is the linear functional δa that isdefined on SN by δa(φ) = φ(a).1

Proposition 44: δa is a singular tempered distribution.

Proof: let fn(x) = G1/n(x − a) where G1/n(x) is the Heat kernel (with t = 1/n). (v. (3.24) ).Changing variables from x = (x1, . . . , xN) to x

′ = (√n(x1 − a), . . . ,

√n(xN − a)):∫

RN

dx fn(x)φ(x) =

∫RN

dx′ g1(x′)φ(a+ x′/

√n) .

For n → ∞ the integrand tends to g1(x′)φ(a) and is dominated by the summable function

g1(x)∥φ∥∞. By dominated convergence, using that∫dx′g1(x

′) = 1:

φ(a) = δa(φ) = limn→+∞

∫RN

dx fn(x)φ(x) ,

1 In the case a = 0, it is denoted simply by δ.

Page 78: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 78

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Fig. 5.1: The cap-shaped function in R1. (Def.41)

so Def.37 is verified because the fn are square-summable.Next we show that no locally summable h(x) exists, such that δa(φ) =

∫dxh(x)φ(x). Assume

that such a h(x) exists; for all integer k define functions sk(x) := θ(k(x − a)) where θ is thecap-shaped test function that is defined in Def.41 and is drawn in Fig.5.1. Then sk ∈ SN, andsk(a) = e−1 for all k. Therefore:

e−1 = δa(sk) =

∫RN

dx h(x)sk(x) =

∫B1(a)

dx h(x)θ(k(x− a)) .

Integrands tend to 0 a.e. for k → ∞, and are dominated by e|h(x)|, which is summable onB1(a). By dominated convergence the integral ought to tend to 0 and not to e−1. �

Definition 41: The cap-shaped function θ (x) is defined as follows:

θ(x) =

{exp

(− 1

1−∥x∥2

), if ∥x∥ < 1,

0, if ∥x∥ ≥ 1 .

It is a C∞ function, because it tends to 0 with all its derivatives when ∥x∥ tends to 1 from the left.

The Dirac δa is a singular distribution, because it ”reads” the test functions only in a set ofzero measure. Other distributions, that are obtained by integrating the test functions alongcurves, or (hyper-) surfaces, are likewise singular. For instance, the spherical layer distribution,denoted δΣR(0), is the distribution in R3 that is defined by δΣR(0)(φ) =

∫ΣR(0)

dSφ, where ΣR(0)

is the spherical surface of radius R centered in 0, and∫ΣR(0)

dS is the surface integral. Using

spherical coordinates r, θ, ϕ:

δΣR(0)(φ) = R2

∫ π

0

dθ sin(θ)

∫ 2π

0

dϕ φ(R sin(θ) cos(ϕ), R sin(θ) sin(ϕ), R cos(θ)) . (5.2)

Page 79: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 79

Singular distributions are sometimes called improper, or generalized functions, and are oftendenoted as if they were indeed functions in the ordinary sense. For instance δa is writtenδ(x − a), and instead of δa(φ) one writes

∫δ(x − a)φ(x)dx. This will be termed the improper

notation. When using the improper notation it must be kept in mind that the expression δ(x)has a sense only if it is associated to a sign

∫dx, and that, contrary to appearance, it does not

imply that δ has a value at any point x .

That being said, the improper denotation is nevertheless of great practical usefulness because it yields intuitive

support to several manipulations on distributions, that the rigorous theory fully justifies at the cost of cum-

bersome notations . In the following frequent use will be made of an ”assisted ” improper notation ′′T (x)′′ to

denote a distribution T , using quotation marks to highlight its improper nature.

5.1.3 The space of Tempered Distributions.

The set of all tempered distributions in RN is denoted S ′N. It is a complex vector space, with

vector operations defined as follows : if α, β ∈ C and T1, T2 ∈ S ′N,

(αT1 + βT2) (φ) := αT1(φ) + βT2(φ) . (5.3)

It is a topological vector space (cp. Sec.1.2.1). Convergence is introduced thanks to thefollowing fundamental fact , that will not be proven here:

Theorem 51: Let Tn be a sequence of tempered distributions such that, ∀φ ∈ SN, the valuesTn(φ) are a convergent sequence of complex numbers. Then the linear functional T∞ that isdefined on S ′

N by:T∞(φ) := lim

n→+∞Tn(φ)

is a tempered distribution. The sequence Tn is said to converge to T∞ in the weak-* sense.

One may also say that the sequence Tn tends to T∞ in the sense of distributions. The followingnotations will be used for weak-* convergence : Tn

w−→ T∞, and also T∞ = w-limTn. Usingthis terminology, the definition of a tempered distribution can be paraphrased as follows:

Every tempered distribution is the weak-* limit of a sequence of square-summable functions,

because, as we have seen, every square-summable function may be seen as a (regular) distribu-tion.

Proposition 45: If a sequence of square summable functions fn ∈ L2(RN) converges in quadraticmean to a function f∞ then Tfn converges to Tf∞ in the weak-* sense.

Proof:

Tfn(φ) =

∫RN

dx fn(x)φ(x) = ⟨φ∗|fn⟩ , (5.4)

so, using continuity of the scalar product ,

limn→∞

Tfn(φ) = ⟨φ∗|f∞⟩ = Tf∞(φ) . �

Page 80: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 80

Problem 63: Let f(x) be summable over R and∫R dxf(x) = 1. For integer n define fn(x) = nf(nx). Show

that w-lim fn = δ. In particular, the Heat kernel (3.24) tends to δ for t ↘ 0 (follow the proof of Prop.44,

replacing the fn used there by the just defined ones.).

Problem 64: In R let fn(x) := einx. Show that w-lim fn = 0.(Use Thm.35).

5.2 Differential Calculus.

5.2.1 Operations with Distributions.

Several operations that can be done on functions can be extended to distributions. A methodof operating this extension is based on the following Lemma.

Lemma 5: ∀T ∈ S ′N, the functions fn(x) in (5.1) can be chosen in SN.

Proof: whatever the choice of the fn(x) in L2(RN), for everyone of them a ϕn ∈ SN can bechosen such that ∥fn − ϕn∥ < 1

n, thanks to 5 in Thm.49. Therefore,∣∣∣∣∫

RN

dx fn(x)φ(x) −∫RN

dx ϕn(x)φ(x)

∣∣∣∣ ≤∫RN

dx |fn(x)− ϕn(x)||φ(x)| ≤ 1n∥φ∥

thanks to Schwarz-Holder. The thesis follows on taking the limit n→ +∞.�Hence every tempered distribution is a weak-* limit of a sequence of test functions : T = w-limϕn.Suppose we can perform a certain operation A on all test functions, so that the result of thisoperation on a test function φ is another test function Aφ. On account of the above Lemma, anatural way of generalizing this operation to a distribution T , which is the weak-* limit of testfunctions ϕn, is to define AT := w-limAϕn. Provided, of course, that the weak-* limit exists,and does not depend on how the approximating sequence ϕn is chosen. In the next section thismethod is used to define the derivatives of a distribution.

5.2.2 The Derivatives of a Distribution.

Let T ∈ SN′, and T (φ) = limn→+∞

∫dxϕn(x)φ(x) for all ϕn ∈ S ′

N. Following the idea exposedabove, the derivative ∂qqqT should be defined so that:

∂qqqT (φ) = limn→+∞

∫dx ∂qqqϕn(x)φ(x) , (5.5)

To see that this makes sense, first note that ∂qqqϕn are test-functions , so ∂qqqT thus defined isreally a distribution thanks to Def. 37 , if we can show that the limit exists. To this endintegrate by parts |qqq| times: then the limit rewrites as

(−1)|qqq| limn→+∞

∫dx ϕn(x)∂

qqqφ(x) .

Page 81: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 81

Comparing this with Def.37 we see that the latter limit indeed exists, and is equal to (−1)|qqq|T (∂qqqφ).So definition 5.5 takes the simple form:

∂qqqT (φ) := (−1)|qqq|T (∂qqqφ) . (5.6)

This definition has the following immediate consequences :

• any distribution is indefinitely differentiable,;

• multiple partial derivatives can be calculated in any order, because this is true of testfunctions, thanks to their extreme regularity;

• differentiation is a continuous operation, or ”the derivative of the limit is equal to thelimit of the derivatives, if both the limit and the derivatives are understood in the senseof distributions.

Problem 65: Show that differentiation is a continuous operation in the sense of distributions.

Problem 66: Show that the derivatives of δa are given by:

∂qqqδa(φ) = (−1)|qqq|∂qqqφ(a)

The distributional derivative of a function.

Any regular distribution is at the same time a function, so two different kinds of derivativecan be defined for it ; the distributional derivative, and the ordinary one of Calculus. Asa distribution, it always has a derivative; as a function, it may not have one . To comparethe distributional derivative of a function to the ordinary one, let us consider the case of af : R → R such that (i) it defines a regular distribution, (ii) it is of class C1 in R \ {a}, (iii)limx→±a f(x) = f(a±) exist (not necessarily equal), (iv) f ′(x) is itself a regular distribution2.Let us find the derivative of the regular distribution Tf using def. (5.6):

T ′f (ϕ) = −Tf (ϕ′) = −

(∫ a

−∞+

∫ +∞

a

)fϕ′dx

In each of the two integrals integration by parts is legitimate. It yields:

T ′f (ϕ) = [f(a+)− f(a−)]ϕ(a) +

∫ +∞

−∞f ′ϕ dx; .

This can be rewritten as follows:

T ′f = [f(a+)− f(a−)] δa + Tf ′ . (5.7)

2 f ′(x) may not exist in x = a, but this is irrelevant in the definition of f ′ as a regular distribution, becausef ′(x) is under an integral sign.

Page 82: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 82

In the special case when f is C1 with no exception in x = a, (5.7) shows that T ′f = Tf ′ . In

other words, the distributional derivative and the ordinary derivative of a C1 function coincide3.However this may be blatantly false if f is not sufficiently regular.Let us consider the following example: define f(x) = 1 for x ≥ 0, and f(x) = 0 for x < 0. Thisis called the unit step function and will be denoted s(x). The ordinary derivative is 0 a.e.,and it does not exist in x = 0. Instead, using the above result, and using improper notations,the distributional derivative is :

s′(x) = δ(x) . (5.8)

Problem 67: Derive eqn.(5.8) directly from the definition of the derivative of a distribution.

The next example introduces a noteworthy singular distribution.

The distribution P 1x−a

.

Consider f(x) = log |x− a|, (a ∈ R). Its ordinary derivative is a.e. the function 1x−a

, which isnot a distribution, because it has a non-integrable singularity at x = a. So the distributionalderivative of f(x) has to be different. For the derivative of the regular distribution Tf , thedefinition yields :

T ′f (φ) = −Tf (φ′) = −

∫φ′(x) log |x− a| dx = − lim

ϵ↘0

∫{|x−a|≥ϵ}

φ′(x) log |x− a| dx

(the last step, by dominated convergence). Integrating by parts in ]−∞, a− ϵ] and in [a+ ϵ,+∞[we find :

T ′f (φ) = lim

ϵ↘0

{∫{|x−a|≥ϵ}

φ(x)

x− adx− [φ(a+ ϵ)− φ (a− ϵ)] log ϵ

}=

= limϵ→0

∫{|x−a|≥ϵ}

φ(x)

x− adx

Definition 42: The distribution defined by:

P1

x− a(φ) = lim

ϵ↘0

∫{|x−a|≥ϵ}

φ(x)

x− adx (5.9)

is called Principal part of 1x−a

, and is denoted by P 1x−a

.

The above calculation shows that this distribution is the distributional derivative of log |x− a|.It is a singular distribution. 4

The function log(x+ iϵ) (−π < arg(x+ iϵ) ≤ π) is a tempered distribution (why?). Explicitly:

log(x+ iϵ) = log |x+ iϵ|+ i arg(x+ iϵ)

3 This is true more in general for all absolutely continuous functions. An absolutely continuous function isa function which has a derivative a.e., and is equal to the integral of that derivative, that is,

∫ x2

x1dxf ′(x) =

f(x2)− f(x1) for all x1 < x24 Existence of the limit in Def.42 is an automatic consequence of the existence of the distributional derivative.

Page 83: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 83

As ϵ ↘ 0 the function log |x+ iϵ| pointwise converges a.e. to log |x| and arg(x + iϵ) tends toπs(−x). The same is true in the weak-* sense.

Problem 68: Prove the last statement (Dominated convergence ... ).

As differentiation is a continuous operation in S ′N , it follows that :

w- limϵ↘0

1

x+ iϵ= P

1

x− iπδ

In a totally similar way one finds the more general formula:

1

x± i0:= w- lim

ϵ↘0

1

x± iϵ= P

1

x∓ iπδ . (5.10)

Potential of a point charge.

The function 1r, where r = ∥x∥ , is locally summable in R3. Moreover it is harmonic in R3 \ {0}

(that is, △1r= 0 at all points = 0, as it is easily verified by direct calculation). We shall compute

its distributional Laplacean (i.e. the Laplacean of the corresponding regular distribution T1/r).Let φ be an arbitrary test-function, and let σϵ , ΣR be spheres centered in 0, with respectiveradii ϵ and R, with 0 < ϵ < R. As 1

ris harmonic in R3 \ {0},

△T1/r(φ) = T1/r(△φ) =∫R3

1

r△φdx

= limϵ→0

R→+∞

∫ΣR\σϵ

1

r△φdx

= limϵ→0

R→+∞

∫ΣR\σϵ

[1

r△φ− φ△1

r

]dx (5.11)

Next we use the Gauss-Green formula:

△1

r(φ) = lim

ϵ→0

∫∂σϵ

[−1

ϵ

∂φ

∂r− φ

ϵ2

]dS + lim

R→+∞

∫∂ΣR

[1

R

∂φ

∂r+

φ

R2

]dS

where integrals are calculated on the surfaces of the spheres. The R → +∞ limit vanishesbecause φ and its derivatives on ΣR vanish faster than any power of 1/R. As to the 1stintegral: when ϵ→ 0, the 1st term within square brackets yields an infinitesimal contribution:

ϵ−1

∣∣∣∣∫∂σϵ

∂φ

∂rdS

∣∣∣∣ ≤ 4πϵ ∥∇φ∥∞

while the 2nd term tends to −4πφ(0) :∣∣∣∣ϵ−2

∫∂σϵ

φdS − 4πφ(0)

∣∣∣∣ = ϵ−2

∣∣∣∣∫∂σϵ

(φ− φ(0))dS

∣∣∣∣ ≤ 4πϵ ∥∇φ∥∞

Page 84: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 84

In conclusion:

△1

r= −4πδ (5.12)

This is a well known result from physics. It says that the electrostatic (or gravitational)potential that is produced by a point charge (or mass) located in 0 is 1

r(apart from a constant

factor).

Problem 69: Find the distributional derivative ∂2

∂x1∂x2F (x1, x2) where F is the characteristic function of the

1st quadrant in R2 (Ans.: δ).

Problem 70: In R3 denote r the radial coordinate. For R > 0 find the distributional derivative ∂∂r of the

function r−2χR, where χR is the characteristic function of the region where r > R (Ans.: R2δΣR(0)).

5.3 Other Operations.

5.3.1 Change of variables.

Let A denote a linear non-singular transformation in RN and a ∈ RN an arbitrarily fixed point.Let L : x 7→ Ax+a, and let L′ be the operator that turns a function f(x) into L′f(x) := f(Lx).Particular cases are: (left) translation τa, which corresponds to A = I:

τaf(x) = f(x+ a) ,

and parity R which corresponds to a = 0 and A = −I. If T = w-limϕn then let L′T := w-limL′ϕn, that is: L′T (φ) = lim

∫dxϕn(Lx)φ(x). Changing integration variables from x to

x′ = Lx ,

(L′T )(φ) =1

| det(A)|lim

n→+∞

∫N

dx′ ϕn(x′)φ(L−1x′) =

1

| det(A)|T (L−1φ) .

The improper denotation for L′T is ′′T (Lx)′′.

Problem 71: Show that: (1) δ(2x) = 2−Nδ(x) , (2) τ−aδ = δa, (3) Rδa = δ−a.

5.3.2 Product.

The product of distributions cannot be defined in general. As matter of fact, fnw−→ T and

gnw−→ S do not ensure existence of the weak-* limit w-lim fngn, which would be the natural

definition of the product ST : see Problem 72. As a further counterexample: every square-summable f(x) is a regular distribution; however f(x)2 = f(x)f(x) may not be such; take, forinstance, f(x) = |1− x|−1/2.The product can be nevertheless defined in several particular cases. The simplest is when oneof the distributions is regular, and is associated with a function F (x) of class C∞(RN), withalgebraic growth. The product FT is then defined for any T ∈ S ′(RN) as follows:

(FT )(φ) = T (Fφ) .

Page 85: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 85

This is justified on noting that: (1) Fφ is always a test-function, so the rhs is well defined, and(2) if ϕn

w−→ T then Fϕnw−→ FT . The product thus defined is a weak-* continuous operation

that is , if Tnw−→ T∞ then FTn

w−→ FT∞.

Problem 72: Let fn(x) the sequence in Problem 63: fnw−→ δ. Show that the weak-* limit of the sequence

fn(x)2 does not exist.

Problem 73: Find the product Fδa.

Problem 74: Find the product xP 1x .

Problem 75: Find the products x 1x±i0 (see eqn.(5.10).

5.3.3 Tensor Product.

Let f and g be functions defined on Rn and Rm respectively. The tensor product of f and g isa function on Rn+m which is denoted f ⊗ g and is defined as follows:

(f ⊗ g)(x, y) = f(x)g(y) , x ∈ Rn , y ∈ Rm . (5.13)

We would like to generalize this operation to distributions. To this end we note that, forconsistency, the tensor product of regular distributions Tf and Tg must be defined such that:

Tf ⊗ Tg = Tf⊗g . (5.14)

Applying this to a test function in S(Rn+m) of the form ϕ⊗ ψ where ϕ ∈ S(Rn), ψ ∈ S(Rm):

Tf⊗g(ϕ⊗ ψ) =

∫ ∫dx dy f(x)g(y)ϕ(x)ψ(y) =

=

∫dx f(x)ϕ(x)

∫dy g(y)ψ(y) =

= Tf (ϕ) Tg(ψ) . (5.15)

Using an arbitrary test function φ ∈ S(Rm+n we formally obtain:

Tf⊗g(φ) =

∫dx f(x)

∫dyg(y)φ(x, y) =

= Tf,x (Tg,y(φ(x, y))) ; (5.16)

In the last line, the denotation Tf,x means that the distribution Tf operates on test functions ofthe variable x; and similarly Tg,y means that Tg works on φ(x, y) as a function of y. Now notethat (5.16) may be taken as the definition of Tf ⊗ T and that, moreover, the last line remainsmeaningful even for arbitrary distributions, and not only for regular ones; so it may be takenas the definition of the tensor product of two arbitrary distribution. The following theorem ,of which no proof will be given here, justifies this idea.

Page 86: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 86

Theorem 52: Let T ∈ S ′(Rn), and S ∈ S ′(Rm).(1) there is one and only one W ∈ S ′(Rn+m) such that, ∀ϕ ∈ S(Rn), ∀ψ ∈ S(Rm):

W (ϕ⊗ ψ) = T (ϕ) S(ψ) ;

The distribution W is termed the tensor product of T and S and is denoted T ⊗ S.(2)∀φ ∈ S(Rn+m), the function of x ∈ Rn defined by Sy (φ(x, y)) is a test function in S(Rn);and the function of y ∈ Rm defined by Tx (φ(x, y)) is in S(Rn+m) ;(3) (T ⊗ S)(φ) = Tx (Sy(φ(x, y))) = Sy (Tx(φ(x, y))) ;(4) T ⊗ S is continuous wrt to T and S, in the sense of weak-* convergence.

5.4 Fourier transform.

The Fourier transform of a tempered distribution T is defined by the same method which wasused for derivatives. Starting from T (φ) = lim

∫dx ϕn(x)φ(x) one sets T (φ) := lim

∫dx ϕn(x)φ(x)

whence , on account of the result in Problem 60, T (φ) = lim∫dx ϕn(x)φ(x), and this finally

leads to:

Definition 43: The Fourier transform of a tempered distribution T is defined by:

T (φ) := T (φ) .

It should be emphasized that this definition rests on the key fact that the FT of a test functionis still a test-function (Prop.37).

Theorem 53: The Fourier transform F : T 7→ T is a vector isomorphism of the space S ′(RN)in itself. Its inverse is the transform F−1 : T 7→ T , where

T (φ) := T (φ) .

Both F and F−1 are continuous maps of S ′(RN) in itself.

Proof: if Tnw−→ T∞ then T∞(φ) = T∞(φ) = limn→+∞ Tn(φ) = limn→+∞ Tn(φ), so Tn

w−→ T∞,and this proves continuity.�

Problem 76: Show that F and F−1 are vector isomorphisms. (From their definitions and from Corollary 4

to Prop.37.)

A regular distribution always has a FT in the sense of distributions. If at the same time it alsohas a FT in the ordinary sense, then the two FT coincide (a.e. at least). In the special case ofa test-function this is immediate from Problem 60. More generally :

Proposition 46: The distributional FT of f ∈ L2(RN) coincides (a.e.) with the Fourier-Plancherel transform of f .

Page 87: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 87

Proof: one can find a sequence ϕn of test-functions such that ϕn → f in the sense of L2.Therefore, ϕn tends to Ff in the same sense, because F is unitary, and Fϕn = ϕn. Thanks toProp. 45, ϕn converges to f also in the sense of distributions, and ϕn converges to Ff in thesame sense. On the other hand ϕn converges in the weak-* sense to the distributional FT of f ,due to weak-* continuity of distributional FT; so the thesis follows.�

Problem 77: Show that the differential equation T ′ = T has no solutions T ∈ S1, other than T = 0.

5.4.1 Explicit Calculation of some Transforms.

The following properties of FT are of frequent use.

Theorem 54: For ppp, q arbitrary multi-indices, and T ∈ S ′(RN):

1. (ik)qqq∂pppT = (∂qqq [(−ix)pppT ])∧;

2. T = RT , where R is the parity operator;

3. (τaT )∧ = ei⟨a|k⟩T , where a ∈ R and τa is the corresponding translation.

Proof : such properties are already known for test-functions. Their extension to distributionsis a direct application of definitions. Only property 1 with qqq = 0 will be proven here, becausethe proofs of the other properties are essentially identical.

∂pppT (φ) = (−1)|ppp|T (∂pppφ)

= (−1)|ppp|T (∂pppφ)

= (−1)|ppp|T ((ix)pppφ)

= (−ix)|ppp|T (φ)=

((−ix)|ppp|T

)∧(φ) . (5.17)

The Fourier transform of δ: the formula of Dirac.

From definitions:

δ(φ) = δ(φ) = φ(0) = (2π)−N/2

∫φdx,

so δ is the regular distribution that is associated with the constant function:

δ = (2π)−N/2 : (5.18)

Using this result it is easy to find the FT of the constant function = 1:

1 = (R1)∨ = 1 = (2π)N/2((2π)−N/2

)∨= (2π)N/2δ. (5.19)

Page 88: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 88

In improper notation and for N = 1, the just found result can be written in the form:

δ(x) = (2π)−1

∫ +∞

−∞dk e−ikx . (5.20)

This is known as the Dirac formula. Such a representation of δ in the form of an integral isquite common and not just formal. Indeed, denoting χa(x) the characteristic function of theinterval [−a,+a], it is easy to see that for a → +∞ the function χa tends pointwise, and alsoin the weak-* sense to the constant function = 1. Thanks to weak-* continuity of FT, it followsthat χa tends (weak-*) to 1, so, using (5.19):

δ(x) = (2π)−1/21 = (2π)−1/2w- lima→+∞

χa

= 12πw- lim

a→∞

∫ a

−a

dk e−ikx . (5.21)

Thus, eqn. (5.20) is fully meaningful, provided that convergence of the generalized integral isunderstood in the weak-* sense.The integral in the bottom line in eqn.(5.21) can be explicitly computed. Then another widelyused representation of the Dirac δ is obtained:

δ(x) = w- lima→∞

sin(ax)

πx.

Fourier transforms ofdDerivatives of δ are immediately found using the definitions :

(∂qqqδ)∧ = (2π)−N/2(ik)qqq . (5.22)

From this, by inverse FT, we infer that

The FT of a polynomial is a combination of a finite number of derivatives of the Dirac δ withsupport in the origin.

Fourier Transform of P 1x.

Formula (5.10) will be used . From Problem 61 we know that ∀ϵ > 0,(1

x+ iϵ

)∧

(k) = −i√2π s(k) e−ϵk,

where s is the unit step function; moreover we know that δ = (2π)−1/2. Taking FT of bothmembers in (5.10), substituting such results, and taking the ϵ↘ 0 limit, one finds that:(

P1

x

)∧

(k) = −i√π

2sgn(k) . (5.23)

Page 89: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 89

Transform of the spherical layer distribution.

Let ΣR(0) be the sphere of radius R in R3 with center in 0, and let δΣR(0) be the correspondingsurface distribution. From definitions:

δΣR(0)(φ) = δΣR(0)(φ) =

∫ΣR(0)

φ(x)dS =

= (2π)−3/2

∫R3

dk φ(k)

∫ΣR(0)

e−i⟨x|k⟩dSx .

The inner integral can be computed using spherical coordinates ϕ, θ on ΣR, with the polar axisaligned with the vector k :

∫ΣR(0)

e−i⟨x|k⟩dSx = R2

2π∫0

π∫0

e−iR∥k∥ cos θ sin θdθ = 4πRsin(R ∥k∥)

∥k∥,

and so:

δΣR(0) =

√2

πR

sin(R ∥k∥)∥k∥

. (5.24)

Problem 78: Find the FT of the regular distribution cos(ωx) .

Problem 79: Find the Fourier transform of f(x) = |x| in S ′1. (|x| = x · sign(x)... use 1 in Thm. 54, and

(5.23).)

5.4.2 Convolution of distributions.

The ccconvolution product, or simply the convolution, of two functions f, g in RN is the operationthat is formally defined by :

(f ∗ g)(x) :=

∫RN

dx′ f(x− x′)g(x′) . (5.25)

It is an important operation with several applications, however it is quite a delicate one. Asit was the case with ordinary product, the convolution product of two distributions cannot bedefined in general, but only in some special cases. Here only the simplest are presented.

Theorem 55: Let f and g be summable over RN; then, for a.e. x ∈ RN, f(x − x′)g(x′) is asummable function of x′, so (5.25) a.e. defines a function f ∗ g. This function is summable,and moreover f ∗ g = g ∗ f .

Proof: omitted. �

Corollary 6: Under the same assumptions as in Thm. 55,

(f ∗ g)∧ = (2π)N/2f · g .

Page 90: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 90

Proof:

(f ∗ g)∧(k) = (2π)−N/2

∫RN

dx e−i⟨k|x⟩∫RN

dx′ f(x− x′)g(x′)

= (2π)−N/2

∫RN

dx′ g(x′)e−i⟨k|x′⟩∫RN

dx e−i⟨k|x−x′⟩f(x− x′) , (5.26)

and now the thesis follows on changing variables in the 2nd integral: from x to x′′ := x− x′.�Problem 80: (1) Show, using the above corollary and eqn.(4.14), that the convolution gσ ∗ gσ′ of normalizedGaussians (as defined in (4.13)) is gσ′′ , where σ′′2 = σ2 + σ′2.(2) conclude that the operators Tt that are defined by the ”Heat kernel” (3.24) have the Semigroup Property,that is :

Tt+s = TtTs

holds true for all t, s > 0. (From (3.22) and (3.24) deduce that TtTs has an integral kernel given by Gt ∗ Gs.

Then use the result at point (1) in this problem.)

Problem 81: Show that the convolution of two Lorentzian functions (4.15) is a Lorentzian .

Problem 82: Show that∫dx(f ∗ g)(x) = (

∫dxf(x))(

∫dxg(x)). (Use corollary 6, noting that for all

summable f one has that∫dxf(x) = (2π)N/2f(0)).

A natural way to define the convolution S ∗ T of two tempered distributions is then :

S ∗ T = (2π)N/2(S · T )∨ . (5.27)

provided that the product of distributions on the rhs has a meaning. Now recall that thisis always the case, whenever one at least of the distributions is a function of class C∞ andalgebraic growth (cp. sec.5.3.2). As a consequence :

The convolution of two distributions is well defined by (5.27), anytime the Fourier transformof at least one of them is C∞ with algebraic growth.

In particular, on account of eqs.(5.22), δ and its derivatives have a convolution with anytempered distribution. From definition 5.27, using (5.18):

δ ∗ T = T (5.28)

∀T ∈ S ′(RN); in other words, δ acts as a unit element for the convolution product. Moreover,

using ∂qqqδ ∗ T = ((ik)qqq · T )∨ and property 1 in Thm. 54:

∂qqqδ ∗ T = ∂qqqT . (5.29)

From this, (5.22), and (1) in Thm. 54, we deduce the rule of differentiating convolution prod-ucts:

∂qqq(S ∗ T ) = (∂qqqδ) ∗ (S ∗ T ) =((ik)qqq(S ∗ T )∧

)∨= (2π)N/2

((ik)qqqS · T

)∨= (2π)N/2

((∂qqqS)∧ · T

)∨= (∂qqqS) ∗ T ; (5.30)

Page 91: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 91

and so ,

∂qqq(S ∗ T ) = (∂qqqS) ∗ T = S ∗ ∂qqqT . (5.31)

because convolution is commutative.

Problem 83: Show that , ∀T ∈ S ′N , δa ∗ T = τaT . (Use Def. 5.27 and property 3 in Thm. 54.)

Convolution of more than two distributions requires caution, because convolution is not as-sociative in general - that is, (S ∗ T ) ∗ U may not be the same as S ∗ (T ∗ U); for instance(s ∗ δ′) ∗ 1 = s′ ∗ 1 = δ ∗ 1 = 1, whereas s ∗ (δ′ ∗ 1) = s ∗ 1′ = s ∗ 0 = 0. This difficulty is absentunder special conditions that will not be described here.The Heat kernel Gt(x) (3.24) is a test function so it has convolution with all distributions.

Proposition 47: ∀T ∈ S ′(RN), and ∀t > 0, Gt ∗ T is a function (i.e., a regular distribution)and:

T = w- limt→0

(Gt ∗ T ) . (5.32)

Proof: omitted. Eqn.(5.32) may be made plausible by w-limGt = δ (Problem 63) together with(5.28). �Convolution with a heat kernel turns any distribution into a regular distribution, and thisprocess is called regularization. Regularization can be performed using other kernels than theHeat ones, and is sometimes called local average, or moving average, or (in the heat kernelcase) Gaussian smoothing. Not only it turns singular distributions into regular distributions;if performed on regular distributions , it turns them into extremely regular functions. Thefollowing is the easiest result of this kind.

Proposition 48: If f is a summable function, then ∀t > 0 Gt ∗ f is a C∞ function.

Problem 84: Prove the above Proposition. (The FT of f is continuous and bounded, the FT of Gt

is a test-function...use Corollary 3).

Hilbert transform.

The Fourier transform of the distribution P 1xis given in (5.23) . It is not a C∞ function, so

the above sufficient condition for existence of convolutions is not satisfied and P 1x∗ T cannot

be defined for all distributions T . In spite of that, it can be defined at least in the case whenT is a square-integrable function (i.e., T = Tf , f ∈ L2(R)).

Theorem 56: Let f(x) be an arbitrary square integrable function.(1) P 1

x∗ Tf as defined in (5.27) exists, and is a regular distribution Tg, where g(x) is again a

square integrable function. One can write g = P 1x∗ f ;

(2)

g(x) = P

∫ +∞

−∞dx

1

x− x′f(x′) ;

Page 92: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 92

(3) the operator H that is defined in L2(R) by:

Hf =1

πP1

x∗ f

is a unitary operator, that is known as the Hilbert transform.

Proof: (1) Tf = TFf , where F is the Fourier-Plancherel transform (Thm.46). Ff is a square

integrable function. Eqn. (5.23) shows that P 1xis a bounded function, so the product of this

function with Ff is still a square integrable function; hence TFf P1xis a well defined regular

distribution. Using eqn.(5.23) it can be written as πV f , where V is the linear operator definedin L2(R) by:

(V f)(k) = −i sign(k) f(k) . (5.33)

The inverse Fourier transform of Tf P 1xis thus the regular distribution Tg ; where g is the

inverse Fourier-Plancherel transform of πV f , and so is a square-integrable function. We havethus found that the convolution P 1

x∗ Tf defined in (5.27) , where f(x) is an arbitrary square-

integrable function, is the distribution Tg where g ∈ L2(R) is given by:

g = πF−1 (V Ff) . (5.34)

(2) proof is omitted.(3) Using (5.33), (5.34), the linear operator H can be written as:

H = F−1 V F , (5.35)

whence (3) follows, because both F and V are unitary operators. �Problem 85: (1) Show that H2 = −I; (2) show that iH is a self-adjoint operator.

5.5 Fundamental Solutions.

Every polynomial P (x) in RN can be written in the form:

P (x) =M∑n=1

cnxpppn

where cn, M , and pppn are fixed constants and multi-indexes. Given such a polynomial, we shalldenote P (∂) the linear differential operator that is defined by:

P (∂) =M∑n=1

cn∂pppn . (5.36)

A fundamental solution of the operator P (∂) is any tempered distribution G which satisfies

P (∂)G = δ . (5.37)

The importance of this notion lies with the following fact :

Page 93: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 93

Proposition 49: Let G be a fundamental solution for P (∂), and W a tempered distribution.If the convolution G ∗W exists, then it satisfies the non-homogeneous differential equation :

P (∂)(G ∗W ) = W .

Proof: immediate from (5.28) and (5.31): P (∂)(G ∗W ) = (P (∂)G) ∗W = δ ∗W = W .�Therefore, fundamental solutions of P (∂) allow to solve the equation P (∂)T = W . Existence offundamental solutions for differential operators of the form (5.36) is a Theorem 5. However, suchsolutions are not unique in general. This can be understood on noting that if G is a solution ofeqn.(5.37), then G+U is also a solution, whenever U is a solution of the homogeneous equationP (∂)U = 0.

5.5.1 The Poisson equation.

Consider the following equation:△V = W

where W is a known distribution. A fundamental solution is provided by eqn.(5.12) in theform of the regular distribution G(x) = −(4π)−1∥x∥−1. Therefore, the Poisson equation hasthe solution :

V (x) = − 1

4π∥x∥∗W

(provided the convolution exists). In improper notation:

V (x) = − 1

∫R3

dx′1

∥x− x′∥′′W (x′)′′ . (5.38)

Apart from physical constants, this is the well known formula for the electrostatic potential gen-erated by a static charge distribution ′′W (x)′′. Every function of the form U(x)− (4π)−1∥x∥−1,where U(x) is an arbitrary harmonic function6 , is still a fundamental solution. The particularchoice U(x) = 0 that leads to the solution (5.38) is not forced by mathematical reasons , butby physical reasons instead, that impose boundary conditions on the solutions of the Poissonequation - namely, they should vanish at infinity. The same happens with other differentialequations of mathematical physics (see below the example of the wave equation).

5.5.2 Fundamental solutions, and the Cauchy problem.

In the following sections two physically important differential operators in R4 are considered.The 1st three coordinates x1, x2, x3 of a point in R4 will have the meaning of space coordinates,and the fourth one of time, so it will be denoted by t (and not by x4). Moreover, x will denotethe position vector: x = (x1, x2, x3) ∈ R3. The operators will have the following form :

L =∂

∂t− L0 , (5.39)

5 The Malgrange - Ehrenpreis theorem: it will not be proven here.6 i.e., a function such that △U(x) = 0 ∀x ∈ R3.

Page 94: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 94

where L0 is a linear differential operator in R3.In loose terms, the Cauchy problem associated with operators in this class is: find a func-tion F (x, t) in R3 × [0,+∞) which satisfies LF = 0 in R3 × (0,+∞), and moreover satisfieslimt→0+ F (x, t) = f(x) where f(x) is a known function (”initial condition”). This may betermed the forward Cauchy problem, the backward problem being that of finding F (x, t) thatsolves LF = 0 in the past (t < 0) and satisfies limt→0− F (x, t) = f(x). The important issue,under which conditions the Cauchy problems can be solved will not be entered here, as thegoal of this section is to illustrate the role of fundamental solutions . So we simply assumethat a unique solution F+(x, t) as above described exists for the forward Cauchy problem. Asit is defined only at times t ≥ 0, it will be defined also for t < 0 by setting F+(x, t) = 0. Thefunction thus defined over R4 solves the equation at all points with t = 0, but not at t = 0because it is discontinuous there. Due to the discontinuity at t = 0, its distributional derivativewith respect to time is (see subsec.5.2.2):

∂tF (x, t) = ′′δ(t)′′f(x) + ∂tF (x, t)

where the rightmost term denotes the ordinary time-derivative, which exists for all t = 0 . Itfollows that F+(x, t) solves the following distributional equation:

LF+(x, t) = ′′δ(t)′′f(x) , (5.40)

This is a non-homogeneous equation of the form LF+ = W where W =′′ δ(t)′′f(x), so it canbe sought in the form G ∗W using a fundamental solution G :

F+(x, t) =

∫dt′∫R3

dx′ ′′G(x− x′, t− t′)′′ ′′δ(t′)′′ f(x′)

=

∫R3

dx′ ′′G(x− x′, t)′′f(x′) . (5.41)

However, as noted in the previous section, neither the fundamental solution, nor the solutionof LF+ = W is unique in general. In contrast, the solution F+(x, t) is unique, once the initialcondition f(x) is specified. Therefore, if (5.5.2) is to solve the Cauchy problem, then G+(x, t)cannot be an arbitrary fundamental solution, but has to be suitably selected somehow. To thisend, note that F+(x, t) vanishes identically for t < 0 , for any choice of f(x), and so G+(x, t)must have the same property. Therefore, of all the fundamental solutions of the operator L,the one that has to be used to solve the forward Cauchy problem must satisfy G+(x, t) = 0 fort < 0. It is called the retarded Green function, and will be denoted Gret(x, t).If also the backward Cauchy problem has a solution F−(x, t) for t ≤ 0, then essentially identicalconsiderations apply. Defining F−(x, t) to be = 0 for t > 0, it satisfies LF− = −W withW = f(x)′′δ(t)′′ as before. Hence, it can be found via :

F−(x, t) = −∫R3

dx′ ′′G−(x− x′, t)′′ f(x′) , (5.42)

where G−(x, t) is now the fundamental solution of the operator L, that vanishes for t > 0; andthen Gadv = −G−(x, t) can be called the advanced Green function.

Page 95: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 95

5.5.3 The Diffusion, or Heat, equation.

We shall find a fundamental solution of the differential operator 7

L :=∂

∂t− D△ ,

where the parameter D > 0 is called the diffusion coefficient. The Laplacean is with respect tospatial coordinates x1, x2, x3. The diffusion equation in R3 (also known as the Heat equation)is the differential equation Lu = 0. To find a fundamental solution G for the operator L, letus Fourier-transform both sides of the equation LG = δ. The FT of G is a distribution in thespace of variables k, ω where k = (k1, k2, k3) ∈ R3 is the wave vector and ω is the frequency .Using the formulae in Thm.54, we obtain:

(iω + D∥k∥2) G = (4π2)−1 . (5.43)

This equation is solved by the regular distribution:

G(k, ω) =1

4π2

1

iω +D∥k∥2; (5.44)

As a matter of fact, this function is locally summable, with finite algebraic growth. The 2ndstatement is obvious, and the 1st will be proven in a Note in the end of this section. One hasthen to calculate the inverse Fourier transform of (5.44) :

G(x, t) =1

4π2

∫R3

dk

∫dω eiωtei⟨k|x⟩G(k, ω) =

1

16π4

∫R3

dkei⟨k|x⟩∫dω

eiωt

D∥k∥2 + iω.

The inner integral is almost identical to that in Problem 61 and, like that one, is easily calculatedusing the lemma of Jordan. This leads to:

G(x, t) =1

8π3s(t)

∫R3

dk ei⟨k|x⟩e−Dt∥k∥2 .

The latter integral is the inverse FT of a Gaussian and using (4.14) with a simple change ofvariables one finally finds that:

G(x, t) = s(t)1

(4πDt)3/2e−

∥x∥24Dt . (5.45)

Thanks to the step function of time, this fundamental solution is the retarded Green function.Hence, substituting it in (5.41), one finds that for all times t > 0, the solution of the Cauchyproblem for the diffusion equation, with initial condition f(x) is given by:

F (x, t) = Gt ∗ f , (5.46)

7 in the notation of eqn.(5.36) this operator corresponds toM = 4, c4 = 1, c1 = c2 = c3 = −D, ppp4 = (0, 0, 0, 1),ppp1 = (2, 0, 0, 0), ppp2 = (0, 2, 0, 0), ppp3 = (0, 0, 2, 0).

Page 96: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 96

where the convolution is in R3, and, moreover:

Gt(x) =1

(4πDt)3/2e−

∥x∥24Dt = G2Dt(x) (5.47)

where Gt(x) is the Heat kernel (cp. Def. 29).It is convenient to consider t in F (x, t) as a time parameter, and to think of F (x, t) as of afamily of functions ft(x) = F (x, t) in R3, indexed by this parameter. Denoting f0 ≡ f , from(5.46), (5.47) , one can write that, for all t > 0:

ft(x) = (Ttf0)(x) ,

where, for all t, Tt is the integral operator associated with the kernel (5.47) as described insubsect. 3.3.2 ). It follows that :

Proposition 50: 1. If f0 is square-summable, such is ft, ∀t > 0, and moreover ∥ft∥ ≤ ∥f0∥.

2. If f0 is summable, such is ft, ∀t > 0, moreover∫dx ft(x) =

∫dx f0(x);

3. If f0 is summable, then ft is a function of class C∞(R3), ∀t > 0;

4. If f0(x) ≥ 0 for a.e. x, the same holds for ft(x), ∀t ≥ 0;

5. for all t, s > 0 f(t+ s) = Tsft, or, equivalently, Tt+s = TtTs.

Proof: (1) because Tt is a continuous operator in L2(R3) of unit norm , as shown in subsec.3.3.2. (2) directly from (5.46), and Problem 82. (3) follows from (5.45) and Prop.48. (4)directly from (5.46). Finally (5) is the semigroup property (Problem 80).�Advanced Green functions for the diffusion equation would correspond to (5.45) with t < 0.However the exponential would then diverge at ∞ too fast to be a tempered distribution. Thediffusion equation cannot be solved backwards in time within the setting of tempered distribu-tions.

Note: it is sufficient to show that |G| is summable on domains in R4 where ω is in an interval [−a,+a] and kin 3-dimensional ball BR(0). Then:∫

B

dk dω|G(k, ω)| ≤ 1

4π2

∫BR(0)

dk

∫ +a

−a

dω1√

ω2 +D2∥k∥4(5.48)

where dk is the volume element in R3 . Using spherical coordinates, and denoting χ = ∥k∥,∫BR(0)

dk

∫ +a

−a

dω|G(k, ω)| ≤ a

π

∫ R

0

dχχ2√D2χ4

≤ a

π

∫ R

0

dχ D−1 < +∞ . (5.49)

Page 97: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 97

5.5.4 The Schrodinger equation for a free particle.

Dispensing with physical constants, this equation can be written in the form (5.39), with theoperator L0 defined in R3 by:

L0 = i△ .

Fundamental solutions G are solutions of LG = δ, so, after Fourier-transforming:

(∥k∥2 + ω) G = −i(2π)−2 . (5.50)

Now G cannot be found by just dividing both sides by (∥k∥2 + ω), because this term vanishesfor ω = −∥k∥2 and this would produce a non-integrable singularity. In spite of that, eqn. (5.50)has a whole family of solutions8. Among these we choose:

G+ = − i

4π2

1

ω + ∥k∥2 − i0:= − i

4π2w- lim

ϵ→0+

1

ω + ∥k∥2 − iϵ, (5.51)

The explicit calculation below will show that this is exactly the retarded Green function. Let’sdirectly find the inverse transform of (5.51):

G+(x, t) = limϵ→0+

−i16π4

∫R3

dk ei⟨k|x⟩∫dω eiωt

1

ω + ∥k∥2 − iϵ, (5.52)

Changing variables in the inner integral, a calculation we’ve already seen many times showsthat: ∫

dω eiωt1

ω + ∥k∥2 − iϵ= e−i∥k∥2t

∫dω′ eiω

′t 1

ω′ − iϵ= 2πi s(t) e−i∥k∥2t e−ϵt ,

and substituting this in (5.52):

G+(x, t) =1

8π3s(t)

∫R3

dk ei⟨k|x⟩e−i||k∥2t . (5.53)

Recall that the integrand, written in extended form, is the function:

eik1x1e−ik21t eik2x1e−ik22t eik3x1e−ik23t ;

so that (5.53) factorizes in a product I1I2I3 of three integrals:

Ij =

∫ +∞

−∞dkj e

ikjxje−ik2j t = eix2j/4t

∫ +∞

−∞dkj e

−it(kj−xj/2t)2

= eix2j/4tπ1/2t−1/2eiπ/4 , j = 1, 2, 3 . (5.54)

The last integral was reduced by an obvious change of variables to the Fresnel integral (MM1,eq. 5.4); taking into account that in (5.53) t is always positive. Giving to j the values 1, 2, 3,multiplying results, and replacing in (5.53), we finally find:

G+(x, t) =1

(4πt)3/2s(t) e−3πi/4 ei∥x∥

2/4t . (5.55)

8 See Problem 75

Page 98: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 98

A fully similar calculation shows that the function:

G− = − i

4π2

1

ω + k2 + i0:= − i

4π2w- lim

ϵ→0+

1

ω + k2 + iϵ,

is the advanced Green function. One finds :

G−(x, t) = − 1

(−4πt)3/2s(−t) e3πi/4 ei∥x∥2/4t . (5.56)

Recalling (5.41) and (5.42) we conclude that the solution ψ(x, t) of the Schrodinger equationfor one free particle satisfies ∀t = 0:

ψ(x, t) =

∫R3

dx′ G(x− x′, t)ψ(x′, 0) ,

where

G(x, t) = (4π|t|)−3/2ei∥x∥2/4te−i3 sign(t)π/4 . (5.57)

The function G(x, t) is the free propagator of the Schrodinger equation. Apart from physicalconstants, it is usually written in the form:

G(x, t) = (4πit)−3/2ei∥x∥2/4t . (5.58)

which is equivalent to the above written one, provided that the correct branch of the polydromicfunction (.)−3/2 is used . As it is seen in (5.57), this choice entails a discontinuous jump of thephase across the singularity in t = 0. This is clarified by a simple link between the freepropagator and the fundamental solution of the diffusion equation: notably, the former is theanalytic continuation of the latter to ”imaginary time”. To see this denote Gd(x, t) the Greenfunction (5.45) of the diffusion equation, as it is given by (5.45)for real t > 0. If t is consideredas a complex variable, then Gd(x, t) can be analytically continued in C, except for a branchcut along the negative real axis (0 included); see Fig. 5.2 (x is now a parameter; the complexvariable is t.) Analytic continuations demands arg(t) = π/2 on the upper imaginary halfline(i.e. for τ > 0), and arg(t) = −π/2 for τ < 0, so we immediately find that:

G(x, t) := Gd(x, it) ,

where G is given by (5.57).

5.5.5 The Wave equation.

The operator which figures in the wave equation is the ”D’Alembertian” operator:

L = � =1

c2∂2

∂t2− △ .

Unlike the equations we have considered earlier, the wave equation is 2nd order with respectto the time variable t, so the Cauchy problem has to be formulated in different terms, that will

Page 99: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 99

0 t

it

−it

Fig. 5.2: Complex time for the Green function of the diffusion equation.

not be discussed here. In any case, fundamental solutions can be computed by a quite similarmethod. Their Fourier transforms satisfy:

(ω2 − c2∥k∥2) G = −(2π)−2c2 . (5.59)

Like in the case of the Schrodinger equation, one cannot directly divide by (ω2 − c2∥k∥2).Among the solutions of (5.59), let us choose the one that is defined by:

G+ = − w- limϵ→0+

c2

4π2

1

(ω − c∥k∥ − iϵ)(ω + c∥k∥ − iϵ), (5.60)

and then let’s calculate the inverse Fourier transform:

G+(x, t) = − c2

16π4limϵ→0+

∫R3

dk ei⟨k|x⟩∫dω eiωt

1

(ω − c∥k∥ − iϵ)(ω + c∥k∥ − iϵ).

The inner integral is easily computed by using Jordan’s lemma:

G+(x, t) = −s(t) ic2

8π3

∫R3

dk ei⟨k|x⟩(

1

2c∥k∥eitc∥k∥ − 1

2c∥k∥e−itc∥k∥

)= s(t)

c2

8π3

∫R3

dk ei⟨k|x⟩sin(tc∥k∥)c∥k∥

, (5.61)

The k-integral is solved thanks to eqn.(5.24) with R = ct:∫R3

dk ei⟨k|x⟩sin(tc∥k∥)c∥k∥

= (2π)3/2F−1

(sin(tc∥k∥)c∥k∥

)= (2π)3/2(π/2)1/2

1

c2t

(δΣct(0)

)∨, (5.62)

where Σct(0) is the surface of the sphere of radius R = ct and center at 0. This leads to:

G+(x, t) =1

4πts(t) δΣct(0) . (5.63)

which can be rewritten as :

G+(x, t) =c

4π∥x∥s(t) δΣct(0) . (5.64)

Page 100: Universit a degli Studi dell’Insubria Dipartimento di ... notes/met_14e… · 1. Functional Spaces. 6 1.2 Normed Spaces. 1.2.1 Topological Vector Spaces. In rough terms, and for

5. Distributions. 100

because ct = ∥x∥ whenever x in Σct.Let’s use the just found Green function to solve the non-homogeneous equation:

�V = ϕ . (5.65)

Using improper denotations:

V (x, t) = (G+ ∗ ϕ)(x, t) =

∫R3

dx′∫dt′ϕ(x′, t′)”G+(x− x′, t− t′)”

=c

∫ t

−∞dt′∫Σc(t−t′)(x)

dSx′1

∥x− x′∥ϕ(x′, t′) , (5.66)

and changing variables from t′ to r = c(t− t′):

V (x, t) =1

∫ +∞

0

dr

∫Σr(x)

dSx′1

∥x− x′∥ϕ(x′, t− r/c)

=1

∫R3

dx′1

∥x− x′∥ϕ(x′, t− ∥x− x′∥/c) . (5.67)

This is known as the retarded potentials formula , because it entails that the ”potential”V that is generated at a point x at a time t by a charge (or current) distribution ϕ(x, t) onlydepends on the latter distribution at earlier times t′ < t. Though physically intuitive, this resultis arbitrary on purely mathematical grounds, as it depends on having selected G+ out of allpossible solutions of (5.59). If in eqn. (5.60) 0+ is changed in 0−, the advanced Green functionis obtained and (5.67) is replaced by the aaadvanced potentials formula. Yet other choices arepossible. 9

9 It should be stressed that this is so, because eq. (5.65) has no unique solutions in the absence of boundaryconditions. Such conditions are decided by physics and not by mathematics, and in the case of electrodynamicsthey are not implicit in the Maxwell-Lorentz equations, which are fully symmetric with respect to the directionof time.