the atiyah–singer index theorem

63
THE ATIYAH–SINGER INDEX THEOREM AJW, LENT 2010 Introduction and overview. Lecture notes. 1. Compact Operators. 2. Fredholm operators. 3. An index theorem for Toeplitz operators. 4. Sobolev spaces and eigenfunction expansions for elliptic operators on T n . 5. Hilbert–Schmidt operators. 6. Trace–class operators. 7. Nash’s embedding theorem for the N–torus. 8. Tensor, symmetric and exterior algebras. 9. The double commutant theorem. 10. Fermions and Clifford algebras. 11. Quantisation: the spin group and its Lie algebra. 12. Bosons: the harmonic oscillator and Mehler’s formula. 13. Manifolds, tangent vectors and metrics. 14. Geodesics and normal coordinates. 15. Hermitian vector bundles and projections. 16. Cohomology, connections and curvature. 17. Clifford bundles and Dirac operators. 18. Sobolev spaces on a Clifford bundle. 19. The Hodge theorem. 20. Fredholm properties and the McKean–Singer index formula. 21. The spectrum of the Dirac Laplacian. 22. Global Sobolev construction of heat kernel. 23. Local Hadamard construction of heat kernel. 24. Lichnerowicz’s formula for the square of the Dirac operator. 25. Supersymmetric proof of the Atiyah–Singer index theorem. Appendix: Solutions of ordinary differential equations. BOOKS. 1. *J. Roe, “Elliptic operators, asymptotic methods and topology”. 2. *N. Berligne, E. Getzler & M. Vergne, “Heat kernels and Dirac operators”. 3. B. Lawson & Mickelson, “Spin geometry”. 4. P. Gilkey, “Invariant theory, the heat equation and the Atiyah–Singer index theorem”. 5. L. H¨ ormander, “Analysis of linear partial differential operators”, Vol. III, Chapter 19. 6. R. Melrose, “The APS theorem”. 7. M. Taylor, PDEs, Vol. 2, Chapter 8. 1

Upload: others

Post on 13-Mar-2022

15 views

Category:

Documents


0 download

TRANSCRIPT

THE ATIYAH–SINGER INDEX THEOREMAJW, LENT 2010

Introduction and overview.

Lecture notes.

1. Compact Operators.2. Fredholm operators.3. An index theorem for Toeplitz operators.4. Sobolev spaces and eigenfunction expansions for elliptic operators on Tn.5. Hilbert–Schmidt operators.6. Trace–class operators.7. Nash’s embedding theorem for the N–torus.8. Tensor, symmetric and exterior algebras.9. The double commutant theorem.10. Fermions and Clifford algebras.11. Quantisation: the spin group and its Lie algebra.12. Bosons: the harmonic oscillator and Mehler’s formula.13. Manifolds, tangent vectors and metrics.14. Geodesics and normal coordinates.15. Hermitian vector bundles and projections.16. Cohomology, connections and curvature.17. Clifford bundles and Dirac operators.18. Sobolev spaces on a Clifford bundle.19. The Hodge theorem.20. Fredholm properties and the McKean–Singer index formula.21. The spectrum of the Dirac Laplacian.22. Global Sobolev construction of heat kernel.23. Local Hadamard construction of heat kernel.24. Lichnerowicz’s formula for the square of the Dirac operator.25. Supersymmetric proof of the Atiyah–Singer index theorem.Appendix: Solutions of ordinary differential equations.

BOOKS.

1. *J. Roe, “Elliptic operators, asymptotic methods and topology”.2. *N. Berligne, E. Getzler & M. Vergne, “Heat kernels and Dirac operators”.3. B. Lawson & Mickelson, “Spin geometry”.4. P. Gilkey, “Invariant theory, the heat equation and the Atiyah–Singer index theorem”.5. L. Hormander, “Analysis of linear partial differential operators”, Vol. III, Chapter 19.6. R. Melrose, “The APS theorem”.7. M. Taylor, PDEs, Vol. 2, Chapter 8.

1

INTRODUCTION AND OVERVIEW.

We start by recalling Weyl’s asymptotic formula. Let D be a bounded domain in Rn with smoothboundary ∂D. The Dirichlet eigenvalue problem consists in finding solutions of ∆f = λf in D with f = 0on ∂D. The eigenvalues λ1 ≤ λ2 ≤ · · · satisfy Weyl’s formula

λn/2k ∼ (2π)nk

vol(D) · vol(Bn),

where Bn is the unit ball in Rn and ∆ = −∑ ∂2/∂x2i is the Laplacian on Rn. The same result holds for

the Neumann eigenvalue problem (where the boundary condition is that normal derivative of f vanishes)or when D is replaced by a compact Riemannian manifold and ∆ by the Laplacian. The left hand sideabove is an analytic expression depending on the spectrum of ∆ while the right hand side is a geometric ortopological expression. This course explores the links between analysis and geometry in the same spirit. Itis an example of what Atiyah calls “elliptic topology”, where elliptic differential operators on a manifold areused to study its geometry and topology.

An operator T : H1 → H2 between Hilbert spaces is said to be Fredholm if ker(T ) is finite–dimensionaland the image im(T ) has finite codimension. The index ind(T ) is defined to be the difference of thesequantities. If H1 and H2 are finite–dimensional, the index is always dim(H1) − dim(H2), so zero if theyhave the same dimension. In infinite dimensions all sorts of strange and interesting things can happen. Theindex is well behaved, however, and for example satisfies ind(ST ) = ind(S) + ind(T ). Every elliptic partialdifferential operator between vector bundles over a compact manifold gives rise to a Fredholm operator;the abstract properties of the index imply that the index of this operator must be determined by purelytopological invariants determined by the principal symbol of the operator and the underlying vector bundles.The index gives a measure of the dimensions of solution spaces of the operator and its adjoint, The easiestexample is that of Toeplitz operators. Let P be the projection onto Hardy space H2(S1) ⊂ L2(S1) anddefine the Toeplitz operator T (f) = Pm(f)P on H2(S1), where m(f) is multiplication by f ∈ C(S1);then, if f is nowhere vanishing, T (f) is Fredholm with index minus the winding number of f . Motivatedby Grothendieck’s Riemann–Roch theorem in algebraic geometry, Atiyah and Singer solved the problem ofdetermining the index for any elliptic differential operator. We shall explain how to solve the problem for aparticular class — Dirac operators with coefficients in a vector bundle — and then explain how this solvesthe general problem.

Vector bundles, connections and curvature. A vector bundle V over a manifold M is the smoothassignment of a vector space to any point x ∈M . Usually V will be a hermitian vector bundle, so that thereis a smoothly varying inner product on each fibre. Vector bundles can always be realised as subbundles oftrivial bundles M × RN and thus a smooth map from M into the projections of a certain rank m in Rn.This gives a projection p in C∞(M,MN). The sections of the bundle, i.e. smooth choices of vector fromeach fibre, are then given by C∞(V ) = pC∞(M,RN ). They form a finitely generated module over C∞(M).This gives a correspondence between vector bundles, projections and projective modules; yet another wayof specifying a vector bundle is through local transition functions fij : Ui ∩ Uj → GL(m) where m is thedimension of the fibre. Any vector field X =

∑ai∂/∂xi on M acts by differentiation on C∞(M) and hence

C∞(M,Rn). It is a derivation, i.e. satisfiesthe Leibnitz rule X(fg) = (Xf)g+ f(Xg) for f, g ∈ C∞(M). The covariant derivative or connection is

∇X = pXp, so that ∇Xξ = p(Xξ) ∈ C∞(V ) and ∇X satifies the Leibnitz rule ∇X(fξ) = (Xf)ξ + f(∇Xξ)for f ∈ C∞(M), ξ ∈ C∞(V ). If V is hermitian, p will be an orthogonal projection and ∇X will satisfy thecompatibility condition X(ξ, η) = (∇Xξ, η) + (ξ,∇Xη) with respect to the inner product (ξ, η) ∈ C∞(M).The curvature is the matrix–valued two form p dp dp, which is the same as [∇X ,∇Y ]−∇[X,Y ] when the twoform is paired with vector fields X and Y . Note that p gives a map of M into the Grassmannian Gm,N ofm–dimensional subspaces of RN , so that the curvature is the pullback of the canonical 2–form p dp dp onGm,N .

Riemannian geometry. LetM be a Riemannian manifold, so that their is an inner product on each tangentspace g(X,Y ), or equivalently the tangent bundle is hermitian. Nash’s embedding theorem states that Mcan be embedded in a Euclidean space RN (for large N) so that the metric on M is the induced metric. This

2

means that the length of a tangent vector toM is just its perceived length in RN . This embedding displays thetangent bundle TM as a subbundle of the trivial bundle M ×RN . The corresponding orthogonal projectiondetermines the Levi–Civita or Riemannian connection ∇X . This acts on tangent vectors (i.e. vector fields)and is uniquely characterised by the properties X(ξ, η) = (∇Xξ, η) + (ξ,∇Xη) and ∇XY −∇YX = [X,Y ].The Euler differential equation for geodesics on M (shortest paths) invokes the connection ∇X . Aroundany point on M we can take outwards pointing geodesics parametrised by length. These sweep out normalcoordinates around the point and in this way we get the exponential map at the point. Gauss showed thatgeodesics from the point meet the concentric spheres orthogonally.

Dirac operators and Clifford bundles. If M is a Riemannian manifold, the space of k–forms ΩkM isnaturally hermitian vector bundle with a natural extension of the Riemannian connection ∇X . The Cliffordalgebra Cliff(M) is defined as Ω(M), the space of all forms, with Clifford multiplication by 1–forms given byc(ω1)ω = (e(ω1) + e(ω1)

∗)ω, where ω1 is a 1–form and e(ω1)ω = ω1 ∧ ω. Note that c(ω)c(ω′) + c(ω′)c(ω) =2(ω, ω′): these are the Clifford algebra relations. The Riemannian connection ∇X satisfies the compatibilitycondition [∇X , c(ω1)] = c(∇Xω1) with Clifford multiplication. A Clifford bundle is by definition a hermitianvector bundle with a connection ∇X and compatible action c(ω1) of Cliff(M). The Dirac operator of aClifford bundle is defined as D =

∑c(ωi)∇Xi

where (ωi, Xj) = δij ; it is independent of the choice of dualbases of vectors and covectors, and defines an elliptic partial differential operator. The Sobolev spaces Hk(V )are the Hilbert space completions of C∞(V ) for the norms ‖ξ‖(k) = ((I +D2

V )kξ, ξ)1/2.We shall be interested mainly in the case when M is even–dimensional and a spin manifold. This means

that there is a Clifford bundle S = S+ ⊕ S− such that S+ ⊗ S− = ΩM . The Dirac operator D on S isself–adjoint and carries S± into S∓. When M is spin, all Clifford bundles have the form S⊗V where V is ahermitian vector bundle. The corresponding Dirac operator DV is called the Dirac operator with coefficientsin V .

Atiyah–Singer Index Theorem.indDV =

∫MA(M)∧ch(V ), where A(M) = (2π)−dim M/2 det(1

2R/ sinh 12R)1/2 (“A-hat genus”) and ch(V ) =

tr exp(F ) (“chern character”), with R the Riemannian curvature and F the curvature of V .

Sketch of proof. Splitting S ⊗ V as S+ ⊗ V ⊕ S− ⊗ V , we get

DV =

(0 ∂∗

∂ 0

),

where ∂ : S+ ⊗ V → S− ⊗ V . We then have

indDV = Tr(e−∂∗∂t) − Tr(e−∂∂∗t) ≡ Trs(e−D2t) “supertrace”,

where the first equality follows because the non–zero eigenspaces of ∂∗∂ and ∂∂∗ are isomorphic, so thatthere is only a contribution from the 0 eigenspaces. The heat operator e−tD2

is constructed by Sobolevtheory as a Clifford algebra valued kernel Kt(x, y); Mercer’s theorem implies that

Trse−tD2

=

M

trsKt(x, x) dx.

But Kt(x, x) has a purely local asymptotic expansion given by

Kt(x, y) = (4πt)−n/2e−d(x,y)2/4t(F0 + tF1 + · · ·)

where F0, F1, F2, · · · can be computed locally in terms of the metric and curvature. The problem is todetermine trsKt(x, x), which a priori involves F0, . . . , Fn and therefore seems quite complex.

This was solved by Getzler by introducing a scaling, inspired by Witten’s ideas on supersymmetry.This capitalises on the fact that the index is independent of the free parameter t. Take ε > 0, morallyPlanck’s constant. Recall that Kt takes values in ΛRn (regarded as a Clifford algebra). We fix x and write

3

y = expx(X), with X a tangent vector at x; thus we write Kt(X) in place of Kt(x, y). Taking the scaling sε

on k–forms as multiplication by ε. we define the Getzler scaling by

Kεt (X) = sεKε2t(εX).

The limit K0t (X) exists and satisfies ∂tK

0 = ∆K0, with

∆ = −∑

(∂

∂xi− 1

4

∑Rijxj)

2 − F.

Thus K0 is the kernel of the heat operator e−∆t. The computation is completed by observing that Fcommutes with the first operator; and the first operator is essentially a matrix version of the harmonicoscillator ∆0 = −d2/dx2 + a2x2. The kernel of e−∆0t is given by Mehler’s formula

(4πt)−n/2

(2at

sinh 2at

)1/2

exp − 1

4t[2at coth(2at)(x2 + y2) − 2cosech(2at)xy].

Hence

K0t (X) = (4πt)−n/2 det

(tR/2

sinh tR/2

)1/2

exp(−(x| tR2

cothtR

2|x)/4t) exp tF.

The result follows on setting t = 1 and x = 0.

Comments.

1. The general index problem. Using vector bundles V over X or projections p ∈ Mn(C∞(M)),Atiyah introduced the Grothendieck group K0(X) = K0(C

∞(M)); it is a cohomology theory. Atiyah alsointroduced the group Ell(M) of abstract elliptic operators. This group is formed from Fredholm operatorsT : H+ →→ H− where H± are C∞(M)–modules with [T, f ] compact. Any elliptic partial differentialoperator D of order m gives such a T by taking appropriate Sobolev spaces for H± (equivalently by takingT = (I+∆)−m/2D, a zeroth order pseudodifferential operator). There is a natural bilinear pairing Ell(M)×K0(M) → Z given by (p, T ) 7→ ind(pTp) = indDV . If X is a spin manifold and D is the Dirac operator,V 7→ DV gives an isomorphism K0(M) → Ell(M). Moreover the Chern character gives an isomorphismK0(X) ⊗Z R → Hev

dR(M), [V ] 7→ Ch(V ). The index problem is to compute the index map Ell(M) → Z.Since the twisted Dirac operators DV generate Ell(M), it is in principle solved once it has been solved forthese operators. The isomorphisms imply that each elliptic operator gives rise to an even cohomology classwhich is inserted into the integral to compute the index. This cohomology class can in fact be computeddirectly from the principal symbol of the operator.

Slightly more concretely, if S∗M denotes the cosphere bundle (of unit cotangent vectors), any ellipticpartial differential operator between vector bundles on M defines a principal symbol. If the vector bundlescorrespond to projection p and q in MN (C∞(M)) ⊂ MN (C∞(S ∗M)), then this symbol leads to a partialisometry u ∈ MN(C∞(S∗M)) such that u∗u = p and uu∗ = q. Each such triple can be used to define anelement of K0(T ∗M). If M is a spin manifold, Bott periodicity implies that this group is in turn isomorphicto K0(M). In this way the principal symbol of an elliptic operator defines a class in K0(M).

It is straightforward to see that the index of the original operator depends only on the class of itsprincipal symbol in K0(M). In fact this can be done quite explicitly using a construction of Friedrichs: it ispossible to define an operator V from L2(S∗M) onto L2(M) with V V ∗ = I such that the projection P = V ∗Vcommutes with multiplication by C(S∗M) modulo the compact operators. It follows that Vm(u)V ∗ definesa Fredholm operator for any unitary u ∈ MN (C(S ∗M)) (or more generally for any triple (u, p, q)). UsingV to identify L2(M) and PL2(S∗M), this operator is unitarily equivalent to the Toeplitz type operatorPm(f)P .

The symbol class of the twisted Dirac operator DV in K0(M) turns out to be [V ], so the twisted Diracoperators generate the symbol classes. On the other hand Chern character gives a natural homomorphismof K0(M) into H∗

dR(M). The even cohomology class defined by an arbitrary elliptic differential operator canbe computed explicitly; comparison with the index theorem for the operators DV then yields the generalcohomological index formula of Atiyah–Singer.

4

2. Nonlinear problems. In this course we will consider several non–linear PDEs; the Euler equations fora geodesic; the Nash embedding equations (following Gunther); the ODE for parallel transport on a matrixgroup. Further related examples are the Seiberg–Witten equations for SpinC structures on a 4–manifold andthe non–linear equations arising in the uniformisation of Riemannian 2–manifolds (see the 2009 course noteson “Analysis of Operators”),

3. Supersymmetry. Although we do not push the similarity too far, Getzler’s proof of the Atiyah–Singertheorem relies on a complete democracy between bosonic and fermionic variables. The bosons are differentialoperators while the fermions represent spinors. Analytically differential operators are treated using the so–called pseudodifferential calculus. Getzler introduced a “super” version of this which treated bosons andfermions on equal footing. As Connes has pointed out, the rule for composing principal symbols can beviewed as a convolution operation in the “super–Heisenberg group” consisting of kernels Kx(X) with valuesin the exterior algebra with composition rule

Ax ⋆ Bx(X) =

X+Y =Z

Ax(X)Bx(Y )e−Rx(X,Y )/4,

where the 2–form Rx(X,Y ) is the Riemannian curvature tensor at x. This is the approach taken in the firstof Getzler’s papers on the index theorem and is very useful for generalisations. In his second much shorterpaper, Getzler showed how to avoid the use of pseudodifferential operators.

5

1. COMPACT OPERATORS.

Let H be a Hilbert space. An operator T ∈ B(H) is compact iff T (B) is compact, where M = x :‖x‖ ≤ 1 is the closed unit ball. This means that if (xn) satisfies ‖xn‖ ≤ 1, the (Txn) has a convergentsubsequence. Recall that a finite rank operator S is one such that S(H) is finite–dimensional. Since anycloed ball in a finite–dimensional ubpace is compact, any such S is compact.

Proposition. T ∈ B(H) is compact iff T is a norm limit of finite rank operators.

Proof. Suppose first that T is compact. Then T (B) is compact so given n we can find x1, x2, . . . , xm ∈ Bsuch that if y ∈ T (B), then ‖y − Txi‖ ≤ n−1 for some xi. Let Pn be the orthogonal projection ontolin(Tx1, . . . , Txm). From the above PnT −T ‖ ≤ n−1. Hence T is the norm limit of the finite rank operatorsPnT .

Conversely suppose that Tn → T with Tn finite rank. If (xn) satisfies ‖xn‖ ≤ 1, then choose a

subsequence (x(1)n ) such that (Tx

(1)n ) is convergent. Then inductively choose subsequences (x

(i)n ) such that

(Tix(i)n ) is convergent. Let (yn) be the diagonal subsequence yn = x

(n)n . Thus (yn) is a subsequence of (xn)

such that (Tixn) is convergent for each i. So each (Tixn) is a Cauchy sequence. Since ‖Ti − T ‖ → 0, itfollows that T (Txn) is a Cauchy sequence and hence T is compact.

Corollary. The compact operators form a linear subspace K(H) of B(H).

Corollary. If T is compact and S is bounded, then ST and TS are compact.

Corollary. T is compact iff T ∗ is compact. In particular Re(T ) = (T + T ∗)/2 and Im(T ) = (T − T ∗)/2iare compact and self adjoint with T = Re(T ) + iIm(T ).

Corollary. A norm limit of compact operators is compact, i.e. K(H) is closed.

Remark. This also applies mutatis mutandi to operators between different Hilbert spaces, by the usual2 × 2 matrix trick B(H1, H2) ⊂ B(H1 ⊕H2). The space of compact operators from H1 to H1 is denoted byK(H1, H2).

The spectral theorem for compact self–adjoint operators. Let T ∈ B(H) be a compact self–adjointoperator, so that T = T ∗. Then H admits am orthonormal basis (en) consisting of eigenvectors of T . ThusTen = λnen with λn ∈ R and λn → 0.

Proof. We start by showing that ±‖T ‖ is an eigenvalue of T . It suffices to show that ‖T ‖2 is an eigenvalueof T 2, since if T 2x = ‖T ‖2x and y = (T − ‖T ‖)x, then either y = 0, so that ‖T ‖ is an eigenvalue, or(T + ‖T ‖)y = 0, so that −‖T ‖ is an eigenvalue. Take xn with ‖xn‖ = 1 such that ‖Txn‖ → ‖T ‖. Then

‖(T 2 − ‖T ‖2)xn‖2 ≤ 2‖T ‖2‖xn‖2 − 2‖Txn‖2 → 0.

Since T is compact, passing to a subsequence if necessary, we may assume that Txn is convergent. Hence(T 2xn) is convergent and from the above (xn) is convergent to x, say, with ‖x‖ = 1 and T 2x = ‖T ‖2x.

Now pick e0 with ‖e0‖ = 1 such that Te0 = λ0e0with |λ0 = ‖T ‖. Setb T0 = T . Define H1 = e0⊥.Since Te0 = λ0e0, T (H1) ⊆ H1. Let T1 = T |H1 . Clearly T1 is compact and self–adjoint and ‖T1‖ ≤‖T ‖. Now inductively pick en ∈ Hn with ‖en‖ = 1 such that Ten = λnen with λn| = ‖Tn‖. DefineHn=1 = e0, . . . , en⊥. Since Tei = λiei (i = 0, . . . , n), T (Hn) ⊆ Hn, so we may set Tn+1 = T |Hn+1 . Thus‖T ‖ = ‖T0‖ ≥ ‖T1‖ ≥ · · ·.

If this process does not terminate, we get infinitely many non–zero eigenvalues with |λn| = ‖Tn‖ ↓ δ. Ifδ > 0, then xn = (δ/λn) · en satisfies ‖xn‖le1 and Txn = δen has no Cauchy subsequence, contradicting thecompactness of T . Hence λn → 0.

Finally suppose that x ⊥ en for all n. Then x ∈ Hn for all n, so that Tx = Tnx. Since ‖Tn‖ → 0, weget Tx = 0, i.e. x ∈ ker(T ). By taking an orthonormal basis (fi) of ker(T ) and combining it with (en) weget the required orthonormal basis of H .

Rayleigh–Ritz minimax principle. Suppose that T is a self–adjoint operator on the inner product spaceE and that Tei = λiei, with (ei) and orthonormal basis of E and λ1 ≥ λ2 ≥ · · ·. Suppose moreover that if

6

x =∑aiei in E, then (Tx, x) =

∑λi|ai|2. Then the kth eigenvalue of T is given by

λk = mindim G=k−1

maxx⊥G

〈T, x〉〈x, x〉 .

Proof. Let En be the subspace of H spanned by e1, . . . en. For any (k − 1)–dimensional subspace G of Hdefine

µ(G) = maxx⊥G

〈Tx, x〉〈x, x〉 .

Then we clearly have µ(Ek−1) = λk, since E⊥k−1 = lin(ek, ek+1, . . .). So it suffices to show that in general

µ(G) ≥ λk to prove the result.Consider the map Ek → G∗, x 7→ 〈x, ·〉. This is a linear map of a k–dimensional space into a (k − 1)–

dimensional space so must have some element in its kernel. Thus we can find x 6= 0 in Ek with x ⊥ G. Butthen 〈Tx, x〉/〈x, x〉 ≥ λk. So µ(G) ≥ λk as required.

Examples.

(a) Multiplication operators on ℓ2. T (xn) = (anxn) with sup |an| <∞. Thus T is a diagonal operatorwith ‖T ‖ = sup |an|. T is compact iff |an| → 0, because it can be approximated by diagonal finite rankoperators.

(b) Hilbert–Schmidt operators. On ℓ2 take A = (anm) with∑

|anm|2 <∞. Then ‖A‖ ≤ (∑

|aij |2)1/2.Let An be the finite rank n× n matrix given by the principal minor of A. Then

‖A−An‖ ≤ (∑

i or j > n

|aij |2)1/2 → 0,

so A is compact.

(c) Continuous kernels on L2([a, b]). Take K(x, y) continuous on [a, b] × [a, b] and set

Tf(x) =

∫ b

a

K(x, y)f(y) dy.

Then ‖T ‖ ≤ (∫ ∫

|K|2)1/2 ≤ sup |K|. Since K can be uniformly approximated by finite sums K ′(x, y) =∑fi(x)gi(y) that correspond to finite rank operators, T is compact. It is also true that, if (en) is an

orthonormal basis of L2([a, b]×[a, b]), then enm(x, y) = en(x)em(y) is an orthonormal basis of L2([a, b]×[a, b]).Writing K =

∑anmenm, we get

∫ ∫|K|2 =

∑|anm|2, so that T is actually Hilbert–Schmidt.

2. FREDHOLM OPERATORS.

Let H be a Hilbert space. T ∈ B(H) is called a Fredholm operator iff kerT and H/imT are finite–dimensional. The index of T is defined to be indT = dimkerT − codim imT ∈ Z. Note that unilateral shiftT (ξ1, ξ2, · · ·) = (0, ξ1, ξ2, · · ·) is Fredholm with index −1.

Lemma 1 (products). S, T Fredholm iff ST , TS Fredholm. If S is invertible and T Fredholm, thenind(ST ) = ind(T ).

Proof. Clearly ker(S) ⊂ ker(TS) and im(S) ⊃ im(ST ), so S (and T ) are Fredholm if ST and TS are.Conversely, ker(ST ) = x : Tx ∈ ker(S) = T−1(ker(S) ∩ im(T )) is finite–dimensional because ker(S)and ker(T ) are; and if V and W are finite–dimensional subspaces complementing im(S) and im(T ), thenH = V + S(H) = V + S(W + TH) = V + SW + ST (H), so that V + SW complements ST (H). Finally forS invertible kerST = kerT and im(ST ) = S(imT ), which proves the second assertion.

Lemma 2 (closure of image). If T is Fredholm, imT is closed and T restricts to an isomorphism ofker(T )⊥ onto im(T ).

7

Proof. Let V be a finite–dimensional subspace complementing T (H). By the Banach isomorphism theorem,the map ker(T )⊥ ⊕ V → H, x⊕ y 7→ Tx+ y is an isomorphism. Thus T must restrict to an isomorphism ofker(T )⊥ onto the closed subspace im(T ).

Theorem 1 (index 0 operators). ind(T ) = 0 iff T is the sum of an invertible and a compact operator.

Proof. Suppose that ind(T ) = 0. We know that T gives an isomorphism between ker(T )⊥ and im(T ). Onthe other hand dimker(T ∗) = codim im(T ) = dim kerT . Since H = ker(T )⊥

⊕ker(T ) = im(T )

⊕ker(T ∗),

we can add any isomorphism A : ker(T ) → ker(T ∗) onto T to get an invertible operator on H . Clearly A isfinite rank.

Conversely by Lemma 1, we have to show I + K has index 0. But I +K = I + A + F , where F hasfinite rank and ‖A‖ is small, < 1. Thus I +A is invertible so that I +K = (I +A)(I + (I +A)−1F ). Thusind(I +K) = ind(I +B) where B = (I +A)−1F has finite rank. Let H1 = kerB ∩ kerB∗. So dimH⊥

1 <∞and B = 0 = B∗ on H1. So BH⊥

1 ⊂ H⊥1 . Thus on H = H⊥

1

⊕H1, I +B = (I +B1)⊕ I where B1 = B|H⊥

1.

So ind(I +B) = 0, from the finite–dimensional case.

Corollary (Fredholm alternative). I +K is Fredholm of index 0 if K is compact. In particular if K iscompact and λ 6= 0, K − λI either has non–zero kernel or is invertible.

Theorem 2 (Atkinson’s parametrix criterion). The following are equivalent:

(1) T is Fredholm.

(2) ST − I and TS − I are finite rank for some S.

(3) ST − I and TS − I are compact some S.

Remark. S is called a parametrix or inverse modulo the finite ranks or compacts. It is unique up toa finite rank or compact; for if S1T = I + A and TS2 = I + B with A,B finite rank or compact, thenS2 + AS2 = (S1T )S2 = S1(TS2) = S1 + S1B, so that S2 − S1 = S1B − AS2. Thus the Fredholms form amultiplicative group modulo the compacts.

Proof. (1) → (2): T gives an isomorphism ker(T )⊥ → im(T ). Let S be the inverse of T on im(T ) and 0 onker(T ∗). Since TS − I = 0 on im(T ) and ST − I = 0 on ker(T )⊥, both ST − I and TS − I have finite rank.

(2) → (3):trivially, because finite rank implies compact.

(3) → (1): We have ST = I +K1 and TS = I +K2 with Ki compact. So ST and TS are Fredholm by theFredholm alternative. So T is Fredholm by Lemma 1.

Corollary 1 (adjoints). T is Fredholm iff T ∗ is Fredholm and ind(T ) = dimkerT−dimkerT ∗ = −ind(T ∗).

Corollary 2 (compact perturbations). If T is Fredholm and K compact, then T +K is Fredholm.

To get the main properties of the index we need a construction S ⊕ T , the direct sum of S and T

on H ⊕ H . This is given by the matrix

(S 00 T

)=

(S 00 I

)(I 00 T

). Clearly S ⊕ T is Fredholm

iff S and T are Fredholm, and ind(S ⊕ T ) = ind(S) + ind(T ) (since ker(S ⊕ T ) = ker(S) ⊕ ker(T ) andim(S ⊕ T ) = im(S) ⊕ im(T )).

Theorem 3 (stability properties). The Fredholms are open in B(H) and ind is norm continuous.

Proof. Take S = T ⊕ T ∗, a Fredholm of index 0. Thus S is an invertible plus a compact. Since theinvertibles are open, S+A is Fredholm of index 0 for ‖A‖ sufficiently small. In particular, taking A = B⊕0,(T + B) ⊕ T ∗ is Fredholm of index 0 for ‖B‖ sufficiently small. Hence T + B is Fredholm of index ind(T )for ‖B‖ sufficiently small.

Corollary 1 (compact perturbations). If T is Fredholm and K compact, then ind(T +K) = ind(T ).

Proof. ind(T + tK) is a continuous function R → Z, so is constant.

Corollary 2 (index of product). If S and T are Fredholm, ind(ST ) = ind(S) + ind(T ).

8

Proof. We have ind(ST ) = ind

(ST 00 I

)while ind(S) + ind(T ) = ind

(S 00 T

). Let

F (t) =

(S 00 I

)(cos t sin t− sin t cos t

)(T 00 I

)(cos t − sin tsin t cos t

),

a continuous path of Fredholms with F (0) = ST⊕I and F (π/2) = S ⊕ T . The result follows from

indF (0) = indF (π/2).

SUMMARY. Under multiplication, the Fredholms form a group modulo the compact operators. The indexgives a homomorphism ]onto Z with kernel T +K, T invertible and K compact. The Fredholms are open inB(H) and ind is norm continuous.

Remark. The whole theory of Fredholm operators works equally well for operators between different Hilbertspaces, since all separable infinite–dimensional spaces are isomorphic. This can be seen by using unitariesto identify the different spaces. Alternatively if T : H1 → H2 is Fredholm (finite dimensional kernel and

cokernel) and U : H2 → H1 is unitary, then

(0 UT 0

)is a Fredholm operator on H = H1 ⊕H2 with the

same index.

3. An index theorem for Toeplitz operators.

Let H = L2(S1) = ∑n∈Zanen | ∑ |an|2 < ∞, where en(z) = zn for |z| = 1. Let H2(S1) be the

Hardy space ∑n≥0 anen | ∑ |an|2 < ∞ and let P : L2 → H2 be the orthogonal projection P (∑anen) =∑

n≥0 anen. For f ∈ C(S1), let m(f) denote the multiplication operator on L2(S1). The Toeplitz operator

corresponding to f is T (f) = Pm(f)P , acting on H2(S1). Clearly if f ∈ C(S1), then ‖T (f)‖ ≤ ‖f‖∞ and‖[m(f), P ]‖ ≤ 2‖f‖∞.

Lemma. If f is a trigonometric polynomial then [m(f), P ] is of finite rank. Hence [m(f), P ] is compact forany f ∈ C(S1).

Proof. Let V be multiplication by z. Then P − V nPV −n is finite rank if n ≥ 1, since is the projectiononto the subspace spanned by e0, e1, . . . , en−1. Hence [V n, P ] = (V nPV −n − P )V n is finite rank for n ≥ 0.Taking adjoint, this is also true for n ≤ 0, so the first assertion follows. Approximating f ∈ C(S1) by atrigonometric polynomial, we see that [m(f), P ] is a norm limit of finite rank operators, so compact.

Corollary. If f, g ∈ C(S1), then T (fg) − T (f)T (g) is compact. Hence if f(S1) ⊂ C∗, then T (f) is aFredholm operator with inverse T (f−1) modulo the compacts.

Proof. T (fg) − T (f)T (g) = Pm(fg)P − Pm(f)Pm(g)P = P [P,m(f)]m(g)P is clearly compact, since[P,m(f)] is.

Examples. The index of T (en) where en(z) = zn (n ∈ Z) is given by −n.Proof. For n > 0, T (zn) = T (z)n. Since T (z) has index −1, this shows T (zn) has index −n. SinceT (z−n) = T (zn)∗, it follows that T (z−n) has index n.

Definition. For any continuous map f : S1 → C there is a unique n, called the winding number of f , suchthat f and en can be joined by a continuous path in C(S1,C∗): to see this, just take logarithms.

Noether’s Theorem (index of Toeplitz operators). For f ∈ C(S1,C∗), indT (f) = − winding numberof f .

Proof. Obvious since the index is invariant under norm–continuous paths in the Fredholms.

4. SOBOLEV SPACES AND EIGENFUNCTION EXPANSIONS FOR ELLIPTIC OPERA-TORS ON Tn.

A. Statement of problem. Let L = ∆ + V (x) be an elliptic operator on T = Tn = Rn/Zn where

∆ = −∑

∂2

∂x2i

and V ∈ C∞(T ) or L∞(T ). The fundamental problem is to solve Lu = f when f is given.

The Hilbert space method of solving this problem is to solve the associated eigenvalue/eigenfunction problem.

9

Show that the eigenfunctions ψn of L with form a complete orthonormal basis in L2(T ) andthat ψn is as smooth as permitted by V (for example C∞ if V is).

Then if f ∈ C(T ) or L2(T ), we write f =∑anfn and solve Lu = f by u =

∑bnψn with bn = an/λn.

The key tool – now one of the main techniques in the modern theory of linear partial differential equations– is the use of Sobolev norms and the associated Sobolev spaces, inroduced above.

B. Sobolev norms and spaces. Let Tn = (z1, . . . , zn) : |zi| = 1 = (eiθ1 , . . . , eiθn) : θi ∈ [0, 2π). We setem(z) = zm = zm1

1 · · · zmnn (multi–index notation). Futhermore Dj = −i ∂

∂θj= zj

∂∂zj

so that Djem = mjem.

We write ‖m‖ = (∑ |mi|2)1/2. For α = (α1, . . . , αn) with αi ∈ N, set Dαf = Dα1

1 · · ·Dαnn f . This is a

differential operator of total degree |α| =∑αi. We know that (em) form an orthonormal basis in C(Tn) for

the inner product (f, g) = (2π)−n∫f(x)g(x) dx. Moreover (

∑amem,

∑bpep) =

∑ambm.

If f ∈ C∞(T ), we define the sth Sobolev norm of f by

‖f‖(s) =(∑

|f(n)|2(1 + |n|2)s)1/2

.

There is an associated inner product 〈f, g〉(s) =∑

(1+ |n|2)sf(n)g(n), making C∞(T ) into an inner productspace. We define Hs(T ) to be the Hilbert space completion of C∞(T ) with respect to ‖ · ‖s. Equivalently,since the en’s are trigonemtric polynomials, Hs(T ) may also be regarded as the Hilbert space completion

of the space of trigonometric polynomials on Tn with respect to ‖f‖s =(∑

|f(m)|2(1 + ‖m‖2)s)1/2

. Thus

Hs(T ) = ∑ amzm :

∑ |am|2(1 + ‖m‖2)s < ∞ with inner product (∑amz

m,∑bmz

m) =∑ambm(1 +

‖m‖2)s. Note that for m ≥ 0 an integer,

‖f‖(m) =

|α|≤m

(m

α

)∫|Dαf |2

1/2

≤ C sup|α|≤m

|Dαf |,

using the binomial expansion for (1 + |n|2)m plus the fact that Djf(n) = nj f(n). Thus Ck(T ) ⊂ Hk(T ).We now prove an inclusion in the opposite direction.

Sobolev’s Embedding Theorem. If s > n/2, then Hs+k(T ) ⊂ Ck(T ) and∑

|α|≤k sup |Dαf | ≤ C‖f‖(s+k)

for some constant C > 0.

Proof. We prove two lemmas:

Lemma 1. If∑ |am| <∞, then

∑amz

m is absolutely convergent. If s > n/2 and∑ |am|2(‖m‖2+1)s <∞,

then|∑

amzm| ≤

∑|am| ≤ (

∑|am|2(‖m‖2 + 1)2)1/2(

∑(‖m‖2 + 1)−s)1/2.

Proof. The inequality follows from Cauchy–Schwarz inequality. The sum∑

(1 + |n|2)−s < ∞ for s > n/2

by the integral test on Rn. In fact converting to radial coordinates, this reduces to∫ R

1r−2srn−1dr ∼ Rn−2s.

Lemma 2. If∑ |bm| < ∞ and

∑ |mi| · |bm| < ∞ for some i, then f(z) =∑bmz

m is continuous withcontinuous derivative Dif(z) =

∑mibmz

m.

Proof. Set g(z) =∑mibmz

m. Then∫ x

0g(s) ds = f(x) − f(0) by uniform convergence of the Fourier series

for g(z). The result follows.

We now prove Sobolev’s embedding theorem. For m = 0, it follows from Lemma 1. In general by Lemma 2,Dαf =

∑mαamz

m for |α| ≤ m. Let bm = mαam. Applying Lemma 1, to Dαf , we get ‖Dαf‖∞ ≤Ks(

∑|bm|2(‖m‖2 + 1)s)1/2 for s > n/2. But |mα| ≤ |m||α| ≤ (1 + ‖m‖2)k/2, so sup ‖Dα‖∞ ≤ Ks‖f‖(s+k)

as required.

In particular C∞(T ) =⋂

s≥0Hs(T ) and can be identified with the Fourier series∑amz

m with (1 +

‖m‖2)k|am| → 0 as ‖m‖ → ∞ (“rapid decay”). Recall that Di = zi∂zi= −i d/dθi, so that Diz

m = mizm or

Diem = miem.

10

C. Rellich’s compactness lemma. If s > t, the inclusion Hs → Ht is compact.

Proof. In the natural orthonormal bases, the inclusion become multiplication by (1 + ‖m‖2)(t−s)/2 whichis clearly compact [some power is even Hilbert–Schmidt].

D. Differential operators.

Lemma. (a) DαHs ⊂ Hs−|α| and Dα : Hs → Hs−|α| is bounded.

(b) I + ∆ : Hs+2 → Hs is a unitary map. In particular (I + ∆)k : Hk → H−k is a unitary.

Proof. (a) This is obvious because in Fourier Dα is multiplication by mα and |mα| ≤ ‖m‖|α|. (b) I + ∆ ismultiplication by 1 + ‖m‖2 in Fourier, so the result is immediate.

E. The differential operator L = ∆ + V . Suppose that V ∈ L∞(T ) with V ≥ ε > 0. Let L = ∆ + V .The operator L maps H2 to H0, because ∆ does and V may be regarded as the composition of the inclusionH2 → H0 with multiplication by V .

Lemma (Fredholm argument). If V ∈ L∞(T ) satisfies V ≥ ε > 0, then the map L : H2 → H0 is anisomorphism.

Proof. L has trivial kernel because

(Lf, f) = (∆f, f) + (V f, f) ≥ min(1, ε)[(∆f, f) + (f, f)] = min(1, ε)‖f‖2(2),

so that Lf = 0 forces f = 0. Composing L with the unitary (I+∆)−1 : H0 → H2, we get L(I+∆)−1 = I+Kwhere K = (V − I)(I + ∆)−1 ∈ K(H0), by Rellich’s lemma. Thus L(I + ∆)−1 ∈ B(H0) is Fredholm ofindex 0 (by the Fredholm alternative). Since it is also injective, it must be an isomorphism, so L must bean isomorphism.F. Eigenfunction expansion for L = ∆ + V .

Theorem. If V ∈ L∞(T ), L2(T ) admits an orthonomal basis consisting of functions ψn ∈ H2 with Lψn =λnψn and λn → ∞. If V ≥ ε > 0, λj > 0 for all j.

Proof. Adjusting by a scalar if necessary, we may assume that V ≥ ε > 0. Thus L : H2 → H0 is anisomorphism. Let T ∈ B(H0) denote the composition of the inverse of L with the inclusion H2 → H0. ByRellich’s lemma, T is compact. Note that T (H0) ⊂ H2 and T is injective, since both L and the inclusionare. T is also self–adjoint, since if f = Lu, g = Lv in H0 with u, v ∈ H2, then

(Tf, g) = (TLu,Lv) = (u, Lv) = (Lu, v) = (Lu, TLv) = (f, T g).

So we can apply the spectral theorem for compact self–adjoint operators. Hence we can find an orthonormalbasis (ψj) in H0 such that Tψj = µjψj with µj → 0. Since T is injective, µj 6= 0, so ψj ∈ T (H0) ⊂ H2.Since Tψj = µjψj , we get LTψj = µjLψj. So Lψj = λjψj with λj = µ−1

j . Clearly λj → ∞. Finally, since(Lψ,ψ) = (∆ψ, ψ) + (V ψ, ψ) ≥ ε(ψ, ψ), we get λj > 0.

G. Eigenfunction expansions for Sturm–Liouville problems on [0, 1]. When n = 1, the eigenfunctionexpansion for ∆ +V reduces to the Sturm–Liouville problem on [−1, 1] (say) with periodic boundary condi-tions f(1) = f(−1), f ′(1) = f ′(−1). By a doubling trick (also called the method of reflection), this periodicproblem can be used to solve the corresponding eigenvalue problem on [0, 1] with Dirichlet or Neumannboundary conditions.

Theorem. Let V ∈ L∞([0, 1]) and L = −d2/dx2 + V . L2([0, 1]) has an orthonormal basis (ψn) ofeigenfunctions for the operator L = −d2/dx2 + V (x) satisfying either the Dirichlet boundary conditionsψn(0) = ψn(1) = 0 or the Neumann boundary conditions ψ′

n(0) = ψ′n(1) = 0. If Lψn = λnψn, then λn → ∞.

If V is piecewise continuous, then ψn, ψ′n are continuous and ψ′′

n is piecewise continuous.

Proof. To use the doubling trick, extend V to V (x) on [−1, 1] by V (x) = V (−x) for x ≤ 0. So V iseven and periodic on [−1, 1]. Let σf(x) = f(−x) on L2([−1, 1]) and L = −d2/dx2 + V . Let ψn be theeigenfunctions of L in L2(T) = L2([−1, 1]). Thus ψn ∈ H2(T), so that by Sobolev’s lemma ψn ∈ C1(T).

11

Now ψ′n lies in H1(T) ⊂ C(T) and its formal derivative as a Fourier series, f = ψ′′

n = (V − λn)ψn, is

piecewise continuous. Thus g(t) =∫ t

0f(s) ds satifies g′ = f (in the usual sense). Moreover g(2π) − g(0) =

(f, e0) = (ψ′n, e0) = 0 (integrate by parts) and, for m 6= 0, −im(g, em) = (g, e′m) = −(g′, em) = −(f, em) =

−(ψ′′n, em) = −im(ψn, em). So non–zero Fourier coefficients of g and ψ′

n coincide and therefore g and ψ′n

differ by a constant. Hence the usual derivative of ψ′n exists and equals f = (V − λn)ψn.

Since L commutes with σ, each of its eigenspaces is a sum of even and odd parts (i.e. the eigenspacesof σ). The even functions correspond to the eigenfunctions of L with Neumann boundary conditions whilethe odd functions correspond to those with Dirichlet boundary conditions. Since restricting even or oddfunctions to [0, 1] gives all of L2([0, 1]), each set of eigenfunctions is complete.

H. Multiplication operators.

Lemma (Peetre’s inequality). For s ≥ 0 and ξ, η ∈ Rn, (1 + ‖ξ‖2)s(1 + ‖η‖2)−s ≤ 2|s|(1 + ‖ξ − η‖2)|s|.

Proof. Swapping ξ and η if necessary, we may assume that s ≥ 0. Now ‖ξ‖ ≤ ‖ξ − η‖ + ‖η‖, so ‖ξ‖2 ≤2(‖ξ − η‖2 + ‖η‖2). So 1 + ‖ξ‖2 ≤ 1 + 2(‖ξ − η‖2 + ‖η‖2) ≤ 2(1 + ‖η‖2)(1 + ‖ξ − η‖2).

First Sobolev multiplication theorem. Hs is invariant under multiplication by smooth functions. If f =∑amz

m in Hs and g =∑bmz

m in C∞(T ), then gf ∈ Hs and ‖gf‖(s) ≤ 2|s|(∑ |bm|(1 + ‖m‖2)|s|/2)‖f‖(s).

Remark. This can be proved directly when s = k is a non–negative integer using ‖f‖2(k =

∑|α|≤k

(kα

)‖Dαf‖2

and the Leibnitz rule.

Proof. Note that ‖gf‖2(s) =

∑m |∑i+j=m biaj |2(1 + ‖m‖2)s. Thus using Peetre’s inequality, we get

‖gf‖2(s) =

m

|∑

i+j=m

biaj |2(1 + ‖m‖2)s ≤ 2|s|∑

m

|∑

i+j=m

|bi|(1 + ‖i‖2)|s|/2|aj |(1 + ‖j‖2)s/2|2.

Set Aj = |aj |(1 + ‖j‖2)s, Bi = |bi|(1 + ‖i‖2)|s|/2 and define F =∑Amz

m, G =∑Bmz

m. The inequal-ity ‖GF‖2 ≤ ‖G‖∞‖F‖2 ≤ (

∑ |Bm|)‖F‖2 shows that the right hand side is bounded by 2|s|(∑ |bi|(1 +

‖i‖2)|s|/2)∑

|aj |2(1 + ‖j‖2)s, as required.

I. Elliptic regularity and Weyl’s lemma. Let H−∞ =⋃Hs and H∞ =

⋂Hs = C∞(T ). Let L = ∆+V .

Weyl’s lemma (global regularity). If Lu = f with f ∈ Hs and u ∈ H−∞, then u ∈ Hs+2. In particular,if f ∈ C∞(T ), then u ∈ C∞(T ).

Proof. Say Lu = f . Then (∆ + I)u = f + (1 − V )u, so that u = (∆ + I)−1(f + (1 − V )u). Now we use a“bootstrap” argument. Let t be the biggest value of t such that u ∈ Ht. If t < s+ 2, then f + (1 − V )u liesin Ht if t < s and in Hs if t ≥ s. So u = (∆ + I)−1(f + (1 − V )u) lies in Ht+2 if t < s and Hs+2 if t ≥ s, acontradiction. Hence t ≥ s+ 2, so that u ∈ Hs+2.

We give the standard construction of bump functions or blips: these are smooth functions supportedinside arbitrary balls in Rn or Tn.

Lemma (bump function). There is a smooth function on Rn such that f(t) = 1 for |t| ≤ 1, f(t) = 0 fort ≥ 1 + δ and 0 ≤ f(t) ≤ 1 for all t.

Proof. If we can solve this for R, we take f(a‖x‖2) for Rn. Let g(x) be the smooth function on R given byexp(−(1 − x2)−1) for |x| < 1 and 0 for |x| ≥ 1. Let h(x) =

∫ x

−∞ g(t) dt/∫g. Thus h is smooth, h(x) = 0 for

x ≤ −1, h(x) = 1 for x ≥ 1 and 0 ≤ h(x) ≤ 1 for all x. Taking k(x) = h(αx + β) suitable α, β, we get asmooth function k with k(x) = 1 for x ≥ −1 − δ and k(x) = 1 for x ≤ −1 and 0 ≤ k(x) ≤ 1 for all x. Nowset f(x) = k(x)k(−x).

Weyl’s lemma (local regularity). Suppose Lu = f with L = ∆+V , where V ∈ C∞(T ), f, u ∈ H−∞(T ).Let U be an open subset of T and suppose that ψf is smooth for every ψ ∈ C∞(T ) such that ψ = 0 off U .Then ψu is also smooth for each such ψ.

12

Proof. Note that ψu = (I + ∆)−1(ψf + ψ(I − V )u − (∆ψ)u − 2∑∂xi

(ψxiu)). If ψu ∈ Hs for all ψ, then

the bracketed expression lies in Hs−1 so that ψu lies in Hs+1. By bootstrap ψu is smooth.

J. Smoothness of eigenfunctions.

Theorem. If the potential V in the operator L = ∆ + V is smooth, then the eigenfunctions ψn are smoothand exhaust the eigenfunctions of L in H−∞.

Proof. By elliptic regularity for L − λj , we see that ψn ∈ C∞(T ). To prove the second assertion, we mayagain assume that V ≥ ε > 0. Note that if Lψ = λψ with ψ ∈ H−∞, then ψ ∈ C∞(T ) by elliptic regularity.But then λ > 0 and hence Tψ = λ−1ψ. Thus ψ is already one of the eigenfunctions of T with eigenvalueµ = λ−1, as claimed.

K. Generalized Laplacian operators. We now extend the global and local versions of Weyl’s lemma tosecond order elliptic operators with matrix coefficients. Let

D =∑

− ∂

∂xjaij(x)

∂xi+∑

bi(x)∂

∂xi+ c(x)

be a differential operator acting on C∞(T,CN), where A(x) = (aij(x)) is a invertible positive definitesymmetric matrix with aij ∈ C∞(T ) and bi, c ∈ C∞(T,MN(C)). We define Hs(T,C

N ) in the obvious way,by putting the standard complex inner product on CN . Thus Hs(T,C

N ) may be identified with a directsum of N copies of Hs(T ). All the above theory applies coordinate–by–coordinate to these vector valuedSobolev spaces and D sends Hs to Hs−2. Note that, with respect to the inner product on C∞(T,CN), wehave (Df, g) = (f,D∗g) where D∗ =

∑− ∂

∂xjaij(x)

∂xi−∑ ∂

∂xibi(x)

∗ + c(x)∗ =∑

− ∂

∂xjaij(x)

∂xi−∑

bi(x)∗ ∂

∂xj−∑ ∂

∂xj(bi(x)

∗)+ c(x)∗

is the formal adjoint of D. Our main theorem concerning D is the following generalisation of the fact thatI + ∆ : Hs → Hs−2 is an isomorphism.

Theorem. For each k ∈ Z, we can find λk such that (D + λ) : Hk → Hk−2 is an isomorphism for λ ≥ λk.

The proof of this result is based on the following elementary fact from Hilbert space theory.

Lemma. If T : H1 → H2 is a bounded linear operator between Hilbert spaces, then T is invertible iff T andT ∗ are bounded below, i.e. ‖Tx‖ ≥ ε‖x‖ and ‖T ∗y‖ ≥ ε‖y‖ for some ε > 0.

Proof. Clearly if T is invertible, T ad T ∗ are bounded below using the norm estimates for T−1 and (T ∗)−1.Conversely if T and T ∗ are bounded below, the image of T is dense (since (imT )⊥ = kerT ∗ = (0)) andcomplete, so the whole of H2. But T is one–one, and the formal inverse for T must be bounded, because Tis bounded below. So T has a bounded inverse.

Below we will prove:

Garding’s inequality. Given k ∈ Z, we can find εk > 0 and λk > 0 such that

Re((D + λ)f, f)(k) ≥ εk‖f‖2(k+1) + (λ− λk)‖f‖2

(k).

Proof of Theorem. By the lemma we must show that T = (D + λ) : Hk → Hk−2 and its adjoint arebounded below. We start by noting that if f, g ∈ Hs, then

|(f, g)(s)| ≤ ‖f‖(s+r)‖g‖(s−r)

for any r; in fact if f =∑amz

m and g=∑bmz

m, then

|(f, g)(s)| ≤∑

|am| · |bm|(1 + ‖m‖2)s =∑

|am|(1 + ‖m‖2)(s+r)/2 · |bm|(1 + ‖m‖2)(s−r)/2

≤(∑

|am|2(1 + ‖m‖2)s+r)1/2 (∑

|bm|2(1 + ‖m‖2)s−r)1/2

.

13

But then from Garding’s inequality we get

εk‖f‖2(k+1) ≤ |((D + λ)f, f)(k)| ≤ ‖(D + λ)f‖(k−1)‖f‖(k+1).

Hence ‖(D+λ)f‖(k−1) ≥ εk‖f‖(k+1), so that (D+λ) is bounded below. To treat the adjoint, we use dualityto identify D∗. Under the pairing Hs(T ) ×H−s(T ) → C, f, g 7→ (f, g) (i.e.

∑amz

m,∑bmz

m 7→∑b∗mam)

H−s can be identified with the dual of Hs. The adjoint of the map D under this identification is then justD∗ : H−s → H−s−2. Thus the adjoint of the map (D+ λ) is bounded below, since our first argument showsthat the map D∗ +λ : H−s → H−s−2 is bounded below with s = k− 1. Hence T and T ∗ are bounded below,as required.

Proof of Garding’s inequality. Step 1. We shall first prove the statement

Re((D + λk)f, f)(k) ≥ εk‖f‖2(k+1) −Ak‖f‖(k)‖f‖(k+1).

In fact note that if a, b ≥ 0 and δ > 0, we get ab ≤ δa2 + (4δ)−1b2. Hence C‖f‖(k)‖f‖(k+1) ≤ 12ε‖f‖2

(k+1) +

(2ε)−1A‖f‖2(k). So we obtain Garding’s inequality:

Re((D + λ)f, f) =1

2ε‖f‖2

(1) + (λ − λk)‖f‖2.

Step 2: k = 0. Integrating by parts we get

Re((D + λ)f, f) = λ‖f‖2 +∑

(aij(x)fxj, fxi

) +∑

Re(bifxi, f) + Re(cf, f)

≥ λ‖f‖2 + ε∑

(fxi, fxi

) −B‖fxi‖2‖f‖2 − C‖f‖2

2

≥ ε‖f‖2(1) −A‖f‖(1)‖f‖,

where ε = min‖A(x)v‖ : x ∈ T, ‖v‖ = 1 and λ is sufficiently large.

Step 3: k > 0. We have

Re((D + λ)f, f)(k) = Re∑

|α|≤k

(k

α

)(Dα(D + λ)f,Dαf)

≥ Re∑

|α|≤k

(k

α

)((D + λ)Dαf,Dαf) −

|α|≤k

|([D,Dα]f,Dαf)|

≥ ε‖f‖2(k+1) −A‖f‖(k+1)‖f‖(k),

using the inequality for k = 0 and noting that [D,Dα] is a differential operator of order ≤ k + 1.

Step 4: k < 0. If j = −k > 0, we have a unitary map (∆ + I)j : Hj → H−j . Set f = (I + ∆)jg with g ∈ Hj .Then

Re((D + λ)f, f)−j = Re((D + λ)(∆ + I)jg, (∆ + I)jg)(−j)

= Re((∆ + I)j(D + λ)g, (∆ + I)jg)(−j) − ([D, (∆ + I)j ]g, (∆ + I)jg)(−j)

≥ Re((∆ + I)j(D + λ)g, (∆ + I)jg)(−j) − ‖[D, (∆ + I)j ]g‖(−j)‖∆ + I)jg‖(−j)

≥ Re((D + λ)g, g)(j) − C‖g‖(j+1)‖g‖(j)

≥ ε‖g‖2(j+1) − (A+ C)‖g‖(j+1)‖g‖(j)

= ε‖f‖2(k+1) − (A+ C)‖f‖(k+1)‖g‖(k)

using the inequality for j and noting that [D,Dα] is a differential operator of order ≤ 2j+1. This completesthe proof.

14

Weyl’s lemma for D (global regularity). If Df = u with f ∈ Hk, then u ∈ Hk+2. In particular, if f issmooth, so is u.

Proof. Say Du = f . Then (D + λ)u = f + λu, so that u = (D + λ)−1(f + λu). Now we use a “bootstrap”argument. Let t be the biggest value of t ∈ s + Z such that u ∈ Ht. If t < s + 2, then f + λu ∈ Ht. Sou = (D + I)−1(f + λu) lies in Ht+2, a contradiction. Hence t ≥ s+ 2, so that u ∈ Hs+2.

Corollary. (a) If D is formally self–adjoint (D = D∗), then after adjusting by a scalar D is positive,i.e. (Df, f) ≥ 0 for all f ∈ C∞.(b) If D = D∗ and D is positive, then D + I : Hk+2 → Hk is an isomorphism for all k and the norm((I +D)kf, f)1/2 is equivalent to ‖f‖(k) on C∞.(c) If D is formally self–adjoint, it has a complete set of eigenfunctions in L2 lying in C∞.

Proof. (a) is immediate from Garding’s inequality.(b) Fix k. We first show that |((D + λ)kf, f)|1/2 defines an equivalent norm on Hk(T ) for k ≥ 0. For ifk = 2m+ 1,

Re((D + λ)2m+1f, f) = Re((D + λ)m+1f, (D + λ)mf) ≥ C‖(λ+D)mf‖2(1) ≥ C′‖f‖2

(2m+1).

If k = 2m, we getRe((D + λ)2mf, f) = ((D + λ)mf, (D + λ)mf) ≥ C‖f‖2

(2m).

Since there are obvious reverse inequalities, equivalence follows. The positivity of D shows that |((D +λ)kf, f)|1/2 and |((D+I)kf, f)|1/2 define equivalent norms (each coefficient of (I+x)k −ε(λ+x)k is positivefor ε suffiently small). Since D + λ : Hk → Hk−2 is an isomorphism, D + I : Hk → Hk−2 is Fredholm ofindex zero by the Fredholm argument. But, since D is positive, D + I has zero kernel. Hence D + I is anisomorphism. The results for general k follows formally by duality.(c) As in the constant coefficient case, we let T ∈ B(H0) be the composition (D + λ0)

−1 : H0 → H2 withthe inclusion H2 → H0. T is compact and self–adjoint so the spectral theorem applies. The eigenfunctionsmust be smooth by elliptic regularity.

Weyl’s lemma for D (local regularity). Suppose that Du = f . Let U be an open subset of T and supposethat ψf is smooth for every ψ ∈ C∞(T ) such that ψ = 0 off U . Then ψu is also smooth for each such ψ.

Proof. Since Du = f , we get (D + λ)ψu = ψf + λψu+ [D,ψ]u. Hence

ψu = (D + λ)−1(ψf + λψu+ [D,ψ]u),

so the result follows by bootstrap, since [D,ψ]u may be written as a sum ofψ′u’s or first order operatorsapplied to them.

Corollary. Say Lu = f where L =∑aij(x)∂

2/∂xi∂xj + · · · with A(x) a positive symmetric matrix in aneighbourhood of x0. If f is C∞ near x0 then u is C∞ near x0.

Proof. Take a bump function χ = 1 near x0, 0 ≤ χ ≤ 1. Let D = χL + (1 − χ)∆. Then Du = f near x0,since if Du = g, we have

ψg = ψDu = ψD(χu) = ψL(u) = ψf,

so that ψg is smooth. Since D is elliptic everywhere, the result follows.

5. HILBERT–SCHMIDT OPERATORS.

Definition. A compact operator T ∈ B(H) is Hilbert–Schmidt iff the eigenvalues λ1 ≥ λ2 ≥ · · · ≥ 0 of T ∗Tsatisfy

∑λn < 0.

Proposition 1. If (ei) is an orthonormal basis of H and T ∈ B(H) with matrix (aij) with aij = (Tej, ei),then ∑

|aij |2 =∑

‖Tei‖2 =∑

‖T ∗ei‖2 ≡ ‖T ‖22

is independent of the choice of (ei) (possibly infinite).

15

Proof. We have‖T ‖2

2 =∑

|(Tei, ej)|2 =∑

‖Tei‖2 =∑

(T ∗Tei, ei).

If (fi) is another orthonormal basis, then we have

∑‖Tei‖2 =

∑|(Tei, fj)|2 =

∑|(T ∗fi, ej)|2 =

∑‖T ∗fi‖2.

Setting ei = fi, we get∑ ‖T ∗fi‖2 =

∑ ‖Tfi‖2 and∑ ‖T ∗ei‖2 =

∑ ‖Tei‖2, hence the result.

Corollary. ‖T ‖2 <∞ iff T is Hilbert–Schmidt.

Proof. (⇐) If T ∗Tei = λiei, we have

‖T ‖22 =

∑(T ∗Tei, ei) =

∑λ+ i <∞.

(⇒) Say ‖T ‖2 <∞. Since any unit vector can be completed to an orthonormal basis, we have ‖Aξ‖2/‖ξ‖2 ≤‖A‖2

2, so that ‖A‖ ≤ ‖A‖2 always. Let Pn be the orthogonal projection onto lin(e1, . . . , en). Then

‖(T (I − Pn)‖2 =∑

i>n

‖Tei‖2 → 0,

as n → ∞. But T (I − Pn) = T − Tn where TPn is finite rank. Thus T is a limit of finite rank operators,so compact. If (λi) are the eigenvalues of T ∗T with corresponding orthonormal eigenvectors (ei), then‖T ‖2

2 =∑λi, so that T is Hilbert–Schmidt.

Remark–exercise. If T is an operator given by a kernel K(x, y), Tf(x) =∫K(x, y)f(y) dy, then

‖T ‖22 =

∫ ∫|K(x, y)|2 dx dy.

Proposition 2.(a) (closure under adjoints) ‖T ∗‖2 = ‖T ‖2.(b) The Hilbert–Schmidt operators form a Hilbert space L2(H) with inner product

(A,B) =∑

(Aej , ei)(ei, Bej)[=∑

aijbij ] =∑

(B∗Aej , ej) = (B∗, A∗).

(c) (bimodule) If T ∈ B(H) and A ∈ L2(H), then TA,AT ∈ L2(H) and ‖AT ‖2, ‖TA‖2 ≤ ‖T ‖‖A‖2.Moreover (TA,B) = (A, T ∗B) and (AT,B) = (A,BT ∗).(d) (continuity) If Tn

s−→T in B(H) and An → A in L2(H), then we have TnAn → TA in L2(H).

Exercise. If S∗n

s−→S∗ in B(H), show that TnAnSn → TAS in L2(H).

Proof. (a) follows from Proposition 1.(b) (A,B) is obtained by polarising ‖A‖2

2, so is independent of the orthonormal basis. The rest is clear.(c)∑ ‖TAei‖2 ≤ ‖T ‖2

∑ ‖Aei‖2, so that ‖TA‖2 ≤ ‖T ‖‖A‖2. The other result follows by taking adjoints.(d) Recall that Tn

s−→T iff Tnξ → Tξ for all ξ. The Banach–Steinhaus uniform boundedness theorem impliesthat sup ‖Tn‖ <∞. Since

TnAn − TA = (Tn − T )A+ Tn(An −A),

it suffices to show that (Tn − T )A → 0 in L2(H). But A can be approximated by linear combinations ofrank one projections in L2(H), so we may reduce to the case when A is a rank one projection onto Cξ say.But then ‖(Tn − T )A‖2 = ‖(Tn − T )ξ‖.

6. TRACE–CLASS OPERATORS.

Definition. A compact operator T ∈ B(H) is trace–class iff the eigenvalues µ1 ≥ µ2 ≥ · · · ≥ 0 of T ∗T

satisfy∑µ

1/2n <∞.

16

Proposition 3 (polar decomposition). Let T ∈ B(H) be a compact operator. Then there is a uniquecompact self–adjoint operator P with non–negative eigenvalues such that P 2 = T ∗T . Define U(Sx) = Tx forx ∈ H and U(y) = 0 for y ∈ kerS. Then U extends uniquely to an isometry of imS onto imT and T = UP .Moreover U∗U is the projection onto imS and UU∗ is the projection onto imT .

Proof. If T ∗Ten = λnen, set Sen = λ1/2n en. Since

‖Sx‖2 = (Sx, Sx) = (S2x, x) = (T ∗Tx, x) = ‖Tx‖2,

the result is clear.

T = U · P is called the polar decomposition of T ; it is unique. We write P = |T | = (T ∗T )1/2.

Proposition 4. The following conditions are equivalent:(1) T is trace–class.(2) T =

∑ni=1 B

∗i Ai with A,Bi ∈ L2(H).

(3) T is compact with |T | trace–class.

Proof. (1) and (3) are equivalent by definition of |T | = (T ∗T )1/2. If T is trace–class, then T = U · |T |1/2 ·|T |1/2, so T is of the form in (2). Conversely if T has this form, then |T | = U∗T =

∑U∗B∗

i Ai. Let ej be anorthonormal basis such that |T |ej = λjej . Then

∑λj =

∑(|T |ej, ej) =

∑(A− i, BiU) <∞,

so that T is trace–class.

Proposition 5 (trace). If X =∑B∗

jAj , then Tr(X) =∑

(Aj , Bj) =∑

(Xei, ei) is independent of thechoice of (ei). If X is trace–class and T ∈ B(H), then TX and XT are trace–class with Tr(TX) = Tr(XT ).

Proof. This is immediate from the properties if the inner–product on L2(H) with its bimodule structure.

Proposition 6 (trace–norm). The trace–class operators form a Banach space with norm

‖X‖1 ≡ Tr(|X |) = sup‖T‖≤1

|Tr(TX)| = sup‖T ‖ ≤ 1, finite rank

|Tr(TX)|.

Proof. By continuity, the last two terms are equal. Equality of the middle terms follows from

|Tr(TX)| ≤ |Tr(TU |X |)| = (TU |X |/12, |X |1/2) ≤ ‖TU‖‖X‖1 ≤ ‖T ‖‖X‖1

and|Tr(U∗X)| = Tr(|X |) = ‖X‖1.

The last two terms obviously define a norm. To see completeness, regard L1(H) as a subset of the dual ofthe finite rank operators with the operator norm. Since ‖T ‖ ≤ ‖T ‖2, any norm continuous functional onthe finite rank operators is continuous for ‖ · ‖2. It therefore has the form T 7→ Tr(TA) for A ∈ L2(H). LetA = UP be the polar decomposition of A. Then

|Tr(TP ) = |Tr(TU∗UP )| ≤ K‖TU∗‖ ≤ K‖T ‖.

By taking T to be the projection on the first m eigenvectors of P , we dedcue that P is trace–class and sotherefore is A.

Remark. This shows that L1(H) is the Banach space dual of K(H) and that B(H) is the Banach space dualof L1(H) (non–separable!). These are non–commutative analogues of the statements that ℓ1 is the dual of c0and ℓ∞ the dual of ℓ1. More generally one can define the non–commutative Lp spaces by L∞(H) = B(H) and

17

Lp(H) = T ∈ K(H) : Tr(|T |p) < ∞. All the duality properties of the commutative ℓp spaces generaliseusing the bilinear pairing Tr(AB).

Proposition 7 (continuity). (a) ‖T ‖2 ≤ ‖T ‖1 for T ∈ L1(H).(b) ‖TS‖1 ≤ ‖T ‖2‖S‖2 for T, S ∈ L2(H) (so that in particular L1(H) is a Banach algebra).(c) If Tn

s−→T and Xn → X in L1(H), then TnXn → TX in L1(H).

Proof. (a) If (λi) are the eigenvalues of |T |, we have (∑λ2

i )1/2 ≤∑λi.

(b) We have‖TS‖1 = sup

‖X‖≤1

Tr(TSX) ≤ ‖T ‖+ 2‖SX‖2 ≤ ‖T ‖2‖S‖2.

(c) Follows from the corresponding result for L2(H), using (b).

Mercer’s Theorem. Let K(x, y) be a smooth kernel on T = Tn defining the bounded operator TKf(x) =∫K(x, y)f(y) dy on L2(T ). Then Tk is trace class and TrTk =

∫K(x, x) dx.

Proof. Since K is a smooth function on T × T , it can be written

K(x, y) =∑

arsψr(x)ψs(y),

where ∆ψr = λrψr. We know that (I+∆)−k is trace–class for k > n/2, hence so too is TK = (I+∆)−kTK1 ,where

K1(x, y) = (I + ∆x)kK(x, y) =∑

r,s

(1 + λr)kψr(x)ψs(y).

So

TrTK =∑

arr =∑

ars

M

ψr(x)ψs(x) dx =

M

K(x, x) dx.

Remarks.1. The result (and proof) holds for N ×N matrix–valued kernels acting on L2(T,CN ) providedK(x, x) is replaced by TrK(x, x) in the integral.2. When Kt(x, y) is the kernel of a heat operator e−t∆ on M , it satisfies

M

Kt1(x, y)Kt2(y, z) dy = Kt1+t2(x, z).

Thus it can be expressed as the product of two Hilbert–Schmidt operators and its trace is therefore givenby∫

MKt(x, x) dx.

3. As we will see below, all the above comments apply equally well to a compact manifold, provided wereplace the er’s by the eigenfunctions of the Laplacian.

7. NASH’S EMBEDDING THEOREM FOR THE N–TORUS.

Convolution operators on Hs(T ). If f, g are functions on T (usually continuous), we define their convolu-

tion as f ⋆ g(x) = (2π)−n∫f(x− y)g(y) dy. Thus we have f ⋆ g(m) = f(m)g(m). So in Fourier, convolution

just becomes multiplication and we can take this as the definition of f ⋆ g when f ∈ C(T ) or C∞(T ) andg ∈ Hs(T ) for any s. Thus C∞(T ) acts by convolution on each Hs(T ). Note that, when defined f ⋆g = g ⋆f .As usual we identify Tn with I = [0, 2π)n.

Convolution Theorem. (a) (Riemann sums) If f ∈ C∞(T ), g ∈ Hs(T ) for s > n/2 (so that Hs(T ) ⊂C(T ), then (2π)−nεn

∑εm∈I f(x− εm)g(εm) → f ⋆ g in Hs(T ).

(b) (regularisation) If f ∈ C∞(T ) with support in 0 ≤ xi ≤ R < 2π and, for ε ≤ 1, define fε(x) = f(x/ε)ε−n

if 0 ≤ xi/ε ≤ R and 0 otherwise. Then, if g ∈ Hs(T ), fε ⋆ g → f(0)g in Hs as ε → 0 (where f(0) =(2π)−n

∫f).

Proof. (a) Let hε(x) = (2π)−nεn∑

εm∈I f(x− εm)g(εm). Then hε(p) = f(p)Kε(p), where

Kε(p) = εn(2π)−n∑

εm∈I

g(εm)eiεm·p.

18

Clearly |Kε(p)| is uniformly bounded and Kε(p) → g(p) as ε→ 0. Hence

‖hε − f ⋆ g‖2(s) =

∑|f(p)|2|Kε(p) − g(p)|2(1 + ‖p‖2)s → 0.

(b) Changing variables, we get fε(m) = (2π)−n∫f(x)e−εm·x dx. The integrand tends uniformly to f(x) as

ε→ 0. Thus fε(m) is uniformly bounded and tends to f(0) = (2π)−n∫f as ε→ 0. Hence

‖fε ⋆ g − f(0)g‖2(s) =

∑|fε(m) − f(0)|2|g(m)|2(1 + ‖m‖2)s → 0.

Second Sobolev Multiplication Theorem. If s > n/2, Hs(Tn) is a Banach algebra, i.e. it is closed

under pointwise multiplication and ‖fg‖(s) ≤ Ks‖f‖(s) · ‖g‖(s) for some constant Ks.

Proof. It suffices to prove there is a constant Ks > 0 such that ‖fg‖(s) ≤ Ks‖f‖(s) · ‖g‖(s) whenever f andg are trigonometric polynomials,

Now if F (z) =∑Amz

m is in Hs(Tn) for s > n/2, prove that ‖F‖∞ ≤ ∑ |Am| ≤ K1‖F‖(s). Now

let F (z) =∑Amz

m and G(z) =∑Bmz

m be absolutely convergent Fourier series. Clearly ‖FG‖2 ≤‖F‖∞‖G‖2; hence we get

∑m |∑i+j=m AiBj |2 ≤ (

∑ |Ar|)2 ·∑ |Bs|2.

It is clear that ‖i+ j‖ ≤ 2 max‖i‖, ‖j‖; it follows that

(1 + ‖i+ j‖2)s ≤ 2s((1 + ‖i‖2)s + (1 + ‖j‖2)s). (∗)

Now take f(z) =∑amz

m and g(z) =∑bmz

m in Hs(Tn) for s > n/2. Setting Ai = |ai|(1 + ‖i‖2)s/2 and

Bi = |bi| or the other way round, we get ‖fg‖2(s) ≤ 2s(‖f‖2

(s)‖g‖2∞ + ‖f‖2

∞‖g‖2(s)). Since ‖f‖∞ ≤ Cs‖f‖(s)

and ‖g‖∞ ≤ Cs‖g‖(s) by Sobolev’s embedding theorem, we deduce that ‖fg‖(s) ≤ Ks‖f‖(s)‖g‖(s).

Third Sobolev Multiplication Theorem. For k ≥ r > n/2 with r fixed, there is a constant Bk > 0 suchthat ‖fg‖(k) ≤ 2(‖f‖(k)‖g‖(r) + ‖f‖(r)‖g‖(k)) +Bk(‖f‖(k−1)‖g‖r + ‖f‖(r)‖g‖(k−1).

Proof. The proof is as above except this time we use the inequality

(1 + ‖i+ j‖2)s ≤ 2[(1 + ‖i‖2)s + (1 + ‖j‖2)s] +Bs[(1 + ‖i‖2)s−1(1 + ‖j‖2) + (1 + ‖i‖2)(1 + ‖j‖2)s−1] (∗∗)

instead of (∗) where s = k (an integer). To prove (∗∗), note that if δ ≤ ‖i‖/‖j‖ ≤ δ−1 with δ > 0 small andfixed, the reult follows from (∗), since

(1 + ‖i+ j‖2)s ≤ 2s((1 + ‖i‖2)s + (1 + ‖j‖2)s) ≤ Cs((1 + ‖i‖2)s−1(1 + ‖j‖2) + (1 + ‖i‖2)(1 + ‖j‖2)s−1).

Otherwise we may assume that ‖j‖ ≤ δ‖i‖ without loss of generality. Then

(1 + ‖i+ j‖2)s

(1 + ‖i‖2)s=

(1 +

2i · j + ‖j‖2

1 + ‖i‖2

)s

≤ 1 + 2s(2i · j + ‖j‖2)/(1 + ‖i‖2).

On the other hand

2si · j ≤ 2s‖i‖‖j‖ ≤ 2s(1 + ‖i‖2)1/2(1 + ‖j‖2)1/2 ≤ (1 + ‖i‖2) + s2(1 + ‖j‖2).

Hence(1 + ‖i+ j‖2)s

(1 + ‖i‖2)s≤ 4 + (2s+ 2s2)(1 + ‖j‖2)/(1 + ‖i‖2),

which gives the inequality in this case.

Application to Nash’s embedding theorem. This theorem states that any Riemannian manifold canbe embedded isometrically in an Euclidean space RN for N sufficiently large (see Chapter III). Here we givethe key technical step in the proof as an application of the theory of Sobolev spaces; this very short proofreplaces the original proof of Nash and Moser which was about 100 times longer and much harder.

19

Let gij(x) be a smooth map of T = Tn into the symmetric real non–zero n × n matrices with non–negative spectrum (eventually we will require each g(x) to be invertible); we (abusively) call g a smoothmetric on T . We say that g is realisable by a smooth map u : T → RN (N arbitrary) if ∂u

∂xi· ∂u

∂xj= gij(x)

(eventually we will require u to be an injective embedding). Clearly if g(1) and g(2) are realisable by u1 andu2, then g = t21g

(1) + t22g(2) is realisable by (t1u1, t2u2).

Theorem A. The realisable metrics are dense in all metrics for each Hs norm.

Proof. We start by proving that any metric g(x) can be approxiamted by a finite convex combination ofmetrics f(x)B with f(x) ≥ 0 smooth and B a positive real symmetric matrix. Let ψ(x) be a bump functionon T in a neigbourhood of 0. By (b) in the Convolution Theorem ψε⋆g → g in Hs. By (a) in the ConvolutionTheorem ψε ⋆ g, can be approximated by a convex combination as claimed.

We are thus reduced to proving that if g(x) = f(x)B, then g can be approximated by representablemetrics. Since B can be diagonalised by a rotation, we may assume that B is diagonal with diagonal entriesai > 0. Now take bump functions hi on T supported near 0 and set ψ(x) =

∏hi(xj). Since hi(x)

2 isperiodic,

∫hi(x)h

′i(x) dx = 0, so that

(2π)−n

∫∂ψ

∂xi

∂ψ

∂xjdx = Cδij

∫h′2i∫h2

i

,

for some C > 0. We may choose∫h′2i /

∫h2

i equal to ai and then rescale ψ so that C = 1. Thus

(2π)−n

∫∂ψ

∂xi

∂ψ

∂xjdx = δijai.

Let

F ij(x) =∂ψ

∂xi

∂ψ

∂xj.

By (b) in the Convolution Theorem, (F ijε ⋆ f) → gB in Hs. By (a), (F ij

ε ⋆ f) can be approximated in Hs byfinite positive combinations of terms Fij(x− a). But each such term is clearly representable, by defintion ofFij . Hence g(x) = f(x)B can be approximated by representatble metrics, as claimed.

Theorem B. Let u0 : Tn → RN be an embedding such that at each x ∈ M = Tn, the n2 + n vectorsu0

i (x), u0ij(x) are linearly independent. Let g0

ij = u0i · u0

j be the Riemannian metric on M induced by the

embedding. Then if gij is another metric on M sufficently close to g0 in C∞(M,Mn(R), we can find asmooth map u : M → RN such that gij = ui · uj.

Remark. Such an embedding u0 can be obtained by composing the inclusion Tn → Cn = R2n with themap Rm → Rm+m2

, (xi) 7→ (xi, xjxk), where m = 2n.

Proof (Gunther). Set g = g0 + h and u = u0 + v. We have to solve ui · uj = gij . Hence hij =vi · vj + u0

i · vj + u0j · vi. Now, if ∆ = −∑ ∂2

i , we have

∆(vi · vj) = ∂i(∆v · vj) + ∂j(∆v · vi) + 2∆v · vij − 2∑

k

vik · vjk.

Hence

(I + ∆)(vi · vj) = ∂i(∆v · vj) + ∂j(∆v · vi) + 2∆v · vij − 2∑

k

vik · vjk + vi · vj .

Define Fi(v) = (I + ∆)−1(∆v · vi) and Uij(v) = (I + ∆)−1[2∆v · vij − 2∑

k vik · vjk + vi · vj ]. On the otherhand

vi · vj = ∂iFj + ∂jFi + Uij .

Hencep

hij = vi · u0j + vj · u0

i + ∂iFj + ∂jFi + Uij = ∂j(v · u0i − Fi) + ∂i(v · u0

j − Fj) +Wij ,

20

where Wij = Uij − 2v · u0ij . Note that, if v ∈ Hs for s > n/2 + 2, then by the multiplication theorem

in Hs−2 Fi, Uij ∈ Hs with ‖Fi‖, ‖Uij‖(s) ≤ A‖v‖2(s) for some A > 0, indpendent of v. Hence ‖Wij‖(s) ≤

B‖v‖2(s)+C‖v‖(s) for constants B,C independent of v. We try to solve the above equation by the decoupling

ansatz v · u0i = −Fi(v) and v · u0

ij = (Uij(v) − hij)/2. We look for v in the form v =∑xiu

0i +

∑yiju

0ij

with xi, yij ∈ C∞(T ). We can find ai, bij vector–valued and smooth such that ai · u0j = δij , ai · u0

ij = 0,

bpq · u0ij = δpq,ij and bpq · u0

i = 0. Thus we are trying to solve T (v) = v where

T (v) = −∑

Fi(v)ai +∑

1/2(Uij(v) − hij)bij .

By the second Sobolev multiplication theorem applied to Hr, we get

‖T (v)‖(r) ≤ A‖v‖2(r) +B‖h‖(r),

with A and B constants. Thus if AR ≤ 1/2 and B‖h‖(r) ≤ R/2, we see that ‖v‖(r) ≤ R implies ‖T (v)‖(r) ≤R. Moreover ‖T (v) − T (w)‖(r) ≤ C‖v + w‖(r)‖v − w‖(r). Thus if in addition RC < 1/2, T is a contractionmapping on v : ‖v‖(r) ≤ R. It therefore has a unique fixed point v0 and T p(0) → v0 in Hr. We now showthat if R is chosen appropriately and k ≥ r, the norms (‖T p(0)‖(k)) are also bounded.

We prove this by induction on k. For k = r, we saw that ‖T p(0)‖(r) ≤ R. Applying the thirdS Sobolevmultiplication theorem to the formula for T (v), we get

‖T (v)‖(k) ≤ A‖v‖(k)‖v‖(r) +Bk‖v‖2(k−1) + Ck,

where A is independent of k. Choose R so small that AR ≤ 1/2. The above inequality and the boundednessreult for k− 1 then imply that ‖T p+1(0)‖(k) ≤ 1

2‖T p(0)‖(k) +Dk. It follows by induction that ‖T p(0)‖(k) ≤2Dk and hence is bounded.

Since (‖T p(0)‖(k+1)) is bounded, by Rellich’s theorem we may pick a subsequence T pi(0) which isconvergent in Hk. Since this must converge a fortiori in Hr, we see that T pi(0) → v0 and therefore v0 liesin⋂Hk = C∞. (Note that, since the limit is independent of the subsequence, in fact T p(0) → v0 in Hk.)

This completes the proof.

Nash’s Theorem for Tn. Every smooth metric on Tn is realisable by an injective embedding in some RN .

Proof. Let g be a smooth metric on T and let g0 be the metric in Theorem B. Take δ > 0, such that g− δg0is everywhere positive definite. If ‖f‖(r) is sufficiently small, g− δg0− δf will still be positive definite. It willalso be representable for arbitarily small ‖f‖(r) by Theorem A. But if ‖f‖(r) is small, g0 + f is representableby Theorem B. Hence g = δ(g0 + f) + (g − δg0 − δf) will be representable.

Thus every smooth metric g can be realised by a smooth map u : T → RN . We now show that we canassume that u is an injective embedding, i.e. the map u is one–one and the derivative matrix ∂u/∂xi has rankn everywhere. Let u1 be the injective embedding T 7→ Cn ≡ R2N given by (xj) 7→ (eixj )j ≡ (cosxj , sinxj)and let g(1) be the corresponding flat metric. Now we can find δ > 0 such g(2)(x) = g(x)−δg(1)(x) is positivedefinite for all x ∈ T . Let u2 be a smooth map realising g(2). Then u = (

√δu1, u2) is a smooth map realising

g. Clearly u is an injective embedding, since u1 is.8. TENSOR, SYMMETRIC AND EXTERIOR ALGEBRAS.

Tensor products. If V and W are finite–dimensional vector spaces over R or C, we defining their tensorproduct V ⊗W by taking bases (vi) and (wj) in V andW and then decreeing V ⊗W to be the vector spec withbasis vi ⊗wi. In general we set (

∑aivi)⊗ (

∑bjwj) =

∑aibj vi ⊗wj, so that v⊗w is defined for any v ∈ V ,

w ∈ W . This definition is up to isomorphism independent of the choice of basis. Clearly dim(V ⊗W ) =dim(V )dim(W ). Iterating we get a similar definition of a k–fold tensor product V⊗ · · · ⊗ Vk. By definitionthese is a natural one–one correspondence between the vector space of multilinear maps V1 × · · · × Vk → Uand Hom(V1⊗· · ·⊗Vk, U); this could equally well be used as the universal property characterising the tensorproduct.

The tensor product has various obvious functorial properties. Thus for example V1 ⊗ V2∼= V2 ⊗ V1,

(V1⊗V2)∗ = V ∗

1 ⊗V ∗2 , V2⊗V ∗

2∼= Hom(V2, V1), V1⊗V2

∼= Hom(V ∗2 , V1), Hom(V1⊗V2, V3) ∼= Hom(V1, V

∗2 ⊗V3).

21

Moreover if fi : Ui → Vi are linear maps, then we have f1 ⊗ f2 : U1 ⊗ U2 → V1 ⊗ V2 sending u1 ⊗ u2 tof1(u1) ⊗ f2(u2).

The tensor algebra. Let T n(V ) = V ⊗n = V ⊗ · · · ⊗ V (n times) and T (V ) =⊕V ⊗n =

⊕T n(V ), the

tensor algebra. Multiplication T a(V ) → T b(V ) → T a+b(V ) is defined by concatenation, so that (v1 ⊗ · · · ⊗va)× (w1 ⊗ · · · ⊗wb) = v1 ⊗ · · · ⊗ va ⊗w1 ⊗ · · · ⊗wb. This makes T (V ) into a non–commutative associativealgebra.

Action of Sn on V ⊗n. The symmetric group Sn acts on V ⊗n by permuting the tensor factors. Thusσ(v1 ⊗ · · ·⊗ vn) = vσ1 ⊗ · · ·⊗ vσn for σ ∈ Sn. Define ε : Sn → ±1 to be the sign homomorphism, assigning+1 to an even permutation and −1 to an odd permutation. Let

Sω =1

n!

σ∈Sn

σω, Aω ==1

n!

σ∈Sn

ε(σ)σω

be the symmetrising and antisymmetrising operators on V ⊗n.

Symmetric and exterior algebras. Let Sn(V ) = ω ∈ V ⊗n : σω = ω∀σ ∈ Sn = SV ⊗n and Λk(V ) = ω ∈V ⊗n : σω = ε(σ)ω∀σ ∈ Sn = AV ⊗n. S(V ) =

⊕Sn(V ) and Λ(V ) =

⊕Λn(V ) are called the symmetric

and exterior algebras. Their multiplication is defined on homogenous elements by ω1 · ω2 = S(ω1 ⊗ ω2)or ω1 ∧ ω2 = A(ω1 ⊗ ω2) and extended bilinearly to the whole of S(V ) or T (V ). It is easy to check thata · (b · c) = S(a⊗ b⊗ c) = (a · b) · c and that a ∧ (b ∧ c) = A(a⊗ b⊗ c) = (a ∧ b) ∧ c, so that S(V ) and Λ(V )become associative algebras.

Lemma. S(V ) is a commutative ring and Λ(V ) is a graded commutative ring.

Proof. The first result follows straight from the definitions and is just part of the fact that S(V ) coincideswith the algebra of polynomial functions on V ∗ (see below). The algebra Λ(V ) is Z2–graded into even orodd elements, according to degree of homogeneous elements. We set ∂a = 0 or 1 according as a is even orodd. Graded commutativity is just the statement that a ∧ b = (−1)∂a∂bb∧ a, which is immediate from thedefinitions.

Concrete realisations of S(V ) and Λ(V ). We map S(V ) into polynomial functions on V ∗. Notethat Sk(V )∗ = Sk(V ∗). We need

Polarisation Lemma. The tensors v⊗m with v ∈ V span SmV

Proof Note that ifX is a subspace and f(λ1, . . . , λm) is a polynomial function of λ1, . . . , λm with values inX ,

then ∂|α|

∂λα f also lies in X for any multinomial α, since X is finite–dimensional, so closed. Take v1, . . . , vm ∈W

and consider f(λ) = (∑λivi)

⊗m. Up to a constant non–zero factor ∂mf∂λ1···∂λm

is the symmetrisation ofv1 ⊗ · · · ⊗ vm. This shows that the symmetrisation of any elementary tensor (and hence any tensor) lies inthe subspace X of SmV ⊂ V ⊗m spanned by the tensors v⊗m.

In particular, SkV ∗ is spanned by tensors x⊗n with x ∈ V ∗. Hence the map f 7→ f(x⊗m) = f(x) definesan injection of Sk(V ) into the polynomials of degree k on V ∗. The map is clearly surjective, so we may identifyf ∈ Sk(V ) with the polynomial f(x). It is easy to see that under this identification f · g(x) = f(x)g(x), sothat as a commutative algebra S(V ) can be identified with the algebra of polynomial functions on V ∗.

Note that if v1, . . . , vn is a basis of V , then a basis of Λk(V ) is given by vi1 ∧ vi2 ∧ · · · ∧ vik. Thus

dimΛk(V ) =(nk

)and dimΛ(V ) = 2n. In particular Λm(V ) = 0 for m > n and Λn(V ) is one–dimensional. We

can also identify ΛkV with alternating multilinear functionals on V ∗×· · ·×V ∗. If f and g are homogeneousof degree a and b respectively, then exterior multiplication is given by the formula

f ∧ g(x1, . . . , xa+b) =1

(a+ b)!

σ∈Sa+b

ε(σ)f1(xσ1, . . . , xσa)g(xσ(a+1), . . . , xσ(a+b)).

(Actually the sum can be reduced to a sum over the coset space Sa+b/Sa × Sb since σ(f ⊗ g) = ε(σ)f ⊗ gfor σ ∈ Sa × Sb.)

22

Finally note that V → S(V ) and V → Λ(V ) are functors from the additive category of vector spacesto the multiplicative tensor category of vector spaces. This will not be important for us, although it is thekey to quantisation in quantum field theory. As Nelson said, first quantisation is a mystery while secondquantisation is a functor. This functoriality appears in the isomorphism S(V ⊕W ) = S(V ) ⊗ S(W ) andΛ(V ⊕W ) = Λ(V )⊗Λ(W ) between (graded) commutative algebras. We need (a⊗b)(c⊗d) = (−1)∂b∂cac⊗bdto define the tensor product of graded algebras. The basic rule in discussing graded objects is that if we movea symbol of degree ∂1 past a symbol of degree ∂2, then a sign (−1)∂1∂2 must be introduced. The functor Scorresponds to bosons which satisfy the canonical commutation relations while the functor Λ corresponds tofermions which satsify the canonical anticommutation relations. The basic idea of supersuymmetry is thatthe bosonic and fermionic theory can be developed in parallel at each stage, so that any concept introducedin one theory has its natural counterpart in the other.

Inner products and tensors. If U and V are real or complex inner product spaces, we can define an innerproduct on U ⊗ V by taking any positive multiple of the inner product (u1 ⊗ v1, u2 ⊗ v2) = (u1, u2)(v1, v2).In particular we define the inner product on T k(V ) = V ⊗k by (a1 ⊗ · · · ⊗ ak, b1 ⊗ · · · ⊗ ak) = k!

∏(ai, bi).

(The factor of k! is essential here to guarantee (exp(a), exp(b)) = exp (a, b) for a, b ∈ V .) This inner productextends to T (V ) by declaring the T k(V ) to be mutually orthogonal. Note that, since S(V ),Λ(V ) ⊂ T (V ),there are naturally induced inner products on S(V ) and T (V ). The definition immediately give the followingexplicit formulas in the functional realisations above.

Lemma. (a) In Λ(V ), we have (a1 ∧ · · · ∧ am, b1 ∧ · · · ∧ bn) = δnm det(ai, bj).(b) In S(V ), we have (xm, yn) = δmnn!(x, y)n.

We will see that regarded as polynomial functions on V ∗ = Cn, the inner product in S(V ) agrees with the

inner product (f, g) = π−n∫

Cn f(z)g(z) e−|z|2, so that S(V ) can be identified with so–called holmorphic Fockspace (see below). Part (a) of the lemma shows that if (ei) is an orthonormal basis of V , then ei1∧ei2∧· · ·∧eik

(i1 < · · · < ik) is an orthonormal basis for Λk(V ).Now both on Λ(V ) and S(V ) we have the operation of multiplication by v ∈ V . We now work out their

adjoints.

Theorem (adjoint derivations). (a) The adjoint e(v)∗ of e(v) is the graded derivation dv(v1 ∧· · ·∧vk) =∑(−1)i+1(vi, v)v1 ∧ · · · ∧ vi−1 ∧ vi+1 ∧ · · · ∧ vk with dv(1) = 0.

(b) The adjoint of multiplication by v is the derivation ∂v given by ∂v(x1 · · ·xn) =∑

(xi, v)∏

j 6=i xj forxj ∈ V with ∂v(1) = 0.

Proof. (a) We have

(e(w1)∗v1 ∧ · · · ∧ vn+1, w2 ∧ · · · ∧ wn+1) = (v1 ∧ · · · ∧ vn+1, w1 ∧ · · · ∧ wn+1)

= det(vi, wj)

=∑

(−1)i+1(vi, w1)(v1 ∧ · · · ∧ vi−1 ∧ vi+1 ∧ · · · ∧ vn+1, w1 ∧ · · · ∧ wn+1).

expanding the determinant by the first column. This proves the formula for e(v)∗. This is usually called“contraction” with v or “interior multiplication”. It is routine to check from the definition of dv that, ifω1 and ω2 are homogeneous, then dv(ω1 ∧ ω2) = dv(ω1) ∧ ω2 + (−1)∂ω1ω1 ∧ dvω2. This means that dv is agraded derivation, the signs being compatible with our previous convention since dv is odd. Note that dv isuniquely determined once we declare that it is a graded derivation, dv(1) = 0 and dvw = (w, v) for w ∈ V .(b) This can be checked directly using the inner product as in (a). When V is a complex inner productspace, it is also obvious in the functional realisation in terms of polynomials on Cn with the above innerproduct, for there clearly zi has adjoint ∂/∂zi.

Theorem (real and complex wave representation). (a) Let V be an inner product space. Then if a, b ∈V , the operators e(a), e(b) on Λ(V ) satisfy the canonical anticommutation relations e(a)e(b) + e(b)e(a) = 0,e(a)∗e(b)∗ + e(b)∗e(a)∗ = 0 and e(a)e(b)∗ + e(b)∗e(a) = (a, b).(b) Let V be an inner product space. Then, if v, w ∈ V , the operators z and ∂w on S(V ) satisfy the canonicalcommutation relations zw − wz = 0, ∂z∂w − ∂w∂z = 0 and ∂wz − z∂w = (z, w).

23

Proof. (a) Clearly e(a) and e(b) anticommute, so taking adjoints so too do e(a)∗ and e(b)∗. Now

(e(a)e(b)∗ + e(b)∗e(a))ω = a ∧ e(b)∗ω + e(b)∗(a ∧ ω) = a ∧ e(b)∗ω + (a, b)ω − a ∧ (eb)∗ω = (a, b)ω.

(b) Clearly z and w commute. hence so do their adjoints ∂z and ∂w. Now

(∂wz − z∂w)p = z∂wp+ (z, w)p− z∂wp = (z, w)p.

This proves the last commutation relation.

Theorem (irreducibility of wave representation). (a) If V is an complex inner product space, theoperators e(v) and e(v)∗ act irreducibly on Λ(V ).(b) If V is a complex inner product space, the operators v and ∂v act irreducibly on S(V ).

Proof. (a) Let U 6= (0) be an invaraiant subspace and take ω 6= 0 in U . Then ω =∑aIvi1 ∧ · · · ∧ vik

withrespect to some orthonormal basis (vi). Pick a non–zero term of maximal degree, aIvi1 ∧ · · · ∧ vik

. Thene(vik

)∗ · · · e(vi1)∗ω = aI , so that 1 ∈ U . Since all of Λ(V ) can be obtained by applying e(v)’s to 1, we see

that U = Λ(V ).(b) We have to show that the operators zi and ∂/∂zj act irreducibly on the polynomial algebra C[z1, . . . , zn].Let U be an invariant subspace and take p(z) 6= 0 in U . Then p(z) =

∑aαz

α. Pick a non–zero termof maximal degree aαz

α. Then ∂αp(z) = α!, so that 1 ∈ U . Since all polynomials can be obtained bymultiplying 1 by zi’s, we see that U = C[z1, . . . , zn].

9. THE DOUBLE COMMUTANT THEOREM. Let V be a finite–dimensional inner product spaceover C and let A ⊆ EndV be a *–subalgebra of EndV . This means that I ∈ A and A is a linear subspaceclosed under multiplication and the adjoint operation T 7→ T ∗. For any subset S ⊆ EndV , we define thecommutant of S by

S′ = EndS(V ) = T ∈ EndV : Tx = xT for all x ∈ S.

Schur’s Lemma. (i) A acts irreducibly on V (i.e. has no invariant subspaces) iff A′ = C.(ii) If A acts on two irreducible subspaces Vi and T ∈ HomA(V1, V2) (i.e. commutes with A), then T = 0 oris an isomorphism.

Proof. (Spectral Theorem.) (i) Say A does not act irreducibly and U ⊂ V be a proper subspace invariantunder A (i.e. U is an A–submodule). Then, if P is the orthogonal projection onto U , we have P ∈ A′. SoA′ 6= C.

Conversely if T ∈ A′, then, since A′ is a *–algebra, both ReT = T + T ∗/2 and ImT = T − T ∗/2i liein A′. By the spectral theorem for self–adjoint matrices, so does any projection onto an eigenspace (i.e. aspectral projection). So if T /∈ C, we have produced a projection P ∈ A′ with P 6= 0, I. The correspondingsubspace is invariant.(ii) If v1 and V2 are irreducible and T is an intertwiner, then so is T ∗ (simply take adjoints of the intertwiningrelation and replace a by a∗). But then TT ∗ and T ∗T are also intertwiners, i.e. T ∗T ∈ π1(A)′ and TT ∗ ∈π2(A)′. They must be scalars by (i), so either both zero or both the same multiple of the identity.

Double commutant theorem. If A ⊂ End(V ) is a *–algebra, then A′′ = A.

Proof. (1) If U is a subspace of V invariant under A, then so is U⊥. In particular V is a direct sum ofirreducible A–submodules.

Proof. Say ξ ∈ U⊥ and a ∈ A. Let η ∈ U . Then 〈aξ, η〉 = 〈ξ, a∗η〉 = 0 since a∗η ∈ U and ξ ⊥ U . Soaξ ⊥ U , i.e. aξ ∈ U⊥. So V = U ⊕ U⊥ with U and U⊥ A–modules. We continue this game if U or U⊥ failto be irreducible.

(2)If S ∈ A′′ and v ∈ V , there is a T ∈ A such that Tv = Sv.

Proof. In fact let W = Av ⊆ V . This is an A–submodule of V . The orthogonal projection onto W givesa projection E ∈ EndV (E2 = E = E∗) which commutes with A, from (1). So E ∈ A′. But S ∈ A′′, soSE = ES. This means that S leaves W and W ′ invariant. (Note that I − E is the orthogonal projectiononto W⊥.) But v ∈ W . So Sv ∈W = Av. So Sv = Tv for some T ∈ A.

24

(3) Let V ′ = V ⊕ · · · ⊕ V (m times) with A acting diagonally, a(ξ1, . . . , ξn) = (aξ1, . . . , aξm). This meanswe can identify A with a *–subalgebra of EndV ′ (for the initiated, V ⊕ · · · ⊕ V = V ⊗ Cm). It’s easy tocheck that π(A)′ = A′ ⊗Mm(C) = Mm(A′), if we write elements of EndV ′ as m×m matrices with entriesin EndV . We go on to check that

π(A)′′ = (π(A)′)′ = π(A′) =

xx

··x

,

where here (and above) π denotes the embedding EndV → EndV ′ taking operators to diagonal operators.

Take m = dim V . Set v =

e1···em

where e1, . . . , em is a basis of V . By step (2), we have π(A)v = π(A)′′v.

But π(A)′′ = π(A′′) from the above. So given S ∈ A′′ we can find T ∈ A such that π(S)v = π(T )v. Hence

SS

··S

e1···em

=

TT

··T

e1···em

.

Thus Sei = Tei for all i and hence S = T .

Corollary 1. A *–algebra A acts irreducibly iff A = End(V ).

Corollary 2. All *–representations of End(V ) are on direct sums of copies of V .

Proof 1. Since all representations are sums of irreducibles, it suffices to show that V is the only irreduciblerepresentation of End(V ). But if W is another inequivalent irreducible, the commutant on V ⊕W must beEnd(V ) ⊕ End(W ) by Schur’s lemma and the double commutant theorem. But the image of End(V ) mustcoincide with its double commutant, a contradiction.

Proof 2. Choose matrix units in A = End(V ) and let W be an A–module. Set W1 = e11W and let (wj)be basis of W1. Consider the map T : ⊕V ⊗W1 → W , ⊕µijei ⊗ wj 7→ ∑

µijei1wj . T is surjective sinceI =

∑ej1e11e1j , so that AW0 = W . T is also injective, for

∑µijei1wj = 0 forces µijwj = 0 for each i

(premultiply by e1i) and hence µij ≡ 0. By construction it commutes with the actions of A. It is evenunitary if (wj) is chosen orthonormal. So W is a direct sum of copies of V .

Corollary 3 (Schur–Weyl duality). if A is the *–algebra of linear combinations of g⊗m’s as g rangesover GL(V ) and B is the *–algebra of linear combinations of the σ’s as σ ranges over Sm, we have A = B′

and B = A′, so that A and B are each other’s commutants.

Proof. Since A and B are *–algebras, by the double commutant theorem, A = A′′ and B = B′′. So to proveA′ = B, it is equivalent to check that B′ = A. The algebra A is a finite–dimensional subspace, so closed.Now any non–invertible matrix is the limit of invertible matrices: for x + εI for all ε sufficiently small. SoA contains all tensors w ⊗ · · · ⊗ w even if w is not invertible. So C coincides with the fixed points of Sm inEndV ⊗m, i.e. the commutant of Sm. (Note that conjugation by σ gives the permutation action of Sm on(EndV )⊗m = W⊗m, where W + EndV .) Thus B′ = A as required.

10. FERMIONS AND CLIFFORD ALGEBRAS.

Real Clifford algebras. Let V be a real 2n–dimensional inner product space. Operators c(v) on a real orcomplex inner product space W are said to satisfy the real Clifford algebra relations iff v 7→ c(v) is R–linear,c(v)∗ = c(v) and c(a)c(b) + c(b)c(a) = 2(a, b)I.

25

Lemma 1. If the operators c(v) satisfy the real Clifford algebra relations, then the real *–algebra A theygenerate is spanned by products c(vi1 )c(vi2 ) · · · c(vik

) with i1 < i2 < · · · < ik and (vi) a basis of V . Moreoverdim(A) ≤ 2dim(V ).

Proof. Clearly the algebra generated by the c(vi)’s is a *–algebra since each c(v) is self–adjoint. Thus itsuffices to prove that A0 = linC(c(vi1 )c(vi2) · · · c(vik

)) is closed under multiplication by c(vi). This, however,is obvious from the Clifford relations. Hence A = A0. Clearly dim(A0) ≤ 2dim(V ).

Lemma 2. Let c(v) = e(v) + e(v)∗ acting on W = ΛRV .

(a) The c(v)’s satisfy the real Clifford algebra relations.

(b) The vector Ω = 1 ∈ Λ0(V ) is cyclic for a the real*–algebra A generated by the c(v)’s, i.e. Aω = Λ(V ).

(c) The operators c(vi1 )c(vi2) · · · c(vik) with i1 < i2 < · · · < ik are linearly independent and the vector Ω = 1

is separating for A, i.e. aΩ = 0 implies a = 0.

(d) c(vi1 ) · · · c(vik)ω = vi1 ∧ vi2 ∧ · · · ∧ vik

∧ ω+ lower order terms modulo two.

Proof. (a) By the canonical anticommutation relations,

c(a)c(b) + c(b)c(a) = (e(a) + e(a)∗)(e(b) + e(b)∗) + (e(b) + e(b)∗)(e(a) + e(a)∗) = 2(a, b)I.

(b) Let Wk = linc(x1)c(x2) · · · c(xj)Ω : j ≤ k for k ≥ 0. We prove by induction that Wk = ⊕kj=0Λ

j(V ).

For k = 0, this is trivial. For k > 0, c(x1)x2 ∧ · · · ∧ xk = x1 ∧ · · · ∧ xk plus a term in Λk−2(V ). Thus, byinduction, x1 ∧ · · · ∧ xk lies in c(x1)Wk−1 +Wk−2 ⊂Wk, as required.

(c) Since 2dim(V ) ≤ dimΛ(V ) = dimAΩ =≤ dim(A) ≤ 2dim(V ), this is obvious from (b) and Lemma 1.

(d) This follows easily by induction on k.

We define the real Clifford algebra Cliff(V ) to be the real *–algebra generated by the c(v)’s on Λ(V ). Weshow that Cliff(V ) has a similar universal property to the group algebra C[G]. This is defined as the algebraof operators on ℓ2(G) generated by left translations. Any finite–dimensional unitary representation of G givesrise to a *–representation of C[G] and conversely, so that C[G] is the universal algebra for representationsof G. We claim that any given Clifford algebra relations C(v) on W , there is a unique *–representation ofCliff(V ) sending c(v) to C(v). Uniqueness is clear, since the c(v)’s generate Cliff(V ); to prove existence, wetake an orthonormal basis (vi) of V and send the basis element c(vi1) · · · c(vik

) of Cliff(V ) to C(vi1 ) · · ·C(vik).

This is clearly a homomorphism of *–algebras. If W = Λ(V ), a real inner product space, we have a naturalcomplexification WC = W ⊗R C. This is just obtained by taking an orthonormal basis for V and henceΛW and extending the scalars and inner product in the obvious way. The algebra A = Cliff(V ) and itscomplexification CliffC(V ) = AC = A ⊕ iA acts on WC. AC is a complex *–algebra and Ω is again cyclicand separating for AC. This means AC cannot act irreducibly; for if it did, AC = End(WC) and Ω is notseparating for End(WC). Note tthat A → AΩ gives an isomorphism between Cliff(V ) and Λ(V ) as linearspaces. This allows us to speak about the degree of an element of Cliff(V ). Note the following immediateconsequence of Lemma 2 (d).

Corollary. If ω1 ∈ Λa(V ) and ω2 ∈ Λb(V ), then ω · ω2 = ω1 ∧ ω2+ lower degree terms modulo two.

We now show how introducing a complex structure on V allows us to produce an irreducible representa-tion of the real Clifford algebra relations. By definition a complex structure on V is a map J ∈ End(V ) suchthat J2 = −I and J is orthogonal. Since dim(V ) = 2n is even, such maps always exist. We can then definea complex inner product space VJ from V by taking J to be multiplication by i and taking the complex innerproduct on V as (v, w)C = (v, w)R − i(Jv, w)R, where (v, w)R denotes the original real inner product on V .

Lemma. VJ is a complex inner product space with (v, v)R = (v, v)C.

Proof. Clearly (v, w) is R–bilinear. Moreover (v, w)C = (v, w)R + i(Jv, w)R = (w, v)R − i(Jw, v)R = (w, v)C.Since (Jv, w) = i(v, w), it follows that (v, w)C is C–linear in v and conjugate linear in w. Now (Jv, v)R =−(v, Jv)R = (Jv, v)R, so that (Jv, v)R = 0. Hence (v, v)C = (v, v)R and VJ is a complex inner product space.

Theorem. The formula C(v) = e(v) + e(v)∗ gives a faithful(=injective) irreducible representation ofCliffC(V ) on S = Λ(VJ ), called the “spin module”. In particular CliffC(V ) ∼= End(S).

26

Proof. Clearly v 7→ C(v) is R–linear, C(v)∗ = C(v) and C(v)C(w) + C(w)C(v) =

(e(v) + e(v)∗)(e(w) + e(w)∗) + (e(w) + e(w)∗)(e(v) + e(v)∗) = 2Re(v, w)CI = 2(v, w)RI.

Hence C(v) satifies the real Clifford algebra relations and therefore we get *–homomorphism of Cliff(V ) intoEnd(Λ(VJ )). Now the relation C(v) = e(v) + e(v)∗ implies C(Jv) = e(v) + e(Jv)∗ = e(v) − ie(v)∗. Hencee(v) = 1

2 (C(v)− iC(Jv)) and e(v)∗ = 12 (C(v)+ iC(Jv)). But the e(v)’s and e(v)∗’s act irreducibly on Λ(VJ )

(it is the complex wave representation), so the C(v)’s must also act irreducibly. Therefore the C(v)’s generateEnd(S). Thus the *–algebra generated by the C(v)’s has C–dimension dim(S)2 = 22dimC(VJ ) = 2dimR(V ).But this is the C–dimension of CliffC(V ), so the representation of CliffC(V ) is faithful and surjective. HenceCliffC(V ) ∼= End(S). Moreover the representation must a fortiori be faithful on the real subalgebra Cliff(V ).

11. QUANTISATION: THE SPIN GROUP AND ITS LIE ALGEBRA.

Bogoliubov automorphisms of Cliff(V ). Consider the compact group SO(V ).

Lemma. SO(V ) is connected.

Proof. Any matrix in SO(V ) is conjugate to a block diagonal matrix with 2 × 2 diagonal blocks Di =(cosxi sinxi

− sinxi cosxi

), so can be connected by a continuous path to I by the path of matrices with blocks

Di =

(cos txi sin txi

− sin txi cos txi

).

If g ∈ SO(V ), v 7→ c(gv) also satisfies the real Clifford algebra relations, so induces an automorphismof Cliff(V ). In fact SO(V ) acts orthogonally on Λ(V ) via g(x1 ∧ · · · ∧ xk) = gx1 ∧ · · · ∧ gxk, so thatge(v)g−1 = e(gv) and hence gc(v)g−1 = c(gv) since c(v) = e(v) + e(v)∗. Thus SO(V ) normalises Cliff(V )on Λ(V ). We write αg for the automorphism of Cliff(V ) and CliffC(V ) induced by Ad g, a 7→ gag−1. Inparticular g0 = −I acts and gives a period two automorphism γ = α−I of Cliff(V ) satisfying αc(v) = −c(v).This automorphism gives rise to a Z2–grading on A = Cliff(V ), because we can take the ±1 eigenspacesA± of γ. Clearly A+A+ ⊂ A+, A+A− ⊂ A−, A−A+ ⊂ A− and A−A− ⊂ A+. Under the identificationA ≡ Λ(V ), A+ = Λeven(V ) and A− = Λodd(V ).

Now if v 7→ C(v) is the irreducible representation of the Clifford algebra relations on the spin moduleS, v 7→ C(gv) will given another irreducible representation on S. By uniqueness we can find Ug ∈ U(S)such that C(gv) = UgC(v)U∗

g for all v. Note that g ∈ SO(V ) commutes with the complex structure J iffg ∈ SU(VJ ). In this case g is canonically implemented on S = Λ(VJ ) by g(v1 ∧ · · · ∧ vk) = gv1 ∧ · · · ∧ gvk. Inparticular g0 = −I commutes with all J ’s, so is canonically implemented on each S: g0 acts as ±1 on S±.

The choice of Ug is not unique. If U ′g is another possible choice, then U∗

gU′g must commute with all

C(v)’s and hence must be a scalar matrix by Schur’s lemma. Thus Ug is uniquely determined up to a phasein T, so that Ug really gives a homomorphism of SO(V ) into U(S)/T = PU(S), the projective unitary group.This is what is meant by quantisation. The prequantised action on V can be implemented on Fock spaceS by a unitary; the phase represents the anomoly that usually arises when we quantise. As we shall see,we really get a 2–valued representation of SO(V ) or equivalently a representation of a double cover, calledSpin(V ), which we now construct. Observe first that UgC(v)U∗

g = C(gv), so that Ug normalises the realsubalgebra A = Cliff(V ) of End(S).

Theorem (Noether–Skolem). g ∈ End(S) normalises A iff g ∈ A∗ · C∗, where A∗ denotes the invertibleelements in A.

Proof. We know that End(S) = A ⊕ iA, a direct sum of real vector spaces. Let g = a + ib with a, b ∈ Aand set α(a) = gag−1. Then (a + ib)x = α(x)(a + ib). Hence ax = α(x)a and bx = α(x)b. Consider thepolynomial p(t) = det(a + tb). Since p(i) 6= 0, we can find t ∈ R such that p(t) 6= 0. Let h = a + tb ∈ Aand let h−1 = u + iv. Then h(u + iv) = I, so that hv = 0 and hence v = 0. Thus h−1 ∈ A. Sincehx = (a + tb)x = α(x)(a + tb) = α(x)h, it follows that z = h−1g commutes with A, so lies in C∗. Henceg = hz as claimed.

Corollary. For each g ∈ SO(V ), there is a unitary element ug ∈ A∗ uniquely determined up to a sign suchthat ugc(v)u

∗g = c(gv).

27

Proof. Suppose ug = λUg. Then ugu∗g = |λ|2 = u∗gug. Scaling ug, we may therefore arrange that ug is

unitary. Since A∗ ∩ C∗ = R∗, ug is uniquely determined up to sign.

The spin group. Let Spin(V ) = ±ug : g ∈ SO(V ) ⊂ Cliff(V ), the spin group.

Lemma. Spin(V ) consists of unitaries u ∈ Cliff(V ) normalising c(V ) such that the orthogonal transforma-tion g defined by c(gv) = uc(v)u∗ lies in SO(V ). In particular Spin(V ) is a closed subgroup of the unitarygroup of A, so compact.

Proof. Clearly any element of Spin(V ) satisfies these conditions. Conversely if u is such a unitary, uc(v)u∗ =c(gv) for a unique g ∈ GL(V ). Since c(v)c(u) + c(u)c(v) = 2(v, u)I, g must lie in O(V ). The condition onthe determinant guarantees that g lies in SO(V ). But then by the corollary above u must lie in Spin(V ).

The map Spin(V ) → SO(V ) is a surjective contiuous homorphism, by construction. Its kernel is ±I, sothat Spin(V ) is a double cover of SO(V ).

Theorem. (a) Spin(V ) is connected.(b) Spin(V ) ⊂ Cliff+(V ).

Proof. (a) Let f : Spin(V ) → Z be a continuous function; we must show it is constant. If we show thatf(−g) = f(g) for all g, then f will drop to a continuous map of SO(V ) into Z and hence be constant, bythe connectivity of SO(V ). But x(t) = cosπt+ c(e1)c(e2) sinπt (t ∈ [0, 1]) is a continuous path in Spin(V )from I to −I. Hence t 7→ f(gx(t)) is continuous so constant. Hence f(g) = f(−g).(b) Let u0 be the element of Cliff(V ) implementing the grading automorphism γ. Thus u0c(v)u

∗0 = −c(v).

But αg(u0) also implements γ, so that αg(u0) = λ(g)u0 with λ(g) = ±1. Thus λ(g) is a continuoushomorphism SO(V ) → ±1. Since SO(V ) is connected, λ(g) ≡ 1. Since αg(u0) = ugu0u

∗g, this implies

that ug commutes with u0. But then γ(ug) = u0ugu∗0 = ug, so that u−G ∈ Cliff+(V ).

Matrix groups and their Lie algebras. We start by proving von Neumann’s theorem on closed subgroupsof GL(V ). We define gl(V ) = EndV with the usual operator norm.

Lemma (Lie’s formulas). If a, b ∈ End(V ) then (exp(a/n) exp(b/n))n → exp(a + b) and

(exp(a/n) exp(b/n) exp(−a/n) exp(−b/n))n2 → exp[a, b].

Proof. Recall that exp(a) =∑an/n! for all a and log(1 + x) =

∑(−1)n+1xn/n for ‖x‖ < 1. For ‖a‖

sufficiently small, we have log expa = a and for x sufficiently small exp log(1 + x) = 1 + x. Then

log([exp(a/n) exp(b/n)]n) = n log(1 + (a+ b)/n+O(1/n2)) = a+ b+O(1/n) → a+ b,

and

log([exp(a/n) exp(b/n) exp(−a/n) exp(−b/n)]n2

) = n2 log(1 + [a, b]/n2 +O(1/n3)) = [a, b] +O(1/n) → [a, b].

Theorem (von Neumann). Let G be a closed subgroup of GL(V ) and let

Lie(G) = X ∈ EndV | exp(tX) ∈ G for all t.

Then Lie(G) is a linear subspace of EndV closed under the Lie bracket [a, b] = ab− ba and exp(Lie(G)) is aneighbourhood of 1 in G. In fact if U is a sufficiently small open neighbourhood of 0 then exp(U) is an openneighbourhood of 1 in G and exp gives a homeomorphism between U and exp(U).

Proof. Lie’s formulas applied to tX and tY immediately show that Lie(G) is a subspace closed under thebracket [X,Y ] = XY − Y X .

It remains to show that exp(Lie(G)) is a neighbourhood of 1 in G. Let Lie(G)⊥ be a vector subspacecomplementing Lie(G) gl(V ), so that gl(V ) = Lie(G)⊕Lie(G)⊥. By the inverse function theorem, X ⊕Y 7→exp(X) exp(Y ) gives a homeomorphism between a neighbourhood of 0 in End(V ) and 1 in GL(V ) (itsderivative is I). If exp(Lie(G)) is not a neighbourhood of 1 in G, then we can find gn ∈ G with gn → 1but gn /∈ exp(Lie(G)). Write gn = exp(Xn) exp(Yn) with Xn ∈ Lie(G), Yn ∈ Lie(G)⊥. By assumption

28

Yn 6= 0 for all n. But since exp(Xn) and gn are in G, it follows that exp(Yn) ∈ G for all n. Since gn → 1,we must have Yn → 0. By compactness, we may assume by passing to a subsequence if necessary thatYn/‖Yn‖ → Y ∈ Lie(G)⊥ with ‖Y ‖ = 1. Since ‖Yn‖ → 0, we can choose integers mn such that mn‖Yn‖ → t.Then exp(mnYn) = exp(Yn)mn ∈ G has limit exp(tY ). Since G is closed, exp(tY ) ∈ G for t > 0 and hencefor all t on taking inverses. So by definition Y lies in Lie(G), a contradiction.

This result says that matrix groups are Lie groups. If G is a matrix group, we denote its Lie algebraby Lie(G). We shall be interested in matrix groups that are closed subgroups of O(n). Since O(n) ⊂ U(n),they are also closed subgroups of U(n).

Corollary. Let G and H be matrix groups and π : G → H a continuous homomorphism. Then there is aunique Lie algebra homomorphism π : Lie(G) → Lie(H) such that π(exp(X)) = expπ(X) for X ∈ Lie(G).

Proof. Uniqueness follows because we may replace X by tX and take the coefficient of t. Conversely notethat π exp(tX) is a one parameter subgroup in H . Now H is a closed subgroup of U(n); since commutingunitaries can be simultaneously diagonalised, it follows that π exp(tX) = exp tA for some matrix skew–adjoint matrix A. But then by definition A lies in Lie(H). We define π(X) = A. From Lie’s formulas, themap X 7→ π(X) is a Lie algebra homomorphism.

Proposition. (a) Lie(SO(V )) = A ∈ End(V ) : At = −A.(b) Lie(Spin(V )) = x ∈ Cliff+(V )|x∗ = −x, [x, c(V )] ⊂ c(V ) = linRc(a)c(b) − c(b)c(a) : a, b ∈ V . Abasis is given by c(ei)c(ej) with i < j.(c) If π : Spin(V ) → SO(V ) is the double cover, π−1(A) = 1

4

∑i6=j aijc(ei)c(ej).

(d) If A ∈ Lie(Spin(V )), then [A, c(v)] = c(π(A)v).

Proof. (a) is obvious. To prove (b) and (c), note that if eyt lies in Spin(V ), then y is even, y∗ = −y andeytc(v)e−yt = c(eAtv) for some A ∈ Lie(SO(V )). Taking the coefficient of t, we get [y, c(v)] = c(Av). Let(Aej , ei) = aij , so that aij is antisymmetric and real, and let x = 1

2

∑i,j aijc(ei)c(ej). Then

[x, c(v)] =1

4

i6=j

aij [c(ei)c(ej), c(v)] =1

4

i6=j

aij [(v, ei)c(ej) − (v, ej)c(ei)] = c(Av).

Thus [y − x, c(v)] = 0 for all v ∈ V and therefore y− x must be a real scalar. Since (y − x)∗ = −(y− x), wededuce that y = x as required. The map between Lie algebras is 1

2

∑aijc(ei)c(ej) → (aij) by uniqueness.

Finally, if A ∈ Lie(Spin(V )), then eAtc(v)e−At = c(π(eAt)v) = c(eπ(A)tv). Taking coefficients of t, we get[A, c(v)] = c(π(A)v), so (d) follows.

12. BOSONS: THE HARMONIC OSCILLATOR AND MEHLER’S FORMULA.

Holomorphic Fock space. Holomorphic Fock space F is defined to be the vector space of holomorphic func-tions on C such that π−1

∫|f(z)|2e−|z|2 dx dy < ∞ with inner product (f, g) = π−1

∫f(z)g(z)e−|z|2 dx dy.

(Note the normalisation!)

Theorem. F is a Hilbert space with orthonormal basis en(z) = zn/√n!. Moreover the Taylor series

expansion of f ∈ F gives the Fourier expansion in terms of (en).

Proof. (1) passing to polar coordinates, it is straightforward to check that the functions fn(z) = zn are

orthogonal with respect to each inner product (f, g)R =∫|z|≤R fge

−|z|2 dx dy. Moreover

‖fn‖2 = π−1

∫ 2π

0

∫ ∞

0

r2n+1e−r2

dr =

∫ ∞

0

sne−s ds =d

dx|x=1

∫ ∞

0

e−sx ds = n!

(2) Clearly ‖f‖R ↑ ‖f‖ as R → ∞. Now if f ∈ F , f(z) =∑anz

n converges uniformly for |z| ≤ R.

So∑N

n=1 anzn → f in the uniform norm. Hence

∑Nn=1 anz

n → f with respect to ‖ · ‖R, since ‖g‖2R ≤

‖g‖∞‖1‖2R. Hence ‖∑N

n=0 anzn‖2

R → ‖f‖2R, that is ‖f‖2

R =∑ |an|2‖fn‖2

R. Letting R → ∞, we get ‖f‖2 =∑ |an|2‖fn‖2 =∑ |an|2n!. Thus (en) forms an orthonormal basis in F (so that F = lin(e0, e1, e2, . . .)).

29

(3) We end by checking that if∑ |bn|2 < ∞, then f(z) =

∑bnz

n/√n! defines a function in F with

(f, en) = bn; this means that F can be identified with ℓ2 and hence is a Hilbert space. In fact for |z| ≤ Rthe series is absolutely convergent because by the Cauchy–Schwartz inequality

∑∑|bn|Rn/

√n! ≤

(∑|bn|2

)1/2 (∑R2n/n!

)1/2

=(∑

|bn|2)1/2

eR2/2.

Clearly (f, en) = limR→∞(f, en)R = bn limR→∞ ‖en‖2R = bn and ‖f‖2 = lim ‖f‖2

R = lim∑

|bn|2‖en‖2R =∑ |bn|2 <∞ as required.

Let F0 be the subspace of F spanned by polynomials in z. This can be identified with the symmetricalgebra of a 1–dimensional complex Hilbert space. The adjoint of multiplication by z is just d/dz, so thatif A = d/dz and A∗ = z, we have AA∗ − A∗A = I (the canonical commutation relations for a boson).Note that A and A∗ act irreducibly on F0. In fact more generally, the operators zi, ∂/∂zi act irreducibly onpolynomials in n variables. This is the complex wave representation for bosons. The Stone–von Neumanntheorem states that this is essentially the unique such representation (see the exercises).

Hermite functions and the Schrodinger representation. Let H be the space of functions H =p(x)e−x2/2 : p(x) a polynomial with inner product (f, g) =

∫∞−∞ f(x)g(x) dx.

Let Q be the multiplication operator Qf(x) = xf(x) (“position” operator) on H and P the differentialoperator Pf(x) = idf/dx (“momentum” operator) on H . Thus P and Q satisfy the Heisenberg commutationrelations [P,Q] = PQ−QP = iI. Define the harmonic oscillator by D = P 2+Q2 = −d2/dx2+x2. Note thatformally P ∗ = P and Q∗ = Q on H, so that D is formally self–adjoint on H. Let X = Q− iP = x+ d/dx,the annihilation operator, and Y = X∗ = x− d/dx, the creation operator.

Lemma. (a) XY − Y X = 2I, so that A = X/√

2 satisfies AA∗ −A∗A = I.

(b) XY = Q2 + P 2 + I = D + I.

(c) XY n − Y nX = 2nY n−1.

(d) DFn = (2n+ 1)Fn.

Proof. (a) and (b) are obvious. (c) follows by induction from (a), sinceXY n−Y nX = (XY n−1−Y n−1X)Y+Y n−1(XY − Y X) = 2nY n−1. Finally by (c), DFn = (XY − I)Y nF0 = (Y n+1X + 2(n+ 1)Y n − Y n)F0 =(2n+ 1)Fn, since XF0 = 0. So (d) follows.

Corollary. (Fn, Fm) = δnm2nn!√π.

Proof. Since the Fn’s correspond to different eiegnvalues of the self–adjoint operator D, they must bepairwise orthogonal. Clearly ‖F0‖2 =

∫∞−∞ e−x2

dx =√π. Moreover we have

(Fn, Fn) = (Y nF0, YnF0) = (XY nF0, Y

n−1F0) = (Y nXF0 + 2nY n−1F0, Yn−1F0) = 2n(Y n−1F0, Y

n−1F0),

since XF0 = 0 and Y ∗ = X . Thus ‖Fn‖2 = 2n‖Fn−1‖2, so the result follows by induction.

Remark. A direct proof is equally easy. It is enough to show that∫∞−∞ Fn(x)xm dx = 0 for all m < n.

Integrating by parts n times, this integral gives∫∞−∞ e−x2 ( d

dx

)nxm dx = 0.

Similarly ‖Fn‖22 =

∫∞−∞ e−x2 ( d

dx

)n2nxn dx = 2nn!

√π.

By definition the functions xne−x2/2 form a basis of H, so applying the Gram–Schmidt process we get anorthonormal basis Hn(x) = pn(x)e−x2/2, where pn(x) is a polynomial of degree n with positive coefficient of

xn. We call Hn and pn the nth Hermite function and polynomial. Note that ex2/2 ddx(e−x2/2f) = ( d

dx − x)f ,so that

Fn(x) = (x− d

dx)ne−x2/2 = (−1)nex2/2dne−x2

/dxn = (2nxn + · · ·)e−x2/2.

The orthogonality of the Fn’s and the positive coefficient of xn in the expression for Fn imply thatHn(x) = Fn(x)/‖Fn‖2.

30

Mehler’s formula.

n≥0

tnHn(x)Hn(y) =1√

π(1 − t2)exp

4xyt− (1 + t2)(x2 + y2)

2(1 − t2).

Proof. Note that by Taylor’s theorem

∑Fn(x)

zn

n!= ex2/2

∞∑

n=0

(−z)n

n!

dn

dxne−x2

= ex2/2e−(x−z)2 = e−x2/2+2xz−z2

.

Thus if s2 = t, we have

Fs,x(z) ≡∑

Hn(x)snen(z) =∑ Fn(x)

(2nn!√π)1/2

snzn

√n!

= π−1/4 exp(−x2/2 + 2xzs/√

2 − z2s2/2).

For a ∈ (−1, 1) and b ∈ R, set Ga,b(z) = eaz2/2+bz. Then we get

(Ga,b, Ga,b′) =1

π

∫ ∫e−(1−a)x2

e−(1+a)y2

e(b+b′)xei(b−b′)y dx dy

=1

π

∫ ∫e−(1−a)x2

e−(1+a)y2

e(b+b′)xei(b−b′)y dx dy exp[(b+ b′)2

4(1 − a)− (b− b′)2

4(1 − a)]

=1√

1 − a2exp

a(b2 + b′2) + 2bb′

2(1 − a2),

(completing squares and shifting contours). In particular Ga,b lies in F and

∑Hn(x)snen(z) = π−1/4e−x2/2Ga,b(z)

with a = −s2 and b =√

2xs. So from Parseval’s equation for Fs,x(z) we get

∑tnHn(x)Hn(y) = π−1/2e−(x2+y2)/2(G−s2,

√2xs, G−s2,

√2ys)

=π−1/2

√1 − s4

e−(x2+y2)/2 exp−s2(2x2s2 + 2y2s2) + 4xys2

2(1 − s4)

=1√

π(1 − t2)exp

4xyt− (1 + t2)(x2 + y2)

2(1 − t2).

Corollary 1 (Mehler’s formula for the heat kernel of the harmonic oscillator). Let Da = −d2/dx2+a2x2. Then

√aHn(

√ax) are the normalised eigenfunctions of Da with corresponding eigenvalue (2n+ 1)a.

The operator e−tDa has kernel Kt(x, y) =∑

n≥0 e−at(2n+1)aHn(

√ax)Hn(

√ay) =

(4πt)−1/2

(2at

sinh 2at

)1/2

exp

(− 1

4t

[2at

tanh 2at(x2 + y2) − 2at

sinh 2at(2xy)

]).

To prove this note that the unitary Uf(x) = a1/4f(a1/2x) satisfies UD1U−1 = a−1Da.

*Corollary 2. Given f ∈ Cc(R) we can find h ∈ H such ‖f − h‖p ≤ ε for for p = 1, 2.

Proof. The Hn(x)’s form an orthonormal basis. Let P be the projection onto the closure of H in L2(R).The operator Tt with kernel

∑tnHn(x)Hn(y) is Hilbert–Schmidt, with ‖Tt‖ = 1, and Tt → P in the weak

operator topology operator topology. Now let f ∈ Cc(R) and set ft(x) =∫Kt(x, y)f(y) dy We claim that

‖ft − f‖∞ → 0 as t ↑ 1. It follows that (Ttf, g) → (f, g) as t ↑ 1 for f, g ∈ Cc(R) and hence P = I, so thatH is dense in L2(R).

31

To prove the claim note that

∫Kt(x, y) dy = (2/1 + t2)1/2 exp−x2(1 − t2)/2(1 + t2) → 1 (1)

uniformly for x ∈ [−R,R] as t ↑ 1. Moreover

Kt(x, y) =1√

π(1 − t2)exp

[−1 + t

1 − t

(x− y)2

4− 1 − t

1 + t

(x+ y)2

4

].

Thus if δ > 0, as t ↑ 1, we have

|x−y|≥δ

Kt(x, y) dy ≤ 1√π(1 − t2)

exp−1 + t

1 − t

(δ2

4

∫exp−1 − t

1 + t

x2

4dx =

1

1 − texp−1 + t

1 − t

δ2

4→ 0 (2)

uniformly in x as t ↑ 1.Now suppose f is supported in [−R,R]. Then, since

f(x) − ft(x) =

∫Kt(x, y)(f(x) − f(y)) dy + f(x)(1 −

∫Kt(x, y) dy),

we have

|f(x) − ft(x)| ≤ 2‖f‖∞∫

|x−y|≥δ

Kt(x, y) dy + ε

|x−y|≤δ

Kt(x, y) dy + ‖f‖∞ sup|x|≤R

(1 −∫Kt(x, y) dy).

Hence ‖f − ft‖∞ → 0 as claimed.

Thus functions p(x)e−x2/2, with p polynomial, are dense in L2(R). Similarly, applying a scaling trans-

formation, functions p(x)e−x2/4 are dense in L2(R). If f ∈ Cc(R), choose p such that ‖g‖2 is small where

g = ex2/4f − pe−x2/4. Hence ‖f − pe−x2/2‖p = ‖e−x2/4g‖p is small for p = 1, 2: for p = 2, it is ≤ ‖g‖2; while

for p = 1, it is ≤ ‖e−x2/2‖2‖g‖2, by the Cauchy–Schwarz inequality.

Remark. 1. The approximation by functions by Hermite functions can be proved using the theory ofSobolev spaces associated with the operator D = −d2/dx2 +x2 on H0 = L2(R). A particular consequence ofthis is that f lies in S(R) (i.e. sup |xaf (b)(x)| <∞ for all a, b) iff f =

∑anHn(x) with (an) of rapid decay.

2. A stronger form of approximation than that in Corollary 2 can also be proved, following the method ofNorbert Wiener: if f ∈ Cc(R) then we can find h ∈ H with ‖f − h‖p small for 1 ≤ p ≤ ∞.

*Lemma 1. If f ∈ Cc(R and ft(x) =∫Kt(x, y)f(y) dy, then ‖ft − f‖p → 0 as t ↑ 1 for 1 ≤ p ≤ ∞.

Proof. We have already proved the result above for p = ∞. To handle the case 1 ≥ p < ∞, since ft → funiformly, it suffices to show that

∫|x|≥2R

|ft(x)|pdx→ 0. But by Holder’s inequality,

|x|≥2R

(∫

|y|≤R

Kt(x, y)|f(y)| dy)p

dx

≤∫

|x|≥2R

dx

(∫Kt(x, y) dy

)p−1

·∫

|y|≤R

Kt(x, y)|f(y)|p dy dx

≤ ‖f‖p∞

|y|≤R

|x|≥2R

Kt(x, y) dx dy

wich is small (set δ = R).

*Lemma 2 (avoidance of measure theory). Suppose that fn → f uniformly on R and (fn) is a Cauchysequence in Lp for 1 ≤ p <∞. Then ‖fn − f‖p → 0.

32

Proof. Take n0 such that ‖fn − fm‖p ≤ varepsilon for n,m ≥ n0. Let χR be the characteristic function of[−R,R]. Then ‖χR(fn − fm)‖p ≤ ε. Let m → ∞. By uniform convergence on [−R,R], ‖χR(fn − f)‖p ≤ ε.Letting R→ ∞, ‖fn − f‖p ≤ ε for n ≥ n0.

*Lemma 3 (norm estimates). ‖Hn‖2 = 1, ‖Hn‖1 ≤ √π(2n + 3/2)1/2 ∼ n1/2 and ‖Hn‖∞ ≤ 1

2 (2n +

3)1/2 ∼ n1/2. Hence for any p, ‖Hn‖p ≤ Cn1/2.

Remark. In fact Cramer proved that ‖Hn‖∞ ≤ 2; however, the estimate for ‖Hn‖1 cannot be improved.

Proof. We have ‖Hn‖2 = 1 by definition. Since DHn = (2n+ 1)Hn, we have

2n+ 1 = (DHn, Hn) = ‖xHn‖2 + |H ′n‖2 = 2‖xHn‖2,

using the Fourier transform. So ‖xHn‖22 = (2n+ 1)/2. Hence

‖Hn‖1 =

∫|Hn(x)| dx ≤ ‖(1 + x2)−1/2‖2‖(1 + x2)1/2Hn‖2 =

√π(2n+ 3/2)1/2

and

‖Hn‖∞ = supx

1√2π

∣∣∣∣∫Hn(t)eitx dt

∣∣∣∣ ≤1√2π

‖Hn‖1.

The estimates for other values of p follow by Jensen’s inequality, since log∫|f |s is a convex function of s.

*Lemma 4. ft − ft,N → 0 in LP (p = 1, 2,∞) if ft,N =∑N

n=0 tn(f,Hn)Hn(x).

Proof. By Mehler’s formula

Kt(x, y) = limN→∞

N∑

n=0

tnHn(x)Hn(y).

Since |Hn(x)Hn(y)| ≤ K(n + 1), this series is uniformly convergent for fixed t (since∑tn(n + 1) = (1 −

t)−2). Integrating over [−R,R] against f(y)dy, we see that ft,N → ft uniformly. By Bessel’s inequality∑ |(f,Hn)|2 ≤ ‖f‖22. Hence ∑

tn|(f,Hn)|‖Hn‖p ≤ ‖f‖2

∑tn‖Hn‖p.

Hence (ft,N) is a Cauchy sequence in Lp. By Lemma 3, ft,N → Ft in Lp.

To prove the approximation result take t such that ‖ft−f‖p ≤ ε/2. Then choose N such that ‖ft−ft,N‖p ≤ε/2. Thus ‖f − ft,N‖p ≤ ε/2.

The density of Hermite functions in the Lp spaces can also be proved directly using a weakened versionof the Stone–Weierstrass theorem to show that (1 + x2)H is uniformly dense in C0(R). It follows that iff ∈ Cc(R), there is an h ∈ H such that ‖(1 + x2)(f − h)‖∞ ≤ ε. This implies that ‖f − h‖p is small for1 ≤ p ≤ ∞. The same result can be proved directly by reducing the problem to that of Fourier series:

Theorem. H is dense in L2(R) so that the functions Hn form a complete set of eigenfunctions for theharmonic oscillator in L2(R).

Proof (Igusa). We will prove the stronger statement that any f ∈ Cc(R) can simultaneously be approxi-mated in every Lp norm by a Hermite function h ∈ H.

Step I: Given b > 0, any function f ∈ Cc(R) can be uniformly approximated by functions in lin(eiaxe−bx2

:

a ∈ R). In fact set g(x) = ebx2

f(x). Let M = ‖g‖∞ and choose R sufficiently large that f vanishes outside[−R,R] and sup|x|≤R exp(−bx2) ≤ ε/(ε+M). Since g vanishes outside [−R,R], we can extend g to a periodicfunction g of period 2R agreeing with g on [−R,R]. Now uniformly approximate g by a Fourier series, sothat

|g(x) −∑

|n|≤N

an exp(ixπn/R)| ≤ ε (∗)

33

for all x. Set h(x) =∑an exp(ixπn/R) exp(−bx2). Then for |x| ≤ R, |f(x) − h(x)| ≤ ε by (∗); while if

|x| ≥ R

|f(x) − h(x)| = |h(x)| ≤ |ex2bh(x)|ε/(M + ε) ≤ (|∑ an exp(ixπn/R) − g(x)| + |g(x)|)εε+M

≤ ε,

also by (∗). Hence ‖f − h‖∞ ≤ ε.

Step II: For a given b > 0, any f ∈ Cc(R) can be uniformly approximated by a function p(x)e−bx2

with p(x)a polynomial. Clearly for x ∈ R we have

|eix −n−1∑

m=0

(ix)m

m!| ≤ |x|n

n!,

since the difference is less than∑

m≥n |x|m/m! which can be estimated using the integral form of the remain-

der in Taylor’s theorem. Thus if we replace each eiax by its first n terms, we are reduced to showing thatcn = supx |a|n|x|ne−bx2

/n! tends to zero as n→ ∞. But cn = |a|n(n/2b)n/2e−n/2/n!. Since logn! ∼ n logn,log cn → −∞ so that cn → 0 as n→ ∞.

Step III. Given ε > 0, we can find a polynomial p(x) such that |f(x)ex2/4 − p(x)e−x2/4| ≤ ε. Hence

|f(x) − p(x)e−x2/2| ≤ εe−x2/4. Now just take any Lp norm.

Remarks. 1. All the above applies equally well with R replaced by Rn; in fact the n–dimensional case isjust an n–fold tensor product of the one–dimensional case.2. The Hermite functions can be used to give a quick proof of the properties of the Fourier transform definedfor f ∈ H by

f(t) =1√2π

∫ ∞

−∞f(x)e−itx dx.

Indeed integration by parts and differentiation under the integral sign show that

P f = −Qf, Qf = P f,

so thatXf = iXf, Y f = −iY f .

Thus X and Y may be regarded as “eigenoperators” for the Fourier transform. Hence Fn = (−i)nFn so thatHn(x) = (−1)nHn(x) = Hn(−x). It follows immediately that

f(x) = f(−x) and ‖f‖2 = ‖f‖2 for f ∈ H.

The results on approximation by Hermite functions lead to the usual results for the Fourier transform inL2(R) and the extension of formula for the formula for the transform to f ∈ L1(R) ∩ L2(R).3. iI, iP and iQ form the basis of a Lie algebra under the Lie bracket [A,B] = AB−BA. The correspondingmatrix group is the Heisenberg group consisting of matrices

1 p r0 1 q0 0 1

with p, q, r ∈ R. Also iP 2, iQ2, i(QP + PQ) form a Lie algebra, that of SL(2,R). If X lies in the secondalgebra and Y in the first, then [X,Y ] lies in the first algebra. The corresponding groups act on L2(R)with SL(2,R) normalising the Heisenberg group. This is called the metaplectic representation of SL(2,R)and is the bosonic version of quantisation that we already saw for fermions. Note that the Fourier trans-

form corresponds to the Weyl element

(0 1−1 0

). The existence of this representation follows from the

Stone–von Neumann theorem which states that this is essentially the only irreducible representation of theHeisenberg commutation relations. If R is replaced by a p–adic local field, we get the Shale–Weil metaplectic

34

representation, which is intimately related to quadratic reprocity and the modern theory of automorphicforms.

4. The unitary map W : L2(R) → F taking Hn to en carries the Hermite functions onto the polynomialsin z and carries the operator 2−1/2X to A = ∂/∂z. It is called the Bargmann transform; its existenceis predicted by the Stone–von Neumann theorem. The harmonic oscillator corresponds to the operatorA∗A = z∂/∂z (up to an additive constant), so is just the operator i∂/∂θ. Moreover the Fourier transform onL2(R) becomes the map f(z) 7→ f(−z) (see the exercises). It is also possible to define analogues of the realClifford algebra relations. These satisfy c(v)c(w)− c(w)c(v) = 2B(v, w)I where B(v, w) is a non–degeneratesymplectic form on an even dimensional space. This form arise as the imaginary part of a complex innerproduct on a complex space. The irreducible representations of the c(v)’s come from choosing a complexstructure on V and invoking the complex wave representation. The symplectic group acts by automorphismsof these relations and so can be quantised as in the fermionic case. This is the metaplectic Shale–Weil–Segalrepresentation. There is also a map of the algebra generated by the c(v)’s on to the polynomial algebra ofV (the symbol map) and one is soon led into the area of Weyl quantisation. (The c(v)’s generate the Weylalgebra.)

13. MANIFOLDS, TANGENT VECTORS AND METRICS. A compact smooth n–dimensionalmanifold M is a compact metric space with a finite cover Ui by open sets homoemorphic with open subsetsVi of Rn by maps φi : Ui → Vi such that ψj φ−1

i : φi(Ui ∩ Uj) → φj(Ui ∩ Uj) is smooth for all i, j. Clearlywe can then cover M by open sets Bj ⊂ Ui homoepmorphic to open balls in Rn such that the coordinatechanges φj φ−1

i are smooth. We say that f ∈ C∞(M) if f φ−1i is smooth for each coordinate map φi.

Partitions of unity. Let M be a compact manifold and Ui an open cover of M . We can find functionsψi ∈ C∞(M) with 0 ≤ ψi ≤ 1 and suppψi ⊂ Ui.

Proof. Since M is compact we can find finitely many open sets Bj diffeomorphic to open balls with each Bj

contained entirely in some Ui (put such a ball at each point of M and then take a subcover). By compactnessfinitely may of these Bj cover M . For each ball Bj pick a bump function hj . Then

∑hj > 0 on M . Set

fj = hj/∑hi. Then suppfj ⊂ Bj , 0 ≤ fj ≤ 1 and

∑fj(x) = 1. Finally match each Bj up to some Ui

containing it and set ψi(x) equal to the sum of the fj(x)’s associated to Ui.

Whitney’s embedding/extension theorem. Any compact manifold M admits a smooth embedding inRN for N sufficiently large. Any smooth function on M extends to a smooth function on RN (of compactsupport).

Proof.Cover M by balls Bi with each Bi contained in a larger ball B′i. Take embeddings φi : B′

i → Rn.Take a bump function χi in C∞

c (B′i) equal to 1 on a neighbourhood of Bi. Let Φ(x) = (χi(x), χi(x)φi(x))i

so that Φ is a C∞ map in RN where N = 2nm with m the number of balls. The derivative is injectiveat each point, since this is true of the function φi on Bi. The map itself is injective, since if x ∈ Bi andΦ(x) = Φ(y), χi(x) = χi(y) = 1, so that y ∈ B′

i. But then φi(x) = φi(y), so that x = y.

To construct the extension, take ψi a partion of unity subordinate to Bi and, for f ∈ C∞(M), setF (y) =

∑χi φ−1(y2i) f φ−1

i (y2i). By construction F ∈ C∞c (RN ) and F (Φ(x)) =

∑ψi(x)f(x) = f(x) for

x ∈M .

Tangent vectors and vector fields. By definition a tangent vector L at x ∈ M is a linear map L :C∞

R(M) → R satisfying the Leibniz derivation rule L(fg) = L(f)g(x)+f(x)L(g). The linear space of such L is

denoted by TxM and called the tangent space at x. A vector field onM is a linear mapX : C∞(M) → C∞(M)satisfying the Leibniz derivation rule X(fg) = (Xf)g + f(Xg). The space of vector fields is denoted byVect(M).

Proposition. (a) If X is a vector field Xx(f) = (Xf)(x) defines a tangent vector at x such that for eachf ∈ C∞(M), g(x) = Xx(f) is smooth. Conversely any smooth family of tangent vectors forms a vector field.

(b) Vect(M) is a C∞(M)–module.

(c) Vect(M) is closed under the Lie bracket [X,Y ]f = XY f − Y Xf .

Proof. A simple exercise.

35

Local expressions for tangent vectors. Given local coordinates (x1, x2, . . . , xn) near x, we can definetangent vectors at a by L(f) = ∂if(a). (More properly we should write ∂i(f φ−1)(φ(a)), but normally oneidentifies functions on U and φ(U).)

Proposition. The tangent vectors f 7→ ∂if(a) are a basis of TaM . Thus∑ai∂i|x gives a typical tangent

vector at a. Vector fields have the form Xf(a) =∑gi(x)∂if(x) in local coordinates with gi smooth. Under

change of coordinate x = x(ξ), X =∑gi(x(ξ))∂ξj/∂xi∂/∂ξj.

Moral. Vector fields are first order differential operators on C∞(M).

Proof. We first observe that if f ∈ C∞(Rn) and f(0) = 0, then f(x) =∑fi(x)xi with f ∈ C∞(Rn). In

fact since ddtf(tx) =

∑xifxi

(xt), we have f(x) =∑xi

∫ 1

0fxi

(tx) dt so we we may take fi(x) =∫ 1

0fxi

(tx) dt(smoothness follows because we can differentiate under the integral sign).

Let B be a neighbourhood of a diffeomorphic to a ball. Take ψ a bump function in C∞c (B) equal to

1 near a. Applying the Leibniz rule at a to (1 − ψ)f = (1 − ψ)1/2 · (1 − ψ)1/2f , we get L(f) = L(ψf) forall f ∈ C∞(M). By the initial observation, ψ(x)f(x) − f(a) =

∑(xi − ai)fi(x) for fi ∈ C∞(Rn). Clearly

fi(a) = ∂i(ψf)(a) = ∂if(a). Since L(1) = 0, we get

L(f) = L(ψ3f) = f(a)L(ψ2) +∑

L(ψ2(xi − ai)fi) = f(a)L(1) +∑

L(ψ(xi − ai))fi(a) =∑

ai∂if(a),

where ai = L(ψ(xi − ai)) is independent of f .

Tangent vectors to curves. A smooth curve c : (a, b) →M is a map which is smooth in any local chart.The tangent vector c(t) to the curve at x = c(t) is c(f)(t) = d

dtf(c(t)). In terms of local coordinates, if

c(t) = (ci(t)), then c(t)f = ddtf(c(t)) =

∑ci∂if(c(t)).

Geometric realisation of tangent space. If u : M → RN is an embedding, the usual physical tangentspace to M at x consists of vectors u(x) + v where v lies in an n–dimensional subspace. The map v = L(u)gives a natural isomorphism between TxM and this physical space. In particular if c : (a, b) →M is a curvewith c(t) = x, then c(t) = c(u) = du(c(t))/dt is the usual tangent vector to the curve in RN ; and if xi arelocal coordinates, the tangent vector ∂/∂xi goes to ∂u/∂xi. Thus the formal definition agrees with geometricintuition!

Riemannian metrics. By definition a metric on a compact manifold is an assignment of an inner productgx to each tangent space TxM ; the inner product should vary smoothly, so that whenever X,Y are vectorfields on M , x 7→ gx(Xx, Yx) is smooth. In local coordinates in Rn, this means that we give an invertiblepositive definite symmetric matrix g(x) = gij(x) so that if X =

∑ai(x)∂/∂xi and Y =

∑bi(x)∂/∂xi

are vector fields gx(Xx, Yx) =∑

ij ai(x)bj(x)gij(x). Under a coordinate change x = x(ξ), g transforms to

gpq(ξ) =∑

ij gij∂xi/∂ξp∂xj/∂ξq. Any embedding u : M → RN induces a metric on M because of theidentification of tangent spaces. Thus in local coordinates gij(x) = gx(∂i, ∂j) = ∂iu ·Dju.

Nash’s embedding theorem. Every compact Riemannian manifold admits an embedding in Rn which isisometric, i.e. the metric on M is the one induced by the embedding.

Proof. By Whitney’s embedding theorem M can be embedded in RN . The metric defines an inner producton each tangent space to M . Take the euclidean inner product on the orthogonal complement of each tangentspace. This gives a smooth map g(x) from M into the positive definite N ×N matrices. Set h(x) = g(x)1/2.Extend h to H ∈ C∞

c (RN ,Mn(R)). Let ψ be a bump function equal to 1 on M and with supp(ψ) ⊂ supp(H).Since ψ and H are supported in some cube [−R,R]N , we may regard them as functions on TN and u asan embedding in TN . Then G = ψHtH + (1 − ψ) is a metric on TN extending g, so can be realised by anembedding v in RM . The required isometric embedding is obtained by restricting vu to M .

14. GEODESICS AND NORMAL COORDINATES. If M is a Riemannian manifold and c : [0, 1] →M is a piecewise smooth path, we define the length of c by ℓ(c) =

∫ 1

0 ‖c(t)‖ dt, where the norm ‖c(t)‖ is com-puted using the inner product on Tc(t)M . By the chain rule, the length is independent of the parametrisationof the path.

36

The Euler equations for a geodesic. We work in a local coordinate patch U around 0 in Rn. Considera smooth curve c : [0, 1] → U and smooth metric g(x). We shall assume in addition that c is nowhere

vanishing. The length of c is just∫ 1

0 (g(c)c, c)1/2 dt. This does not change under reparametrisation, so we

shall reparametrise proportionally to arclength. This means that (g(c)c, c)1/2 = L is constant. We want tolook at paths that minimise the length, i.e. which are critical points subject to the end points being fixed,c(0) = 0 and c(1) = x. Assume that c is parametrised proprtionally to arclength. If we replace c by c+ εvwhere v(0) = 0 = v(1), we find, using g =

∑ci∂g/∂xi,

(ℓ(c+ εv) − ℓ(c))/ε = (2L)−1

∫ 1

0

((v · ∇)g(c)c, c) + 2(g(c)v, c) dt+O(ε)

= (2L)−1

∫ 1

0

((v · ∇)gc, c) − 2(gv, c) − 2(gv, c) dt+O(ε).

Thus for a critical point we require∫ 1

0

∑k vk(

∑i 2gikci +

∑ij 2∂jgik cicj − ∂kgij cicj) dt = 0. Since vk is

arbitrary,∑

i gkici +∑

ij ∂jgik cicj − 12∂kgij cicj = 0. Symmetrising in i and j, we deduce Euler’s equations:

ck +∑

Γkij cicj = 0,

where the Christoffel symbols are given by Γkij = 1

2

∑ℓ g

kℓ(∂jgiℓ + ∂igjℓ − ∂ℓgij) and gij(x) = g(x)−1. Wesay that a curve c(t) is a geodesic if it satisfies Euler’s equations.

Lemma. If c(t) is a geodesic, the parameter t is proportional to arclength.

Proof. If aijk = 1/2(∂igjk + ∂jgik − ∂kgij), then by Euler’s equations

d

dt(g(c(t))c, c) = (gc, c) + 2(gc, c) =

∑∂kgij cicj ck −

∑aijk cicj ck =

∑∂kgij cicj ck −

∑∂kgij cicj ck = 0.

Hence ds/dt = (g(c(t))c, c)1/2 is constant.

Theorem (exponential map). If M is a compact Riemannian manifold and p ∈ M , then there existsδ > 0 such that if X ∈ TpM with ‖X‖ < δ, there is a unique geodesic cX : (−2, 2) →M such that cX(0) = pand cX(0) = X. If |s| ≤ 1, then csX(t) = cX(st). Moreover expp(X) = cX(1) defines a diffeomorphismbetween Bp = X ∈ TpM : ‖X‖ < δ and an open neighbourhood of p in M .

The map (x,X) 7→ expx(X) is smooth and locally a diffeomorphism onto a neighbourhood of (x, x) ∈M ×M .

Proof. Consider the second order ODE ck +∑

Γkij(c(t))cicj = 0 with initial conditions c(0) = X and

c(0) = 0. This is equivalent to the first order system of ODEs ci = vi, vk = −∑ij Γkij(c)vivj with initial

conditions ci(0) = 0, vi(0) = X . We quote the following result on ODEs, proved in the appendix:

“Let f(t, x) be C∞ on |t − t0| ≤ a, ‖x − y‖ ≤ b where x ∈ Rn. Then x(t) = f(t, x), x(t0) = x0 has aunique solution for |t− t0| sufficiently small and the solution depends smoothly on t and x0.”

Thus there exist ε, δ > 0 such that for ‖X‖ ≤ δ and |t| < ε these equations have a unique solution dependingsmoothly on X and t. If c(t) satisfies Euler’s equations so too does c(st) for |s| ≤ 1. Looking at the initialconditions, it follows that cX(st) = csX(t). We set c0(t) ≡ 0. Using the homogeneity condition, we see thatwe may assume that ε = 2 provided we make δ sufficiently small. Define expp(X) = cX(1). This gives asmooth map of Bp = X ∈ TpM : ‖X‖ < δ into M . By the inverse function theorem, the proof will becompleted once we have shown that the derivative of this map at 0 is just the identity. But

limh→0

(chX(1) − c0(1))

h= lim

h→0

cX(h)

h= cX(0) = X.

So the derivative is the identity.To prove the last assertion, note that smoothness follows because the solution of the system of ODEs

ci = vi, vk = −∑ij Γkij(c)vivj with initial conditions ci(0) = x, vi(0) = Xi depends smoothly on X and

37

x. The derivative of the map (x,X) 7→ (x, expx(X) at (0, 0) has the form

(I ∗0 I

), so this map is a local

diffeomorphism by the inverse mapping theorem.

We get normal geodesic coordinates at p by using the map expp : Bp → M (or its inverse) to define achart or coordinate system near p. If ‖X‖ is the norm on TpM , then g(0) = I and the lines t 7→ tX aregeodesics through 0. Hence ds/dt = (g(xt)x, x)1/2 = k for some k = k(x). Taking t = 0 we get k = ‖x‖ sothat (g(xt)x, x) = ‖x‖2 or equivalently (g(x)x, x) = ‖x‖2.

Gauss’ Lemma. In normal geodesic coordinates g(x)x = x.

Proof. Pick x 6= 0 and b ⊥ x with ‖b‖ = ‖x‖. Let fs(t) = (x cos s + b sin b)t be the geodesic from0 to x cos s + b sin s. Let u = x cos s + b sin s and v = du/ds = −x sin s + b cos s. By Euler’s equation,∑

Γkijuiuj = 0, so that 2

∑ij ∂jgikuiuj =

∑ij ∂kgijuiuj . Thus

∑vk(∂k(g)u, u) = 2

∑uk(∂k(g)u, v). On the

other hand(g(tu)u, u) = ‖x‖2 (∗)

for all s and t (since (g(c)c, c) is independent of t). Now if u = (u1, . . . , un) and v(= v1, . . . , vn), then by thechain rule

∂g(ut)

∂t=∑

uj∂g∂xj ,∂g(tu(s))

∂s=∑

tvj∂g

∂xj.

Differentiating (∗) with respect to s, we get

0 =∑

k

tvk(∂k(g)u, u) + 2(gu, v) = 2∑

ijk

tvk∂jgikuiuj + 2(gu, v) = 2t∂t(gu, v) + 2(gu, v).

Thus ∂t(t(gu, v)) = 0 and hence t(gu, v) is independent of t, so zero setting t = 0. Hence (gu, v) = 0. Settings = 0 and t = 1, we get (g(x)x, b) = 0. This is true for all b ⊥ x, so that g(x)x = λx for some λ. Since(g(x)x, x) = ‖x‖2, we get λ = 1 and therefore g(x)x = x.

Corollary 1 (geodesics minimise length). In normal geodesic coordinates, the path t 7→ tx is the shortestpath in M from 0 to x for ‖x‖ ≤ δ. A path from 0 to x has minimal length ‖x‖ only if it lies on the line tx;if it is parametrised by arclength it attains the minimum iff it is t 7→ tx.

Proof. The path c(t) = tx has length x, so that d(0, x) ≤ ‖x‖. Now we claim that any path c(t) from x toy satisfies ℓ(c) ≥ ‖x‖. Since we can apply this to the portion of the path when it first hits the shell of radius‖x‖, we may assume ‖c(t)‖ ≤ ‖x‖. Write c(t) = r(t)v(t) with ‖v(t)‖ = 1 and r(t) > 0. Then c = rv + rv.Since ‖v(t)‖ = 1, v(t) ⊥ v(t). So by Gauss’ lemma, (g(v)v, v) = (v, v) = 0. Hence (gc, c) = (r)2 + r2(gv, v).So

ℓ(c) =

∫ 1

0

(gc, c)1/2 dt ≥∫ 1

0

|r| dt ≥ |∫ 1

0

r(t) dt = r(1) − r(0) = ‖x‖.

So ℓ(c) ≥ ‖x‖. Equality occurs iff v(t) = 0 and r(t) always has the same sign. Thus v(t) is constant and thepath is just the sgement between 0 and x.

Corollary 2. d(x, y) = infℓ(c) : c is a piecewise smooth curve from x to y defines a metric on M givingthe usual topology. In geodesic coordinates d(0, x) = ‖x‖ for ‖x‖ sufficiently small.

Proof. This is immediate from corollary 1.

Corollary 3. In normal geodesic coordinates ∂kg(0) = 0 and Γkij(0) = 0. Hence gij(x) = δij +O(‖x‖2).

Proof. Applying the radial vector field∑xi∂i to the identity g(x)x = x, we get

∑xk(∂kg)(x)x = 0.

Replacing x by tx, we get∑xk(∂kg)(tx)x = 0. Setting t = 0, we get

∑xk(∂kg)(0)x = 0. Thus ∂kgij(0) +

∂jgik(0) = 0. Set aijk = ∂kgij(0). Then aijk = ajik and aijk = −aikj . Since (1, 2) and (2, 3) generate S3, wemust have aijk = 0; indeed aijk = −aikj = −akij = akji = ajki = −aijk. Thus ∂kg(0) = 0. The formula forΓk

ij immediately gives Γkij(0) = 0.

Remark. Although we shall not need this, it is not hard to show that the quadratic correction is given bywhat will turn out to be the Riemannian curvature: ∂2

pqgij(0) = 23Ripqj where Ripqj = ∂qΓ

ijp(0) − ∂jΓ

iqp(0).

38

15. HERMITIAN VECTOR BUNDLES AND PROJECTIONS. Note that by the embeddingtheorems, and embedding of M in RN exhibits the tnagent bundle as a subbundle of the trivial bundleM ×RN . Let p(x) be the orthogonal projection onto TxM ⊂ RN , a rank n projection. P is smooth functionfrom M into Mn(R) and a vector field on M is a element ξ ∈ C∞(M,Rn) such that pξ = xi. For the Nashembedding the natural inner product on p(x)RN agrees with the one given by the Riemannian metric.

This can be generalised to more general projection–valued P ∈ Mk(C∞(M)), which define hermitianvector bundles. The space of sections is given by ξ ∈ C∞(M,Ck) such that pξ = ξ. This is evidently aC∞(M)-submodule of C∞(M)k with a C∞(M)-valued inner product given pointwise by (ξ(x), η(x)). It has(I − p)C∞(M) as a direct complement. Every finitely generated projective C∞(M)-module has this form.

It is often more convenient to have a local description of the vector bundle in terms of transition matrices.For the tangent bundle, these are given by the derivative of the coordinate changes φi φ−1

j , a smooth mapgij : Ui ∩ Uj → GLn(R).

To prove the existence of these transition matrices we need:

Lemma. Let p : B(0, r) → Mn(C) be a smooth map such that p(x) is an orthogonal projection. Then wecan find U : B(0, δ) → U(n) smooth such that p(x) = U(x)p(0)U(x)∗.

Proof. We give a proof based on parallel transport (see below). Let F (x) = I − 2p(x). Thus F (x)∗ = F (x)and F (x)2 = I. Fix x0 ∈ V with ‖x0‖ = 1 and set f(t) = F (tx0). Consider the ODE g(t) = h(t)g(t),g(0) = I, where h = 1

2 ff−1. Clearly f∗ = f and ff + f f = 0. Hence h∗ = −h and g∗ = −g∗h. So

ddt (g

∗g) = −g∗hg + g∗hg = 0. Thus g(x)∗g(x) = g(0)∗g(0) = I, so that g is unitary. We now claim thatg−1fg = f(0), so that p(xt) = g(xt)p(0)g(xt)∗. Since g depends smoothly on the initial data x, we get asmooth family U(x) for ‖x‖ < r such that p(x) = U(x)p(0)U(x)∗, where U(tx0) = gx0(t). To prove theclaim, note that

d

dt(g−1fg) = −g−1gg−1fg + g−1fg + g−1f g = g−1(−gg−1f + f + f gg−1)g = 0,

since gg−1 = 12 f f − 1

2f f .

Differential equations on a matrix group. Let G ⊂ GL(V ) be a closed subgroup with Lie algebra g.

Proposition. If A(t) : (a, b) → g is smooth and c ∈ (a, b), there is a unique smooth map g : (a, b) → G suchthat g(t) = A(t)g(t) and g(c) = I. If A varies smoothly, so does g.

Proof. Suppose firstly that G = GL(V ). Then there is a unique solution g(t) = A(t)g(t) and g(0) = I withg(t) ∈ End(V ). Similarly there is a unique solution to h(t) = −h(t)A(t) and h(0) = I with h(t) ∈ End(V ).But then d

dt (hg) = 0, so that h(t)g(t) ≡ I. If G = O(V ) or U(V ), we have A(t)∗ = −A(t) and h1(t) = g(t)∗

would therefore satisfy the same equations as h. By uniqueness, g∗ = g−1, so g is orthogonal or unitary.Finally if A depends smoothly on an additional parameter y, we introduce Y as an additional variable withY = 0, Y (c) = y and apply the smoothness result for ODEs to this enlarged system.

For general G, we require the following lemma:

Lemma. If ad(x)y = [x, y] and f : (a, b) → g, then

e−f d

dt(ef ) =

I − e−ad f

ad f

df

dt.

Proof (Alex Selby). For A,B ∈ End(V ), we have ddte

−AteBt = e−At(B − A)eBt, so that integrating weget Duhamel’s formula:

e−AeB =

∫ 1

0

e−At(B −A)eBt dt.

We use this to compute the coefficient of ε in e−aea+h, setting A = a and B = a+ εx. Note that e−ad(at)x =e−atxeat, since they both satisfy the differential equation f(t) = −t[a, x] with f(0) = x. We get

(e−aea+εx − I)/ε =

∫ 1

0

e−atxeat dt+O(ε) =

∫ 1

0

e−ad(a)t · xdt+O(ε) =

(I − e−ad a

ada

)· x+O(ε).

39

This immediately gives the formula for the derivative.

Corollary. d/dt(ef )e−f = ead f [(1 − e−ad f )/ad f ]f = [(ead f − 1)/ad f ]f .

Since replacing g(t) by g−10 g(t) and A(t) by g−1

0 A(t)g0 preserves the ODE, it suffices to prove g(t) lies in Gnear t = 0. Write g(t) = exp(f(t)) for t small. The corollary shows that f(t) = h(ad(f(t))A(t), where h isthe power series h(x) = x/(ex − 1). Because g is closed under Lie brackets, the right hand side lies in g andis an ODE with values in g. Hence its solution f(t) lies in g (by uniqueness), so that g(t) = exp f(t) lies inG.

Now if we fix a rank k projection p0 and balls Bi covering M , for each i we can find Ui(x) unitary suchp(x) = Ui(x)p0Ui(x)

∗ on Bi. Now we set gij(x) = U∗j Ui on the image of p0, V say. From the definition,

gij(x)gji(x) = I on Bi ∩Bj and gijgjkgki = I on Bi ∩Bj ∩Bk. Such a collection of C∞ functions is called asystem of transition matrices. A section of the bundle corresponds to local maps ξi : Bi → Rm or Cm suchthat gijξi = ξj on Ui ∩ Uj .

Conversely any system of unitary transition matrices can be used to define a vector bundle, i.e. aprojection. In fact let ψi be a partition of unity subordinate to the Bi’s and consider

(ψ1/21 ξ1, · · · , ψ1/2

m ξn : gijξj = ξi.

At each point x of M this defines a k–dimensional subspace of Cm and it is easy to see that the orthogonalprojection p(x) varies smoothly. This gives the required projection. To summarise:

Theorem. The following objects are equivalent:

(1) hermitian vector bundles of rank m, i.e. transition functions gij in U(m) or O(n);

(2) smooth maps of M into the rank m projections in MN(R) or MN (C);

(3) finitely generated projective hermitian C∞(M)–modules.

This equivalence is an equivalence of categories.

Operations on vector bundles. The importance of the above theorem is that it gives several quitedifferent ways of viewing a vector bundle. In a particular situtation or calculation adopting the right pointof view can often simplify things. All the usual operations on vector spaces have their natural counterpartfor vector bundles. We have already defined the direct sum E⊕F . Clearly C∞(E⊕F ) = C∞(E)⊕C∞(F ),so this corresponds to the direct sum of modules. In terms of transition matrices, we take gE⊕F

ij = gEij ⊕ gF

ij .

Similarly we can define E ⊗ F via gE⊗Fij = gE

ij ⊗ gFij or as C∞(E) ⊗C∞(M) C

∞(F ) or as PE ⊗ PF . The

definitions of E∗, Hom(E,F ), Sk(E) and Λk(E), are similar and agree with the vector space definitionsat points, so for example Λk(E∗)x = Λk(E∗

x). Note that sections of the vector bundle Hom(E,F ) are justvector bundle homomorphisms E → F . They give linear maps Ex → Fx; if each is an isomorphism, we getan isomorphism of vector bundles. Note that if E and F correspond to projections P,Q ∈ MN(C∞(M))),then C∞(Hom(E,F )) = QMN (C∞(M))P so everything is quite concrete!

16. COHOMOLOGY, CONNECTIONS AND CURVATURE.

Differential forms. The most important example of a vector bundle is the tangent bundle TM . Em-beddings in euclidean space give it the smooth structure of a vector bundle, The transition matricesgij : Ui ∩Uj → GLn(R) are obtained by taking the derivative of φi ∩φ−1

j . The embedding in euclidean space

RN makes the tangent bundle TM a subbundle of M × RN , so we may take the corresponding orthogonalprojection. Finally C∞(TM) = Vect(M), the vector fields. Let T ∗M be the dual vector bundle, called thecotangent bundle. Thus C∞(T ∗M) = HomC∞(M)(Vect(M), C∞(M)). This is called the space of 1–formsand denoted Ω1(M). If f ∈ C∞(M), we define df ∈ Ω1(M) by (df,X) = Xf ∈ C∞(M). If (x1, . . . , xn)are local coordinates, then dx1, . . . , dxn satisfy (dxi, ∂j) = δij so locally give a dual basis of 1–forms to thevector fields ∂j . Hence locally we can write any 1–form as

∑fidxi with fi smooth.

Let Ωk(M) = Λk(T ∗M). Clearly Ωk(M) may be identified with the space of alternating C∞(X)–linearmaps of Vect(M) into C∞(M), (X1, . . . , Xk) 7→ ω(X1, . . . , Xk). In local coordinates any k–form may bewritten

∑fIdxi1 ∧· · ·∧dxik

, where i1 < · · · < ik. If α and β are homogeneous of degree a and b respectively,

40

then exterior multiplication is given by the formula

α ∧ β(X1, . . . , Xa+b) =1

(a+ b)!

σ∈Sa+b

ε(σ)α(Xσ(1), . . . , Xσ(a))β(Xσ(a+1), . . . , Xσ(a+b)).

In local coordinates if α =∑aIdxi1 ∧· · ·∧dxia

and β =∑bJdxj1 ∧· · ·∧dxjb

, we have α∧β =∑aIbJdxi1 ∧

∧dxia∧ dxj1 ∧ · · · ∧ dxjb

, which may be rewritten using the usual rules for exterior multiplication. Thespace Ω∗(M) =

⊕ni=0 Ωk(M) forms a graded commutative algebra under this product, so that α ∧ β =

(−1)∂α∂ββ ∧ α. As an algebra, it is generated by Ω0M = C∞(M) and the 1–forms Ω1M . Indeed, using apartition of unity, we may express every k–form as a sum of terms f0df1 ∧ · · · ∧ dfk.

Theorem. There is a unique linear mapping d : Ω∗(M) → Ω∗(M) such that:

(a) dΩk(M) ⊂ Ωk+1(M).(b) if f ∈ C∞(M) and X ∈ Vect(M), then (df,X) = Xf .(c) d2 = 0(d) d is a graded derivation, i.e. d(α ∧ β) = (dα) ∧ β + (−1)∂αα ∧ (dβ).

d is given by the formula dω(X1, . . . , Xk+1) = (k + 1)−1[∑k+1

i=1 (−1)i+1Xiω(X1, . . . , Xi−1, Xi+1, . . . , Xk+1)

+∑

i<j

(−1)i+jω([Xi, Xj ], X1, . . . , Xi, . . . , Xj , . . . , Xk+1)]. (∗)

In local coordinates if ω =∑fIdxi1 ∧ · · · ∧ dxik

with i1 < · · · < ik, we have dω =∑∑

j ∂xjfIdxj ∧ dxi1 ∧

· · · ∧ dxik. In particular if α ∈ Ω1M , we have dα(X,Y ) = 1

2 (Xα(Y ) − Y α(X) − α([X,Y ])).

The Riemannian manifold M is said to be oriented if for some choice of charts the transition matricesgij have positive determinant: the charts are then said to be oriented. If ω ∈ ΛnM is an n–form onM and we use the oriented charts φ : Ui → M , ω = fi(x)dx1 ∧ · · · ∧ dxn on Ui. We define

∫Mω =∑

i

∫Uiψifi(x) dx1dx2 · · · dxn. Under any oriented change of coordinates, we get a positive jacobian, so this

integral is independent of the choice of charts and partition of unity by the change of variables formula.Thus the integral is unambiguously defined once we have chosen an orientation.

Theorem (Stokes’ formula).∫

Mdω = 0.

Proof. Let ψi be a partition of unity subordinate to a covering by open cubes Di. Then∫

M dω =∑∫Did(ψiω). Thus it suffices to show that

∫Ddω = 0 in an open cube D = (−R,R)n with ω =

∑fi(x)dx1∧

dx2 ∧ · · · dxi · · · ∧ dxn with fi ∈ C∞c (D). But

D

dω =

Rn

d(∑

fi(x)dx1 ∧ dx2 ∧ · · · dxi · · · dxn) =

n∑

i=1

(−1)i+1

Rn

∂xifi(x)dx1 ∧ · · · ∧ dxn = 0,

since∫∞−∞ ∂xi

fi(x)dxi ≡ 0.

de Rham cohomology. The de Rham cohomology groups are defined by HkdR(M) = ker(d)/im(d) in ΩkM .

We will see later that these are finite–dimensional spaces. Note that H∗dR(M) is a graded commutative

algebra, because of the properties of d. Moreover∫

M is unambiguously defined on HndR(M) by Stokes’

formula.

Riemannian volume. If M is an oriented Riemannian manifold, there is a canonical n–form, given locallyby det g(x)1/2 dx1 ∧ · · · ∧ dxn. It is called the Riemannian volume form.

Connections. If E is a vector bundle over M , we would like to make vector fields X act as differentialoperators on sections of E. If E were a trivial bundle, this is easy since C∞(M,Km) = C∞(M)m, so we candifferentiate componentwise using X ⊗ I. Any Grassmannian description of E gives C∞(E) = PC∞(M)N

some N . Define covariant derivative ∇X with respect to X by ∇X(ξ) = P (Xξ). Thus we get a mapVect(M) × C∞(E) → C∞(E), (X, ξ) 7→ ∇Xξ such that

41

(a) X 7→ ∇X is C∞(M)–linear;(b) ∇X satisfies the Leibniz derivation rule ∇X(fξ) = (Xf)ξ + f(∇Xξ).

Any such assignment is called a connection. We just constructed a Grassmannian connection (essentiallyby pulling back the canonical connection on the Grassmannian), so connections always exist. If E is ahermitian vector bundle, we say that the connection is compatible with the hermitian structure iff X(ξ, η) =(∇Xξ, η)+(ξ,∇Xη) for all ξ, η ∈ C∞(E). If E ⊂M×KN has the induced hermitian structure, compatibilityof the Grassmannian connection follows straight from Leibniz rule, since (∇Xξ, η) = ((X ⊗ I)ξ, η) and(ξ,∇Xη) = (ξ, (X ⊗ I)η).

All the connections we consider will be constructed in this way, including the most important example,the Riemannian connection.Connection matrices and gauge transformations. On a trivialising ball Bi, the connection is given by

∇Xξ = Xξ +Ai(X)ξ

where the connection matrix Ai(X) lies in the Lie algebra of the gauge group G. Since Ai has a pairing witha vector field, it can be regarded a 1-form on Bi with values in g. If gij are the transition matrices, thenan equivalent set of matrices is given by higijh

−1j where hi is a smooth map of Bi into G, called a gauge

transformation. It is clear thatAj = g−1

ij Aigij + g−1ij dgij .

Similarly under a gauge tranformation hi, Ai is transformed into

A′i = h−1

i Aihi + h−1i dhi.

The 1–forms A(X) are usually described locally by their Christoffel symbols Γrpq = A(∂p)qr. Note that the

Riemannian connection only reduces to an orthogonal connection after the gauge change ξ 7→ g(x)1/2ξ whichmakes the matrices g(x)1/2(ϕi ϕ−1

j )′g(x)−1/2 orthogonal.

Synchronous gauge change. We now show using the idea of parallel transport that on any local coordinateball, the connection can be trivialised along radial lines. This is the so–called synchronous gauge changewhich we will see appearing again in Hadamard’s parametrix construction. So assume that B is diffeomorphicto a ball in Rn with origin at 0. Thus the bundle has the form B × Km and the connection is given bymatrices Ai(x) ∈ g ⊂Mm(K), so that ∇i = ∇∂i

= ∂i +Ai(x).

Theorem. There is a unique gauge change g(x) ∈ G on B such that g(0) = I and the gauge field Bi =g−1Aig + g−1∂ig is synchronous, i.e.

∑xiBi(x) ≡ 0. Thus B is trivial when restricted to rays emanating

from 0, so the gauge change trivialises the restriction of the connection to radial lines.

Proof. Fix x ∈ B and set A(t) =∑xitAi(xt). Consider the ODE f−1f = A, f(0) = I. This differential

equation has a solution fx0(t) ∈ G, depending smoothly on x0. By uniqueness fsx0(t) = fx0(st). Setg(x0) = fx0(1). Then g is smooth and by construction B = g−1Ag + g−1dg is synchronous.

Curvature. If ∇X is a connection on E, we define the curvature tensorK by K(X,Y ) = [∇X ,∇Y ]−∇[X,Y ].

Lemma. (a) K(X,Y ) is a C∞(M)–module endomorphism on C∞(E).(b) K(X,Y ) = −K(Y,X) and K is C∞(M)–bilinear.

Proof. This is routine algebra. We have ∇X∇Y (fξ) = f∇X∇Y ξ + (Xf)(∇Y ξ) + (Y f)∇Xξ + (XY f)ξ.Hence (∇X∇Y −∇Y ∇X)fξ−∇[X,Y ]fξ = f(∇X∇Y −∇Y ∇X)ξ−f∇[X,Y ]ξ, so that K(X,Y )fξ = fK(X,Y )ξas required. The assertions in (b) are proved similarly.

This result shows that the curvature tensor is a smooth section of Λ2⊗End(E), i.e. a 2–form with valuesin End(V ). We can get more concrete expressions for the curvature in the projection and transition matrixpictures.

Theorem (Chern character). If K is the curvature of a Grassmannian connection ∇X on E, then Tr(Kℓ)is a closed 2ℓ–form [whose de Rham cohomology class is independent of the choice of connection]. If we denote

42

it ch2ℓ(E) and define the Chern character by Ch(E) =∑

ch2ℓ(E)/(−2πi)ℓℓ! = Tr(eiK/2π) ∈ HevdR(M),

then Ch(E ⊕ F ) = Ch(E) + Ch(F ) and Ch(E ⊗ F ) = Ch(E) ∧ Ch(F ). If E corresponds to a projectionp ∈MN (C∞(M)) and ∇Xξ = p(X ⊗ I)ξ, then K = 2p(dp)2 and ch2ℓ(E) = 2ℓTr(p(dp)2ℓ).

Remark. The factor of i/2π appears to guarantee the integrality of certain characteristic classes associatedwith the Chern character, i.e. that the cohomology classes come from elements of H2ℓ(X,Z).

Proof of theorem (Chern–Weil). Direct computation shows that

K(X,Y )ξ = ([∇X ,∇Y ] −∇[X,Y ])ξ = p(X · p)Y ξ − p(Y · p)Xξ.

Since p2 = p, (X ·p)p+p(X ·p) = X ·p, so that p(X ·p)p = 0. Hence p(X ·p)Y ξ = p(X ·p)Y pξ = p(X ·p)(Y ·p)ξ.Thus K(X,Y ) = 〈2p(dp)2, X ⊗ Y 〉, as required.

To prove closedness we may take α = Tr(p(dp)2ℓ). Then dα = Tr(dp)2ℓ+1. Now p(dp) + (dp)p = dp, sothat (dp)2ℓ+1 = (dp)p(dp)2ℓ + p(dp)2ℓ+1, Hence Tr(dp)2ℓ+1 = 2Trp(dp)2ℓ+1. From (1), however, p(dp)2ℓ+1 =(dp)2ℓ+1(1 − p), so that Tr(p(dp)2ℓ+1) = Tr(p(dp)2ℓ+1(1 − p)) = 0. Thus tr(dp)2ℓ+1 = 0, so that dα = 0.

Local expressions for curvature. If a connection is given in local coordinates by ∇∂p= ∂p +Ap(x), then

Kpq = K(∂p, ∂q) = [∇∂p,∇∂q

] = ∂pAq − ∂qAp + [Ap, Aq]. Thus

Kpq = ∂pAq − ∂qAp + [Ap, Aq].

If the connection reduces to the matrix group G, so that Ai(x) ∈ g, it follows from this formula that thecurvature Kij also lies in g. In a synchronous frame about 0 in these coordinates, we have

∑xiAi(x) = 0.

Setting tx in place of x and dividing by t, we get∑xiAi(tx) = 0. Setting t = 0, we get Ai(0) = 0, i.e. the

Christoffel symbols vanish at the origin in a synchronous frame. More generally if Ai(0) = 0 (even in thenon–synchronous case), Kij(0) = (∂iAj)(0)−(∂jAi)(0). Since

∑xiAi(x) = 0, collecting the quadratic terms

in x, we get ∂iAj(0) + ∂jAi(0) = 0. Hence Ai(x) = − 12

∑Kij(0)xj + O(‖x‖2), where K is the curvature

tensor.

Example: Riemannian connection and curvature. Let M be a compact Riemannian manifold. ByNash’s embedding theorem, M can be isometrically embedded in Euclidean space RN for N sufficientlylarge. This makes the tangent bundle TM a hermitian subbundle of the trivial bundle M × RN . LetP be the corresponding orthogonal projection and ∇X the corresponding connection. This connection iscalled the Levi–Civita or Riemannian connection. By construction it is compatible with the hermitianstructure on TM . We shall find a local formula for the connection which shows that it is independentof the isometric embedding u : M → RN . Let h(x) be a positive square root of g(x)−1. We take localcoordinates x1, . . . , xn near x0. Since ∂iu · ∂ju = gij , the vectors ei =

∑hij∂ju form an orthonormal basis

of each tangent space TxM for x near x0. So P (x)v =∑

i(v, ei)ei and hence ∇∂i∂ju = P (x)(∂i∂ju) =∑

k(∂i∂ju, ek)ek =∑

ℓ,p,k(∂i∂ju) · (∂ℓu)hkℓhkp∂pu =∑

ℓ,p(∂i∂ju) · (∂ℓu)gℓp∂pu. Hence ∇∂i

∂j =∑

p Γpij∂p,

where Γpij =

∑ℓ g

pℓ(∂i∂ju ·∂ℓu) = 12

∑ℓ g

pℓ(∂i(∂ju ·∂ℓu)+∂j(∂iu ·∂ℓu)−∂ℓ(∂iu ·∂ju)). Since gab = ∂au ·∂bu,we get

Γkij =

1

2

gkℓ(∂igjℓ + ∂jgiℓ − ∂ℓgij).

This formula shows that Γ is symmetric, i.e. Γkij = Γk

ji. Thus ∇∂i∂j = ∇∂j

∂i. Taking combinations∑fa∂a

instead of ∂i, ∂j , this symmetry condition has a coordinate free reformulation ∇XY − ∇Y X = [X,Y ].This can also be checked directly, for as we have seen, vector fields on RN become maps M → RN underX 7→ Xu. With this identification, we get (∇XY )u − (∇Y X)u = P (XY u − Y Xu) = P [X,Y ]u = [X,Y ]u.Hence ∇XY −∇Y X = [X,Y ]. The formula for the Christoffel symbols only invokes the metric on M so isintrinsic. The more traditional characterisation of the Riemannian connection is as follows.

Fundamental theorem of Riemannian geometry. The Riemannian connection is the unique connectionon TM compatible with the hermitian structure satisfying the symmetry property ∇XY −∇Y X = [X,Y ] forall X,Y .

43

Proof. To prove uniqueness, suppose that ∇X satisfies the conditions. Then Xg(Y, Z) = g(∇XY, Z) +g(Y,∇XZ), Y g(Z,X) = g(∇Y Z,X) + g(Z,∇YX) and Zg(X,Y ) = g(∇ZX,Y ) + g(X,∇ZY ). Subtractingthe third equation from the sum of the first two and using the condition ∇AB −∇BA = [A,B], we get

g(∇Y X,Z) =1

2(Xg(Y, Z) + Y g(Z,X)− Zg(X,Y ) − g([X,Z], Y ) − g([Y, Z], X) − g([X,Y ], Z). (∗)

Hence ∇YX is uniqely determined.To prove existence without using an embedding, denote the right hand side of (∗) by g(∇Y X,Z). It

is easy to check that this expression is C∞(M)–linear in Y and Z, that g(∇Y (fX), Z) = fg(∇YX,Z) +Y fg(X,Z) and that g(∇XY −∇Y X,Z) = g([X,Y ], Z). It follows immediately that ∇Y X is a vector fieldand that ∇Y is a connection such that ∇XY −∇Y X = [X,Y ] for all X,Y .

Remark. Note that the local formula for the Christoffel symbols can also be deduced from (∗).

Riemannian curvature. If ∇X is the Riemannian connection on M , let R(X,Y ) denote the correspondingcurvature tensor, so that R(X,Y ) = [∇X ,∇Y ] − ∇[X,Y ]. Note that R satisfies R(X,Y ) = −R(Y,X), isC∞(M)–bilinear and commutes with C∞(M) (like any curvature tensor).

Lemma. With respect to an orthonormal frame in normal coordinates at x, Rijkℓ ≡ (R(∂i, ∂j)∂k, ∂ℓ) =∂jΓ

ikℓ(x) − ∂kΓi

jℓ(x).

Proof. By the corollary to Gauss’ lemma, Γ(x) = 0 in normal coordinates centred on x. So the formulafollows straight from the local formula giving the curvature in terms of the Christoffel symbols.

Theorem (First Bianchi identity). R(X,Y )Z +R(Y, Z)X +R(Z,X)Y = 0.

Proof. We have

R(X,Y )Z +R(Y, Z)X +R(Z,X)Y

= ∇X∇Y Z −∇Y ∇XZ −∇[X,Y ]Z + ∇Y ∇ZX −∇Z∇Y X −∇[Y,Z]X + ∇Z∇XY −∇X∇ZY −∇[Z,X]Y

= ∇Y [X,Z] + ∇Z [Y,X ] + ∇X [Z, Y ] −∇[X,Z]Y −∇[Y,X]Z −∇[Z,Y ]X

= [Y, [X,Z]] + [Z, [Y,X ]] + [X, [Y, Z]] = 0,

by the Jacobi identity for Lie brackets.

Corollary (symmetries of the curvature tensor). If (Xi) is an orthonormal basis of TxM and Rijkℓ =(R(Xi, Xj)Xk, Xℓ), then (a) Rijkℓ + Rjkiℓ + Rkijℓ = 0; (b) Rijkℓ = −Rjikℓ; (c) Rijkℓ = −Rijℓk; and (d)Rijkℓ = Rkℓij .

Proof. (a) is just the first Bianchi identity. (b) follows from R(X,Y ) = −R(Y,X). (c) follows from thefact that R(X,Y ) lies in the Lie algebra of SO(n) so is skew–symmetric. Finally to prove (d), we start fromRijkℓ +Rikjℓ +Rkjiℓ = 0. Symmetrising over the 4–cycle (ijkℓ), we obtain 2Rℓijk + 2Rkjℓi = 0. This clearlyimplies (d).

Scalar curvature. Let R(X,Y ) be the Riemannian curvature tensor. The scalar curvature at x ∈ Mdefined to be κ(x) =

∑i,j(R(Xi, Xj)Xj , Xi) where (Xi) is any orthonormal basis of TxM (it is clearly

independent of the choice). Although we shall not need this, up to a constant it equals (∆Tr g)(x) with ∆the Laplacian in normal coordinates.

The Riemannian connection on forms. For each X ∈ Vect(M), there is a unique derivation ∇X onΩ(V ) such that X 7→ ∇X is C∞(M)–linear and ∇X is compatible with the C∞(M)–pairing Vect(X)×Ω1 →C∞(M). The connection is compatible with the metric on ΩkM and is given by the formula

(∇Xω)(Y1, . . . , Yk) = X · ω(Y1, . . . , Yk) −∑

i

ω(Y1, . . . ,∇X(Yi), . . . , Yk).

Any 1–form α defines a vector field α∗ by (α∗, ω) = g(α, ω). Compatibility of ∇X with g and the pairingVectM × Ω1 → C∞(M) imply easily that ∇Xα

∗ = (∇Xα)∗.

44

17. CLIFFORD BUNDLES AND DIRAC OPERATORS.

Clifford bundles. A Clifford bundle E over a compact Riemannian manifoldM is a hermitian vector bundleE = E+ ⊕E− with a compatible connection ∇X and a C∞(M)–bilinear map Ω1M ×C∞(E±) → C∞(E∓),(ω, ξ) 7→ c(ω)ξ such that

c(ω1)c(ω2) + c(ω2)c(ω1) = 2g(ω1, ω2), (c(ω)ξ, η) = (ξ, c(ω)η), [∇X , c(ω)] = c(∇Xω).

The first two rules imply that we get the Clifford relations pointwise. The operator c(ω) is called Cliffordmultiplication by ω. The third rule gives a compatibility between the connection on E and the Riemannianconnection on 1–forms.

Dirac operators. The Dirac operator of a Clifford bundle E = E+ ⊕ E− is given by D =∑c(ωi)∇Xi

where (Xi) is any local basis of vector fields and (ωj) is the dual basis of 1–forms, i.e. (ωi, Xj) = δij . Thisis evidently independent of the choice of bases so is globally defined. We assume that M is oriented, sohas a canonical volume form. The Dirac Laplacian is given by −D2. We will see below that D is skew–adjoint (formally), so that −D2 is self–adjoint. Note that D takes C∞(E±) to C∞(E∓), so it breaks up

into two parts D±C∞(E±) → C∞(E∓) and D may thus be written in matrix form D =

(0 D−D+ 0

). The

skew–adjointness condition means that D− = −D∗+ (formally),

Adjoints. Let Ω be the volume form on an oriented Riemannian manifold M . Thus locally Ω = ω1∧· · ·∧ωn,where (ωi) is an oriented orthonormal basis of 1–forms. Note that Ω is independent of the choice of orientedbasis. If f is a function on M , we shall sometimes write

∫M f(x)dx or

∫M f instead of

∫M f Ω when there is

no risk of confusion.

Lemma. ∇XΩ = 0.

Proof. Since g(ωi, ωj) = δij , we have g(∇Xωi, ωj) = −g(ωi,∇Xωj). So that ∇Xωi =∑aijωj with

aij = −aji. In particular aii = 0 and hence ∇XΩ =∑

(−1)i+1aiiΩ = 0. [This can also be proved usingparallel transport, since by uniqueness Ω is invariant under parallel transport so its covariant derivatives arezero.]

Divergence of a vector field. IfX ∈ Vect(M), we define the divergence ofX by div(X) =∑

i ωi(∇XiX) ∈

C∞(M) where (Xi) and (ωi) are dual bases. By Leibniz rule, div(fX) = (Xf) + fdiv(X), since div(fX) =∑ωi(∇Xi

(fX)) = f · div(X) +∑

(Xif)ωi(X) = f · div(X) +Xf .

The Leibniz rule immediately implies that [∇X , e(ω)] = e(∇Xω) for ω ∈ Ω1M . We now check the samerelation for e(ω)∗.

Lemma. [∇X , e(ω)∗] = e(∇Xω)∗.

Proof. We have

(∇Xe(ω)∗ω1, ω2) = −(e(ω)∗ω1,∇Xω2) +X(e(ω)∗ω1, ω2)

= −(ω1, e(ω)∇Xω2) − (ω1, (∇Xω) ∧ ω2) +X(ω1, ω ∧ ω2)

= −(ω1,∇X(ω ∧ ω2)) + (ω1, (∇Xω) ∧ ω2) − (ω1, (∇Xω) ∧ ω2) + (ω1, (∇Xω) ∧ ω2)

= (∇Xω1, ω ∧ ω2) + (ω1,∇Xω ∧ ω2) = (e(ω)∗∇Xω1, ω2) + (e(∇X)∗ω1, ω2),

as required.

We shall neer expression for the exterior derivative in terms of the Riemannian connection ∇X .

Proposition. If ∇X is the Riemannian connection on forms, then dω =∑e(ξi)∇Xi

ω, where at each point(Xi) is a basis of tangent vectors and (ξi) is the dual basis of cotangent vectors.

Proof. Let d′ω =∑e(ξi)∇Xi

ω. It is easy to check that d′ is a graded derivation of ΩM . If f ∈ Ω0M =C∞(M), we have (d′f,X) =

∑(ξi, X)Xi(f) = Xf = (df,X), so that d = d′ on Ω0M . If ξ ∈ Ω1M , then

2(d′ξ,X ⊗ Y ) =∑

ξi(X)(∇Xiξ, Y ) − ξi(Y )(∇Xi

(ξ), X) = (∇Xξ, Y ) − (∇Y ξ,X)

= X(ξ, Y ) − (ξ,∇XY ) − Y (ξ,X) + (ξ,∇Y X) = X(ξ, Y ) − Y (ξ,X) − ξ([X,Y ]) = 2(dξ,X ⊗ Y ),

45

since ∇XY − ∇Y X = [X,Y ]. (The 2 is needed because of the conventions for exterior multiplication andderivative.) Hence dξ = d′ξ. Since d and d′ agree on functions and one forms, they must agree everywhere.

Gauss’ divergence theorem.∫

M div(X)Ω = 0.

Proof. If we prove that d(e(α)∗Ω) = div(α∗) · Ω, then, with X = α∗, by Stokes’ theorem we get∫M

div(X)Ω =∫

Md(e(α)∗Ω) = 0. But, since ∇Xi

Ω = 0 and e(α)∗ is a graded derivation, we get

d(e(α)∗Ω) =∑

ωi ∧∇Xi(e(α)∗Ω) =

∑ωie(∇Xi

α)∗Ω

=∑

e(∇Xiα)∗(ωi)Ω =

∑g(ωi,∇Xi

α)Ω

=∑

(ωi,∇Xi(α∗))Ω = div(α∗) · Ω.

Corollary.∫

MXfΩ = −

∫Mf · div(X)Ω.

Proof. Apply the divergence theorem to the vector field fX .

Proposition (adjoint of ∇X). ∇∗X = −∇X − div(X).

Proof. Using the fact that the connection is hermitian together with the corollary to the divergence theorem,we get ∫

M

(∇Xξ, η)Ω =

M

[X(ξ, η) − (ξ,∇Xη)]Ω =

M

[−div(X)(ξ, η) − (ξ,∇Xη)]Ω.

Lemma.∑

i ∇Xiωi = −∑i div(Xi)ωi.

Proof. Since locally (ωi, Xj) = δij , we have∑

i(∇Xiωi, Xj) = −(ωi,∇Xi

Xj) = −div(Xj), so the resultfollows.

Proposition (adjoint of D). D∗ = −D, so D is skew–adjoint.

Proof. By the lemma, we have

D∗ =∑

∇∗Xic(ωi)

∗ = −∑

(∇Xi+ div(Xi))c(ωi) = −

i

c(ωi)∇Xi− c(

i

∇Xiωi + div(Xi)ωi) = −D.

Hence D∗ = −D on C∞(E).

Examples.

A. Grassmann bundle. We have already seen that dω =∑e(ωi)∇Xi

ω above, where (ωi, Xj) = δij locally.We need a similar formula for d∗ the adjoint of d.

Theorem (adjoint of d). d∗ = −∑e(ωi)

∗∇Xi.

Proof. We have

d∗ =∑

∇∗Xie(ωi)

∗ = −∑

(∇Xi+ div(Xi))e(ωi)

∗ = −∑

i

e(ωi)∗∇Xi

− e(∑

i

∇Xiωi + div(Xi)ωi)

∗.

By the lemma, we have∑

i ∇Xiωi = −∑i div(Xi)ωi, so that d∗ = −∑ e(ωi)

∗∇Xi.

With this preparation, it is easy to see that ΛT ∗M is a Clifford bundle. We define c(ω)ξ = e(ω)ξ +e(ω)∗ξ for ξ ∈ Ω∗M . Since e(α)e(β)∗ + e(β)∗e(α) = g(α, β), it follows that, if c(α) = e(α) + e(α)∗, thenc(α)c(β) + c(β)c(α) = 2g(α, β) for α, β ∈ Ω1M . Thus the Clifford algebra axioms are satisfied. We havealready checked that [∇X , e(ω)] = e(∇Xω) and [∇X , e(ω)∗] = e(∇Xω)∗. Hence [∇X , c(ω)] = c(∇Xω). Since∇X is compatible with the hermitian structure, this means that ΛT ∗M is a Clifford bundle. The associatedDirac operator is D =

∑c(ωi)∇Xi

= d− d∗.

46

B. Spin bundles. Let M be a 2n–dimensional oriented compact Riemannian manifold. Thus the transitionmatrices for TM are given by maps gij : Ui ∩ Uj → SO(2n). A spin structure is a lifting to a system oftransition matrices Gij : Ui ∩ Uj → Spin(2n). This means that GijGji = I and GijGjkGki = I.

Passing to a finer cover if necessary, we may assume that each gij can be lifted to a smooth map gij withgij gji = I. However hijk = gij gjkgki lies in the kernel of the homomorphism spin(2n) → SO(2n), so takesthe values ±I. This can be regarded as an element of the Cech cohomology group H2(M,Z2) and is calledthe second Stieffel–Whitney class. It is trivial iff M has a spin structure. The space of inequivalent spinstructures (i.e. liftings) is indexed by the Cech group H1(M,Z2) which is isomorphic with Hom(π1(M),Z2).There are spaces such that no finite cover admits a spin structure, for example Gm,n, the Grassmannian ofrank m projections in Mn(C) for certain m and n.

We say that M is a spin manifold if it is endowed with a spin structure. If S = S+ ⊕ S− is the uniqueirreducible representation of Cliff(R2n) ⊃ Spin(2n), we know that Spin(2n) acts unitarily on S leaving S±invariant. This unitary representation πS of spin(2n) gives a hermitian bundle S with transition matricesπ(Gij). It is called the spin bundle and is the direct sum of S+ and S−.

The Riemannian connection ∇X gives connection matricesAα(X) taking values in Lie(SO(2n)) since ∇X

preserves the hermitian structure on TM . The corresponding connection on S is π−1(Aα(X)) ∈ Lie Spin(2n)⊂ Lie(U(S)). It follows that ∇X is compatible with the hermitian structure. It is called the spin connection.

The last ingredient we need is Clifford multiplication. We define this in a trivialisation Uα × S of thebundle S. In this trivialisation, ∇∂i

= ∂i + π−1(Aαi ) = ∂i + 1

4

∑j,k Γk

ijcicj (using the formula for the π).

The Clifford multiplication operators have the form c(f) where f : U → R2n is smooth. These operators actpointwise and satisfy the Clifford relations c(f)c(g) + c(g)c(f) = 2(f, g) ∈ C∞(U).

To prove compatibilty with the connection, recall that if A ∈ Lie(Spin(V )), then [A, c(v)] = c(π(A)v)where π : Spin(V ) → SO(V ) is the double cover. Thus [∇∂i

, c(f)] = c(∂if) + [π(Aαi ), c(f)] = c(∂if) +

c(Aαi f) = c(∇Xf). Thus ∇X is compatible with the Clifford multiplication and hence the spin bundle S is

a Clifford bundle. The corresponding Dirac operator is called the Dirac operator of the spin manifold. Note

that D : C∞(S±) → C∞(S∓). Denote by D± the restriction of D to C∞(S±). Since D =

(0 D−D+ 0

)and

D∗ = −D, it follows that D∗+ = −D−. formally.

C. Twisted bundles. If S is a Clifford bundle with connection ∇SX (for example the spin bundle of a spin

manifold) and E is an hermitian vector bundle over M with compatible connection ∇EX , then S⊗E becomes

a Clifford bundle with connection ∇S⊗EX (ξ ⊗ η) = (∇S

Xξ) ⊗ η + ξ ⊗ (∇EXη). The Clifford multiplication c(ω)

only acts on the first factor c(ω)(ξ ⊗ η) = (c(ω)ξ) ⊗ η. When S is the spin bundle of a spin manifold, thecorresponding Dirac operator DS⊗E or DE is called the twisted Dirac operator with coefficients in E.Thistwisting operation is particularly simple in the Grassmannian framework: if E is a subbundle of M × CN

corresponding to the projection p, then DE = p(D ⊗ I)p acting on C∞(S ⊗ E) = pC∞(S)N . These areexactly the Dirac operators for which we will prove the index theorem. It can be shown that when M is aspin manifold, any Clifford bundle is of this form; and even when M does not admit a spin structure, thisis true locally. Since the calculation of the index we give below is purely local, it is possible to compute theindex by the same method in this case. For simplicity we shall only prove the index theorem for twistings ofthe spin bundle of a spin manifold.

18. SOBOLEV SPACES ON A CLIFFORD BUNDLE. Let D =∑c(ωi)∇Xi

be the Dirac operatoron a Clifford bundle E and let ∆ = −D2. We define the Sobolev norms on C∞(E) for k ≥ 0 by ‖ξ‖2

(k) =

((I + ∆)kξ, ξ), where this inner product includes an integration over M . These come with inner products(ξ, η) = ((I+∆)kξ, η). Let Hk(E) be the Hilbert space completion of C∞(E) with respect to this norm. Fork ≥ 0, we define ‖ξ‖(−k) = sup|(ξ, η)| : ‖η‖(k) = 1, η ∈ C∞(E). Thus we use the pairing ξ, η 7→ (ξ, η)to embed C∞(E) into the dual of Hk(E). The norm is therefore a Hilbert space norm. Let H−k(E) be thecorresponding Hilbert space completion.

Theorem 1 (duality). (a) (I + ∆)k : Hk(E) → H−k(E) is a unitary isomorphism.(b) Hk and H−k are each other’s duals under the pairing ξ, η 7→ (ξ, η), i.e. the map H−k 7→ H∗

k induced bythe pairing is a unitary isomorphism.

Proof. Since ‖η‖(−k) = sup‖ξ‖(k)≤1 |(ξ, η)|, the map H−k(E) → Hk(E)∗ is an isometry. Now if ξ, η ∈

47

C∞(E), we have(ξ, (I + ∆)kη) = (ξ, η)(k). (∗)

By continuity this extends to all ξ ∈ Hk(E). Hence

‖(I + ∆)kη‖(−k) = sup‖ξ‖(k)≤1

|(ξ, (I + ∆)kη)| = sup‖ξ‖(k)≤1

|(ξ, η)(k)| = ‖η‖(k).

Thus the map (I+∆)k : Hk → H−k is an isometry. The proof will be completed if we show that (I+∆)kC∞ ⊂H−k ⊂ H∗

k is dense in H∗k . This is immediate from (∗) since (ξ, (I + ∆)kη) = 0 implies (ξ, η)(k) = 0 for all

η ∈ C∞(E), which in turn implies ξ = 0.

Theorem 2 (local expressions for Sobolev norms). (a) C∞(M) and C∞(End(E)) act continuouslyby multiplication on each Hk(E).(b) If φi : Vi × Km → E are local trivialisations of the bundle over Ui ⊂ M , with Vi an open subset ofT = Tn, and ψi is a partition of unity subordinate to Ui, then the norm ‖ξ‖(k) is equivalent to the norm∑

i ‖(ψiξ) φi‖(k) (or(∑

i ‖(ψiξ) φi‖2(k)

)1/2

) calculated in Hk(T,Km).

Proof. Note that locally ∆ = −D2 = −∑∂ig

ij∂j+ lower order terms, so that ∆ is a generalised Laplacian.Let Bi ⊂ B′

i be a family of Euclidean balls covering M . Take χi a partition of unity subordinate to Bi andtake ψi ∈ C∞

c (B′i) such that ψi = 1 on a neighbourhood of Bi. Then

((I + ∆)kξ, ξ) =∑∫

B′i

χi((I + ∆)kξ, ξ) =∑∫

B′i

χi((I + ∆)kψiξ, ψiξ).

By the results on generalised Laplacian and Sobolev spaces, this norm is equivalent to(∑ ‖(ψiξ) φi‖2

(k)

)1/2

with the Sobolev norms computed on the torus.Using this equivalent norm it follows that each Hk(E) is invariant under multiplication by C∞(M) or

C∞(End(E)) for k ≥ 0. For k ≤ 0, invariance under multiplication follows by duality, since (fξ, η) = (ξ, f∗η).Finally to get the result for ‖ · ‖(−k), note that ‖ξ‖(−k) ≤

∑ ‖χiξ‖(−k) ≤ A∑ ‖ξ‖(−k), since multiplica-

tion is continuous. On the other hand ‖χiξ‖(−k) = supg∈C∞(E),‖g‖(k)≤1 |(g, χiξ)|. Since ψiχi = χi, we mayinsert the extra condition that ψig = g. Because of the equivalence for k ≥ 0, we get an equivalent norm bytransferring this supremum to the Sobolev spaces on the torus.

Theorem 3. (a) Hk+s(E) → Ck(E) for s > n/2 (Sobolev embedding theorem).(b) The inclusion Hk(E) → Hℓ(E) is compact for k > ℓ (Rellich’s compactness lemma).(c) If (∆ + λ)ξ = η with η smooth, then ξ is smooth (elliptic regularity).(d) (I + ∆) : Hk → Hk−2 is a unitary isomorphism for each k ∈ Z.(e) The operator ∆ (and hence D) admits a complete set of eigenfunctions ψn ∈ L2(E). They all lie inC∞(E) and form orthogonal bases in each Hk(E). The corresponding eigenvalues λ0 ≤ λ1 ≤ · · · satisfyλk → ∞. They are given by the maximin principle

λk = maxG∈L2(E),dim(G)=k

minξ∈H1(E),ξ⊥G

(∆ξ, ξ)

(ξ, ξ).

(f) Hk(E) = ∑ anψn|∑ |an|2(1 + |λn|2)k < ∞ with inner product (

∑anψn,

∑bnψn) =

∑anbn(1 +

|λn|2)k.

Proof. (a) If ξ ∈ Hk+s(E) and ψi ∈ C∞(M) is a suitable partition of unity, ψiξ ∈ Hk+s(E). Henceψiξ ∈ Ck(E) by the Sobolev embedding theorem on the torus. So ξ =

∑ψiξ ∈ Ck(E).

(b) Say (ξn) is bounded in Hk(E). Then ψiξn is bounded in Hk(T,Cm) so admits a Cauchy subsequencein Hℓ(T,C

m). Hence ψiξn admits a Cauchy subsequence in Hℓ(E). Since there only finitely many ψi’s andξn =

∑ψiξn, ξn admits a Cauchy subsequence in Hℓ(E). Thus the inclusion Hk(E) → Hℓ(E) is compact.

(c) Let ψ be a bump function equal to 1 near x0. Then ∆(ψξ) = η near x0. By the Weyl local regularityresult on the torus, ψξ is smooth near x0, since η is smooth near x0. Since ψ = 1 near x0, it follows that ξis smooth near x0 and hence everywhere.

48

(d) For k = 1, this was proved in Theorem 1. For k/ge2, V = I+∆ is an isometry by definition of the positiveSobolev norms. So V ∗V = I. Hence V V ∗ = P , a projection. It is therefore sufficient to show that V ∗ isinjective. But, since ∆ is formally self–adjoint, V ∗ may be identified with the map I + ∆ : H−k+2 → H−k.If (I + ∆)ξ = 0, then ξ is smooth by elliptic regularity and therefore ξ = 0. So V and V ∗ are unitary. Thiscovers all the cases.(e) Let T ∈ B(H0) be operator obtained by composing the unitary (I + ∆)−1 : H0 → H2 with the compactinclusion H2 → H0. Then T is a compact self–adjoint operator so the existence of an orthonormal basis ofeigenfunctions follows from the spectral theorem. They are smooth by elliptic regularity. Finally we canapply the usual argument to derive the maximin principle.(f) For k ≥ 0 this follows straight from the definition of the Sobolev norms. For k ≤ 0 it follows from thedefinition of the dual norms.

19. THE HODGE THEOREM.

Theorem (Hodge decomposition). Let ∆ = −D2 be the Dirac Laplacian on a Clifford bundle V overM . Then H0(V ) = ∆H2(V ) ⊕ ker(∆) and C∞(V ) = ∆C∞(V ) ⊕ ker(∆).

Proof. ∆ + I : H2(V ) → H0(V ) is invertible, so that ∆(∆ + I)−1 = (∆ + I − I)(∆ + I) = I − (I + ∆)−1

is Fredholm of index 0. Hence ∆ : H2(V ) → H0(V ) is Fredholm of index 0. Thus ∆H2(V ) is a closedsubspace of H0(V ) of codimension dimker(∆). But ker(∆) is orthogonal to ∆H2(V ) by self–adjointness,so that ker(∆) = (∆H2(V ))⊥ and H0(V ) = ker(∆) ⊕ ∆H2. Now say f ∈ C∞(V ). So f = ∆u + g withu ∈ H2(V ) and g ∈ ker(∆). By elliptic regularity, g is smooth. Hence ∆u is smooth, so that u is smooth, byelliptic regularity.

We have already explained how the Laplacian on forms can be regarded as a Dirac Laplacian. In factd =

∑e(ωi)∇Xi

and d∗ = −∑ e(ωi)∗∇Xi

. Hence D = d− d∗ =∑c(ωi)∇Xi

, where c(ω) = e(ω) + e(ω)∗.

Corollary (Hodge theorem). Let M be a Riemannian manifold and let ∆ = D2 = dd∗ + d∗ be theLaplacian on forms Ω∗M . Let Hk

DR(M) = ker(d)/im(d) in ΩkM be the kth de Rham chomology group.Then Hk

dR(M) ∼= ω ∈ ΩkM : dω = 0, d∗ω = 0 = ω ∈ ΩkM : ∆ω = 0, the space of harmonic forms. Thisspace is finite–dimensional. Moreover ΩV = ker(d) ∩ ker(d∗) ⊕ dΩM ⊕ d∗ΩM .

Proof. Let D = d − d∗ be the Dirac operator on forms, so that ∆ = −D2 is the Laplacian. From thetheorem, ΩM = ker(∆)⊕∆ΩM . But clearly ker(∆) = ker(dd∗ +d∗d) = ker(d)∩ker(d∗). Since dΩM , d∗ΩMand ker(d) ∩ ker(d∗) are pairwise orthogonal and ∆Ω ⊂ dΩM ⊕ d∗ΩM , it follows that

ΩV = ker(d) ∩ ker(d∗) ⊕ dΩM ⊕ d∗ΩM.

But thenker(d)/im(d) ∼= ker(d) ∩ (im(d))⊥ = ker(d) ∩ ker(d) ∩ ker(d∗) ⊕ d∗ΩM

= ker(d) ∩ ker(d∗) ⊕ (d∗ΩM ∩ ker(d)) = ker(d) ∩ ker(d∗),

since d∗ΩM ∩ ker(d) = (0) because they are orthogonal. Thus ker(d)/im(d) ∼= ker(d) ∩ ker(d∗) as required.

20. FREDHOLM PROPERTIES AND THE MCKEAN–SINGER INDEX FORMULA. The

Dirac operator D on a Clifford bundle E = E+ ⊕E− splits up as D

(0 D−D+ 0

)with D∗

+ = −D− formally.

Proposition. D : H1(E) → H0(E) is a Fredholm operator of index 0. Hence the operators D± : H1(E±) →H0(E∓) are Fredholm.

Remark. In fact it is not hard to show that the index of D+ does not depend on the choice of definingconnection.

Proof. Since ∆ = −D2 and D∗ = −D, we get (I + D)∗(I + D) = (I − D)(I + D) = I + ∆. It followseasily that the operator I + D gives a unitary map of H1 onto H0. Under this isomorphism D becomesthe operator D(I + D)−1 on H0. But D(I + D)−1 = I + K with K = −(I + D)−1 compact by Rellich’scompactness lemma. Hence D is Fredholm of index 0. Since D = D+

⊕D−, both D+ and D− are Fredholm

with ind(D+) = −ind(D−).

49

Theorem (McKean–Singer). ind(D+) = Tr(eD−D+t) − Tr(eD+D−t).

Proof. Let V±(λ) be the λ–eigenspace ∆ = −D2 in C∞(E±). Since D∗ = −D and D∗+ = −D−, we

have V±(0) = ker(D∗±D±) = ker(D±). On the other hand if λ 6= 0, we have V+(λ) ∼= V−(λ). In fact

D(V±(λ)) ⊆ V∓(λ) and this must be an isomorphism since D2 = −λI on each space. Hence

Tr(eD−D+t) − Tr(eD+D−t) =∑

λ

dim(V+(λ))e−λt −∑

λ

dim(V−(λ))e−λt

= dim(V+(0)) − dim(V−(0)) = dim(kerD+) − dim(kerD−) = ind(D+).

21. THE SPECTRUM OF THE DIRAC LAPLACIAN. We use a maximin argument to get moreinformation on the spectrum of the Dirac Laplacian ∆. Since the operator T = (I + ∆)−1 ∈ B(L2(V )) iscompact and self–adjoint, we see that the eigenvalues of ∆ satisfy λ0 ≤ λ1 ≤ · · · with λn → ∞. Moreoverλk is given by the variational principle

λk = maxG⊂C∞(V ),dim(G)=k

minf∈H1(V ),f∈G⊥

(Df,Df)

(f, f).

We use the maximin principle to obtain upper and lower bounds for the eigenvalues in terms of the corre-sponding eigenvalues for the Dirichlet Neumann boundary value problem for a rectangular cube.

Dirichlet boundary value problem for a rectangular cube. Let D be the cube [0, π]n in Rn or Tn

(identified with [−π, π]n). The eigenfunctions of the Dirichlet problem on D are just functions continu-ous on D, vanishing on the boundary, twice differentiable in D such taht −

∑∂2

i f = µf . This is just aproduct of one–dimensional Sturm–Liouville problems: the eigenfucntions are

∏sin(mixi) with mi ≥ 1 and

corresponding eigenvule ‖m‖2. Now observe that these eigenvalues satisfy

µk ≤ Ak1/n, (∗)

where N = 2n is the dimension of the space. Indeed if R is an integer, |m : ‖m‖ ≤√NR| ≥ |m : 1 ≤ mi ≤

R|, so that λRN ≤ NR2. Given k large, take R so that RN ≤ k ≤ (R + 1)N . Then λRN ≤ λk ≤ λ(R+1)N ,

so that λk ≤ N(R+ 1)2 and hence (∗), since R ∼ k1/n.To use this information we need a result on H1(T

n). Let A ⊂ Tn be a finite set, so that z ∈ Tn : zi /∈ Ais a union of open rectangular cubes Di. Let g be a continuous function on Tn such that for each i, ∂g/∂xi

exists on each Di and extends to a continuous function on Di. We call g a piecewise differentiable function.

Lemma. If g is piecewise differentiable, then g ∈ H1(Tn).

Proof. Clearly ∂g/∂xi lies in L2(Tn), Integration by parts in the ith coordinate shows that Dig(m) =mig(m). Hence ∑

(1 + ‖m‖2)|g(m)|2 = (g, g) +∑

(∂g

partialxi,∂g

∂xi) <∞,

as required.

Theorem (Upper Weyl estimate). There is a constant A > 0 such that λk ≤ Ak1/n (where dim(M) =2n).

Proof. Let U be an open set in M diffeomorphic to an open ball B ⊂ Tn and let D be an open rectangularcube with D ⊂ B. The vector bundle becomes trivial over B and the Dirac Laplacian has the form−D2 =

∑− ∂∂xi

aij∂

∂xj+∑bi(x)

∂∂xi

+c(x) acting on vector–valued functions. Now if f ∈ C∞c (U), (Df,Df) ≤

C(∑

(fxi, fxi

)+‖f‖2) for some C > 0. Let f0, f2, · · · , fk be the first eigenfunctions for the Dirichlet problemon D with eiegenvalues µ0, . . . , µk and extend fi to be zero off D. Then fi ∈ H1(V ) since it is piecewisedifferentiable. Let G = lin(g0, . . . , gk−1), where gi are the eigenfunctions of ∆ corresponding to eigenvaluesλi. We can find f 6= 0 in lin(f0, . . . , fk) such that f ⊥ G. Hence

λk ≤ (Df,Df)

(f, f)≤ C

∑ (fxi, fxi

)D

(f, f)D+ C ≤ Cµk + C ≤ Ak1/n.

50

Neumann boundary value problem for a rectangular cube. We generalize the doubling trick alreadyusing for the one–dimensional Sturm–Liouville problem. Define even functions on [−π, π]n to be thoseinvariant under the transformations xi 7→ −xi. By definition they are periodic so can be identified withfunctions on Tn invariant under the transformations zi 7→ zi. The Laplacian ∆0 = −∑ ∂2

i leaves thissubspace invariant. Its eigenfunctions are just the functions

∏cos(mixi) with mi ≥ 0 and corresponding

eigenvalue ‖m‖2. Just as in the Dirichlet case, these eigenvalues satisfy

µk ≥ Bk1/n, (∗)

where N = 2n is the dimension of the space. Indeed, if R is an integer, |m : ‖m‖ ≤√N

−1R| ≤ 2N |m :

0 ≤ mi ≤ R|, so that λRN ≥ 2−NN−1(R + 1)2. Given k large, take R so that RN ≤ k ≤ (R + 1)N . ThenλRN ≤ λk ≤ λ(R+1)N , so that λk ≥ 2−nN−1R2 and hence (∗), since R ∼ k1/2n.

Theorem (Lower Weyl estimate). There is a constant B > 0 such that λk ≥ Bk1/n (where dim(M) =2n).

Proof. Cover M by finitely many open rectangular cubes D(i) (i = 1, . . . ,m) contained in open balls Bi.Let 〈f, g〉D(i) be the standard Euclidean inner product on each cube. It is clearly equivalent to the innerproduct induced by the metric on M . On each D(i) we have

(Df,Df) ≥ ε∑

〈fxi, fxi

〈−C〈f, f〉

for some ε > 0 and constant C (this is an easy case of Garding’s inequality). Let g(j)0 , . . . , g

(j)k be the

first eigenfunctions for the Neumann problem on D(j) with corresponding eigenvalues µ(j)i . Let f0, . . . , fkm

be the first eigenfunctions of ∆ = −D2 in C∞(V ) with corresponding eigenvalues λi. Choose f 6= 0 in

lin(f0, . . . , fkm) such that 〈f, g(j)i 〉D(j) = 0 for all i, j. Note that the lemma on piecewise differentiable

functions shows that the restriction of f to each cube D(j) defines a function in H1(Tn by reflection. Since

f is orthogonal to the first mk + 1 eigenfunctions, we know that (Df,Df) ≤ λkm(f, f). But then, sincem(g, g) ≥

∑(g, g)D(j) ≥ (g, g) (the D(j)’s cover M), we get using the equiva

(Df,Df) ≥ m−1∑

(Df,Df)D(j)

≥ m−1m∑

i=1

ε(µ(j)k − C)〈f, f〉D(j)

≥ m−1(minµ(j)k − C)

∑(f, f)D(j)

≥ m−1(minµ(j)k − C)(f, f).

Hence λkm ≥ C′(minj µ(j)k − C), which implies that λk ≥ Bk1/n (for k sufficiently large).

22. GLOBAL SOBOLEV CONSTRUCTION OF HEAT KERNEL. To understand the mainidea we first treat the case of the usual Laplacian ∆ acting on functions. Let ψn ∈ C∞(M) be the realeigenfunctions with ∆ψk = λkψk. Then the heat operator e−t∆ satisfies e−t∆ψk = e−λktψk. It therefore haskernel

Kt(x, y) =∑

e−tλkψk(x)ψk(y).

On the other hand the Laplacian of M × M is ∆ ⊗ I + I ⊗ ∆ and has eigenfunctions ψk(x)ψℓ(y) witheigenvalue λk + λℓ. But then since λk ∼ k1/n it is easy to check that Kt lies in Hm(M ×M) for all m, so issmooth. In particular it defines a Hilbert–Schmidt operator. Since

Kt(x, y) =

M

Kt/2(x, z)Kt/2(z, y) dz

it follows that e−t∆ is trace–class as the product of Hilbert–Schmidt operators. Computing the trace usingthe inner product on these operators we get Mercer’s formula:

Tr e−t∆ =

M

Kt(x, x) dx.

51

(The analogous formula in the bundle case requires a trace because the kernel is matrix–valued.) Note thatthe estimates for the eigenvalues guarantee that

∑e−tλk

<∞, so that e−t∆ is trace-class.We now generalize this argument to Clifford bundles. Let E be a Clifford bundle with Dirac operator

D and Dirac Laplacian ∆ = −D2. The heat operator etD2

= e−t∆ is defined as an operator on the Sobolevspaces Hk(E) using the eigenfunctions ψn, with ∆ψn = λnψn. In fact we set

e−t∆∑

anψn =∑

ane−λntψn.

Since λn ≥ 0 for all n, this clearly defines a bounded operator of norm ≤ 1 on each Hk(E). We wish toshow that this operator is Hilbert–Schmidt or trace–class with smooth kernel so that its trace on H0(E±)can be computed by integrating the kernel “on the diagonal”. The trick is to realise this is a problem aboutSobolev spaces on M ×M : we must construct the kernel within Hk(E ⊠ E∗) where E ⊠ E∗ is the externaltensor product bundle over M ×M .

Let E∗ be the dual bundle of E and form the external tensor product E ⊠ E∗ over M ×M ; this hasfibre Ex ⊗ E∗

y over (x, y) ∈ M ×M . If ψ ∈ C∞(E), we define ψ∗ ∈ C∞(E∗) using the inner product onthe fibres ψ∗(ξ) = (ξ, ψ). If ψn ∈ C∞(E) are the eigenfunctions of ∆ = −D2 with ∆ψn = λnψn, thenψ∗

n are the eigenfunctions of ∆ on C∞(E∗) with ∆ψ∗n = λnψ

∗n. Clearly the Hilbert space L2(E ⊠ E∗) has

orthonormal basis ψi ⊗ψ∗j . If D1 is the Dirac on E and D2 the Dirac on E∗, then D1 and D2 act on sections

ξ ⊗ η∗ by D1(ξ ⊗ η∗) = (D1ξ) ⊗ η∗ and D2(ξ ⊗ η∗) = ξ ⊗ (D2η∗). Because of the way in which D1 and D2

involve Clifford multiplications, they anticommute, so that D1D2 = −D2D1. On the other hand the Diracoperator for E ⊠ E∗ is D = D1 + D2. Hence the Dirac Laplacian is ∆ = −D2 = −D2

1 − D22 = ∆1 + ∆2.

Thus ∆(ψi ⊗ ψ∗j ) = (λi + λj)ψi ⊗ ψ∗

j . Thus the Sobolev spaces of E ⊠ E∗ can be described in terms of theorthonormal basis ψi ⊗ ψ∗

j as

Hk(E ⊠ E∗) = ∑

aijψi ⊗ ψ∗j |∑

|aij |2(1 + (λi + λj)2)k <∞.

Kernels in L2(E ⊠E∗) acting on C∞(E) correspond to Hilbert–Schmidt operators, K =∑aijψi ⊗ ψ∗

j .

They act according to the formula K ⋆ ξ(x) =∫

MK(x, y)ξ(y) dy. Under this identification ψ ⊗ φ∗ goes to

the rank one operator ξ 7→ (ξ, φ)ψ. The operator e−t∆ is by definition the diagonal operator sending ψn toe−λntψn. Its matrix is aij = δije

−λit. Since λk ∼ k1/n if dim(M) = 2n, it follows from the integral test that

∑e−λkt(1 + 4λ2

k)k <∞.

Hence Kt =∑e−λktψk ⊗ ψ∗

k lies in Hk(E ⊗ E∗) for all k, i.e. Kt lies in C∞(E ⊠ E∗). Moreover byconstruction the map t 7→ Kt is a smooth map of (0,∞) into each Hilbert space Hk. “Smooth” here meansthat t 7→ (Kt, F ) is smooth for any F ∈ L2(E ⊠ E∗) which is clear from the construction. It is easyto check that this implies that Kt(x, y) is smooth on (0,∞) ×M × M in the classical sense. [In fact ifχ(t) ∈ C∞

c (0,∞) is a bump function, then F (t, x, y) = ψ(t)Kt(x, y) may be regarded as periodic in t. Bythe Sobolev embedding theorem for T ×M ×M , we are reduced to checking that ‖F‖(k) < ∞. But this

follows from the finiteness of∫ b

a‖∂m

t Kt‖2(k) dt =

∫ b

a

∑k |λk|2m(1 + 4|λk|2)e−2λkt dt, for b > a > 0.] Since ∆

leaves C∞(E±) invariant, the above arguments imply that the restrictions ∆± are given by smooth kernelsin C∞(E ⊠ E∗). We summarise our findings.

Theorem. The heat operators e−t∆ and e−t∆± are given by smooth kernels in C∞(E ⊠E∗) and C∞(E± ⊠

E∗±) depending smoothly on t ∈ (0,∞).

Mercer’s formula. The same argument as above gives:

Tr(e−t∆±) =

M

M

trKt/2(x, z)Kt/2(z, x) dzdx =

M

trKt(x, x) dx.

23. LOCAL HADAMARD CONSTRUCTION OF HEAT KERNEL. We give Hadamard’s asymp-totic method of constructing the heat kernel as first used by Minakshisundaram and Pleijel. This gives a

52

formal power series solution in t which can never converge – it is only asymptotic. This is in contrast to thecorresponding constructions for the Laplacian or wave operator for which Hadamard proved convergence.Let gij(x) be a metric on B = x : ‖x‖ < r with gij(0) = δij . Let

L = −∑

ij

∂jgij∂i +

i

bi∂i + c,

where g(x)−1 = (gij(x)), bi, ci ∈ C∞(B,MN (C)). We assume in addition that we are in normal coordinates,so that g(x)x = x for x ∈ B.

Theorem. (a) There is a unique formal power series in t,∑

n≥0 tnBn(x),

F (x, t) = (4πt)−n/2 exp(−‖x‖2/4t)∑

n≥0

tnBn(x),

such that ∂tF + LF = 0, with B0(0) = I and Bn ∈ C∞(B,MN(C)) for n > 0.(b) If E =

∑xi∂i and h(x) = 1

2

∑xibi(x), then (E + n− h)Bn = −LBn−1 for n ≥ 1 (the transport equa-

tions). In particular EB0 = hB0, so that B0(x) ∈ GL(N,R), a synchronous gauge change. Moreover for n >

0, βn(x) = B0(x)−1Bn(X) satisfies (E+n)βn = −Lβn−1, β0(x) = I so that βn(x) = −

∫ 1

0sn−1Lβn−1(sx) ds

for n ≥ 1 and Bn(x) = −B0(x)∫ 1

0 sn−1LB0(sx)

−1Bn−1(sx) ds.

(c) For t > 0, (∂t + L)(4πt)−n/2 exp(−‖x‖2/4t)∑

n≤N tnBn(x) = (4πt)−n/2 exp(−‖x‖2/4t)L(BN)tN .

Proof. Let A(x, t) = (4πt)−n/2 exp(−‖x‖2/4t) and B(x, t) =∑

n≥0Bn(x)tn. Note that, since the co-

ordinates are normal,∑

j gij(x)∂kf(‖x‖2) =

∑j g

ij(x)2xjf′(‖x‖2) = 2xif

′(‖x‖2) = ∂if(‖x‖2), so that∑gij∂j = ∂i on radial functions in x. Hence −

∑∂ig

ij∂j = −∑∂2

i = ∆0 on radial functions in x. Inparticular this applies to A. By Leibniz’ rule

∑bi∂i(AB) = A(

∑bi∂iB) − t−1AhB, (1)

where h(x) = 12

∑xibi(x). Moreover, if ∆x = −∑ ∂ig

ij(x)∂j , then

∆x(AB) = ∆0(A)B +A∆xB + t−1AEB. (2)

Using (1) and (2) we obtain

(∂t + L)(AB) = (∂t + ∆x +∑

bi∂i + c)AB = A(∂t + t−1E − t−1h+ L)B,

since (∂t+∆0)A = 0. We therefore require that (∂t+t−1E−t−1h+L)B = 0. Comparing coefficients of tn, we

get (E − h)B0 = 0 and (E + n− h)Bn = −LBn−1, the required transport equations. Setting f(t) = B0(xt),we must solve f = f(t)h(x0t), which is the ODE for a synchronous frame so has a smooth solution B0(x).Setting βn(x) = B0(x)

−1Bn(x), we get (E + n)βn = −(B−10 LB0)βn−1. Hence if g(t) = tnβn(x0t) for x0

fixed, then g = −tn−1LBn−1(tx). Integrating between 0 and 1 gives the formula in (b) and also provesuniqueness. Finally

(∂t + L)[A(

N∑

n=0

Bn(x)tn)] = A(∂t + t−1E − t−1h+ L)(

N∑

n=0

Bn(x)tn) = AL(BN )tN ,

by the transport equations.

Local construction of heat kernel. By compactness we can find a δ > 0 such that (X,x) : X ∈TxM, ‖X‖x < δ is diffeomorphic to a neighbourhood U of the diagonal in M ×M via the exponential map(X,x) 7→ (expx(X), x). By construction U = (y, x) : d(x, y) < δ.

53

The Dirac Laplacian has the form ∆ = −D2 = −∑i,j ∂igij∂j+ lower order terms. By Hadamard’s

parametrix construction, we can construct a formal power series Bn,x(X), smooth for x ∈ B and ‖X‖ < δ,such that

Fx(X, t) = (4πt)−n/2 exp(−‖X‖2/4t)∑

n≥0

tnBn,x(X)

satisfies ∂tF + ∆F = 0 with B0,x(0) = I and ∆ = −D2. Moreover for t > 0, we have

(∂t + ∆)(4πt)−n/2 exp(−‖X‖2/4t)

N∑

n=0

tnBn,x(X) = (4πt)−n/2 exp(−‖X‖2/4t)∆(BN,x)tN .

Thus if F(N)t,x (X) = (4πt)−n/2 exp(−‖X‖2/4t)

∑n≤N tnBn,x(X), then

(∂t + ∆)F(N)t,x (X) = (4πt)−n/2 exp(−‖X‖2/4t)∆(BN,x)tN .

We can pull this back to a function on a neighbourhood of (x0, x0) under the mapping (X,x) 7→ (expx(X), x).By compactness fnitely many such neighbourhoods cover the diagonal (x, x)|x ∈ M in M ×M and theBn,x’s agree on the different neighbourhoods by definition. Thus if we define Bn(expy(X), y) = Bn,y(X) for‖X‖ < δ we get a kernel defined on a neighbourhood U of the diagonal in M ×M . Let ψ(s) be a bumpfunction equal to 1 near 0 and 0 for s ≥ δ2/4 and let Ψ be the bump function Ψ(x, y) = ψ(d(x, y)2). Set

k(N)t (x, y) = ψ(d(x, y)2)(4πt)−n/2e−d(x,y)2/4t

N∑

k=0

Bk(x, y)tk.

By construction kt lies in C∞(E ⊠ E∗). Moreover by Leibniz’ rule, we have locally

(∂t + ∆x)kt(expy(X), y) = (∂t + Ly)ΨAy(X, t)N∑

k=0

tkBy,k(X)

= ψ(‖X‖2)Ay(X, t)∆A(BN,x)tN + [L,Ψ]Ay(X, t)N∑

i=0

By,iti,

where [L,Ψ] is a first order differential operator vanishing in a neighbourhood of the diagonal.

Theorem. The Hadamard parametrix construction gives a kernel k(N)t ∈ C∞(E ⊠ E∗) such that

(a) (∂t − D2x)k

(N)t = (4πt)−n/2e−d2/4tRt where Rt is a polynomial of degree N in t with coefficients in

C∞(E ⊠ E∗) and t 7→ Rt with the non–leading coefficients vanishing for d(x, y) ≤ δ, so that in particularany Sobolev norm of the right hand side is bounded by a constant multiple of tN−n/2;

(b) if ξ, η ∈ C∞(E), then∫

M (k(N)t (x, y)ξ(y), η(x)) dx → (ξ(y), η(y)) uniformly in y as t→ 0.

Proof. (a) is immediate from the previous discussion. (b) is easy to verify locally using the properties of

the Euclidean heat kernel kt(X) = (4πt)−n/2e−‖X‖2/4t. In fact if f ∈ Cc(Rn), then setting X = t1/2Y , we

have ∫kt(X)f(X) dX − f(0) = (4π)−n/2

∫e−‖Y ‖2/4(f(t1/2Y ) − f(0)) dY

which evidently tends to 0 as t→ 0.

24. LICHNEROWICZ’S FORMULA FOR THE SQUARE OF THE DIRAC OPERATOR.

The connection Laplacian. Let (Xi) be a local frame and (Yi) the dual frame, so that g(Xi, Yj) = δij .If E is a hermitian vector bundle with compatible connection ∇X , we define the connection Laplacian∆A =

∑∇∗Yi∇Xi

(A here stands for the gauge field of the connection). This is independent of the choice ofdual bases and hence globally defined.

54

Lemma. If (Yi) is the dual basis to (Xi) with respect to g, then ∆A = −∑(∇Yi∇Xi

− ∇∇YiXi

) =−∑

(∇Yi∇Xi

+ div(Yi)∇Xi).

Proof. Since ∇∗Yi

= −∇Yi−div(Yi), we have ∆A =

∑∇∗Yi∇Xi

= −∑(∇Yi∇Xi

−div(Yi)∇Xi. On the other

hand, since g(Xi, Yj) ≡ δij locally, we have

i

div(Yi)∇Xi=∑

i,k

g(∇YkYi, Xk)∇Xi

= −∑

i,k

g(Yi,∇YkXk)∇Xi

= −∑

k

∇∇YkXk.

Theorem. (a) If (Xi) is any local frame and gij = g(Xi, Xj), then ∆A =∑∇∗

Xigij∇Xj

.

(b) In local coordinates x1, . . . , xn, ∆A = −∑ gij(∇i∇j −∑

k Γkij∇k).

Proof. (a) If (Xi) is a local basis, then the dual basis with respect to g is Yi =∑gijXj . Since ∇Yi

=∑gij∇Xi

, we get ∇∗Yi

=∑∇∗

Xjgij , so the result follows.

(b) Let F (X,Y ) = ∇X∇Y −∇∇XY . This is C∞(M)–linear in X . Moreover, since ∇XY −∇Y X = [X,Y ],we get F (X,Y )−F (Y,X) = [∇X ,∇Y ]−∇[X,Y ] = K(X,Y ). Since K(X,Y ) is C∞(M)–linear in Y , it followsthat F (X,Y ) is also C∞(M)–linear in Y . From the lemma, ∆A = −∑j F (Yj , Xj). Since Yj =

∑gijXi and

∇XiXj =

∑k Γk

ijXk, we get

∆A = −∑

gijF (Xi, Xj) = −∑

gij(∇Xi∇Xj

− Γkij∇Xk

),

as required.

Remark. All the theory of Sobolev spaces could equally well have been done in terms of the connectionLaplacian, so applies to any hermitian vector bundle. As we shall see, it only differs from the Dirac Laplacianby a multiplication operator in C∞(M).

A curvature computation. If M is a Riemannian manifold, the scalar curvature κ is just∑

i,j Rijij ,where we have set Rijkℓ for simplicity. Let us also abbreviate c(ei) to ci. Apart from the obvious symmetryproperties Rijkℓ = Rkℓij = −Rjikℓ = −Rijℓk, the tensor Rijkℓ satisfies Bianchi’s first identity, namelyRijkℓ +Rikℓj +Riℓjk = 0. (This is an immediate consequence of the Jacobi identity.)

Lemma (Lichnerowicz).∑

i,j,k,l Rijklcicjckcl = −2∑

i,j Rijij .

This lemma will be proved below. We use it now to establish the fundamental:

Theorem (Lichnerowicz’s formula). If D is the Dirac operator on a Clifford bundle S ⊗ E, then−D2 = ∆A + κ/4 + F , where ∆A is the connection Laplacian on S ⊗ E, κ is the scalar survature andF = 1

2

∑K(Xi, Xj)c(ωi)c(ωj) with K(X,Y ) the curvature tensor of E.

Remark. This formula was first proved by Weitzenbock and Bochner for the Dirac operator d + d∗ onfunctions and forms.

Proof. We take local coordinates x1, . . . , xn, set Xi = ∂i and let ωi be the dual basis of 1–forms. We firstwork out how the covariant derivative acts on 1–forms. In fact

0 = Xi(Xa, ωb) = (∇XiXa, ωb) + (Xa,∇Xi

ωb) = Γbia + (Xa,∇Xi

ωb),

55

so that ∇Xiωj = −∑k Γj

ikωk. Since D =∑c(ωi)∇Xi

, Γkij = Γk

ji and [Xi, Xj ] = 0 locally, we get

D2 =∑

i,j

c(ωi)∇Xic(ωj)∇Xj

=∑

c(ωi)c(ωj)∇Xi∇Xj

+∑

c(ωi)c(∇Xiωj)∇Xj

=1

2

∑[c(ωi)c(ωj)∇Xi

∇Xj+ c(ωj)c(ωi)∇Xj

∇Xi] +∑

c(ωi)c(∇Xiωj)∇Xj

=1

2

∑[c(ωi)c(ωj) + c(ωj)c(ωi)]∇Xi

∇Xj+∑

c(ωi)c(∇Xiωj)∇Xj

− 1

2

∑c(ωi)c(ωj)[∇Xi

,∇Xj]

=∑

i,j

gij∇Xi∇Xj

+∑

i,j

c(ωi)c(∇Xiωj)∇Xj

− 1

2

i,j

c(ωi)c(ωj)([∇Xi,∇Xj

] −∇[Xi,Xj ])

=∑

i,j

gij∇Xi∇Xj

−∑

i,j,k

c(ωi)Γjikc(ωk)∇Xj

− 1

2

i,j

c(ωi)c(ωj)([∇Xi,∇Xj

] −∇[Xi,Xj ])

=∑

i,j

gij∇Xi∇Xj

− 1

2

i,j,k

[c(ωi)c(ωj) + c(ωj)c(ωi)]Γkij∇Xk

− 1

2

i,j

c(ωi)c(ωj)([∇Xi,∇Xj

] −∇[Xi,Xj ])

=∑

i,j

gij∇Xi∇Xj

−∑

i,j,k

gijΓkij∇Xk

− 1

2

i,j

c(ωi)c(ωj)([∇Xi,∇Xj

] −∇[Xi,Xj ])

=∑

i,j

gij(∇Xi∇Xj

−∑

k

Γkij∇Xk

) − 1

2

i,j

c(ωi)c(ωj)(R(Xi, Xj) +K(Xi, Xj))

= −∆A − κ/4 − F,

by the previous lemma.

Proof of Lemma. By Bianchi’s identity we have

i,j,k,ℓ

Rijkℓcicjckcℓ = −∑

i,j,k,ℓ

Rijkℓci(ckcℓcj + cℓcjck)

= 2∑

i,j,k,ℓ

Rijkℓci(δjkcℓ + δkℓcj − 2δjℓck − cjckcℓ).

Here we converted the two bracket terms to cJckcℓ using the Clifford relations. Taking this term over to theother side gives ∑

i,j,k,ℓ

Rijkℓcicjckcℓ =2

3

i,j,k,ℓ

Rijkℓci(δjkcℓ + δkℓcj − 2δjℓck)

The symmetry properties allow the δkℓ term to be deleted and, upon rearrangement, we get

i,j,k,ℓ

Rijkℓcicjckcℓ =2

3

i,j,k,ℓ

Rijkℓ(δjkcicℓ − 2δjℓcick)

= 2∑

ijℓ

Rijjℓcicℓ = 2∑

ijl

Rijjℓ(cicℓ + cℓci)/2 = 2∑

ij

Rijji,

as required.

Corollary 1. If D is the Dirac operator on the spin bundle S, then −D2 = ∆S +κ/4, where κ is the scalarcurvature and ∆S is the spin Laplacian.

Corollary 2 (Lichnerowicz). If M has positive scalar curvature, M has no harmonic spinors (i.e. solutionsof Dξ = 0).

56

25. SUPERSYMMETRIC PROOF OF THE ATIYAH–SINGER INDEX THEOREM.

STEP I: The local heat kernel approximates the global heat kernel. We know that the Sobolevheat kernel Kt ∈ C∞(E ⊠ E∗) satisfies (∂t − D2

x)Kt = 0 and Kt ⋆ ξ → ξ in Hk(E) for any ξ ∈ C∞(E)

(as can be seen by writing ξ =∑anψn). The Hadamard heat kernel kt = k

(N)t ∈ C∞(E ⊠ E∗) satisfies

(∂t − D2x)kt = rt, where t 7→ rt is a continuous map of (0, 1] into Hs(E ⊠ E∗) with ‖rt‖(s) ≤ Kst

N−n/2.Moreover

∫M

(kt(x, y)ξ(y), η(x)) dx → (ξ(y), η(y)) uniformly in y as t → 0 for any ξ, η ∈ C∞(E). TakeN > n/2.

Theorem. Let ξ ∈ C∞(E) and ft,y(x) = (Kt(x, y) − kt(x, y))ξ(y), so that ft,y ∈ C∞(E). Then we havesupy∈M ‖ft,y‖(k) → 0 as t → 0. In particular supx,y∈M ‖(Kt(x, y) − kt(x, y))η(y)‖ → 0 as t → 0, so thatsupx,y ‖Kt(x, y) − kt(x, y)‖ → 0.

Proof. Let Gt,y(x) = Kt(x, y)ξ(y) and gt,y(x) = kt(x, y)ξ(y) in C∞(E), so that ft,y = Gt,y − gt,y. Ifη ∈ C∞(E), then we claim that

∫M

(ft,y(x), η(x)) dx → 0 uniformly in y. Then

(Gt,y, η) =

M

(Kt(x, y)ξ(y), η(x))dx =

M

(ξ(y),Kt(y, x)η(x)) dx = (ξ(y),Kt ⋆ η(y)).

Since Kt⋆η → η in each Hk(E), in particularKt⋆η → η uniformly, so that (Gt,y , η) → (ξ(y), η(y)) uniformly.We already proved that (gt,y, η) → (ξ(y), η(y)) uniformly, so the claim follows.

We may write ft,y(x) =∑at,y(m)ψm(x) where at,y(m) = (ft,y, ψm). By assumption

at,y(m) → 0 uniformly in y as t→ 0. (1)

On the other hand (∂t − D2)ft,y = rt,y where rt,y ∈ C∞(E) and (t, y) 7→ rt,y is a continuous map of(0, 1] ×M into each Sobolev space Hk(E) satisfying ‖rt,y‖(k) ≤ Akt

N−n/2. Say rt,y =∑bt,y(m)ψm where

bt,y(m) = (rt,y , ψm). Hence∂tat,y(m) + λmat,y(m) = bt,y(m). (2)

Solving (1) and (2), we get at,y(m) =∫ t

0 eλm(s−t)bs,y(m) ds. Hence

|at,y(m)|2 ≤∫ t

0

|bs,y(m)|2 ds∫ t

0

ds = t

∫ t

0

|by,s(m)|2ds,

so that

‖ft,y‖2(k) ≤ t

∫ t

0

‖rs,y‖2(k) ds ≤ t sup

[0,1]×M

‖rs,y‖2(k) ≤ t

∫ t

0

A2s2N−n ds ≤ A′t2N−n+2. (3)

Hence ‖ft,y‖(k) → 0 uniformly in y as t → 0. Since this is true for k > n/2, we see that ‖ft,y(x)‖ → 0uniformly in x and y as t→ 0.

Corollary. ind(D+) =∫

MTrs(Kt(x, x)) dx = limt→0

∫M

Trs(k(N)t (x, x)) dx for N > n/2.

STEP II: The supertrace. We start by determining the grading operator in Cliff(V ).

Lemma. If (ei) is any orthonormal basis of V , then c(e1) · · · c(en) equals ±u0, the operator implementing

the grading. Moreover u20 = (−1)

12dim(V )I. The grading operator on S is given by λu0 where λ = (i)

12dim(V ).

Proof.If dim(V ) = 2m, then the elements ai = c(e2i−1)c(e2i) commute and satisfy a2i = −1. Hence

g = c(e1) · · · c(en) satisfies g2 = (−1)mI. Moreover gc(ei) = −c(ei)g. Hence g = ±u0 and u20 = g2 = (−1)mI.

Now a multiple λu0 of u0 acts as ±1 on S±. Since u20 = (−1)m, we get λ2 = (−1)m.

Corollary. Spin(V ) = u ∈ Cliff+(V ) : uu∗ = u∗u = I, uc(V )u∗ = c(V ).Proof. Suppose that u ∈ Cliff+(V ) is unitary and that the orthogonal transformation g with uc(v)u∗ = c(gv)has determinant −1. Define h ∈ O(V ) by he1 = −e1 and hei = ei for i > 1. Then x = g−1h ∈ SO(V ), socorresponds to v ∈ Spin(V ). Hence h = gx corresponds to w = uv ∈ Cliff+(V ). But wc(e1) · · · c(en)w∗ =c(he1) · · · c(hen) = −c(e1) · · · c(en), so that γ(w) = −w, a contradiction.

57

Let ε = λu0 ∈ CliffC be the grading operator. Let Tr(a) be the usual trace on End(S) and let

Trs(a) = Tr(εa) be the supertrace. Thus if we take the block matrix a =

(a++ a−+

a+− a−−

)corresponding to

the decomposition S = S+ ⊕ S−, Tr(a) = Tr(a+) − Tr(a−). It is immediate from the definition that thesupertrace is a graded trace, i.e. if a and b are homogeneous, Trs(ab) = (−1)∂a∂bTrs(ba). We need a way tocalculate the supertrace.

Lemma. Tr(c(ei1) · · · c(eik)) = 0 if i1 < · · · < ik and k ≥ 1.

Proof. The representation on Λ(V )C breaks up as a direct sum of copies of S = ΛVJ , so it suffices to provethe result for the trace taken on Λ(V ). But Λ(V ) = Cliff(V )Ω, so we may take the c(ej1)c(ej2) . . . c(ejℓ

)Ω’sas a basis. An element c(ei1) · · · c(eik

) with k > 0 permutes around the elements of the basis up to a signwith no fixed vectors, so has trace zero.

Theorem. Under the identification Cliff(V ) ≡ Λ(V ), Trs(ω) is λ−12n/2 times the coefficient of e1 ∧ · · · ∧ en

in ω.

Proof. By the previous lemma, Trs(c(ei1) · · · c(eik)) = 0 if k < n, since εc(ei1) · · · c(eik

)) is proportional to

the product of the c(ei)’s with i distinct from the ij’s, so has zero trace. Since⊕n−1

k=0 Λk(V ) is the linearspan of all vectors c(ei1) · · · c(eik

)Ω with k ≤ n − 1 and Trs vanishes on all products c(ei1) · · · c(eik) with

k ≤ n − 1, under the identification Cliff(V ) ≡ Λ(V ) Trs is proportional to the coefficient of e1 ∧ · · · ∧ en.Now Trs(ε) = Tr(ε2) = Tr(I) = 2m. On the other hand ε = λc(e1) · · · c(en), so the coefficient of e1 ∧ · · · ∧ en

is λ in this case. Hence Trs(ω) = λ−12m times the coefficient of e1 ∧ · · · ∧ en in ω.

STEP III: Reduction to supersymmetric harmonic oscillator by Getzler scaling.

Extension of Mehler’s formula to the supersymmetric harmonic oscillator. Let B = (bij) be areal positive symmetric n×n matrix and C = (cij) a real skew–symmetric n×n matrix commuting with B.We shall initially assume that B and C are invertible, so that n must be even. Let F ∈ MN (C). Considerthe operator

D = −∑ ∂2

∂x2i

+ xtBx+ xtC∂

∂x+ F (∗)

acting on CN–valued functions on Rn. We wish to compute the kernel of the operator e−Dt. To do sowe define convergent power series f(T ), g(T ) and h(T ) by f(x2) = x coth(x), g(x2) = xcosech(x) andh(x2) = g(x2)1/2. Mehler’s formula can then be rewritten as

Kt(x, y) = (4πt)−1/2h(4t2a2) exp−[f(4t2a2)(x2 + y2) − 2g(4t2a2)xy]/4t.

Theorem 1. The operator e−Dt has kernel Kt(eCtx, y)e−Ft = Kt(x, e

−Cty)e−Ft, where

Kt(x, y) = (4πt)−n/2 deth(4t2B) exp−[(f(4t2B)x, x) + (f(4t2B)y, y) − 2(g(4t2B)x, y)]/4t.

Proof. Note that D is a sum of the three commuting operators DB = ∆ + xtBx, RC = xtC∂x and F .Since B and C commute, by an orthogonal transformation R2n splits up as a direct 2–dimensional subspaces

where C =

(0 −cc 0

)and B =

(a2 00 a2

)with a, c 6= 0. On each subspace

D0 = − ∂2

∂x2− ∂2

∂y2+ a2(x2 + y2) + c(x

∂y− y

∂x).

Now ∂/∂θ = x∂/∂y − y∂/∂x and ∆ + a2r2 is rotationally invariant. So D is a sum of two commutingoperators. The Taylor’s series expansion implies that ect∂/∂θ is just rotation through an angle ct. Thus(eRCtξ)(x) = ξ(eCtx). Since F commutes with D, the formula therefore follows straight from the scalarversion of Mehler’s formula.

Corollary 1. The function kt(x, 0) = (4πt)−n/2 deth(4t2B) exp(−tF ) exp[−(f(4t2B)x, x)/4t] satisfies∂k/∂t+Dk = 0.

58

Corollary 2. Suppose that aij = −aji is a real skew-symmetric matrix and let D = −∑(∂xi+∑aijxj)

2+Fwith F ∈MN(C). Then For t sufficiently small the End(W ⊗ Cn)–valued analytic function

K(t, x) = (4πt)−n/2 det h(4t2A2) exp(−tF ) exp[−(f(4t2A2)x, x)/4t]

satisfies ∂K/∂t+DK = 0

We now want to allow the matrix A in Corollary 2 to take values in a commutative subalgebra A ⊂End(W ). (In fact it will take values in 2–forms, and hence will be nilpotent so that the power seriesfunctions truncate to polynomials. The next result shows that the corollary above still holds by “analyticcontinuation”.

Theorem 2 Suppose that aij = −aji and F lie in a commutative subalgebra A ⊂ EndW and D = −∑(∂xi+∑

aijxj)2 + F . Then for t sufficiently small the End(W ⊗ Cn)–valued analytic function

K(t, x) = (4πt)−n/2 det h(4t2A2) exp(−tF ) exp[−(f(4t2A2)x, x)/4t]

satisfies ∂K/∂t+DK = 0 and is the unique formal power series solution (in t) of the form

(4πt)−n/2 exp(−tF ) exp(−‖x‖2/4t)

n∑

k=0

Gk(x)tk

with G0(x) = I and Gk(x) = gk(x, aij) with gk(x, tij) a polynomial.

Proof. The uniqueness of the formal power series solution follows from the same recurrence relations ofHadamard that we used in the local construction of the heat kernel. From the previous corollary the formulafor Gk will apply when A lies in an open subset of the the real skew–symmetric matrices. As a formal powerseries identity it must also hold when the aij are replaced by indeterminates. But then it remains true whenthese indeterminates are specialized to take values in A.

We have already shown that

ind(D+) = limt→0

M

Trs kt(x, x) dx = limε→0

M

Trs kε2t(x, x) dx.

Locally kt(x, x) ∈ End(S ⊗ V ) = End(S) ⊗ End(V ). This kernel can be regarded as acting on sections ofthe bundle End(S) ⊗ End(V ). But End(S) = ΛRn as a graded vector space. In the local picture wherekt(x, x) acts on functions with values in ΛRn ⊗ End(V ), kt,x(X) = kt(expxX,x) must be a partial sum ofa formal power series solution of (∂t + L)kt,x = 0, where L = −∑ gij(∇i∇j −∑k Γk

ij∇k) + κ4 + F with

∇i = ∂i + 14

∑j,k Γk

ijcjck. Thus kt,x(X) = ψ(‖X‖2)(4πt)−n/2e−‖X‖2/4t∑N

k=0 Bk,x(X)tk. We are going to

define a smooth family of differential operators Lε and kernels kεt,x(X) such that L1 = L and k1

t,x = kt,x. Thekε

t,x will be related to the Lε’s as in the corollary to the Hadamard parametrix construction. The operatorL0 will be the supersymmetric harmonic oscillator so that k0

t,x must actually be given by Mehler’s formula(it agrees with its truncation since A is nilpotent). The supertrace will be invariant under the scaling, sothe index can be computed using k0

t,x(0).We first define the Getzler scaling operator for fermions. Let Sε be the operator on Λ∗Rn defined by

Sεω = ε−∂ωω. Then Sεc(v)S−1ε = e(v) + ε2e(v)∗, as may be immediately verified by applying both sides

to an ω ∈ ΛkV . Let Mε(v) = e(v) + ε2e(v)∗ = εSεc(v)S−1ε . Thus M1(v) = c(v) and M0(c) = e(v). [The

parameter ε plays the role of Planck’s constant; in the “classical limit” the Clifford algebra degenerates tothe exterior algebra. Thus the Clifford algebra is the quantisation of the exterior algebra.]

We now extend this scaling to include bosons, i.e. differential operators and functions of them. Wedefine

kεt,x(X) = εnSε kε2t,x(εX)S−1

ε = ψ(ε2(g(x)X,X))(4πt)−n/2e−(g(x)X,X)/4tN∑

k=0

SεBk,y(εX)S−1ε tkε2k. (1)

59

We regard this as acting on functions with values in ΛRn ⊗ End(V ). Since Trskεt,x(0) = Trsk

1t,x(0) =

Trskt(x, x), it suffices to show that Trskεt,x(0) converges uniformly in x as ε → 0. This limit will give the

formula for the index.

In local coordinates we have L = −∑ ∂igij(X)∂j +

∑bi∂i + c. It is easy to see that

(4πt)−n/2e−‖X‖2/4t∑

k

SεBk,y(εX)S−1ε tkε2k

is the formal power series solution for ∂t + Lε, where

Lε = −∑

∂igij(εX)∂j + ε−1

∑Sεbi(εX)S−1

ε ∂i + ε−2Sεc(εX)S−1ε .

We must check that the coefficients of Lε extend smoothly or continuously to (−1, 1). In fact, using theLichnerowicz formula, we get

Lε = −∑

gijx (εX)

∇ε

i∇εj −

ε

4

jk

Γkij(εX)∇ε

k

+

ε2

4κ(εX) +

1

2

∑Kij(εX)Mε(vi)Mε(vj).

Here we have set ∇εi = ∂i + ε−1

4 Γkij(εX)Mε(vj)Mε(vk). Recall that in normal coordinates gij(X) = δij +

O(‖X‖2) and Γkij(X) = − 1

2

∑ℓRiℓjk(0)Xℓ + O(‖X‖2). Hence ∇ε

i → ∇0i = ∂i − 1

8

∑j,kℓRijkℓxjek ∧ eℓ,

where this convergence is uniform on the coefficients. Furthermore Lεx → L0 = −

∑(∇0

i )2 + Fx(0), where

Fx(0) = 12

∑Kijei∧ej . This is just the supersymmetric harmonic oscillator −∑(∂i+

∑j aijxj)

2+Fx(0) with

aij = − 18

∑Rijkℓek ∧ eℓ = − 1

4Rij , where Rij =∑

k<ℓRijkℓek ∧ eℓ. The formal power series expansion for

the heat kernel of this operator is (4πt)−n/2 det(sinh 2At/2At)−1/2 exp(−tF ) exp[−(2At coth(2At)X,X)/4t]and satisfies ∂K/∂t+DK = 0. This is the unique formal power series solution (in t) of the form

(4πt)−n/2 exp(−tF ) exp(−‖X‖2/4t)

n∑

k=0

Gk(X)tk

with G0(X) = I. Note that this series actually terminates, because A is a matrix of 2–forms so nilpotent. Soif the N appearing in the construction of the Hadamard parametrix is sufficiently large, i.e.N > n/2, then thesolution agrees exactly with Mehler’s formula. Thus k0

t,x(0) = (4πt)−n/2 det(sinh 2At/2At)−1/2 exp(−tF ).

We define the A–hat form by A = det(

sinh(R/4πi)R/4πi

)−1/2

. It is a closed form, so a characteristic class:

Lemma. The A form is closed so gives an element of ⊕H4idR(M).

Proof. If B = R2, then A = det−1/2f(B) where f(x) = 1 + a1x + cdots is an analytic power series whichterminated since B is nilpotent. Let g(x) = log f(x). Thus

det−1/2f(B) = det−1/2eg(B) = e−Tr g(B)/2.

Since the forms TrBk are closed, so too is A.

We recall also that exp(−F/2πi) is the Chern character of the vector bundle V . Hence, taking thesupertrace and setting t = 1, we get at long last:

THE ATIYAH–SINGER INDEX THEOREM. ind(D+) =∫

M A ∧ Ch(V ).

Proof. We have ind(D+) =

(−2i)n/2(4π)−n/2

M

det−1/2 sinh(R2 )

R2

∧ exp(−F ) =

M

det−1/2 sinh( R4πi)

R4πi

∧ exp(−F2πi

) =

M

A ∧ Ch(V ) cqfd

60

APPENDIX: SOLUTIONS OF ORDINARY DIFFERENTIAL EQUATIONS

1. Existence and uniqueness of solutions of ordinary differential equations.

Contraction Mapping Theorem Let (X, d) be a complete metric space and T : X → X a map such thatd(Ty1, T y2) ≤ kd(y1, y2) with k ∈ (0, 1). Then T has a unique fixed point in X ; in fact if y0 ∈ X, thenTmy0 → fixed point as m→ ∞.

Proof. Using the geometric progression

(1 − k)−1 =∑

m≥0

km,

we check that Tmy0 forms a Cauchy sequence in X . So by completeness of X , Tmy0 → y some y. But thenTm+1y0 → Ty, so Ty = y and y is a fixed point. To check uniqueness, note that if Ty1 = y1 and Ty2 = y2,then d(y1, y2) ≤ kd(y1, y2), so that d(y1, y2) = 0 and y1 = y2. Thus T has a unique fixed point.

Corollary. Suppose that T n is a contraction mapping for some n. Then the same conclusions hold.

Proof. By the theorem, T n has a unique fixed point y. But then T n(Ty) = T n+1y = T (T ny) = Ty, so Tyis also a fixed point of T n. By uniqueness, Ty = y. Also Tmny0 → y, Tmn+1y0 → y, . . ., Tmn+(n−1)y0 → y(m→ ∞). Combining these, we get Tmy0 → y.

Let f(t, x) be continuous on |t − t0| ≤ a, ‖x − x0‖ ≤ b where x ∈ Rn. Suppose f also satisfies theLipschitz condition ‖f(t, x1) − f(t, x2)‖ ≤ c‖x1 − x2‖. Let M = sup |f(t, x)| and set h = min(a, b/M). Weare interested in the ordinary differential equation:

x′(t) = f(t, x), x(t0) = x0. (1)

Theorem. The above differential equation has a unique solution for |t− t0| ≤ h.

Proof (Picard-Lindelof). Let

(Tx)(t) = x0 +

∫ t

t0

f(s, x(s)) ds. (2)

Clearly x solves (1) if only if Tx = x (just integrate or differentiate). Now let

X = x ∈ C([t0 − h, t0 + h],Rn) | ‖x(t) − x0‖ ≤M ·h ∀t.

This is a complete metric space for d(x1, x2) = sup|t−t0|≤h

‖x1(t) − x2(t)‖. Moreover, if x ∈ X , Tx is also in X

since M ·h ≤ b. We claim

‖T kx1(t) − T kx2(t)‖ ≤ ck

k!|t− t0|k∂(x1, x2). (3)

For k = 0, this is obvious. In general it follows by induction since

‖T kx1(t) − T kx2(t)‖ ≤∫ t

t0

‖f(s, T k−1x1(s)) − f(s, T k−1x2(s))‖ds

≤ c

∫ t

t0

‖T k−1x1(s) − T k−1x2(s)‖ds

≤ ck

(k − 1)!

∫ t

t0

|s− t0|k−1 ds d(x1, x2)

≤ ck

k!|t− t0|kd(x1, x2).

But then T n is a contraction mapping for n sufficiently large and the result follows.

61

2. Dependence of ODEs on Initial Conditions.

Theorem. The solution of (1) above depends continuously on the initial data x0.

Proof. Pick h1 < h and take δ > 0 such that Mh1 + δ ≤ b. Let

Y = y ∈ C([t0 − h1, t0 + h1] ×B(x0, δ),Rn) : ‖y(t, x) − x‖ ≤M ·h, y(t0, x) = x.

Again Y is complete for the metric ∂(y1, y2) = sup ‖y1(t, x) − y2(t, x)‖. let

(Ty)(t, x) = x+

∫ t

t0

f(s, y(s, x))ds

Since Mh1 + δ ≤ b, T maps Y into Y and as before we can check by induction that

‖T ky1(t, x) − T ky2(t, x)‖ ≤ ck

k|t− t0|k∂(y1, y2).

So T n is a contraction mapping for n sufficiently large and hence T has a unique fixed point y satisfying∂y∂t = f(t, y), y(t0, x) = x. Now y is a continuous function of both t and x and if we fix x = x0 then y(t, x0)solves the initial value problem (1). Since the solution of the ODE for given initial conditions is unique, thisshows that the solution depends continuously on the initial data.

3. Perturbations of linear ODEs. Let A(t, x) be a continuous matrix-valued functions of t and x.Consider the reference linear ODE for vector–valued functions: dξ(t, x)/dt = A(t, x)ξ(t, x), ξ(t0, x) = a(x).We prove that a perturbation of this linear ODE must have a nearby solution; thus if the data A and a varycontinuously, the solution also varies continuously.

Theorem. Let B(t, x) be a continuous matrix–valued function of t and x. Let M ≥ supt,x ‖B‖. Thendη(t,x)

dt = B(t, x)η(t, x), η(t0, x) = b(x) has a solution satisfying

supx

‖ξ(t, x) − η(t, x)‖ ≤ C‖A−B‖eM|t−t0| − 1

M+ ‖a− b‖eM|t−t0|

where C is a constant depending only on A and a.

Proof. By the method of successive approximations, we know that the sequences defined by ξk = a +∫ t

t0Aξk−1 ds, xi0 = a, and ηk = b +

∫ t

t0Bηk−1 ds, eta0 = b must satisfy ξk → ξ, ηk → η. Let gk(t) =

supx ‖ξk(t, x) − ηk(t, x)‖ and C = supk,x,t ‖ξk‖. Then we check that

gn(t) ≤ ‖a− b‖ + C‖A−B‖|(t− t0)| +M

∫ t

t0

gn−1(s) ds. (∗)

(Simply write ξk − ηk = a− b+∫ t

t0(A−B)ξk ds+

∫ t

t0B(ξk − ηk) ds.) Define fn by f0(t) = ‖a− b‖ and then

inductively by making (∗) an equality:

fn(t) = ‖a− b‖ + C‖A−B‖|(t− t0)| +M

∫ t

t0

fn−1(s) ds.

Clearly fn ≥ gn. As we have a contraction mapping, fn → f where f is a solution of

f(t) = ‖a− b‖ + C‖A−B‖(t− t0) +M

∫ t

t0

f(s) ds.

Solving the corresponding differential equation we get

f(t) = ‖a− b‖eM|t−t0| + C‖A−B‖eM|t−t0| − 1

M.

62

As gn(t) ≤ fn(t), we get supx ‖ξn(t, x) − ηn(t, x)‖ ≤ fn(t). Letting n→ ∞, we get the result.

4. Differentiablity of solutions. Consider the ODE

d

dtα(t, x) = f(t, α(t, x)), α(0, x) = x.

We say that f is Ck if all derivatives of order ≤ k exist and are continuous. If f is Ck for all k, we say f isC∞ or smooth.

Theorem. If f is Ck for 1 ≤ k ≤ ∞, then α is Ck.

Proof. The hardest case to prove is k = 1 (differentiability); the others follow almost trivially by induction.So we assume f is C1, i.e. ∂f

∂t , ∂f∂xi

exist and are continuous. We must show that α is also C1. Note that

formally, if we set λ(t, x) =(

∂α(t,x)∂xi

)= Dxα (an n×n matrix), then by the chain rule we expect λ to satisfy

the linear ODEdλ

dt= Dxf(t, α)λ. (1)

Let λ be the continuous solution of (1). We now show that Dαa exists and equals λ. Let F (s) = f(t, a +s(b− a)) for t ∈ R. Then by the chain rule

dF

ds= Dxf(t, a+ s(b− a)) · (b− a).

Thus f(t, b) − f(t, a) =∫ 1

0 Dxf(t, a+ s(b− a)) · (b− a) ds. But then

dt(α(t, x + y) − α(t, x)) = = f(t, α(t, x+ y)) − f(t, α(t, x))

=

∫ 1

0

Dxf(t, α(t, x)) + s(α(t, x+ y) − α(t, x)) · (α(t, x + y) − α(t, x)) ds.

Let A(t, x) = Dxf(t, α(t, x)), ξ(t, x) = λ(t, x)y and By(t, x) =∫ 1

0Dxf(t, α(t, x) + s(α(t, x+ y)− α(t, x)))ds,

ηy(t, x) = α(t, x + y) − α(t, x). Thus dξ/dt = Aξ and dη/dt = Byη with η(0, x) = y. The perturbationtheorem for ODEs implies that

sup|t|≤ǫ

‖λ(t, x)y − α(t, x+ y) − α(x)‖ = o(‖y‖). (2)

This says that Dxα = λ; since dαdt = f(t, α), this means that α is C1. (To summarise: we had to prove (2) to

show that Dxα = λ; this was achieved by showing that the two terms were solutions of linear ODEs whichwere close when y was small.)

Now we continue the induction. Suppose we know that f Ck−1 implies α Ck−1. (We have just provedthat f C1 implies α is C1.) Now suppose that f is Ck. Then by induction α is Ck−1. Now dλ

dt = Aλ, so A is

A is Ck−1. Hence λ is Ck−1 by induction applied to dλdt = Aλ. So Dxα is Ck−1. Also dα

dt = f(t, α) is Ck−1.So the first derivatives of α are Ck−1 and hence α is Ck.

63