the eye carlo... · 2020. 1. 17. · quantum gravity quantum gravity is perhaps the most important...
TRANSCRIPT
Quantum Gravity
Quantum gravity is perhaps the most important open problem in fundamental physicsIt is the problem of merging quantum mechanics and general relativity the two greatconceptual revolutions in the physics of the twentieth century
This book discusses the many aspects of the problem and presents technical andconceptual advances towards a background-independent quantum theory of gravityobtained in the last two decades The first part of the book is an exploration on howto re-think basic physics from scratch in the light of the general-relativistic conceptualrevolution The second part is a detailed introduction to loop quantum gravity and thespinfoam formalism It provides an overview of the current state of the field includingresults on area and volume spectra dynamics extension of the theory to matter appli-cations to early cosmology and black-hole physics and the perspectives for computingscattering amplitudes The book is completed by a historical appendix which overviewsthe evolution of the research in quantum gravity from the 1930s to the present day
Carlo Rovel l i was born in Verona Italy in 1956 and obtained his PhD inPhysics in Padua in 1986 In 1996 he was awarded the Xanthopoulos InternationalPrize for the development of the loop approach to quantum gravity and for researchon the foundation of the physics of space and time Over the years he has taught andworked in the University of Pittsburgh Universite de la Mediterranee Marseille andUniversita La Sapienza Rome Professor Rovellirsquos main research interests lie in generalrelativity gravitational physics and the philosophy of space and time He has had over100 publications in international journals in physics and has written contributions formajor encyclopedias He is senior member of the Institut Universitaire de France
CAMBRIDGE MONOGRAPHS ONMATHEMATICAL PHYSICS
General editors P V Landshoff D R Nelson S Weinberg
S J Aarseth Gravitational N-Body Simulations
J Ambjorn B Durhuus and T Jonsson Quantum Geometry A Statistical Field Theory
Approach
A M Anile Relativistic Fluids and Magneto-Fluids
J A de Azcarrage and J M Izquierdo Lie Groups Lie Algebras Cohomology and Some
Applications in Physicsdagger
O Babelon D Bernard and M Talon Introduction to Classical Integrable Systems
V Belinkski and E Verdaguer Gravitational Solitons
J Bernstein Kinetic Theory in the Expanding Universe
G F Bertsch and R A Broglia Oscillations in Finite Quantum Systems
N D Birrell and PCW Davies Quantum Fields in Curved spacedagger
M Burgess Classical Covariant Fields
S Carlip Quantum Gravity in 2+1 Dimensions
J C Collins Renormalizationdagger
M Creutz Quarks Gluons and Latticesdagger
P D DrsquoEath Supersymmetric Quantum Cosmology
F de Felice and C J S Clarke Relativity on Curved Manifoldsdagger
B S DeWitt Supermanifolds 2nd editiondagger
P G O Freund Introduction to Supersymmetrydagger
J Fuchs Affine Lie Algebras and Quantum Groupsdagger
J Fuchs and C Schweigert Symmetries Lie Algebras and Representations A Graduate Course
for Physicistsdagger
Y Fujii and K Maeda The ScalarndashTensor Theory of Gravitation
A S Galperin E A Ivanov V I Orievetsky and E S Sokatchev Harmonic Superspace
R Gambini and J Pullin Loops Knots Gauge Theories and Quantum Gravitydagger
M Gockeler and T Schucker Differential Geometry Gauge Theories and Gravitydagger
C Gomez M Ruiz Altaba and G Sierra Quantum Groups in Two-dimensional Physics
M B Green J H Schwarz and E Witten Superstring Theory volume 1 Introductiondagger
M B Green J H Schwarz and E Witten Superstring Theory volume 2 Loop Amplitudes
Anomalies and Phenomenologydagger
V N Gribov The Theory of Complex Angular Momenta
S W Hawking and G F R Ellis The Large-Scale Structure of Space-Timedagger
F Iachello and A Arima The Interacting Boson Model
F Iachello and P van Isacker The Interacting BosonndashFermion Model
C Itzykson and J-M Drouffe Statistical Field Theory volume 1 From Brownian Motion to
Renormalization and Lattice Gauge Theorydagger
C Itzykson and J-M Drouffe Statistical Field Theory volume 2 Strong Coupling Monte
Carlo Methods Conformal Field Theory and Random Systemsdagger
C Johnson D-Branes
J I Kapusta Finite-Temperature Field Theorydagger
V E Korepin A G Izergin and N M Boguliubov The Quantum Inverse Scattering Method
and Correlation Functionsdagger
M Le Bellac Thermal Field Theorydagger
Y Makeenko Methods of Contemporary Gauge Theory
N Manton and P Sutcliffe Topological Solitons
N H March Liquid Metals Concepts and Theory
I M Montvay and G Munster Quantum Fields on a Latticedagger
L Orsquo Raifeartaigh Group Structure of Gauge Theoriesdagger
T Ortın Gravity and Strings
A Ozorio de Almeida Hamiltonian Systems Chaos and Quantizationdagger
R Penrose and W Rindler Spinors and Space-Time volume 1 Two-Spinor Calculus and
Relativistic Fieldsdagger
R Penrose and W Rindler Spinors and Space-Time volume 2 Spinor and Twistor Methods in
Space-Time Geometrydagger
S Pokorski Gauge Field Theories 2nd edition
J Polchinski String Theory volume 1 An Introduction to the Bosonic String
J Polchinski String Theory volume 2 Superstring Theory and Beyond
V N Popov Functional Integrals and Collective Excitationsdagger
R J Rivers Path Integral Methods in Quantum Field Theorydagger
R G Roberts The Structure of the Protondagger
C Rovelli Quantum Gravity
W C Saslaw Gravitational Physics of Stellar and Galactic Systemsdagger
H Stephani D Kramer M A H MacCallum C Hoenselaers and E Herlt Exact Solutions
of Einsteinrsquos Field Equations 2nd edition
J M Stewart Advanced General Relativitydagger
A Vilenkin and E P S Shellard Cosmic Strings and Other Topological Defectsdagger
R S Ward and R O Wells Jr Twistor Geometry and Field Theoriesdagger
J R Wilson and G J Mathews Relativistic Numerical Hydrodynamics
daggerIssued as a paperback
Quantum Gravity
CARLO ROVELLICentre de Physique Theorique de LuminyUniversite de la Mediterranee Marseille
cambridge university press Cambridge New York Melbourne Madrid Cape Town
Singapore Satildeo Paulo Delhi Tokyo Mexico City
Cambridge University Press The Edinburgh Building Cambridge CB2 8RU UK
Published in the United States of America by Cambridge University Press New York
wwwcambridgeorg Information on this title wwwcambridgeorg9780521715966
copy Cambridge University Press 2004
This publication is in copyright Subject to statutory exception and to the provisions of relevant collective licensing agreements no reproduction of any part may take place without the written
permission of Cambridge University Press
First published 2004Reprinted 2005
First paperback edition published with correction 2008 Reprinted 2010
A catalogue record for this publication is available from the British Library
Library of Congress Cataloguing in Publication data
isbn 978-0-521-83733-0 Hardback isbn 978-0-521-71596-6 Paperback
Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in
this publication and does not guarantee that any content on such websites is or will remain accurate or appropriate Information regarding prices travel
timetables and other factual information given in this work is correct at the time of first printing but Cambridge University Press does not guarantee
the accuracy of such information thereafter
Contents
Foreword by James Bjorken page xi
Preface xiii
Preface to the paperback edition xvii
Acknowledgements xix
Terminology and notation xxi
Part 1 Relativistic foundations 1
1 General ideas and heuristic picture 311 The problem of quantum gravity 3
111 Unfinished revolution 3112 How to search for quantum gravity 4113 The physical meaning of general relativity 9114 Background-independent quantum field theory 10
12 Loop quantum gravity 13121 Why loops 14122 Quantum space spin networks 17123 Dynamics in background-independent QFT 22124 Quantum spacetime spinfoam 26
13 Conceptual issues 28131 Physics without time 29
v
vi Contents
2 General Relativity 3321 Formalism 33
211 Gravitational field 33212 ldquoMatterrdquo 37213 Gauge invariance 40214 Physical geometry 42215 Holonomy and metric 44
22 The conceptual path to the theory 48221 Einsteinrsquos first problem a field theory for the
newtonian interaction 48222 Einsteinrsquos second problem relativity of motion 52223 The key idea 56224 Active and passive diffeomorphisms 62225 General covariance 65
23 Interpretation 71231 Observables predictions and coordinates 71232 The disappearance of spacetime 73
24 Complements 75241 Mach principles 75242 Relationalism versus substantivalism 76243 Has general covariance any physical content 78244 Meanings of time 82245 Nonrelativistic coordinates 87246 Physical coordinates and GPS observables 88
3 Mechanics 9831 Nonrelativistic mechanics 9832 Relativistic mechanics 105
321 Structure of relativistic systems partialobservablesrelativistic states 105
322 Hamiltonian mechanics 107323 Nonrelativistic systems as a special case 114324 Mechanics is about relations between observables 118325 Space of boundary data G and Hamilton function
S 120326 Evolution parameters 126327 Complex variables and reality conditions 128
33 Field theory 129331 Partial observables in field theory 129332 Relativistic hamiltonian mechanics 130333 The space of boundary data G and the Hamilton
function S 133
Contents vii
334 HamiltonndashJacobi 13734 Thermal time hypothesis 140
4 Hamiltonian general relativity 14541 EinsteinndashHamiltonndashJacobi 145
411 3d fields ldquoThe length of the electric field is theareardquo 147
412 Hamilton function of GR and its physical meaning 15142 Euclidean GR and real connection 153
421 Euclidean GR 153422 Lorentzian GR with a real connection 155423 Barbero connection and Immirzi parameter 156
43 Hamiltonian GR 157431 Version 1 real SO(3 1) connection 157432 Version 2 complex SO(3) connection 157433 Configuration space and hamiltonian 158434 Derivation of the HamiltonndashJacobi formalism 159435 Reality conditions 162
5 Quantum mechanics 16451 Nonrelativistic QM 164
511 Propagator and spacetime states 166512 Kinematical state space K and ldquoprojectorrdquo P 169513 Partial observables and probabilities 172514 Boundary state space K and covariant vacuum |0〉 174515 Evolving constants of motion 176
52 Relativistic QM 177521 General structure 177522 Quantization and classical limit 179523 Examples pendulum and timeless double
pendulum 18053 Quantum field theory 184
531 Functional representation 186532 Field propagator between parallel boundary
surfaces 190533 Arbitrary boundary surfaces 193534 What is a particle 195535 Boundary state space K and covariant vacuum |0〉 197536 Lattice scalar product intertwiners and spin
network states 19854 Quantum gravity 200
541 Transition amplitudes in quantum gravity 200542 Much ado about nothing the vacuum 202
viii Contents
55 Complements 204551 Thermal time hypothesis and Tomita flow 204552 The ldquochoicerdquo of the physical scalar product 206553 Reality conditions and scalar product 208
56 Relational interpretation of quantum theory 209561 The observer observed 210562 Facts are interactions 215563 Information 218564 Spacetime relationalism versus quantum
relationalism 220
Part II Loop quantum gravity 223
6 Quantum space 22561 Structure of quantum gravity 22562 The kinematical state space K 226
621 Structures in K 230622 Invariances of the scalar product 231623 Gauge-invariant and diffeomorphism-invariant
states 23363 Internal gauge invariance The space K0 234
631 Spin network states 234632 Details about spin networks 236
64 Diffeomorphism invariance The space Kdiff 238641 Knots and s-knot states 240642 The Hilbert space Kdiff is separable 241
65 Operators 242651 The connection A 242652 The conjugate momentum E 243
66 Operators on K0 246661 The operator A(S) 246662 Quanta of area 249663 n-hand operators and recoupling theory 250664 Degenerate sector 253665 Quanta of volume 259
67 Quantum geometry 262671 The texture of space weaves 268
7 Dynamics and matter 27671 Hamiltonian operator 277
711 Finiteness 280712 Matrix elements 282
Contents ix
713 Variants 28472 Matter kinematics 285
721 YangndashMills 286722 Fermions 287723 Scalars 288724 The quantum states of space and matter 289
73 Matter dynamics and finiteness 28974 Loop quantum gravity 291
741 Variants 293
8 Applications 29681 Loop quantum cosmology 296
811 Inflation 30182 Black-hole thermodynamics 301
821 The statistical ensemble 303822 Derivation of the BekensteinndashHawking entropy 308823 Ringing modes frequencies 311824 The BekensteinndashMukhanov effect 312
83 Observable effects 315
9 Quantum spacetime spinfoams 32091 From loops to spinfoams 32192 Spinfoam formalism 327
921 Boundaries 32893 Models 329
931 3d quantum gravity 331932 BF theory 340933 The spinfoamGFT duality 343934 BC models 348935 Group field theory 356936 Lorentzian models 359
94 Physics from spinfoams 361941 Particlesrsquo scattering and Minkowski vacuum 363
10 Conclusion 366101 The physical picture of loop gravity 366
1011 GR and QM 3661012 Observables and predictions 3671013 Space time and unitarity 3681014 Quantum gravity and other open problems 370
102 What has been achieved and what is missing 371
x Contents
Part III Appendices 375
Appendix A Groups and recoupling theory 377A1 SU(2) spinors intertwiners n-j symbols 377A2 Recoupling theory 383
A21 Penrose binor calculus 383A22 KL recoupling theory 385A23 Normalizations 388
A3 SO(n) and simple representations 390
Appendix B History 393B1 Three main directions 393B2 Five periods 395
B21 The Prehistory 1930ndash1957 398B22 The Classical Age 1958ndash1969 400B23 The Middle Ages 1970ndash1983 403B24 The Renaissance 1984ndash1994 407B25 Nowadays 1995ndash 410
B3 The divide 412
Appendix C On method and truth 415C1 The cumulative aspects of scientific knowledge 415C2 On realism 420C3 On truth 422
References 424
Index 452
Foreword
The problem of what happens to classical general relativity at the extremeshort-distance Planck scale of 10minus33 cm is clearly one of the most pressingin all of physics It seems abundantly clear that profound modifications ofexisting theoretical structures will be mandatory by the time one reachesthat distance scale There exist several serious responses to this chal-lenge These include effective field theory string theory loop quantumgravity thermogravity holography and emergent gravity Effective fieldtheory is to gravitation as chiral perturbation theory is to quantum chro-modynamics ndash appropriate at large distances and impotent at short Itsprimary contribution is the recognition that the EinsteinndashHilbert action isno doubt only the first term in an infinite series constructed out of higherpowers of the curvature tensor String theory emphasizes the possibleroles of supersymmetry extra dimensions and the standard-model inter-nal symmetries in shaping the form of the microscopic theory Loop grav-ity most directly attacks the fundamental quantum issues and featuresthe construction of candidate wave-functionals which are background in-dependent Thermogravity explores the apparent deep connection of semi-classical gravity to thermodynamic concepts such as temperature and en-tropy The closely related holographic ideas connect theories defined inbulk spacetimes to complementary descriptions residing on the bound-aries Finally emergent gravity suggests that the time-tested symbioticrelationship between condensed matter theory and elementary particletheory should be extended to the gravitational and cosmological contextsas well with more lessons yet to be learned
In each of the approaches difficult problems stand in the way of at-taining a fully satisfactory solution to the basic issues Each has its bandof enthusiasts the largest by far being the string community Most of theapproaches come with rather strong ideologies especially apparent whenthey are popularized The presence of these ideologies tends to isolate the
xi
xii Foreword
communities from each other In my opinion this is extremely unfortu-nate because it is probable that all these ideologies including my own(which is distinct from the above listing) are dead wrong The evidence ishistory from the Greeks to Kepler to Newton to Einstein there has beenno shortage of grand ideas regarding the Basic Questions In the presenceof new data available to us and not them only fragments of those grandvisions remain viable The clutter of thirty-odd standard model param-eters and the descriptive nature of modern cosmology suggests that wetoo have quite a way to go before ultimate simplicity is attained Thisdoes not mean abandoning ideologies ndash they are absolutely essential indriving us all to work hard on the problems But it does mean that anattitude of humility and of high sensitivity toward alternative approachesis essential
This book is about only one approach to the subject ndash loop quantumgravity It is a subject of considerable technical difficulty and the litera-ture devoted to it is a formidable one This feature alone has hindered thecross-fertilization which is as delineated above so essential for progressHowever within these pages one will find a much more accessible de-scription of the subject put forward by one of its leading architects anddeepest thinkers The existence of such a fine book will allow this im-portant subject quite likely to contribute significantly to the unknownultimate theory to be assimilated by a much larger community of the-orists If this does indeed come to pass its publication will become oneof the most important developments in this very active subfield since itsonset
James Bjorken
Preface
A dream I have long held was to write a ldquotreatiserdquo on quantum gravityonce the theory had been finally found and experimentally confirmed Weare not yet there There is neither experimental support nor sufficienttheoretical consensus Still a large amount of work has been developedover the last twenty years towards a quantum theory of spacetime Manyissues have been clarified and a definite approach has crystallized Theapproach variously denoted1 is mostly known as ldquoloop quantum gravityrdquo
The problem of quantum gravity has many aspects Ideas and resultsare scattered in the literature In this book I have attempted to collect themain results and to present an overall perspective on quantum gravity asdeveloped during this twenty-year period The point of view is personaland the choice of subjects is determined by my own interests I apologizeto friends and colleagues for what is missing the reason so much is missingis due to my own limitations for which I am the first to be sorry
It is difficult to over-estimate the vastitude of the problem of quantumgravity The physics of the early twentieth century has modified the veryfoundation our understanding of the physical world changing the meaningof the basic concepts we use to grasp it matter causality space and timeWe are not yet able to paint a consistent picture of the world in whichthese modifications taken together make sense The problem of quantumgravity is nothing less than the problem of finding this novel consistentpicture finally bringing the twentieth century scientific revolution to anend
Solving a problem of this sort is not just a matter of mathematical skillAs was the case with the birth of quantum mechanics relativity electro-magnetism and newtonian mechanics there are conceptual and founda-tional problems to be addressed We have to understand which (possibly
1See the notation section
xiv Preface
new) notions make sense and which old notions must be discarded inorder to describe spacetime in the quantum relativistic regime What weneed is not just a technique for computing say gravitonndashgraviton scatter-ing amplitudes (although we certainly want to be able to do so eventu-ally) We need to re-think the world in the light of what we have learnedabout it with quantum theory and general relativity
General relativity in particular has modified our understanding of thespatio-temporal structure of reality in a way whose consequences havenot yet been fully explored A significant part of the research in quantumgravity explores foundational issues and Part I of this book (ldquoRelativisticfoundationsrdquo) is devoted to these basic issues It is an exploration of howto rethink basic physics from scratch after the general-relativistic con-ceptual revolution Without this we risk asking any tentative quantumtheory of gravity the wrong kind of questions
Part II of the book (ldquoLoop quantum gravityrdquo) focuses on the loopapproach The loop theory described in Part II can be studied by itselfbut its reason and interpretation are only clear in the light of the generalframework studied in Part I Although several aspects of this theory arestill incomplete the subject is mature enough to justify a book A theorybegins to be credible only when its original predictions are reasonablyunique and are confirmed by new experiments Loop quantum gravity isnot yet credible in this sense Nor is any other current tentative theory ofquantum gravity The interest of the loop theory in my opinion is thatat present it is the only approach to quantum gravity leading to well-defined physical predictions (falsifiable at least in principle) and moreimportantly it is the most determined effort for a genuine merging ofquantum field theory with the world view that we have discovered withgeneral relativity The future will tell us more
There are several other introductions to loop quantum gravity Clas-sic reports on the subject [1ndash10 in chronological order] illustrate variousstages of the development of the theory For a rapid orientation and toappreciate different points of view see the review papers [11ndash15] Muchuseful material can be found in [16] Good introductions to spinfoam the-ory are to be found in [1117ndash19] This book is self-contained but I havetried to avoid excessive duplications referring to other books and reviewpapers for nonessential topics well developed elsewhere This book focuseson physical and conceptual aspects of loop quantum gravity ThomasThiemannrsquos book [20] which is going to be completed soon focuses onthe mathematical foundation of the same theory The two books are com-plementary this book can almost be read as Volume 1 (ldquoIntroduction andconceptual frameworkrdquo) and Thiemannrsquos book as Volume 2 (ldquoCompletemathematical frameworkrdquo) of a general presentation of loop quantumgravity
Preface xv
The book assumes that the reader has a basic knowledge of general rel-ativity quantum mechanics and quantum field theory In particular theaim of the chapters on general relativity (Chapter 2) classical mechan-ics (Chapter 3) hamiltonian general relativity (Chapter 4) and quantumtheory (Chapter 5) is to offer the fresh perspective on these topics whichis needed for quantum gravity to a reader already familiar with the con-ventional formulation of these theories
Sections with comments and examples are printed in smaller fonts (seeSection 131 for first such example) Sections that contain side or morecomplex topics and that can be skipped in a first reading without com-promising the understanding of what follows are marked with a star (lowast)in the title References in the text appear only when strictly needed forcomprehension Each chapter ends with a short bibliographical sectionpointing out essential references for the reader who wants to go into moredetail or to trace original works on a subject I have given up the immensetask of collecting a full bibliography on loop quantum gravity On manytopics I refer to specific review articles where ample bibliographic infor-mation can be found An extensive bibliography on loop quantum gravityis given in [9] and [20]
I have written this book thinking of a researcher interested in workingin quantum gravity but also of a good PhD student or an open-mindedscholar curious about this extraordinary open problem I have found thejourney towards general relativistic quantum physics towards quantumspacetime a fascinating adventure I hope the reader will see the beautyI see and that he or she will be capable of completing the journey Thelandscape is magic the trip is far from being over
Preface to the paperback edition
Three years have lapsed since the first edition of this book During thesethree years the research in loop gravity has been developing briskly and inseveral directions Remarkable new results are for instance the proof thatspinfoam and hamiltonian loop theory are equivalent in 3d the proof ofthe unicity of the loop representation (the ldquoLOSTrdquo theorem) the resolu-tion of the r = 0 black hole singularity major advances in loop cosmologythe result that in 3d loop quantum gravity plus matter yields an effectivenon-commutative quantum field theory the ldquomaster constraintrdquo programfor the definition of the quantum dynamics the idea of deriving parti-cles from linking the recalculation of the Immirzi parameter from blackhole thermodynamics and last but not least the first steps toward calcu-lating scattering amplitudes from the background independent quantumtheory I am certainly neglecting something that will soon turn out to beimportant
I have added notes and pointers to recent literature or recent reviewpapers where the interested reader can find updates on specific topicsIn spite of these rapid developments however it is too early for a full-fledged second edition of this book it seems to me that the book as itis still provides a comprehensive introduction to the field In fact severalof these developments reinforce the point of view of this book namelythat the lines of research considered form a coherent picture and definea common language in which a consistent quantum field theory without(background) space and time can be defined
When I feel pessimistic I see the divergence between research linesand the impressive number of problems that are still open When I feeloptimistic I see their remarkable coherence and I dream we might bewith respect to quantum gravity as Einstein was in 1914 with all themachinery ready trying a number of similar field equations Then itseems to me that a quantum theory of gravity (certainly not the final
xvii
xviii Preface to the paperback edition
theory of everything) is truly at hand maybe we have it maybe what weneed is just the right combination of techniques a few more details orone last missing key idea
Once again my wish is that among the readers of this paperback editionthere is she or he who will give us this last missing idea
Acknowledgements
I am indebted to the many people that have sent suggestions and cor-rections to the draft of this book posted online and to its first editionAmong them are M Carling Alexandru Mustatea Daniele Oriti JohnBaez Rafael Kaufmann Nedal Colin Hayhurst Jurgen Ehlers ChrisGauthier Gianluca Calcagni Tomas Liko Chang Chi-Ming YoungsubYoon Martin Bojowald and Gen Zhang Special thanks in particular toJustin Malecki Jacob Bourjaily and Leonard Cottrell
My great gratitude goes to the friends with whom I have had the priv-ilege of sharing this adventure
To Lee Smolin companion of adventures and friend His unique creativ-ity and intelligence intellectual freedom and total honesty are among thevery best things I have found in life
To Abhay Ashtekar whose tireless analytical rigor synthesis capacityand leadership have been a most precious guide Abhay has solidified ourideas and transformed our intuitions into theorems This book is a resultof Leersquos and Abhayrsquos ideas and work as much as my own
To Laura Scodellari and Chris Isham my first teachers who guided meinto mathematics and quantum gravity
To Ted Newman who with Sally parented the little boy just arrivedfrom the Empirersquos far provinces I have shared with Ted a decade ofintellectual joy His humanity generosity honesty passion and love forthinking are the example against which I judge myself
I would like to thank one by one all the friends working in this fieldwho have developed the ideas and results described in this book butthey are too many I can only mention my direct collaborators and afew friends outside this field Luisa Doplicher Simone Speziale ThomasSchucker Florian Conrady Daniele Colosi Etera Livine Daniele OritiFlorian Girelli Roberto DePietri Robert Oeckl Merced MontesinosKirill Krasnov Carlos Kozameh Michael Reisenberger Don Marolf
xx Acknowledgements
Berndt Brugmann Junichi Iwasaki Gianni Landi Mauro Carfora JormaLouko Marcus Gaul Hugo Morales-Tecotl Laurent Freidel Renate LollAlejandro Perez Giorgio Immirzi Philippe Roche Federico LaudisaJorge Pullin Thomas Thiemann Louis Crane Jerzy Lewandowski JohnBaez Ted Jacobson Marco Toller Jeremy Butterfield John Norton JohnBarrett Jonathan Halliwell Massimo Testa David Finkelstein GaryHorowitz John Earman Julian Barbour John Stachel Massimo PauriJim Hartle Roger Penrose John Wheeler and Alain Connes
With all these friends I have had the joy of talking about physics ina way far from problem-solving from outsmarting each other or frommaking weapons to make ldquousrdquo stronger than ldquothemrdquo I think that physicsis about escaping the prison of the received thoughts and searching fornovel ways of thinking the world about trying to clear a bit the mistylake of our insubstantial dreams which reflect reality like the lake reflectsthe mountainsForemost thanks to Bonnie ndash she knows why
Terminology and notation
bull In this book ldquorelativisticrdquo means ldquogeneral relativisticrdquo unless other-wise specified When referring to special relativity I say so explicitlySimilarly ldquononrelativisticrdquo and ldquoprerelativisticrdquo mean ldquonon-general -relativisticrdquo and ldquopre-general -relativisticrdquo The choice is a bit unusual(special relativity in this language is ldquononrelativisticrdquo) One reason forit is simply to make language smoother the book is about general rela-tivistic physics and repeating ldquogeneralrdquo every other line sounds too muchlike a Frenchman talking about de Gaulle But there is a more substantialreason the complete revolution in spacetime physics which truly deservesthe name of relativity is general relativity not special relativity This opin-ion is not always shared today but it was Einsteinrsquos opinion Einstein hasbeen criticized on this but in my opinion the criticisms miss the full reachof Einsteinrsquos discovery about spacetime One of the aims of this book isto defend in modern language Einsteinrsquos intuition that his gravitationaltheory is the full implementation of relativity in physics This point isdiscussed at length in Chapter 2
bull I often indulge in the physicistsrsquo bad habit of mixing up function fand function values f(x) Care is used when relevant Similarly I followstandard physicistsrsquo abuse of language in denoting a field such as theMaxwell potential as Aμ(x) A(x) or A where the three notations aretreated as equivalent manners of denoting the field Again care is usedwhere relevant
bull All fields are assumed to be smooth unless otherwise specified All state-ments about manifolds and functions are local unless otherwise specifiedthat is they hold within a single coordinate patch In general I do notspecify the domain of definition of functions clearly equations hold wherefunctions are defined
xxi
xxii Terminology and notation
bull Index notation follows the most common choice in the field Greek in-dices from the middle of the alphabet μ ν = 0 1 2 3 are 4d spacetimetangent indices Upper case Latin indices from the middle of the alphabetI J = 0 1 2 3 are 4d Lorentz tangent indices (In the special relativis-tic context the two are used without distinction) Lower case Latin indicesfrom the beginning of the alphabet a b = 1 2 3 are 3d tangent indicesLower case Latin indices from the middle of the alphabet i j = 1 2 3are 3d indices in R3 Coordinates of a 4d manifold are usually indicated asx y while 3d manifold coordinates are usually indicated as x y (alsoas τ) Thus the components of a spacetime coordinate x are
xμ = (t x) = (x0 xa)
while the components of a Lorentz vector e are
eI = (e0 ei)
bull ηIJ is the Minkowski metric with signature [minus+++] The indicesI J are raised and lowered with ηIJ δij is the Kronecker delta or theR3 metric The indices i j are raised and lowered with δij
bull For reasons explained at the beginning of Chapter 2 I call ldquogravita-tional fieldrdquo the tetrad field eIμ(x) instead of the metric tensor gμν(x) =ηIJ eIμ(x)eJν (x)
bull εIJKL or εμνρσ is the completely antisymmetric object with ε0123 = 1Similarly for εabc or εijk in 3d The Hodge star is defined by
F lowastIJ =
12εIJKL FKL
in flat space and by the same equation where FIJeIμe
Jν = Fμν and
F lowastIJe
Iμe
Jν = F lowast
μν in the presence of gravity Equivalently
F lowastμν =
radicminusdet g
12εμνρσ F ρσ = | det e| 1
2εμνρσ F ρσ
bull Symmetrization and antisymmetrization of indices is defined with a halfA(ab) = 1
2(Aab + Aba) and A[ab] = 12(Aab minusAba)
bull I call ldquocurverdquo on a manifold M a map
γ I rarr M
s rarr γa(s)
where I is an interval of the real line R (possibly the entire R) I callldquopathrdquo an oriented unparametrized curve namely an equivalence class of
Terminology and notation xxiii
curves under change of parametrization γa(s) rarr γprimea(s) = γa(sprime(s)) withdsprimeds gt 0
bull An orthonormal basis in the Lie algebras su(2) and so(3) is chosen onceand for all and these algebras are identified with R3 For so(3) the basisvectors (vi)jk can be taken proportional to εi
jk for su(2) the basis vec-
tors (vi)AB can be taken proportional to the Pauli matrices see AppendixA1 Thus an algebra element ω in su(2) sim so(3) has components ωi
bull For any antisymmetric quantity vij with two 3d indices i j I use alsothe one-index notation
vi = 12εijk vjk vij = εijk vk
the one-index and the two-indices notation are considered as definingthe same object For instance the SO(3) connections ωij and Aij areequivalently denoted ωi and Ai
Symbols Here is a list of symbols with their name and the equationchapter or section where they are introduced or defined
A area Section 214A YangndashMills connection Equation (230)AAi
μ(x) selfdual 4d gravitational connection Equation (219)AAi
a(x) selfdual or real 3d gravitational connection Sections 41142
C relativistic configuration space Section 321Dμ covariant derivative Equation (231)Diff lowast extended diffeomorphism group Section 622eIμ(x) gravitational field Equation (21)e determinant of eIμe edge (of spinfoam) Section 91EEa
i (x) gravitational electric field Section 411f face (of spinfoam) Section 91F curvature two-form Section 211g or U group elementG Newton constantG space of boundary data Sections 325ndash
333hγ U(A γ) Section 71H relativistic hamiltonian Section 32H0 nonrelativistic (conventional) hamiltonian Section 32H quantum state space Chapter 5H0 nonrelativistic quantum state space Chapter 5
xxiv Terminology and notation
in intertwiner on spin network node n Section 63ie intertwiner on spinfoam edge e Chapter 9j irreducible representation (for SU(2) spin)jl spin associated to spin network link l Section 621jf representation associated to spinfoam face f Chapter 9K kinematical quantum state space Section 52K0 SU(2) invariant quantum state space Section 623Kdiff diff-invariant quantum state space Section 623K boundary quantum space Sections 514
535l link (of spin network) Section 91lP Planck length
radicGcminus3
L length Section 214M spacetime manifoldn node (of spin network) Section 91pa relativistic momenta (including pt) Section 32pt momentum conjugate to t Section 32P the ldquoprojectorrdquo operator Section 52PG group G projector Equation
(9117)PH subgroup H projector Equation
(9119)P transition probability Chapter 5P path ordered Equation
(281)qa partial observables Section 32RI
J μν(x) curvature Equation(28)
R(j)αβ(g) matrix of group element g in representation j
R 3d region Section 214s s-knot abstract spin network Equation
(641)|s〉 s-knot state Equation
(641)SBH black-hole entropy Section 82S embedded spin network Section 63|S〉 spin network state Section 631S 2d surface Section 214S space of fast decrease functions Chapter 5S0 space of tempered distributions Chapter 5S[γ] action functional Section 32S(qa) HamiltonndashJacobi function Section 322S(qa qa0) Hamilton function Section 325
Terminology and notation xxv
tρ thermal time Sections 34551
T target space of a field theory Section 331U or g group elementU(A γ) holonomy Section 215v vertex (of spinfoam) Section 91V volume Section 214W (qa qprimea) propagator Chapter 5W transition amplitudes propagator Section 52x 4d spacetime coordinatesx 3d coordinatesZ partition function Chapter 9α loop closed pathβ inverse temperature Section 34γ pathγ motion (in C) Section 321γ Immirzi parameter Section 423γ motion in Ω Section 32Γ relativistic phase space Section 321Γ graph Section 62Γ two-complex Chapter 9θ PoincarendashCartan form on Σ Section 322θ Poincare form on Ω Equation (39)ηIJ ημν Minkowski metric = diag[minus1111]λ cosmological constant Equation (211)λ gauge parameter Section 213ρ statistical state Sections 34 551Σ constraint surface H = 0 Section 322σΣ 3d boundary surface Chapter 4σ spinfoam Chapter 9φ(x) scalar field Equation (232)ψ(x) fermion field Equation (235)ω presymplectic form on Σ Section 322ωIμJ(x) spin connection Equation (22)
ω symplectic form on Ω Section 322Ω space of observables and momenta Sections 32ndash3326j Wigner 6j symbol Equation (933)10j Wigner 10j symbol Equation (9103)15j Wigner 15j symbol Equation (956)|0〉 covariant vacuum in K Sections 514 535|0t〉 dynamical vacuum in Kt Sections 514 532|0M〉 Minkowski vacuum in H Sections 514 531
xxvi Terminology and notation
bull The name of the theory Finally a word about the name of the quantumtheory of gravity described in this book The theory is known as ldquoloopquantum gravityrdquo (LQG) or sometimes ldquoloop gravityrdquo for short Howeverthe theory is also designated in the literature using a variety of othernames I list here these other names and the variations of their use forthe benefit of the disoriented reader
ndash ldquoQuantum spin dynamicsrdquo (QSD) is used as a synonym of LQG WithinLQG it is sometimes used to designate in particular the dynamical as-pects of the hamiltonian theory
ndash ldquoQuantum geometryrdquo is also sometimes used as a synonym of LQGWithin LQG it is used to designate in particular the kinematical as-pects of the theory The expression ldquoquantum geometryrdquo is generic it isalso widely used in other approaches to quantum spacetime in particulardynamical triangulations [21] and noncommutative geometry
ndash ldquoNonperturbative quantum gravityrdquo ldquocanonical quantum gravityrdquo andldquoquantum general relativityrdquo (QGR) are often used to designate LQGalthough their proper meaning is wider
ndash The expression ldquoAshtekar approachrdquo is still used sometimes to desig-nate LQG it comes from the fact that a key ingredient of LQG is thereformulation of classical GR as a theory of connections developed byAbhay Ashtekar
ndash In the past LQG was also called ldquothe loop representation of quantumgeneral relativityrdquo Today ldquoloop representationrdquo and ldquoconnection repre-sentationrdquo are used within LQG to designate the representations of thestates of LQG as functionals of loops (or spin networks) and as functionalsof the connection respectively The two are related in the same manneras the energy (ψn = 〈n|ψ〉) and position (ψ(x) = 〈x|ψ〉) representationsof the harmonic oscillator states
Part I
Relativistic foundations
I know that I am mortal and the creature of aday but when I search out the massed wheeling circlesof the stars my feet no longer touch the earthside by side with Zeus himself I drink my fill ofambrosia food of the gods
Claudius Ptolemy Mathematical Syntaxis
1General ideas and heuristic picture
The aim of this chapter is to introduce the general ideas on which this book is based andto present the picture of quantum spacetime that emerges from loop quantum gravityin a heuristic and intuitive manner The style of the chapter is therefore conversationalwith little regard for precision and completeness In the course of the book the ideasand notions introduced here will be made precise and the claims will be justified andformally derived
11 The problem of quantum gravity
111 Unfinished revolution
Quantum mechanics (QM) and general relativity (GR) have greatlywidened our understanding of the physical world A large part of thephysics of the last century has been a triumphant march of exploration ofnew worlds opened up by these two theories QM led to atomic physics nu-clear physics particle physics condensed matter physics semiconductorslasers computers quantum optics GR led to relativistic astrophysicscosmology GPS technology and is today leading us hopefully towardsgravitational wave astronomy
But QM and GR have destroyed the coherent picture of the worldprovided by prerelativistic classical physics each was formulated in termsof assumptions contradicted by the other theory QM was formulatedusing an external time variable (the t of the Schrodinger equation) ora fixed nondynamical background spacetime (the spacetime on whichquantum field theory is defined) But this external time variable and thisfixed background spacetime are incompatible with GR In turn GR wasformulated in terms of riemannian geometry assuming that the metric isa smooth and deterministic dynamical field But QM requires that anydynamical field be quantized at small scales it manifests itself in discretequanta and is governed by probabilistic laws
3
4 General ideas and heuristic picture
We have learned from GR that spacetime is dynamical and we havelearned from QM that any dynamical entity is made up of quanta andcan be in probabilistic superposition states Therefore at small scales thereshould be quanta of space and quanta of time and quantum superpositionof spaces But what does this mean We live in a spacetime with quantumproperties a quantum spacetime What is quantum spacetime How canwe describe it
Classical prerelativistic physics provided a coherent picture of the phys-ical world This was based on clear notions such as time space matterparticle wave force measurement deterministic law This picture haspartially evolved (in particular with the advent of field theory and spe-cial relativity) but it has remained consistent and quite stable for threecenturies GR and QM have modified these basic notions in depth GRhas modified the notions of space and time QM the notions of causalitymatter and measurement The novel modified notions do not fit togethereasily The new coherent picture is not yet available With all their im-mense empirical success GR and QM have left us with an understandingof the physical world which is unclear and badly fragmented At the foun-dations of physics there is today confusion and incoherence
We want to combine what we have learnt about our world from the twotheories and to find a new synthesis This is a major challenge ndash perhapsthe major challenge ndash in todayrsquos fundamental physics GR and QM haveopened a revolution The revolution is not yet complete
With notable exceptions (Dirac Feynman Weinberg DeWitt WheelerPenrose Hawking rsquot Hooft among others) most of the physicists of thesecond half of the last century have ignored this challenge The urgencywas to apply the two theories to larger and larger domains The develop-ments were momentous and the dominant attitude was pragmatic Apply-ing the new theories was more important than understanding them Butan overly pragmatic attitude is not productive in the long run Towardsthe end of the twentieth century the attention of theoretical physics hasbeen increasingly focusing on the challenge of merging the conceptualnovelties of QM and GR
This book is the account of an effort to do so
112 How to search for quantum gravity
How to search for this new synthesis Conventional field quantizationmethods are based on the weak-field perturbation expansion Their appli-cation to GR fails because it yields a nonrenormalizable theory Perhapsthis is not surprising GR has changed the notions of space and time tooradically to docilely agree with flat space quantum field theory Somethingelse is needed
11 The problem of quantum gravity 5
In science there are no secure recipes for discovery and it is important toexplore different directions at the same time Currently a quantum theoryof gravity is sought along various paths The two most developed are loopquantum gravity described in this book and string theory Other researchdirections include dynamical triangulations noncommutative geometryHartlersquos quantum mechanics of spacetime (this is not really a specificquantum theory of gravity but rather a general theoretical frameworkfor general-relativistic quantum theory) Hawkingrsquos euclidean sum overgeometries quantum Regge calculus Penrosersquos twistor theory Sorkinrsquoscausal sets rsquot Hooftrsquos deterministic approach and Finkelsteinrsquos theoryThe reader can find ample references in the general introductions to quan-tum gravity mentioned in the note at the end of this chapter Here I sketchonly the general ideas that motivate the approach described in this bookplus a brief comment on string theory which is currently the most popularalternative to loop gravity
Our present knowledge of the basic structure of the physical universe issummarized by GR quantum theory and quantum field theory (QFT) to-gether with the particle-physics standard model This set of fundamentaltheories is inconsistent But it is characterized by an extraordinary em-pirical success nearly unique in the history of science Indeed currentlythere is no evidence of any observed phenomenon that clearly escapesquestions or contradicts this set of theories (or a minor modification ofthe same to account say for a neutrino mass or a cosmological constant)This set of theories becomes meaningless in certain physical regimes Inthese regimes we expect the predictions of quantum gravity to becomerelevant and to differ from the predictions of GR and the standard modelThese regimes are outside our experimental or observational reach at leastso far Therefore we have no direct empirical guidance for searching forquantum gravity ndash as say atomic spectra guided the discovery of quan-tum theory
Since quantum gravity is a theory expected to describe regimes that areso far inaccessible one might worry that anything could happen in theseregimes at scales far removed from our experience Maybe the search isimpossible because the range of the possible theories is too large Thisworry is unjustified If this was the problem we would have plenty ofcomplete predictive and coherent theories of quantum gravity Insteadthe situation is precisely the opposite we havenrsquot any The fact is that wedo have plenty of information about quantum gravity because we haveQM and we have GR Consistency with QM and GR is an extremely strictconstraint
A view is sometime expressed that some totally new radical and wildhypothesis is needed for quantum gravity I do not think that this isthe case Wild ideas pulled out of the blue sky have never made science
6 General ideas and heuristic picture
advance The radical hypotheses that physics has successfully adoptedhave always been reluctantly adopted because they were forced upon usby new empirical data ndash Keplerrsquos ellipses Bohrrsquos quantization ndash or bystringent theoretical deductions ndash Maxwellrsquos inductive current Einsteinrsquosrelativity (see Appendix C) Generally arbitrary novel hypotheses leadnowhere
In fact today we are precisely in one of the typical situations in whichtheoretical physics has worked at its best in the past Many of the moststriking advances in theoretical physics have derived from the effort offinding a common theoretical framework for two basic and apparently con-flicting discoveries For instance the aim of combining the keplerian or-bits with galilean physics led to newtonian mechanics combining Maxwelltheory with galilean relativity led to special relativity combining specialrelativity and nonrelativistic quantum theory led to the theoretical discov-ery of antiparticles combining special relativity with newtonian gravityled to general relativity and so on In all these cases major advances havebeen obtained by ldquotaking seriouslyrdquo1 apparently conflicting theories andexploring the implications of holding the key tenets of both theories fortrue Today we are precisely in one of these characteristic situations Wehave learned two new very general ldquofactsrdquo about Nature expressed byQM and GR we have ldquojustrdquo to figure out what they imply taken to-gether Therefore the question we have to ask is what have we reallylearned about the world from QM and from GR Can we combine theseinsights into a coherent picture What we need is a conceptual scheme inwhich the insights obtained with GR and QM fit together
This view is not the majority view in theoretical physics at presentThere is consensus that QM has been a conceptual revolution but manydo not view GR in the same way According to many the discovery of GRhas been just the writing of one more field theory This field theory isfurthermore likely to be only an approximation to a theory we do not yetknow According to this opinion GR should not be taken too seriously asa guidance for theoretical developments
I think that this opinion derives from a confusion the confusion betweenthe specific form of the EinsteinndashHilbert action and the modification of thenotions of space and time engendered by GR The EinsteinndashHilbert actionmight very well be a low-energy approximation of a high-energy theoryBut the modification of the notions of space and time does not depend onthe specific form of the EinsteinndashHilbert action It depends on its diffeo-morphism invariance and its background independence These properties
1In [22] Gell-Mann says that the main lesson to be learnt from Einstein is ldquoto lsquotakevery seriouslyrsquo ideas that work and see if they can be usefully carried much furtherthan the original proponent suggestedrdquo
11 The problem of quantum gravity 7
(which are briefly illustrated in Section 113 below and discussed in de-tail in Chapter 2) are most likely to hold in the high-energy theory aswell One should not confuse the details of the dynamics of GR with themodifications of the notions of space and time that GR has determinedIf we make this confusion we underestimate the radical novelty of thephysical content of GR The challenge of quantum gravity is precisely tofully incorporate this radical novelty into QFT In other words the taskis to understand what is a general-relativistic QFT or a background-independent QFT
Today many physicists prefer disregarding or postponing these founda-tional issues and instead choose to develop and adjust current theoriesThe most popular strategy towards quantum gravity in particular isto pursue the line of research grown in the wake of the success of thestandard model of particle physics The failure of perturbative quantumGR is interpreted as a replay of the failure of Fermi theory2 Namely asan indication that we must modify GR at high energy With the inputof the grand-unified-theories (GUTs) supersymmetry and the KaluzandashKlein theory the search for a high-energy correction of GR free from badultraviolet divergences has led to higher derivative theories supergravityand finally to string theory
Sometimes the claim is made that the quantum theory of gravity hasalready been found and it is string theory Since this is a book about quan-tum gravity without strings I should say a few words about this claimString theory is based on a physical hypothesis elementary objects areextended rather than particle-like This hypothesis leads to a very richunified theory which contains much phenomenology including (with suit-able inputs) fermions YangndashMills fields and gravitons and is expected bymany to be free of ultraviolet divergences The price to pay for these theo-retical results is a gigantic baggage of additional physics supersymmetryextra dimensions an infinite number of fields with arbitrary masses andspins and so on
So far nothing of this new physics shows up in experiments Super-symmetry in particular has been claimed to be on the verge of beingdiscovered for years but hasnrsquot shown up Unfortunately so far the the-ory can accommodate any disappointing experimental result because it ishard to derive precise new quantitative physical predictions with whichthe theory could be falsified from the monumental mathematical appa-ratus of the theory Furthermore even recovering the real world is noteasy within the theory the search for a compactification leading to the
2Fermi theory was an empirically successful but nonrenormalizable theory of the weakinteractions just as GR is an empirically successful but nonrenormalizable theory ofthe gravitational interaction The solution has been the GlashowndashWeinbergndashSalamelectroweak theory which corrects Fermi theory at high energy
8 General ideas and heuristic picture
standard model with its families and masses and no instabilities has notyet succeeded as far as I know It is clear that string theory is a very inter-esting hypothesis but certainly not an established theory It is thereforeimportant to pursue alternative directions as well
String theory is a direct development of the standard model and isdeeply rooted in the techniques and the conceptual framework of flatspace QFT As I shall discuss in detail throughout this book manyof the tools used in this framework ndash energy unitary time evolutionvacuum state Poincare invariance S-matrix objects moving in a space-time Fourier transform ndash no longer make sense in the quantum grav-itational regime in which the gravitational field cannot be approxi-mated by a background spacetime ndash perhaps not even asymptotically3
Therefore string theory does not address directly the main challengeof quantum gravity understanding what is a background-independentQFT Facing this challenge directly before worrying about unificationleads instead to the direction of research investigated by loop quantumgravity4
The alternative to the line of research followed by string theory is givenby the possibility that the failure of perturbative quantum GR is not areplay of Fermi theory That is it is not due to a flaw of the GR actionbut instead it is due to the fact that the conventional weak-field quantumperturbation expansion cannot be applied to the gravitational field
This possibility is strongly supported a posteriori by the results of loopquantum gravity As we shall see loop quantum gravity leads to a pictureof the short-scale structure of spacetime extremely different from that ofa smooth background geometry (There are hints in this direction fromstring theory calculations as well [25]) Spacetime turns out to have anonperturbative quantized discrete structure at the Planck scale whichis explicitly described by the theory The ultraviolet divergences are curedby this structure The ultraviolet divergences that appear in the pertur-bation expansion of conventional QFT are a consequence of the fact that
3To be sure the development of string theory has incorporated many aspects of GRsuch as curved spacetimes horizons black holes and relations between different back-grounds But this is far from a background-independent framework such as the onerealized by GR in the classical context GR is not about physics on a curved space-time or about relations between different backgrounds it is about the dynamics ofspacetime A background-independent fundamental definition of string theory is beingactively searched for along several directions but so far the definition of the theoryand all calculations rely on background metric spaces
4It has been repeatedly suggested that loop gravity and string theory might mergebecause loop gravity has developed precisely the background-independent QFT meth-ods that string theory needs [23] Also excitations over a weave (see Section 671)have a natural string structure in loop gravity [24]
11 The problem of quantum gravity 9
we erroneously replace this discrete Planck-scale structure with a smoothbackground geometry
If this is physically correct ultraviolet divergences do not require theheavy machinery of string theory to be cured On the other hand the con-ventional weak-field perturbative methods cannot be applied because wecannot work with a fixed smooth background geometry We must there-fore adapt QFT to the full conceptual novelty of GR and in particularto the change in the notion of space and time induced by GR What arethese changes I sketch an answer below leaving a complete discussion toChapter 2
113 The physical meaning of general relativity
GR is the discovery that spacetime and the gravitational field are thesame entity What we call ldquospacetimerdquo is itself a physical object in manyrespects similar to the electromagnetic field We can say that GR is thediscovery that there is no spacetime at all What Newton called ldquospacerdquoand Minkowski called ldquospacetimerdquo is unmasked it is nothing but a dy-namical object ndash the gravitational field ndash in a regime in which we neglectits dynamics
In newtonian and special-relativistic physics if we take away the dy-namical entities ndash particles and fields ndash what remains is space and time Ingeneral-relativistic physics if we take away the dynamical entities nothingremains The space and time of Newton and Minkowski are re-interpretedas a configuration of one of the fields the gravitational field This impliesthat physical entities ndash particles and fields ndash are not immersed in spaceand moving in time They do not live on spacetime They live so to sayon one another
It is as if we had observed in the ocean many animals living on anisland animals on the island Then we discover that the island itself is infact a great whale So the animals are no longer on the island just animalson animals Similarly the Universe is not made up of fields on spacetimeit is made up of fields on fields This book studies the far-reaching effectthat this conceptual shift has on QFT
One consequence is that the quanta of the field cannot live in spacetimethey must build ldquospacetimerdquo themselves This is precisely what the quantaof space do in loop quantum gravity
We may continue to use the expressions ldquospacerdquo and ldquotimerdquo to indicateaspects of the gravitational field and I do so in this book We are usedto this in classical GR But in the quantum theory where the field hasquantized ldquogranularrdquo properties and its dynamics is quantized and there-fore only probabilistic most of the ldquospatialrdquo and ldquotemporalrdquo features ofthe gravitational field are lost
10 General ideas and heuristic picture
Therefore to understand the quantum gravitational field we must aban-don some of the emphasis on geometry Geometry represents the classicalgravitational field but not quantum spacetime This is not a betrayal ofEinsteinrsquos legacy on the contrary it is a step in the direction of ldquorelativ-ityrdquo in the precise sense meant by Einstein Alain Connes has describedbeautifully the existence of two points of view on space the geometricone centered on space points and the algebraic or ldquospectralrdquo one cen-tered on the algebra of dual spectral quantities As emphasized by Alainquantum theory forces us to a complete shift to this second point of viewbecause of noncommutativity In the light of quantum theory continuousspacetime cannot be anything else than an approximation in which wedisregard quantum noncommutativity In loop gravity the physical fea-tures of space appear as spectral properties of quantum operators thatdescribe our (the observersrsquo) interactions with the gravitational field
The key conceptual difficulty of quantum gravity is therefore to accept theidea that we can do physics in the absence of the familiar stage of spaceand time We need to free ourselves from the prejudices associated withthe habit of thinking of the world as ldquoinhabiting spacerdquo and ldquoevolving intimerdquo Chapter 3 describes a language for describing mechanical systemsin this generalized conceptual framework
This absence of the familiar spacetime ldquostagerdquo is called the backgroundindependence of the classical theory Technically it is realized by the gaugeinvariance of the action under (active) diffeomorphisms A diffeomorphismis a transformation that smoothly drags all dynamical fields and particlesfrom one region of the four-dimensional manifold to another (the pre-cise definition of these transformations is given in Chapter 2) In turngauge invariance under diffeomorphism (or diffeomorphism invariance) isthe consequence of the combination of two properties of the action itsinvariance under arbitrary changes of coordinates and the fact that thereis no nondynamical ldquobackgroundrdquo field
114 Background-independent quantum field theory
Is quantum mechanics5 compatible with the general-relativistic notionsof space and time It is provided that we choose a sufficiently generalformulation For instance the Schrodinger picture is only viable for the-ories where there is a global observable time variable t this conflictswith GR where no such variable exists Therefore the Schrodinger pic-ture makes little sense in a background-independent context But there
5I use the expression ldquoquantum mechanicsrdquo to indicate the theory of all quantumsystems with a finite or infinite number of degrees of freedom In this sense QFT ispart of quantum mechanics
11 The problem of quantum gravity 11
are formulations of quantum theory that are more general than theSchrodinger picture In Chapter 5 I describe a formulation of QM suf-ficiently general to deal with general-relativistic systems (For anotherrelativistic formulation of QM see [26]) Formulations of this kind aresometimes denoted ldquogeneralized quantum mechanicsrdquo I prefer to useldquoquantum mechanicsrdquo to denote any formulation of quantum theory ir-respective of its generality just as ldquoclassical mechanicsrdquo is used to des-ignate formalisms with different degrees of generality such as NewtonrsquosLagrangersquos Hamiltonrsquos or symplectic mechanics
On the other hand most of the conventional machinery of perturbativeQFT is profoundly incompatible with the general-relativistic frameworkThere are many reasons for this
bull The conventional formalism of QFT relies on Poincare invarianceIn particular it relies on the notion of energy and on the existence ofthe nonvanishing hamiltonian operator that generates unitary timeevolution The vacuum for instance is the state that minimizes theenergy Generally there is no global Poincare invariance no generalnotion of energy and no nonvanishing hamiltonian operator in ageneral-relativistic theory
bull At the root of conventional QFT is the physical notion of particleThe theoretical experience with QFT on curved spacetime [27] andon the relation between acceleration and temperature in QFT [28]indicates that in a generic gravitational situation the notion of par-ticle can be quite delicate (This point is discussed in Section 534)
bull Consider a conventional renormalized QFT The physical contentof the theory can be expressed in terms of its n-point functionsW (x1 xn) The n-point functions reflect the invariances of theclassical theory In a general-relativistic theory invariance under acoordinate transformation x rarr xprime = xprime(x) implies immediately thatthe n-point functions must satisfy
W (x1 xn) = W (xprime(x1) xprime(xn)) (11)
and therefore (if the points in the argument are distinct) it must bea constant That is
W (x1 xn) = constant (12)
Clearly we are immediately in a very different framework from con-ventional QFT
bull Similarly the behavior for small |xminus y| of the two-point function ofa conventional QFT
W (x y) =constant
|xminus y|d (13)
12 General ideas and heuristic picture
expresses the short-distance structure of the QFT More generallythe short-distance structure of the QFT is reflected in the operatorproduct expansion
O(x)Oprime(y) =sum
n
On(x)|xminus y|n (14)
Here |x minus y| is the distance measured in the spacetime metric Onflat space for instance |xminusy|2 = ημν(xμminusyμ)(xνminusyν) In a general-relativistic context these expressions make no sense since there isno background Minkowski (or other) metric ημν In its place there isthe gravitational field namely the quantum field operator itself Butthen if standard operator product expansion becomes meaninglessthe short-distance structure of a quantum gravitational theory mustbe profoundly different from that of conventional QFT As we shallsee in Chapter 7 this is precisely the case
There is a tentative escape strategy to circumvent these difficultieswrite the gravitational field e(x) as the sum of two terms
e(x) = ebackground(x) + h(x) (15)
where ebackground(x) is a background field configuration This may beMinkowski or any other Assume that ebackground(x) defines spacetimenamely it defines location and causal relations Then consider h(x) asthe gravitational field governed by a QFT on the spacetime backgrounddefined by ebackground For instance the field operator h(x) is assumed tocommute at spacelike separations where spacelike is defined in the geom-etry determined by ebackground(x) As a second step one may then considerconditions on ebackground(x) or relations between the formulations of thetheory defined by different choices of ebackground(x) This escape strategyleads to three orders of difficulties (i) Conventional perturbative QFTof GR based on (15) leads to a nonrenormalizable theory To get rid ofthe uncontrollable ultraviolet divergences one has to resort to the compli-cations of string theory (ii) As mentioned loop quantum gravity showsthat the structure of spacetime at the Planck scale is discrete Thereforephysical spacetime has no short-distance structure at all The unphysicalassumption of a smooth background ebackground(x) implicit in (15) maybe precisely the cause of the ultraviolet divergences (iii) The separationof the gravitational field from spacetime is in strident contradiction withthe very physical lesson of GR If GR is of any guide in searching for aquantum theory of gravity the relevant spacetime geometry is the onedetermined by the full gravitational field e(x) and the separation (15) ismisleading
12 Loop quantum gravity 13
A formulation of quantum gravity that does not take the escape strategy(15) is a background-independent or general covariant QFT The mainaim of this book is develop the formalism for background-independentQFT
12 Loop quantum gravity
I sketch here the physical picture of quantum spacetime that emergesfrom loop quantum gravity (LQG) The basic ideas and assumptions onwhich LQG is based are the following
(i) Quantum mechanics and general relativity QM suitably formulatedto be compatible with general covariance is assumed to be cor-rect The Einstein equations may be modified at high energy butthe general-relativistic notions of space and time are assumed to becorrect The motivation for these two assumptions is the extraordi-nary empirical success they have had so far and the absence of anycontrary empirical evidence
(ii) Background independence LQG is based on the idea that the quan-tization strategy based on the separation (15) is not appropriatefor describing the quantum properties of spacetime
To this we can add
(iii) No unification Nowadays a fashionable idea is that the problemof quantizing gravity has to be solved together with the problemof finding a unified description of all interactions LQG is a solu-tion of the first problem not the second6
(iv) Four spacetime dimensions and no supersymmetry LQG is com-patible with these possibilities but there is nothing in the theorythat requires higher dimensions or supersymmetry Higher space-time dimensions and supersymmetry are interesting theoreticalideas which as many other interesting theoretical ideas can bephysically wrong In spite of 15 years of search numerous pre-liminary announcements of discovery then turned out to be false
6A motivation for the idea that these two issues are connected is the expectation thatwe are ldquonear the end of physicsrdquo Unfortunately the expectation of being ldquonear theend of physicsrdquo has been present all along the three centuries of the history of modernphysics In the present situation of deep conceptual confusion on the fundamentalaspects of the world I see no sign indicating that we are close to the end of ourdiscoveries about the physical world When I was a student it was fashionable toclaim that the problem of finding a theory of the strong interactions had to be solvedtogether with the problem of getting rid of renormalization theory Nice idea Butwrong
14 General ideas and heuristic picture
and despite repeated proclamations that supersymmetry was go-ing to be discovered ldquonext yearrdquo so far empirical evidence has beensolidly and consistently against supersymmetry This might changebut as scientists we must take the indications of the experimentsseriously
On the basis of these assumptions LQG is a straightforward quantiza-tion of GR with its conventional matter couplings The program of LQGis therefore conservative and of small ambition The physical inputs ofthe theory are just QM and GR well-tested physical theories No majoradditional physical hypothesis or assumption is made (such as elementaryobjects are strings space is made by individual discrete points quantummechanics is wrong GR is wrong supersymmetry extra dimensions )No claim of being the final ldquoTheory Of Everythingrdquo is made
On the other hand LQG has a radical and ambitious side to merge theconceptual insight of GR into QM In order to achieve this we have togive up the familiar notions of space and time The space continuum ldquoonwhichrdquo things are located and the time ldquoalong whichrdquo evolution happensare semiclassical approximate notions in the theory In LQG this radicalstep is assumed in its entirety
LQG does not make use of most of the familiar tools of conventionalQFT because these become inadequate in a background-independent con-text It only makes use of the general tools of quantum theory a Hilbertspace of states operators related to the measurement of physical quanti-ties and transition amplitudes that determine the probability outcome ofmeasurements of these quantities Hilbert space of states and operatorsassociated to physical observables are obtained from classical GR follow-ing a rather standard quantization strategy A quantization strategy isa technique for searching for a solution to a well-posed inverse problemfinding a quantum theory with a given classical limit The inverse prob-lem could have many solutions As noticed presently the difficulty is notto discriminate among many complete and consistent quantum theoriesof gravity We would be content with one
121 Why loops
Among the technical choices to make in order to implement a quantiza-tion procedure is which algebra of field functions to promote to quan-tum operators In conventional QFT this is generally the canonical al-gebra formed by the positive and negative frequency components of thefield modes The quantization of this algebra leads to the creation and
12 Loop quantum gravity 15
annihilation operators a and adagger The characterization of the positive andnegative frequencies requires a background spacetime
In contrast to this what characterizes LQG is the choice of adifferent algebra of basic field functions a noncanonical algebra basedon the holonomies of the gravitational connection The holonomy (orldquoWilson looprdquo) is the matrix of the parallel transport along a closedcurve
The idea that holonomies are the natural variables in a gauge the-ory has a long history In a sense it can be traced back to the veryorigin of gauge theory in the physical intuition of Faraday Faraday un-derstood electromagnetic phenomena in terms of ldquolines of forcerdquo Twokey ideas underlie this intuition First that the relevant physical vari-ables fill up space this intuition by Faraday is the origin of field the-ory Second that the relevant variables do not refer to what happens ata point but rather refer to the relation between different points con-nected by a line The mathematical quantity that expresses this ideais the holonomy of the gauge potential along the line In the Maxwellcase for instance the holonomy U(Aα) along a loop α is simply theexponential of the line integral along α of the three-dimensional Maxwellpotential A
U(Aα) = e∮α A = exp
int 2π
0ds Aa(α(s))
dαa(s)ds
(16)
In LQG the holonomy becomes a quantum operator that creates ldquoloopstatesrdquo In the loop representation formulation of Maxwell theory forinstance a loop state |α〉 is a state in which the electric field vanisheseverywhere except along a single Faraday line α More precisely it is aneigenstate of the electric field with eigenvalue
Eα(x) =∮
dsdα(s)
dsδ3(x α(s)) (17)
where s rarr α(s) is the Faraday line in space This electric field vanisheseverywhere except on the loop α itself and at every point of α it is tangentto the loop see Figure 11 Notice that the vector distribution field E(x)defined in (17) is divergenceless that is it satisfies Coulomb law
div Eα(x) = 0 (18)
16 General ideas and heuristic picture
Fig 11 A loop α and the distributional electric field configuration Eα (repre-sented by the arrows)
in the sense of distributions In fact for any smooth function f we have
[div Eα](f) =int
d3x f(x) div Eα(x)
=int
d3x f(x)part
partxa
∮ds
dαa(s)ds
δ3(x α(s))
= minus∮
dsdαa(s)
dspart
partαaf(α(s))
= minus∮
αdf = minus
∮ds
dds
f(α(s)) = 0 (19)
Indeed intuitively Coulomb law requires precisely that an electric fieldat a point ldquocontinuesrdquo in the direction of the field itself namely that itdefines Faraday lines The state |α〉 is therefore a sort of minimal quantumexcitation satisfying (18) it is an elementary quantum excitation of asingle Faraday line
The idea that a YangndashMills theory is truly a theory of these loops hasbeen around for as long as such theories have been studied MandelstamPolyakov and Wilson among many others have long argued that loopexcitations should play a major role in quantum YangndashMills theories andthat we must get to understand quantum YangndashMills theories in terms ofthese excitations In fact much of the development of string theory hasbeen influenced by this idea
In lattice YangndashMills theory namely in the approximation to YangndashMills theory where spacetime is replaced by a fixed lattice loop stateshave finite norm In fact certain finite linear combinations of loop statescalled ldquospin networkrdquo states form a well-defined and well-understood or-thonormal basis in the Hilbert space of a lattice gauge theory
However in a QFT theory over a continuous background the idea offormulating the theory in terms of loop-like excitations has never provedfruitful The difficulty is essentially that loop states over a background areldquotoo singularrdquo and ldquotoo manyrdquo The quantum Maxwell state |α〉 describedabove for instance has infinite norm and an infinitesimal displacement of
12 Loop quantum gravity 17
a loop state over the background spacetime produces a distinct indepen-dent loop state yielding a continuum of loop states Over a continuousbackground the space spanned by the loop states is far ldquotoo bigrdquo forproviding a basis of the (separable) Hilbert space of a QFT
However loop states are not too singular nor too many in abackground-independent theory This is the key technical point on whichLQG relies The intuitive reason is as follows Spacetime itself is formedby loop-like states Therefore the position of a loop state is relevant onlywith respect to other loops and not with respect to the background Aninfinitesimal (coordinate) displacement of a loop state does not produce adistinct quantum state but only a gauge equivalent representation of thesame physical state Only a finite displacement carrying the loop stateacross another loop produces a physically different state Therefore thesize of the space of the loop states is dramatically reduced by diffeomor-phism invariance most of it is just gauge Equivalently we can think thata single loop has an intrinsic Planck-size ldquothicknessrdquo
Therefore in a general-relativistic context the loop basis becomesviable The state space of the theory called Kdiff is a separable Hilbertspace spanned by loop states More precisely as we shall see in Chapter 6Kdiff admits an orthonormal basis of spin network states which are formedby finite linear combinations of loop states and are defined precisely asthe spin network states of a lattice YangndashMills theory This Hilbert spaceand the field operators that act on it are described in Chapter 6 Theyform the basis of the mathematical structure of LQG
Therefore LQG is the result of the convergence of two lines of think-ing each characteristic of twentieth-century theoretical physics On theone hand the intuition of Faraday Yang and Mills Wilson MandelstamPolyakov and others that forces are described by lines On the otherhand the EinsteinndashWheelerndashDeWitt intuition of background indepen-dence and background-independent quantum states Truly remarkablyeach of these two lines of thinking is the solution of the blocking difficultyof the other On the one hand the traditional nonviability of the loop ba-sis in the continuum disappears because of background independence Onthe other hand the traditional difficulty of controlling diffeomorphism-invariant quantities comes under control thanks to the loop basis
Even more remarkably the spin network states generated by this happymarriage turn out to have a surprisingly compelling geometric interpre-tation which I sketch below
122 Quantum space spin networks
Physical systems reveal themselves by interacting with other systemsThese interactions may happen in ldquoquantardquo energy is exchanged with an
18 General ideas and heuristic picture
oscillator of frequency ν in discrete packets or quanta of size E = hνIf the oscillator is in the nth energy eigenstate we say that there are nquanta in it If the oscillator is a mode of a free field we say that thereare n ldquoparticlesrdquo in the field Therefore we can view the electromagneticfield as made up of its quanta the photons What are the quanta of thegravitational field Or since the gravitational field is the same entity asspacetime what are the quanta of space
The properties of the quanta of a system are determined by the spectralproperties of the operators representing the quantities involved in ourinteraction with the system The operator associated with the energy ofthe oscillator for instance has a discrete spectrum and the number ofquanta n labels its eigenvalues The set of its eigenstates form a basisin the state space of the quantum system this fact allows us to vieweach state of the system as a quantum superposition of states |n〉 formedby n quanta To understand the quantum properties of space we havetherefore to consider the spectral problem of the operators associatedwith the quantities involved in our interaction with space itself The mostdirect interaction we have with the gravitational field is via the geometricstructure of the physical space A measurement of length area or volumeis in fact according to GR a measurement of a local property of thegravitational field
For instance the volume V of a physical region R is
V =int
Rd3x | det e(x)| (110)
where e(x) is the (triad matrix representing the) gravitational field Inquantum gravity e(x) is a field operator and V is therefore an operatoras well
The volume V is a nonlinear function of the field e and the definitionof the volume operator implies products of local operator-valued distri-butions This can be achieved as a limit using an appropriate regulariza-tion procedure The development of regularization procedures that remainmeaningful in the absence of a background metric is a major technical toolon which LQG is based Using these techniques a well-defined self-adjointoperator V can be defined The computation of its spectral properties isthen one of the main results of LQG and will be derived in Section 665
The spectrum of V turns out to be discrete Therefore the spacetimevolume manifests itself in quanta of definite volume size given by theeigenvalues of the volume operator These quanta of space can be in-tuitively thought of as quantized ldquograinsrdquo of space or ldquoatoms of spacerdquoThe first intuitive picture of quantum space is therefore that of ldquograinsof spacerdquo These have quantized amounts of volume determined by thespectrum of the operator V
12 Loop quantum gravity 19
j
j
2= 12
s =i1 i2
1 = 1
j3
= 12
Fig 12 A simple spin network
The next element of the picture is the information on which grain isadjacent to which Adjacency (being contiguous being in touch beingnearby) is the basis of spatial relations If two spacetime regions are ad-jacent that is if they touch each other they are separated by a surfaceS Let A be the area of the surface S Area also is a function of the grav-itational field and is therefore represented by an operator like volumeThe spectral problem for this operator has been solved in LQG as wellIt is discussed in detail in Section 662 This spectrum turns out also tobe discrete Intuitively the grains of space are separated by ldquoquanta ofareardquo The principal series of the eigenvalues of the area for instance islabeled by multiplets of half-integers ji i = 1 n and turns out to begiven by
A = 8πγ Gsum
i
radicji(ji + 1) (111)
where γ the Immirzi parameter is a free dimensionless constant of thetheory
Consider a quantum state of space |s〉 formed by N ldquograinsrdquo of spacesome of which are adjacent to one another Represent this state as anabstract graph Γ with N nodes (By abstract graph I mean here anequivalence class under smooth deformations of graphs embedded in a3-manifold) The nodes of the graph represent the grains of space thelinks of the graph link adjacent grains and represent the surfaces separat-ing two adjacent grains The quantum state is then characterized by thegraph Γ and by labels on nodes and on links the label in on a node nis the quantum number of the volume and the label jl on a link l is thequantum number of the area
A graph with these labels is called an (abstract) ldquospin networkrdquos = (Γ in jl) see Figure 12 In Section 631 we will see that the quantumnumbers in and jl are determined by the representation theory of the localgauge group (SU(2)) More precisely jl labels unitary irreducible repre-sentations and in labels a basis in the space of the intertwiners betweenthe representations adjacent to the node n The area of a surface cutting
20 General ideas and heuristic picture
Fig 13 The graph of an abstract spinfoam and the ensemble of ldquochunks ofspacerdquo or quanta of volume it represents Chunks are adjacent when the corre-sponding nodes are linked Each link cuts one elementary surface separating twochunks
n links of the spin network with labels ji i = 1 n is then given by(111)
As shown in Section 631 the (kinematical) Hilbert space Kdiff admitsa basis labeled precisely by these spin networks This is a basis of statesin which certain area and volume operators are diagonal Its physicalinterpretation is the one sketched in Figure 13 a spin network state |s〉describes a quantized three-geometry
A loop state |α〉 is a spin network state in which the graph Γ hasno nodes namely is a single loop α and is labeled by the fundamentalrepresentation of the group In such a state the gravitational field hassupport just on the loop α itself as the electric field in (17)
In LQG physical space is a quantum superposition of spin networks inthe same sense that the electromagnetic field is a quantum superpositionof n-photon states The first and basic prediction of the (free) QFT ofthe electromagnetic field is the existence of the photons and the specificquantitative prediction of the energy and the momentum of the photonsof a given frequency Similarly the first prediction of LQG is the existenceof the quanta of area and volume and the quantitative prediction of theirspectrum
The theory predicts that any sufficiently accurate measurement of areaor volume would measure one of these spectral values So far verifyingthis prediction appears to be outside our technological capacities
Where is a spin network A spin network state does not have a positionIt is an abstract graph not a graph immersed in a spacetime manifold
12 Loop quantum gravity 21
Only abstract combinatorial relations defining the graph are significantnot its shape or its position in space
In fact a spin network state is not in space it is space It is not localizedwith respect to something else something else (matter particles otherfields) might be localized with respect to it To ask ldquowhere is a spinnetworkrdquo is like asking ldquowhere is a solution of the Einstein equationsrdquo Asolution of the Einstein equations is not ldquosomewhererdquo it is the ldquowhererdquowith respect to which anything else can be localized In the same waythe other dynamical objects such as YangndashMills and fermion fields liveon the spin network state
This is a consequence of diffeomorphism invariance Technically spinnetwork states are first defined as graphs embedded in a three-dimensionalmanifold then the implementation of the diffeomorphism gauge identifiestwo graphs that can be deformed into each other They are gauge equiv-alent This is like identifying two solutions of the Einstein equations thatare related by a change of coordinates Spin networks embedded in amanifold are denoted S and called ldquoembedded spin networksrdquo equiva-lence classes of these under diffeomorphisms are denoted s and are calledldquoabstract spin networksrdquo or s-knots A quantum state of space is deter-mined by an s-knot7
The fact that spin networks do not live in space but rather are spacehas far-reaching consequences Space itself turns out to have a discrete andcombinatorial character Notice that this is not imposed on the theory orassumed It is the result of a completely conventional quantum mechanicalcalculation of the spectrum of the physical quantities that describe thegeometry of space Since there is no spatial continuity at short scale thereis (literally) no room in the theory for ultraviolet divergencies The theoryeffectively cuts itself off at the Planck scale Space is effectively granularat the Planck scale and there is no infinite ultraviolet limit
Chapter 7 describes how YangndashMills and fermion fields can be coupledto the theory This can be obtained by enriching the structure of the spinnetworks s In the case of a YangndashMills theory with gauge group G forinstance links carry an additional quantum number labeling irreduciblerepresentations G The spin network itself behaves like the lattice of lat-tice YangndashMills theory In quantum gravity therefore the lattice itselfbecomes a dynamical variable But notice a crucial difference with re-spect to conventional lattice YangndashMills theory the lattice size is not tobe scaled down to zero it has physical Planck size
In summary spin networks provide a mathematically well-defined andphysically compelling description of the kinematics of the quantum grav-itational field They also provide a well-defined picture of the small-scale
7The expression ldquospin networkrdquo is used in the literature to designate both the embeddedand the abstract ones as well as to designate the quantum states they label
22 General ideas and heuristic picture
structure of space It is remarkable that this novel picture of space emergessimply from the combination of old YangndashMills theory ideas with general-relativistic background independence
123 Dynamics in background-independent QFT
The dynamics of the quantum gravitational field can be described givingamplitudes W (s) for spin network states Let me illustrate here in aheuristic manner the physical interpretation of these amplitudes and theway they are defined in the theory A major feature of this book is thatit is based on a general-relativistic way of thinking about observables +evolution This section sketches this view and may be somewhat harderto follow than the previous ones
Interpretation of the amplitude W (s) The quantum dynamics of a par-ticle is entirely described by the transition probability amplitudes
W (x t xprime tprime) = 〈x|eminus iH0(tminustprime)|xprime〉 = 〈x t|xprime tprime〉 (112)
where |x t〉 is the eigenstate of the Heisenberg position operator x(t)with eigenvalue x H0 is the hamiltonian operator and |x〉 = |x 0〉 Thepropagator W (x txprime tprime) depends on two events (x t) and (xprime tprime) thatbound a finite portion of a classical trajectory The space of the pairs ofevents (x t xprime tprime) is called G in this book
A physical experiment consists of a preparation at time tprime and a mea-surement at time t Say that in a particular experiment we have localizedthe particle in xprime at tprime and then found it in x at time t The set (x t xprime tprime)represents the complete set of data of a specific complete observationalset up including preparation and measurement The space G is the spaceof these data sets In the quantum theory we associate the complex am-plitude W (x t xprime tprime) which is a function on G with any such data set Asemphasized by Feynman this amplitude codes the full quantum dynamicsFollowing Feynman we can compute W (x t xprime tprime) with a sum-over-pathsthat take the values x and xprime at t and tprime respectively
If we measure a different observable than position we obtain statesdifferent from the states |x〉 Let |ψin〉 be the state prepared at time tprimeand let |ψout〉 be the state measured at time t The amplitude associatedto these measurements is
A = 〈ψout|eminusiH0(tminustprime)|ψin〉 (113)
The pair of states (ψin ψout) determines a state ψ = |ψin〉 otimes 〈ψout| inthe space Kttprime which is the tensor product of the Hilbert space of theinitial states and (the dual of) the Hilbert space of the final states The
12 Loop quantum gravity 23
propagator defines a (possibly generalized) state |0〉 in Kttprime by 〈0|(|xprime〉 otimes〈x|) = W (x t xprime tprime) The amplitude (113) can be written simply as
A = 〈0|ψ〉 (114)
Therefore we can express the dynamics from tprime to t in terms of a singlestate |0〉 in a Hilbert space Kttprime that represents outcomes of measurementson tprime and t The state |0〉 is called the covariant vacuum and should notbe confused with the state of minimal energy
Let us extend this idea to field theory In field theory the analog ofthe data set (x t xprime tprime) is the couple [Σ ϕ] where Σ is a 3d surfacebounding a finite spacetime region and ϕ is a field configuration on ΣThese data define a set of events (x isin Σ ϕ(x)) that bound a finite portionof a classical configuration of the field just as (x t xprime tprime) bound a finiteportion of the classical trajectory of the particle The data from a localexperiment (measurements preparation or just assumptions) must in factrefer to the state of the system on the entire boundary of a finite spacetimeregion The field theoretical space G is therefore the space of surfaces Σand field configurations ϕ on Σ Quantum dynamics can be expressed interms of an amplitude W [Σ ϕ] Following Feynmanrsquos intuition we canformally define W [Σ ϕ] in terms of a sum over bulk field configurationsthat take the value ϕ on the boundary Σ In fact in Section 53 I arguethat the functional W [Σ ϕ] captures the dynamics of a QFT
Notice that the dependence of W [Σ ϕ] on the geometry of Σ codes thespacetime position of the measuring apparatus In fact the relative posi-tion of the components of the apparatus is determined by their physicaldistance and the physical time lapsed between measurements and thesedata are contained in the metric of Σ
Consider now a background-independent theory Diffeomorphism in-variance implies immediately that W [Σ ϕ] is independent of Σ This isthe analog of the independence of W (x y) from x and y mentioned inSection 114 Therefore in gravity W depends only on the boundaryvalue of the fields However the fields include the gravitational field andthe gravitational field determines the spacetime geometry Therefore thedependence of W on the fields is still sufficient to code the relativedistance and time separation of the components of the measuring ap-paratus
What is happening is that in background-dependent QFT we have twokinds of measurements those that determine the distances of the partsof the apparatus and the time lapsed between measurements and the ac-tual measurements of the fieldsrsquo dynamical variables In quantum gravityinstead distances and time separations are on an equal footing with thedynamical fields This is the core of the general-relativistic revolutionand the key for background-independent QFT
24 General ideas and heuristic picture
We need one final step Notice from (112) that the argument of W isnot the classical quantity but rather the eigenstate of the correspond-ing operator The eigenstates of the gravitational field are spin networksTherefore in quantum gravity the argument of W must be a spin networkrepresenting the possible outcome of a measurement of the gravitationalfield (or the geometry) on a closed 3d surface Thus in quantum gravityphysical amplitudes must be expressed by amplitudes of the form W (s)These give the correlation probability amplitude associated with the out-come s in a measurement of a geometry just as W (x t xprime tprime) does for aparticle
A particularly interesting case is when we can separate the boundarysurface in two components then s = sout cup sin In this case W (sout sin)can be interpreted as the probability amplitude of measuring the quantumthree-geometry sout if sin was observed
Notice that a spin network sin is the analog of (x t) not just x aloneThe time variable is mixed up with the physical variables (Chapter 3) Thenotion of unitary quantum evolution in time is ill defined in this contextbut probability amplitudes remain well defined and physically meaningful(Chapter 5) The quantum dynamical information of the theory is entirelycontained in the spin network amplitudes W (s) Given a configuration ofspace and matter these amplitudes determine a correlation probabilityof observing it
Calculation of the amplitude W (s) In the relativistic formulation of clas-sical hamiltonian theory dynamics is governed by the relativistic hamilto-nian H8 This is discussed in detail in Chapter 3 The quantum dynamicsis governed by the corresponding quantum operator H In quantum grav-ity H is defined on the space of the spin networks There is no externaltime variable t in the theory and the quantum dynamical equation whichreplaces the Schrodinger equation is the equation HΨ = 0 called theWheelerndashDeWitt equation The space of the solutions of the WheelerndashDeWitt equation is denoted H There is an operator P Kdiff rarr H thatprojects Kdiff on the solutions of the WheelerndashDeWitt equation (for amathematically more precise statement see Section 52)
The transition amplitudes W (s sprime) are the matrix elements of the oper-ator P They define the physical scalar product namely the scalar productof the space H
W (s sprime) = 〈s|P |sprime〉Kdiff= 〈s|sprime〉H (115)
Thus the transition amplitude between two states is simply their physicalscalar product (Chapter 5) More generally there is a preferred state |empty〉
8H is sometimes called the ldquohamiltonian constraintrdquo or the ldquosuperhamiltonianrdquo
12 Loop quantum gravity 25
Fig 14 Scheme of the action of H on a node of a spin network
in Kdiff which is formed by no spin networks It represents a space withzero volume or more precisely no space at all The covariant vacuumstate which defines the dynamics of the theory is defined by |0〉=P |empty〉The amplitude of a spin network is defined by
W (s) = 〈0|s〉 = 〈empty|P |s〉 (116)
The construction of the operator H is a major task in LQG It is delicateand it requires a nontrivial regularization procedure in order to deal withoperator products Chapter 7 is devoted to this construction Remarkablythe limit in which the regularization is removed exists precisely thanksto diffeomorphism invariance (Section 71) This is a second major payoffof background independence At present more than one version of theoperator H has been constructed and it is not yet clear which variant(if any) is correct The remarks that follow refer to all of them
The most remarkable aspect of the hamiltonian operator H is that itacts only on the nodes A state labeled by a spin network without nodes ndashthat is in which the graph Γ is simply a collection of nonintersectingloops ndash is a solution of the WheelerndashDeWitt equation In fact the unex-pected fact that exact solutions of the WheelerndashDeWitt equation couldbe found at all was the first major surprise that raised interest in LQGin the first place in the late 1980s
Acting on a generic state |s〉 the action of the operator H turns out tobe discrete and combinatorial the topology of the graph is changed andthe labels are modified in the vicinity of a node A typical example of theaction of H on a node is illustrated in Figure 14 the action on a nodesplits the node into three nodes and multiplies the state by a number a(that depends on the labels of the spin network around the node) Labelsof links and nodes are not indicated in the figure
Notice the various manners in which the spin network basis is effectivein quantum gravity The states in the spin network basis
(i) diagonalize area and volume(ii) control diff-invariance diffeomorphism equivalence classes of states
are labeled by the s-knots(iii) simplify the action of H reducing it to a combinatorial action on
the nodes
26 General ideas and heuristic picture
The construction of the hamiltonian operator H completes the defini-tion of the general formalism of LQG in the case of pure gravity This isextended to matter couplings in Chapter 7 In Chapter 8 I describe someof the most interesting applications of the theory In particular I illustratethe application of LQG to cosmology (control of the classical initial sin-gularity inflation) and to black-hole physics (entropy emitted spectrum)I also mention some of its tentative applications in astrophysics
124 Quantum spacetime spinfoam
To be able to compute all the predictions of a theory it is not sufficient tohave the general definition of the theory A road towards the calculationof transition amplitudes in quantum gravity is provided by the spinfoamformalism
Following Feynmanrsquos ideas we can give W (s sprime) a representation as asum-over-paths This representation can be obtained in various mannersIn particular it can be intuitively derived from a perturbative expansionsumming over different histories of sequences of actions of H that send sprime
into sA path is then the ldquoworld-historyrdquo of a graph with interactions hap-
pening at the nodes This world-history is a two-complex as in Figure15 namely a collection of faces (the world-histories of the links) facesjoin at edges (the world-histories of the nodes) in turn edges join atvertices A vertex represents an individual action of H An example of avertex corresponding to the action of H of Figure 14 is illustrated inFigure 16 Notice that on moving from the bottom to the top a sectionof the two-complex goes precisely from the graph on the left-hand side ofFigure 14 to the one on the right-hand side Thus a two-complex is likea Feynman graph but with one additional structure A Feynman graph iscomposed by vertices and edges a spinfoam by vertices edges and faces
Faces are labeled by the area quantum numbers jl and edges by thevolume quantum numbers in A two-complex with faces and edges la-beled in this manner is called a ldquospinfoamrdquo and denoted σ Thus a spin-foam is a Feynman graph of spin networks or a world-history of spinnetworks A history going from sprime to s is a spinfoam σ bounded by sprime
and sIn the perturbative expansion of W (s sprime) there is a term associated
with each spinfoam σ bounded by s and sprime This term is the amplitude ofσ The amplitude of a spinfoam turns out to be given by (a measure termμ(σ) times) the product over the vertices v of a vertex amplitude Av(σ)The vertex amplitude is determined by the matrix element of H betweenthe incoming and the outgoing spin networks and is a function of the labels
12 Loop quantum gravity 27
v2
v1
5
56
7
8
8
1
3
7
63
3
si
sf
s1
Σi
Σf
Fig 15 A spinfoam representing the evolution of an initial spin network si toa final spin network sf via an intermediate spin network s1 Here v1 and v2 arethe interaction vertices
Fig 16 The vertex of a spinfoam
of the faces and the edges adjacent to the vertex This is analogous to theamplitude of a conventional Feynman vertex which is determined by thematrix element of the hamiltonian between the incoming and outgoingstates
28 General ideas and heuristic picture
The physical transition amplitudes W (s sprime) are then obtained by sum-ming over spinfoams bounded by the spin networks s and sprime
W (s sprime) simsum
σpartσ=scupsprime
μ(σ)prod
v
Av(σ) (117)
More generally for a spin network s representing a closed surface
W (s) simsum
σpartσ=s
μ(σ)prod
v
Av(σ) (118)
In general the Feynman path integral can be derived from Schrodingertheory by exponentiating the hamiltonian operator but it can also be di-rectly interpreted as a sum over classical trajectories of the particle Simi-larly the spinfoam sum (117) can be interpreted as a sum over spacetimesThat is the sum (117) can be seen as a concrete and mathematicallywell-defined realization of the (ill-defined) WheelerndashMisnerndashHawking rep-resentation of quantum gravity as a sum over four-geometries
W (3g 3gprime) simint
partg= 3gcup3gprime[Dg] e
iS[g] (119)
Because of their foamy structure at the Planck scale spinfoams canbe viewed as a mathematically precise realization of Wheelerrsquos intuitionof a spacetime ldquofoamrdquo In Chapter 9 I describe various concrete realiza-tions of (117) as well as the possibility of directly relating (117) with adiscretization of (119)
13 Conceptual issues
The search for a quantum theory of gravity raises questions such as Whatis space What is time What is the meaning of ldquobeing somewhererdquoWhat is the meaning of ldquomovingrdquo Is motion to be defined with respectto objects or with respect to space Can we formulate physics withoutreferring to time or to spacetime And also What is matter What iscausality What is the role of the observer in physics
Questions of this kind have played a central role in periods of majoradvances in physics For instance they played a central role for EinsteinHeisenberg Bohr and their colleagues but also for Descartes GalileoNewton and their contemporaries and for Faraday Maxwell and theircolleagues Today this manner of posing problems is often regarded asldquotoo philosophicalrdquo by many physicists
Indeed most physicists of the second half of the twentieth century haveviewed questions of this nature as irrelevant This view was appropriate
13 Conceptual issues 29
for the problems they were facing one does not need to worry aboutfirst principles in order to apply the Schrodinger equation to the heliumatom to understand how a neutron star holds together or to find out thesymmetry group governing the strong interactions During this periodphysicists lost interest in general issues As someone has said during thisperiod ldquodo not ask what the theory can do for you ask what you cando for the theoryrdquo That is do not ask foundational questions just keepdeveloping and adjusting the theory you happen to find in front of youWhen the basics are clear and the issue is problem-solving within a givenconceptual scheme there is no reason to worry about foundations theproblems are technical and the pragmatical approach is the most effectiveone
Today the kind of difficulties that we face have changed To understandquantum spacetime we have to return once more to those foundationalissues We have to find new answers to the old foundational questions Thenew answers have to take into account what we have learned with QM andGR This conceptual approach is not the one of Weinberg and Gell-Mannbut it is the one of Newton Maxwell Einstein Bohr Heisenberg FaradayBoltzmann and many others It is clear from the writings of the latterthat they discovered what they did discover by thinking about generalfoundational questions The problem of quantum gravity will not be solvedunless we reconsider these questions
Several of these questions are discussed in the text Here I only commenton one of these conceptual issues the role of the notion of time
131 Physics without time
The transition amplitudes W (s sprime) do not depend explicitly on time Thisis to be expected because the physical predictions of classical GR do notdepend explicitly on the time coordinate t either The theory predictscorrelations between physical variables not the way physical variablesevolve with respect to a preferred time variable But what is the meaningof a physical theory in which the time variable t does not appear
Let me tell a story It was Galileo Galilei who first realized that thephysical motion of objects on Earth could be described by mathematicallaws expressing the evolution of observable quantities ABC in timeThat is laws for the functions A(t) B(t) C(t) A crucial contributionby Galileo was to find an effective way to measure the time variable tand therefore provide an operational meaning to these functions In factGalileo gave a decisive contribution to the discovery of the modern clockby realizing as a young man that the small oscillations of a pendulumldquotake equal timerdquo The story goes that Galileo was staring at the slowoscillations of the big chandelier that can still be seen in the marvelous
30 General ideas and heuristic picture
Cathedral of Pisa9 He checked the period of the oscillations against hispulse and realized that the same number of pulses lapsed during anyoscillation of the chandelier This was the key insight the basis of themodern clock today virtually every clock contains an oscillator Laterin life Galileo used a clock to discover the first quantitative terrestrialphysical law in his historic experiments on descent down inclines
Now the puzzling part of the story is that while Galileo checked thependulum against his pulse not long afterwards doctors were checkingtheir patientrsquos pulse against a pendulum What is the actual meaning ofthe pendulum periods taking ldquoequal timerdquo An equal amount of t lapsesin any oscillation how do we know this if we can access t only via anotherpendulum
It was Newton who cleared up the issue conceptually Newton as-sumes that an unobservable quantity t exists which flows (ldquoabsolute andequal to itselfrdquo) We write equations of motion in terms of this t butwe cannot truly access t we can build clocks that give readings T1(t)T2(t) that according to our equations approximate t with the preci-sion we want What we actually measure is the evolution of other variablesagainst clocks namely A(T1) B(T1) Furthermore we can check clocksagainst one another by measuring the functions T1(T2) T2(T3) Thefact that all these observations agree with what we compute using evo-lution equations in t gives us confidence in the method In particularit gives us confidence that to assume the existence of the unobservablephysical quantity t is a useful and reasonable thing to do
Simply the usefulness of this assumption is lost in quantum gravity Thetheory allows us to calculate the relations between observable quantitiessuch as A(B) B(C) A(T1) T1(A) which is what we see But it doesnot give us the evolution of these observable quantities in terms of anunobservable t as Newtonrsquos theory and special relativity do In a sensethis simply means that there are no good clocks at the Planck scale
Of course in a specific problem we can choose one variable decide totreat it as the independent variable and call it ldquotherdquo time For instance acertain clock time a certain proper time along a certain particle historyetc The choice is largely arbitrary and generally it is only locally meaning-ful A generally covariant theory does not choose a preferred time variable
Here are two examples to illustrate this arbitrariness- Imagine we throw a precise clock upward and compare its lapsed reading tf when it
lands back with the lapsed reading te of a clock remaining on the Earth GR predictsthat the two clocks read differently and provides a quantitative relation between tf
9Nice story Too bad the chandelier was hung there a few decades after Galileorsquos dis-covery
Bibliographical notes 31
and te Is this about the observable tf evolving in the physical time te or about theobservable te evolving in the physical time tf
10
- The cosmological context is often indicated as one in which a natural choice oftime is available the cosmological time tc is the proper time from the Big Bang alongthe galaxiesrsquo worldlines But an event A happening on Andromeda at the same tc asours happens much later than an event B on Andromeda simultaneous to us in thesense of Einsteinrsquos definition of simultaneity11 So what is happening ldquoright nowrdquo onAndromeda A or B Furthermore the real world is not truly homogeneous when twogalaxies having two different ages relative to the Big Bang or two different massesme merge which of the two has the right time
So long as we remain within classical general relativity a given gravi-tational field has the structure of a pseudo-riemannian manifold There-fore the dynamics of the theory has no preferred time variable but wenevertheless have a notion of spacetime for each given solution But inquantum theory there are no classical field configurations just as thereare no trajectories of a particle Thus in quantum gravity the notion ofspacetime disappears in the same manner in which the notion of trajec-tory disappears in the quantum theory of a particle A single spinfoam canbe thought of as representing a spacetime but the history of the world isnot a single spinfoam it is a sum over spinfoams
The theory is conceptually well defined without making use of the no-tion of time It provides probabilistic predictions for correlations betweenthe physical quantities that we can observe In principle we can checkthese predictions against experiments12 Furthermore the theory providesa clear and intelligible picture of the quantum gravitational field namelyof a ldquoquantum geometryrdquo
Thus there is no background ldquospacetimerdquo forming the stage on whichthings move There is no ldquotimerdquo along which everything flows The worldin which we happen to live can be understood without using the notionof time
mdashmdash
Bibliographical notes
The fact that perturbative quantum general relativity is nonrenormaliz-able has been long believed but was proven only in 1986 by Goroff andSagnotti [29]
10If you are tempted to say that the lapsed reading te of the clock remaining on Earthgives the ldquotrue timerdquo recall that the pseudo-riemannian distance between the twoevents at which the clocks meet is tf not te it is the clock going up and down thatfollows a geodesic
11Thanks to Marc Lachieze-Rey for this observation12The special properties of a time variable may emerge only macroscopically This is
discussed in Sections 34 and 551
32 General ideas and heuristic picture
For an orientation on current research on quantum gravity see for in-stance the review papers [30ndash33] An interesting panoramic of points ofview on the problem is in the various contributions to the book [34] Ihave given a critical discussion on the present state of spacetime physicsin [35ndash37] A historical account of development of quantum gravity isgiven in Appendix B
As a general introduction to quantum gravity ndash a subject where nothingyet is certain ndash the student eager to learn is strongly advised to study alsothe classic reviews which are rich in ideas and present different points ofview such as John Wheeler 1967 [38] Steven Weinberg 1979 [39] StephenHawking 1979 and 1980 [4041] Karel Kuchar 1980 [42] and Chris Ishamrsquosmagistral syntheses [43ndash45] On string theory classic textbooks are GreenSchwarz and Witten and Polchinksi [46] For a discussion of the difficul-ties of string theory and a comparison of the results of strings and loopssee [47] written in the form of a dialog and [48] For a fascinating pre-sentation of Alain Connesrsquo vision see [49] Lee Smolinrsquos popular-sciencebook [50] provides a readable and enjoyable introduction to LQG
LQG has inspired novels and short stories Blue Mars by Kim StanleyRobinson [51] contains a description of the future evolution and merg-ing of loop gravity and strings I recommend the science fiction novelSchild Ladder by Greg Egan [52] which opens with one of the clearestpresentations of the picture of space given by loop gravity (Greg is a tal-ented writer and also a scientist who is contributing to the development ofLQG) and for those who can read Italian Anna prende il volo by EnricoPalandri [53] a charming novel with a gentle meditation on the meaning ofthe disappearance of time Literature has the capacity of delicately merg-ing the novel hard views that science develops into the common discourseof our civilization
2General Relativity
Lev Landau has called GR ldquothe most beautifulrdquo of the scientific theories The theoryis first of all a description of the gravitational force Nowadays it is very extensivelysupported by terrestrial and astronomical observations and so far it has never beenquestioned by an empirical observation
But GR is far more than that It is a complete modification of our understanding ofthe basic grammar of nature This modification does not apply solely to gravitationalinteraction it applies to all aspects of physics In fact the extent to which Einsteinrsquosdiscovery of this theory has modified our understanding of the physical world and thefull reach of its consequences have not yet been completely unraveled
This chapter is not an introduction to GR nor an exhaustive description of thetheory For this I refer the reader to the classic textbooks on the subject Here I givea short presentation of the formalism in a compact and modern form emphasizingthe reading of the theory which is most useful for quantum gravity I also discuss indetail the physical and conceptual basis of the theory and the way it has modified ourunderstanding of the physical world
21 Formalism
211 Gravitational field
Let M be the ldquospacetimerdquo four-dimensional manifold Coordinates onM are written as x x where x = (xμ) = (x0 x1 x2 x3) Indicesμ ν = 0 1 2 3 are spacetime tangent indices
bull The gravitational field e is a one-form
eI(x) = eIμ(x) dxμ (21)
with values in Minkowski space Indices I J = 0 1 2 3 label the com-ponents of a Minkowski vector They are raised and lowered with theMinkowski metric ηIJ
33
34 General Relativity
I call ldquogravitational fieldrdquo the tetrad field rather than Einsteinrsquos metric field gμν(x)There are three reasons for this (i) the standard model cannot be written in terms of gbecause fermions require the tetrad formalism (ii) the tetrad field e is nowadays moreutilized than g in quantum gravity and (iii) I think that e represents the gravitationalfield in a more conceptually clean way than g (see Section 223) The relation with themetric formalism is given in Section 215
bull The spin connection ω is a one-form with values in the Lie algebra ofthe Lorentz group so(3 1)
ωIJ(x) = ωI
μJ(x) dxμ (22)
where ωIJ = minusωJI It defines a covariant partial derivative Dμ on allfields that have Lorentz (I) indices
DμvI = partμv
I + ωIμJ vJ (23)
and a gauge-covariant exterior derivative D on forms For instance for aone-form uI with a Lorentz index
DuI = duI + ωIJ and uJ (24)
The torsion two-form is defined as
T I = DeI = deI + ωIJ and eJ (25)
A tetrad field e determines uniquely a torsion-free spin connection ω =ω[e] called compatible with e by
T I = deI + ω[e]IJ and eJ = 0 (26)
The explicit solution of this equation is given below in (291) or (292)
bull The curvature R of ω is the Lorentz algebra valued two-form1
RIJ = RI
J μν dxμ and dxν (27)
defined by2
RIJ = dωI
J + ωIK and ωK
J (28)
1Generally I write spacetime indices μν before internal Lorentz indices IJ But for thecurvature I prefer to stay closer to Riemannrsquos notation
2Sometimes the curvature of a connection ωIJ is written as RI
J = DωIJ If we naively
use the definition (24) for D we get an extra 2 in the quadratic term The point isthat the indices on the connection are not vector indices That is (24) defines theaction of D on sections of a vector bundle and a connection is not a section of a vectorbundle
21 Formalism 35
We have then immediately from (24)
D2uI = RIJ and uJ (29)
and from this equation and (26)
RIJ and eJ = 0 (210)
A region where the curvature is zero is called ldquoflatrdquo Equations (25) and(28) are called the Cartan structure equations
bull The Einstein equations ldquoin vacuumrdquo are
εIJKL (eI andRJK minus 23λ eI and eJ and eK) = 0 (211)
The equation (26) relating e and ω and the Einstein equations (211)are the field equations of GR in the absence of other fields They are theEulerndashLagrange equations of the action
S[e ω] =1
16πG
intεIJKL(
14eIandeJandR[ω]KLminus 1
12λ eIandeJandeKandeL) (212)
where G is the Newton constant3 and λ is the cosmological constantwhich I often set to zero below
bull Inverse tetrad Using the matrix eμI (x) defined to be the inverse of the matrixeIμ(x) we define the Ricci tensor
RIμ = RIJ
μν eνJ (213)
and the Ricci scalar
R = RIμ eμI (214)
and write the vacuum Einstein equations (211) as
RIμ minus 1
2ReIμ + λeIμ = 0 (215)
3The constant 16πG has no effect on the classical equations of motion (211) Howeverit governs the strength of the interaction with the matter fields described below andit also determines the quantum properties of the system In this it is similar to themass constant m in front of a free-particle action the classical equations of motion(x = 0) do not depend on m but the quantum dynamics of the particle does Forinstance the rate at which a wave packet spreads depends on m Similarly we willsee that the quanta of pure gravity are governed by this constant
36 General Relativity
bull Second-order formalism Replacing ω with ω[e] in (212) we get the equivalentaction
S[e] =1
16πG
intεIJKL (
1
4eI and eJ andR[ω[e]]KL minus 1
12λ eI and eJ and eK and eL) (216)
The formalisms in (212) where e and ω are independent is called the first-order for-malism The two formalism are not equivalent in the presence of fermions we do notknow which one is physically correct because the effect of gravity on single fermions ishard to measure
bull Selfdual formalism Consider the selfdual ldquoprojectorrdquo P iIJ given by
P ijk =
1
2εijk P i
0j = minusP ij0 =
i
2δij (217)
where i = 1 2 34 This verifies the two properties
1
2εIJ
KLP iKL = iP i
IJ P IJi P i
KL = P IJKL equiv 1
2δ I[K δLJ] +
i
4εIJKL (218)
where P IJKL is the projector on selfdual tensors Define the complex SO(3) connection
Aiμ = P i
IJ ωIJμ (219)
Equivalently
Ai = ωi + iω0i (220)
(We write ωi = 12εijkωjk See pg xxii) We can use the complex selfdual connection
Ai (three complex one-forms) instead of the real connection ωIJ (six real one-forms)
as the dynamical variable for GR (This is equivalent to describing a system with tworeal degrees of freedom x and y in terms of a single complex variable z = x + iy) Interms of Ai the vacuum Einstein equations read
PiIJ eI and (F i minus 2
3λ P i
KLeK and eL) = 0 (221)
where F i = dAi + εijkAjAk is the curvature of A5 These are the EulerndashLagrange
equations of the action
S[eA] =1
16πG
int(minusiPiIJ eI and eJ and F i minus 1
12λ εIJKL eI and eJ and eK and eL) (222)
which differs from the action (212) by an imaginary term that does not change theequations of motion The selfdual formalism is often used in canonical quantizationbecause it simplifies the form of the hamiltonian theory If we replace the imaginaryunit i in (217) with a real parameter γ (222) is called the Holst action [54] and givesrise to the Ashtekar-Barbero-Immirzi formalism γ is called the Immirzi parameter
Plebanski formalism The Plebanski selfdual two-form is defined as
Σi = P iIJ eI and eJ (223)
That isΣ1 = e2 and e3 + i e0 and e1 (224)
4The complex Lorentz algebra splits into two complex so(3) algebras called the selfdualand anti-selfdual components so(3 1C) = so(3C) oplus so(3C) The projector (217)reads out the selfdual component
5Because of the split mentioned in the previous footnote the curvature of the selfdualcomponent of the connection is the selfdual component of the curvature
21 Formalism 37
and so on cyclically A straightforward calculation shows that Σ satisfies
DΣi equiv dΣi + Aij and Σj = 0 (225)
where we write Aij = εijkA
k See pg xxii The algebraic equations for a triplet ofcomplex two-forms Σi
3 Σi and Σj = δij Σk and Σk = minusδij Σk and Σk Σi and Σ
j= 0 (226)
are solved by (223) where eI is an arbitrary real tetrad The GR action can thus bewritten as
S[Σ A] =minusi
16πG
int (Σi and F i +
1
3λ Σk and Σk)
(227)
where Σi satisfies the Plebanski constraints (226) The Plebanski formalism is often
used as a starting point for spinfoam models
212 ldquoMatterrdquo
In the general-relativistic parlance ldquomatterrdquo is anything which is not thegravitational field As far as we know the world is made up of the grav-itational field YangndashMills fields fermion fields and presumably scalarfields
bull Maxwell The electromagnetic field is described by the one-form fieldA the Maxwell potential
A(x) = Aμ(x) dxμ (228)
Its curvature is the two-form F = dA with components Fμν = partμAν minuspartνAμ Its dynamics is governed by the action
SM[eA] =14
intF lowast and F (229)
bull YangndashMills The above generalizes to a nonabelian connection A ina YangndashMills group G A defines a gauge covariant exterior derivative Dand curvature F The action is
SYM[eA] =14
inttr[F lowast and F ] (230)
where tr is a trace on the algebra
bull Scalar Let ϕ(x) be a scalar field possibly with values in a representa-tion of G The YangndashMills field A defines the covariant partial derivative
Dμϕ = partμϕ + AAμLAϕ (231)
where LA are the generators of the gauge algebra in the representationsto which ϕ belongs The action that governs the dynamics of the field is
Ssc[eA ϕ] =int
d4x e(ηIJ eμI Dμϕ eνJ Dνϕ + V (ϕ)
) (232)
where e is the determinant of eIμ and V (ϕ) is a self-interaction potential
38 General Relativity
bull Fermion A fermion field ψ is a field in a spinor representation ofthe Lorentz group possibly with values in a representation of G Thespin connection ω and the YangndashMills field A define the covariant partialderivative
Dμψ = partμψ + ωIμJL
JIψ + AA
μLAψ (233)
where LJI and LA are the generators of the Lorentz and gauge algebras
in the representations to which ψ belongs Define
Dψ = γIeμI Dμψ (234)
where γI are the standard Dirac matrices The action that governs thedynamics of the fermion field is
Sf [e ωA ϕ ψ] =int
d4x e(ψ Dψ + Y (ϕ ψ ψ)
)+ complex conjugate
(235)where the second term is a polynomial interaction potential with a scalarfield
bull The ldquolagrangian of the worldrdquo the standard model As far as we knowthe world can be described in terms of a set of fields e ωA ψ ϕ whereG = SU(3) times SU(2) times U(1) and ψ and ϕ are in suitable multiplets andis governed by the action
S[e ωA ψ ϕ] = SGR[e ω] + SYM[eA] + Sf [e ωA ψ] + Ssc[eA ϕ]= SGR[e ω] + Smatter[e ωA ϕ ψ] (236)
with suitable polynomials V and Y The equations of motion that followfrom this action by varying e are the Einstein equations (211) with asource term namely
εIJKL (eI andRJK minus 23λ eI and eJ and eK) = 2πG TL (237)
where the energy-momentum three-form
TI =det e3
TμI εμνρσdxν and dxρ and dxσ (238)
is defined by
TI(x) =δSmatter
δeI(x) (239)
Equivalently the Einstein equations (237) can be written as
RIμ minus 1
2ReIμ + λeIμ = 8πG T I
μ (240)
21 Formalism 39
T Iμ(x) is called the energy-momentum tensor It is the sum of the individ-
ual energy-momentum tensors of the various matter terms6
bull Particles The trajectory xμ(s) of a point particle is an approximate notion Macro-scopic objects have finite size and elementary particles are quantum entities and there-fore have no trajectories At macroscopic scales the notion of a point-particle trajectoryis nevertheless very useful
In the absence of nongravitational forces the equations of motion for the worldlineγ s rarr xμ(s) of a particle are determined by the action
S[e γ] = m
intds
radicminusηIJvI(s)vJ(s) (241)
wherevI(s) = eIμ(x(s))vμ(s) (242)
and vμ is the particle velocity
vμ(s) = xμ(s) equiv dxμ(s)
ds (243)
This action is independent of the way the trajectory is parametrized and thereforedetermines the path not its parametrization With the parametrization choice vIv
I =minus1 the equations of motion are
xμ = minusΓμνρ xν xρ (244)
whereΓσμν = eρJe
Jσ(eρIpart(μeIν) + eνIpart[μe
Iρ] + eμIpart[νe
Iρ]) (245)
is called the LevindashCivita connection In an arbitrary parametrization the equations ofmotion are
xμ + Γμνρ xν xρ = I(s) xμ (246)
where I(s) is an arbitrary function of s
Minkowski solution Consider a regime in which we can assume that theNewton constant G is small that is a regime in which we can neglectthe effect of matter on the gravitational field Assume also that withinour approximation the cosmological constant λ is negligible The Ein-stein equations (211) then admit (among many others) the particularlyinteresting solution
eIμ(x) = δIμ ωIμJ(x) = 0 (247)
which is called the Minkowski solution This solution is everywhere flatAssume that the gravitational field is in this configuration What are
the equations of motion of the matter interacting with this particular
6The energy-momentum tensor defined as the variation of the action with respect tothe gravitational field may differ by a total derivative from the one conventional inMinkowski space defined as the Noether current of translations
40 General Relativity
gravitational field These are easily obtained by inserting the Minkowskisolution (247) into the matter action (236)
S[Aϕ ψ] = Smatter[e = δ ω = 0 A ϕ ψ] (248)
The action S[Aϕ ψ] is the action of the standard model used in high-energy physics This action is usually written in terms of the spacetimeMinkowski metric ημν This metric is obtained from the Minkowski value(247) of the tetrad field For instance in the action of a scalar field (232)the combination ηIJeμI (x)eνJ(x) becomes
ηIJeμI (x)eνJ(x) = ηIJδμI δνJ = ημν (249)
on this solutionThe Minkowski metric ημν of special-relativistic physics is nothing but
a particular value of the gravitational field It is one of the solutions ofthe Einstein equation within a certain approximation
213 Gauge invariance
The general definition of a system with a gauge invariance and the onewhich is most useful for understanding the physics of gauge systems is thefollowing which is due to Dirac Consider a system of evolution equationsin an evolution parameter t The system is said to be ldquogaugerdquo invariantif evolution is under-determined that is if there are two distinct solu-tions that are equal for t less than a certain t see Figure 21 These twosolutions are said to be ldquogauge equivalentrdquo Any two solutions are saidto be gauge equivalent if they are gauge equivalent (as above) to a thirdsolution The gauge group G is a group that acts on the physical fields andmaps gauge-equivalent solutions into one another Since classical physics isdeterministic under-determined evolution equations are physically consis-tent only under the stipulation that only quantities invariant under gaugetransformations are physical predictions of the theory These quantitiesare called the gauge-invariant observables
The equations of motion derived by the action (236) are invariant underthree groups of gauge transformations (i) local YangndashMills gauge trans-formations (ii) local Lorentz transformations and (iii) diffeomorphismtransformations They are described below Gauge-invariant observablesmust be invariant under these three groups of transformations
(i) Local G transformations G is the YangndashMills group A local G transformationis labeled by a map λ M rarr G It acts on ϕψ and the connection A in the
21 Formalism 41
t
j
t
~j (t)j(t)
Fig 21 Dirac definition of gauge two different solutions of the equations ofmotion must be considered gauge equivalent if they are equal for t lt t
well-known form while e and ω are invariant
λ ϕ(x) rarr Rϕ(λ(x)) ϕ(x) (250)
ψ(x) rarr Rψ(λ(x)) ψ(x) (251)
Aμ(x) rarr R(λ(x)) Aμ(x) + λ(x)partμλminus1(x) (252)
eIμ(x) rarr eIμ(x) (253)
ωIμJ(x) rarr ωI
μJ(x) (254)
Here Rϕ and Rψ are the representations of G to which ϕ and ψ belong and Ris the adjoint representation
(ii) Local Lorentz transformations A local Lorentz transformation is labeled bya map λ M rarr SO(3 1) It acts on ϕψ and the connection ω precisely as aYangndashMills local transformation with YangndashMills group G=SO(3 1) Scalars ϕbelong to the trivial representation fermions ψ belong to the spinor representa-tions S The gravitational field e transforms in the fundamental representationExplicitly writing an element of SO(3 1) as λI
J we have
λ ϕ(x) rarr ϕ(x) (255)
ψ(x) rarr S(λ(x)) ψ(x) (256)
Aμ(x) rarr Aμ(x) (257)
eIμ(x) rarr λIJ(x) eJμ(x) (258)
ωIμJ(x) rarr λI
K(x)ωKμL(x)λL
J(x) + λ IK (x)partμλ
KJ(x) (259)
(iii) Diffeomorphisms Third and most important is the invariance under diffeo-morphisms A diffeomorphism gauge transformation is labeled by a smoothinvertible map φ M rarr M (that is by a ldquodiffeomorphismrdquo of M)7 It actsnonlocally on all the fields by pulling them back according to their form char-
7There is an unfortunate terminological imprecision A map φ M rarr M is called adiffeomorphism The associated transformations (260)ndash(264) on the fields are alsooften loosely called a diffeomorphism (also in this book) instead of diffeomorphismgauge transformations This tends to generate confusion
42 General Relativity
acter ϕ and ψ are zero forms e ω and A are one-forms8
φ ϕ(x) rarr ϕ(φ(x)) (260)
ψ(x) rarr ψ(φ(x)) (261)
Aμ(x) rarr partφν(x)
partxμAν(φ(x)) (262)
eIμ(x) rarr partφν(x)
partxμeIν(φ(x)) (263)
ωIμJ(x) rarr partφν(x)
partxμωIνJ(φ(x)) (264)
These three groups of transformations send solutions of the equationsof motion into other solutions of the equations of motion They are gaugetransformations because we can take these transformations to be the iden-tity before a given coordinate time t and different from the identity af-terwards Therefore they are responsible for the under-determination ofthe evolution equations Following Diracrsquos argument given above physicalpredictions of the theory must be given by quantities invariant under allthree of these transformations
In particular let a local quantity in spacetime be a quantity dependenton a fixed given point x Notice that such a quantity cannot be invariantunder a diffeomorphism Therefore no local quantity in spacetime (in thissense) is a gauge-invariant observable in GR The meaning of this fact andthe far-reaching consequences of diffeomorphism invariance are discussedbelow in Section 232
214 Physical geometry
At each point x of the spacetime manifold M the gravitational field eIμ(x)defines a map from the tangent space TxM to Minkowski space The mapsends a vector vμ in TxM into the Minkowski vector uI = eIμ(x)vμ TheMinkowski length |u| =
radicminusu middot u =radic
minusηIJuIuJ defines a norm |v| of thetangent vector vμ
|v| equiv |u| =radicminusηIJ(eIμ(x)vμ) (eJν (x)vν) (265)
8Under this definition internal Lorentz spinor and gauge indices do not transformunder a diffeomorphism Alternatively one should consider fiber-preserving diffeo-morphisms of the Lorentz and gauge bundle This alternative can be viewed as math-ematically more clean and physically more attractive because it makes more explicitthe fact that local inertial frames or local gauge choices at different spacetime pointscannot be identified (see later) However the mathematical description of a diffeo-morphism becomes more complicated while the two choices are ultimately physicallyequivalent due to the gauge invariance under local Lorentz and gauge transforma-tions The proper mathematical transformation of a spinor under diffeomorphisms isdiscussed in [55] and [56]
21 Formalism 43
|v| is called the ldquophysical lengthrdquo of the tangent vector v The tangentvector v is called timelike (spacelike or lightlike) if u is timelike (spacelikeor lightlike)
This fact allows us to assign a size to any d-dimensional surface in M At any point x on the surface the gravitational field maps the tangentspace of the surface into a surface in Minkowski space This surface carriesa volume form which can be pulled back to the tangent space of x andthen to the surface itself and integrated In particular
The length L of a curve γ s rarr xμ(s) is the line integral of the norm ofits tangent
L[e γ] =int
|dγ| =int
ds |u(s)| =int
dsradic
minusηIJ uI(s)uJ(s) (266)
whereuI(s) = eIμ(γ(s))
dxμ(s)ds
(267)
This can be written as the line integral of the norm of the one-formeI(x) = eIμ(x)dxμ along γ
L[e γ] =int
γ|e| (268)
The length is independent of the parametrization and the orien-tation of γ A curve is called timelike if its tangent is everywheretimelike Notice that the action of a particle (241) is nothing butthe length of its path in spacetime
S[e γ] = m L[e γ] (269)
The area A of a two-dimensional surface S σ= (σi) rarr xμ(σi) i= 1 2immersed in M is
A[eS] =int ∣∣d2S
∣∣ =int
Sd2σ
radicdet (ui middot uj) (270)
whereuIi (σ) = eIμ(γ(σ))
partxμ(σ)partσi
(271)
and the determinant is over the i j indices That is
A[eS] =int
d2σradic
(u1 middot u1)(u2 middot u2) minus (u1 middot u2)2 (272)
A surface is called spacelike if its tangents are all spacelike
44 General Relativity
The volume V of a three-dimensional region R σ = (σi) rarr xμ(σi) i =1 2 3 immersed in M is
V[eR] =int ∣
∣d3R∣∣ =
int
Rd3σ
radicn middot n (273)
wherenI = εIJKL uJ1u
K2 uL3 (274)
is normal to the surface A region is called spacelike if n is every-where timelike
The quantities L A and V are particular functions of the grav-itational field e The reason they have these geometric names isdiscussed below in Section 223
215 Holonomy and metric
In GR quantities close to observations such as lengths and areas arenonlocal in the sense that they depend on finite but extended regions inspacetime such as lines and surfaces Another natural nonlocal quantitywhich plays a central role in the quantum theory is the holonomy U ofthe gravitational connection (ω or its selfdual part A) along a curve γ
Definition of the holonomy Given a connection A in a group G overa manifold M the holonomy is defined as follows Let a curve γ be acontinuous piecewise smooth map from the interval [0 1] into M
γ [0 1] minusrarr M (275)s minusrarr xμ(s) (276)
The holonomy or parallel propagator U [A γ] of the connection A alongthe curve γ is the element of G defined by
U [A γ](0) = 11 (277)dds
U [A γ](s) minus γμ(s)Aμ
(γ(s)
)U [A γ](s) = 0 (278)
U [A γ] = U [A γ](1) (279)
where γμ(s) equiv dxμ(s)ds is the tangent to the curve (In the mathematicalliterature the term ldquoholonomyrdquo is generally used for closed curves only Inthe quantum gravity literature it is commonly employed for open curvesas well) The formal solution of this equation is
U [A γ] = P expint 1
0ds γμ(s) Ai
μ
(γ(s)
)τi equiv P exp
int
γA (280)
21 Formalism 45
where τi is a basis in the Lie algebra of the group G and the path orderedP is defined by the power series expansion
P expint 1
0dsA
(γ(s)
)
=infinsum
n=0
int 1
0ds1
int s1
0ds2 middot middot middot
int snminus1
0dsnA
(γ(sn)
)middot middot middotA
(γ(s1)
) (281)
The connection A is a rule that defines the meaning of parallel-transporting a vector in a representation R of G from a point of M toa nearby point the vector v at x is defined to be parallel to the vectorv +R(Aadxμ)v at x+ dx A vector is parallel-transported along γ to thevector R(U(A γ))v
An important property of the holonomy is that it transforms homoge-neously under the gauge transformation (252) of A That is U [Aλ γ] =λ(xγf )U [A γ]λminus1(xγi ) where xγif are the initial and final points of γ
A technical remark that we shall need later on the holonomy of anycurve γ is well defined even if there are (a finite number of) points whereγ is nondifferentiable and A is ill defined The reason is that we canbreak γ into components where everything is differentiable and define theholonomy of γ as the product of the holonomies of the components whichare well defined by continuity
Physical interpretation of the holonomy Consider two left-handed neutri-nos that meet at the spacetime point A separate and then meet again atthe spacetime point B Assume their spins are parallel at A and evolve un-der the sole influence of the gravitational field What is their relative spinat B A left-handed neutrino lives in the selfdual representation of theLorentz group and therefore its spin is parallel-transported by the selfdualconnection A Let γ1 and γ2 be the worldlines of the two neutrinos fromA to B and let γ = γminus1
2 γ1 be the loop formed by the two worldlines Ifthe first neutrino has spin ψ at B the second has spin ψprime = U(A γ)ψ Byhaving the two neutrinos interact we can in principle measure a quantitysuch as α = 2Re〈ψ|ψprime〉 which (assuming |ψ| = 1) gives the trace of theholonomy α = tr U [A γ]
Metric notation Einstein wrote GR in terms of the metric field HereI give the translation to metric variables Notice however that this isnecessarily incomplete since the fermion equations of motion cannot bewritten in terms of the metric field
46 General Relativity
The metric field g is a symmetric tensor field defined by
gμν(x) = eIμ(x) eJν (x) ηIJ (282)
At each point x of M g defines a scalar product in the tangent space TxM
(u v) = gμν(x)uμvν u v isin TxM (283)
and therefore maps TxM into T lowastxM In other words gμν and its inverse gμν can be used
to raise and lower tangent indices The fact that eμI (x) equiv ηIJgμνeJν (x) is the inverse
matrix of eIμ(x) is then a result not a definition
The metric-preserving linear connection Γ is the field Γρμν(x) defined by
Γρμν = eρI(partμe
Iν + ωI
μJ eJν ) (284)
It defines a covariant partial derivative Dμ on all fields that have tangent (μ) indices
Dμvν = partμv
ν + Γνμρv
ρ (285)
Together with ω it defines a covariant partial derivative Dμ on all objects thathave Lorentz as well as tangent indices In particular notice that (284) yieldsimmediately
DμeIν = partμe
Iν + ωI
μJ eJν minus Γρμν eIρ = 0 (286)
The antisymmetric part T ρμν = Γρ
μν minus Γρνμ of the linear connection gives the torsion
T I = eIρTρμνdxμdxν defined in (25)
The LevindashCivita connection is the (metric-preserving) linear connection determinedby e and ω[e] That is it is defined by
partμeIν + ω[e]IμJ eJν minus Γρ
μν eIρ = 0 (287)
whose solution is (245) It is torsion-free Notice that the antisymmetric part of thisequation is the first Cartan structure equation with vanishing torsion namely (26)which is sufficient to determine ω[e] as a function of e
The LevindashCivita connection is uniquely determined by g it is the unique torsion-freelinear connection that is metric preserving namely that satisfies
Dμgνρ = 0 (288)
or equivalently
partμgνρ minus Γσμνgσρ minus Γσ
μρgνσ = 0 (289)
This equation is solved by (245) or
Γρμν =
1
2gρσ(partμgσν + partνgμσ minus partσgμν) (290)
Notice that equations (287) and (290) allow us to write the explicit solution of theGR equation of motion (26)
ω[e]IμJ = eνJ(partμeIν minus Γρ
μνeIρ) (291)
21 Formalism 47
where Γ is given by (290) and g by (282) Explicitly this gives with a bit of algebra
ω[e]IJμ = 2 eν[Ipart[μeν]J] + eμKeνIeσJpart[σeν]
K (292)
The Riemann tensor can be defined via
Rμνρσ eIμ = RI
J ρσ eJν (293)
The Ricci tensor is
Rμν = RIμ eIν (294)
where RIμ is defined in (213) The energy-momentum tensor (see footnote 6 after (240))
Tμν = T Iμ eIν (295)
In terms of these quantities the Einstein equations (240) read
Rμν minus 1
2Rgμν + λgμν = 8πG Tμν (296)
The Minkowski solution isgμν(x) = ημν (297)
where we see clearly that the spacetime Minkowski metric is nothing but a particularvalue of the gravitational field With a straightforward calculation the action (212)reads
S[g] =1
16πG
int(R + λ)
radicminus det g d4x (298)
The matter action cannot be written in metric variables
Riemann geometry The tensor g equips the spacetime manifold M with a metric struc-ture it defines a distance between any two points and this distance is a smooth functionon M (More precisely it defines a pseudo-metric structure as distance can be imag-inary) Riemann studied the structure defined by (M g) called today a riemannianmanifold and defined the Riemann curvature tensor as a generalization of Gauss the-ory of curved surfaces to an arbitrary number of dimensions Riemann presented thismathematical theory as a general theory of ldquogeometryrdquo that generalizes Euclidean ge-ometry Einstein utilized this mathematical theory for describing the physical dynamicsof the gravitational field In retrospect the reason this was possible is because as un-derstood by Einstein the euclidean structure of the physical space in which we liveis determined by the local gravitational field Therefore elementary physical geometryis simply a description of the local properties of the gravitational field as revealed bymatter (rigid bodies) interacting with it This point is discussed in more detail belowin Section 223
mdashmdashmdashndash
The basic equations of GR presented in this section do not look too dif-ferent from the equations of a prerelativistic9 field theory such as QED orthe standard model But the similarity can be very misleading The phys-ical interpretation of a general-relativistic theory is very different fromthe interpretation of a prerelativistic one In particular the meaning of
9Recall that in this book ldquorelativisticrdquo means general relativistic
48 General Relativity
the coordinates xμ is different than in prerelativistic physics and thegauge-invariant observables are not related to the fields as they are inprerelativistic physics
The process of understanding the physical meaning of the GR formal-ism has taken many decades and perhaps it is not entirely concluded yetFor several decades after Einsteinrsquos discovery of the theory for instanceit was not clear whether or not the theory predicted gravitational wavesThe prevailing opinion was that wave solutions were only a coordinate ar-tifact and did not represent physical waves capable of carrying energy andmomentum or as Bondi put it capable of ldquoboiling a glass of waterrdquo Thisopinion was wrong of course Einstein himself badly misinterpreted themeaning of the Schwarzschild singularity Wrong high-precision measure-ments of the EarthndashMoon distance have been in the literature for a whilebecause of a mistake due to a conceptual confusion between physical andcoordinate distance
I do not want to give the impression that GR is ldquofoggyrdquo Quite thereverse the fact that in all these and similar instances consensus haseventually emerged indicates that the conceptual structure of GR is se-cure But to understand this conceptual structure to understand how touse the equations of GR correctly and how to relate the quantities ap-pearing in these equations to the numbers measured in the laboratoryor observed by the astronomers is definitely a nontrivial problem Moregenerally the problem is to understand what precisely GR says about theworld Clarity in this respect is essential if we want to understand thequantum physics of the theory
In order to shed light on this problem it is illuminating to retrace theconceptual path and the problems that led to the discovery of the theoryThis is done in the following Section 22 The impatient reader may skipSection 22 and jump to Section 23 where the interpretation of GR iscompactly presented (but impatience slows understanding)
22 The conceptual path to the theory
The roots of GR are in two distinct problems Einsteinrsquos genius was tounderstand that the two problems solve each other
221 Einsteinrsquos first problem a field theory for the newtonianinteraction
It was Newton who discovered dynamics But to a large extent it wasDescartes who a generation earlier fixed the general rules of the modernscience of nature or the Scientia Nova as it was called at the time One of
22 The conceptual path to the theory 49
Descartesrsquo prescriptions was the elimination of all the ldquoinfluences from farawayrdquo that plagued mediaeval science According to Descartes physicalinteractions happen only between contiguous entities ndash as in collisionspushes and pulls Newton violated this prescription describing gravity asthe instantaneous ldquoaction-at-a-distancerdquo of the force
F = Gm1m2
d2 (299)
Newton did not introduce action-at-a-distance with a light heart he callsit ldquorepugnantrdquo His violation of the cartesian prescriptions was one of thereasons for the strong initial opposition to newtonianism For many hislaw of gravitation sounded too much like the discredited ldquoinfluences fromthe starsrdquo of the ineffective science of the Middle Ages But the empiricalsuccess of Newtonrsquos dynamics and gravitational theory was so immensethat most worries about action-at-a-distance dissipated
Two centuries later it is another Briton who finds the way to ad-dress the problem afresh in an effort to understand electric and magneticforces Faraday introduces a new notion10 which is going to revolution-ize modern physics the notion of field For Faraday the field is a setof lines filling space The Faraday lines begin and end on charges inthe absence of charges each line closes forming a loop In his wonderfulbook which is one of the pillars of modern physics and has virtually noequations Faraday discusses whether the field is a real physical entity11
Maxwell formalizes Faradayrsquos powerful physical intuition into a beautiful
10Many ideas of modern science have been resuscitated from hellenistic science [57]Is the FaradayndashMaxwell notion of field a direct descendant of the notion of πνευμα(pneuma) that appears for instance in Hipparchus as the carrier of the attractionof the Moon on the oceans causing the tides and which also appears in contextsrelated to magnetism [58] Did Faraday know this notion
11ldquoWith regards to the great point under consideration it is simply whether the linesof force have a physical existence or not I think that the physical nature of thelines must be grantedrdquo [59] Strictly speaking we can translate the problem in modernterms as to whether the field has degrees of freedom independent from the chargesor not But this doesnrsquot diminish the ontological significance of Faradayrsquos questionwhich seems to me transparent in these lines Faradayrsquos continuation is lovely ldquoAndthough I should not have raised the argument unless I had thought it both importantand likely to be answered ultimately in the affirmative I still hold the opinion withsome hesitation with as much indeed as accompanies any conclusion I endeavor todraw respecting points in the very depths of sciencerdquo I think that Faradayrsquos greatnessshines in this ldquohesitationrdquo which betrays his full awareness of the importance of thestep he is taking (virtually all of modern fundamental physics comes out of theselines) as well as the full awareness of the risk of taking any major novel step
50 General Relativity
mathematical theory ndash a field theory At each spacetime point Maxwellelectric and magnetic fields represent the tangent to the Faraday lineThere is no action-at-a-distance in the theory the Coulomb descriptionof the electric force between two charges namely the instantaneous action-at-a-distance law
F = kq1 q2d2
(2100)
is understood to be correct only in the static limit A charge q1 at distanced from another charge q2 does not produce an instantaneous force on q2because if we move q1 rapidly away it takes a time t = dc before q2begins to feel any change This is the time the interaction takes to moveacross space at a finite speed in a manner remarkably consistent withDescartesrsquo prescription
When Einstein studies physics Maxwell theory is only three decadesold In his writings Einstein rhapsodizes on the beauty of Maxwell the-ory and the profound impression it made upon him Given the formalsimilarity of the Newton and Coulomb forces (299) and (2100) it iscompletely natural to suspect that (299) also is only correct in the staticlimit Namely that the gravitational force is not instantaneous either ifa neutron star rushing at great speed from the deep sky smashed awaythe Sun it would take a finite time before any effect be felt on EarthThat is it is natural to suspect that there is a field theory behind New-ton theory as well Einstein set out to find this field theory GR is what hefound
Special relativity In fact the need for a field theory behind Newton law(299) is not just suggested by the CoulombndashMaxwell analogy it is indi-rectly required by Maxwell theory The reason is that Maxwell theory notonly eliminated the apparent action-at-a-distance of Coulomb law (2100)but it also led to a reorganization of the notions of space and time whichin turn renders any action-at-a-distance inconsistent This reorganizationof the notions of space and time is special relativity a key step towardsGR
In spite of its huge empirical success Maxwell theory had an appar-ent flaw if taken as a fundamental theory12 it is not galilean invariantGalilean invariance is a consequence of the equivalence of inertial frames ndashat least it had always been understood as such Inertial frame equivalenceor the fact that velocity is a relative notion is one of the pillars of dy-namics The story goes that in the silent halls of Warsawrsquos University an
12Rather than as a phenomenological theory of the disturbances of a mechanical etherwhose dynamics is still to be found
22 The conceptual path to the theory 51
old and grave professor stormed out of his office like a madman shoutingldquoEureka Eureka The new Archimedes is bornrdquo when he saw Einsteinrsquos1905 paper offering the solution of this apparent contradiction The wayEinstein solves the problem is an example of theoretical thinking at itsbest I think it should be kept in mind as an exemplar when we considerthe apparent contradictions between GR and QM
Einstein maintains his confidence in the galilean discovery that physicsis the same in all moving inertial frames and also maintains his confidencethat Maxwell equations are correct in spite of the apparent contradictionHe realizes that there is contradiction only because we implicitly hold athird assumption By dropping this third assumption the contradictiondisappears The third assumption regards the notion of time It is the ideathat it is always meaningful to say which of two distant events A andB happens first Namely that simultaneity is well defined in a mannerindependent of the observer Einstein observes that this is a prejudice wehave on the structure of reality We can drop this prejudice and accept thefact that the temporal ordering of distant events may have no meaningIf we do so the picture returns to consistency
The success of special relativity was rapid and the theory is todaywidely empirically supported and universally accepted Still I do notthink that special relativity has really been fully absorbed even nowthe large majority of cultivated people as well as a surprisingly highnumber of theoretical physicists still believe deep in their heart thatthere is something happening ldquoright nowrdquo on Andromeda that there isa single universal time ticking away the life of the Universe Do you myreader
An immediate consequence of special relativity is that action-at-a-distance is not just ldquorepugnantrdquo as Newton felt it is a nonsense Thereis no (reasonable) sense in which we can say that the force due to themass m1 acts on the mass m2 ldquoinstantaneouslyrdquo If special relativity iscorrect (299) is not just likely to be the static limit of a field theoryit has to be the static limit of a field theory When the neutron starhits the Sun there is no ldquonowrdquo at which the Earth could feel the effectThe information that the Sun is no longer there must travel from Sun toEarth across space carried by an entity This entity is the gravitationalfield
Maxwell rarr Einstein Therefore shortly after having worked out the keyconsequences of special relativity Einstein attacks what is obviously thenext problem searching the field theory that gives (299) in the staticlimit His aim is to do for (299) what Faraday and Maxwell had done for(2100) The result in brief is the following expressed in modern language
52 General Relativity
Maxwellrsquos solution to the problem is tointroduce the one-form field Aμ(x)
The force on the particles is
xμ = eFμν xν (2101)
where F is constructed with the firstderivatives of AA satisfies the (Maxwell) field equations
partμFνμ = Jν (2102)
a system of second-order partial differ-ential equations for A with the chargecurrent Jν as sourceMore generally the field equations canbe obtained as EulerndashLagrange equa-tions of the action
S[Amatt] =1
4
intF lowast and F
+Smatt[Amatt] (2103)
where F is the curvature of A
Smatt is obtained from the matter actionby replacing derivatives with covariantderivatives
It follows that the source of the fieldequations is
Jμ =δ
δAμSmatt[Amatt] (2104)
Einsteinrsquos solution is to introduce thefield eIμ(x) a one-form with value inMinkowski spaceThe force on the particles is (eq (244))
xμ = minusΓμνρ xν xρ (2105)
where Γ is constructed with the firstderivatives of e (equation (245))e satisfies the (Einstein) field equations(eq (237) here with λ = 0)
RIμ minus 1
2eIμR = 8πG T I
μ (2106)
a system of second order partial differ-ential equations for e with the energymomentum tensor T I
μ as sourceMore generally the field equations canbe obtained as EulerndashLagrange equa-tions of the action ((236) in second or-der form)
S[ematt] =1
16πG
inteIandeJandRKLεIJKL
+ Smatt[ematt] (2107)
where R is the curvature of the connec-tion ω compatible with eSmatt is obtained from the matter ac-tion by replacing derivatives with co-variant derivatives and the Minkowskimetric with the gravitational metricIt follows that the source of the fieldequations is (237)
T Iμ =
δ
δeIμSmatt[ematt] (2108)
The structural similarity between the theories of Maxwell and Einsteintheories is evident However this is only half of the story
222 Einsteinrsquos second problem relativity of motion
To understand Einsteinrsquos second problem we have to return again to theorigin of modern physics In the western culture there are two traditionalways of understanding what is ldquospacerdquo as an entity or as a relation
ldquoSpace is an entityrdquo means that space still exists when there is nothingelse besides space It exists by itself and objects move in it Thisis the way Newton describes space and is called absolute spaceIt is also the way spacetime (rather than space) is understood in
22 The conceptual path to the theory 53
special relativity Although considered since ancient times (in thedemocritean tradition) this way of understanding space was not thetraditional dominant view in western culture The dominant viewfrom Aristotle to Descartes was to understand space as a relation
ldquoSpace is a relationrdquo means that the world is made up of physical objectsor physical entities These objects have the property that they canbe in touch with one another or not Space is this ldquotouchrdquo orldquocontiguityrdquo or ldquoadjacencyrdquo relation between objects Aristotle forinstance defines the spatial location of an object as the (internal)boundary of the set of the objects that surround it This is relationalspace
Strictly connected to these two ways of understanding space there aretwo ways of understanding motion
ldquoAbsolute motionrdquo If space is an entity motion can be defined as goingfrom one part of space to another part of space This is how Newtondefines motion
ldquoRelative motionrdquo If space is a relation motion can only be defined asgoing from the contiguity of one object to the contiguity of anotherobject This is how Descartes13 and Aristotle14 define motion
For a physicist the issue is which of these two ways of thinking aboutspace and motion allows a more effective description of the world
For Newton space is absolute and motion is absolute15 This is a sec-ond violation of cartesianism Once more Newton does not take this step
13ldquoWe can say that movement is the transference of one part of matter or of one bodyfrom the vicinity of those bodies immediately contiguous to it and considered at restinto the vicinity of some othersrdquo (Descartes Principia Philosophiae Section II-25p 51) [60]
14Aristotle insists that motion is relative He illustrates the point with the example of aman walking on a boat The man moves with respect to the boat which moves withrespect to the water of the river which moves with respect to the ground Aristotlersquosrelationalism is tempered by the fact that there are preferred objects that can be usedas a preferred reference the Earth at the center of the Universe and the celestialspheres in particular one of the fixed stars Thus we can say that something ismoving ldquoin absolute termsrdquo if it moves with respect to the Earth However thereare two preferred frames in ancient cosmology the Earth and the fixed stars andthe two rotate with respect to each other The thinkers of the Middle Ages did notmiss this point and discussed at length whether the stars rotate around the Earthor the Earth rotates under the stars Remarkably in the fourteenth century Buridanconcluded that neither view is more true than the other on grounds of reason andOresme studied the rotation of the Earth more than a century before Copernicus
15ldquoSo it is necessary that the definition of places and hence local motion be referredto some motionless thing such as extension alone or space in so far as space is seentruly distinct from moving bodiesrdquo [61] This is in open contrast with Descartesdefinition given in footnote 13
54 General Relativity
with a light heart he devotes a long initial section of the Principia toexplain the reasons of his choice The strongest argument in Newtonrsquosfavor is entirely a posteriori his theoretical construction works extraor-dinarily well Cartesian physics was never as effective But this is notNewtonrsquos argument Newton resorts to empirical evidence discussing afamous experiment with a bucket
Newtonrsquos bucket Consider a ldquobucket full of water hung by a long cordso often turned about that the cord is strongly twistedrdquo Whirl thebucket so that it starts rotating and the cord untwisting At first
(i) the bucket rotates (with respect to us) and the water remainsstill The surface of the water is flatThen the motion of the bucket is transmitted to the water byfriction and thus the water starts rotating together with thebucket At some time
(ii) the water and the bucket rotate together The surface of thewater is no longer flat it is concave
We know from experience that the concavity of the water is caused byrotation Rotation with respect to what Newtonrsquos bucket experimentshows something subtle about this question If motion is change of placewith respect to the surrounding objects as Descartes demands then wemust say that in (i) water rotates (with respect to the bucket whichsurrounds it) while in (ii) water is still (with respect to the bucket) Butobserves Newton the concavity of the surface appears in (ii) not in (i)It appears when the water is still with respect to the bucket not whenthe water moves with respect to the bucket Therefore the rotation thatproduces the physical effect is not the rotation with respect to the bucketIt is the rotation with respect to what
It is rotation with respect to space itself answers Newton The concav-ity of the water surface is an effect of the absolute motion of the waterthe motion with respect to absolute space not to the surrounding bodiesThis claims Newton proves the existence of absolute space
Newtonrsquos argument is subtle and for three centuries nobody had been able to defeatit To understand it correctly we should lay to rest a common misunderstanding Rela-tionalism namely the idea that motion can be defined only in relation to other objectsshould not be confused with galilean relativity Galilean relativity is the statementthat ldquorectilinear uniform motionrdquo is a priori indistinguishable from stasis Namely thatvelocity (just velocity) is relative to other bodies Relationalism on the other handholds that any motion (however zig-zagging) is a priori indistinguishable from stasisThe very formulation of galilean relativity assumes a nonrelational definition of motionldquorectilinear and uniformrdquo with respect to what
Now when Newton claimed that motion with respect to absolute space is real andphysical he in a sense overdid it insisting that even rectilinear uniform motion is
22 The conceptual path to the theory 55
absolute This caused a painful debate because there are no physical effects of inertialmotion and therefore the bucket argument fails for this particular class of motions16
Therefore inertial motion and velocity are to be considered relative in newtonian me-chanics
What Newton needed for the foundation of dynamics ndash and what we are discussinghere ndash is not the relativity of inertial motion it is whether accelerated motion exem-plified by the rotation of the water in the bucket is relative or absolute The questionhere is not whether or not there is an absolute space with respect to which velocity canbe defined The question is whether or not there is an absolute space with respect towhich acceleration can be defined Newtonrsquos answer supported by the bucket argumentwas positive Without this answer Newtonrsquos main law
F = ma (2109)
wouldnrsquot even make sense
Opposition to Newtonrsquos absolute space was even stronger than oppo-sition to his action-at-a-distance Leibniz and his school argued fierilyagainst Newton absolute motion and Newtonrsquos use of absolute accelera-tion17 Doubts never really disappeared down through subsequent cen-turies and a lingering feeling remained that something was wrong inNewtonrsquos argument At the end of the nineteenth century Ernst Machreturned to the issue suggesting that Newtonrsquos bucket argument could bewrong because the water does not rotate with respect to absolute space itrotates with respect to the full matter content of the Universe I will com-ment on this idea and its influence on Einstein in Section 241 But asfor action-at-a-distance the immense empirical triumph of newtonianismcould not be overcome
Or could it After all in the early twentieth century 43 seconds of arcin Mercuryrsquos orbit were observed which Newtonrsquos theory didnrsquot seem tobe able to account for
Generalize relativity Einstein was impressed by galilean relativity Thevelocity of a single object has no meaning only the velocity of objectswith respect to one another is meaningful Notice that in a sense thisis a failure of Newtonrsquos program of revealing the ldquotrue motionsrdquo It is aminor but significant failure For Einstein this was a hint that there issomething wrong in the newtonian (and special-relativistic) conceptualscheme
16Newton is well aware of this point which is clearly stated in the Corollary V ofthe Principia but he chooses to ignore it in the introduction to Principia I thinkhe did this just to simplify his argument which was already hard enough for hiscontemporaries
17Leibniz had other reasons of complaint with Newton The two were fighting overthe priority for the invention of calculus ndash scientistsrsquo frailties remain the same in allcenturies
56 General Relativity
In spite of its immense empirical success Newtonrsquos idea of an abso-lute space has something deeply disturbing in it As Leibniz Mach andmany others emphasized space is a sort of extrasensorial entity that actson objects but cannot be acted upon Einstein was convinced that theidea of such an absolute space was wrong There can be no absolutespace no ldquotrue motionrdquo Only relative motion and therefore relative ac-celeration must be physically meaningful Absolute acceleration shouldnot enter physical equations With special relativity Einstein had suc-ceeded in vindicating galilean relativity of velocities from the challenge ofMaxwell theory He was then convinced that he could vindicate the entirearistotelianndashcartesian relativity of motion In Einsteinrsquos terms ldquothe lawsof motion should be the same in all reference frames not just in the iner-tial framesrdquo Things move with respect to one another not with respect toan absolute space there cannot be any physical effect of absolute motion
According to many contemporary physicists this is excessive weightgiven to ldquophilosophicalrdquo thinking which should not play a role in physicsBut Einsteinrsquos achievements in physics are far more effective than the onesobtained by these physicists
223 The key idea
The question addressed in Newtonrsquos bucket experiment is the followingThe rotation of the water has a physical effect ndash the concavity of thewater surface with respect to what does the water ldquorotaterdquo Newtonargues that the relevant rotation is not the rotation with respect to thesurrounding objects (the bucket) therefore it is rotation with respect toabsolute space Einsteinrsquos new answer is simple and fulgurating
The water rotates with respect to a local physical entity the gravitational field
It is the gravitational field not Newtonrsquos inert absolute space that tellsobjects if they are accelerating or not if they are rotating or not Thereis no inert background entity such as newtonian space there are onlydynamical physical entities Among these are the fields Among the fieldsis the gravitational field
The flatness or concavity of the water surface in Newtonrsquos bucket is notdetermined by the motion of the water with respect to absolute spaceIt is determined by the physical interaction between the water and thegravitational field
The two lines of Einsteinrsquos thinking about gravity (finding a field the-ory for the newtonian interaction and getting rid of absolute acceleration)meet here Einsteinrsquos key idea is that Newton has mistaken the gravita-tional field for an absolute space
22 The conceptual path to the theory 57
What leads Einstein to this idea Why should newtonian accelerationbe defined with respect to the gravitational field The answer is givenby the special properties of the gravitational interaction18 These canbe revealed by a thought experiment called Einsteinrsquos elevator I presentbelow a modern and more realistic version of Einsteinrsquos elevator argument
An ldquoelevatorrdquo argument newtonian cosmology Here is a simple physical situation thatillustrates that inertia and gravity are the same thing The model is simple but com-pletely realistic It leads directly to the physical intuition underlying GR
In the context of newtonian physics consider a universe formed by a very largespherical cloud of galaxies Assume that the galaxies are ndash and remain ndash uniformlydistributed in space with a time-dependent density ρ(t) and that they attract eachother gravitationally Let C be the center of the cloud Consider a galaxy A (say ours)at a distance r(t) from the center C As is well known the gravitational force on A dueto the galaxies outside a sphere of radius r around C cancels out and the gravitationalforce due to the galaxies inside this sphere is the same as the force due to the samemass concentrated in C Therefore the gravitational force on A is
F = minusGmA
43πr3(t) ρ(t)
r2(t) (2110)
ord2r
dt2= minusG
4
3π r(t)ρ(t) (2111)
If the density remains spatially constant it scales uniformly as rminus3 That is ρ(t) =ρ0r
minus3(t) where ρ0 is a constant equal to the density at r(t) = 1 Therefore
d2r
dt2= minus4
3πGρ0
1
r2(t)= minus c
r2(t) (2112)
where
c =4πGρO
3(2113)
is a constant Equation (2112) is the Friedmann cosmological equation which governsthe expansion of the universe (It is the same equation that one obtains from full GRin the spatially flat case)
In the newtonian model we are considering the galaxy C is in the center of theuniverse and defines an inertial frame while the galaxy A is not in the center and isnot inertial Assume that the cloud is so large that its boundary cannot be observedfrom C or A If you are in one of these two galaxies how can you tell in which youare That is how can you tell whether you are in the inertial reference frame C or inthe accelerated frame A
The answer is very remarkably that you cannot Since the entire cloud expandsor contracts uniformly the picture of the local sky looks uniformly expanding or con-tracting precisely in the same manner from all galaxies But you cannot detect if youare in the inertial galaxy C or in the accelerated galaxy A by local experiments eitherIndeed to detect if you are in an accelerated frame you have to observe inertial forces
18Gravity is ldquospecialrdquo in the sense that newtonian absolute space is a configuration ofthe gravitational field Once we get rid of the notion of absolute space the gravita-tional interaction is no longer particularly special It is one of the fields forming theworld But it is a very different world from that of Newton and Maxwell
58 General Relativity
such as the ones that make the water surface of Newtonrsquos bucket concave The A frameacceleration is
a =c
r2(t)u (2114)
where u is a unit vector pointing towards C Therefore there is an inertial force
Finertial = minus c
r2(t)u (2115)
on all moving masses This is the force that should allow us to detect that the frame isnot inertial However all masses feel besides the local forces Flocal also the cosmologicalgravitational pull towards C
Fcosmological =c
r2(t)u (2116)
so that their motion in the accelerated A frame is governed by
ma = Flocal + Finertial + Fcosmological (2117)
= Flocal (2118)
because (2115) and (2116) cancel out exactly Therefore the local dynamics in Alooks precisely as if it were inertial The parabola of a falling stone in A seen from theaccelerated A frame looks as a straight line There is no way of telling if you are thecenter and no way of telling if you are inertial or not
How do we interpret this impossibility of detecting the inertial frameAccording to newtonian physics the dynamics in C or A should be com-pletely different But this difference is not physically observable In thenewtonian conceptual scheme A is noninertial there are gravitationalforces and inertial forces but there is a sort of conspiracy that hides bothof them In fact the situation is completely general in a sufficiently smallregion inertial and gravitational forces cancel to any accuracy in a free-falling reference system19 It is clear that there should be a better wayof understanding this physical situation without resorting to all theseunobservable forces
The better way is to drop the newtonian preferred global frame andto realize each galaxy has its own local inertial reference frame We candefine local inertial frame by the absence of observable inertial effects asin newtonian physics Each galaxy then has its local inertial frame These
19This is the equivalence principle By the way Newton the genius knew it ldquoIf bodiesmoved among themselves are urged in the direction of parallel lines by equal acceler-ative forces they will all continue to move among themselves after the same manneras if they had not been urged by those forcesrdquo (Newton Principia Corollary VI tothe ldquoLaws of Motionrdquo) [62] Newton uses this corollary for computing the complicatedmotion of the Moon in the Solar System In the frame of the Earth inertial forcesand the solar gravity cancel out with good approximation and the Moon follows akeplerian orbit
22 The conceptual path to the theory 59
frames are determined by the gravitational force That is it is gravitythat determines at each point what is inertial Inertial motion is suchwith respect to the local gravitational field not with respect to absolutespace
Gravity determines then the way the frames of different galaxies fallwith respect to one another The gravitational field expresses the rela-tion between the various inertial frames It is the gravitational field thatdetermines inertial motion Newtonrsquos true motion is not motion with re-spect to absolute space it is motion with respect to a frame determinedby the gravitational field It is motion relative to the gravitational fieldEquation (2109) governs the motion of objects with respect to the grav-itational field
The form of the gravitational field Recall that Einsteinrsquos problem wasto describe the gravitational field The discussion above indicates thatthe gravitational field can be viewed as the field that determines at eachpoint of spacetime the preferred frames in which motion is inertial Letus write the mathematics that expresses this intuition
Return to the cloud of galaxies Since we have dropped the idea ofa global inertial reference system let us coordinatize events in the cloudwith arbitrary coordinates x= (xμ) The precise physical meaning of thesecoordinates is discussed in detail in the next section Let xμA be coordinatesof a particular event A say in our galaxy Since these coordinates arearbitrarily chosen motion described in the coordinates xμ is in generalnot inertial in our galaxy For instance particles free from local forcesdo not follow straight lines But we can find a locally inertial referenceframe around A Let us denote the coordinates it defines as XI and takethe event A as the origin so that XI(A) = 0 The coordinates XI can beexpressed as functions
XI = XI(x) (2119)
of the arbitrary coordinates x In the x coordinates the noninertiality ofthe motion in A is gravity Gravity in A is the information of the changeof coordinates that takes us to inertial coordinates This information iscontained in the functions (2119) But only the value of these functions ina small neighborhood around A is relevant because if we move away thelocal inertial frame will change Therefore we can Taylor-expand (2119)and keep only the first nonvanishing term As XI(A) = 0 to first non-vanishing order we have
XI(x) = eIμ(xA) xμ (2120)
60 General Relativity
where we have defined
eIμ(xA) =partXI(x)partxμ
∣∣∣∣x=x(A)
(2121)
The quantity eIμ(xA) contains all the information we need to know thelocal inertial frame in A The construction can be repeated at each pointx The quantity
eIμ(x) =partXI(x)partxμ
∣∣∣∣x
(2122)
where XI are now inertial coordinates at x is the gravitational field atx This is the form of the field introduced in Section 211
The gravitational field eIμ(x) is therefore the jacobian matrix of thechange of coordinates from the x coordinates to the coordinates XI thatare locally inertial at x The field eIμ(x) is also called the ldquotetradrdquo fieldfrom the Greek word for ldquofourrdquo or the ldquosoldering formrdquo because it ldquosol-dersrdquo a Minkowski vector bundle to the tangent bundle or followingCartan the ldquomoving framerdquo although there is nothing moving about it
Transformation properties If the coordinate system XI defines a localinertial system at a given point so does any other local coordinate sys-tem Y J = ΛJ
IXI where Λ is a Lorentz transformation Therefore the
index I of eIμ(x) transforms as a Lorentz index under a local Lorentztransformation and the two fields eIμ(x) and
eprimeJμ(x) = ΛJI(x)eIμ(x) (2123)
represent the same physical gravitational field Thus this description ofgravity has a local Lorentz gauge invariance
What happens if instead of using the physical coordinates x we hadchosen coordinates y = y(x) The chain rule determines the field eprimeIν(y)that we would have found had we used coordinates y
eprimeIν(y) =partxμ(y)partyν
eIμ(x(y)) (2124)
The transformation properties (2123) and (2124) are precisely the trans-formation properties (258) and (263) under which the GR action is in-variant
These transformation laws are also the ones of a one-form field valued in a vectorbundle P over the spacetime manifold M whose fiber is Minkowski space M associatedwith a principal SO(3 1) Lorentz bundle This is a natural geometric setting for thegravitational field The connection ω defined in Section 211 is a connection of this
22 The conceptual path to the theory 61
bundle This setting realizes the physical picture of a patchwork of Minkowski spacessuggested by the cloud of galaxies carrying Lorentz frames at each galaxy More pre-cisely the gravitational field can be viewed as map e TM rarr P that sends tangentvectors to Lorentz vectors
Matter Finally consider a particle moving in spacetime along a worldlinexμ(τ) If a particle has velocity vμ = dxμdτ at a point x its velocity inlocal Minkowski coordinates XI at x is
uI =partXI(x)partxμ
∣∣∣∣x
vμ = eIμ(x)vμ (2125)
In this local Minkowski frame the infinitesimal action along the trajectoryis
dS = mradicminusηIJuIuJ dτ (2126)
Therefore the action along the trajectory is the one given in (241) Thesame argument applies to all matter fields the action is a sum over space-time of local terms which can be inferred from their Minkowski spaceequivalent
Metric geometry In Section 214 we saw that the gravitational field e defines a metricstructure over spacetime One is often tempted to give excessive significance to thisstructure as if distance was an essential property of reality But there is no a priorikantian notion of distance needed to understand the world We could have developedphysics without ever thinking about distances and still have retained the completepredictive and descriptive power of our theories
What is the physical meaning of the spacetime metric structure What do we meanwhen we say that two points are 3 centimeters apart or two events are 3 seconds apart
The answer is in the dynamics of matter interacting with the gravitational field Letus first consider Minkowski space Consider two objects A and B that are 3 centimetersapart This means that if we put a ruler between the two points the part of the rulerthat fits between the two is marked 3 cm The shape of the ruler is determined bythe Maxwell and Schrodinger equations at the atomic level These equations containthe Minkowski tensor ηIJ They have stable solutions in which the molecules maintainpositions (better vibrate around equilibrium positions) at a fixed ldquodistancerdquo L fromone another L is determined by the constants in these equations This means that themolecules maintain positions at points with coordinate distances ΔxI such that
ηIJΔxIΔxJ = L2 (2127)
We exploit this peculiar behavior of condensed matter for coordinatizing spacetimelocations That is ldquodistancerdquo is nothing but a convenient manner for labeling locationsdetermined by material objects (the ruler) whose dynamics is governed by certainequations We could avoid mentioning distance by saying a number N =3 [cm]L of
62 General Relativity
molecules obeying the Maxwell and Schrodinger equations with given initial values fitbetween A and B
Consider now the same situation in a gravitational field e Again the fact that twopoints A and B are 3 centimeters apart means that we can fit the N molecules of theruler between A and B But now the dynamics of the molecules is determined by theirinteraction with the gravitational field The Maxwell and Schrodinger equations havestable solutions in which the molecules keep themselves at coordinate distances Δxμ
such thatηIJe
Iμ(x)eJν (x)ΔxμΔxν = L2 (2128)
Thus a measure of distance is a measurement of the local gravitational field performedexploiting the peculiar way matter interacts with gravity
The same is true for temporal intervals Consider two events A and B that happenin time The meaning that 3 seconds have elapsed between A and B is that a second-ticking clock has ticked three times in this time interval The physical system that weuse as a clock interacts with the gravitational field The pace of the clock is determinedby the local value of e Thus a clock is nothing but a device measuring an extensivefunction of the gravitational field along a worldline going from A to B
Imagine that a particle falls along a timelike geodesic from A to B We know fromspecial relativity that the increase of the action of the particle in the particle frame is
dS = mdt (2129)
where m is the particle mass Therefore a clock comoving with the particle will measurethe quantity
T =1
mS =
int B
A
dτradic
minusηIJeIμeJν xμxν (2130)
Thus a clock is a device for measuring a function T of the gravitational field Ingeneral any metric measurement is nothing but a measurement of a nonlocal functionof the gravitational field
This is true in an arbitrary gravitational field e as well as in flat space In flatspace we can use these measurements for determining positions with respect to thegravitational field Since the flat-space gravitational field is Newton absolute spacethese measurements locate points in spacetime
224 Active and passive diffeomorphisms
Before getting to the last and main step in Einsteinrsquos discovery of GRwe need the notion of active diffeomorphism I introduce this notion withan example
Consider the surface of the Earth and call it M At each point P isin M on Earthsay the city of Paris there is a certain temperature T (P ) The temperature is a scalarfunction T M rarr R on the Earthrsquos surface Imagine a simplified model of weatherevolution in which the only factor determining temperature change was the displace-ment of air due to wind By this I mean the following Fix a time interval say we callT the temperature on May 1st and T the temperature on May 2nd During this timeinterval the winds move the air which is over a point Q = φ(P ) to the point P Ifsay Q is the French village of Quintin this means that the winds have blown the airof Quintin to Paris Assume the temperature T (P ) of Paris on May 2nd is equal to thetemperature T (Q) of Quintin the day before The ldquowindrdquo map φ is a map from the
22 The conceptual path to the theory 63
Earthrsquos surface to itself which associates with each point P the point Q from whichthe air has been blown by the wind From May 1st to May 2nd the temperature fieldchanges then as follows
T (P ) rarr T (P ) = T (φ(P )) (2131)
Assuming it is smooth and invertible the map φ M rarr M is an active diffeomorphismThe scalar field T on M is transformed by this active diffeomorphism as in (2131)it is ldquodraggedrdquo along the surface of the Earth by the diffeomorphism φ Notice thatcoordinates play no role in all this
Now imagine that we choose certain geographical coordinates x to coordinatize thesurface of the Earth For instance latitude and longitude namely the polar coordinatesx = (θ ϕ) with ϕ = 0 being Greenwich Using these coordinates the temperature isrepresented by a function of the coordinates T (x) The May 1st temperature T (x) andthe May 2nd temperature T (x) are related by
T (x) = T (φ(x)) (2132)
For instance if the wind has blown uniformly westward by 220prime degrees (Quintin is220prime west of Paris) then
T (θ ϕ) = T (θ ϕ + 220prime) (2133)
Of course there is nothing sacred about this choice of coordinates For instance theFrench might resent that the origin of the coordinates is Greenwich and have it passthrough Paris instead Thus the French would describe the same temperature fieldthat the British describe as T (θ ϕ) by means of different polar coordinates defined byϕ = 0 being Paris Since Paris is 220prime degrees East of Greenwich for the French thetemperature field on May 1st is
T prime(θ ϕ) = T (θ ϕ + 220prime) (2134)
This is a change of coordinates or a passive diffeomorphismNow the two equations (2133) and (2134) look precisely the same But it would
be silly to confuse them In (2133) T (θ ϕ) is the temperature on May 2nd while in(2134) T prime(θ ϕ) is the temperature on May 1st but written in French coordinatesIn summary the first equation represents a change in the temperature field due tothe wind the second equation represents a change in convention The first equationdescribes an ldquoactive diffeomorphismrdquo the second a change of coordinates also calleda ldquopassive diffeomorphismrdquo
Given a manifold M an active diffeomorphism φ is a smooth invertiblemap from M to M A scalar field T on M is a map T M rarr R Givenan active diffeomorphism φ we define the new scalar field T transformedby φ as
T (P ) = T (φ(P )) (2135)
Coordinates play no role in thisA coordinate system x on a d-dimensional manifold M is an invertible
differentiable map from (an open set of) M to Rd Given a field T on M this map determines the function t Rd rarr R defined by t(x) = T (P (x))called ldquothe field T in coordinates xrdquo20 A passive diffeomorphism is an
20In the physics literature the two maps T M rarr R and t = T xminus1 Rd rarr R
64 General Relativity
invertible differentiable map φ Rd rarr Rd that defines a new coordi-nate system xprime on M by x(P ) = φ(xprime(P )) The value of the field T incoordinates xprime is given by
tprime(xprime) = t(φ(xprime)) (2136)
Beware the formal similarity between (2135) and (2136)The above extends immediately to all structures on M For instance
an active diffeomorphism φ carries a one-form field e on M to the newone-form field e = φlowaste the pull-back of e under φ and so on
In particular a metric d M timesM rarr R+ is an assignment of a distanced(AB) between any two points A and B of M An active diffeomorphismdefines the new metric d given by d(AB) equiv d(φminus1(A) φminus1(B)) Thetwo metrics d and d are isometric but distinct21 An equivalence class ofmetrics under active diffeomorphisms is sometimes called a ldquogeometryrdquoGiven a coordinate system we can represent a (Riemannian) metric dby means of a tensor field on Rd Riemannrsquos metric tensor gμν(x) orequivalently the tetrad field eIμ(x) Under a change of coordinate systemthe same metric is represented by a different gμν(x) or eIμ(x)
The example of the Earthrsquos temperature given above illustrates a pecu-liar relation between active and passive diffeomorphisms given two tem-perature fields T and T related by an active diffeomorphism we can al-ways find a coordinate transformation such that in the new coordinatesT is represented by the same function as T in the old coordinates Thissimple mathematical observation is at the root of Einsteinrsquos argumentsthat I will describe below (The argument will be essentially that a the-ory that does not distinguish coordinate systems cannot distinguish fieldsrelated by active diffeomorphisms either)
More precisely the relation between active and passive diffeomorphismsis as follows The group of the active diffeomorphisms acts on the space
are always indicated with the same symbol generating confusion between active andpassive diffeomorphisms In this paragraph I use distinct notations In the rest of thetext however I shall adhere to the standard notation and indicate the field and itscoordinate representation with the same symbol
21Here is an example of isometric but distinct metrics The 2001 Shell road-map saysthat the distances between New York (NY) Chicago (C) and Kansas City (KC) ared(NY C) = 100 miles d(C KC) = 50 miles d(KC NY) = 100 miles while the 2002Lonely Planet tourist guide claims that these distances are d(NY C) = 100 milesd(C KC) = 100 miles d(KC NY) = 50 miles Obviously these are not the samedistances But they are isometric the two are transformed into each other by theactive diffeomorphism φ(NY) = C φ(C) = KC φ(KC) = NY
22 The conceptual path to the theory 65
Space of metrics dSpace of functions g (x)mν
Orbit of the passive
diffeomorphism groupOrbit of the active
diffeomorphism group
Coordinate system S
Coordinate system S prime
Fig 22 Active and passive diffeomorphisms
of metrics d The group of passive diffeomorphisms acts on the spaceof functions gμν(x) The orbits of the first group are in natural one-to-one correspondence with the orbits of the second However the relationbetween the individual metrics d and the individual functions gμν(x) de-pends on the coordinate system chosen The situation is illustrated inFigure 22
225 General covariance
Around 1912 using the idea that any motion is relative Einstein hadfound the form of the gravitational field as well as the equations of motionsof matter in a given gravitational field This was already a remarkableachievement but the field equations for the gravitational field were stillmissing In fact the best part of the story had yet to come
Two problems remained open the field equations and understandingthe physical meaning of the coordinates xμ introduced above Einsteinstruggled with these two problems during the years 1912ndash1915 trying sev-eral solutions and changing his mind repeatedly Einstein has called this
66 General Relativity
search his ldquostruggle with the meaning of the coordinatesrdquo The strugglewas epic The result turned out to be amazing In Einsteinrsquos words it wasldquobeyond my wildest expectationsrdquo
To increase Einsteinrsquos stress Hilbert probably the greatest mathemati-cian at the time was working on the same problem trying to be first tofind the gravitational field equations The fact that Hilbert with his farsuperior mathematical skills could not find these equations first testifiesto the profound differences between fundamental physical problems andmathematical problems
In his search for the field equations Einstein was guided by severalpieces of information First the static limit of the field equations mustyield the Newton law as the static limit of Maxwell theory yields theCoulomb law Second the source of Coulomb law is charge and the chargedensity is the temporal component of four-current Jμ(x) which is thesource of Maxwell equations The source of the Newtonian interaction ismass Einstein had understood with special relativity that mass is in facta form of energy and that the energy density is the temporal compo-nent of the energy-momentum tensor Tμν(x) Therefore Tμν(x) had tobe the likely source of the field equations Third the introduction of thegravitational field was based on the use of arbitrary coordinates there-fore there should be some form of covariance under arbitrary changes ofcoordinates in the field equations Einstein searched for covariant second-order equations as relations between tensorial quantities since these areunaffected by coordinate change He learned from Riemannian geometrythat the only combination of second derivatives of the gravitational fieldthat transforms tensorially is the Riemann tensor Rμ
νρσ(x) This was infact Riemannrsquos major result Einstein knew all this in 1912 To deriveEinsteinrsquos field equations (297) from these ideas is a simple calculationpresented in all GR textbooks and which a good graduate student cantoday repeat easily Still Hilbert couldnrsquot do it and Einstein got stuckfor several years What was the problem
The problem was ldquothe meaning of the coordinatesrdquo Here is the story
1 Einstein for general covariance At first Einstein demands the fieldequations for the gravitational field eIμ(x) to be generally covariant onM This means that if eIμ(x) is a solution then eprimeIν(y) defined in (2124)should also be a solution For Einstein this requirement (unheard of atthe time) was the formalization of the idea that the laws of nature must bethe same in all reference frames and therefore in all coordinate systems
2 Einstein against general covariance In 1914 however Einstein con-vinces himself that the field equations should not be generally covariant
22 The conceptual path to the theory 67
t = 0
M
AB
M
AB
f
e e~
(a) (b)
Fig 23 The active diffeomorphism φ drags the nonflat (wavy) gravitationalfield from the point B to the point A
[63] Why Because Einstein rapidly understands the physical conse-quences of general covariance and he initially panics in front of themThe story is very instructive because it reveals the true magic hiddeninside GR Einsteinrsquos argument against general covariance is the follow-ing22
Consider a region of spacetime containing two spacetime points A andB Let e be a gravitational field in this region Say that around the pointA the field is flat while at the point B it is not (see Figure 23(a)) Nextconsider a map φ from M to M that maps the point A to the point BConsider the new field e = φlowaste which is pulled back by this map Thevalue of the field e at A is determined by the value of e at B and thereforethe field e will not be flat around A (see Figure 23(b))
Now if e is a solution of the equations of motion and if the equationsof motion are generally covariant then e is also a solution of the equationsof motion This is because of the relation between active diffeomorphismsand changes of coordinates we can always find two different coordinatesystems on M say x and y such that the function eIμ(x) that represents ein the coordinate system x is the same function as the function eIμ(y) thatrepresents e in the coordinate systems y Since the equations of motion
22At first Einstein got discouraged about generally covariant field equations becauseof a mistake he was making while deriving the static limit the calculation yieldedthe wrong limit But this is of little importance here given the powerful use thatEinstein has been routinely capable of making of general conceptual arguments
68 General Relativity
are the same in the two coordinate systems the fact that this functionsatisfies the Einstein equations implies that e as well as e are physicalsolutions
Let me repeat the argument in a different form We have found in theprevious section that if eIμ(x) is a solution of the Einstein equations thenso is eprimeIν(y) defined in (2124) But the function eprimeIν can be interpreted intwo distinct manners First as the same field as e expressed in a differentcoordinate system Second as a different field e expressed in the samecoordinate system That is we can define the new field as
eIμ(x) = eprimeIμ(x) (2137)
This new field e is genuinely different from e In general it will not beflat around A In particular the scalar curvature R of e at A is
R|A = R(xA) = R(φ(xA)) = R|B (2138)
In other words if the equations of motion are generally covariant they arealso invariant under active diffeomorphisms
Given this Einstein makes the following famous observation
The ldquoholerdquo argument Assume the gravitational field-equations aregenerally covariant Consider a solution of these equations in whichthe gravitational field is e and there is a region H of the uni-verse without matter (the ldquoholerdquo represented as the white regionin Figure 23) Assume that inside H there is a point A where eis flat and a point B where it is not flat Consider a smooth mapφ M rarr M which reduces to the identity outside H and such thatφ(A) = B and let e = φlowaste be the pull-back of e under φ The twofields e and e have the same past are both solutions of the fieldequations but have different properties at the point A Thereforethe field equations do not determine the physics at the spacetimepoint A Therefore they are not deterministic But we know that(classical) gravitational physics is deterministic Therefore either
(i) the field equations must not be generally covariant or(ii) there is no meaning in talking about the physical spacetime
point A
On the basis of this argument Einstein searched for nongenerally covari-ant field equations for three years in a frantic race against Hilbert
3 Einsteinrsquos return to general covariance Then rather suddenly in 1915Einstein published generally covariant field equations What had hap-pened Why had Einstein changed his mind Is there a mistake in the
22 The conceptual path to the theory 69
t = 0
M
AB
M
AB
f
e x x~
e xa xb~~
a b
Fig 24 The diffeomorphism moves the nonflat region as well as the intersectionpoint of the two particles a and b from the point B to the point A
hole argument No the hole argument is correct The correct physicalconclusion however is (ii) not (i) This point hit Einstein like a flashof lightning the precise conceptual discovery to which all his previousthoughts had led
Einsteinrsquos way out from the difficulty raised by the hole argument isto realize that there is no meaning in referring to ldquothe point Ardquo or ldquotheevent Ardquo without further specifications
Let us follow Einsteinrsquos explanation in detail
Spacetime coincidences Consider again the solution e of the field equa-tions but assume that in the universe there are also the two particles aand b Say that the worldlines (xa(τa) xb(τb)) of the two particles intersectat the spacetime point B see Figure 24
Now for given initial conditions the worldlines of the particles are de-termined by the gravitational field They are geodesics of e or if otherforces are involved they satisfy the geodesic equation with an addi-tional force term Consider the field e = φlowaste The particlesrsquo worldlines(xa(τa) xb(τb)) are no longer solutions of the particlesrsquo equations of mo-tion in this gravitational field If the gravitational field is e instead of ethe particlesrsquo motions over M will be different But it is easy to find themotion of the particles determined by e precisely because the completeset of equations of motion is generally covariant Therefore an active dif-feomorphism on the gravitational field and the particles sends solutionsinto solutions Thus the motion of the particles in the field e is given by
70 General Relativity
the worldlines
xa(τa) = φminus1(xa(τa)) xb(τb) = φminus1(xb(τb)) (2139)
Then the particles a and b no longer intersect in B They intersect inA = φminus1(B)
Now instead of asking whether or not the field is flat at A let us askwhether or not the field is flat at the point where the particles meetClearly the result is the same for the two cases (e xa xb) and (e xa xb)Formally assuming the intersection point is at τa = τb = 0
R|inters = R(xa(0)) = R(φ(xa(0)))= R(φ(φminus1(xa(0)))) = R(xa(0)) = R|inters (2140)
This prediction is deterministic There are not two contradictory predic-tions therefore there is determinism so long as we restrict ourselves tothis kind of prediction Einstein calls ldquospacetime coincidencesrdquo this wayof determining points
Einstein observes that this conclusion is general the theory does notpredict what happens at spacetime points (like newtonian and special-relativistic theories do) Rather it predicts what happens at locations de-termined by the dynamical elements of the theory themselves In Einsteinrsquoswords
All our space-time verifications invariably amount to a determinationof space-time coincidences If for example events consisted merely inthe motion of material points then ultimately nothing would be ob-servable but the meeting of two or more of these points Moreover theresults of our measuring are nothing but verifications of such meetingsof the material points of our measuring instruments with other mate-rial points coincidences between the hands of a clock and points onthe clock dial and observed point-events happening at the same placeat the same time The introduction of a system of reference serves noother purpose than to facilitate the description of the totality of suchcoincidences [64]
The two solutions (e xa xb) and (e xa xb) are only distinguished by theirlocalization on the manifold They are different in the sense that they as-cribe different properties to manifold points However if we demand thatlocalization is defined only with respect to the fields and particles them-selves then there is nothing that distinguishes the two solutions physi-cally In fact concludes Einstein the two solutions represent the samephysical situation The theory is gauge invariant in the sense of Diracunder active diffeomorphisms there is a redundancy in the mathematicalformalism the same physical world can be described by different solutionsof the equations of motion
23 Interpretation 71
It follows that localization on the manifold has no physical meaningThe physical picture is completely different from the example of the tem-perature field on the Earthrsquos surface illustrated in the previous section Inthat example the cities of Paris and Quintin were real distinguishable en-tities independent from the temperature field In GR general covarianceis compatible with determinism only assuming that individual spacetimepoints have no physical meaning by themselves It is like having only thetemperature field without the underlying Earth
What disappears in this step is precisely the background spacetime thatNewton believed to have been able to detect with great effort beyond theapparent relative motions
Einsteinrsquos step toward a profoundly novel understanding of nature isachieved Background space and spacetime are effaced from this new un-derstanding of the world Motion is entirely relative Active diffeomor-phism invariance is the key to implement this complete relativizationReality is not made up of particles and fields on a spacetime it is madeup of particles and fields (including the gravitational field) that can onlybe localized with respect to one another No more fields on spacetimejust fields on fields Relativity has become general
23 Interpretation
General covariance makes the relation between formalism and experimentfar more indirect than in conventional field theories
Take Maxwell theory as an example We assume that there is a back-ground spacetime We have special objects at our disposal (the walls ofthe lab the Earth) that define an inertial frame to a desired approxima-tion These objects allow us to designate locations relative to backgroundspacetime We have two kinds of measuring devices (a) meters and clocksthat measure distance and time intervals from these reference objects and(b) devices that measure the electric and magnetic fields The reading ofthe devices (a) gives us xμ The reading of the devices (b) gives us Fμν We measure the two and say that the field has the value Fμν at the pointxμ The theory can predict the value Fμν at the point xμ
We cannot do the same in GR The theory does not predict the value ofthe field at the point xμ So how do we compare theory and observations
231 Observables predictions and coordinates
As discussed at the end of the previous section a physical state does notcorrespond to a solution e(x) of Einsteinrsquos equations but to an equivalenceclass of solutions under active diffeomorphisms Therefore the quantitiesthat the theory predicts are all and only the quantities that are well
72 General Relativity
defined on these equivalence classes That is only the quantities that areinvariant under diffeomorphisms These quantities are independent fromthe coordinates xμ
In concrete applications of the theory these quantities are generallyobtained by solving away the coordinates x from solutions to the equationsof motion Here are a few examples
Solar System Consider the dynamics of the Solar System The vari-ables are the gravitational field e(x) and the worldlines of the plan-ets xn(τn) Fix a solution (e(x) xn(τn)) to the equations of motionWe want to derive physical predictions from this solution and com-pare them with observations Choose for simplicity τn = x0 so thatthe solution is expressed by (e(x) xn(x0)) Consider the worldlineof the Earth Compute the distance dn(x0) between the Earth andthe planet n defined as the proper time elapsed along the Earthrsquosworldline while a null geodesic (a light pulse) leaving the Earth atx0 travels from Earth to the planet and back
The functions (dn(x0)) can be computed from the given solutionsto the equations of motion Consider a space C with coordinates(dn) The functions (dn(x0)) define a curve γ on this space
We can associate a measuring device with each dn a laser ap-paratus that measures the distance to planet n These quantitiescan be measured together We obtain the event (dn) which can berepresented by a point in C The theory predicts that this point willfall on the curve γ A sequence of these events can be comparedwith the curve γ and in this way we can test the given solutionsto the equations of motion against experience (In the terminologyof Chapter 3 the quantities dn are partial observables) Notice thatthis can be done with arbitrary precision and that distant starsinertial systems preferred coordinates or choice of time variableplay no role
Clocks Consider the gravitational field around the Earth Consider twoworldlines Let the first be the worldline of an object fixed on theEarthrsquos surface Let the second be the worldline of an object in freefall on a keplerian orbit around the Earth that is a satellite Fixan arbitrary initial point P on the worldline of the orbiting objectand let T1 be the proper time from P along this worldline Send alight signal from P to the object on Earth let Q be the point onthe Earthrsquos worldline when the signal is received and let T2 be theproper time from Q along this worldline Then let T2(T1) be thereception proper time on Earth of a signal sent at T1 proper timein orbit GR allows us to compute the function T2(T1) for any T1
23 Interpretation 73
It is easy to associate measuring devices to T1 and T2 these area clock on Earth and a clock in orbit If the orbiting object sendsa signal at fixed proper times T1 the reception times T2 can becompared with the predictions of the theory Here T1 and T2 arethe partial observables I let you decide which one of the two is theldquotrue time variablerdquo
Solar System with a clock We can add a clock to the Solar Systemmeasurements described above Fixing arbitrarily an initial eventon Earth (a particular eclipse the birth of Jesus or the death ofJohn Lennon) we can compute the proper time T (x0) lapsed fromthis event along the Earthrsquos worldline The partial observable Tcan be added to the partial observables dn giving the set (dn T ) ofpartial observables If we do so it may be convenient to express thecorrelations (dn T ) as functions dn(T ) A complete gauge-invariantobservable fully predicted by the theory is the value dn(T ) of aplanet distance at a certain given Earth proper time T from theinitial event Notice that T is not a coordinate It is a complicatednonlocal function of the gravitational field to which a measuringdevice (measuring a partial observable) has been attached The useof a clock on Earth to determine a local temporal localization isjust a matter of convenience
Binary pulsar Consider a binary-star system in which one of the twostars is a pulsar Because of a Doppler effect the frequency of thepulsing signal oscillates with the orbital period of the system Thisfact allows us to count the number of pulses in each orbit Let Nn
be the number of pulses we receive in the nth orbit A theoreticalmodel of the pulsar allows us to compute the expected decrease inorbital period due to gravitational wave emission and therefore theexpected sequence Nn which can be compared with the observedone Doing this with sufficient care won JH Taylor and RA Hulsethe 1993 Nobel Prize
Notice that in all these examples the coordinates xμ have disappearedfrom the observable quantities This is true in general A theoretical modelof a physical system is made using coordinates xμ but then observablequantities are independent of the coordinates xμ23
232 The disappearance of spacetime
In the mathematical formalism of GR we utilize the ldquospacetimerdquo man-ifold M coordinatized by x However a state of the universe does not
23Unless we gauge-fix them to given partial observables see Section 246
74 General Relativity
correspond to a configuration of fields on M It corresponds to an equiva-lence class of field configurations under active diffeomorphisms An activediffeomorphism changes the localization of the field on M by dragging itaround Therefore localization on M is just gauge it is physically irrele-vant
In fact M itself has no physical interpretation it is just a mathematicaldevice a gauge artifact Pre-general-relativistic coordinates xμ designatepoints of the physical spacetime manifold ldquowhererdquo things happen (see adetailed discussion below in Section 245) in GR there is nothing of thesort The manifold M cannot be interpreted as a set of physical ldquoeventsrdquoor physical spacetime points ldquowhererdquo the fields take value It is meaning-less to ask whether or not the gravitational field is flat around the point Aof M because there is no physical entity ldquospacetime point Ardquo Contraryto Newton and to Minkowski there are no spacetime points where parti-cles and fields live There are no spacetime points at all The Newtoniannotions of space and time have disappeared
In Einsteinrsquos words
the requirement of general covariance takes away from space andtime the last remnant of physical objectivity [64]
Einstein justifies this conclusion in the immediate continuation of thistext which is the paragraph I quoted at the end of the previous sectionwith the observation that all observations are spacetime coincidences
In newtonian physics if we take away the dynamical entities whatremains is space and time In general-relativistic physics if we take awaythe dynamical entities nothing remains The space and time of Newtonand Minkowski are reinterpreted as a configuration of one of the fieldsthe gravitational field
Concretely this radically novel understanding of spatial and temporalrelations is implemented in the theory by the invariance of the field equa-tions under diffeomorphisms Because of background independence ndash thatis since there are no nondynamical objects that break this invariance inthe theory ndash diffeomorphism invariance is formally equivalent to generalcovariance namely the invariance of the field equations under arbitrarychanges of the spacetime coordinates x and t
Diffeomorphism invariance implies that the spacetime coordinates xand t used in GR have a different physical meaning to the coordinatesx and t used in prerelativistic physics In prerelativistic physics x andt denote localization with respect to appropriately chosen reference ob-jects These reference objects are chosen in such a way that they make thephysical influence of background spacetime manifest In particular theirmotion can be chosen to be inertial In GR on the other hand the space-time coordinates x and t have no physical meaning physical predictionsof GR are independent of the coordinates x and t
24 Complements 75
A physical theory should not describe the location in space and theevolution in time of dynamical objects It describes relative location andrelative evolution of dynamical objects Newton introduced the notion ofbackground spacetime because he needed the acceleration of a particle tobe well defined (so that F = ma could make sense) In the newtoniantheory and in special relativity a particle accelerates when it does sowith respect to a fixed spacetime in which the particle moves In generalrelativity a particle (a dynamical object) accelerates when it does so withrespect to the local values of the gravitational field (another dynamicalobject) There is no meaning for the location of the gravitational field orthe location of the particle only the relative location of the particle withrespect to the gravitational field has physical meaning
What remains of the prerelativistic notion of spacetime is a relationbetween dynamical entities we can say that two particlesrsquo worldlines ldquoin-tersectrdquo that a field has a certain value ldquowhererdquo another field has a certainvalue or that we measure two partial observables ldquotogetherrdquo This is pre-cisely the modern realization of Descartesrsquo notion of contiguity and it isthe basis of spatial and temporal notions in GR
As Whitehead put it we cannot have spacetime without dynamicalentities anymore than saying that we can have the catrsquos grin withoutthe cat The world is made up of fields Physically these do not liveon spacetime They live so to say on one another No more fields onspacetime just fields on fields It is as outlined in the metaphor in Section113 where we no longer had animals on the island just animals on thewhale animals on animal Our feet are no longer in space we have to ridethe whale
24 Complements
I close this chapter by discussing a certain number of issues related to the interpretationof GR
241 Mach principles
The ideas of Ernst Mach had a strong influence on Einsteinrsquos discovery of GR Machpresented a number of acute criticisms to Newtonrsquos motivations for introducing absolutespace and absolute time In particular he pointed out that in Newtonrsquos bucket argumentthere is a missing element he observed that the inertial reference frame (the referenceframe with respect to which rotation has detectable physical effects) is also the referenceframe in which the fixed stars do not rotate Mach then suggested that the inertialreference frame is not determined by absolute space but rather it is determined by theentire matter content of the Universe including distant stars He suggested that if wecould repeat the experiment with a very massive bucket the mass of the bucket wouldaffect the inertial frame and the inertial frame would rotate with the bucket
In the light of GR the observation is certainly pertinent and it is clear that the ar-gument may have played a role in Einsteinrsquos dismissal of Newtonrsquos argument However
76 General Relativity
for some reason the precise relation between Machrsquos suggestion and GR has generateda vast debate Machrsquos suggestion that inertia is determined by surrounding matter hasbeen called ldquothe Mach principlerdquo and much ink has been employed to discuss whetheror not GR implements this principle whether or not ldquoGR is machianrdquo Remarkably inthe literature one finds arguments and proofs in favor as well as against the conclusionthat GR is machian Why this confusion
Because there is no well-defined ldquoMach principlerdquo Mach provided a very importantbut vague suggestion that Einstein developed into a theory not a precise statement thatcan be true or false Every author that has discussed ldquothe Mach principlerdquo has actuallyconsidered a different principle Some of these ldquoMach principlesrdquo are implemented inGR others are not
In spite of the confusion or perhaps thanks to it the discussion on how machianGR is sheds some light on the physical content of GR Here I list several versions ofthe Mach principle that have been considered in the literature and for each of theseI comment on whether this particular Mach principle is True or False in GR In thefollowing ldquomatterrdquo means any dynamical entity except the gravitational field
bull Mach principle 1 Distant stars can affect the local inertial frameTrue Because matter affects the gravitational field
bull Mach principle 2 The local inertial frame is completely determined by thematter content of the UniverseFalse The gravitational field has independent degrees of freedom
bull Mach principle 3 The rotation of the inertial reference frame inside the bucketis in fact dragged by the bucket and this effect increases with the mass of thebucketTrue In fact this is the LensendashThirring effect a rotating mass drags the inertialframes in its vicinity
bull Mach principle 4 In the limit in which the mass of the bucket is large theinternal inertial reference frame rotates with the bucketDepends It depends on the details of the way the limit is taken
bull Mach principle 5 There can be no global rotation of the UniverseFalse Einstein believed this to be true in GR but Godelrsquos solution is a counter-example
bull Mach principle 6 In the absence of matter there would be no inertiaFalse There are vacuum solutions of the Einstein field equations
bull Mach principle 7 There is no absolute motion only motion relative to some-thing else therefore the water in the bucket does not rotate in absolute terms itrotates with respect to some dynamical physical entityTrue This is the basic physical idea of GR
bull Mach principle 8 The local inertial frame is completely determined by thedynamical fields in the UniverseTrue In fact this is precisely Einsteinrsquos key idea
242 Relationalism versus substantivalism
In contemporary philosophy of science there is an interesting debate on the inter-pretation of GR The two traditional theses about space ndash absolute and relational ndashsuitably edited to take into account scientific progress continue under the names
24 Complements 77
of substantivalism and relationalism Here I present a few considerations on theissue
GR changes the notion of spacetime in physics in the sense of relationalism In pre-relativistic physics spacetime is a fixed nondynamical entity in which physics happensIt is a sort of structured container which is the home of the world In relativistic physicsthere is nothing of the sort There are only interacting fields and particles The onlynotion of localization which is present in the theory is relative dynamical objects canbe localized only with respect to one another This is the notion of space defendedby Aristotle and Descartes against which Newton wrote the initial part of PrincipiaNewton had two points the physical reality of inertial effects such as the concavityof the water in the bucket and the immense empirical success of his theory based onabsolute space Einstein provided an alternative interpretation for the cause of the con-cavity ndash interaction with the local gravitational field ndash and a theory based on relationalspace that has better empirical success than Newton theory After three centuriesthe European culture has returned to a fully relational understanding of space andtime
At the basis of cartesian relationalism is the notion of ldquocontiguityrdquo Two objects arecontiguous if they are close to one another Space is the order of things with respectto the contiguity relation At the basis of the spacetime structure of GR is essentiallythe same notion Einsteinrsquos ldquospacetime coincidencesrdquo are analogous to Descartes ldquocon-tiguityrdquo
A substantivalist position can nevertheless still be defended to some extent Ein-steinrsquos discovery is that newtonian spacetime and the gravitational field are the sameentity This can be expressed in two equivalent ways One states that there is no space-time there is only the gravitational field This is the choice I have made in this bookThe second states that there is no gravitational field it is spacetime that has dynamicalproperties This choice is common in the literature I prefer the first because I find thatthe differences between the gravitational field and other fields are more accidental thanessential But the choice between the two points of view is only a matter of choice ofwords and thus ultimately personal taste If one prefers to keep the name ldquospacetimerdquofor the gravitational field then one can still hold a substantivalist position and claimthat according to GR spacetime is an entity not a relation Furthermore localizationcan be defined with respect to the gravitational field and therefore the substantivalistcan say that spacetime is an entity that defines localization For an articulation of thisthesis see for instance [65]
However this is a very weakened substantivalist position One is free to call ldquospace-timerdquo anything with respect to which we define position But to what extent is space-time different from any arbitrary continuum of objects used to define position New-tonrsquos acute formulation of his substantivalism already mentioned in footnote 15 abovecontains a precise characterization of ldquospacerdquo
so it is necessary that the definition of places and hence of localmotion be referred to some motionless thing such as extension aloneor ldquospacerdquo in so far as space is seen to be truly distinct from movingbodies24
The characterizing feature of space is that of being truly distinct from moving bodiesthat is in modern terms and after the FaradayndashMaxwell conceptual revolution that of
24I Newton De Gravitatione et aequipondio fluidorum [61]
78 General Relativity
being truly distinct from dynamical entities such as particles or fields This is clearlynot the case for the spacetime of GR If the modern substantivalist is happy to give upNewtonrsquos strong substantivalism and identify the thesis that ldquospacetime is an entityrdquowith the thesis that ldquospacetime is the gravitational field which is a dynamical entityrdquothen the distinction between substantivalism and relationalism is completely reducedto one of semantics
When two opposite positions in a long-standing debate have come so close that theirdistinction is reduced to semantics one can probably say that the issue is solved I thinkone can say that in this sense GR has solved the long-standing issue of the relationalversus substantivalist interpretations of space
243 Has general covariance any physical content Kretschmannrsquosobjection
Virtually any field theory can be reformulated in a generally covariant form An exampleof a generally covariant reformulation of a scalar field theory on Minkowski spacetime ispresented below This fact has led some people to wonder whether general covariancehas any physical significance at all The argument is as follows if any theory canbe formulated in a general covariant language then general covariance is not a principlethat selects a particular class of theories therefore it has no physical content Thisargument was presented by Kretschmann shortly after Einsteinrsquos publication of GRIt is heard among some philosophers of science and sometimes used also by somephysicists that dismiss the conceptual novelty of GR
I think that the argument is wrong The non sequitur is the idea that a formal prop-erty that does not restrict the class of admissible theories has no physical significanceWhy should that be Formalism is flexible and we can artificially give a theory a cer-tain formal property especially if we accept byzantine formulations But it does notfollow from this that the use of one formalism or another is irrelevant Physics is thesearch for the more effective formalism to read Nature The relevant question is notwhether general covariance restricts the class of admissible theories but whether GRcould have been conceived or understood at all without general covariance Let meillustrate this point with the example of rotational invariance
Kretschmannrsquos objection applied to rotational symmetry Ancient physics assumed thatspace has a preferred direction The ldquouprdquo and the ldquodownrdquo were considered absolutelydefined This changes with newtonian physics where space has rotational symmetryall spatial directions are a priori equivalent and only contingent circumstances ndash suchas the presence of a nearby mass like the Earth ndash can make one direction particularPhysicists often say that rotational invariance limits the admissible forces But strictlyspeaking this is not true Kretschmannrsquos objection applies equally well to rotationalinvariance given a theory which is not rotationally invariant we can reformulate it asa rotationally invariant theory just by adding some variable For instance consider aphysical theory T in which all bodies are subject to a force in the z-direction F = minusgwhere g is a constant (such as gravity) This is a nonrotationally-invariant theory Nowconsider another theory T prime in which there is a dynamical vector quantity v of lengthunity and a force F = gv The theory T prime is rotationally invariant but in each solutionthe vector v will take a particular value in a particular direction Calling z this directionwe have precisely the same phenomenology as theory T
24 Complements 79
The example shows that we can express a nonrotationally invariant theory T ina rotationally invariant formalism T prime Therefore rotational invariance does not trulyrestrict the class of admissible theories Shall we conclude with Kretschmann thatrotational invariance has no physical significance
Obviously not Modern physics has made real progress with respect to ancientphysics in understanding that space is rotationally invariant Where is the progress Itis in the fact that the discovery of the rotational invariance of space puts us in a farmore effective position for understanding Nature We can say that we have discoveredthat in general there is no preferred ldquouprdquo and ldquodownrdquo in the Universe Equivalentlywe can say that a rotationally invariant physical formalism is far more effective forunderstanding Nature than a nonrotationally invariant one
There are two key issues here First it would have been difficult to find newtoniantheory within a conceptual framework in which the ldquouprdquo and the ldquodownrdquo are consideredabsolute Second reformulating the theory T in the rotationally invariant form T prime
modifies our understanding of it we have to introduce the dynamical vector v Fromthe point of view of the two theories T and T prime the vector v is a byzantine constructionwithout much sense But notice that from the point of view of understanding Naturethe introduction of v points to the physically correct direction we are led to investigatethe nature and the dynamics of this vector v is indeed the local gravitational fieldand this is precisely the right track towards a more effective understanding of NatureThis is the strength of having understood rotational invariance
In fact if there is rotational invariance in the Universe there should be a rota-tionally invariant manner of understanding ancient physics which in its limited ex-tent was effective Theory T prime above represents precisely this better understanding ofancient physics More than that the reinterpretation itself indicates a new effectiveway of understanding the world In conclusion the fact that the effective but non-rotationally invariant theory T admits the byzantine rotationally invariant formula-tion T prime is not an argument for the physical irrelevance of rotational invariance Farfrom that it is something that is required for us to have confidence in rotationalinvariance
On the one hand rotational invariance is interesting because it enlarges not be-cause it restricts the kind of physics we can naturally describe On the other handrotational invariance does drastically reduce the kind of theories that we are willingto consider Not because it forbids us to write certain theories ndash such as theory T prime
ndash but because if we want to describe a theory such as T we have to pay a priceHere the introduction of the vector v It is up to the theoretician to judge whetherthis price is worth paying that is whether v is in fact a physical entity worthwhileconsidering
The value of a novel idea or a novel language in theoretical physics is not in the factthat old physics cannot be expressed in the new language It is simply in the fact thatit is more effective for describing reality A physical theoretical framework is a map ofreality If the symbols of the map are better chosen the map is more effective A newlanguage by itself rarely truly restricts the kind of theories that can be expressed Butit renders certain theories far simpler and others awkward It orients our investigationon Nature This and nothing else is scientific knowledge
Let me come back to general covariance Like rotational invariance general covari-ance is a novel language which expresses a general physical idea about the worldIt is possible to express Newtonian physics in a generally covariant language It isalso possible to express GR physics in a nongenerally covariant language (by gauge-fixing the coordinates) But newtonian physics expressed in a covariant language or GR
80 General Relativity
expressed in a noncovariant language are both monsters formulated in a form far moreintricate than what is possible Nobody would have found them
What Einstein discovered is that two classes of entities previously considered dis-tinct are in fact entities of the same kind Newton taught us that (an effective wayto understand the world is to think that) the world is made up of two clearly distinctclasses of entities of very different nature The first class is formed by space and timeThe second class includes all dynamical entities moving in space and in time In new-tonian physics these two classes of entities are different in many respects and enterthe formalism of physical models in very different manners Einstein has understoodthat (a more effective way to understand the world is to think that) the world is notmade up of two distinct kinds of entities There is only one type of entity dynamicalfields General covariance is the language for describing a world without distinctionbetween the spacetime entities and the dynamical entities It is the language that doesnot assume this distinction
We can reinterpret prerelativistic physics in a generally covariant language It sufficesto rewrite the newtonian absolute space and absolute time as a dynamical field andthen write generally covariant equations that fix them to their flat-space values But ifwe do so we are not denying the physical content of Einsteinrsquos idea On the contrary weare simply reinterpreting the world in Einsteinrsquos terms In other words we are showingthe strength not the weakness of general covariance Furthermore in so doing weintroduce a new physical field and we find ourselves in the funny situation of havingto write equations of motion for this field that constrain it to a single value Thus wehave a theory where one of the dynamical fields is strangely constrained to a singlevalue This immediately suggests that perhaps we can relax these equations and allowa full dynamics for this field If we do so we are directly on the track of GR Again farfrom showing the physical irrelevance of general covariance this indicates its enormouscognitive strength
I think that the mistake behind Kretschmannrsquos argument is an excessively legalisticreading of the scientific enterprise It is the mistake of taking certain common physi-cistsrsquo statements too literally Physicists often write that a certain symmetry or a certainprinciple ldquouniquely determinesrdquo a certain theory At a close reading these statementsare almost always much exaggerated The uniqueness only holds under a vast number ofother assumptions that are left implicit and which are facts or ideas the physicist con-siders natural and does not bother detailing The typical physicist carelessly dismissescounter-examples by saying that they would be unphysical implausible or completelyartificial The connection between general physical ideas general principles intuitionssymmetries is a burning melt of powerful ideas not the icy demonstration of a math-ematical theorem What is at stake is finding the most effective language for thinkingthe world not writing axioms It is language in formation not bureaucracy25
25Historically the entire issue might be the result of a misunderstanding Kretschmannattacked Einstein in a virulent form In particular he attacked Einsteinrsquos coincidencessolution of the hole argument Now Einstein probably learned the idea that coinci-dences are the only observables precisely from Kretschmann but didnrsquot give muchcredit to Kretschmann for this I suppose this should have made Kretschmann quitebitter I think that Kretschmannrsquos subtext in saying that general covariance is emptywas not that general covariance was no progress with respect to old physics it wasthat general covariance was no progress with respect to what he himself had alreadyrealized before Einstein
24 Complements 81
Generally covariant flat-space field theory Consider the field theory of a free masslessscalar field φ(x) on Minkowski space The theory is defined by the action
S[φ] =
intd4x ηαβpartαφ partβφ (2141)
The equation of motion is the flat-space KleinndashGordon equation
ηαβpartαpartβφ = 0 (2142)
and the theory is obviously not generally covariantA trivial way to reformulate this theory in generally covariant language is to intro-
duce the tetrad field eαμ(x) and write the equations
partμ(e ηαβeμαeνβpartνφ) = 0 (2143)
Rαβμν = 0 (2144)
The solution of (2144) is that e is flat Since the system is covariant we can choose agauge in which eαμ(x) = δαμ In this gauge (2143) becomes (2141)
A more interesting way is as follows Consider a field theory for five scalar fieldsΦA(x) where A = 1 5 Use the notation
VA = εABCDE partμΦBpartνΦCpartρΦDpartσΦEεμνρσ (2145)
where εμνρσ and εABCDE are the 4-dimensional and 5-dimensional completely antisym-metric pseudo-tensors Consider the theory defined by the action
S[ΦA] =
intd4x V minus1
5 (V4V4 minus V3V3 minus V2V2 minus V1V1) (2146)
where V5 is assumed never to vanish The theory is invariant under diffeomorphismsIndeed VA transforms as a scalar density (because εμνρσ is a scalar density) hence theintegrand is a scalar density and the integral is invariant For α = 1 2 3 4 define thematrix
Eαμ (x) = partμΦα(x) (2147)
its inverse Eμα and its determinant E Varying Φ5 we obtain the equation of motion
partμ(E ηαβEμαE
νβpartνΦ5) = 0 (2148)
This is the massless KleinndashGordon equation (2143) interacting with a gravitationalfield Eα
μ Varying Φα we do not obtain independent equations We obtain the energy-momentum conservation law implied by (2148) The fact that there is only one inde-pendent equation is a consequence of the fact that there is a four-fold gauge invarianceWe can choose a gauge in which
Φa(x) = xa (2149)
We then have immediately Eaμ = δaμ and (2148) becomes (2142) The other four
equations are
parta(partaΦ5partbΦ
5 minus 1
2δab partcΦ
5partcΦ5) = 0 (2150)
Even better we may not fix the gauge and consider the gauge-invariant function offour variables φ(Xa) defined by
φ(Φa(x)) = Φ5(x) (2151)
This function satisfies the Minkowski-space KleinndashGordon equation (2142)How to interpret such a theory The theory (2141) is not generally covariant there-
fore its coordinates x are (partial) observables The theory is defined by five partial
82 General Relativity
observables four xμ and φ To interpret the theory we must have measuring proceduresassociated with these five quantities The relation between these observables is governedby (2141) On the other hand the theory (2146) is generally covariant therefore thecoordinates x are not observable The theory is defined by five partial observablesthe five φA We must have measuring procedures associated with these five quantitiesThe relation between these observables is governed again by (2141) Therefore in thetwo cases we have the same partial observables identified by ΦA harr (xa φ) relatedby the same equation
There is only one subtle but important difference between theory (2146) and theory(2141) Theory (2141) separates the five partial observables (x φ) into two sets theindependent ones (x) and the dependent one (φ) Theory (2146) treats the five partialobservables ΦA on an equal footing Thus in a strict sense theory (2141) containsone extra item of information a distinction between dependent and independent par-tial observables Because of this difference the two theories reflect two quite differentinterpretations of the world The first describes a worldrsquos ontology split into spacetimeand matter The second describes a world where the spacetime structure is interpretedas relational
244 Meanings of time
The concept of time used in natural language carries many properties Within a giventheoretical framework (say newtonian mechanics) time maintains some of these prop-erties and loses others In different theoretical frameworks time has different proper-ties The best-known example is probably the directionality of time absent in me-chanics present in thermodynamics But many other features of time lack in onetheory and are present in others For instance a property of time in newtonian me-chanics is uniqueness there is a unique time interval between any two events Con-versely in special relativity there are as many time variables as there are Lorentzobservers (x0 xprime0 ) Another attribute of time in newtonian mechanics is globalityevery solution of the equations of motion ldquopassesrdquo through every value of newtoniantime t once and only once In some cosmological models on the other hand thereis no choice of time variable with such a property there is ldquono timerdquo if we demandthat being global is an essential property of time In other words we use the wordldquotimerdquo to denote quite different concepts that may or may not include this or thatproperty
Here I describe a simple classification of possible attributes of time Below I identifyand list nine properties of time Then I describe and tabulate ten separate levels ofincreasing complexity of the notion of time corresponding to an increasing number ofproperties Theories typically fall in one of these levels according to the set of attributesthat the theory ascribes to the notion of time it uses The ten-fold arrangement isconventional the main point I intend to emphasize is that a single clear and purenotion of ldquotimerdquo does not exist
Properties of time Consider an infinite set S without any structure Add to S a topol-ogy and a differential structure dx Thus S becomes a manifold assume that thismanifold is one-dimensional and denote the set S together with its differentiable struc-ture as the line L = (Sdx) Next assume we add a metric structure d to L denotethe resulting metric line as M = (Sdx d) Next fix an ordering lt (a direction) inM Denote the resulting oriented line as the affine line A = (Sdx dlt) Next fix a
24 Complements 83
preferred point of A as the origin 0 the resulting space is isomorphic to the real lineR = (Sdx dlt 0)
The real line R is the traditional metaphor for the idea of time Time is frequentlyrepresented by a variable t in R The structure of R corresponds to an ensemble ofproperties that we naturally associate to the notion of time as follows (a) The existenceof a topology on the set of the time instants namely the existence of a notion of two timeinstants being close to each other and the fact that time is ldquoone-dimensionalrdquo (b) Theexistence of a metric Namely the possibility of stating that two distinct time intervalsare equal in magnitude time is ldquometricrdquo (c) The existence of an ordering relationbetween time instants Namely the possibility of distinguishing the past direction fromthe future direction (d) The existence of a preferred time instant the present theldquonowrdquo To capture these properties in mathematical language we describe time as areal line R An affine line A describes time up to the notion of present a metric lineM describes time up to the notions of present and pastfuture distinction a line Ldescribes time up to the notion of metricity
In newtonian mechanics we begin by representing time as a variable in R but thenthe equations are invariant both under t rarr minust and under t rarr t + a Thus the theoryis actually defined in terms of a variable t in a metric line M Newtonian mechanicsin fact incorporates both the notions of topology of the set of time instants and (ina very essential way) the fact that time is metric but it does not make any use ofthe notion of present nor the direction of time This is well known Note that Newtontheory is not inconsistent with the introduction of the notions of a present and of time-directionality it simply does not make any use of these notions These notions are notpresent in Newton theory
The properties listed above do not exhaust the different ways in which the notionof time enters physical theories the development of theoretical physics has modifiedsubstantially the natural notion of time A first modification was introduced by specialrelativity Einsteinrsquos definition of the time coordinate of distant events yields a notion oftime which is observer dependent An invariant structure can be maintained at the priceof relaxing the 1d character of time and the 3d character of space in favor of a notion of4d spacetime Alternatively we may say that the notion of a single time is replaced bya three-parameter family of times tv one for each Lorentz observer Therefore the timewe use in special relativity is not unique as is the time in newtonian mechanics Ratherthan a single line we have a three-parameter family of lines (the straight lines throughthe origin that fill the light cone of Minkowski space) Denote this three-parameter setof lines as M3
Times in GR There are several distinct possibilities of identifying ldquotimerdquo in GR Eachsingles out a different notion of time Each of these notions reduces to the standardnonrelativistic or special-relativistic time in appropriate limits but each lacks at leastsome of the properties of nonrelativistic time The most common ways of identifyingtime within GR are the following
Coordinate time x0 Coordinate time can be arbitrarily rescaled and does not pro-vide a way of identifying two time intervals as equal in duration Therefore it isnot metric in the sense defined above In addition the possibility of changingthe time coordinate freely from point to point implies that there is an infinite-dimensional choice of equally good coordinate times Finally unlike prerelativis-tic time x0 is not an observable quantity Denote the set of all the possiblecoordinate times as Linfin
Proper time τ This notion of time is metric But it is very different from the notion of
84 General Relativity
time in special relativity for several reasons First it is determined by the grav-itational field Second we have a different time for each worldline or infinitesi-mally for every speed at every point For an infinitesimal timelike displacementdxμ at a point x the infinitesimal time interval is dτ =
radicminusgμν(x) dxμdxν This
notion of time is a radical departure from the notion of time used in special rela-tivity because it is determined by the dynamical fields in the theory A solutionof Einsteinrsquos equations defines a point in the phase space Γ of GR It assignsa metric structure to every worldline Therefore this notion of time is given bya function from the phase space Γ multiplied by the set of the worldlines wlinto the metric structures d wl times wl rarr R+ Denote this function as minfin Callldquointernalrdquo a notion of time affected by the dynamics
Before GR dynamics could be expressed as evolution in a single time vari-able which has metric properties and could be measured In general-relativisticphysics this concept of time splits into two distinct concepts we can still viewthe dynamics as evolution in a time variable x0 but this time has no metricproperties and is not observable alternatively there is a notion of time thathas metric properties τ but the dynamics of the theory cannot be expressed asevolution in τ Is there a way to go around this split and view GR as a dynam-ical theory in the sense of a theory expressing evolution in an observable metrictime
Clock time The dynamics of GR determines how observable quantities evolve withrespect to one another We can always choose one observable quantity tc de-clare it the independent one and describe how the other observables evolve asfunctions of it A typical example of this clock time is the radius of a spatiallycompact universe in relativistic cosmology R Formally clock time is a functionon the extended configuration space C of the theory (see Chapter 3) Denote thisnotion of time as the clock time τc C rarr R
Under this definition of time GR becomes similar to a standard hamiltoniandynamical theory A clock time however generally behaves as a clock only incertain states or for a limited amount of time The radius of the universe forinstance fails as a good time variable when the universe recollapses In gen-eral a clock time lacks temporal globality In fact several results are knownconcerning obstructions to defining a function tc that behaves as ldquoa good timerdquoglobally [66]
Notice that some of these relativistic notions of time are in a sense opposite to theprerelativistic case while in newtonian theory time evolution is captured by a functionfrom the metric line M (time) to the configuration or phase space now the notion oftime is captured by a function from the configuration or phase space to the metric lineThis inversion is the mathematical expression of the physical idea that the flow of timeis affected or determined by the dynamics of the system itself
Finally none of the ways of thinking of time in classical GR can be uncriticallyextended to the quantum regime Quantum fluctuations of physical clocks and quan-tum superposition of different metric structures make the very notion of time fuzzy atthe Planck scale As will be discussed in the second part of this book a fundamentalconcept of time may be absent in quantum gravity
Notions of time Notice that properties of time progressively disappear in going towardmore fundamental physical theories At the opposite end of the spectrum there are
24 Complements 85
properties associated with the notion of time used in the natural languages which arenot present in physical theories They play a role in other areas of natural investigationsI mention these properties for the sake of completeness These are for instance memoryexpectations and the psychological perception of free will
To summarize I have identified the following properties of the notion of time
1 Existence of memory and expectations2 Existence of a preferred instant of time the present the now3 Directionality the possibility of distinguishing the past from the future direction4 Uniqueness the feature that is lost in special and general relativity where we
cannot identify a preferred time variable5 The property of being external the independence of the notion of time from the
dynamical variables of the theory6 Spatial globality the possibility of defining the same time variable in all space
points7 Temporal globality the fact that every motion goes through every value of the
time variable once and only once8 Metricity the possibility of saying that two time intervals have equal duration9 One-dimensionality namely the possibility of arranging the time instants in a
one-dimensional manifold
This discussion suggests a sequence of notions of time which I list here in order ofdecreasing complexity
Time of natural language This is the notion of time of everyday language which in-cludes all the features just listed This notion of time is not necessarily nonscien-tific for instance any scientific approach to say the human brain should makeuse of this notion of time
Time-with-a-present This is the notion of time that has all the features just listedincluding the existence of a preferred instant the present but not the notionsof memory and expectations which are notions usually more related to complexsystems (brain) than to time itself The notion of present is generally considereda feature of time itself This notion of time is the one to which often people referwhen they refer to the ldquoflow of timerdquo or Eddingtonrsquos ldquovivid perception of theflow of timerdquo [67] This notion of time can be described by the structure of aparametrized line R
Thermodynamical time If we maintain the distinction between a future direction anda past direction but we give up the notion of present we obtain the notionof time typical of thermodynamics Since thermodynamics is the first physicalscience that appears in this list this is maybe a good place to emphasize that thenotion of present of the ldquonowrdquo is completely absent from the description of theworld in physical terms This notion of time can be described by the structureof an affine line A
Newtonian time In newtonian mechanics there is no preferred direction of time Noticethat in the absence of a preferred direction of time the notions of cause and effectare interchangeable This notion of time can be described by the structure of ametric line M
Special-relativistic time If we give up uniqueness we have the time used in specialrelativity different Lorentz observers have a different notion of time Special-relativistic time is still external spatially and temporally global metrical andone-dimensional but it is not unique There is a three-parameter set of quantitiesthat share the status of time This notion of time can be described by the three-parameter set of metric lines M3
86 General Relativity
Table 21 Notions of time
Time notion Property Example Form
natural language time memory brain time-with-a-present present biology Rthermodynamical time direction thermodynamics Anewtonian time unique newtonian mechanics Mspecial-relativistic time external special relativity M3
cosmological time spatially global cosmological time mproper time temporally global worldline proper time minfin
clock time metric clocks in GR cparameter time one-dimensional coordinate time Linfin
no-time none quantum gravity none
Cosmological time By this I indicate a time which is spatially and temporally globalmetrical and one-dimensional but it is not external namely it is dynamicallydetermined by the theory Proper time in cosmology is the typical example Itis the most structured notion of time that occurs in GR Denote it by m
Proper time By this I indicate a time which is temporally global metrical and one-dimensional but it is not spatially global as the notion of proper time alongworldlines in GR It can be represented by a function minfin defined on the carte-sian product of the phase space and the ensemble of the worldlines
Clock time By this I indicate a time which is metrical and one-dimensional but itis not temporally global A realistic matter clock in GR defines a time in thissense This notion of time can be described by a function c on the phase space
Parameter time By this we mean a notion of time which is not metric and not ob-servable The typical example is the coordinate-time in GR Another exampleof parameter time is the evolution parameter in the parametrized formulationof the dynamics of a relativistic particle Parameter time is described by anunparametrized line L or by an infinite set Linfin of unparametrized lines
No-time Finally this is the bottom level in the analysis it is not a time concept butrather I indicate by no-time the idea that a predictive physical theory can bewell defined also in the absence of any notion of time
The list must not be taken rigidly It is summarized in the Table 21There is a interesting feature that emerges from the above analysis the hierarchical
arrangement While some details of this arrangement may be artificial neverthelessthe analysis points to a general fact moving from theories of ldquospecialrdquo objects likethe brain or living beings toward more general theories that include larger portionsof Nature we make use of a physical notion of time that is less specific and has lessdeterminations If we observe Nature at progressively more fundamental levels andwe seek for laws that hold in more general contexts then we discover that these lawsrequire or admit an increasingly weaker notion of time
This observation suggests that ldquohigh levelrdquo features of time are not present at thefundamental level but ldquoemergerdquo as features of specific physical regimes like the notionof ldquowater surfacerdquo emerges in certain regimes of the dynamics of a combination of waterand air molecules (see for instance [68])
24 Complements 87
Notions of time with more attributes are high-level notions that have no meaning inmore general situations The uniqueness of newtonian time for instance makes senseonly in the special regime in which we consider an ensemble of bodies moving slowlywith respect to each other Thus the notion of a unique time is a high-level notionthat makes sense only for some regimes in Nature For general systems most featuresof time are genuinely meaningless
245 Nonrelativistic coordinates
The precise meaning of the coordinates x = (x t) in newtonian and special-relativisticphysics is far from obvious Let me recall it here in order to clarify the precise differencebetween these and the relativistic coordinates
Newton is well aware that the motions we observe are relative motions and stressesthis point in Principia His point is not that we can directly observe absolute motionHis point is that we can infer the absolute motion or ldquotrue motionsrdquo or motion withrespect to absolute space from its physical effects (such as the concavity of the waterin the bucket) starting from our observation of relative motions
For instance we observe and describe motions with respect to Earth but fromsubtle effects such as Foucaultrsquos pendulum we infer that these are not true motionsThe experiment of the bucket is an example of the possibility of revealing true motion(rotation of the water with respect to space) disentangling it from relative motion(rotation with respect to the bucket) by means of an observable effect (the concavityof the water surface)26
For Newton the coordinates x that enter his main equation
F = md2x(t)
dt2(2152)
are the coordinates of absolute space However since we cannot directly observe spacethe only way we can coordinatize space points is by using physical objects The co-ordinates x of the object A moving along the trajectory x(t) are therefore definedas distances from a chosen system O of objects which we call a ldquoreference framerdquoBut then x are not the coordinates of absolute space So how can equation (2152)work
The solution of the difficulty is to use the capacity of unveiling ldquotrue motionrdquo thatNewton has pointed out in order to select the objects forming the reference frame Owisely There are ldquogoodrdquo and ldquobadrdquo reference frames The good ones are the ones inwhich no effect such as the concavity of the water surface of Newtonrsquos bucket can be
26Newton accords deep significance to the fact that we can unveil true motion Hedescribes relative motion as the way reality is observed by us and true motion as theway reality might be directly ldquoperceivedrdquo or ldquosensedrdquo by God This is why Newtoncalls space ndash the entity with respect to which true motion happens ndash the ldquoSensoriumof Godrdquo true motion is motion ldquowith respect to Godrdquo or ldquoas perceived by GodrdquoThere is a platonic tone in this idea that reason finds the way to the veiled divinetruth beyond appearances I wouldnrsquot read this as so removed from modernity asit is often portrayed There isnrsquot all that much difference between Newtonrsquos inquiryinto Godrsquos way of ldquosensing the worldrdquo and the modern search for the most effectiveway of conceptualizing reality Newtonrsquos God plays a mere linguistical role herethe role of denoting a major enterprise upgrading our own conceptual structure forunderstanding reality
88 General Relativity
observed within a desired accuracy Equation (2152) is correct to the desired accuracyif we use coordinates defined with respect to these good frames In other words thephysical content of (2152) is actually quite subtle
There exist reference objects O with respect to which the motion ofany other object A is correctly described by (2152)
This is a statement that begins to be meaningful only when a sufficiently large numberof moving objects is involved
Notice also that for this construction to work it is important that the objects Oforming the reference frame are not affected by the motion of the object A Thereshouldnrsquot be any dynamical interaction between A and O
Special relativity does not change much of this picture Since absolute simultaneitymakes no sense if the event A is distant from the clock in the origin its time t is illdefined Einsteinrsquos idea is to define a procedure for assigning a t to distant events usingclocks moving inertially
At clock time te send a light signal that reaches the event Receivethe reflected signal back at tr The time coordinate of the event isdefined to be tA = 1
2 (te + tr)
It is important to emphasize that this is a useful definition not a metaphysical state-ment that the event A happens ldquoright at the time whenrdquo the observer clock displaystA
Special relativity replaces Newtonrsquos absolute space and absolute time with a singleentity Minkowskirsquos absolute spacetime while the notion of inertial system and themeaning of the coordinates are the same as in newtonian mechanics
Summarizing these coordinates have the following properties
(i) Coordinates describe position with respect to physical reference objects (referenceframes)
(ii) Space coordinates are defined by the distance from the reference bodies Timecoordinates are defined with respect to isochronous clocks
(iii) Reference objects are appropriately chosen they are such that the reference systemthey define is inertial
(iv) Inertial frames reveal the structure of absolute spacetime itself
(v) The object A whose dynamics is described by the coordinates does not interactwith the reference objects O There is no dynamical coupling between A and O
Relativistic coordinates do not have any of these properties The fact that the two areindicated with the same notation xμ is only an unfortunate historical accident
246 Physical coordinates and GPS observables
Instead of working with arbitrary unphysical coordinates xμ we can choose to coordina-tize spacetime events with coordinates Xμ having an assigned physical interpretationFor instance we can describe the Universe by giving a name X to each galaxy andchoosing X0 as the proper time from the Big Bang along the galaxy worldline If wedo so the defining properties of the coordinates X must be added to the formalismWe must add a certain number of equations for the gravitational field the equations ofmotions of the objects used to fix the coordinates (the galaxies in the example) Theseadditional equations gauge-fix general covariance
24 Complements 89
The gauge-fixing can also be partial For instance a common choice is
e00(X) = 1 ei0(X) = 0 e0
a(X) = 0 (2153)
where i = 1 2 3 and a = 1 2 3 This corresponds to partially fixing the coordinates byrequiring that X0 measures proper time that equal X0 surfaces are locally instantaneitysurfaces in the sense of Einstein for the constant X lines and that the local Lorentzframes are chosen so that these lines are still
If the coordinates are fully specified the set formed by these physical gauge-fixingequations and the equations of motion has no residual gauge invariance that is ini-tial data determine evolution uniquely This procedure can be implemented in manypossible ways since there are arbitrarily many ways of fixing physical coordinates andnone is a priori better than any other In spite of this arbitrariness this procedure isoften convenient when the physical situation suggests a natural coordinate choice asin the cosmological context mentioned
Physical coordinates Xμ defined by matter filling space can only be effectively usedin the cosmological context because it is only at the cosmological scale that matter fillsspace In a system in which there are empty regions such as the Solar System thesephysical coordinates are not available An interesting alternative choice is provided bythe GPS coordinates described below
The physical coordinates Xμ are partial observables and we can associate measuringdevices with them
Undetermined physical coordinates Finally there is a third interpretation of the co-ordinates of GR which is intermediate between arbitrary coordinates xμ and physicalcoordinates Xμ Imagine that a region of the universe is filled with certain light objectswhich may not be in free fall We can use these objects to define physical coordinatesXμ but also choose to ignore the equations of motion of these objects We obtain asystem of equations for the gravitational field and other matter expressed in termsof coordinates Xμ that are interpreted as the spacetime location of reference objectswhose dynamics we have chosen to ignore
This set of equations is under-determined the same initial conditions can evolve intodifferent solutions However the interpretation of such under-determination is simplythat we have chosen to neglect part of the equations of motion Different solutions withthe same initial conditions represent the same physical configuration of the fields butexpressed say in one case with respect to free-falling reference objects in the othercase with respect to reference objects on which a force has acted at a certain momentand so on This procedure has the disadvantage of being useless in quantum theorywhere we cannot assume that something is observable and at the same time neglect itsdynamics
In conclusion one should always be careful in talking about general-relativistic co-ordinates whether one is referring to
(i) arbitrary mathematical coordinates x
(ii) physical coordinates X with an interpretation as positions with respect to objectswhose equations of motion are taken into account
(iii) physical coordinates with an interpretation as positions with respect to objectswhose equations of motion are ignored
The system of equations of motion is nondeterministic in (i) and (iii) deterministic in(ii) The coordinates are partial observables in (ii) and (iii) but not in (i) Confusionabout observability in GR follows from confusing these three different interpretationsof the coordinates The following is an example of physical coordinates
90 General Relativity
GPS observables In the literature there are many attempts to define useful physicalcoordinates It is easier to define physical coordinates in the presence of matter thanin the context of pure GR Ideally we can consider GR interacting with four scalarmatter fields Assume that the configuration of these fields is sufficiently nondegener-ate Then the components of the gravitational field at points defined by given valuesof the matter fields are gauge-invariant observables This idea has been developed ina number of variants such as dust-carrying clocks and others (see [69ndash71] and refer-ences therein) The extent to which the result is realistic or useful is questionable Itis rather unsatisfactory to understand the theory in terms of fields that do not existor phenomenological objects such as dust and it is questionable whether these pro-cedures could make sense in the quantum theory where the aim is to describe Planckscale dynamics Earlier attempts to write a complete set of gauge-invariant observ-ables are in the context of pure GR [72] The idea is to construct four scalar functionsof the gravitational field (say scalar polynomials of the curvature) and use these tolocalize points The value of a fifth scalar function in a point where the four scalarfunctions have a given value is a gauge-invariant observable This works but the resultis mathematically very intricate and physically very unrealistic It is certainly possiblein principle to construct detectors of such observables but I doubt any experimenterwould get funded for a proposal to build such an apparatus
There is a simple way out based on GR coupled with a minimal and very realisticamount of additional matter Indeed this way out is so realistic that it is in fact realit is essentially already implemented by existing technology the Global PositioningSystem (GPS) which is the first technological application of GR or the first large-scaletechnology that needs to take GR effects into account [73]
Consider a generally covariant system formed by GR coupled with four small bodiesThese are taken to have negligible mass they will be considered as point particlesfor simplicity and called ldquosatellitesrdquo Assume that the four satellites follow timelikegeodesics that these geodesics meet in a common (starting) point O and at O theyhave a given (fixed) speed ndash the same for all four ndash and directions as the four vertices ofa tetrahedron The theory might include any other matter Then (there is a region R ofspacetime for which) we can uniquely associate four numbers sα α = 1 2 3 4 to eachspacetime point p as follows Consider the past lightcone of p This will (generically)intersect the four geodesics in four points pα The numbers sα are defined as thedistance between pα and O (That is the proper time along the satellitesrsquo geodesic)We can use the sα as physically defined coordinates for p The components gαβ(s) of themetric tensor in these coordinates are gauge-invariant quantities They are invariantunder four-dimensional diffeomorphisms (because these deform the metric as well asthe satellitesrsquo worldlines) They define a complete set of gauge-invariant observables forthe region R
The physical picture is simple and its realism is transparent Imagine that the fourldquosatellitesrdquo are in fact satellites each carrying a clock that measures the proper timealong its trajectory starting at the meeting point O Imagine also that each satellitebroadcasts its local time with a radio signal Suppose I am at the point p and have anelectronic device that simply receives the four signals and displays the four readings seeFigure 25 These four numbers are precisely the four physical coordinates sα definedabove Current technology permits us to perform these measurements with an accu-racy well within the relativistic regime [73 74] If we then use a rod and a clock andmeasure the physical 4-distances between sα coordinates we are directly measuring thecomponents of the metric tensor in the physical coordinate system In the terminologyof Chapter 3 the sα are partial observables while the gαβ(s) are complete observables
24 Complements 91
O
p
Σ
t
x
s2s1
Fig 25 s1 and s2 are the GPS coordinates of the point p Σ is a Cauchy surfacewith p in its future domain of dependence
As shown below the physical coordinates sα have nice geometrical properties theyare characterized by
gαα(s) = 0 α = 1 4 (2154)
Surprisingly in spite of the fact that they are defined by what looks like a rather non-local procedure the evolution equations for gαβ(s) are local These evolution equationscan be written explicitly using the ArnowittndashDeserndashMisner (ADM) variables (see [131]of Chapter 3 for details) Lapse and Shift turn out to be fixed local functions of thethree metric
In what follows I first introduce the GPS coordinates sα in Minkowski space ThenI consider a general spacetime I assume the Einstein summation convention only forcouples of repeated indices that are one up and one down Thus in (2154) α is notsummed While dealing with Minkowski spacetime the spacetime indices μ ν are raisedand lowered with the Minkowski metric Here I write an arrow over three- as well asfour-dimensional vectors Also here I use the signature [+minusminusminus] in order to havethe same expressions as in the original article on the subject
Consider a tetrahedron in three-dimensional euclidean space Let its center be atthe origin and its four vertices vα where the vectors vα have unit length |vα|2 = 1 andvα middot vβ = minus13 for α = β Here α = 1 2 3 4 is an index that distinguishes the fourvertices and should not be confused with vector indices With a convenient orientationthese vertices have cartesian coordinates (a = 1 2 3)
v1a = (0 0 1) v2a = (2radic
23 0 minus13) (2155)
v3a = (minusradic
23radic
23 minus13) v4a = (minusradic
23 minusradic
23 minus13) (2156)
Let us now go to a four-dimensional Minkowski space Consider four timelike 4-vectorsWα of length unity | Wα|2 = 1 representing the normalized 4-velocities of four par-ticles moving away from the origin in the directions vα at a common speed v Their
92 General Relativity
Minkowski coordinates (μ = 0 1 2 3) are
Wαμ =1radic
1 minus v2(1 v vαa) (2157)
Fix the velocity v by requiring the determinant of the matrix Wαμ to be unity (Thischoice fixes v at about one-half the speed of light a different choice changes only a fewnormalization factors in what follows) The four by four matrix Wαμ plays an importantrole in what follows Notice that it is a fixed matrix whose entries are certain givennumbers
Consider one of the four 4-vectors say W = W 1 Consider a free particle inMinkowski space that starts from the origin with 4-velocity W Call it a ldquosatelliterdquo Itsworldline l is x(s) = s W Since W is normalized s is precisely the proper time alongthe worldline Consider now an arbitrary point p in Minkowski spacetime with coordi-nates X Compute the value of s at the intersection between l and the past lightconeof p This is a simple exercise giving
s = X middot W minusradic
( X middot W )2 minus | X|2 (2158)
Now consider four satellites moving out of the origin at 4-velocity Wα If they radiobroadcast their position an observer at the point p with Minkowski coordinates Xreceives the four signals sα
sα = X middot Wα minusradic
( X middot Wα)2 minus | X|2 (2159)
Introduce (nonlorentzian) general coordinates sα on Minkowski space defined by thechange of variables (2159) These are the coordinates read out by a GPS device inMinkowski space The jacobian matrix of the change of coordinates is given by
partsα
partxμ= Wα
μ minus Wαμ ( X middot Wα) minusXμradic( X middot Wα)2 minus | X|2
(2160)
where Wαμ and Xμ are Wαμ and Xμ with the spacetime index lowered with the
Minkowski metric This defines the tetrad field eαμ(s)
eαμ(s(X)) =partsα
partxμ(X) (2161)
The contravariant metric tensor is given by gαβ(s) = eαμ(s)eμβ(s) Using the relation
| Wα|2 = 1 a straightforward calculation shows that
gαα(s) = 0 α = 1 4 (2162)
This equation has the following nice geometrical interpretation Fix α and considerthe one-form field ωα =dsα In sα coordinates this one-form has components ωα
β = δαβ
and therefore ldquolengthrdquo |ωα|2 = gβγωαβω
αγ = gαα But the ldquolengthrdquo of a one-form is
proportional to the volume of the (infinitesimal now) 3-surface defined by the formThe 3-surface defined by dsα is the surface sα = constant But sα = constant is the setof points that read the GPS coordinate sα namely that receive a radio broadcastingfrom a same event pα of the satellite α namely that are on the future lightcone of pαTherefore sα = constant is a portion of this lightcone it is a null surface and thereforeits volume is zero And so |ωα|2 = 0 and gαα = 0
24 Complements 93
Since the sα coordinates define sα = constant surfaces that are null we denote themas ldquonull GPS coordinatesrdquo It is useful to introduce another set of GPS coordinates aswell which have the traditional timelike and spacelike character We denote these assμ call them ldquotimelike GPS coordinatesrdquo and define them by
sα = Wαμ sμ (2163)
This is a simple algebraic relabeling of the names of the four GPS coordinates suchthat sμ=0 is timelike and sμ=a is spacelike In these coordinates the gauge condition(2162) reads
Wαμ Wα
ν gμν(s) = 0 (2164)
This can be interpreted geometrically as follows The (timelike) GPS coordinates arecoordinates sμ such that the four one-form fields
ωα = Wαμ dsμ (2165)
are nullLet us now jump from Minkowski space to full GR Consider GR coupled with four
satellites of negligible mass that move geodesically and whose worldlines emerge froma point O with directions and velocity as above Locally around O the metric can betaken to be minkowskian therefore the details of the initial conditions of the satellitesrsquoworldlines can be taken as above The phase space of this system is the one of pureGR plus ten parameters giving the location of O and the Lorentz orientation of theinitial tetrahedron of velocities The integration of the satellitesrsquo geodesics and of thelightcones can be arbitrarily complicated in an arbitrary metric However if the metricis sufficiently regular there will still be a region R in which the radio signals broadcastby the satellites are received (In the case of multiple reception the strongest one can beselected That is if the past lightcone of p intersects l more than once generically therewill be one intersection which is at shorter luminosity distance) Thus we still havewell-defined physical coordinates sα on R Equation (2162) holds in these coordinatesbecause it depends only on the properties of the light propagation around p We definealso timelike GPS coordinates sμ by (2163) and we get condition (2164) on the metrictensor
To study the evolution of the metric tensor in GPS coordinates it is easier to shiftto ADM variables NNa γab These are functions of the covariant components of themetric tensor defined in general by
ds2 = gμνdxμdxν = N2dt2 minus γab(dxa minusNadt)(dxb minusNbdt) (2166)
Equivalently they are related to the contravariant components of the metric tensor by
gμνvμvν = minusγabvavb + (nμvμ)2 (2167)
where γab is the inverse of γab and nμ = (1NNaN) Using these variables the gaugecondition (2164) reads
Wαa Wα
b γab = (Wαμ nμ)2 (2168)
Notice now that this can be solved for the Lapse and Shift as a function of the 3-metric(recall that Wα
μ are fixed numbers) obtaining
nμ = Wμα q
α (2169)
94 General Relativity
where Wμα is the inverse of the matrix Wα
μ and
qα =radic
Wαa Wα
b γab (2170)
Or explicitly
N =1
W 0αqα
Na =W a
αqα
W 0αqα
(2171)
The geometrical interpretation is as follows We want the one-form ωα defined in (2165)to be null namely its norm to vanish But in the ADM formalism this norm is the sumof two parts the norm of the pull-back of ωα on the constant time ADM surfacewhich is qα given in (2170) and depends on the three metric plus the square of theprojection of ωα on nμ We can thus obtain the vanishing of the norm by adjusting theLapse and Shift We have four conditions (one per α) and we can thus determine Lapseand Shift from the 3-metric In other words whatever the 3-metric we can alwaysadjust Lapse and Shift so that the gauge condition (2164) is satisfied But in theADM formalism the arbitrariness of the evolution in the Einstein equations is entirelycaptured by the freedom in choosing Lapse and Shift Since here Lapse and Shift areuniquely determined by the 3-metric evolution is determined uniquely if the initialdata on a Cauchy surface are known Therefore the evolution in the GPS coordinates0 of the GPS components of the metric tensor gμν(s) is governed by deterministicequations the ADM evolution equation with Lapse and Shift determined by (2170)ndash(2171) Notice also that evolution is local since the ADM evolution equations as wellas the (2170)ndash(2171) are local27
How can the evolution of the quantities gμν(s) be local The conditions on thenull surfaces described in the previous paragraph are nonlocal Coordinate distancestypically yield nonlocality imagine we define physical coordinates in the Solar Systemusing the cosmological time tc and the spatial distances xS xE xJ (at fixed tc) from saythe Sun the Earth and Jupiter The metric tensor gμν(tc xS xE xJ) in these coordinatesis a gauge-invariant observable but its evolution is highly nonlocal To see this imaginethat in this moment (in cosmological time) Jupiter is swept away by a huge cometThen the value of gμν(tc xS xE xJ) here changes instantaneously without any localcause the value of the coordinate xJ has changed because of an event happening faraway Whatrsquos special about the GPS coordinates that avoids this nonlocality Theanswer is that the value of a GPS coordinate at a point p does in fact depend onwhat happens ldquofar awayrdquo as well Indeed it depends on what happens to the satelliteHowever it only depends on what happened to the satellite when it was broadcastingthe signal received in p and this is in the past of p If p is in the past domain ofdependence of a partial Cauchy surface Σ then the value of gμν(s) in p is completelydetermined by the metric and its derivative on Σ namely evolution is causal becausethe entire information needed to set up the GPS coordinates is in the data in Σ seeFigure 25 Explicitly the sα = constant surfaces around Σ can be uniquely integratedahead all the way to p They certainly can as they represent just the evolution of alight front This is how local evolution is achieved by these coordinates
Summarizing I have introduced a set of physical coordinates determined by certainmaterial bodies Geometrical quantities such as the components of the metric tensorexpressed in physical coordinates are of gauge-invariant observables There is no needto introduce a large unrealistic amount of matter or to construct complicated andunrealistic physical quantities out of the metric tensor Four particles are sufficient to
27This does not imply that the full set of equations satisfied by gμν(s) must be localsince initial conditions on s0 = 0 satisfy four other constraints besides the ADM ones
24 Complements 95
Fig 26 A simple apparatus to measure the gravitational field Two GPS de-vices reading sμL and sμR respectively connected by a 1 meter rod If for instancesμR = sμL for μ = 0 2 3 then the local value of g11(s) is g11(s) = (s1
R minus s1L)minus2m2
coordinatize a (region of a) four-geometry Furthermore the coordinatization procedureis not artificial it is the real one utilized by existing technology
The components of the metric tensor in (timelike) GPS coordinates can be measuredas follows (see Figure 26) Take a rod of physical length L (small with respect to thedistance along which the gravitational field changes significantly) with two GPS de-vices at its ends (reading timelike GPS coordinates) Orient the rod (or search amongrecorded readings) so that the two GPS devices have the same reading s of all coor-dinates except for s1 Let δs1 be the difference in the two s1 readings Then we havealong the rod
ds2 = g11(s)δs1δs1 = L2 (2172)
Therefore
g11(s) =
(L
δs1
)2
(2173)
Nondiagonal components of gab(s) can be measured by simple generalizations of thisprocedure The g0b(s) are then algebraically determined by the gauge conditions Ina thought experiment data from a spaceship traveling in a spacetime region could beused to produce a map of values of the GPS components of the metric tensor Insteadof using a rod which is a rather crude device for measuring distances one could senda light pulse forward and back between the two GPS devices kept at fixed spatial sμ
coordinates If T is the (physical) time for flying back and forward measured by aprecise clock on one device then g11(s) = (cT2δs1)2 This is valid so long as T andL are small compared to the distances over which the gab(s) change by amounts of theorder of the experimental errors
The individual components of the metric tensor expressed in physical coordinatesare measurable The statement that ldquothe curvature is measurable but the metric is notmeasurablerdquo which is often heard is incorrect Both metric and curvature in physicalcoordinates are measurable and predictable Neither metric nor curvature in arbitrarynonphysical coordinates are measurable
The GPS coordinates are partial observables (see Chapter 3) The complete ob-servables are the quantities gμν(s) for any given value of the coordinates sμ Thesequantities are diffeomorphism invariant are uniquely determined by the initial dataand in a canonical formulation are represented by functions on the phase space thatcommute with all constraints
The GPS observables are a straightforward generalization of Einsteinrsquos ldquospacetimecoincidencesrdquo In a sense they are precisely Einsteinrsquos point coincidences Einsteinrsquosldquomaterial pointsrdquo are just replaced by photons (light pulses) the spacetime point sα is
96 General Relativity
characterized as the meeting point of four photons designated by the fact of carryingthe radio signals sα
mdashmdash
Bibliographical notes
There are many beautiful classic textbooks on GR Two among the bestoffering remarkably different points of view on the theory are Weinberg[75] and Wald [76] The first stresses the similarity between GR and flat-space field theory the second on the contrary emphasizes the geometricreading of GR Here I have followed a third path I place emphasis onthe change of the notions of space and time needed for general-relativisticphysics (which affects quantization dramatically) but I put little emphasison the geometric interpretation of the gravitational field (which is goingto be largely lost in the quantum theory)
Relevant mathematics is nicely presented for instance in the text byChoquet-Bruhat DeWitt-Morette and Dillard-Bleick [77] and in [16] Onthe large empirical evidence in favor of GR piled up in the recent yearssee Ciufolini and Wheeler [78]
The tetrad formalism and its introduction into quantum gravity aremainly due to Cartan to Weyl [80] and to Schwinger [80] the first-orderformalism to Palatini The Plebanski two-form was introduced in [81] Theselfdual connection which is at the root of Ashtekarrsquos canonical theory(see Chapter 4) was introduced by Amitaba Sen [82] The lagrangianformulation for the selfdual connections was given in [83] A formulationof GR based on the sole connection is discussed in [84]
Interesting reconstructions of Einsteinrsquos path towards GR are in [8586]Kretschmannrsquos objection to the significance of general covariance ap-peared in [87] On this see also Andersonrsquos book [88] An account ofthe historical debate on the interpretation of space and motion is JulianBarbourrsquos [89] a wonderful historical book In the philosophy of sciencethe debate was reopened by a 1987 paper on the hole argument by JohnEarman and John Norton [90] On the contemporary version of this de-bate see [65 91ndash93] On the physical side of the discussion of what isldquoobservablerdquo in GR see [71]
The discussion of the different notions of time follows [94] A surprisingand inspiring book on the subject is Fraser [95] a book that will convincethe reader that the notion of time is far from being a monolithic conceptThe literature on the problem of time in quantum gravity is vast I list onlya few pointers here distinguishing various problems origin of the ldquoarrowof timerdquo and the cosmological time asymmetry [96] disappearance of thecoordinate-time variable in canonical quantum gravity [97] possibility of
Bibliographical notes 97
a consistent interpretation of quantum mechanics for systems withoutglobal time [269899] problems in choosing an ldquointernal timerdquo in generalrelativity and the properties that such an internal time should have [66]see also [100] The presentation of the GPS observables follows [101] seealso [102103]
3Mechanics
In its conventional formulation mechanics describes the evolution of states and ob-servables in time This evolution is governed by a hamiltonian This is also true forspecial-relativistic theories where evolution is governed by a representation of thePoincare group which includes a hamiltonian This conventional formulation is notsufficiently broad because general-relativistic systems ndash in fact the world in which welive ndash do not fit into this conceptual scheme Therefore we need a more general formula-tion of mechanics than the conventional one This formulation must be based on notionsof ldquoobservablerdquo and ldquostaterdquo that maintain a clear meaning in a general-relativistic con-text A formulation of this kind is described in this chapter
The conventional structure of conventional nonrelativistic mechanics already pointsrather directly to the relativistic formulation described here Indeed many aspects ofthis formulation are already utilized by many authors For instance Arnold [104] iden-tifies the (presymplectic) space with coordinates (t qi pi) (time lagrangian variablesand their momenta) as the natural home for mechanics Souriau has developed a beau-tiful and little-known relativistic formalism [105] Probably the first to consider thepoint of view used here was Lagrange himself in pointing out that the most convenientdefinition of ldquophase spacerdquo is the space of the physical motions [106] Many of the toolsused below are also used in hamiltonian treatments of generally covariant theories asconstrained systems although generally within a rather obscure interpretative cloud
31 Nonrelativistic mechanics mechanics is about timeevolution
I begin with a brief review of conventional mechanics This is useful tofix notations and introduce some notions that will play a role in the rela-tivistic formalism I give no derivations here they are standard and theycan be obtained as a special case of the derivations in the next section
Lagrangian A dynamical system with m degrees of freedom describes theevolution in time t of m lagrangian variables qi where i = 1 m Thespace in which the variables qi take value is the m-dimensional (nonrela-tivistic) configuration space C0 The dynamics of the system is determined
98
31 Nonrelativistic mechanics 99
by a single function of 2m variables L(qi vi) the lagrangian Given twotimes t1 and t2 and two points qi1 and qi2 in C0 physical motions are suchthat the action
S[q] =int t2
t1
dt L(qi(t)
dqi(t)dt
)(31)
is an extremum in the space of the motions qi(t) such that qi(t1) = qi1and qi(t2) = qi2 A dynamical system is therefore specified by the couple(C0 L) Physical motions satisfy the Lagrange equations
ddt
pi
(qi(t)
dqi(t)dt
)= Fi
(qi(t)
dqi(t)dt
) (32)
where momenta and forces are defined by
pi(qi vi) =partL(qi vi)
partvi Fi(qi vi) =
partL(qi vi)partqi
(33)
Hamiltonian The Lagrange equations can be cast in first-order form byusing the lagrangian coordinates qi and the momenta pi as variablesInverting the function pi(qi vi) yields the function vi(qi pi) inserting thisin the function Fi(qi vi) defines the force fi(qi pi) equiv Fi(qi vi(qi pi)) asfunctions of coordinates and momenta The equations of motion (32)become
dqi(t)dt
= vi(qi(t) pi(t))dpi(t)
dt= fi(qi(t) pi(t)) (34)
These equations are determined by the function H0(qi pi) the nonrela-tivistic hamiltonian defined by H0(qi pi) = piv
i(qi pi) minus L(qi vi(qi pi))Indeed (34) is equivalent to (32) with
vi(qi pi) =partH0(qi pi)
partpi fi(qi pi) = minuspartH0(qi pi)
partqi (35)
Symplectic The Hamilton equations (34)ndash(35) can be written in a use-ful and compact geometric language The 2m-dimensional space coordi-natized by the coordinates qi and the momenta pi is the nonrelativisticphase space Γ0 (The reason for the subscript 0 will be clear below) Timeevolution is a flow (qi(t) pi(t)) in this space the vector field on Γ0 tangentto this flow is
X0 = vi(qi pi)part
partqi+ fi(qi pi)
part
partpi (36)
100 Mechanics
Therefore the dynamics is specified by assigning the vector field X0 on Γ0Now Γ0 can be interpreted as the cotangent space T lowastC0 Any cotangentspace carries a natural1 one-form θ0 = pidqi where dθ0 is nondegenerateA space equipped with such a one-form has the remarkable property thatevery function f determines a vector field Xf via the relation (dθ0)(Xf ) =minusdf A straightforward calculation shows that the flow defined by H0 isprecisely the time evolution vector field (36) Therefore the equations ofmotion (34)ndash(35) can be written simply2 as
(dθ0)(X0) = minusdH0 (37)
The two-form ω0 = dθ0 entering (37) is symplectic3 A dynamical systemis determined by a triple (Γ0 ω0 H0) where Γ0 is a manifold ω0 is asymplectic two-form and H0 is a function on Γ0
Presymplectic A very elegant formulation of mechanics and a crucialstep in the direction of the relativistic theory is provided by the presym-plectic formalism This formalism is based on the idea of describingmotions by using the graph of the function (qi(t) pi(t)) instead of thefunctions themselves The graph of the function (qi(t) pi(t)) is an un-parametrized curve γ in the (2m + 1)-dimensional space Σ = R times Γ0with coordinates (t qi pi) it is formed by all the points (t qi(t) pi(t)) inthis space The vector field
X =part
partt+ vi(qi pi)
part
partqi+ f i(qi pi)
part
partpi(38)
is tangent to all these curves (So is any other vector field obtained byscaling X namely any vector field X prime = fX where f is a scalar functionon Σ) Now consider the Poincare one-form
θ = pidqi minusH0(qi pi)dt (39)
on Σ The two-form ω = dθ is closed but it is degenerate (every two-formis degenerate in odd dimensions) that is there is a vector field X (calledthe null vector field of ω) satisfying
(dθ)(X) = 0 (310)
1It is defined intrinsically by θ0(X)(s) = s(πX) where X is a vector field on T lowastC0 s apoint in T lowastC0 and π the bundle projection
2The contraction between a two-form and a vector is defined by (αandβ)(X) = α(X)βminusβ(X)α
3That is closed and nondegenerate Closed means dω0 = 0 nondegenerate means thatω0(X) = 0 implies X = 0
31 Nonrelativistic mechanics 101
The integral curves4 of the null vector field of a two-form ω are calledthe ldquoorbitsrdquo of ω It is easy to see that X given in (38) satisfies (310)Therefore the graphs of the motions are simply the orbits of dθ In otherwords (310) is a rewriting of the equations of motion
A space Σ equipped with a closed degenerate two-form ω is calledpresymplectic A dynamical system is thus completely defined by apresymplectic space (Σ ω) We use also the notation (Σ θ) where ω = dθ
Notice that (310) is homogeneous and therefore it determines X onlyup to scaling This is consistent with the fact that the vector field tangentto the motions is defined only up to scaling That is consistent with thefact that motions are represented by unparametrized curves in Σ
Finally it is easy to see that the action (31) is simply the line integral ofthe Poincare one-form (39) along the orbits if γ is an orbit (t qi(t) pi(t))of ω then the action of the motion qi(t) is
S[q] =int
γθ (311)
Extended Finally let me come to a formulation of dynamics that ex-tends naturally to general-relativistic systems In light of the presymplec-tic formulation described above it is natural to consider the relativisticconfiguration space
C = Rtimes C0 (312)
coordinatized by the m + 1 variables (t qi) and to describe motions withthe graphs of the functions qi(t) which are unparametrized curves in CConsider the cotangent space T lowastC with coordinates (t qi pt pi) and thefunction
H(t qi pt pi) = pt + H0(qi pi) (313)
on this space Let Σ be the surface in T lowastC defined by
H(qi t pi pt) = 0 (314)
We can coordinatize Σ with the coordinates (t qi pi) Since it is a cotan-gent space T lowastC carries a natural one-form which is
θ = pidqi + ptdt (315)
The restriction of this one-form to the surface (314) is precisely (39)Therefore the surface (314) is the presymplectic space that defines thedynamics
4An integral curve of a vector field is a curve everywhere tangent to the field
102 Mechanics
In other words the dynamics is completely defined by the couple (C H)a relativistic configuration space C and a function H on T lowastC The graphsof the motions are simply the orbits of dθ on the surface (314)5 I call Hthe relativistic hamiltonian
Remarkably the dynamics can be directly expressed in terms of a varia-tional principle based on (C H) An unparametrized curve γ in C describesa physical motion if γ extremizes the integral
S[γ] =int
γθ (316)
in the class of the curves γ in T lowastC satisfying (314) whose restriction γ toC connects two given points (t1 qi1) and (t2 qi2)
The relativistic configuration space C has the structure (312) and therelativistic hamiltonian H has the form (313) As we shall see the struc-ture (312)ndash(313) does not survive in the relativistic formulation of me-chanics
Relativistic phase space Denote Γ the space of the orbits of dθ in ΣThere is a natural projection π Σ rarr Γ that sends each point of Σ to thecurve to which it belongs It is not hard to show that there is one andonly one symplectic two-form ωph on Γ such that its pull-back to Σ is dθnamely πlowastωph = dθ Therefore Γ is a symplectic space Γ is the space ofthe physical motions I shall call it the relativistic phase space
The relation between the relativistic phase space Γ and the nonrela-tivistic phase space Γ0 = T lowastC0 is the following Γ0 is the space of theinstantaneous states the states that the system can have at a fixed timet = t0 On the other hand Γ is the space of all solutions of the equationsof motion Now fix a time say t = t0 If at t = t0 the system is in an ini-tial state in Γ0 it will then evolve in a well-defined motion The other wayaround each motion determines an instantaneous state at t = t0 There-fore there is a one-to-one mapping between Γ and Γ0 The identificationbetween Γ and Γ0 depends on the t0 chosen
HamiltonndashJacobi The HamiltonndashJacobi equation is
partS(qi t)partt
+ H0
(qi
partS(qi t)partqi
)= 0 (317)
If a family of solutions S(qi Qi t) depending on m parameters Qi is foundthen we can compute the function
Pi(qi Qi t) = minuspartS(qi Qi t)partQi
(318)
5More precisely the projections of these orbits on C
31 Nonrelativistic mechanics 103
by simple derivation Inverting this function we obtain
qi(t) = qi(Qi Pi t) (319)
which are physical motions namely the general solution of the equationsof motion where the quantities (Qi Pi) are the 2m integration constants
Solutions of (317) can be found in the form S(qi Qi t) = EtminusW (qi Qi)where E is a constant and W satisfies
H0
(qi
partW (qi Qi)partqi
)= E (320)
S is called the principal HamiltonndashJacobi function W is called the char-acteristic HamiltonndashJacobi function
The HamiltonndashJacobi equation (317) can be obtained from the classicallimit of the Schrodinger equation
The Hamilton function Consider two points (t1 qi1) and (t2 qi2) in C Thefunction on G = C times C
S(t1 qi1 t2 qi2) =
int t2
t1
dt L(qi(t) qi(t)) (321)
where qi(t) is the physical motion from qi1(t1) to qi2(t2) (that minimizesthe action) is called the Hamilton function Equivalently
S(t1 qi1 t2 qi2) =
int
γθ (322)
where γ is the orbit into Σ that projects to qi(t) Notice the differencebetween the action (31) and the Hamilton function (321) the first is afunctional of the motion the second is a function of the end points It isnot hard to see that the Hamilton function solves the HamiltonndashJacobiequation (in both sets of variables) The Hamilton function is thereforea preferred solution of the HamiltonndashJacobi equation If we know theHamilton function we have solved the equations of motion because weobtain the general solution of the equations of motion in the form qi =qi(t Qi Pi T ) by simply inverting the function
Pi(t qi TQi) =partS(t qi TQi)
partQi(323)
with respect to qi The resulting function qi(t TQi Pi) is the generalsolution of the equations of motion where the integration constants arethe initial coordinate and momenta Qi Pi at time T
104 Mechanics
Thus the action defines a dynamical system the Hamilton function di-rectly gives all the motions6 The Hamilton function (321) is the classicallimit of the quantum mechanical propagator
Example a pendulum Let α be the lagrangian variable describing the elongation of asimple harmonic oscillator which I call ldquopendulumrdquo for simplicity The lagrangianis L(α v) = (mv22) minus (mω2α22) the nonrelativistic hamiltonian is H0(α p) =(p22m) + (mω2α22) The extended configuration space has coordinates (t α) andthe relativistic hamiltonian is
H(t α pt p) = pt +p2
2m+
mω2α2
2 (324)
Choose coordinates (t α p) on the constraint surface H = 0 which is therefore definedby pt = minusH0(α p) The restriction of the one-form θ = pt dt + pdα to this surface is
θ = pdαminus(
p2
2m+
mω2α2
2
)dt (325)
The presymplectic two-form is therefore
ω = dθ = dp and dαminus p
mdp and dtminusmω2α dα and dt (326)
The orbits are obtained by integrating the vector field
X = Xtpart
partt+ Xα
part
partα+ Xp
part
partp(327)
satisfying ω(X) = 0 Inserting (326) and (327) in ω(X) = 0 we get
ω(X) = Xt
(minus p
mdpminusmω2α dα
)+ Xα
(dp + mω2α dt
)+ Xp
(minusdα +
p
mdt
)
=(minus p
mXt + Xα
)dp +
(minusmω2αXt minusXp
)dα +
(mω2αXα +
p
mXp
)dt
= 0 (328)
Writing dt(τ)dτ = Xt dα(τ)dτ = Xα dp(τ)dτ = Xp equation (328) reads
dα(τ)
dτminus p
m
dt(τ)
dτ= 0 minusdp(τ)
dτminusmω2α
dt(τ)
dτ= 0 (329)
together with a third equation dependent on the first two Equation (329) can bewritten as
dα(t)
dt=
p
m
dp(t)
dt= minusmω2α (330)
which are the Hamilton equations of the pendulum We can write its general solutionin the form
α(t) = a eiωt + a eminusiωt (331)
The Hamilton function S(α1 t1 α2 t2) is the preferred solution of the HamiltonndashJacobiequation
partS(α t)
partt+
1
2m
(partS(α t)
partα
)2
+mω2α2
2= 0 (332)
6Hamilton (talking about himself in the third person) ldquoMr Lagrangersquos function statesthe problem Mr Hamiltonrsquos function solves itrdquo [107]
32 Relativistic mechanics 105
obtained by computing the action of the physical motion α(t) that goes from α(t1) = α1
to α(t2) = α2 This motion is given by (331) with
a =α1e
minusiωt2 + α2eminusiωt1
2i sin[(ω(t1 minus t2))] (333)
Inserting this in the action and integrating we obtain the Hamilton function
S(α1 t1 α2 t2) = mω2α1α2 minus (α2
1 + α22) cos[(ω(t1 minus t2))]
2 sin[(ω(t1 minus t2))] (334)
This concludes the short review of nonrelativistic mechanics I nowconsider the generalization of this formalism to relativistic systems
32 Relativistic mechanics
321 Structure of relativistic systems partial observablesrelativistic states
Is there a version of the notions of ldquostaterdquo and ldquoobservablerdquo broad enough to applynaturally to relativistic systems I begin by introducing the main notions and tools ofcovariant mechanics in the context of a simple system
The pendulum revisited Say we want to describe the small oscillations ofa pendulum To this aim we need two measuring devices a clock and adevice that reads the elongation of the pendulum Let t be the readingof the clock (in seconds) and α the reading of the device measuring theelongation of the pendulum (in centimeters) Call the variables t and α thepartial observables of the pendulum (I use also relativistic observables orsimply observables if there is no risk of confusion with the nonrelativisticnotion of observable which is different)
A useful observation is a reading of the time t and the elongation αtogether Thus an observation yields a pair (t α) Call a pair obtained inthis manner an event
Let C be the two-dimensional space with coordinates t and α CallC the event space of the pendulum (I use also relativistic configurationspace or simply configuration space if there is no risk of confusion withthe nonrelativistic configuration space C0 which is different)
Experience shows we can find mathematical laws characterizing se-quences of events This is the reason we can do science These laws havethe following form Call an unparametrized curve γ in C a motion of thesystem Perform a sequence of measurements of pairs (t α) and find thatthe points representing the measured pairs sit on a motion γ Then wesay that γ is a physical motion We express a motion as a relation in C
f(α t) = 0 (335)
Thus a motion γ is a relation or a correlation between partial observables
106 Mechanics
Then disturb the pendulum (push it with a finger) and repeat theentire experiment over At each repetition of the experiment a differentmotion γ is found That is a different mathematical relation of the form(335) is found Experience shows that the space of the physical motionsis very limited it is just a two-dimensional space Only a two-dimensionalspace of curves γ is realized in Nature
In the case of the small oscillations of a frictionless pendulum we cancoordinatize the physical motions by the two real numbers A ge 0 and0 le φ lt 2π and (335) is given by
f(α tA φ) = αminusA sin(ωt + φ) = 0 (336)
This equation gives a curve γ in C for each couple (A φ)Let Γ be the two-dimensional space of the physical motions coordina-
tized by A and φ Γ is the relativistic phase space of the pendulum (orthe space of the motions) A point in Γ is also called a relativistic state(Or a Heisenberg state or simply a state if there is no risk of confusionwith the nonrelativistic notion of state which is different)
Equation (336) is the mathematical law that captures the empiricalinformation we have on the pendulum This equation is the evolutionequation of the system The function f is the evolution function of thesystem
A relativistic state is determined by a couple (A φ) It determines acurve γ in the (t α) plane That is it determines a correlation betweenthe two partial observables t and α via (336) If we disturb the pendulumby interacting with it or if we start a new experiment over we have a newstate The state remains the same if we observe the pendulum and theclock without disturbing them (here we disregard quantum theory ofcourse)
Summarizing each state in the phase space Γ determines a correlationbetween the observables in the configuration space C The set of theserelations is captured by the evolution equation (336) namely by thevanishing of a function
f Γ times C rarr R (337)
The evolution equation f = 0 expresses all predictions that can be madeusing the theory Equivalently these predictions are captured by the sur-face f = 0 in the cartesian product of the phase space with the configu-ration space
General structure of the dynamical systems The (CΓ f) language de-scribed above is general It is sufficient to describe all predictions of con-ventional mechanics On the other hand it is broad enough to describe
32 Relativistic mechanics 107
general-relativistic systems All fundamental systems can be described (tothe accuracy at which quantum effects can be disregarded) by making useof these concepts
(i) The relativistic configuration space C of the partial observables
(ii) The relativistic phase space Γ of the relativistic states
(iii) The evolution equation f = 0 where f Γ times C rarr V
Here V is a linear space The state in the phase space Γ is fixed until thesystem is disturbed Each state in Γ determines (via f = 0) a motion γ ofthe system namely it describes a relation or a set of relations betweenthe observables in C
A motion is not necessarily a one-dimensional curve in C it can be asurface in C of any dimension k If k gt 1 we say that there is gaugeinvariance For a system with gauge invariance we call ldquomotionrdquo the mo-tion itself and any curve within it In this chapter we shall not deal muchwith systems with gauge invariance but we shall mention them whererelevant
Predictions are obtained as follows We first perform enough measure-ments to determine the state (In reality the state of a large system isoften ldquoguessedrdquo on the basis of incomplete observations and reasonableassumptions justified inductively) Once the state is so determined orguessed the evolution equation predicts all the possible events namelyall the allowed correlations between the observables in any subsequentmeasurement
In the example of the pendulum for instance the equation predicts thevalue of α that can be measured together with any given t or the valuesof t that can be measured together with any given α These predictionsare valid until the system is disturbed
The definitions of observable state configuration space and phase spacegiven here are different from the conventional definitions In particu-lar notions of instantaneous state evolution in time observable at afixed time play no role here These notions make no sense in a general-relativistic context For nonrelativistic systems the usual notions can berecovered from the definitions given The relation between the relativisticdefinitions considered here and the conventional nonrelativistic notions isdiscussed in Section 324
The task of mechanics is to find the (CΓ f) description for all phys-ical systems The first step kinematics consists in the specification ofthe observables that characterize the system Namely it consists in thespecification of the configuration space C and its physical interpretationPhysical interpretation means the association of coordinates on C with
108 Mechanics
measuring devices The second step dynamics consists in finding thephase space Γ and the function f that describe the physical motions ofthe system
In the next section I describe a relativistic hamiltonian formalism formechanics based on the relativistic notions of state and observable definedhere
322 Hamiltonian mechanics
Elementary physical systems can be described by hamiltonian mechanics7
Once the kinematics ndash that is the space C of the partial observables qa ndashis known the dynamics ndash that is Γ and f ndash is fully determined by givinga surface Σ in the space Ω of the observables qa and their momenta paThe surface Σ can be specified by giving a function H Ω rarr Rk Σ is thendefined by H = 08 Denote γ a curve in Ω (observables and momenta)and γ its restriction to C (observables alone) H determines the physicalmotions via the following
Variational principle A curve γ connecting the events qa1and qa2 is a physical motion if γ extremizes the action
S[γ] =int
γpa dqa (338)
in the class of the curves γ satisfying
H(qa pa) = 0 (339)
whose restriction γ to C connects qa1 and qa2
All (relativistic and nonrelativistic) hamiltonian systems can be formu-lated in this manner
If k = 1 H is a scalar function and is sometimes called the hamil-tonian constraint The case k gt 1 is the case in which there is gaugeinvariance In this case the system (339) is sometimes called the systemof the ldquoconstraint equationsrdquo I call H the relativistic hamiltonian or ifthere is no ambiguity simply the hamiltonian I denote the pair (C H)as a relativistic dynamical system The generalization to field theory isdiscussed in Section 33
The relativistic hamiltonian H is related to but should not be confusedwith the usual nonrelativistic hamiltonian denoted H0 in this book Halways exists while H0 exists only for nonrelativistic systems
7Perhaps because they are the classical limit of a quantum system8Different Hs that vanish on the same surface Σ define the same physical system
32 Relativistic mechanics 109
Indeed notice that this formulation of mechanics is similar to the ex-tended formulation of nonrelativistic mechanics defined in Section 31The novelty is that C and H do not have the structure (312)ndash(313) Thediscussion above shows that this structure is not necessary in order to havea well-defined physical interpretation of the formalism A nonrelativisticsystem is characterized by the fact that one of its partial observables qa
is singled out by having the special role of an independent variable tThis does not happen in a relativistic system The following simple ex-ample shows that the relativistic formulation of mechanics is a propergeneralization of standard mechanics
Timeless double pendulum I now introduce a genuinely timeless system which I willrepeatedly use as a simple model to illustrate the theory Consider a mechanical modelwith two partial observables say a and b whose dynamics is defined by the relativistichamiltonian
H(a b pa pb) = minus1
2
(p2a + p2
b + a2 + b2 minus 2E) (340)
where E is a constant The extended configuration space is C = R2 The constraintsurface has dimension 3 it is the sphere of radius
radic2E in T lowastC The phase space has
dimension 2 The motions are curves in the (a b) space For each state the theorypredicts the correlation between a and b
A straightforward calculation (see below) shows that the evolution equation deter-mined by H is an ellipse in the (a b) space
f(a bα β) =( a
sinα
)2
+( b
cosα
)2
+ 2a
sinα
b
cosαcosβ minus 2E2 sin2 β = 0 (341)
where α and β parametrize Γ Therefore motions are closed curves and in fact ellipsesin C The system does not admit a conventional hamiltonian formulation because fora nonrelativistic hamiltonian system motions in C = R times C0 are monotonic in t isin Rand therefore cannot be closed curves
The example is not artificial There exist cosmological models that have precisely thisstructure For instance we can identify a with the radius of a maximally symmetricuniverse and b with the spatially constant value of a field representing the mattercontent of that universe and adopt the approximation in which these are the only twovariables that govern the large-scale evolution of the universe Then the dynamics ofgeneral relativity reduces to a system with the structure (340)
The associated nonrelativistic system The system (340) can also be viewed as followsConsider a physical system which we denote the ldquoassociated nonrelativistic systemrdquoformed by two noninteracting harmonic oscillators Let me stress that the associatednonrelativistic system is a different physical system than the timeless double pendu-lum considered above The timeless double pendulum has one degree of freedom itsassociated nonrelativistic system has two degrees of freedom The partial observablesof the associated nonrelativistic system are the two elongations a and b and the timet The nonrelativistic hamiltonian that governs the evolution in t is
H0(a b pa pb) =1
2
(p2a + p2
b + a2 + b2 minus 2E) (342)
110 Mechanics
It it has the same form as the relativistic hamiltonian (340) of the timeless doublependulum9 The constant term 2E of course has no effect on the equations of motionit only redefines the energy Physically we can view the relation between the twosystems as follows Imagine that we take the associated nonrelativistic system but wedecide to ignore the clock that measures t we consider just measurements of the twoobservables a and b Furthermore assume that the energy of the double pendulum isconstrained to vanish namely
1
2
(p2a + p2
b + a2 + b2)
= E (344)
Then the observed relation between the measurements of a and b is described by therelativistic system (340)
Geometric formalism As for nonrelativistic hamiltonian mechanics theequations of motion can be expressed in an elegant geometric form Thevariables (qa pa) are coordinates on the cotangent space Ω = T lowastC Equa-tion (339) defines a surface Σ in this space The cotangent space carriesthe natural one-form
θ = padqa (345)
Denote θ the restriction of θ to the surface Σ The two-form ω = dθ onΣ is degenerate it has null directions The integral surfaces of these nulldirections are the orbits of ω on Σ Each such orbit projects from T lowastC toC to give a surface in C These surfaces are the motions
Consider the case k = 1 In this case Σ has dimension 2nminus1 the kernelof ω is generically one-dimensional and the motions are generically one-dimensional Let γ be a motion on Σ and X be a vector tangent to themotion then
ω(X) = 0 (346)
To find the motions we have just to integrate this equation Equation(346) is the equation of motion X is defined by the homogeneous equa-tion (346) only up to a multiplicative factor Therefore the tangent of theorbit is defined only up to a multiplicative factor and so the parametriza-tion of the orbit is not determined by (346)
The case k gt 1 is analogous In this case Σ has dimension 2n minus kthe kernel of ω is generically k-dimensional and the motions are generi-cally k-dimensional X is then a k-dimensional multi-tangent and it stillsatisfies (346)
Let π Σ rarr Γ be the projection map that associates with each pointof the constraint surface the motion to which the point belongs The
9The relativistic hamiltonian of the associated nonrelativistic system is
H(a b t pa pb pt) = pt +1
2
(p2a + p2
b + a2 + b2 minus 2E) (343)
32 Relativistic mechanics 111
projection π equips the phase space Γ with a symplectic two-form ωph
defined to be the two-form whose pull-back to Σ under π is ω Locally itexists and it is unique precisely because ω is degenerate along the orbits
Relation with the variational principle Let γ be an orbit of ω on Σ such that itsrestriction γ in C is bounded by the initial and final events q1 and q2 Let γprime be a curvein Σ infinitesimally close to γ such that its restriction γprime is also bounded by q1 and q2Let δs1 (and δs2) be the difference between the initial (and final) points of γ and γprimeThe four curves γ δs1 minusγprime and minusδs2 form a closed curve in Σ Consider the integral ofω over the infinitesimal surface bounded by this curve This integral vanishes becauseat every point of the surface one of the tangents is (to first order) a null directionof ω (the surface is a strip parallel to the motion γ) But ω = dθ and therefore byStokes theorem the integral of θ along the closed curve vanishes as well The integralof θ = padq
a along δs1 and δs2 is zero because qa is constant along these segmentsTherefore int
γ
θ +
int
minusγprimeθ = 0 (347)
or
δ
int
γ
θ = 0 (348)
for any variation in the class considered This is precisely the variational principlestated in Section 32
Hamilton equations Consider first the case k = 1 Motions are one-dimensional Parametrize the curve with an arbitrary parameter τ Thatis describe a motion (in Ω) with the functions (qa(τ) pa(τ)) These func-tions satisfy the Hamilton system
H(qa pa) = 0 (349)
dqa(τ)dτ
= N(τ) va(qa(τ) pa(τ))
dpa(τ)dτ
= N(τ) fa(qa(τ) pa(τ)) (350)
where
va(qa pa) =partH(qa pa)
partpa fa(qa pa) = minuspartH(qa pa)
partqa (351)
The function N(τ) is called the ldquoLapse functionrdquo It is arbitrary Differ-ent choices of N(τ) determine different parameters τ along the motionTo obtain a monotonic parametrization we need N(τ) gt 0 A preferredparametrization can be obtained by taking N(τ) = 1 that is replacing(350)ndash(351) by the equations (written in the usual compact form)
qa =partH
partpa pa = minuspartH
partqa (352)
112 Mechanics
where the dot indicates derivative with respect to τ This choice is calledthe Lapse = 1 gauge It is not preferred in a physical sense In particu-lar different but physically equivalent hamiltonians H defining the samesurface Σ determine different preferred parametrizations Nevertheless itis often the easiest gauge to compute with
If k gt 1 the function H has components Hj with j = 1 k and motions arek-dimensional surfaces We can parametrize a motion with k arbitrary parameters τ =τj Namely we can represent it using the 2n functions qa(τ) pa(τ) of k parametersτj These equations satisfy the system given by (349) and
partqa(τ)
partτj= Nj(τ)
partHj(qa pa)
partpa
partpa(τ)
partτj= minusNj(τ)
partHj(qa pa)
partqa (353)
A motion is determined by the full k-dimensional surface in C we can choose a particularcurve τ(τ) on this surface where τ is an arbitrary parameter and represent the motionby the one-dimensional curve qa(τ) = qa(τ(τ)) in C This satisfies the system formedby (349) and
dqa(τ)
dτ= Nj(τ)
partHj(qa pa)
partpa
dpa(τ)
dτ= minusNj(τ)
partHj(qa pa)
partqa(354)
for k arbitrary functions of one variable Nj(τ) Different choices of the functions Nj(τ)determine different curves on the single surface that defines a motion These are gauge-equivalent representations of the same motion
It is important to stress that the parameters τ or τj are an artifact ofthis technique They have no physical significance They are absent in thegeometric formalism as well as in the HamiltonndashJacobi formalism as weshall see below The physical content of the theory is in the motion in Cnot in the way the motion is parametrized That is the physical informa-tion is not in the functions qa(τ) it is in the image of these functions inC
Relation with the variational principle Parametrize the curve γ with a parameter τ The action (338) reads
S =
intdτ pa(τ)
dqa(τ)
dτ (355)
The constraint (339) can be implemented in the action with lagrange multipliers Ni(τ)This defines the action
S =
intdτ
(pa
dqa
dτminusNi H
i(pa qa)
) (356)
Varying this action with respect to Ni(τ) qa(τ) and pa(τ) gives the Hamilton equation(349) (354)
Example double pendulum Consider the system defined by the hamiltonian (340)The Hamilton equations (349) (352) in the Lapse = 1 gauge give
a = pa b = pb pa = minusa pb = minusb a2 + b2 + p2a + p2
b = 2E (357)
32 Relativistic mechanics 113
The general solution is
a(τ) = Aa sin(τ) b(τ) = Ab sin(τ + β) (358)
where Aa =radic
2E sinα and Ab =radic
2E cosα The motions are given by the image in Cof these curves These are the ellipses (341) The parametrization of the curves (358)has no physical significance The physics is in the unparametrized ellipses in C and inthe relation between a and b they determine
HamiltonndashJacobi HamiltonndashJacobi formalism is elegant general andpowerful it has a direct connection with quantum theory and is con-ceptually clear The relativistic formulation of HamiltonndashJacobi theory issimpler than the conventional nonrelativistic version indicating that therelativistic formulation unveils a natural and general structure of mechan-ical systems
The relativistic HamiltonndashJacobi formalism is given by the system of kpartial differential equations
H
(qa
partS(qa)partqa
)= 0 (359)
for the function S(qa) defined on the extended configuration space C LetS(qa Qi) be a family of solutions parametrized by the nminus k constants ofintegration Qi Pose
f i(qa Pi Qi) equiv partS(qa Qi)
partQi+ Pi = 0 (360)
for n minus k arbitrary constants Pi This is the evolution equation Theconstants Qi Pi coordinatize a 2(nminus k)-dimensional space Γ This is thephase space
The form of the relativistic HamiltonndashJacobi equation (359) is simplerthan the usual nonrelativistic HamiltonndashJacobi equation (317) Further-more there is no equation to invert as in the nonrelativistic formalismNotice also that the function S(qa Qi) can be identified with the principalHamiltonndashJacobi function S(t qi Qi) = Et+W (qi Qi) of the nonrelativis-tic formalism as well as with the characteristic HamiltonndashJacobi functionW (qi Qi) since (359) is formally like (320) with vanishing energy Thetwo functions are in fact identified in the relativistic formalism
Example double pendulum The HamiltonndashJacobi equation of the timeless system(340) is
(partS(a b)
parta
)2
+
(partS(a b)
partb
)2
+ a2 + b2 minus 2E = 0 (361)
114 Mechanics
A one-parameter family of solutions is given by
S(a b A) =a
2
radicA2 minus a2 +
A2
2arctan
(aradic
A2 minus a2
)
+b
2
radic2E minusA2 minus b2 +
2E minusA2
2arctan
(bradic
2E minusA2 minus b2
) (362)
The general solution (341) of the system is directly obtained by writing
partS(a b A)
partAminus φ = 0 (363)
where φ is an integration constant
Derivation of the HamiltonndashJacobi formalism Since the phase space Γ is a symplecticspace we can locally choose canonical coordinates (Qi Pi) over it These coordinatescan be pulled back to Σ where they are constant along the orbits In fact they labelthe orbits Let θph = PidQ
i therefore dθph = ω But ω = dθ = d(padqa) so on Σ we
have
d(θph minus θ) = d(PidQi minus padq
a) = 0 (364)
This implies that there should locally exist a function S on Σ such that
PidQi minus padq
a = minusdS (365)
Let us choose qa and Qi as independent coordinates on Σ Then (365) reads
dS(qa Qi) = pa(qa Qi)dqa minus Pi(q
a Qi)dQi (366)
that is
partS(qa Qi)
partqa= pa(q
a Qi) (367)
partS(qa Qi)
partQi= minusPi(q
a Qi) (368)
By the definition of Σ we have H(qa pa) = 0 which using (367) gives the HamiltonndashJacobi equation (359) Equation (368) is then immediately the evolution equation(360)
In other words S(qa Qi) is the generating function of a canonical transformationthat relates the observables and their momenta (qa pa) to new canonical variables(Qi Pi) satisfying Qi = 0 Pi = 0 These new variables are constants of motion andtherefore define Γ The relation between C and Γ given by the canonical transformationequations (367)ndash(368) is the evolution equation
323 Nonrelativistic systems as a special case
Here I discuss in more detail how the notions and the structures of conven-tional mechanics described in Section 31 are recovered from the relativis-tic formalism A nonrelativistic system is simply a relativistic dynamicalsystem in which one of the partial observables qa is denoted t and calledldquotimerdquo and the hamiltonian H has the form
H = pt + H0 (369)
partt+ X0 (374)
32 Relativistic mechanics 115
where H0 is independent from pt and is called the nonrelativistic hamil-tonian The quantity E = minuspt is called energy The device that measuresthe partial observable t is called a clock
The relativistic configuration space therefore has the structure
C = Rtimes C0 (370)
with coordinates qa = (t qi) where i = 1 nndash1 The space C0 is theusual nonrelativistic configuration space Accordingly the cotangent spaceΩ = T lowastC has coordinates (qa pa) = (t qi pt pi)
If H has the form (369) the relativistic HamiltonndashJacobi equation(359) becomes the conventional nonrelativistic HamiltonndashJacobi equation(317)
Given a state and a value t of the clock observable we can ask whatare the possible values of the observables qi such that (qi t) is a possibleevent That is we can ask what is the value of qi ldquowhenrdquo the time is t Thesolution is obtained by solving the evolution function f i(qi tQi Pi) = 0for the qi This gives
qi = qi(tQi Pi) (371)
which is interpreted as the evolution equation of the variables qi in thetime t The form (369) of the hamiltonian guarantees that we can solve fwith respect to the qi because the Hamilton equation for t (in the gaugeLapse = 1) is simply t = τ which can be inverted
In the parametrized hamiltonian formalism the evolution equationfor t(τ) is trivial and gives taking advantage of the freedom in rescal-ing τ just t = τ Using this equations (353) become the conventionalHamilton equations and (349) simply fixes the value of pt namely theenergy
In the presymplectic formalism the surface Σ turns out to be
Σ = Rtimes Γ0 (372)
where the coordinate on R is the time t and Γ0 = T lowastC0 is the nonrela-tivistic phase space The restriction of θ to this surface has the Cartanform
θ = pidqi minusH0dt = θ0 minusH0dt (373)
We can take the vector field X to have the form
X =part
116 Mechanics
where X0 is a vector field on Γ0 Then the equation of motion (346)reduces to the equation
(dθ0)(X0) = minusdH0 (375)
which is the geometric form of the conventional Hamilton equations ThusH determines how the variables in Γ0 are correlated to the variable t Thatis ldquohow the variables in Γ0 evolve in timerdquo In this sense the nonrelativis-tic hamiltonian H0 generates ldquoevolution in the time trdquo This evolution isgenerated in Γ0 by the hamiltonian flow X0 of H0 A point s = (qi pi) inΓ0 is taken to the point s(t) = (qi(t) pi(t)) where
ds(t)dt
= X0(s(t)) (376)
The evolution of an observable (not depending explicitly on time) de-fined by At(s) = A(s(t)) = A(s t) can be written introducing the Poissonbracket notation
AB = minusXA(B) = XB(A) =sum
i
(partA
partqipartB
partpiminus partA
partpi
partB
partqi
) (377)
asdAt
dt= At H0 (378)
Instantaneous states and relativistic states The nonrelativistic definitionof state refers to the properties of a system at a certain moment of timeDenote this conventional notion of state as the ldquoinstantaneous staterdquo Thespace of the instantaneous states is the conventional nonrelativistic phasespace Γ0 Letrsquos fix the value t = t0 of the time variable and characterizethe instantaneous state in terms of the initial data For the pendulumthese are position and momentum (α0 p0) at t = t0 Thus (α0 p0) arecoordinates on Γ0
On the other hand a relativistic state is a solution of the equations ofmotion (If there is gauge invariance a state is a gauge equivalence classof solutions of the equations of motion) The relativistic phase space Γ isthe space of the solutions of the equations of motion
Given a value t0 of the time there is a one-to-one correspondence be-tween initial data and solutions of the equations of motion each solutionof the equation of motion determines initial data at t = t0 and eachchoice of initial data at t0 determines uniquely a solution of the equationsof motion Therefore there is a one-to-one correspondence between in-stantaneous states and relativistic states Therefore the relativistic phasespace Γ is isomorphic to the nonrelativistic phase space Γ sim Γ0 How-ever the isomorphism depends on the time t0 chosen and the physical
32 Relativistic mechanics 117
interpretation of the two spaces is quite different One is a space of statesat a given time the other a space of motions
In the case of the pendulum the nonrelativistic phase space Γ0 can becoordinatized with (α0 p0) the relativistic phase space Γ can be coordina-tized with (A φ) The identification map (A φ) rarr (α0 p0) is given by
α0(A φ) = A sin(ωt0 + φ) (379)p0(A φ) = ωmA cos(ωt0 + φ) (380)
The nonrelativistic phase space Γ0 plays a double role in nonrelativistichamiltonian mechanics it is the space of the instantaneous states but itis also the arena of nonrelativistic hamiltonian mechanics over which H0
is defined In the relativistic context this double role is lost one mustdistinguish the cotangent space Ω = T lowastC over which H is defined fromthe phase space Γ which is the space of the motions This distinction willbecome important in field theory where Ω is finite-dimensional while Γ isinfinite-dimensional
In a nonrelativistic system X0 generates a one-parameter group oftransformations in Γ0 the hamiltonian flow of H0 on Γ0 Instead of havingthe observables in C0 depending on t one can shift perspective and viewthe observables in C0 as time-independent objects and the states in Γ0
as time-dependent objects This is a classical analog of the shift fromthe Heisenberg to the Schrodinger picture in quantum theory and can becalled the ldquoclassical Schrodinger picturerdquo
In the relativistic theory there is no special ldquotimerdquo variable C doesnot split naturally as C = R times C0 the constraints do not have the formH = pt +H0 and the description of the correlations in terms of ldquohow thevariables in C0 evolve in timerdquo is not available in general In a system thatdoes not admit a nonrelativistic formulation the classical Schrodingerpicture in which states evolve in time is not available only the relativisticnotions of state and observable make sense
Special-relativistic systems There are relativistic systems that do not ad-mit a nonrelativistic formulation such as the example of the double pen-dulum discussed above There are also systems that can be given a nonrel-ativistic formulation but their structure is far more clean in the relativis-tic formalism Lorentz-invariant systems are typical examples They canbe formulated in the conventional hamiltonian picture only at the priceof breaking Lorentz invariance The choice of a preferred Lorentz framespecifies a preferred Lorentz time variable t = x0 The predictions of thetheory are Lorentz invariant but the formalism is not This way of deal-ing with the mechanics of special-relativistic systems hides the simplicityand symmetry of its hamiltonian structure The relativistic hamiltonian
118 Mechanics
formalism exemplified below for the case of a free particle is manifestlyLorentz invariant
Example relativistic particle The configuration space C is a Minkowski space M withcoordinates xμ The dynamics is given by the hamiltonian H = pμpμ+m2 which definesthe mass-m Lorentz hyperboloid Km The constraint surface Σ is therefore given byΣ = T lowastM|H=0 = MtimesKm The null vectors of the restriction of dθ = dpμ and dxμ to Σare
X = pμpart
partxμ (381)
because ω(X) = pμdpμ = 2d(p2) = 0 on pμpμ = minusm2 The integral lines of X namelythe lines whose tangent is X are
xμ(τ) = Pμτ + Xμ pμ(τ) = Pμ (382)
which give the physical motions of the particle The space of these lines is six-dimensional (it is coordinatized by the eight numbers (Xμ Pμ) but PμPμ = minusm2
and (Pμ Xμ) defines the same line as (Pμ Xμ + Pμa) for any a) and represents thephase space The motions are thus the timelike straight lines in M
Notice that all notions used are completely Lorentz invariant A state is a time-like geodesic an observable is any Minkowski coordinate a correlation is a point inMinkowski space The theory is about correlations between Minkowski coordinatesthat is observations of the particle at certain spacetime points On the other handthe split M = RtimesR3 necessary to define the usual hamiltonian formalism is observerdependent
The relativistic formulation of mechanics is not only more general butalso more simple and elegant and better operationally founded than theconventional nonrelativistic formulation This is true whether one usesthe Hamilton equations the geometric language or the HamiltonndashJacobiformalism
324 Discussion mechanics is about relations between observables
The key difference between the relativistic formulation of mechanics dis-cussed in this chapter and the conventional one ndash and in particular be-tween the relativistic definitions of state and observable and the conven-tional ones ndash is the role played by time In the nonrelativistic context timeis a primary concept Mechanics is defined as the theory of the evolutionin time In the definition considered here on the other hand no specialpartial observable is singled out as the independent variable Mechanicsis defined as the theory of the correlations between partial observables
Technically C does not split naturally as C = R times C0 the constraintsdo not have the form H = pt + H0 and the Schrodinger-like descriptionof correlations in terms of ldquohow states and observables evolve in timerdquo isnot available in general
32 Relativistic mechanics 119
It is important to understand clearly the meaning of this shift of per-spective
The first point is that it is possible to formulate conventional mechanicsin this time-independent language In fact the formalism of mechanicsbecomes even more clean and symmetric (for instance Lorentz covariant)in this language This is a remarkable fact by itself What is remarkableis that the formal structure of mechanics doesnrsquot really treat the timevariable on a different footing than the other variables The structure ofmechanics is the formalization of what we have understood about thephysical structure of the world Therefore we can say that the physical(more precisely mechanical) structure of the world is quite blind to thefact that there is anything ldquospecialrdquo about the variable t
Historically the idea that in a relativistic context we need the time-independent notion of state has been advocated particularly by Dirac(see [148] in Chapter 5) and by Souriau [105] The advantages of therelativistic notion of state are multi-fold In special relativity for instancetime transforms with other variables and there is no covariant definitionof instantaneous state In a Lorentz-invariant field theory in particularthe notion of instantaneous state breaks explicit Lorentz covariance theinstantaneous state is the value of the field on a simultaneity surfacewhich is such for a certain observer only The relativistic notion of stateon the other hand is Lorentz invariant
The second point is that this shift in perspective is forced in gen-eral relativity where the notion of a special spacelike surface over whichinitial data are fixed conflicts with diffeomorphism invariance A gen-erally covariant notion of instantaneous state or a generally covariantnotion of observable ldquoat a given timerdquo makes little physical sense In-deed none of the various notions of time that appear in general rel-ativity (coordinate time proper time clock time) play the role that tplays in nonrelativistic mechanics A consistent definition of state andobservable in a generally covariant context cannot explicitly involvetime
The physical reason for this difference is discussed in Chapter 2 Innonrelativistic physics time and position are defined with respect to asystem of reference bodies and clocks that are implicitly assumed to ex-ist and not to interact with the physical system studied In gravitationalphysics one discovers that no body or clock exists which does not inter-act with the gravitational field the gravitational field affects directly themotion and the rate of any reference body or clock Therefore one cannotseparate reference bodies and clocks from the dynamical variables of thesystem General relativity ndash in fact any generally covariant theory ndash isalways a theory of interacting variables that necessarily include the phys-ical bodies and clocks used as references to characterize spacetime points
120 Mechanics
In the example of the pendulum discussed in Section 321 for instancewe can assume that the pendulum and the clock do not interact In ageneral-relativistic context the two always interact and C does not splitinto C0 and R
Summarizing it is only in the nonrelativistic limit that mechanics canbe seen as the theory of the evolution of the physical variables in time Ina fully relativistic context mechanics is a theory of correlations betweenpartial observables
325 Space of boundary data G and Hamilton function S
I describe here the relativistic version of a structure that plays an impor-tant role in the quantum theory
Hamilton function Notice that the Hamilton function defined in (321)is naturally a function on (two copies of) the relativistic configurationspace C In fact its definition extends to the relativistic context giventwo events qa and qa0 in C the Hamilton function is defined as
S(qa qa0) =int
γθ (383)
where γ is the orbit in Σ of the motion that goes from qa0 to qa This is alsothe value of the action along this motion For instance for a nonrelativisticsystem we can write
S(qa qa0) =int
γθ =
int
γpadqa (384)
=int 1
0pa(τ)qa(τ)dτ =
int 1
0
(pi(τ)qi(τ) + pt(τ)t(τ)
)dτ
=int 1
0
(pi(τ)qi(τ) minusH0(τ)t(τ)
)dτ
=int t
t0
(pi(t)
dqi(t)dt
minusH0(t))
dt
=int t
t0
L
(qi
dqi(t)dt
)dt (385)
where L is the lagrangian From the definition we have
partS(qa qa0)partqa
= pa(qa qa0) (386)
where pa(qa qa0) is the value of the momentum at the final event Notice
32 Relativistic mechanics 121
that this value depends on qa as well as on qa0 The derivation of thisequation is less obvious than appears at first sight I leave the details tothe acute reader
It follows from (386) that S(qa qa0) satisfies the HamiltonndashJacobi equa-tion (359) The quantities qa0 can be seen as the HamiltonndashJacobi inte-gration constants Notice that they are n not nminus1 Equations (360) nowread
fa(qa qa0 pa0) =partS(qa qa0)
partqa0+ pa0 = 0 (387)
Therefore the phase space is directly (over-)coordinatized by initial co-ordinates and momenta (qa0 pa0) These are not independent for tworeasons First they satisfy the equation H = 0 Second different sets(qa0(τ) pa0(τ)) along the same motion determine the same motion Fur-thermore one of the equations (387) turns out to be dependent on theothers
S(qa qa0) satisfies the HamiltonndashJacobi equation in both sets of vari-ables namely it satisfies also
H
(qa0 minus
partS(qa qa0)partqa0
)= 0 (388)
where the minus sign comes from the fact that the second set of variablesis in the lower integration boundary in (383)
If there is more than one physical motion γ connecting the boundarydata the Hamilton function is multivalued If γ1 γn are distinct so-lutions with the same boundary values we denote its different branchesas
Si(qa1 qa2) =
int
γi
θ (389)
The Hamilton function is strictly related to the quantum theory It isthe phase of the propagator W (qa qa0) which as we shall see in Chapter5 is the main object of the quantum theory If S is single valued we have
W (qa qa0) sim A(qa qa0) eiS(qaqa0 ) (390)
up to higher terms in If S is multivalued
W (qa qa0) simsum
i
Ai(qa qa0) eiSi(q
aqa0 ) (391)
122 Mechanics
Example free particle In the case of the free particle the value of the classical actionalong the motion is
S(x t x0 t0) =
int 1
0
(pt t + px)dt = pt
int t
t0
dt + p
int x
x0
dx
= minusm(xminus x0)2
2(tminus t0)+ m
(xminus x0)2
tminus t0
=m(xminus x0)
2
2(tminus t0) (392)
It is easy to check that S solves the HamiltonndashJacobi equation of the free particle Thefirst of the two equations (387) gives the evolution equation
partS(x t x0 t0)
partx0+ p0 = minusm
xminus x0
tminus t0+ p0 = 0 (393)
The second equation constrains the pt integration constant
partS(x t x0 t0)
partt0+ pt0 = minus 1
2mp20 + pt0 = 0 (394)
Recall that the propagator of the Schrodinger equation of the free particle is
W (x t x0 t0) =1
radici(tminus t0)
ei
m(xminusx0)2
2(tminust0) =1
radici(tminus t0)
eiS(xtx0t0) (395)
Example double pendulum The Hamilton function of the timeless system (340) canbe computed directly from its definition This gives
S(a b aprime bprime) = S(a b aprime bprime A(a b aprime bprime)
) (396)
where
S(a b aprime bprime A) = S(a b A) minus S(aprime bprime A) (397)
S(a b A) is given in (362) and A(a b aprime bprime) is the value of A of the ellipse (341) thatcrosses (a b) and (aprime bprime) This value can be obtained by noticing that (358) imply withlittle algebra that
A2 =a2 + aprime2 minus 2aaprime cos τ
sin2 τ(398)
and
E =(a2 + b2 + aprime2 + bprime2) minus 2(aaprime + bbprime) cos τ
sin2 τ (399)
The second equation can be solved for τ(a b aprime bprime) and inserting this in the first givesA(a b aprime bprime) It is not complicated to check that the derivative of partS(a b aprime bprime A)partAvanishes when A = A(a b aprime bprime) Using this it is easy to see that (396) solves theHamiltonndashJacobi equation in both sets of variables
Notice that for given (a b aprime bprime) equation (398) gives A as a function of τ We cantherefore consider also the function
S(a b aprime bprime τ) = S(a b A(τ)) minus S(aprime bprime A(τ)) (3100)
which is the value of the action of the nonrelativistic system formed by two harmonicoscillators evolving in a physical time τ with a nonrelativistic hamiltonian H that is
32 Relativistic mechanics 123
it is the Hamilton function of this system With some algebra this can be written alsoas
S(a b aprime bprime τ) = Mτ +(a2 + b2 + aprime2 + bprime2) cos τ minus 2(aaprime + bbprime)
sin τ (3101)
As for A we have immediately
partS(a b aprime bprime τ)
partτ
∣∣∣∣τ=τ(abaprimebprime)
= 0 (3102)
This means that the Hamilton function of the timeless system is numerically equalto the Hamilton function of the two oscillators for the ldquocorrectrdquo time τ needed to gofrom (aprime bprime) to (a b) staying on a motion of total energy E And that this ldquocorrectrdquotime τ = τ(a b aprime bprime) is the one that minimizes the Hamilton function of the twooscillators
More precisely for given (a b aprime bprime) there are two paths connecting (aprime bprime) with (a b)these are the two paths in which the ellipse that goes through (aprime bprime) and (a b) is cutby these two points Denote S1 and S2 the two values of the action along these pathsTheir relation is easily obtained by noticing that the action along the entire ellipse iseasily computed as
S1 + S2 = 2πE (3103)
The space of the boundary data G The Hamilton function is a functionon the space G = C times C An element α isin G is an ordered pair of elementsof the extended configuration space C α = (qa qa0) Notice that α isthe ensemble of the boundary conditions for a physical motion For anonrelativistic system α = (t qi t0 qi0) the motion begins at qi0 at timet0 and ends at qi at time t
The space G carries a natural symplectic structure In fact let i G rarr Γbe the map that sends each pair to the orbit that the pair defines Thenwe can define the two-form ωG = ilowastωph where ωph is the symplecticform of the phase space defined in Section 322 In other words α =(qa qa0) can be taken as a natural over-coordinatization of the phase spaceInstead of coordinatizing a motion with initial positions and momentawe coordinatize it with initial and final positions In these coordinatesthe symplectic form is given by ωG
The two-form ωG can be computed without having first to computeΓ and ωph Denote γα the orbit in Σ with boundary data α and γα itsprojection to C Then α is the boundary of γα We write α = partγα Denotes and s0 the initial and final points of γα in Σ That is s = (qa pa) ands0 = (qa0 p0a) where in general both pa and p0a depend on qa and onqa0 Let δα = (δqa δqa0) be a vector (an infinitesimal displacement) at αThen the following is true
ωG(α)(δ1α δ2α) = ωG(qa qa0)((δ1qa δ1qa0) (δ2qa δ2qa0))= ω(s)(δ1s δ2s) minus ω(s0)(δ1s0 δ2s0) (3104)
124 Mechanics
Notice that δ1s the variation of s is determined by δ1q as well as byδ1q0 and so on This equation expresses ωG directly in terms of ω As weshall see this equation admits an immediate generalization in the fieldtheoretical framework where ω will be a five-form and ωG is a two-form
Now fix a pair α = (qa qa0) and consider a small variation of only oneof its elements say
δα = (δqa 0) (3105)
This defines a vector δα at α on G which can be pushed forward toΓ If the variation is along the direction of the motion then the pushforward vanishes that is ilowastδα = 0 because α and α + δα define the samemotion It follows that if the variation is along the direction of the motionωG(δα) = 0 Therefore the equation
ωG(X) = 0 (3106)
gives the solutions of the equations of motionThus the pair (G ωG) contains all the relevant information of the sys-
tem The null directions of ωG define the physical motions and if we divideG by these null directions the factor space is the physical phase spaceequipped with the physical symplectic structure
Example free particle The space G has coordinates α = (t x t0 x0) Given this pointin G there is one motion that goes from (t0 x0) to (t x) which is
t(τ) = t0 + (tminus t0)τ (3107)
x(τ) = x0 + (xminus x0)τ (3108)
Along this motion
p = mxminus x0
tminus t0 (3109)
pt = minusm(xminus x0)2
2(tminus t0)2 (3110)
The map i G rarr Γ is thus given by
P = p = mxminus x0
tminus t0 (3111)
Q = xminus p
mt = xminus xminus x0
tminus t0t (3112)
and therefore the two-form ωG is
ωG = ilowastωΓ = dP (t x t0 x0) and dQ(t x t0 x0)
= m dxminus x0
tminus t0and d
(xminus xminus x0
tminus t0t
)
=m
tminus t0
(dxminus xminus x0
tminus t0dt
)and
(dx0 minus xminus x0
tminus t0dt0
) (3113)
32 Relativistic mechanics 125
Immediately we see that a variation δα = (δt δx 0 0) (at constant (x0 t0)) such thatωG(δα) = 0 must satisfy
δx =xminus x0
tminus t0δt (3114)
This is precisely a variation of x and t along the physical motion (determined by(x0 t0)) Therefore ωG(δα) = 0 gives again the equations of motion The two nulldirections of ωG are thus given by the two vector fields
X =xminus x0
tminus t0partx + partt (3115)
X0 =xminus x0
tminus t0partx0 + partt0 (3116)
which are in involution (their Lie bracket vanishes) and therefore define a foliation ofG with two-dimensional surfaces These surfaces are parametrized by P and Q givenin (3111) (3112) and in fact
X(P ) = X(Q) = X0(P ) = X0(Q) = 0 (3117)
We have simply recovered in this way the physical phase space the space of thesesurfaces is the phase space Γ and the restriction of ωG to it is the physical symplecticform ωph
Physical predictions from S There are several different ways of derivingphysical predictions from the Hamilton function S(qa qa0)
bull Equation (387) gives the evolution function f in terms of the Hamil-ton function
bull If we can measure the partial observables qa as well as their momentapa then the Hamilton function can be used for making predictionsas follows Let
p1a(q
a1 q
a2) =
partS(qa1 qa2)
partqa1
p2a(q
a1 q
a2) =
partS(qa1 qa2)
partqa2 (3118)
The two equations
p1a = p1
a(qa1 q
a2)
p2a = p2
a(qa1 q
a2) (3119)
relate the four partial observables of the quadruplet (qa1 p1a q
a2 p
2a)
The theory predicts that it is possible to observe the quadruplet(qa1 p
1a q
a2 p
2a) only if this satisfies (3119) In this way the classical
theory determines which combinations of values of partial observ-ables can be observed
126 Mechanics
bull Alternatively we can fix two points qai and qaf in C and ask whethera third point qa is on the motion determined by qai and qaf That isask whether or not we could observe the correlation qa given thatthe correlations qai and qaf are observed A moment of reflection willconvince the reader that if the answer to this question is positivethen
S(qaf qa) + S(qa qai ) = S(qaf q
ai ) (3120)
because the action is additive along the motion Furthermore theincoming momentum at qa and the outgoing one must be equaltherefore
partS(qaf qa)
partqa= minus
partS(qa qaf )partqa
(3121)
326 Evolution parameters
A physical system is often defined by an action which is the integral of alagrangian in an evolution parameter But there are two different physicalmeanings that the evolution parameter may have
We have seen that the variational principle governing any hamiltoniansystem can be written in the form (here k = 1)
S =int
dτ(pa
dqa
dτminusNH(pa qa)
) (3122)
The action is invariant under reparametrizations of the evolution param-eter τ The evolution parameter τ has no physical meaning there is nomeasuring device associated with it
On the other hand consider a nonrelativistic system where qa = (t qi)and H = pt + H0 The action (3122) becomes
S =int
dτ(pt
dtdτ
+ pidqi
dτminusN(pt + H0(pi qi))
) (3123)
Varying N we obtain the equation of motion
pt = minusH0 (3124)
Inserting this relation back into the action we obtain
S =int
dτ(minusH0
dtdτ
+ pidqi
dτ
) (3125)
32 Relativistic mechanics 127
We can now change the integration variable from τ to t(τ) Defining (inbad physicistsrsquo notation) qi(t) equiv qi(τ(t)) and so on we can write
S =int
dτdtdτ
(minusH0 + pi
dqi
dt
)=
intdt
(pi
dqi(t)dt
minusH0
) (3126)
The evolution parameter in the action is no longer an arbitrary unphysicalparameter τ It is one of the partial observables the time observable t
If we are given an action we must understand whether the evolutionparameter in the action is a partial observable such as t or an unphysicalparameter such as τ If the action is invariant under reparametrizationsof its evolution parameter then the evolution parameter is unphysical Ifit is not then the evolution parameter is a partial observable
The same is true if the action is given in lagrangian form In performingthe Legendre transform from the lagrangian to the hamiltonian formalismthe consequence of the invariance of the action under reparametrizationsis doublefold First the relation between velocities and momenta cannotbe inverted The map from the space of the coordinates and velocities(qa qa) to the space of coordinates and momenta (qa pa) is not invertibleThe image of this map is a subspace Σ of Ω and we can characterize Σby means of an equation H = 0 for a suitable hamiltonian H Secondthe canonical hamiltonian computed via the Legendre transform vanisheson Σ In the language of constrained system theory this is because thecanonical hamiltonian generates evolution in the parameter of the actionsince this is unphysical this evolution is gauge the generator of a gaugeis a constraint and therefore vanishes on Σ
The evolution parameter in the action is often denoted t whether it isa partial observable or an unphysical parameter One should not confusethe t in the first case with the t in the second case They have verydifferent physical interpretations The time coordinate t in Maxwell theoryis a partial observable The time coordinate t in GR is an unphysicalparameter The fact that the two are generally denoted with the sameletter and with the same name is a very unfortunate historical accident
Example relativistic particle As we have seen the hamiltonian dynamics of a relativis-tic particle is defined by the relativistic hamiltonian H = pμp
μ + m2 namely by theaction principle
S =
intdτ
(pμx
μ minus N
2(pμp
μ + m2)
) (3127)
The relation between velocities and momenta obtained by varying pμ is xμ = NpμThe inverse Legendre transform therefore gives
S =1
2
intdτ
(xμx
μ
NminusNm2
) (3128)
128 Mechanics
We can also get rid of the Lagrange multiplier N from this action by writing its equationof motion
minus xμxμ
N2minusm2 = 0 (3129)
which is solved by
N =
radicminusxμxμ
m (3130)
and inserting this relation back into the action This gives
S = m
intdτ
radicminusxμxμ (3131)
which is the best known reparametrization invariant action for the relativistic particle
327 Complex variables and reality conditions
In GR it is often convenient to use complex dynamical variables sincethese simplify the form of the dynamical equations A particularly conve-nient choice is a mixture of complex and real variables where one canoni-cal variable is complex while the conjugate one is real As we shall see theselfdual connection (219) which is complex naturally leads to canonicalvariables of this type To exemplify how the use of such variables affectsdynamics consider a free particle with coordinate x momentum p andhamiltonian H0(x p) = p22m and assume we want to describe its dy-namics in terms of the variables (x z) where
z = xminus ip (3132)
In terms of these variables the nonrelativistic hamiltonian reads
H0(x z) = minus 12m
(xminus z)2 (3133)
Consider z as a configuration variable and ix as its momentum variableThe HamiltonndashJacobi equation becomes
partS(z t)partt
= minusH0
(minusi
partS(z t)partz
z
)=
12m
(ipartS(z t)
partz+ z
)2
(3134)
This is solved by
S(z t k) = kz +i2z2 minus k2
2mt (3135)
Equating the derivative of S with respect to the parameter k to a constantwe obtain the solution
C =partS(z t k)
partk= z minus k
mt (3136)
33 Field theory 129
that is
z(t) =k
mt + C (3137)
This is not the end of the story since so far k and C can be arbitrarycomplex constants To find the good solutions corresponding to real x andp we have to remind ourselves that z and x are not truly independentsince x is the real part of z
z + z = 2x (3138)
that is
z + z = minus2ipartS
partz (3139)
Inserting the solutions (3137) in the lhs we get
Im [k]t
m+ Im[C] = minusk (3140)
Therefore k is real and the imaginary part of C is minusk This immediatelygives the correct solution
Equation (3138) is called the reality condition The example illustratesthat in the HamiltonndashJacobi formalism the reality condition restricts thevalues of the HamiltonndashJacobi constants once the solutions of the evolu-tion equations are inserted
33 Field theory
There are several ways in which a field theory can be cast in hamiltonian form Onepossibility is to take the space of the fields at fixed time as the nonrelativistic configu-ration space Q This strategy badly breaks special- and general-relativistic invarianceLorentz covariance is broken by the fact that one has to choose a Lorentz frame for thet variable Far more disturbing is the conflict with general covariance The very founda-tion of generally covariant physics is the idea that the notion of a simultaneity surfaceover all the Universe is devoid of physical meaning It is better to found hamiltonianmechanics on a notion not devoid of physical significance
A second alternative is to formulate mechanics on the space of the solutions of theequations of motion The idea goes back to Lagrange In the generally covariant con-text a symplectic structure can be defined over this space using a spacelike surface butone can show that the definition is surface independent and therefore it is well definedThis strategy has been explored by several authors [108] The structure is viable inprinciple and has the merit of showing that the hamiltonian formalism is intrinsicallycovariant In practice it is difficult to work with the space of solutions to the field equa-tions in the case of an interacting theory Therefore we must either work over a spacethat we canrsquot even coordinatize or coordinatize the space with initial data on someinstantaneity surface and therefore effectively go back to the conventional fixed-timeformulation
130 Mechanics
The third possibility which I consider here is to use a covariant finite-dimensionalspace for formulating hamiltonian mechanics I noted above that in the relativisticcontext the double role of the phase space as the arena of mechanics and the spaceof the states is lost The space of the states namely the phase space Γ is infinite-dimensional in field theory essentially by definition of field theory But this does notimply that the arena of hamiltonian mechanics has to be infinite-dimensional as wellThe natural arena for relativistic mechanics is the extended configuration space C ofthe partial observables Is the space of the partial observables of a field theory finite-or infinite-dimensional
331 Partial observables in field theory
Consider a field theory for a field φ(x) with N components The fieldis defined over spacetime M with coordinates x and takes values in anN -dimensional target space T
φ M minusrarr T
x minusrarr φ(x) (3141)
For instance this could be Maxwell theory for the electric and magneticfields φ = ( E B) where N = 6 In order to make physical measurementson the field described by this theory we need N measuring devices to mea-sure the components of the field φ and four devices (one clock and threedevices giving us the distance from three reference objects) to determinethe spacetime position x Field values φ and positions x are therefore thepartial observables of a field theory Therefore the operationally motivatedrelativistic configuration space for a field theory is the finite-dimensionalspace
C = M times T (3142)
which has dimension 4 + N A correlation is a point (x φ) in C It repre-sents a certain value (φ) of the fields at a certain spacetime point (x) Thisis the obvious generalization of the (t α) correlations of the pendulum ofthe example in Section 321
A physical motion γ is a physically realizable ensemble of correlationsA motion is determined by a solution φ(x) of the field equations Such asolution determines a 4-dimensional surface in the ((4 +N)-dimensional)space C the surface is the graph of the function (3141) Namely theensemble of the points (x φ(x)) The space of the solutions of the fieldequations namely the phase space Γ is therefore an (infinite-dimensional)space of 4d surfaces γ in the (4 + N)-dimensional configuration space CEach state in Γ determines a surface γ in C
Hamiltonian formulations of field theory defined directly on C = MtimesTare possible and have been studied The main reason is that in a local field
33 Field theory 131
theory the equations of motion are local and therefore what happens at apoint depends only on the neighborhood of that point There is no needtherefore to consider full spacetime to find the hamiltonian structure ofthe field equations I refer the reader to the beautiful and detailed paper[109] and the ample references therein for a discussion of this kind ofapproach I give a simple and self-contained illustration of the formalismbelow with the emphasis on its general covariance
332 Relativistic hamiltonian mechanics
Consider a field theory on Minkowski space M Call φA(xμ) the fieldwhere A = 1 N The field is a function φ M rarr T where T = RN
is the target space namely it is the space in which the field takes valuesThe extended configuration space of this theory is the finite-dimensionalspace C = M times T with coordinates qa = (xμ φA) The coordinates qa
are the (4 + N) partial observables whose relations are described by thetheory A solution of the equations of motion defines a four-dimensionalsurface γ in C If we coordinatize this surface using the coordinates xμthen this surface is given by [xμ φA(xμ)] where φA(xμ) is a solution of thefield equations If alternatively we use an arbitrary parametrization withparameters τρ ρ = 0 1 2 3 then the surface is given by [xμ(τρ) φA(τρ)]where φA(xμ(τρ)) = φA(τρ)
In the case of a finite number of degrees of freedom (and no gauges)motions are given by one-dimensional curves At each point of the curvethere is one tangent vector and momenta coordinatize the one-forms Infield theory motions are four-dimensional surfaces and have four inde-pendent tangents Xμ or a ldquoquadritangentrdquo X = εμνρσXμotimesXνotimesXρotimesXσ
at each point Accordingly momenta coordinatize the four-forms LetΩ = Λ4T lowastC be the bundle of the four-forms pabcddqa and dqb and dqc and dqd
over C A point in Ω is thus a pair (qa pabcd) The space Ω carries thecanonical four-form
θ = pabcd dqa and dqb and dqc and dqd (3143)
In general given the finite-dimensional space C of the partial observ-ables qa dynamics is defined by a relativistic hamiltonian H Ω rarr V where Ω = Λ4T lowastC and V is a vector space Denote γ a four-dimensionalsurface in Ω and γ the projection of this surface on C The physical mo-tions γ are determined by the following
Variational principle A surface γ with a boundary α is aphysical motion if γ extremizes the integral
S[γ] =int
γpabcd dqa and dqb and dqc and dqd (3144)
132 Mechanics
in the class of the surfaces γ satisfying
H(qa pabcd) = 0 (3145)
and whose restriction γ to C is bounded by α
This is a completely straightforward generalization of the variational prin-ciple of Section 32 Equation (3145) defines a surface Σ in Ω As beforewe denote θ the restriction of θ to Σ and ω = dθ
For a field theory on Minkowski space without gauges the system(3145) is given by
pABCD = pABCμ = pABμν = 0 (3146)
H = π + H0(xμ φA pμA) = 0 (3147)
where H0 is DeDonderrsquos covariant hamiltonian [110] (see below for anexample) It is convenient to use the notation pμνρσ = πεμνρσ and pAνρσ =pμAεμνρσ for the nonvanishing momenta and to use coordinates (xμ φA pμA)on Σ On the surface defined by (3146)
θ = π d4x + pμA dφA and d3xμ (3148)
where we have introduced the notation d4x = dx0 and dx1 and dx2 and dx3 andd3xμ = d4x(partμ) = 1
3εμνρσdxν and dxρ and dxσ On Σ defined by (3146) and(3147)
θ = θ|Σ = minusH0(xμ φA pμA)d4x + pμA dφA and d3xμ (3149)
and ω is the five-form
ω = minusdH0(xμ φA pμA) and d4x + dpμA and dφA and d3xμ (3150)
An orbit of ω is a four-dimensional surface m immersed in Σ such thatat each of its points a quadruplet X of tangents to the surface satisfies
ω(X) = 0 (3151)
I leave to the reader the exercise of showing that the projection of anorbit on C is a physical motion
In more detail let (partμ partA partAμ ) be the basis in the tangent space of
Σ determined by the coordinates (xμ φA pμA) Parametrize the surfacewith arbitrary parameters τρ The surface is then given by the points[xμ(τρ) φA(τρ) pμA(τρ)] Let partρ = partpartτρ Then let
Xρ = partρxμ(τρ) partμ + partρφ
A(τρ) partA + partρpμA(τρ) partA
μ (3152)
33 Field theory 133
Then X = X0 otimesX1 otimesX2 otimesX3 is a rank four tensor on Σ If ω(X) = 0then φA(xμ) determined by φA(xμ(τρ)) = φA(τρ) is a physical motion
Summarizing the canonical formalism of field theory is completely de-fined by the couple (C H) where C is the finite-dimensional space ofthe partial observables (field values and spacetime coordinates) and H ahamiltonian on the finite-dimensional space Ω = Λ4T lowastC Equivalently itis completely defined by the finite-dimensional presymplectic space (Σ θ)The formalism as well as its interpretation make sense even in the case inwhich the coordinates of C do not split into xμ and φA and the relativistichamiltonian does not have the particular form (3146)ndash(3147)
Example scalar field As an example consider a scalar field φ(xμ) on Minkowski spacesatisfying the field equations
partμpartμφ(xμ) + m2φ(xμ) + V prime(φ(xμ)) = 0 (3153)
Here the Minkowski metric has signature [+minusminusminus] and V prime(φ) = dV (φ)dφ The fieldis a function φ M rarr T where here T = R The relativistic configuration space of thistheory is the five-dimensional space C = M times T with coordinates (xμ φ) The spaceΩ has coordinates (xμ φ π pμ) (equation (3146) is trivially satisfied) and carries thecanonical four-form
θ = π d4x + pμ dφ and d3xμ (3154)
The dynamics is defined on this space by the DeDonder relativistic hamiltonian
H = π + H0 = 0 (3155)
H0 =1
2
(pμpμ + m2φ2 + 2V (φ)
) (3156)
Therefore we can use coordinates (xμ φ pμ) on the surface Σ defined by these equationsand (3149) gives
θ = minus1
2
(pμpμ + m2φ2 + 2V (φ)
)d4x + pμ dφ and d3xμ (3157)
The couple (Σ θ) defines the presymplectic formulation of the system ω is the five-form
ω = dθ = minus(pμdpμ + m2φdφ + V prime(φ)dφ
)and d4x + dpμ and dφ and d3xμ (3158)
A tangent vector has the form
V = Xμpartxμ + Xφpartφ + Y μpartpμ (3159)
If we coordinatize the orbits of ω with the coordinates xμ at every point we have thefour independent tangent vectors
Xμ = partxμ + (partμφ)partφ + (partμpρ)partpρ (3160)
and the quadritangent X = εμνρσXμ otimesXν otimesXρ otimesXσ Inserting (3160) and (3158) inω(X) = 0 a straightforward calculation yields
partμφ(x) = pμ(x) (3161)
partμpμ(x) = minusm2φ(x) minus V prime(φ(x)) (3162)
and therefore precisely the field equations (3153) Notice that the canonical formalismis manifestly Lorentz covariant and no equal-time initial data surface has to be chosen
134 Mechanics
A state is a 4d surface (x φ(x)) in the extended configurations space C It representsa set of combinations of measurements of partial observables that can be realized inNature The phase space Γ is the infinite-dimensional space of these states A statedetermines whether or not a certain correlation (x φ) or a certain set of correlations(x1 φ1) (xn φn) can be observed They can be observed if the points (xi φi) lie onthe 4d surface that represents the state Conversely the observation of a certain setof correlations gives us information on the state the surface has to pass through theobserved points
333 The space of boundary data G and the Hamilton function S
The space of boundary data G described in Section 325 plays a key role inquantum theory In the finite-dimensional case G is the cartesian productof the extended configuration space with itself but the same is not true inthe field theoretical context where we need an infinite number of bound-ary data to characterize solutions Recall that in the finite-dimensionalcase G is the space of the possible boundaries of a motion in C In fieldtheory a motion is a 4d surface in C Its boundary is a three-dimensionalsurface α without boundaries in C Let us therefore define G in field theoryas a space of oriented three-dimensional surfaces α without boundaries inC As C = M times T the boundary data α includes a 3d boundary surfaceσ in spacetime as well as the value ϕ of the field on this surface
More precisely let xμ be spacetime coordinates in M and φA coordi-nates in the target space Coordinatize the 3d surface α with 3d coordi-nates τ = (τ1 τ2 τ3) Then α is given by the functions
α = [σ ϕ] (3163)σ τ rarr xμ(τ) (3164)ϕ τ rarr ϕA(τ) (3165)
The functions xμ(τ) define the 3d surface σ without boundaries in space-time The functions ϕA(τ) define the value of the field φ(x) on this surface
φA(x(τ)) equiv ϕA(τ) (3166)
Say σ is the boundary of a connected region R of M Then genericallyϕ determines a solution φ(x) of the equations of motion in the interior Rsuch that φ|σ = ϕ Imagine that σ is a cylinder in Minkowski space Todetermine a solution in the interior we need the initial value of the fieldon the bottom of the cylinder its final value on the top of the cylinder aswell as spatial boundary conditions on the side of the cylinder The dataα determine all these field values as well as the spacetime location of thecylinder itself These data form the field theoretical generalization of theset (t qi tprime qprimei) which form the argument of the Hamilton function and ofthe quantum propagator in finite-dimensional mechanics Alternatively
33 Field theory 135
the surface α need not be connected For instance it can be formed bytwo components which we can view as initial and final configurations
The Hamilton function S[α] = S[σ ϕ] is defined as the action of thesolution of the equations of motion φ(x) such that φ|σ = ϕ in R We shallsee below that S[α] satisfies a functional HamiltonndashJacobi equation andcan be seen as the classical limit of a quantum mechanical propagator10
We can give a more formal definition of S[α] analogous to the definition(383) Let γ be the motion in C bounded by α Let γ be the lift of γ toΣ That is let γ be the orbit of ω that projects down to γ Then
S[α] =int
γθ (3167)
Example scalar field For a scalar field for instance
S[α] =
int
γ
θ =
int
γ
(πd4x + pμdφ and d3xμ) =
int
R(π + pμpartμφ) d4x
=
int
R
(minus1
2pμpμ minus 1
2m2φ2 minus V (φ) + pμpartμφ
)d4x (3168)
=
int
R
(1
2partμφpart
μφminus 1
2m2φ2 minus V (φ)
)d4x
=
int
RL(φ partμφ) d4x (3169)
where L is the lagrangian density and we have used the equation of motion pμ = partμφIt is not hard to compute the Hamilton function for a free scalar field in the special
case in which α is formed by the two spacelike parallel hypersurfaces xμ(τ) = (t1 τ)and xμ(τ) = (t2 τ) and by the values φ1(x) and φ2(x) of the field on these surfacesThe calculation is simplified by the fact that a free field is essentially a collection of
oscillator with modes of wavelength k and frequency ω(k) =
radic|k|2 + m2 Using this
fact and (334) it is straightforward to compute the field for given boundary values andits action This gives
S(φ1 t1 φ2 t2) =
intd3k ω(k)
2φ1(k)φ2(k) minus (|φ1|2(k) + |φ2|2(k)) cos[ω(k)(t1 minus t2)]
2 sin[ω(t1 minus t2)]
(3170)
where φ(k) are the Fourier components of φ(x)
The symplectic structure on G As in the finite-dimensional case wecan define a symplectic structure on G Let s be the 3d surface in Σ thatbounds γ That is s = [xμ(τ) ϕA(τ) pμA(τ)] where the momenta pμA(τ)are determined by the solution of the field equations determined by theentire α
10S[α] is only defined on the regions of G where this solution exists and it is multivaluedwhere there is more than one solution
136 Mechanics
Define a two-form on G as follows
ωG [α] =int
sω (3171)
The form ωG is a two-form it is the integral of a five-form over a 3dsurface More precisely let δα be a small variation of α This variationcan be seen as a vector field δα(τ) defined on α This variation determinesa corresponding small variation δs which in turn is a vector field δs(τ)over s Then
ωG [α](δ1α δ2α) =int
sα
ω(δ1s δ2s) (3172)
Thus the five-form ω on the finite-dimensional space Σ defines the two-form ωG on the infinite-dimensional space G
Consider a small local variation δα of α This means varying the surfaceαM in Minkowski space as well as varying the value of the field over itAssume that this variation satisfies the field equations that is the vari-ation of the field is the correct one for the solution of the field equationsdetermined by α We have
ωG [α](δα) =int
sα
ω(δs) (3173)
But the variation δs is by construction along the orbit namely in the nulldirection of ω and therefore the right-hand side of this equation vanishesIt follows that if δα is an infinitesimal physical motion then
ωG(δα) = 0 (3174)
The pair (G ωG) contains all the relevant information on the systemThe null directions of ωG determine the variations of the 3-surface α alongthe physical motions The space G divided by these null directions namelythe space of the orbits of these variations is the physical phase space Γand the ωG restricted to this space is the physical symplectic two-formof the system
Example scalar field Letrsquos compute ωG in a slightly more explicit form for the exampleof the scalar field From the definition (3171)
ωG [α] =
int
s
ω =
int
s
dπ and d4x + dpμ and dφ and d3xμ
=
int
s
(pνdpν + m2φdφ + V primedφ) and d4x + dpμ and dφ and d3xμ
=
int
αM
d3xν
[(pμminuspartμφ)dpμ and dxν + (m2φ + V prime + partμp
μ)dφ and dxν + dpν and dφ]
=
int
αM
d3xν dpν and dφ (3175)
33 Field theory 137
where we have used the xμ coordinates themselves as integration variables and there-fore the integrand fields are the functions of the xμ Notice that since the integral is ons the pμ in the integrand is the one given by the solution of the field equation deter-mined by the data on α Therefore it satisfies the equations of motion (3161)ndash(3162)which we have used above Using (3161) again we have
ωG [α] =
int
αM
d3x nν d(nablaνφ) and dφ (3176)
In particular if we consider variations δα that do not move the surface and such thatthe change of the field on the surface is δφ(x) we have
ωG [α](δ1α δ2α) =
int
αM
d3x nν
(δ1φnablaνδ2φminus δ2φnablaνδ1φ
) (3177)
This formula can be directly compared with the expression of the symplectic two-formgiven on the space of the solutions of the field equations in [108] The expression is thesame but with a nuance in the interpretation ωG is not defined on the space of thesolutions of the field equations it is defined on the space of the lagrangian data G andthe normal derivative nνnablaνφ of these data is determined by the data themselves viathe field equations
334 HamiltonndashJacobi
A HamiltonndashJacobi equation for the field theory can be written as a localequation on the boundary satisfied by the Hamilton function I illustratehere the derivation of the HamiltonndashJacobi equation in the case of thescalar field leaving the generalization to the interested reader From thedefinition
S[α] =int
γθ =
int
γ(πd4x + pμdφ and d3xμ) (3178)
we can write
δS[α]δxμ(τ)
= π(τ) nμ(τ) + εμνρσ pν(τ) partiφ(τ) partjxρ(τ) partkx
σ(τ) εijk (3179)
where
nμ(τ) =13εμνρσpart1x
ν(τ)part2xρ(τ)part3x
σ(τ) (3180)
is the normal to the 3-surface σ The momentum π depends on the fullα Contracting this equation with nμ we obtain
π(τ) = nμ(τ)δS[α]δxμ(τ)
+ pi(τ) partiφ(τ) (3181)
Using the equation of motion pμ = partμφ this becomes
π(τ) = nμ(τ)δS[α]δxμ(τ)
+ partiφ(τ)partiφ(τ) (3182)
138 Mechanics
Also
δS[α]δϕ(τ)
= pμ(τ)nμ(τ) (3183)
The derivation of these two equations requires steps analogous to the oneswe used to derive (386)
Now from (3155) and (3156) we have that the scalar field dynamicsis governed by the equation
π +12
(pμpμ + m2φ2 + 2V (φ)
)= 0 (3184)
We split pμ into its normal (p = pμnμ) and tangential (pi) components(so that pμ = pipartix
μ + pnμ) obtaining
π +12
(p2 minus pipi + m2φ2 + 2V (φ)
)= 0 (3185)
Inserting (3182) and (3183) we obtain
δS[α]δxμ(τ)
nμ(τ) +12
[(δS[α]δϕ(τ)
)2
+partjϕ(τ)partjϕ(τ) + m2ϕ2(τ) + 2V (ϕ(τ))
]
= 0
(3186)
This is the HamiltonndashJacobi equation Notice that the function
S[xμ(τ) ϕ(τ)] = S[σ ϕ] = S[α] (3187)
is a function of the surface not the way the surface is parametrizedTherefore it is invariant under a change of parametrization It followsthat
δS[α]δxμ(τ)
partjxμ(τ) +
δS[α]δϕ(τ)
partjϕ(τ) = 0 (3188)
(This equation can be obtained also from the tangential component of(3179)) The two equations (3186) and (3188) govern the HamiltonndashJacobi function S[α]
The connection with the nonrelativistic field theoretical HamiltonndashJacobi formalism is the following We can restrict the formalism to apreferred choice of parameters τ Choosing τ j = xj we obtain S in theform S[t(x) φ(x)] and the HamiltonndashJacobi equation (3186) becomes
δS
δt(x)+
12
[(δS[α]δφ(x)
)2
+ partjφpartjφ + m2φ2 + 2V (φ)
]
= 0 (3189)
33 Field theory 139
Further restricting the surfaces to the ones of constant t gives the func-tional S[t φ(x)] satisfying the HamiltonndashJacobi equation
partS
partt+
12
intd3x
[(δS
δφ(x)
)2
+ |nablaφ|2 + m2φ2 + 2V (φ)
]
= 0 (3190)
which is the usual nonrelativistic HamiltonndashJacobi equation
partS
partt+ H
(φ nablaφ
δS[α]δφ(x)
)= 0 (3191)
where H(φ nablaφ parttφ) is the nonrelativistic hamiltonian
Canonical formulation on G We can write a hamiltonian density functionH(τ) directly for the infinite-dimensional space G H(τ) is a function onthe cotangent space T lowastG We coordinatize this cotangent space with thefunctions (xμ(τ) ϕ(τ)) and their momenta (πμ(τ) p(τ)) The hamiltonianis then
H[xμ ϕ πμ p](τ) = πμ(τ)nμ(τ) +12
[p2(τ) + partjϕ(τ)partjϕ(τ)
+m2ϕ2(τ) + 2V (ϕ(τ))] (3192)
and the HamiltonndashJacobi equation (3186) reads
H
[xμ ϕ
δS[α]δxμ
δS[α]δϕ
](τ) = 0 (3193)
If we restrict the surface xμ(τ) to the case xμ(τ) = (t τ) then H(τ)becomes
H[xμ ϕ πμ p](x) = π0(x) + H0(x) (3194)
where H0(x) is the conventional nonrelativistic hamiltonian density
H0[φ p] =12
[p2+partjϕpart
jϕ + m2ϕ2 + 2V (ϕ)] (3195)
Physical predictions from S The complete physical predictions of thetheory can be obtained directly from the Hamilton function S[α] = S[σ ϕ]as follows Let p(τ) be a function on the surface σ Define
F [σ ϕ p](τ) =δS[σ ϕ]δϕ(τ)
minus p(τ) (3196)
140 Mechanics
Given a closed surface σ in spacetime we can observe field boundaryvalues φ(x(τ)) = ϕ(τ) together with momenta nμpartμφ(x(τ)) = p(τ) if andonly if
F [σ ϕ p](τ) = 0 (3197)
This equation is equivalent to the equations of motion and expresses di-rectly the physical content of the theory as a restriction on the partialobservables that can be observed on a boundary surface
As in the case of finite-dimensional systems the general solution ofthe equations of motion can be obtained by derivations For instance letα be formed by two connected components that we denote α = [σ ϕ]and α0 = [σ0 ϕ0] parametrized by τ and τ0 respectively Consider theequation for α
f [α](τ) =δS[α cup α0]δϕ0(τ0)
minus p0(τ0) = 0 (3198)
where p0(τ) is an arbitrary initial value momentum This is the evolutionequation that determines all surfaces α compatible with the initial dataϕ0 p0 on σ0
34 Thermal time hypothesis
Earth lay with Sky and after them was born TimeThe wily youngest and most terrible of her children
Hesiod Theogony [111]
In the macroscopic world the physical variable t measured by a clockhas peculiar properties It is not easy to pinpoint these properties withprecision without referring to a presupposed notion of time but it is alsodifficult to deny that they exist From the point of view developed inthis book at the fundamental level the variable t measured by a clockis on the same footing as any other partial observable If we accept thisidea we have then to reconcile the fact that time is not a special variableat the fundamental level with its peculiar properties at the macroscopiclevel What is so special about time An interesting possibility is that itis statistical mechanics and therefore thermodynamics that singles outt and gives it its special properties I briefly illustrate this idea in thissection
The world around us is made up of systems with a large number ofdegrees of freedom such as fields We never measure the totality of thesedegrees of freedom Rather we measure certain macroscopic parametersand make predictions on the basis of assumptions on the state of the other
34 Thermal time hypothesis 141
degrees of freedom The viability of our choice of macroscopic parametersand our assumptions about the state of the others is justified a posterioriif the system of prediction works We represent our incomplete knowledgeand assumptions in terms of a statistical state ρ The state ρ can berepresented as a normalized positive function on the phase space Γ
ρ Γ rarr R+ (3199)int
Γds ρ(s) = 1 (3200)
ρ(s) represents the assumed probability density of the state s in Γ Thenthe expectation value of any observable A Γ rarr R in the state ρ is
ρ[A] =int
Γds A(s) ρ(s) (3201)
The fundamental postulate of statistical mechanics is that a system leftfree to thermalize reaches a time-independent equilibrium state that canbe represented by means of the Gibbs statistical state
ρ0(s) = NeminusβH0(s) (3202)
where β = 1T is a constant ndash the inverse temperature ndash and H0 isthe nonrelativistic hamiltonian Classical thermodynamics follows fromthis postulate Time evolution At = αt(A) of A is determined by (378)Equivalently At(s) = A(t(s)) where s(t) is the hamiltonian flow of H0 onΓ The correlation probability between At and B is given by
WAB(t) = ρ0[αt(A)B] =int
Σds A(s(t)) B(s) eminusβH0(s) (3203)
In this chapter we have seen that the formulas of mechanics do notsingle out a preferred variable because all mechanical predictions canbe obtained using the relativistic hamiltonian H which treats all vari-ables on an equal footing instead of using the nonrelativistic hamiltonianH0 which singles t out Is this true also for statistical mechanics andthermodynamics Equations (3200)ndash(3201) are meaningful also in therelativistic context where Γ is the space of the solutions of the equationsof motion But this is not true for (3202) and (3203) These dependon the nonrelativistic hamiltonian They depend on the fact that tis a variable different from the others Equations (3202) and (3203) defi-nitely single out t as a special variable This observation indicates that thepeculiar properties of the t variable have to do with statistical mechanicsand thermodynamics rather than with mechanics With purely mechani-cal measurements we cannot recognize the time variable With statisticalor thermal measurements we can
142 Mechanics
Indeed notice that if we try to pinpoint what is special about thevariable t we generally find features connected to thermodynamics irre-versibility convergence to equilibrium memory feeling of ldquoflowrdquo and soon
Indeed there is an intriguing fact about (3202) and (3203) Imaginethat we study a system which is in equilibrium at inverse temperature βand we do not know its nonrelativistic hamiltonian H0 In principle wecan figure out H0 simply by repeated microscopic measurements on copiesof the system without any need of observing time evolution Indeed ifwe find out the distribution of microstates ρ0 then up to an irrelevantadditive constant we have
H0 = minus 1β
ln ρ0 (3204)
Therefore in a statistical context we have in principle an operationalprocedure for determining which one is the time variable First measureρ0 second compute H0 from (3204) third compute the hamiltonianflow s(t) of H0 on Σ The time variable t is the parameter of this flowA ldquoclockrdquo is any measuring apparatus whose reading grows linearly withthis flow The multiplicative constant in front of H0 just sets the unit inwhich time is measured Up to this unit we can find out which one is thetime variable just by measuring ρ0 This is in strident contrast with thepurely mechanical context where no operational procedure for singlingout the time variable is available
Now let me come to the main observation Imagine that we have atruly relativistic system where no partial observable is singled out as thetime variable Imagine that we make measurements on many copies of thesystem and find that the statistical state describing the system is givenby a certain arbitrary11 state ρ Define the quantity
Hρ = minus ln ρ (3205)
Let s(tρ) be the hamiltonian flow of Hρ Call tρ ldquothermal timerdquo Callldquothermal clockrdquo any measuring device whose reading grows linearly withthis flow Given an observable A consider the one-parameter family ofobservables Atρ defined by Atρ(s) = A(tρ(s)) Then it follows that thecorrelation probability between the observables Atρ and B is given by
WAB(tρ) =int
Σds A(tρ(s)) B(s) eminusHρ(s) (3206)
What is the difference between the physics described by (3202)ndash(3203)and that described by (3205)ndash(3206) None That is whatever the
11For (3205) to make sense assume that ρ nowhere vanishes on Σ
34 Thermal time hypothesis 143
statistical state ρ there exists always a variable tρ measured by the ther-mal clock with respect to which the system is in equilibrium and whosephysics is the same as in the conventional nonrelativistic statistical caseThis key observation naturally leads us to the following hypothesis
The thermal time hypothesis In Nature there is no preferredphysical time variable t There are no equilibrium states ρ0 preferreda priori Rather all variables are equivalent we can find the systemin an arbitrary state ρ if the system is in a state ρ then a preferredvariable is singled out by the state of the system This variable iswhat we call time
In other words it is the statistical state that determines which variable isphysical time and not any a priori hypothetical ldquoflowrdquo that drives the sys-tem to a preferred statistical state All variables are physically equivalentat the mechanical level But if we restrict our observations to macroscopicparameters and assume the other dynamical variables are distributed ac-cording to a statistical state ρ then a preferred variable is singled outby this procedure This variable has the property that correlations withrespect to it are described precisely by ordinary statistical mechanics Inother words it has precisely the properties that characterize our macro-scopic time parameter
In other words when we say that a certain variable is ldquothe timerdquo weare not making a statement concerning the fundamental mechanical struc-ture of reality12 Rather we are making a statement about the statisticaldistribution we use to describe the macroscopic properties of the systemthat we describe macroscopically
The hamiltonian Hρ determined by a state ρ is called the thermalhamiltonian The ldquothermal time hypothesisrdquo is the idea that what wecall ldquotimerdquo is simply the thermal time of the statistical state in whichthe world happens to be when described in terms of the macroscopicparameters we have chosen
Let the system be in the mechanical microstate s Describe it with macroscopicobservables Ai In general (but not always) there exists a statistical state ρ whose meanvalues give the correct predictions for the Ai that is Ai(s) sim ρ[Ai] Assuming it exitsρ codes in a sense our ignorance of the microscopic details of the state Intuitivelywe can therefore say that the existence of time is the result of this ignorance of oursTime is the expression of our ignorance of the microstate
The thermal time hypothesis works surprisingly well in a number ofcases For example if we start from a radiation-filled covariant cosmo-logical model having no preferred time variable and write a statistical
12Time Kρoνoς comes after matter (Earth Γαια and Sky Ovρανoς) also in Greek
mythology See Hesiodrsquos quote [111] at the beginning of this section
144 Mechanics
state representing the cosmological background radiation then the ther-mal time of this state turns out to be precisely the Friedmann time [112]Furthermore we will see in Section 551 that this hypothesis extends inan extremely natural way to the quantum context and even more nat-urally to the quantum field theoretical context where it leads also to ageneral abstract state-independent notion of time flow
mdashmdash
Bibliographical notes
The hamiltonian theory of systems with constraints is one of Diracrsquos manymasterpieces The theory is not just a technical complication of standardhamiltonian mechanics it is a powerful generalization of mechanics whichremains valid in the general-relativistic context The title of Diracrsquos initialwork on the subject was ldquoGeneralized Hamiltonian dynamicsrdquo [113] Thetheory is synthesized in [114] For modern accounts and developments see[115] For the notion of partial observable I have followed [116] On thegeneral structure of mechanics I have followed [117 118] For a nontrivialexample of relational evolution treated in detail see [119]
The canonical treatment of field theory on finite-dimensional spaces de-rives from the Weyl and DeDonderrsquos calculus of variations [110 120] Abeautiful comprehensive and mathematically precise discussion of covari-ant hamiltonian field theory is in [109] which contains complete referencesto the literature on the subject See also [121]
The idea of the thermal origin of time was introduced in [112 122] inthe context of classical field theory and was independently suggested byAlain Connes It is developed in quantum field theory (see Section 551)in [125] see also [124] For a related Boltzmann-like approach see [123]
4Hamiltonian general relativity
I begin this chapter by presenting the HamiltonndashJacobi formulation of GR This is thebasis of the quantum theory
In the remainder of the chapter I present formulations of hamiltonian GR on afinite-dimensional configuration space along the lines illustrated at the end of theprevious chapter
This order of presentation is inverse to the logical order which should start from thefinite-dimensional configuration space of the partial observables But I do not want toforce the hurried reader to navigate through the entire chapter before finding the fewsimple equations that are the basis of the quantum theory
I take the cosmological constant to be zero and ignore matter fields leaving to thereader the generally easy exercise of adding the cosmological and matter terms to therelevant equations
41 EinsteinndashHamiltonndashJacobi
GR can be expressed in terms of a complex field Aia(τ) and a 3d real
momentum field Eai (τ) defined on a three-dimensional space σ without
boundaries satisfying the reality conditions
Aia + Ai
a = Γia[E] (41)
where Γ is defined below in (423)ndash(424) The theory is defined by thehamiltonian system
DaEai = 0 (42)
Eai F
iab = 0 (43)
F ijabE
ai E
bj = 0 (44)
145
146 Hamiltonian general relativity
where F ijab = εijk F
kab (see pg xxii) and Da and F i
ab are the covariantderivative and the curvature of Ai
a defined by
Davi = partavi + εijkAjav
k (45)
F iab = partaA
ib minus partbA
ia + εijkA
jaA
kb (46)
I sketch the derivation of these equations from the lagrangian formalismbelow An indirect derivation via a finite-dimensional canonical formula-tion is given at the end of this chapter
The HamiltonndashJacobi system is given in terms of the functional S[A]by writing
Eai (τ) =
δS[A]δAi
a(τ)(47)
in the hamiltonian system The first two equations that we obtain
DaδS[A]δAi
a(τ)= 0 F i
ab(τ)δS[A]δAi
a(τ)= 0 (48)
require that S[A] is invariant under 3d diffeomorphisms (diffs) and localSO(3) transformations as I will show in a moment The last reads
F ijab(τ)
δS[A]δAi
a(τ)δS[A]
δAjb(τ)
= 0 (49)
This is the HamiltonndashJacobi equation of GR It defines the dynamics ofGR
Smeared form Equivalently we can integrate equations (42)ndash(44)against suitable ldquotestrdquo functions and demand the integral to vanish forany such function For the first two we get
G[λ] = minusint
d3τλi DaEai =
intd3τDaλ
iEai = 0 (410)
C[f ] = minusint
d3τfaF iabE
bi = 0 (411)
The quantities Daλi and faF i
ab that appear in these equations are theinfinitesimal transformations of the connection under an internal gaugetransformation with generator λi(τ) and under (the combination of aninternal gauge transformation and) an infinitesimal diffeomorphism gen-erated by the vector field fa(τ)
δλAia = Daλ
i δfAia = f bF i
ab (412)
41 EinsteinndashHamiltonndashJacobi 147
Therefore the smeared form of (48) readsint
d3τ δλAia(τ)
δS[A]δAi
a(τ)= 0
intd3τ δfA
ia(τ)
δS[A]δAi
a(τ)= 0 (413)
which is the requirement that S[A] is invariant under gauge and diffeo-morphisms
The quantity (44) on the other hand is a density of weight two Tobe able to integrate it against a scalar quantity and get a well-definedresult we need a density of weight one This can be obtained by dividingthe hamiltonian by the square roote of the determinant of E exploitingthe freedom in the definition of the hamiltonian The Poisson bracketderived below in (425) between the volume
V =int
d3xradic| detE(x)| (414)
and the connection is
V Aia(x) = (8πiG)
Ebj (x)Ec
k(x)εabcεijk
4radic
| detE(x)| (415)
Using this we can write (44) in the form
H[N ] =int
N tr(F and V A) = 0 (416)
This form of the hamiltonian will prove convenient in the quantum theoryEquations (410) (411) and (416) define GR
411 3d fields ldquoThe length of the electric field is the areardquo
What is the relation between the 4d fields used in Chapter 2 and the3d fields used above Consider a solution (eIμ(x) Ai
μ(x)) of the Einsteinequations (221) Choose a 3d surface σ τ = (τa) rarr xμ(τ) withoutboundaries in the coordinate space The four-dimensional forms Ai (theselfdual connection defined in (219)) Σi (the 4d Plebanski two-formdefined in (223)) and eI (the gravitational field introduced in (21))induce the three-dimensional forms
Ai(τ) = Aia(τ) dτa (417)
Σi(τ) = Σiab(τ) dτa and dτ b (418)
eI(τ) = eIa(τ) dτa (419)
on σ The 3d field E is defined as the vector density associated to Σi thatis
Eai(τ) = εabc Σibc(τ) (420)
148 Hamiltonian general relativity
Letrsquos write eI(τ) = (e0(τ) ei(τ)) Choose a gauge in which
e0(τ) = 0 (421)
(The extension of the formalism to a more general gauge deserves to beinvestigated See [127]) It is easy to see that in this gauge Ea
i (τ) is realand
Eai (τ) =
12εijk εabc ejb(τ) ekc (τ) (422)
The connection Γi[E](τ) = εijkΓj
k[E](τ) used in (41) is defined by
dei + Γij [E] and ej = 0 (423)
(this is the first Cartan structure equation for σ) which is solved by
Γjak =
12ebk(partae
jb minus partbe
ja + ecjealpartbe
lc) (424)
That is it is the spin connection of the triad eia It is also easy to verifythat in this gauge the two quantities Ai
a(τ) and Eai (τ) defined by (417)
and (420) satisfy the ldquoreality conditionrdquo (41)The quantity Ea
i (τ) is (8πiG times) the momentum conjugate to Aia(τ)
Hence we can write immediately the Poisson brackets
Aia(τ) Eb
j (τprime) = (8πiG) δbaδ
ijδ
3(τ τ prime) (425)
In Maxwell and YangndashMills theories the momentum conjugate to thethree-dimensional connection A is called electric field The field E istherefore called the gravitational electric field In the gauge (421) weare considering E is determined just by eia(τ) the triad field of σEquation (422) shows that E is the inverse matrix of the triad eia(τ)multiplied by its determinant
Eai = (det e)eai (426)
I sketch here the derivation of the basic equations of the hamiltonianformalism namely the Poisson brackets (425) and the constraint system(42)ndash(44) For a detailed discussion of this derivation see for instanceI [2 9 20 126] An indirect derivation via a finite-dimensional canonicalformulation is given at the end of this chapter We can start for instancefrom the action (227) without the cosmological constant and write it as
41 EinsteinndashHamiltonndashJacobi 149
follows
S[Σ A] =minusi
16πG
intΣi and F i =
minusi16πG
intΣiμνF
iρσε
μνρσd4x
=minusi
8πG
int (ΣiabF
ic0 + Σi0aF
ibc
)εabcd4x
=minusi
8πG
int (Ec
i
(part0A
ic minus partcA
i0 + εijkA
j0A
kc
)+ PiIJe
Jae
J0F
ibcε
abc)d4x
=minusi
8πG
int (Ec
i Aic + Ai
0DcEci +
12(εijke
jae
k0 + ie0
0eia
)Fibcε
abc
)d4x
=minusi
8πG
int (Ec
i Aic + λi
0
(DcE
ci
)+ λb
(Ea
i Fiab
)+ λ
(Ea
jEbkF
jkab
))d4x
(427)
The dot over A indicates time derivative I have used the gauge conditione0i = 0 and the Lagrange multipliers are multiples of the nondynamical
variables Ai0 e
00 e
i0 The first term shows that Ec
i 8iπG is the momentumconjugate to Ai
c varying with respect to the Lagrange multipliers yieldsthe constraint system (42)ndash(44)
The geometry of the three surface In Section 214 we saw that the grav-itational field has a metric interpretation The metric structure inheritedby σ depends on the gravitational electric field E In particular considera two-dimensional surface S σ = (σ1 σ2) rarr τ(σi) embedded in thethree-dimensional surface σ What is the area of S From the definitionof the area equation (270) we have in a few steps
A(S) =int
Sd2σ |E| (428)
Here the norm is defined by |v| =radicδijvivj and
Ei(σ) = Eai (τ(σ)) na(σ) (429)
the normal to the surface being defined by
na(σ) = εabcpartτ b(σ)partσ1
partτ c(σ)partσ2
(430)
Equation (428) can be interpreted as the surface integral of the norm ofthe two-form
Ei = Eai εabc dxb and dxc (431)
and writtenA(S) =
int
S|E| (432)
150 Hamiltonian general relativity
Thus Eai or more precisely its norm or ldquolengthrdquo |E| defines the area
element We could therefore say that in gravity ldquothe length of the electricfield is the areardquo or more precisely the area of a surface is the flux of(the norm of) the gravitational electric field across the surface
Using (273) a similar calculation gives the volume of a 3d region R
V(R) =int
Rd3τ
radic| detE| (433)
If we know the area of any surface and the volume of any region we knowthe geometry
These expressions for area and volume in terms of the gravitationalelectric field E play a major role in quantum gravity The correspondingquantum operators have a discrete spectrum their eigenstates are knownand determine a convenient basis in the quantum state space
For later use notice that detE = det ((det e)eminus1)=(det e)3(det e)minus1 =(det e)2 Hence
radicn middot n = | det e| =
radic| detE| where n is defined in (430)
Phase space states and relation with the Einstein equations A state ofGR is an equivalence class of 4d field configurations eIμ(x) solving the Ein-stein equations under the two gauge transformations (2123) and (2124)The space Γ of these equivalence classes is the phase space of GR
Given one solution of the Einstein equations consider a 3d surface σwithout boundaries in coordinate space Let Ai
a(τ) be the connection on σinduced by the 4d selfdual connection Ai
μ(x) A state determines a familyof possible 3d fields Ai
a(τ) called compatible with the state obtainedby changing the representative in the equivalence class of solutions orequivalently changing σ A family of 3d fields Ai
a(τ) compatible with astate can be obtained in principle from a solution of the HamiltonndashJacobisystem as follows
In general to solve the system we need a solution S[Aα] of theHamiltonndashJacobi system depending on a sufficiently large number αn ofparameters A state is then determined by constants αn and βn as followsThe equation
F [Aα] =partS[Aα]partαn
minus βn = 0 (434)
determines the Aia(τ) compatible with the state In this sense a solution
S[Aα] of the HJ equation (49) contains the solution of the Einstein equa-tions In what follows I focus on the particular solution of the HamiltonndashJacobi system provided by the Hamilton function
41 EinsteinndashHamiltonndashJacobi 151
412 Hamilton function of GR and its physical meaning
A preferred solution of the HamiltonndashJacobi equation is the Hamiltonfunction S[A] This is defined as the value of the action of the regionR bounded by σ = partR computed on a solution of the field equationsdetermined by the boundary value A
A boundary value A on σ determines a solution (eIμ(x) Aiμ(x)) of the
Einstein equations in the region R In turn this solution induces on σ the3d field E[A] The Hamilton function satisfies
δS[A]δAi
a(τ)= Ea
i (τ)[A] (435)
Notice that Eai (τ)[A] is the value of Ea
i (τ) which is determined via theEinstein equations by the value of A on the entire surface σ Define thefunctional
F [AE](τ) =δS[A]δAi
a(τ)minus Ea
i (τ) (436)
then the equation
F [AE] = 0 (437)
is equivalent to the Einstein equations It expresses the conditions thatthe Einstein equations put on the possibility of having fields A and E ona 3d surface
This can be viewed as an evolution problem in the special case in whichwe take σ to be formed by two connected components σin and σoutbounding a single connected region R For instance σin and σout could betwo spacelike surfaces of a spatially closed universe In this case a solutionis determined by the components [Ain Aout] of the connection on σin andσout and also by the components [Ain Ein] of the connection and electricfield on σin alone We can write
F [Aout Ain Ein](τ) =δS[Ain Aout]δAin
ia(τ)
minus Einai (τ) = 0 (438)
Taking Aout as the unknown and Ain and Ein as data this equation givesthe general solution of the Einstein equations for fixed Ain and Ein itis solved by all 3d connections Aout on σ that are compatible with asolution bounded by a 3d surface with ldquoinitial conditionsrdquo Ain and EinThat is (438) determines all fields Aout that can ldquoevolverdquo from the ldquoinitialconditionsrdquo Ain and Ein Therefore the Hamilton function S[A] containsthe full solution of the Einstein equations S[A] expresses the full dynamicsof GR
152 Hamiltonian general relativity
As we shall see the full dynamics of quantum GR is contained in thecorresponding quantum propagator W [A] To the first relevant order in W [A] will be related to eminus
iS[A]
I have made no request that the 3d surface σ be spacelike In particular I haveavoided the issue of whether arbitrary boundary values A admit or determine an inter-polating solution In general the function S[A] will be defined only on a region andcan be multivalued in some other region These issues are important and I refer theinterested reader to [109] and references therein for literature on the topic On theother hand I think that the insistence on spacelike surfaces might be more tied to ourprerelativistic thinking habits than to their relations with Cauchy problems In viewof the construction of the quantum theory these problems can perhaps be postponedIf needed the requirement that the 3d surface is spacelike can be implemented as arestriction on the momentum E
Experiments Suppose we knew explicitly the Hamilton function S[A]How could we compare the theory with experience The answer is simpleWe should measure the 3d fields A and E on a closed 3-surface σ Thetheory predicts that the only fields (AE) we could measure are the onesthat satisfy (436) and (437) Therefore the theory determines which 3dfields could be measured and which could not In turn this determinesrestrictions (namely predictions) on any other quantity depending onthese fields Several important observations are in order
First the prediction is local in the sense that it regards a finite regionof spacetime Observables that require the full spacetime or the full spaceto be observed are not realistic
Second and most important where is the surface σ located Whichsurface σ should we consider The remarkable answer is it doesnrsquot matterThis is a key point in the interpretation of GR and should be understoodin detail
Consider a concrete experimental situation Consider for instance ascattering experiment in a particle accelerator or the propagation andreception of waves (electromagnetic or gravitational) In a nonrelativisticsituation say on Minkowski spacetime we can view the situation as fol-lows We have a certain number of objects and detectors located in certainknown positions of spacetime We measure the initial or incoming dataWe measure the final or outgoing data Furthermore we specify spatialboundary values (that forbid for instance spurious incoming radiation)The initial final and boundary values of the fields can be represented bythe value of the fields on a compact 3d surface σ These data howeverare not sufficient to make theoretical predictions we also need to knowthe location of σ in spacetime To fix the ideas say that σ is a cylinderin Minkowski space The height of the cylinder for instance is the timelapse between the beginning and the end of the experiment
42 Euclidean GR and real connection 153
Notice that the only relevant aspects of the location of objects appara-tus and detectors are their relative distances and time lapses Thereforethe only relevant aspect of the location of σ is the value of the metricon the surface and in its interior Indeed if we displace the surface (thatis the full experiment) in such a way that the geometry of the experimentremains the same we expect that the outcome will not change Since thegeometry of the interior is dictated (on Minkowski) by the geometry of σwe actually need to know only the geometry of σ It is this geometry thatdetermines the relative distances and time lapses between emissions anddetections Thus the full data that we need in a prerelativistic situationare
(i) the value of the dynamical fields on σ and(ii) the geometry of σ
Consider now the general-relativistic situation The same data as aboveare needed but now the geometry of σ is determined by the value ofthe dynamical fields on σ because the geometry is determined by thegravitational field Therefore the data that we need is
(i) the value of the dynamical fields on σ
and nothing elseThe ldquolocationrdquo of σ in the coordinate manifold is irrelevant because it
only reflects the arbitrary choice of coordinatization of spacetime In otherwords the distances and the time lapses among the detectors are preciselypart of the boundary data (AE) on σ For instance if σ is a cylinder thetime lapse between the initial and final measurement is precisely codedin the value of the gravitational field on the vertical (timelike) side ofthe cylinder Asking what happens after a longer time means nothing butasking what happens for larger values of E on the side of the cylinder
42 Euclidean GR and real connection
421 Euclidean GR
In this section I describe a field theory different from GR but which playsan important role in quantum gravity This is often called ldquoeuclideanGRrdquo Usual physical GR is then denoted ldquolorentzianrdquo to emphasize itsdistinction from euclidean GR Euclidean GR can be defined by the sameequations as GR for instance the action (213) with the only differencethat indices I J in the internal space are raised and lowered with theeuclidean metric δIJ instead of the Minkowski metric ηIJ Accordinglythe euclidean spin connection ω is an SO(4) connection instead of anSO(3 1) connection
154 Hamiltonian general relativity
It is still convenient to define the selfdual connection A as in (219) butthe appropriate selfdual projector P is now defined without the imaginaryfactor that is
Ai = ωi + ω0i (439)
Therefore the selfdual connection A is real in the euclidean case Theabsence of the imaginary factor gives immediately the Poisson brackets
Aia(τ) Eb
j (τprime) = (8πG) δbaδ
ijδ
3(τ τ prime) (440)
instead of (425)There is an important difference between the lorentzian and euclidean
cases In the euclidean case the connection lives in the so(4) algebraThis algebra decomposes as so(4) = so(3) oplus so(3) The real connection(439) is simply one of the two components Therefore (439) has half theinformation of ω
In the lorentzian case on the other hand the Lorentz algebra so(3 1)does not decompose at all However its complexification so(3 1C) de-composes as so(3 1C) = so(3C) oplus so(3C) A real ω determines twocomplex components which are complex conjugate to each other andeach component contains the same information as ω itself In this caseindeed the connection (219) has three complex components which isprecisely the same information as the six real components of ω
Remarkably the canonical formalism for the euclidean theory parallelscompletely the one for the lorentzian theory The theory is defined bythe same HamiltonndashJacobi equations (42)ndash(44) with the only differencethat (in the gauge (421)) the reality conditions (41) are replaced by
Eai minus Eai = 0 Aia minusAi
a = 0 (441)
The world is described by lorentzian GR not by euclidean GR Why then is euclideanGR useful at all Because euclidean GR plays a role in the search for a physical quantumtheory of gravity in several ways These will be discussed in more detail in the secondpart of the book but it is appropriate to anticipate some of these reasons here
First the key difficulty of quantum gravity is to understand how to formulate anontrivial generally covariant quantum field theory Euclidean GR is an example of anontrivial generally covariant field theory which is simpler than lorentzian GR becausethe reality conditions are simpler Therefore a complete and consistent formulation ofeuclidean quantum GR is not yet a quantum theory of gravity but is probably a majorstep in that direction Euclidean GR is a highly nontrivial model of the true theory
Second it is well known that the euclidean version of flat-space quantum field the-ories is strictly connected to the physical lorentzian version Under wide assumptionsone can prove that physical n-point functions are analytical continuations of the onesof the euclidean theory Naively one can simply Wick-rotate the time coordinate in theimaginary plane More precisely solid theorems from axiomatic quantum field theory
42 Euclidean GR and real connection 155
assure us that Wightman distributions are indeed the analytic continuation of the mo-ments of an euclidean process (the Schwinger functions) under very general hypothe-ses Defining the euclidean quantum field theory is therefore equivalent to defining thephysical theory In fact calculations are routinely performed in the euclidean region instandard quantum field theory We cannot assume naively that the same remains truein quantum gravity There is no Wick rotation to consider (recall the coordinate t isirrelevant for observable amplitudes anyway) and we are outside the hypotheses ofthe axiomatic approach Therefore we cannot as we do on flat space content ourself todefine the euclidean quantum field theory and lazily be sure that a consistent physicaltheory will follow
Still the very strict connection between the euclidean and the lorentzian theorythat exists on flat space strongly suggests that some connection between euclideanand lorentzian quantum GR is likely to exist Stephen Hawking in particular has ex-plored the hypothesis that physical quantum gravity could be directly defined in termsof the quantization of the euclidean theory There are various indications for that Firstthe formal functional path integral of the euclidean theory solves the WheelerndashDeWittequation for the lorentzian theory as well Second there is a standard technique forobtaining the vacuum of a quantum theory by propagating for an infinite euclideantime the adaptation of this idea to gravity led Jim Hartle and Stephen Hawking to theidea that a quantum gravitational ldquovacuumrdquo is obtained from propagation in imaginarytime or equivalently from the quantum euclidean theory
Finally as I show in the next section the lorentzian theory admits a formulationthat has the same kinematics as the euclidean theory It is therefore reasonable toexpect that the kinematical features of the two theories are the same and thereforekinematical aspects of the physical theory can be studied in the euclidean context
422 Lorentzian GR with a real connection
Let us return to lorentzian GR In this context define the quantity
Ai = ωi + ω0i (442)
precisely as in (439) This quantity does not transform as a connectionunder a local Lorentz transformation (as it does in the euclidean case)but it is still a well-defined field If we fix the gauge (421) then thereduced local internal gauge invariance is SO(3) and A defined in (442)transforms as a connection under SO(3) transformations For this reasonit is denoted the ldquoreal connectionrdquo of the lorentzian theory
Remarkably we can take the real connection or more precisely itsthree-dimensional restriction to the boundary surface as a canonical co-ordinate Lorentzian GR in other words can be expressed in terms of areal SO(3) connection The reality conditions are trivial The only differ-ence with respect to the euclidean theory is the form of the hamiltonianwhich acquires another more complicated term with respect to (44)
H = (F ijab + 2Ki
[aKjb]) Ea
i Ebj (443)
156 Hamiltonian general relativity
where Kia = Ai
a minus Γia[E] (See for instance [20]) The connection (442)
and the hamiltonian (443) provide a second hamiltonian formalism forGR alternative to the one described at the beginning of this chapter
423 Barbero connection and Immirzi parameter
Finally there is a third possible formalism for lorentzian GR It consistsin using the connection
Ai = ωi + γω0i (444)
where γ is an arbitrary complex parameter This is called the Barberoconnection it derives naturally from the use of the Holst action (see Sec211) The case γ = i gives the selfdual connection When γ is real it iscalled the Immirzi parameter In this case the reality conditions are stilltrivial (that is A = A) and the hamiltonian is a small modification of(443) (see [20])
H = (F ijab + (γ2 + 1)Ki
[aKjb]) Ea
i Ebj (445)
We will use this formalism in the quantum theorySince γ scales the term ω0i which is the one that has nonvanishing
Poisson brackets with E it is easy to see that the Poisson brackets betweenthe Barbero connection and the electric field are
Aia(x) Eb
j (y) = (8πγG) δbaδijδ
3(x y) (446)
The fact that γ can be arbitrary is important because as we shall seethe quantum theories obtained starting with different values of γ leadto different physical predictions That is in pure gravity γ has no effectin the classical theory but has an effect in the quantum theory (In thepresence of minimally coupled fermions γ appears in the equation ofmotion [128]) Presumably the presence of this parameter reflects a one-parameter quantization ambiguity of the theory γ is a parameter of thequantum theory that is absent in the classical theory such as for instancethe θ parameter of the QCD θ-vacua In fact γ can also be introduced asthe constant in front of a topological term added to the action precisely asthe θ parameter in QCD Such terms do not affect the classical equationsof motion but affect the quantum theory
As we shall see in Chapter 8 γ enters in several key predictions of thequantum theory In particular it enters in the computation of the black-hole entropy Comparing the black-hole entropy with the one determinedthermodynamically then determines γ A calculation along these linessketched in Chapter 8 suggests the value
γ asymp 02375 (447)
43 Hamiltonian GR 157
It has also been repeatedly suggested that γ may determine the relationbetween the bare and renormalized Newton constant Nevertheless thephysical interpretation of this parameter is not yet clear
43 Hamiltonian GR
I give here a formulation of canonical GR on a finite-dimensional config-uration space along the lines described in Section 332
431 Version 1 real SO(3 1) connection
Let T be the space on which the fields e and ω take value This is a (16 +24)-dimensional space with coordinates (eIμ ω
IJμ ) Let Σ = M times T be the
(4 + 16 + 24)-dimensional space with coordinates (xμ eIμ ωIJμ ) Consider
the four-form
θ = εIJKL eIμ eJν DωKLρ and dxμ and dxν and dxρ (448)
defined on this space Here the covariant differential D is defined by
DωKLρ = dωKL
ρ + ωKσI ω
ILρ dxσ (449)
This structure defines GR as follows Consider a four-dimensional surfaceγ in Σ Recall from Section 332 that we say that γ is an orbit of ω if thequadritangent X to the orbit is in the kernel of the five-form ω = dθ
dθ(X) = 0 (450)
The orbits of ω are the solutions of the Einstein equations If we use thex as coordinates on the γ then γ is represented by
γ = (xμ eIμ(x) ωIJμ (x)) (451)
If γ is an orbit of ω then the functions eIμ(x) ωIJμ (x) solve the Einstein
equations The demonstration is a straightforward calculation along thelines sketched for the scalar field example in Section 332
432 Version 2 complex SO(3) connection
Consider the space Σ with coordinates (xμ Aiμ e
Iμ) where Ai
μ is complexand eIμ is real Define the gauge-covariant differential acting on all quan-tities with internal i indices as
Dvi = dvi + εijkAjμv
kdxμ (452)
158 Hamiltonian general relativity
andDAi
μ = dAiμ + εijkA
jνA
kμdxν (453)
GR is defined by the four-form
θ = PIJi eI and eJ and DAi (454)
where P iIJ is the selfdual projector defined in (217) Indeed the orbits
(xμ Aiμ(xμ) eIμ(xμ)) of ω = dθ satisfy the Einstein equations in the form
eI and (deJ + PJKi Ai and eK) = 0 (455)
PIJi and eI and F i = 0 (456)
where F i is the curvature of Ai The calculation is straightforward
433 Configuration space and hamiltonian
Above I have defined canonical GR directly as a presymplectic (Σ θ)system This form can be derived from a configuration space and a hamil-tonian namely from the (C H) formalism described in Section 332 asfollows
Consider the finite-dimensional space C with coordinates (xμ Aiμ) Here
Aiμ is a complex matrix Assuming immediately (3146) the corresponding
space Ω has coordinates (xμ Aiμ π p
μνi ) and carries the canonical four-
formθ = πd4x + pμνi dAi
ν and d3xμ (457)
Using D the canonical form (457) reads
θ = pd4x + pμνi DAiμ and d3xν (458)
where p = π minus pμνi AjνAk
μεijk Also define
Eiμν = εμνρσ δijpρσj (459)
and the forms Ai = AiμdxμDAi = dAi
μ and dxμ + AjνAk
μεijkdx
ν and dxν Ei =Ei
μνdxμ and dxν and so on on Ω
GR is defined by the hamiltonian system
p = 0 (460)
pμνi + pνμi = 0 (461)
Ei and Ej = 0 (462)
(δikδjl minus13δijδkl)Ei and Ej = 0 (463)
43 Hamiltonian GR 159
The key point is that the constraints (462) (463) imply that thereexists a real four by four matrix eIμ where I = 0 1 2 3 such that Ei
μν isthe selfdual part of eIμe
Jν In fact it is easy to check that (462) and (463)
are solved by
Ei = P iIJ eI and eJ (464)
and the counting of degrees of freedom indicates that this is the uniquesolution Therefore we can use the coordinates (xμ Ai
μ eIμ) on the con-
straint surface Σ (where Aiμ is complex and eIμ is real) and the induced
canonical four-form is (454) Thus we recover the above (Σ θ) structure
434 Derivation of the HamiltonndashJacobi formalism
Let α be a three-dimensional surface in C Thus α = [xμ(τ) Aiμ(τ)] where
τ = (τ1 τ2 τ3) = (τa) Define the functional
S[α] =int
γθ (465)
as in (3167) That is γ is the four-dimensional surface in Σ which is anorbit of dθ and therefore a solution of the field equations and is suchthat the projection of its boundary to C is α From the definition (454)
δS[α]δAi
μ(τ)= PiIJ εμνρσeJρ (τ)eIσ(τ)nν(τ) (466)
where nν is defined in (3180) Since from this equation we have immedi-ately
nμ(τ)δS[α]δAi
μ(τ)= 0 (467)
it follows that the dependence of S[α] on Aiμ(τ) is only through the restric-
tion of Ai(τ) to the 3-surface αM that is only through the components
Aia(τ) = partax
μ(τ)Aiμ(τ) (468)
Thus S[α] = S[xμ(τ) Aia(τ)] and
δS[α]δAi
a(τ)= PiJK εaνbc partbx
ρ(τ)partcxσ(τ)eJρ (τ)eKσ (τ)nν(τ) equiv Eai (τ) (469)
Therefore Eai is the conjugate momentum to the connection Ai
a Noticethat Ea
i is the dual of the restriction to the boundary surface σ of thePlebanski two-form Σi = Σi
μνdxμ and dxν defined in (223) Assume for
160 Hamiltonian general relativity
simplicity that the boundary surface is given by x0 = 0 and coordinatizedby x(τ) = τ and that we have chosen the gauge e0
b(τ) = 0 Then nμ =(1 0 0 0) and
Eai = εabc Σibc (470)
Its real part is the densitized inverse triad
ReEai = minusεijk εabc ejbe
kc = det(e) eai (471)
where eai is the matrix inverse to the ldquotriadrdquo one-form eia Its imaginarypart is
ImEai = εabc eibe
0c (472)
The projection of the field equations (456) on σ written in terms ofEa
i read DaEai = 0 F i
abEai = 0 and F i
abEaiEbkεijk = 0 where Da and F i
abare the covariant derivative and the curvature of Ai
a Using (469) thesegive the three HamiltonndashJacobi equations of GR
DaδS[α]δAi
a(τ)= 0 (473)
δS[α]δAi
a(τ)F iab = 0 (474)
F ijab(τ)
δS[α]δAi
a(τ)δS[α]
δAjb(τ)
= 0 (475)
Kinematical gauges Equation (473) could have been obtained by simplyobserving that S[α] is invariant under local SU(2) gauge transformationson the 3-surface Under one such transformation generated by a functionf i(τ) the variation of the connection is δfAi
a = Dafi Therefore S satisfies
0 = δfS =int
d3τ δfAia(τ)
δS[α]δAi
a(τ)=
intd3τ Daf
i(τ)δS[α]δAi
a(τ)
= minusint
d3τ f i(τ) DaδS[α]δAi
a(τ) (476)
This gives (473) Next the action is invariant under a change of coordi-nates on the 3-surface αM Under one such transformation generated by afunction fa(τ) the variation of the connection is δfAi
a = f bpartbAia+Ai
bpartafb
Integrating by parts as in (476) this gives
partbAia
δS[α]δAi
a(τ)+ (partbAi
a)δS[α]δAi
a(τ)= 0 (477)
which combined with (473) gives (474) Thus (473) and (474) aresimply the requirement that S[α] is invariant under internal gauge andchanges of coordinates on the 3-surface The three equations (473) (474)and (475) govern the dependence of S on Ai
a(τ)
43 Hamiltonian GR 161
Dropping the coordinates It is easy to see that S is independent fromxμ(τ) A change of coordinates xμ(τ) tangential to the surface cannotaffect the action which is independent of the coordinates used Moreformally the invariance under change of parameter τ implies
δS[α]δxμ(τ)
partjxμ(τ) =
δS[α]δAi
a(τ)δjA
ia(τ) (478)
and we have already seen that the right-hand side vanishes The variationof S under a change of xμ(τ) normal to the surface is governed by theHamiltonndashJacobi equation proper equation (3186) In the present casefollowing the same steps as for the scalar field we obtain
δS[α]δxμ(τ)
nμ(τ) + εijkFiab
δS[α]
δAja(τ)
δS[α]δAk
b (τ)= 0 (479)
But the second term vanishes because of (475) Therefore S[α] is inde-pendent of tangential as well as normal parts of xμ(τ) S depends onlyon [Ai
a(τ)]We can thus drop altogether the spacetime coordinates xμ from the
extended configuration space Define a smaller extended configurationspace C as the 9d complex space of the variables Ai
a Geometrically thiscan be viewed as the space of the linear mappings A D rarr sl(2C) whereD = R3 is a ldquospace of directionsrdquo and we have chosen the complex selfdualbasis in the sl(2C) algebra We then identify the space G as a space ofparametrized 3d surfaces A with components [Ai
a(τ)] and without bound-aries in C GR is defined on this space by the HamiltonndashJacobi system
DaδS[A]δAi
a(τ)= 0 (480)
δS[A]δAi
a(τ)F iab = 0 (481)
F ijab(τ)
δS[A]δAi
a(τ)δS[α]
δAjb(τ)
= 0 (482)
These are the equations presented at the beginning of this chapter onwhich we will base quantum gravity
Equivalently we can solve immediately (480) and (481) by definingthe space G0 of the equivalence classes of 3d SU(2) connections A undergauge and 3d diffeomorphisms (Aa
i(τ) = partτ primebpartτa A
primebi(τ prime(τ))) transformation
Then GR is defined by the sole equation (482) on this space (wherefunctions S[Ai
a(τ)] overcoordinatize G0) Accordingly we can interpretGR as the dynamical system defined by the extended configuration space
162 Hamiltonian general relativity
G0 and the relativistic hamiltonian
H(τ) = F ijab(τ) Ea
i (τ) Ebj (τ) (483)
435 Reality conditions
The two variables on which we have based the canonical formulation ofGR described above are a complex 3d connection Ai
a and its complex con-jugate momentum Eai They have 9 complex components each On theother hand the degrees of freedom of GR have (9 + 9) real componentsof which (2 + 2) are physical degrees of freedom 7 are constrained and 7are gauges The explanation of the apparent doubling of the componentsis that A and E are like the coordinates z = x + ip and z = x minus ip overthe phase space of a one-dimensional system That is they are not inde-pendent of each other
To find out these relations let us write the real and imaginary parts ofA and E From their definition we have
ReAia = ωi
a (484)ImAi
a = ω0ia (485)
ReEai = det(e) eai (486)
ImEai = εabc e0
b eic (487)
We have chosen a gauge in which e0a = 0 Then (487) implies that E
is real Recall that the tetrad and the connection ω are related by theequation deI = ωI
J and eJ Projecting this equation on the 3-surface weobtain
dei = ωij and ej + ωi
0 and e0 (488)
In the gauge chosen the last term vanishes and ωij is the spin connection
of the triad ei Hence (484) implies that the real part of A satisfies (41)Without fixing the gauge e0
a = 0 the reality conditions are a bit morecumbersome
mdashmdash
Bibliographical notes
The hamiltonian formulation of GR was developed independently byPeter Bergmann and his group [129] and by Dirac [130] The long-termgoal of both was quantum gravity The main tool for this the hamiltoniantheory of constrained systems was developed for this purpose The greatalgebraic complexity of the hamiltonian formalism was dramatically re-duced by the introduction of the ADM variables by Arnowitt Deser and
Bibliographical notes 163
Misner [131] and then by the selfdual connection variables systematizedby Ashtekar [132]
The conventional derivation of the fundamental equations (42ndash44)from the lagrangian formalism can be found in many books and arti-cles see for instance I [2 9 20 126] See also the original articles [132]The expression (416) of the hamiltonian which plays an important rolein the quantum theory was introduced by Thomas Thiemann [133] Theusefulness of the Barbero connection was pointed out in [134] on itsgeometrical interpretation see [136] The importance of the Immirzi pa-rameter for the quantum theory in [135] An (inconclusive) discussion onthe Immirzi parameter and its physical interpretation is in [137]
For the finite-dimensional formulation I have followed here [138 139]On other versions of this formalism see [140] For the covariant HamiltonndashJacobi formalism for GR see also [141]
5Quantum mechanics
Quantum mechanics (QM) is not just a theory of micro-objects it is our currentfundamental theory of motion It expresses a deeper understanding of Nature thanclassical mechanics Precisely as classical mechanics the conventional formulationof QM describes evolution of states and observables in time Precisely as classicalmechanics this is not sufficient to deal with general relativistic systems because thesesystems do not describe evolution in time they describe correlations between observ-ables Therefore a formulation of QM slightly more general than the conventional onendash or a quantum version of the relativistic classical mechanics discussed in the previouschapter ndash is needed In this chapter I discuss the possibility of such a formulationIn the last section I discuss the general physical interpretation of QM
QM can be formulated in a number of more or less equivalent formalisms canonical(Hilbert spaces and self-adjoint operators) covariant (Feynmanrsquos sum-over-histories)algebraic (states as linear functionals over an abstract algebra of observables) andothers Generally but not always we are able to translate these formalisms into oneanother but often what is easy in one formulation is difficult in another A general-relativistic sum-over-histories formalism has been developed by Jim Hartle [26] HereI focus on the canonical formalism because the canonical formalism has provided themathematical completeness and precision needed to explicitly construct the mathemat-ical apparatus of quantum gravity Later I will consider alternative formalisms
51 Nonrelativistic QM
Conventional QM can be formulated as follows
States The states of a system are represented by vectors ψ in a complexseparable Hilbert space H0
Observables Each observable quantity A is represented by a self-adjointoperator A on H0 The possible values that A can take are thenumbers in the spectrum of A
164
51 Nonrelativistic QM 165
Probability The average of the values that A takes over many equal statesrepresented by ψ is a = 〈ψ|A|ψ〉〈ψ|ψ〉
Projection If the observable A takes values in the spectral interval Ithe state ψ becomes then the state PIψ where PI is the spectralprojector on the interval I
Evolution States evolve in time according to the Schrodinger equation
iparttψ(t) = H0ψ(t) (51)
where H0 is the hamiltonian operator corresponding to the energyEquivalently states do not evolve in time but observables do andtheir evolution is governed by the Heisenberg equation
ddt
A(t) = minus i
[A(t) H0] (52)
A given quantum system is defined by a family (generally an algebra) ofoperators Ai including H0 defined over an Hilbert space H0
This scheme for describing Nature differs substantially from the newto-nian one Here are the main features of the physical content of the abovescheme
Probability Predictions are only probabilistic
Quantization Some physical quantities can take certain discrete valuesonly (are ldquoquantizedrdquo)
Superposition principle If a system can be in a state A where a physicalquantity q has value a as well as in state B where q has value bthen the system can also be in states (denoted ψ = caA+ cbB with|ca|2 + |cb|2 = 1) where q has value a with probability |ca|2 andvalue b with probability |cb|2
Uncertainty principle There are couples of (conjugate) variables thatcannot have determined values at the same time
Effect of observations on predictions The properties we expect the sys-tem to have at some time t2 are determined not only by the proper-ties we know the system had at time t0 but also by the propertieswe know the system has at the time t1 where t0 lt t1 lt t21
1Bohr expressed this fact by saying that observation affects the observed system Butformulations such as Bohmrsquos or consistent histories force us to express this physicalfact using more careful wording
166 Quantum mechanics
In Section 56 I discuss the physical content of QM in more depthIn general a quantum system (H0 Ai H0) has a classical limit which
is a mechanical system describing the results of observations made onthe system at scales and with accuracy larger than the Planck constantIn the classical limit Heisenberg uncertainty can be neglected and theobservables Ai can be taken as coordinates of a commutative phase spaceΓ0 Quantum commutators define classical Poisson brackets and (52)reduces to Hamilton equation (378)
If the classical limit is known the search for a quantum system fromwhich this limit may derive is called the quantization problem There isno reason for the quantization problem to have a unique solution Theexistence of distinct solutions is denoted ldquoquantization ambiguityrdquo Ex-perience shows that the simplest quantization of a given classical systemis very often the physically correct one If we are given a classical systemdefined by a nonrelativistic configuration space C0 with coordinates qi andby a nonrelativistic hamiltonian H0(qi pi) then a solution of the quan-tization problem can be obtained by interpreting the HamiltonndashJacobiequation (317) as the eikonal approximation of the wave function (51)that governs the quantum dynamics [142] This can be achieved by defin-ing multiplicative operators qi derivative operators pi = minusi part
partqiand the
hamiltonian operator
H0 = H0
(qiminusi
part
partqi
)(53)
on the Hilbert space H0 = L2[C0] the space of the square integrablefunctions on the nonrelativistic configuration space [143]
In a special-relativistic context this structure remains the same but theEvolution postulate above is extended to the requirement that H0 carriesa unitary representation of the Poincare group and H0 is the generator ofthe time translations of this representation
This structure is not generally relativistic In particular the notions ofldquostaterdquo and ldquoobservablerdquo used above are the nonrelativistic ones Can thestructure of QM be extended to the relativistic framework In Section52 I discuss such an extension As a preliminary step however in therest of this section I introduce and illustrate some tools needed for thisreformulation in the context of a very simple system ndash as I did for classicalmechanics
511 Propagator and spacetime states
Nonrelativistic formulation The quantum theory of the pendulum canbe written on the Hilbert space H0 = L2[R] of wave functions ψ0(α) in
51 Nonrelativistic QM 167
terms of the multiplicative position operator α the momentum operatorpα = minusi part
partα and the hamiltonian
H0 = minus 2
2mpart2
partα2+
mω2
2α2 (54)
More precisely the theory is defined on a rigged Hilbert space or Gelfand triple AGelfand triple S sub H sub S prime is formed by a Hilbert space H a proper subset S densein H and equipped with a weak topology and the dual S prime of S with their naturalidentifications A manifold M with a measure dx determines a rigged Hilbert spaceSM sub HM sub S prime
M where SM is the space of smooth functions on M with fast decrease(Schwarz space) HM = L2[M dx] and S prime
M is the space of the tempered distributionson M This setting allows us in particular to deal with eigenstates of observables withcontinuous spectrum and Fourier transforms
The operators (here h = 1)
α(t) = eitH0αeminusitH0 (55)
which solve (52) are the Heisenberg position operators that give the posi-tion at any time t Denote |α t〉 the generalized eigenstate of the operatorα(t) with eigenvalue α (which are in S prime)
α(t)|α t〉 = α|α t〉 (56)
and |α〉 = |α 0〉 Clearly |α t〉 = eitH0 |α〉 Given a state |ψ〉 theSchrodinger wave function
ψ(α t) = 〈α t|ψ〉 = 〈α|eminusitH0 |ψ〉 (57)
satisfies the Schrodinger equation (51) Conversely each solution of theSchrodinger equation restricted to t = 0 defines a state in H0 Thereforethere is a one-to-one correspondence between states at fixed time ψ0(α)and solutions of the Schrodinger equation ψ(α t) I call H the space ofthe solutions of the Schrodinger equation Thanks to the identificationjust mentioned H is a Hilbert space isomorphic to the Hilbert space H0
of the states at fixed time I call
R0 H rarr H0 (58)ψ(α t) rarr ψ0(α) = ψ(α 0) (59)
the identification map The relation between H and H0 is analogous tothe relation between the spaces Γ and Γ0 in classical mechanics discussedin Chapter 3
The propagator is defined as
W (α t αprime tprime) = 〈α t|αprime tprime〉 = 〈α|eminusi(tminustprime)H0 |αprime〉=
sum
n
Hn(α) eminusiEn(tminustprime) Hn(αprime) (510)
168 Quantum mechanics
where Hn(α) is the eigenfunction of H0 with eigenvalue En Explicitly astraightforward calculation that can be found in many books gives
W (α t αprime tprime) =radic
mω
ih sin[ω(tminus tprime)]e
iωm2h
[(α2+αprime2) cos[ω(tminustprime)]minus2ααprime
sin2[ω(tminustprime)]
]
(511)
where h = 2πh The propagator satisfies the Schrodinger equation in thevariables (α t) (and the conjugate equation in the variables (αprime tprime))
Spacetime states It is convenient to consider the following states Givenany compact support complex function f(α t) the state
|f〉 =int
dα dt f(α t) |α t〉 (512)
is in H0 and is called the ldquospacetime smeared staterdquo or simply the ldquospace-time staterdquo of the function f(α t) Since standard normalizable statesare dimensionless (for 〈ψ|ψ〉 = 1 to make sense) and the states |α t〉have dimension Lminus12 the function f must have dimensions Tminus1Lminus12These states generalize the conventional wave packets for which f(α t) =f(α)δ(t) Conventional wave packets can be thought of as being associatedwith results of instantaneous position measurements with finite resolutionin space as I will illustrate later on spacetime states can be associatedwith realistic measurements where the measuring apparatus has finiteresolution in space as well as in time The Schrodinger wave function of|f〉 is
ψf (α t) = 〈α t|f〉
= 〈α t|int
dαprimedtprime f(αprime tprime) |αprime tprime〉
=int
dαprimedtprime W (α t αprime tprime) f(αprime tprime) (513)
and satisfies the Schrodinger equation The scalar product of two space-time states is
〈f |f prime〉 =int
dα dt dαprimedtprime f(α t) W (α t αprime tprime) f prime(αprime tprime) (514)
In particular we can associate a normalized state |R〉 to each spacetimeregion R
|R〉 = CR
int
Rdα dt |α t〉 (515)
51 Nonrelativistic QM 169
where the factor
Cminus2R =
intdα dt dαprimedtprime W (α t αprime tprime) (516)
fixes the normalization 〈R|R〉 = 1 as well as giving the state the rightdimensions
512 Kinematical state space K and ldquoprojectorrdquo P
As discussed in Chapter 3 the kinematics of a pendulum is describedby two partial observables time t and elongation α These coordinatizethe relativistic configuration space C The classical relativistic formalismtreats α and t on an equal footing The quantum relativistic formalismas well treats α and t on an equal footing and therefore it is based onfunctions f(α t) on C
To be precise let S sub K sub S prime be the Gelfand triple defined by C and the measuredαdt That is S is the space of the smooth functions f(α t) on C with fast decreaseK = L2[C dαdt] and S prime is formed by the tempered distributions over C
I call S the ldquokinematical state spacerdquo and its elements f(α t) ldquokine-matical statesrdquo
In the relativistic formalism the dynamics of the system is defined bythe relativistic hamiltonian H(α t p pt) given in (324) The quantumdynamics is defined by the ldquoWheelerndashDeWittrdquo (WdW) equation
H ψ(α t) = 0 (517)
where
H = H
(α tminusih
part
partαminusih
part
partt
)
= minusihpart
partt+ H0
= minusihpart
parttminus h2
2mpart2
partα2+
mω2
2α2 (518)
and H0 is given in (54) In the case of the pendulum (517) reducesto the Schrodinger equation (51) but (517) is more general than theSchrodinger equation because in general H does not have the nonrela-tivistic form H = pt +H0 Solutions ψ(α t) of this equation form a linearspace H which carries a natural scalar product that I will construct in amoment The key object for the relativistic quantum theory is the oper-ator
P =int
dτ eminusiτH (519)
170 Quantum mechanics
defined on S prime This operator maps arbitrary functions f(α t) into solu-tions of the WdW equation (517) namely into H
To see this expand a function f(α t) as
f(α t) =sum
n
intdE fn(E) Hn(α) eminusiEt (520)
Acting with P on this function we obtain
[Pf ](α t) =
intdτ eminusiτH
sum
n
intdE fn(E) Hn(α) eminusiEt
=
intdτ
sum
n
intdE eminusiτ(minusE+En) fn(E) Hn(α) eminusiEt
=sum
n
intdE δ(E minus En) fn(E) Hn(α) eminusiEt
=sum
n
ψn Hn(α) eminusiEnt (521)
where ψn = fn(En) which is the general solution of (517) Therefore P sends arbitraryfunctions into solutions of the WdW equation Intuitively P sim δ(H)
The integral kernel of P is the propagator (510) Indeed the inverseof (520) gives
ψn = fn(En) =int
dαdt Hn(α) eiEnt f(α t) (522)
Inserting this in (521) we have
[Pf ](α t) =sum
n
intdαprimedtprime Hn(αprime) eiEntprimeHn(α) eminusiEnt f(αprime tprime)
=int
dαprimedtprime W (α t αprime tprime) f(αprime tprime) (523)
P is often called ldquothe projectorrdquo although improperly so Intuitively it ldquoprojectsrdquoon the space of the solutions of the WdW equation In some systems (when 0 is aneigenvalue in the discrete spectrum of H) P is indeed a projector But generically andin particular for the nonrelativistic systems (where 0 is in the continuum spectrum ofH) P is not a projector because its domain is smaller than the full S prime In particularit does not contain the solutions of the WdW equation namely P rsquos codomain Thedomain of P contains on the other hand S
The matrix elements of P
〈f |P |f prime〉K =int
dα dt dαprimedtprime f(α t) W (α t αprime tprime) f prime(αprime tprime) (524)
51 Nonrelativistic QM 171
define a degenerate inner product in S Dividing S by the kernel of thisinner product that is identifying f and f prime if Pf = Pf prime and completingin norm we obtain a Hilbert space But if Pf = Pf prime then f and f prime definethe same solution of the WdW equation In fact they define the solutionthat corresponds to the spacetime state |f〉 defined above Therefore anelement of this Hilbert space corresponds to a solution of the WdW equa-tion the Hilbert space can be identified with the space of the solutions ofthe WdW equation H Therefore
P S rarr Hf rarr |f〉 (525)
It follows that P directly equips the space H of the solutions with aHilbert space structure if ψ = Pf and ψprime = Pf prime are two solutions of theWdW equation (517) their scalar product is defined by
〈ψ|ψprime〉 equiv 〈f |P |f prime〉K (526)
where the right-hand side is the scalar product in K and is explicitlygiven in (524)
Notice that the scalar product on the space of the solutions of the WdWequation can be defined just by using the relativistic operator P withoutany need of picking out t as a preferred variable
For all nonrelativistic systems the configuration space has the structureC = C0timesR where t isin R and a function ψ(α t) in H is uniquely determinedby its restriction ψt = Rtψ on C0 for a fixed t
ψt(α) equiv ψ(α t) (527)
For each t denote Ht the space of the L2[C0] functions ψt(α) so thatRt H rarr Ht The spaces H and Ht are in one-to-one correspondence theinverse map Rminus1
t is the evolution determined by equation (517) In par-ticular H0 is the Hilbert space used in the nonrelativistic formulation ofthe quantum theory Under the identification between H and H0 given byR0 the scalar product defined above is precisely the usual scalar productof the nonrelativistic Hilbert space
This can be directly seen by noticing that the right-hand side of (524) is precisely(514) More explicitly let ψ(α t) =
sumn ψnHn(α)eminusiEnt be a function in H namely a
solution of the WdW equation Its restriction to t = 0 is ψ0(α) =sum
n ψnHn(α) and itsnorm in H0 is ||ψ0||2 =
intdα |ψ0(α)|2 =
sumn |ψn|2 A function f such that Pf = ψ is
for instance simply f(α t) = ψ0(α)δ(t) =sum
n ψnHn(α)int
dEeminusiEt (This is actuallynot in S0 but we could take a sequence of functions in S0 converging to f But f isin the domain of P and such a procedure would not give anything new) The norm
172 Quantum mechanics
of ψ is
||ψ||2 = 〈f |f〉H = 〈f |P |f〉K
=
intdτ
intdα
intdt f(α t) eminusiτH f(α t)
=
intdτ
intdα
intdt
sum
n
ψn Hn(α)
intdEeiEteminusiτH
sum
m
ψm Hm(α)
intdEprimeeminusiEprimet
=sum
n
intdτ
intdt
intdE
intdEprime |ψn|2eiEteminusiτ(EprimeminusEn) eminusiEprimet
=sum
n
|ψn|2 = ||ψ0||2 (528)
513 Partial observables and probabilities
Consider two events (α t) and (αprime tprime) in the extended configuration spaceSuppose we have observed the event (αprime tprime) What is the probability ofobserving the event (α t)
To measure this probability we need measuring apparata for α and fort In general these apparata will have a certain resolution say Δα andΔt The proper question is therefore what is the probability of observingan event included in the region R = (αplusmn Δα tplusmn Δt) It is important toremark that no realistic measuring device or detector can have Δα = 0nor Δt = 0 Most QM textbooks put much emphasis on the fact thatΔα gt 0 and completely ignore the fact that Δt gt 0 Consider thus tworegions R and Rprime If a detector at Rprime has detected the pendulum what isthe probability PRRprime that a detector at R detects the pendulum
If the regions R and Rprime are much smaller than any other physical quan-tity in the problem including the spatial and temporal separation of Rand Rprime a direct application of perturbation theory shows that
PRRprime = γ2 |〈R|Rprime〉|2 (529)
where γ2 is a dimensionless constant related to the efficiency of the de-tector (We may assume that a ldquoperfectrdquo detector is defined by γ = 1)The reader can repeat the calculation himself or find it for instance in[144] Explicitly we can write this probability as the modulus square ofthe amplitude PRRprime = |ARRprime |2
ARRprime = γ〈R|Rprime〉
radic〈R|R〉
radic〈Rprime|Rprime〉
(530)
〈R|Rprime〉 =int
Rdαdt
int
Rprimedαprimedtprime W (α t αprime tprime) (531)
Therefore the propagator has all the information about transition proba-bilities
51 Nonrelativistic QM 173
Assume that R is sufficiently small so that the wave function ψ(α t) =〈α t|Rprime〉 is constant within R and has the value ψ(α t) Then we can writethe probability of the pendulum being detected in R as
PR = γ (VRCR)2 |ψ(α t)|2 (532)
where VR is the volume of the region R Now assume the region R hassides ΔαΔt A direct calculation (see [144]) shows that if Δt mΔα2hthen (VRCR)2 is proportional to Δα therefore
PR sim Δα |ψ(α t)|2 (533)
So for small regions we have the two important results that (i) the tem-poral resolution of the detector drops out from the detection probabilityand (ii) the probability is proportional to the spacial resolution of thedetector Because of (i) we can forget the temporal resolution of the de-tector and take the idealized limit of an instantaneous detector Becauseof (ii) we can associate a probability density in α to each infinitesimalinterval dα in α Fixing the overall normalization by requiring that anidealized perfect detector covering all values of α detects with certaintythis yields the results that |ψ(α t)|2 is the probability density in α todetect the system at (α t) with an instantaneous detector That is werecover the conventional probabilistic interpretation of the wave functionfrom (529)
In the opposite limit when Δt mΔα2h (VRCR)2 is proportionalto (Δt)minus12 Therefore
PR sim (Δt)minus12 |ψ(α t)|2 (534)
and we cannot associate a probability density in t with this detectorbecause the detection probability does not scale linearly with Δt Thedifferent behavior of the probability in α and t is a consequence of thespecific form of the dynamics
Partial observables in quantum theory Recall that α and t are partial ob-servables They determine commuting self-adjoint operators in K Theseact simply by multiplication Their common generalized eigenstates |α t〉are in S The states |α t〉 satisfy
〈α t|P |αprime tprime〉 = W (α t αprime tprime) (535)
We can view the states |α t〉 as ldquokinematical statesrdquo that do not know any-thing about dynamics They correspond to a single quantum event Theldquokinematicalrdquo scalar product of these states in K given below in (536)expresses only their independence while the ldquophysicalrdquo scalar product of
174 Quantum mechanics
these states in H given in (535) expresses the physical relation betweenthe two events it determines the probability that one event happens giventhat the other happened
Do not confuse |α t〉 with |α t〉 The first is an eigenstate of α and t the secondis an eigenstate of α(t) They both determine (generalized) functions on C The state|α t〉 determines a delta distribution at the point (α t)
〈αprime tprime|α t〉 = δ(αprime α)δ(tprime t) (536)
while the state |α t〉 determines a solution of the Schrodinger equation This solutionhas support all over C and is such that on the line t = constant it is a delta functionin α
〈αprime t|α t〉 = δ(αprime α) (537)
while for different trsquos〈α t|αprime tprime〉 = W (α t αprime tprime) (538)
The relation between the two is simply
|α t〉 = P |α t〉 (539)
Notice that (538) and (539) give
W (α t αprime tprime) = 〈α t|P daggerP |α t〉H (540)
which is consistent with (535) because the definition of the scalar product in H (indi-
cated in (540) by 〈middot|middot〉H) is (526)
514 Boundary state space K and covariant vacuum |0〉In this subsection I introduce some notions that play an important role inthe field theoretical context Fix two times t = 0 and t Let H0 = L2[Rdα]be the space of the instantaneous quantum states ψ0 at t = 0 Let Ht sim H0
be the space of the instantaneous states ψt at t The probability amplitudeof measuring a state ψt at t if the state ψ0 was measured at t = 0 is
A = 〈ψt|eminusiH0t|ψ0〉 (541)
Consider the boundary state space
Kt = Hlowastt otimesH0 = L2[R2 dαdαprime] (542)
The linear functional ρt defined by
ρt(ψt otimes ψ0) = 〈ψt|eminusiH0t|ψ0〉 (543)
is well defined on Kt This functional captures the entire dynamical infor-mation about the system A linear functional on a Hilbert space definesa state I denote |0t〉 the state defined by ρt
ρt(ψ) = 〈0t|ψ〉Kt (544)
and call it the ldquodynamical vacuumrdquo state in boundary state space Kt
51 Nonrelativistic QM 175
These definitions can be given the following physical interpretationWe make a measurement on the system at t = 0 and a measurement att We can measure the positions (α αprime) or the momenta or other com-binations The outcomes of the two measurements are not independentbecause of the dynamics but to start with letrsquos ignore the dynamicsAll possible outcomes of measurements at t = 0 (with their kinematicalrelations) are described by instantaneous states at t = 0 namely by thenonrelativistic Hilbert space H0 Similarly for t If we ignore the dynam-ical correlations we can view the two measurements as if they were doneon two independent systems and therefore we can describe the outcomesof the two measurements using the Hilbert space Kt Dynamics is a cor-relation between the two measurements These correlations are describedby a probability amplitude associated with any given couple of statesNamely to any state in Kt
It is a simple exercise that I leave to the reader to show that in therepresentation Kt = L2[R2dαdαprime] the state |0t〉 is precisely the propaga-tor
〈0t |α αprime〉 = W (α t αprime 0) (545)
Dynamical vacuum versus Minkowski vacuum Denote |0M〉 the lowesteigenstate of H0 in H0
〈α|0M〉 = H0(α) =1radic2π
eminus12α2
(546)
and call it the ldquoMinkowskirdquo vacuum because of its analogy with thevacuum state of the quantum field theories on Minkowski space Considerthe analytic continuation in imaginary time of the propagator (510)
W (αminusit αprime 0) = 〈α|eminusH0t|αprime〉 =sum
n
Hn(α) eminusEnt Hn(αprime) (547)
For large t only the lowest-energy state survives in the sum and we have
W (αminusit αprime 0) minusrarrtrarrinfin H0(α) eminusE0t H0(αprime) (548)
Using the definitions of the previous section this can be written as
limtrarrinfin
eE0t |0minusit〉 = |0M〉 otimes 〈0M| (549)
(The ket and bra in the right-hand side are in H0 while the ket in theleft-hand side is in K = Hlowast
0 otimesH0) This expression relates the dynamicalvacuum |0t〉 and the Minkowski vacuum |0M〉 We will use this equationto find the quantum states corresponding to Minkowski spacetime fromthe spinfoam formulation of quantum gravity
176 Quantum mechanics
The boundary state space K and covariant vacuum |0〉 The constructionabove can be given a more covariant formulation as follows Consider theHilbert space
K = Klowast otimesK = L2[R4dα dt dαprimedtprime] = L2[G] (550)
I call this space the ldquototalrdquo quantum space The propagator defines apreferred state |0〉 in K
〈α t αprime tprime|0〉 = W (α t αprime tprime) (551)
I call this state the covariant vacuum stateTo run a complete experiment in a one-dimensional quantum system
we need to measure two events a ldquopreparationrdquo and a ldquomeasurementrdquoThe space K describes all possible (a priori equal) outcomes of the mea-surements of these two events Any couple of measurements is representedby operators on K and any outcome is represented by a state ψ isin K whichis an eigenstate of these operators The dynamics is given by the bra 〈0|The probability amplitude of the given outcome is determined by
A = 〈0|ψ〉 (552)
This is a compact and fully covariant formulation of quantum dynamics
515 Evolving constants of motion
The interpretation of the theory is already entirely contained in (529)Still to make the connection with the nonrelativistic formalism moredirect we can also consider operators related to observable quantitieswhose probability distribution can be predicted by the theory
In the classical theory if we know the (relativistic) state of the pen-dulum we can predict the value of α when t has value say t = T Inthe quantum theory there is an operator that corresponds to this physi-cal prediction It is of course the Heisenberg position operator (55) fort = T that is α(T ) (For clarity it is convenient to distinguish the par-ticular numerical value T from the argument of the wave function t) Inow define and characterize this operator in a relativistic language
First of all notice that the operator α(T ) defined on H0 in (55) is infact well defined on H as
α(T ) = Rminus10 α(T )R0 = Rminus1
0 eiTH0 α eminusiTH0R0 = Rminus1T α RT (553)
The operator α(T ) can be directly defined on H without referring to H0as follows Consider the operator
a(T ) = eminusiω(Tminust)(α + i
pαmω
)(554)
52 Relativistic QM 177
and its real part
α(T ) = Re [a(T )] =a(T ) + adagger(T )
2 (555)
defined on S These operators commute with H for any T Therefore theyare well defined on the space of the solutions of (517) namely on H Therestriction of the operator (555) to H is precisely the operator (553)
The operator α(T ) is characterized by two properties First the factthat it commutes with the hamiltonian
[α(T ) H] = 0 (556)
Second if we put T = t in the expressions (554) (555) we obtain αThat is α(T ) is defined as an operator function α(T )(α pα t) such that
α(T )(α pα T ) = α (557)
Intuitively these two equations determine α(T ) since the second fixes itat t = T and the first evolves it for all t Operators of this kind arecalled ldquoevolving constants of motionrdquo They are ldquoevolvingrdquo because theydescribe the evolution (here the evolution of α with respect to t) theyare ldquoconstants of motionrdquo because they commute with the hamiltonianIn GR the operators of this kind are independent from the temporalcoordinate
52 Relativistic QM
In the previous section I used the example of a pendulum to introducea certain number of notions on which a relativistic hamiltonian formula-tion of QM can be based It is now time to attempt a general theory ofrelativistic QM
521 General structure
Kinematical states Kinematical states form a space S in a rigged Hilbertspace S sub K sub S prime
Partial observables A partial observable is represented by a self-adjointoperator in K Common eigenstates |s〉 of a complete set of com-muting partial observables are denoted quantum events
Dynamics Dynamics is defined by a self-adjoint operator H in K the(relativistic) hamiltonian The operator from S to S prime
P =int
dτ eminusiτH (558)
178 Quantum mechanics
is (sometimes improperly) called the projector (The integrationrange in this integral depends on the system) Its matrix elements
W (s sprime) = 〈s|P |sprime〉 (559)
are called transition amplitudes
Probability Discrete spectrum the probability of the quantum event sgiven the quantum event sprime is
Pssprime = |W (s sprime)|2 (560)
where |s〉 is normalized by 〈s|P |s〉 = 1 Continuous spectrum theprobability of a quantum event in a small spectral region R given aquantum event in a small spectral region Rprime is
PRRprime =
∣∣∣∣∣
W (RRprime)radic
W (RR)radicW (Rprime Rprime)
∣∣∣∣∣
2
(561)
whereW (RRprime) =
int
Rds
int
Rprimedsprime W (s sprime) (562)
To this we may add
Boundary quantum space and covariant vacuum For a finite number ofdegrees of freedom the boundary Hilbert space K = Klowast otimes K rep-resents any observations of pairs of quantum events The covariantvacuum state |0〉 isin K defined by
〈0|(ψ otimes ψprime)〉K = 〈ψ|P |ψprime〉K (563)
expresses the dynamics It determines the correlation probabilityamplitude of any such observation The extension to QFT is con-sidered in Section 535
States A physical state is a solution of the WheelerndashDeWitt equation
Hψ = 0 (564)
Equivalently it is an element of the Hilbert space H defined by thequadratic form 〈 middot |P | middot 〉 on S (Elements of K are called kinematicalstates and elements of K are called boundary states)
Complete observables A complete observable A is represented by a self-adjoint operator on H A self-adjoint operator A in K defines acomplete observable if
[AH] = 0 (565)
52 Relativistic QM 179
Projection If the value of the observable A is restricted to the spectral in-terval I the state ψ becomes the state PIψ where PI is the spectralprojector on the interval I If an event corresponding to a sufficientlysmall region R is detected the state becomes |R〉
A relativistic quantum system is defined by a rigged Hilbert space ofkinematical states K and a set of partial observables Ai including a rela-tivistic hamiltonian operator H Alternatively it is defined by giving theprojector P
Axiomatizations are meant to be clarifying not prescriptive The struc-ture defined above is still tentative and perhaps incomplete There are as-pects of this structure that deserve to be better understood clarified andspecified Among these is the precise meaning of the ldquosmallnessrdquo of the re-gion R in the case of the continuum spectrum and the correct treatmentof repeated measurements On the other hand the conventional structureof QM is certainly physically incomplete in the light of GR The aboveis an attempt to complete it making it general relativistic
522 Quantization and classical limit
In general a quantum system (K Ai H) has a classical limit which isa relativistic mechanical system (C H) describing the results of observa-tions on the system at scales and with accuracy larger than the Planckconstant In the classical limit Heisenberg uncertainty can be neglectedand a commuting set of partial observables Ai can be taken as coordinatesof a commutative relativistic configuration space C
If we are given a classical system defined by a nonrelativistic config-uration space C with coordinates qa and by a relativistic hamiltonianH(qa pa) a solution of the quantization problem is provided by the mul-tiplicative operators qa the derivative operators
pa = minusihpart
partqa (566)
and the hamiltonian operator
H = H
(qaminusih
part
partqa
)(567)
on the Hilbert space K = L2[Cdqa] or more precisely the Gelfand tripledetermined by C and the measure dqa The physics is entirely containedin the transition amplitudes
W (qa qprimea) = 〈qa|P |qprimea〉 (568)
180 Quantum mechanics
where the states |qa〉 are the eigenstates of the multiplicative operatorsqa
In turn the space K has the structure
K = L2[G] (569)
As we shall see this remains true in field theory and in quantum gravityThe space G was defined in Section 325 for finite-dimensional systems inSection 333 for field theories and in Section 434 in the case of gravity
In the limit h rarr 0 the WheelerndashDeWitt equation becomes the rela-tivistic HamiltonndashJacobi equation (359) and the propagator has the form(writing q equiv (qa))
W (q qprime) simsum
i
Ai(q qprime) eihSi(qq
prime) (570)
where Si(q qprime) are the different branches of the Hamilton function as in(389) Now the reverse of each path is still a path The Hamilton functionand the amplitude of a reversed path acquires a minus giving
W (q qprime) simsum
i
Ai(q qprime) sin[
1h Si(q qprime)
] (571)
and W is real Assuming only one path matters
W (q qprime) sim A(q qprime) sin[
1h S(q qprime)
](572)
and we can write for instance
limhrarr0
1W
ihpart
partqaih
part
partqbW (q qprime) =
partS(q qprime)partqa
partS(q qprime)partqb
(573)
This equation provides a precise relation between a quantum theory (en-tirely defined by the propagator W (q qprime)) and a classical theory (entirelydefined by the Hamilton function S(q qprime)) Using (386) and (566) thisequation can be written in the suggestive form
limhrarr0
1W
papbW (q qprime) = pa(q qprime) pb(q qprime) (574)
523 Examples pendulum and timeless double pendulum
Pendulum An example of relativistic formalism is provided by the quan-tization of the pendulum described in the previous section the kinematicalstate space is K = L2[R2dαdt] The partial observable operators are the
52 Relativistic QM 181
multiplicative operators α and t acting on the functions ψ(α t) in K Dy-namics is defined by the operator H given in (518) The WheelerndashDeWittequation is therefore
(minusih
part
parttminus h2
2mpart2
partα2+
mω2
2α2
)Ψ(α t) = 0 (575)
H is a space of solutions of this equation The ldquoprojectorrdquo operator P K rarr H defined by H is given in (523) and defines the scalar product inH Its matrix elements W (α t αprime tprime) between the common eigenstates ofα and t are given by the propagator (511) They express all predictionsof the theory Because of the specific form of H these define a probabilitydensity in α but not in t as explained in Section 513
Equivalently the quantum theory can be defined by the boundarystate space K = L2[G] where G is the boundary space of the classi-cal theory with coordinates (α t αprime tprime) and the covariant vacuum state〈α t αprime tprime|0〉 = W (α t αprime tprime) which determines the amplitude A = 〈0|ψ〉of any possible outcome ψ isin K of a preparationmeasurement experiment
Timeless double pendulum An example of a relativistic quantum sys-tem which cannot be expressed in terms of conventional relativistic quan-tum mechanics is provided by the quantum theory of the timeless system(340) The kinematical Hilbert space K is L2[R2 dadb] and the WheelerndashDeWitt equation is
12
(minush2 part2
parta2minus h2 part2
partb2+ a2 + b2 minus 2E
)Ψ(a b) = 0 (576)
Below I describe this system in some detail
States Since H = Ha + Hb minus E where Ha (resp Hb) is the harmonicoscillator hamiltonian in the variable a (resp b) this equation is easy tosolve by using the basis that diagonalizes the harmonic oscillator Let
ψn(a) = 〈a|n〉 =1radicn
Hn(a) eminusa22h (577)
be the normalized nth eigenfunction of the harmonic oscillator with eigen-value En = h(n+12) Here Hn(a) is the nth Hermite polynomial Thenclearly
Ψnanb(a b) = ψna(a)ψnb
(b) equiv 〈a b|na nb〉 (578)
solves (576) ifh(na + nb + 1) = E (579)
182 Quantum mechanics
Therefore the quantum theory exists (with this ordering) only if Eh =N + 1 is an integer which we assume from now on The general solutionof (576) is
Ψ(a b) =sum
n= 0N
cn ψn(a) ψNminusn(b) (580)
Therefore H is an (N+1)-dimensional proper subspace of K An orthonor-mal basis is formed by the N + 1 states |nN minus n〉 with n = 0 N
Projector The projector P S rarr H is in fact a true projector and canbe written explicitly as
P =sum
n= 0N
|nN minus n〉〈nN minus n| (581)
This can be obtained from (558) by taking the integration range to be2π determined by the range of τ in the classical hamiltonian evolutionor by the fact that H is the generator of an U(1) unitary action on Kwith period 2π Indeed
int 2π
0dτ eminus
ihτH =
int 2π
0dτ
sum
nanb
|na nb〉eminusihτ(h(na+nb+1)minusE)〈na nb|
=sum
nanb
|na nb〉δ(na + nb + 1 minus Eh)〈na nb|
= P (582)
Transition amplitudes The transition amplitudes are the matrix elementsof P In the basis that diagonalizes a and b
W (a b aprime bprime) = 〈a b|P |aprime bprime〉 =sum
n=0N
〈a b|nN minus n〉〈nN minus n|aprime bprime〉
(583)Explicitly this is
W (a b aprime bprime) =sum
n=0N
1radic
n(N minus n)Hn(a)HNminusn(b)
timesHn(aprime)HNminusn(bprime) eminus(a2+b2+aprime2+bprime2)2h (584)
This function codes all the properties of the quantum system Roughlyit determines the probability density of measuring (a b) if (aprime bprime) wasmeasured Let us study its properties
52 Relativistic QM 183
Semiclassical limit of the projector Notice that by inserting (582) into(583) we can write the projector as
W (a b aprime bprime) =int 2π
0dτ 〈a b|eminus i
hHτ |aprime bprime〉
=int 2π
0dτ e
ihEτ 〈a|eminus i
hHaτ |aprime〉〈b|eminus i
hHbτ |bprime〉 (585)
W (a b aprime bprime) =int 2π
0dτ e
ihEτW (a aprime τ) W (b bprime τ) (586)
where W (a aprime τ) is the propagator of the harmonic oscillator in a physicaltime τ given in (511) Inserting (511) in (586) we obtain
W (a b aprime bprime) =int 2π
0dτ
1sin τ
eminusihS(abaprimebprimeτ) (587)
where S(a b aprime bprime τ) is given in (3101) We can evaluate this integral ina saddle-point approximation This gives
W (a b aprime bprime) simsum
i
1sin τi
eminusihS(aaprimebbprimeτi) (588)
where the τi are determined by
partS(a b aprime bprime τ)partτ
∣∣∣∣τ=τi(abaprimebprime)
= 0 (589)
But this is precisely (3102) that defines the value of τ giving the Hamil-ton function of the timeless system This equation has two solutions cor-responding to the two portions into which the ellipse is cut The relationbetween the two actions is given in (3103) Recalling that Eh is aninteger this gives
W (a b aprime bprime) sim 1sin τ(a b aprime bprime)
(eminus
ihS(aaprimebbprime) minus e
ihS(aaprimebbprime)
) (590)
that is
W (a b aprime bprime) sim 1sin τ(a b aprime bprime)
sin[
1hS(a aprime b bprime)
] (591)
as in (572) Here sim indicates equality in the lowest order in h Thisequation expresses the precise relation between the quantum theory andthe classical theory
184 Quantum mechanics
Propagation ldquoforward and backward in timerdquo Notice that the two termsin (590) have two natural interpretations One is that they representthe two classical paths going from (aprime bprime) to (a b) in C The other moreinteresting interpretation is that they correspond to a trajectory goingfrom (aprime bprime) to (a b) and a ldquotime reversedrdquo trajectory going from (a b)to (aprime bprime) In fact the projector (which recall is real) can be naturallyinterpreted as the sum of two propagators one going forward and onegoing backward in the parameter time τ
The distinction between forward and backward in the parameter timeτ has no physical significance in the classical theory because the physicsis only in the ellipses in C not in the orientation of the ellipses
However in the quantum theory we can identify in H ldquoclockwise-movingrdquo and ldquoanticlockwise-movingrdquo components These components arethe eigenspaces of the positive and negative eigenvalues of the angular mo-mentum operator L = apartbminusbparta (or L = partφ where a = r sinφ b = r cosφ)Thus we can write wave packets ldquotraveling along the ellipses purely for-ward or purely backward in the parameter timerdquo If we consider only alocal evolution in a small region of C and we interpret say b as the in-dependent time variable and a as the dynamical variable then these twocomponents have respectively positive and negative energy In a sensethey can be viewed as particles and antiparticles
53 Quantum field theory
I assume the reader is familiar with standard quantum field theory (QFT) Here Iillustrate the connection between QFT and the relativistic formalism developed aboveand I recall a few techniques that will be used in Part II and are not widely knownOf particular importance are the distinction between Minkowski vacuum and covariantvacuum the functional representation of a field theory and the construction of thephysical Hilbert space of lattice YangndashMills theory
In Chapter 3 we have seen that a classical field theory can be definedcovariantly by the boundary space G of closed surfaces α in a finite-dimensional space C and a relativistic hamiltonian H on T lowastG For in-stance in a scalar field theory C = M timesR has coordinates (xμ φ) wherexμ is a point in Minkowski space and φ a field value A surface α isdetermined by the two functions
α = [xμ(τ) ϕ(τ)] (592)
and determines a boundary 3-surface xμ(τ) in Minkowski space M andboundary values φ(x(τ)) = ϕ(τ) of the field on this surface
A quantization of the theory can be obtained precisely as in the finite-dimensional case in terms of a boundary state space K of functionalsΨ[α] on G Notice however that the difference between the kinematical
53 Quantum field theory 185
state space K and the boundary state space K is far less significant infield theory than for finite-dimensional systems In the finite-dimensionalcase the states ψ(qa) in K are functions on the extended configurationspace C while the states ψ(qa qaprime) in K are functions on the boundaryspace G = C times C In the field theoretical case both states have the formΨ[α] The difference is that the states in K are functions of an ldquoinitialrdquosurface α where xμ(τ) can be for instance the spacelike surface x0 = 0in this case α contains only one-half of the data needed to determine asolution of the field equations On the other hand the states Ψ[α] in Kare functions of a closed surface α In fact the only difference betweenK and K is in the global topology of α If we disregard this and considerlocal equations we can confuse K and K (see Section 535)
The relativistic hamiltonian is given in (3192) The complete solutionof the classical dynamics is known if we know the Hamilton function S[α]which is the value of the action
S[α] = S[R φ] =int
RL(φ(x) partμφ(x))d4x (593)
where R is the four-dimensional region bounded by x(τ) and φ(x) isthe solution of the equations of motion in this region determined bythe boundary data φ(x(τ)) = ϕ(τ) If there is more than one of thesesolutions we write them as φi(x) and the Hamilton function is multivalued
Si[α] = S[R φi] =int
RL(φi(x) partμφi(x))d4x (594)
The relativistic Hamiltonian gives rise to the WheelerndashDeWitt equation
H
[xμ φminusih
δ
δxμminusih
δ
δϕ
](τ) Ψ[α] = 0 (595)
precisely as in the finite-dimensional case The HamiltonndashJacobi equation(3193) can be interpreted as the eikonal approximation for this waveequation
The complete solution of the dynamics is known if we know the propa-gator W [α] which is a solution of this equation Formally the field prop-agator can be written as a functional integral
W [α] =int
φ(x(τ))=ϕ(τ)[Dφ] eminus
ihS[Rφ] (596)
Of course one should not confuse the field propagator W [α] with theFeynman propagator The first propagates field the second the particlesof a QFT The first is a functional of a surface and the value of the field
186 Quantum mechanics
on this surface the second is a function of two spacetime points To thelowest order in h the saddle-point approximation gives
W [α] simsum
i
Ai[α] eminusihSi[α] (597)
There are two characteristic difficulties in the field theoretical contextthat are absent in finite dimensions the definition of the scalar productand the need to regularize operator products
First in finite dimensions a measure dqa on C is sufficient to definean associated L2 Hilbert space of wave functions In the field theoreticalcase we have to define the scalar product in some other way The scalarproduct must respect the invariances of the theory and must be such thatreal classical variables be represented by self-adjoint operators This isbecause self-adjoint operators have a real spectrum and the spectrumdetermines the values that a quantity can take in a measurement Givena set of linear operators on a linear space the requirement that they areself-adjoint puts stringent conditions on the scalar product As we shallsee in all cases of interest these requirements are sufficient to determinethe scalar product
Second local operators are in general distributions and their productsare ill defined Operator products arise in physical observable quantities aswell as in the dynamical equation namely in the WheelerndashDeWitt equa-tion In particular functional derivatives are distributions In the classicalHamiltonndashJacobi equation we have products of functional derivatives ofthe HamiltonndashJacobi functional which are well-defined products of func-tions In the corresponding quantum WheelerndashDeWitt equation thesebecome products of functional-derivative operators which are ill definedwithout an appropriate renormalization procedure The definition of gen-erally covariant regularization techniques will be a major concern in thesecond part of the book
531 Functional representation
Consider a simple free scalar theory where V = 0 I describe this well-known QFTin some detail in order to illustrate certain techniques that play a role in quantumgravity In particular I illustrate the functional representation of quantum field the-ory a simple form of the WheelerndashDeWitt equation the general form of W [α] and itsphysical interpretation The functional representation is the representation in whichthe field operator is diagonal The quantum states will be represented as functionalsΨ[φ] = 〈φ|Ψ〉 where |φ〉 is the (generalized) eigenstate of the field operator with eigen-value φ(x) The relation between this representation and the conventional one on the
Fock basis |k1 k1〉 is precisely the same as the relation between the Schrodingerrepresentation ψ(x) and the one on the energy basis |n〉 for a simple harmonic oscilla-tor I also illustrate the way in which the scalar product on the space of the solutions
53 Quantum field theory 187
of the WheelerndashDeWitt equation is determined by the reality properties of the fieldoperators
To start with and to connect the generally covariant formalism de-scribed above with conventional QFT letrsquos restrict the surface x(τ) in α toa spacelike surface xμ(τ) = (t τ) in Minkowski space Then α = [t φ(x)]and Ψ[α] = Ψ[t φ(x)] The HamiltonndashJacobi equation (3186) reduces to(3190) The corresponding quantum WheelerndashDeWitt equation becomes
ihpart
parttΨ = H0Ψ (598)
where the nonrelativistic hamiltonian operator H0 is
H0 =int
d3x H0
[φminusih
δ
δφ
](x) (599)
and H0[φ p](x) is given in (3195) The factor ordering of this operatorcan be chosen in order to avoid the divergence that would result from thenaive factor ordering
H0 naive =12
intd3x
[minush2 δ
δφ(x)δ
δφ(x)+|nablaφ|2(x) + m2φ2(x)
] (5100)
The Fourier modes
φ(k) = (2π)minus32
intd3x e+ikmiddotxφ(x) (5101)
decouple
H0 =12
intd3k
[p2(k) + ω2(k)φ2(k)
] (5102)
where ω =radic
|k|2 + m2 The dangerous divergence is produced by the
vacuum energy of the quantum oscillators associated with each mode kand can be avoided by normal ordering In terms of the positive andnegative frequency fields
a(k) =iradic2ω
p(k) +radic
ω
2φ(k) (5103)
adagger(k) = minus iradic2ω
p(minusk) +radic
ω
2φ(minusk) (5104)
the hamiltonian reads
H0 =int
d3k ω(k) adagger(k) a(k) (5105)
188 Quantum mechanics
We define the quantum hamiltonian by this equation where
a(k) =hradic2ω
δ
δφ(k)+
radicω
2φ(k) (5106)
adagger(k) = minus hradic2ω
δ
δφ(minusk)+
radicω
2φ(minusk) (5107)
The lowest-energy eigenvector of the hamiltonian has vanishing eigen-value and is called the Minkowski vacuum state This state is usuallydenoted |0〉 I denote it here as |0M〉 where M stands for Minkowski inorder to distinguish it from other vacuum states that will be introducedlater on The Minkowski vacuum state is determined by a(k)|0M〉 = 0 Inthe functional representation this state reads
Ψ0M [φ] equiv 〈φ|0M〉 (5108)
and is determined by
a(k)Ψ0M [φ] =hradic2ω
δ
δφ(k)Ψ0M [φ] +
radicω
2φ(k)Ψ0M [φ] = 0 (5109)
The solution of this equation gives the functional form of the vacuumstate
Ψ0M [φ] = Neminus12h
intd3k ω(k)φ(k)φ(k) (5110)
The one-particle state with momentum k is created by adagger(k)
Ψk[φ] equiv 〈φ|k〉 = adagger(k)Ψ0M [φ] =
radic2ω φ(k) Ψ0M [φ] (5111)
It has energy hω(k) Therefore the time-dependent state
Ψk[t φ] equiv
radic2ω eminusiω(k)t φ(k) Ψ0M [φ] (5112)
is a solution of the WheelerndashDeWitt equation (598)A generic one-particle state with wave function f(k) is defined by
|f〉 equivint
d3kradic2ω
f(k) |k〉 (5113)
and its functional representation is therefore
Ψf [φ] equiv 〈φ|f〉 =int
d3k f(k) φ(k) Ψ0[φ] (5114)
or
Ψf [φ] = φ[f ] Ψ0[φ] (5115)
53 Quantum field theory 189
where
φ[f ] =int
d3k f(k) φ(k) (5116)
The corresponding solution of the WheelerndashDeWitt equation (598) is
Ψf [t φ] =int
d3k f(k) eminusiω(k)t φ(k) Ψ0[φ] (5117)
or in Fourier transform
Ψf [t φ] =int
d3x F (t x) φ(x) Ψ0[φ] (5118)
where
F (x) = F (t x) = (2π)minus32
intd3k ei(kmiddotxminusω(k)t) f(k) (5119)
is a positive-energy solution of the KleinndashGordon equationThe n-particle states |k1 kn〉 can be obtained using again the cre-
ation operator adagger(k) in the well-known way They have energy h(ω1 +middot middot middot + ωn) where ωi = ω(ki) The general solution of the WheelerndashDeWittequation is therefore
Ψ[t φ] =sum
n
intd3k1 d3knradic
2ω1 2ωnf(k1 kn) eminusi(ω1+middotmiddotmiddot+ωn)t
times adagger(k1) adagger(kn)Ψ0[φ] (5120)
The space F of these solutions labeled by the functions f(k1 kn) isthe physical state space H of the theory Since Ψ[t φ] is determined byΨ[φ] = Ψ[0 φ] we can also represent the quantum states by their valueon the t = 0 surface namely as functionals Ψ[φ]
Scalar product The scalar product can be determined on the space ofthe solutions of the WheelerndashDeWitt equation from the requirement thatreal quantities are represented by self-adjoint operators The scalar fieldφ(x) and its momentum p(x) are real Therefore we must demand thatthe corresponding operators are self-adjoint It follows that the operatoradagger(k) is the adjoint of the operator a(k) Using this we obtain easily
〈k|kprime〉 = 〈adagger(k)0|adagger(kprime)0〉 = 〈0|a(k)adagger(kprime)0〉 = hδ(k minus kprime) (5121)
It follows from (5113) that
〈f |f prime〉 = h
intd3k
2ωf(k) f prime(k) (5122)
190 Quantum mechanics
(Recall that d3k2ω is the Lorentz-invariant measure) Therefore the one-particle state space is H1 = L2[R3 d3k2ω] Let us write
f(x) = (2π)minus32
intd3k
2ωeikx f(k) (5123)
The spacetime function
f(x) = f(t x) = (2π)minus32
intd3k
2ωei(kmiddotxminusω(k)t) f(k) (5124)
is a positive-frequency solution of the KleinndashGordon equation with initialvalue f(0 x) = f(x) (not to be confused with the one defined in (5119)which is F = iparttf) Then easily
〈f |f prime〉 = ihint
d3x[f(x)part0f
prime(x) minus f prime(x)part0f(x)]t=0
(5125)
This is the well-known KleinndashGordon scalar product which is positivedefinite on the positive-frequency solutions
Notice that the one-particle Hilbert space can be represented in various equivalentmanners It is
bull the space of the positive-frequency solutions f(x) of the KleinndashGordon equationwith the scalar product (5125)
bull the space H = L2[R3 d3k2ω] of the functions f(k)
bull the space H = L2[R4 δ(k2 + m2)θ(k0)d4k] of the functions f(k)
bull the space H = L2[R3 d3x] of the functions
f(x) =
intd3kradic2ω
eikmiddotx f(k) (5126)
(the position operator x in this representation is obviously self-adjoint it is thewell-known NewtonndashWigner operator which has a far more complicated form inother representations)
bull and so on
Using the same technique the entire space F can be equipped with ascalar product The resulting Hilbert space is of course the well-knownFock space over this one-particle Hilbert space
532 Field propagator between parallel boundary surfaces
Consider now a surface Σt formed by two parallel spacelike planes inMinkowski space say xμ1 (τ) = (t1 τ) and xμ2 (τ) = (t2 τ) Consider twoscalar fields ϕ1(τ) ϕ2(τ) on these planes Let α be the union of thesetwo surfaces with their fields that is α is formed by two disconnected
53 Quantum field theory 191
components α = α1 cup α2 = [xμ1 (τ) ϕ1(τ)] cup [xμ2 (τ) ϕ2(τ)] Consider thefield propagator (596) for this value of α Thus W [α] = W [t1 ϕ1 t2 ϕ2]In this case we can simply write
W [t1 ϕ1 t2 ϕ2] = 〈t1 ϕ1|t2 ϕ2〉 = 〈ϕ1|eminusihH0(t1minust2)|ϕ2〉 (5127)
The calculation of the propagator is simplified by the fact that the quan-tum field theory is essentially a collection of one harmonic oscillator foreach mode k Using the propagator of the harmonic oscillator given in(511) one obtains with some algebra
W [t1 ϕ1 t2 ϕ2]
= N exp
minus i2h
intd3k
(2π)3ω
[(|ϕ1|2 + |ϕ2|2
)cos[ω(t1 minus t2)] minus 2ϕ1ϕ2
sin[ω(t1 minus t2)]
]
(5128)
where N is the formal divergent normalization factor
N simprod
k
radicmω(k)
hexp
minusV
2
intd3k
(2π)3ln
[sin[ω(k)(t1 minus t2)]
] (5129)
This has the form (597) see the classical Hamilton function given in(3170)
Minkowski vacuum from the euclidean field propagator The state spaceat time zero Ht=0 is Fock space where the field operators ϕ(x) = φ(x t)and the hamiltonian H0 are defined Fock space is separable and thereforeadmits countable bases Choose a basis |n〉 of eigenstates of H0 witheigenvalues En and consider the operator
W (T ) =sum
n
eminusThEn |n〉〈n| (5130)
In the large-T limit this becomes the projection on the only eigenstatewith vanishing energy namely the Minkowski vacuum
limTrarrinfin
W (T ) = |0M〉〈0M| (5131)
In the functional Schrodinger representation the operator (5130) reads
W [ϕ1 ϕ2 T ] = 〈ϕ1|eminusihH0(minusiT )|ϕ2〉 = W [0 ϕ1 iT ϕ2] (5132)
192 Quantum mechanics
Therefore it is the analytical continuation of the field propagator (5127)and satisfies the euclidean Schrodinger equation
minushpart
partTW [ϕ1 ϕ2 T ] = Hϕ1 W [ϕ1 ϕ2 T ] (5133)
We can obtain the vacuum (up to normalization) as
Ψ0M [ϕ] = 〈ϕ|0M〉 = limTrarrinfin
W [ϕ 0 T ] (5134)
We can derive all particle scattering amplitudes from the functionalW [ϕ1 ϕ2 T ] For instance the 2-point function can be obtained as theanalytic continuation of the Schwinger function
S(x1 x2) = limTrarrinfin
intDϕ1Dϕ2 W [0 ϕ1 T ]ϕ1(x1)
timesW [ϕ1 ϕ2 (t1 minus t2)]ϕ2(x2) W [ϕ2 0 T ] (5135)
This can be generalized to any n-point function where the times t1 tnare on the t = 0 and the t = T surfaces these in turn are sufficient tocompute all scattering amplitudes since time dependence of asymptoticstates is trivial
W [ϕ1 ϕ2 T ] admits the well-defined functional integral representation
W [ϕ1 ϕ2 T ] =int
φ|t=T =ϕ1
φ|t=0=ϕ2
Dφ eminus1hSET [φ] (5136)
Here the integral is over all fields φ on the strip R bounded by the twosurfaces t = 0 and t = T with fixed boundary value The action SE
T [φ]is the euclidean action Notice that using this functional integral repre-sentation the expression (5135) for the Schwinger function becomes thewell-known expression
S(x1 x2) =int
Dφ φ(x1) φ(x2) eminus1hSE [φ] (5137)
obtained by joining at the two boundaries the three functional integralsin the regions tltt2 t2lttltt1 and t1ltt The functional W [ϕ1 ϕ2 T ] canbe computed explicitly in the free field theory Its expression in terms ofthe Fourier transform ϕ of ϕ is the analytic continuation of (5128)
W [ϕ1 ϕ2 T ] = N expminus 1
2h
intd3k
(2π)3ω
( |ϕ1|2 + |ϕ2|2tanh (ωT )
minus 2ϕ1ϕ2
sinh (ωT )
)
(5138)
53 Quantum field theory 193
The dynamical vacuum |0ΣT〉 Consider the boundary state space KΣt
associated with the entire surface Σt as in Section 514 That is defineKΣt = Ht otimes Hlowast
0 Denote ϕ = (ϕ1 ϕ2) a field on Σt The field basis ofthe Fock space induces the basis |ϕ〉 = |ϕ1 ϕ2〉 equiv |ϕ1〉t otimes 〈ϕ2|0 in KΣt the vectors |Ψ〉 of KΣt are written in this basis as functionals Ψ[ϕ] =Ψ[ϕ1 ϕ2] equiv 〈ϕ1 ϕ2|Ψ〉 This is the field theoretical generalization of theboundary state space defined in (542)
The functional W defines a preferred state in this Hilbert space as in(543)ndash(544) Denote this state |0Σt〉 and call it the dynamical vacuum Itis defined by 〈ϕ|0Σt〉 equiv W [t ϕ1 0 ϕ2] This state expresses the dynamicsfrom t = 0 to t A state in the tensor product of two Hilbert spaces definesa linear mapping between the two spaces The linear mapping from Ht=0
to Ht=T defined by |0ΣT〉 is precisely the time evolution eminusiHt
The interpretation of this state is the same as in the finite-dimensionalcase The tensor product of two quantum state spaces describes the en-semble of the measurements described by the two factors Therefore KΣt
is the space of the possible results of all measurements performed at time0 and at time t Observations at two different times are correlated by thedynamics Hence KΣt is a ldquokinematicalrdquo state space in the sense that itdescribes more outcomes than the physically realizable ones Dynamics isthen a restriction on the possible outcomes of observations It expressesthe fact that measurement outcomes are correlated The linear functional〈0Σt | on KΣt assigns an amplitude to any outcome of observations Thisamplitude gives us the correlation between outcomes at time 0 and out-comes at time t
Therefore the theory can be represented as follows The Hilbert spaceKΣt describes all possible outcomes of measurements made on Σt The dy-namics is given by a single bra state 〈0Σt | Kt rarr C For a given collectionof measurement outcomes described by a state |Ψ〉 the quantity 〈0Σt |Ψ〉gives the correlation probability amplitude between these measurements
Using (5131) we have then the relation between the dynamical vacuumand the Minkowski vacuum (the braket mismatch is apparent only asthe three states are in different spaces)
limtrarrinfin
|0Σminusit〉 = |0M〉 otimes 〈0M| (5139)
533 Arbitrary boundary surfaces
So far I have considered only boundary surfaces formed by two parallelspacelike planes This restriction is sufficient and convenient in ordinaryQFT on Minkowski space but it has no meaning in a generally covariantcontext It is therefore necessary to consider arbitrary boundary surfacesso let us study the extension of the formalism to the case where the surface
194 Quantum mechanics
Σ instead of being formed by two parallel planes is the boundary of a(sufficiently regular) arbitrary finite region of spacetime R
Let Σ be a closed connected 3d surface in Minkowski spacetime withthe topology (but in general not the geometry) of a 3-sphere and Σ =partR Let ϕ be a scalar field on Σ and consider the functional
W [ϕΣ] =int
φ|Σ=ϕDφ eminusSE
R[φ] (5140)
The integral is over all 4d fields on R that take the value ϕ on Σ and theaction in the exponent is the euclidean action where the 4d integral is overR In the free theory the integral is a well-defined gaussian integral andcan be evaluated The classical equations of motion with boundary valueϕ on Σ form an elliptic system which in general has a solution φcl[ϕ] thatcan be obtained by integration from the Green function for the shape R Achange of variable in the integral reduces it to a trivial gaussian integrationtimes eminusSE
R[ϕcl] Here SER[ϕ] is the field theoretical Hamilton function the
action of the bulk field determined by the boundary condition ϕ
W [ϕΣ] can be defined in the Minkowski regime as well If Σ is a rectangular boxin Minkowski space let ϕ = (ϕout ϕin ϕside) be the components of the field on thespacelike bases and timelike side Consider the field theory defined in the box withtime-dependent boundary conditions ϕside and let U [ϕside] be the evolution operatorfrom t = 0 to t = T generated by the (time-dependent) hamiltonian of the theoryThen we can write
W [ϕΣ] equiv 〈ϕout|U [ϕside]|ϕin〉 (5141)
In particular if ϕside is constant in time W can be obtained by analytic continuationfrom the euclidean functional More generally we can write the formal definition
W [ϕΣ] =
int
φ|Σ=ϕ
Dφ eiSR[φ] (5142)
Notice that W [ϕΣ] is a function on the space G defined in Section 333This space represents all possible ensembles of classical field measurementson a closed surface namely the minimal data for a local experimentFormally functions on G define the quantum state space K and W [ϕΣ]defines the preferred covariant vacuum state |0〉 in K
Local Schrodinger equation W [ϕΣ] satisfies a local functional equationthat governs its dependence on Σ Let τ be arbitrary coordinates on ΣRepresent the surface and the boundary fields as Σ τ rarr xμ(τ) andϕ τ rarr ϕ(τ) Let nμ(τ) be the unit length normal to Σ Then
nμ(τ)δ
δxμ(τ)W [ϕΣ] = H(τ) W [ϕΣ] (5143)
53 Quantum field theory 195
where H(x) is an operator obtained by replacing π(x) by minusiδδϕ(x) inthe hamiltonian density
H(x) = gminus12π2(x) + g
12 (|nablaϕ|2 + m2ϕ2) (5144)
Here g is the determinant of the induced metric on Σ and the norm istaken in this metric (see [145 146]) The local HamiltonndashJacobi equation(3186) can be viewed as the eikonal approximation of this equation SinceW is independent from the parametrization we have
partxμ(τ)partτ
δ
δxμ(τ)W [ϕΣ] = P (τ) W [ϕΣ] (5145)
where the linear momentum is P (τ) = nablaφ(τ) δδϕ(τ) If Σ is spacelike(5143) is the (euclidean) TomonagandashSchwinger equation
We expect a local equation like (5143) to hold in any field theory If thetheory is generally covariant the functional W will be independent fromΣ and therefore the left-hand side of the equation will vanish leavingonly the hamiltonian operator acting on the field variables namely aWheelerndashDeWitt equation
534 What is a particle
Choose Σ to be a cylinder ΣRT with radius R and height T with thetwo bases on the surfaces t = 0 and t = T Given two compact supportfunctions ϕ1 and ϕ2 defined on t = 0 and t = T respectively we canalways choose R large enough for the two compact supports to be includedin the bases of the cylinder Then
limRrarrinfin
W [ϕ1 ϕ2ΣRT ] = W [ϕ1 ϕ2 T ] (5146)
because the euclidean Green function decays rapidly and the effect ofhaving the side of the cylinder at finite distance goes rapidly to zero as Rincreases Equation (5135) illustrates how scattering amplitudes can becomputed from W [ϕ1 ϕ2 T ] In turn (5146) indicates how W [ϕ1 ϕ2 T ]can be obtained from W [ϕΣ] where Σ is the boundary of a finite regionTherefore knowledge of W [ϕΣ] allows us to compute particle scatteringamplitudes We expect this to remain true in the perturbative expansionof an interacting field theory as well where R includes the interactionregion
The limits TR rarr infin seem to indicate that arbitrarily large surfaces Σare needed to compute vacuum and particle scattering amplitudes Butnotice that the convergence of W [ϕ1 ϕ2 T ] to the vacuum projector isdictated by (5130) and is exponential in the mass gap or the Compton
196 Quantum mechanics
frequency of the particle Thus T at laboratory scales is largely sufficientto guarantee arbitrarily accurate convergence In the euclidean regimerotational symmetry suggests the same to hold for the R rarr infin limitThus the limits can be replaced by choosing R and T at laboratory scales(At least for the vacuum which does not require analytic continuation)
The conventional notions of vacuum and particle states are global innature How is it possible that we can recover them from the local func-tional W [ϕΣ] This is an important question that plays a role in QFTon curved spacetime and in quantum gravity To answer this questionnotice that realistic particle detectors are finitely extended How can afinitely extended detector detect particles if particles are globally definedobjects
The answer is that there exist two distinct notions of particle Fock par-ticle states are ldquoglobalrdquo while the physical states detected by a localizeddetector (eigenstates of local operators describing detection) are ldquolocalrdquoparticle states Local particle states are close to (in a suitable topology)but distinct from the global particle states In conventional QFT we usea global particle state in order to conveniently approximate the local par-ticle state detected by a detector Global particle states indeed are fareasier to deal with
Therefore the global nature of the conventional definition of vacuumand particles is not dictated by the physical properties of particles it isan approximation adopted for convenience Replacing the limits Rrarrinfinand T rarr infin with finite macroscopic R and T we miss the exact globalvacuum or n-particle state but we can nevertheless describe local ex-periments The restriction of QFT to a finite region of spacetime mustdescribe completely experiments confined to this region
Global and local particles in a simple finite system The distinction between globalparticles and local particles can be illustrated in a very simple system Consider twoweakly coupled harmonic oscillators Let the total hamiltonian of the system be
H =1
2(p2
1 + q21) +
1
2(p2
2 + q22) minus 2λq1q2 = H1 + H2 minus λV (5147)
Consider a measuring apparatus that interacts only with the first oscillator and mea-sures the quantity H1 The Hilbert space of the system is H = L2[R
2 dq1dq2] On thisspace the quantity H1 is represented by the operator minush2part2partq2
1 + q21 The operator
has a discrete spectrum E = (n+ 12)h If the result of the measurement is the eigen-value (1 + 12)h let us say that ldquothere is one local particle in the first oscillatorrdquo Inparticular a one-local-particle state is the common eigenstate of H1 and H2
ψlocal(q1 q2) = q1eminus(q21+q22)2h (5148)
in which there is one local particle in the first oscillator and no local particles in thesecond
Next let us diagonalize the full hamiltonian H This can easily be done by findingthe normal modes of the system which are qplusmn = (q1 plusmn q2)
radic2 and have frequencies
53 Quantum field theory 197
ω2plusmn = 1 plusmn λ The eigenvalues of H are therefore E = h(n+ω+ + nminusωminus + 1) We call
|n+ nminus〉 the corresponding eigenstates and N = n+ + nminus the global-particle numberIn particular we call ldquoone-global-particle staterdquo all states with N = 1 namely anystate of the form |ψ〉 = α|1 0〉 + β|0 1〉 Notice that this is precisely the definition ofone-particle states in QFT a one-particle state is an arbitrary linear combination ofstates |k〉 where there is a single quantum in one of the modes In particular considerthe one-global-particle state |ψ〉 = (|1 0〉+ |0 1〉)
radic2 This is a global particle which is
maximally localized on the first oscillator A straightforward calculation gives to firstorder in λ
ψglobal(q1 q2) = (q1 +λ
4q2)e
minus(q21+q22minus2λq1q2)2h (5149)
The two states ψlocal and ψglobal are different and have different physical meaningThe state ψglobal is the kind of state that is called a one-particle state in QFT It is theone-particle state which is most localized on the first oscillator On the other hand ifour measuring apparatus interacts only with the first oscillator then what we measureis not ψglobal it is ψlocal which is an eigenstate of an operator that acts only on thevariable q1
In QFT we confuse the two kinds of states In the formalism we use global-particlestates such as ψglobal However particle detectors are localized in space (A local mea-suring apparatus can only interact with the components of the field in a finite regionlike the apparatus that interacts only with the variable q1 in the example) Thereforethey measure particle states such as ψlocal Strictly speaking therefore the interpreta-tion of the particle states measured by particle detectors as global-particle states is amistake because a global-particle state can never be an eigenstate of a local measuringapparatus and therefore cannot be detected by a local apparatus
The reason we can nevertheless use this interpretation successfully is that the statesψlocal and ψglobal are very similar In the example their distance in the Hilbert normvanishes to first order in lambda
(ψglobal ψlocal) = 0(λ) (5150)
The error we make in using ψglobal to describe the physical state ψlocal is small if λVis small In the field theoretical case λV represents the interaction energy betweenthe region inside the detector and the region outside the detector this energy is verysmall compared to the energy of the state itself for all the states of interest We caneffectively approximate the local-particle states that are detected by our measuringapparatus by means of the global-particle states which are easier to deal with
On the other hand the argument shows that global-particle states are not required fordealing with the realistic observed particles they are just a convenient approximationIf we can define local-particle states by means of a local formalism we are not makinga mistake rather we are simply not using an approximation that was convenient onflat space but may not be viable in a generally covariant context
535 Boundary state space K and covariant vacuum |0〉Finally consider the space G of the variable α = (Σ ϕ) where Σ is a closed3d surface in spacetime Call K the space of functions ψ[α] = ψ[Σ ϕ] Thisspace represents all possible outcomes of ensembles of measurements onthe boundary of a finite region of spacetime The measurements includespacetime localization measurements that determine the surface Σ as well
198 Quantum mechanics
as field (or particle) measurements that determine ϕ (or a function of ϕ)K is the boundary quantum space
There is a preferred state |0〉 in K given by
〈Σ ϕ|0〉 = W [Σ ϕ] (5151)
If the functional integral can be defined this is given by (5142) The state|0〉 expresses the dynamics entirely As we shall see this formulation ofQFT makes sense in quantum gravity
In general K is a space of functions over G Recall that G is the spaceof data needed to determine a classical solution two events in the finite-dimensional case a set of events forming a 3d closed surface in the fieldcase
In the case of a finite-dimensional theory a classical solution in some in-terval is determined by two events in C In the quantum theory a completeexperiment consists of two events a preparation and a quantum measure-ment In this case K = L2[G] = L2[C times C] sim L2[C] otimes L2[C] = K otimes K isthe space representing two quantum events while K = L2[C] is the spacerepresenting a single quantum event
In the field theoretical case a classical solution in a region R is deter-mined by infinite events in C forming a closed 3d surface namely by a3d surface Σ = partR in spacetime and the field ϕ on it In the quantumtheory a complete experiment requires measurements (or assumptions)on the entire Σ In this case K sim L2[G] is the space describing the obser-vation of the entire boundary surface Σ and the measurements on it
The boundary of R can be formed by two (or even more) connectedcomponents Σ In this case we can decompose K into the tensor productof one factor K associated with each component The space K is then aspace of functionals of the connected surface Σ and the field on it Sincethe WheelerndashDeWitt equation is local it looks the same on K and on KTherefore the distinction between K and K is of much less importance inthe field theoretical context than in the finite-dimensional case The spaceK is associated with the idea of the full data characterizing an experimenton a closed surface Σ while the space K is associated with the idea of anldquoinitial datardquo surface Σ
536 Lattice scalar product intertwiners and spin network states
An interacting quantum field theory can be constructed as a perturbation expansion
around a free theory An alternative is to define a cut-off theory with a large but finite
number of degrees of freedom using a lattice One expects then to recover physical
predictions as suitable limits as the lattice spacing is taken to zero I illustrate here the
definition of the scalar product in a lattice gauge theory since the same technique is
used in quantum gravity
53 Quantum field theory 199
Consider a three-dimensional lattice Γ with L links l and N nodes nTo define a YangndashMills theory for a compact YangndashMills group G on thislattice we associate a group element Ul to each link l and we considerthe Hilbert space KΓ = L2[GLdUl] where GL is the product of L copiesof G and dUl equiv dUl dUl is the Haar measure on the group Quantumstates in KΓ are functions Ψ(Ul) of L group elements The scalar productof two states is given by
〈Ψ|Φ〉 equivint
dU1 dUL Ψ(U1 UL) Φ(U1 UL) (5152)
An orthonormal basis of states in KΓ can be obtained as follows Let jlabel unitary irreducible representations of G and let (Rj(U))αβ be thematrix elements of the representation The PeterndashWeyl theorem tells usthat the states |j β α〉 defined by 〈U |j β α〉 = (Rj(U))αβ form an or-thonormal basis in L2[GdU ] A basis in KΓ is therefore given by thestates
|jl βl αl〉 equiv |j1 jL β1 βL α1 αL〉 (5153)
defined by 〈Ul|jl βl αl〉 =prod
l(Rjl(Ul))
αlβl
The theory is invariant under local YangndashMills transformations on the
lattice These depend on a group element λn for each node n The variablesUl transform under a gauge transformation as Ul rarr λli
minus1Ulλlf where thelink l goes from the initial node li to the final node lf Hence the gauge-invariant states are the ones satisfying
Ψ(Ul) = Ψ(λliminus1Ulλlf ) (5154)
These states form a linear subspace K0Γ of KΓ the space K0
Γ is the(fixed-time) Hilbert space of the gauge-invariant states of the theory Anorthonormal basis of states in K0
Γ can be obtained using the notion of anintertwiner
Intertwiners Consider N irreducible representations j1 jN Considerthe tensor product of their Hilbert spaces
Hj1jN = Hj1 otimes otimesHjN (5155)
This space can be decomposed into a sum of irreducible components Inparticular let H0
j1jNbe the subspace formed by the invariant vectors
namely the subspace that transforms in the trivial representation Thisspace is k-dimensional where k is the multiplicity with which the trivialrepresentation appears in the decomposition It is of course a Hilbertspace and therefore we can choose an orthonormal basis in it We call
200 Quantum mechanics
the elements i of this basis ldquointertwinersrdquo between the representationsj1 jN
More explicitly elements of Hj1jN are tensors vα1 αN with one index in eachrepresentation Elements of H0
j1jN are tensors vα1αN that are invariant under theaction of G on all their indices That is they satisfy
R(j1)α1β1(U) R(jN )αN
βN (U) vβ1 βN = vα1 αN (5156)
The intertwiners vα1 αNi are a set of k such invariant tensors which are orthonormal
in the scalar product of H0j1jN That is they satisfy
vα1αNi viprimeα1αN = δiiprime (5157)
If the space Hj carries the representation j its dual space Hlowastj carries the
dual representation jlowast An intertwiner i between n dual representationsjlowast1 j
lowastn and m representations j1 jm is an invariant tensor in the
space (otimesi=1nHlowastji) otimes (otimesk=1mHjk) that is a covariant map
i (otimesi=1nHji) minusrarr (otimesk=1mHjk) (5158)
or an invariant tensor with n lower indices and m upper indicesNow associate a representation jl to each link l and an intertwiner in
in each node n (in the tensor product of the representations associatedwith the links adjacent to the node) of the lattice The set s = (Γ jl in)is called a ldquospin networkrdquo Each spin network s defines a state |s〉 by
〈Ul|s〉 = ψs(Ul) =prod
l
Rjl(Ul) middotprod
n
in (5159)
where the raised dot indicates index contraction Notice that the indices(not indicated in the equation) match as on each side of the dot there isone index for each couple node-link The states |s〉 form a complete andorthonormal basis in K0
Γ〈s|sprime〉 = δssprime (5160)
This basis will play a major role in quantum gravity
54 Quantum gravity
Finally I sketch here the formal structure of quantum gravity The actual mathematicaldefinition of the quantities mentioned here is the task I undertake in the second partof the book
541 Transition amplitudes in quantum gravity
In the presence of a background QFT yields scattering amplitudes andcross sections for asymptotic particle states and these are compared with
54 Quantum gravity 201
data obtained in a lab The conventional theoretical definition of theseamplitudes involves infinitely extended spacetime regions and relies onsymmetry properties of the background In a background-independentcontext this procedure becomes problematic For instance backgroundindependence implies immediately that any 2-point function W (x y) isconstant for x = y as mentioned in Section 114 How can the formalismcontrol the localization of the measuring apparatus
We have seen above that in the context of a simple scalar field theory lo-cal physics can be expressed in terms of a functional W [ϕΣ] that dependson field boundary eigenstates ϕ and the geometry of the 3d surface Σthat bounds R Physical predictions concerning measurements performedin the finite region R including scattering amplitudes between particlesdetected in the lab can be expressed in terms of W [ϕΣ] The functionalsatisfies a local version of the Schrodinger equation The geometry of Σcodes the relative spacetime localization of the particle detectors W [ϕΣ]can be expressed as a functional integral over a finite spacetime regionR of spacetime In the euclidean regime the functional integral is welldefined and can be used to determine the Minkowski vacuum state
This technique can be extended to quantum gravity namely to adiffeomorphism-invariant context The effect of diffeomorphism invarianceis that the functional W turns out to be independent of the location of ΣAt first sight this seems to leave us in the characteristic interpretative ob-scurity of background-independent QFT the independence of W from Σ isequivalent to the independence of W (x y) from x and y mentioned above
But a closer look reveals it is not so The boundary field includes thegravitational field which is the metric and therefore the argument of Wdoes describe the metric of the boundary surface that is the relativespacetime location of the detectors as explained in Section 412 There-fore the relative location of the detectors lost with Σ because of generalcovariance comes back with ϕ as this now includes the boundary valueof the gravitational field The boundary value of the gravitational fieldplays the double role previously played by ϕ and Σ In fact this is pre-cisely the core of the conceptual novelty of general relativity there is no apriori distinction between localization measurements and measurementsof dynamical variables
More formally in a background-dependent theory the space G is a spaceof couples (Σ ϕ) but in a general-relativistic theory the space G is just aspace of fields on a closed differential surface In pure GR we can take Gas the space of the gravitational connections A on a closed surface Ac-cordingly the space K is a space of functionals of the field A on a closedsurface These functionals are invariant under 3d diffeomorphisms of thesurface In the second part of the book the space K will be built explic-itly As explained in the previous section the functional W determines a
202 Quantum mechanics
preferred state |0〉 in K This is the covariant vacuum state which containsthe dynamical information of the theory
A key result of the theory developed in the second part of the book isthat the eigenstates of the gravitational field on a 3d surface are notsmooth fields They present a characteristic Planck-scale discretenessThese eigenstates determine a preferred basis |s〉 in K labeled by theldquospin networksrdquo s that will be described in detail in the second part ofthe book Each state |s〉 describes a ldquoquantum geometry of spacerdquo namelythe possible result of a complete measurement of the gravitational fieldon the 3d surface We shall express W in this preferred basis
W (s) = 〈0|s〉 (5161)
Therefore because of the Planck-scale discreteness of space in the gravi-tational context the analog of W [ϕΣ] is the functional W (s) A definitionof W (s) in the canonical quantum theory will be given below in (737)As we shall see the covariant vacuum state |0〉 will simply be related tothe spin network state with no nodes and no links A sum-over-historiesdefinition of W (s) will be given below in (921)
A case of particular interest is the one in which we can separate theboundary surface Σ into two components For instance these can be dis-connected Accordingly we can write s as (sout sin) and the associatedamplitude as
W (sout sin) = 〈0|sout sin〉 = 〈sout|P |sin〉 (5162)
where P is the projector on the solutions of the WheelerndashDeWitt equa-tion A sum-over-histories expression of W (sout sin) is given in terms ofhistories that go from sin to sout
542 Much ado about nothing the vacuum
The notion of ldquovacuum staterdquo plays a central role in QFT on a backgroundspacetime The vacuum is the basis over which Fock space is built In grav-ity on the other hand the notion of vacuum is very ambiguous This factcontributes to make quantum gravity sharply different from conventionalQFTs However this is not a difficulty a preferred notion of vacuum isnot needed for a quantum theory to be well defined The quantum theoryof a harmonic oscillator has a vacuum state but the quantum theory ofa free particle does not In this respect general relativity resembles morea free particle than a harmonic oscillator
Notice that even the terminology of classical GR is confusing with re-spect to the notion of vacuum in relativistic parlance all solutions of theEinstein equations without a source term are called ldquovacuum solutionsrdquo
54 Quantum gravity 203
We use three distinct notions of vacuum in quantum gravity
Covariant vacuum The first is the nonperturbative or covariant vac-uum state |0〉 defined in Sections 514 and 532 This is the statein the boundary state space that defines the dynamics Intuitivelyit is defined by the sum-over-histories on a region bounded by thegiven boundary data If the metric boundary data are chosen to bespacelike this is the HartlendashHawking state In the context we areconsidering instead the boundary surface bounds a finite 4d regionof spacetime and the state |0〉 is a background-independent way ofcoding quantum dynamics
Empty state The state |empty〉 is the kinematical quantum state of thegravitational field in which the volume of space is zero namely inwhich there is no physical space As we shall see it is related to thecovariant vacuum state |0〉
Minkowski vacuum A different notion of vacuum is the Minkowskivacuum state |0M〉 The quantum state |0M〉 that describes theMinkowski vacuum is not singled out by the dynamics alone In-stead it is singled out as the lowest eigenstate of an energy HT
which is the variable canonically conjugate to a nonlocal functionof the gravitational field defined as the proper time T along a givenworldline This is analogous to the identification of the energy witha momentum p0 under the choice of a specific Lorentz time x0 Tofind this state in quantum gravity we can use the procedure em-ployed in (549) and (5139) This will be briefly discussed at theend of Chapter 9 Alternatively in an asymptotically flat contextwe expect |0M 〉 to be the lowest eigenstate of the ADM energy
The notion of vacuum is strictly connected to the notion of energy Thevacuum can be defined as the state with lowest energy In GR the notionof energy is ambiguous and the ambiguity in the definition of energy isreflected in the ambiguity in defining the vacuum Indeed we can identifyseveral notions of energy in GR
Canonical energy The canonical energy namely the generator H oftranslations in coordinate time vanishes identically in any general-relativistic theory In this sense all physical states of quantum grav-ity are vacuum states
Matter energy The energy-momentum tensor T Iμ of the nongravita-
tional fields is well defined and therefore the energy Ematter = T 00
of the nongravitational fields is well defined In classical GR a vac-uum solution is a solution with Ematter = 0 In this sense vacuumstates are all the pure gravity physical states without matter
204 Quantum mechanics
Gravitational energy The energy of the gravitational field Egravity isstrictly speaking (minus) the left-hand side of the timendashtime com-ponent of the Einstein equations so the timendashtime component ofthe Einstein equations reads Egravity+Ematter = 0 That is the totalenergy vanishes see for instance [147]
ADM energy We can associate an energy EADM to an isolated sys-tem surrounded by a region where the gravitational field is approxi-mately minkowskian Such a system can be described by asymptot-ically flat solutions of the Einstein equations For such a system wecan identify the energy with the generator EADM of time transla-tions in the asymptotic Minkowski space Given asymptotic flatnessEADM is minimized by the Minkowski solution In this sense theMinkowski solution is ldquothe vacuumrdquo of the asymptotic minkowskiantheory
The fact that the notions of energy and vacuum are so ambiguous in GRshould not be disconcerting There is nothing essential in these notions aquantum theory and its predictions are meaningful also in the absence ofthem The notions of energy and vacuum play an important role in non-general-relativistic physics just because of the accidental fact that we livein a region of the Universe which happens to have a peculiar symmetrytranslation invariance in newtonian or special-relativistic time
55 Complements
551 Thermal time hypothesis and Tomita flow
The thermal time hypothesis discussed in Section 34 extends nicely toQM and very nicely to QFT
QM In QM the time flow is given by
At = αt(A) = eitH0AeminusitH0 (5163)
A statistical state is described by a density matrix ρ It determines theexpectation values of any observable A via
ρ[A] = tr[Aρ] (5164)
This equation defines a positive functional ρ on the observablesrsquo algebraThe relation between a quantum Gibbs state ρ0 and H0 is the same as in(3202) That is
ρ0 = NeminusβH0 (5165)
55 Complements 205
Correlation probabilities can be written as
WAB(t) = ρ0[αt(A)B] = tr[eitH0A eminusitH0B eminusβH0 ] (5166)
Notice that it follows immediately from the definition that
ρ0[αt(A)B] = ρ0[α(minustminusiβ)(B)A] (5167)
namely
WAB(t) = WBA(minustminus iβ) (5168)
A state ρ0 over an algebra satisfying the relation (5167) is said to beKMS (KubondashMartinndashSchwinger) with respect to the flow αt
We can generalize easily the thermal time hypothesis Given a genericstate ρ the thermal hamiltonian is defined by
Hρ = ln ρ (5169)
and the thermal time flow is defined by
Atρ = αtρ(A) = eitρHρAeminusitρHρ (5170)
ρ is a KMS state with respect to the thermal time flow
QFT Tomita flow In QFT finite-temperature states do not live in thesame Hilbert space as the zero-temperature states H0 is a divergent oper-ator on these finite-temperature states This is to be expected since in athermal state there is a constant energy density and therefore a divergingtotal energy H0 Therefore (5165) makes no sense in QFT How thendo we characterize the Gibbs states The solution to this problem is wellknown equation (5167) can still be used to characterize a Gibbs stateρ0 in the algebraic framework and can be taken as the basic postulate ofstatistical QFT a Gibbs state ρ0 over an algebra of observables is a KMSstate with respect to the time flow α(t)
It follows that if we want to extend the thermal time hypothesis tofield theory we cannot use (5169) Can we get around this problem Isthere a flow αtρ which is KMS with respect to a generic thermal stateρ Remarkably the answer is yes A celebrated theorem by Tomita statesprecisely that given any2 state ρ over a von Neumann algebra3 there isalways a flow αt called the Tomita flow of ρ such that (5167) holds
2Any separating state ρ A separating density matrix has no zero eigenvalues This isthe QFT equivalent of the condition stated in footnote 11 of Chapter 3
3The observablesrsquo algebra is in general a Clowast algebra We obtain a von Neumann algebraby closing in the Hilbert norm of the quantum state space
206 Quantum mechanics
This theorem allows us to extend (3205) to QFT the thermal timeflow αtρ is defined in general as the Tomita flow of the statistical state ρ
Thus the thermal time hypothesis can be readily extended to QFTwhat we call the ldquoflow of timerdquo is simply the Tomita flow of the statisticalstate ρ in which the world happens to be when it is described in termsof macroscopic parameters
The flow αtρ depends on the state ρ However a von Neumann algebra possessesalso a more abstract notion of time flow independent of ρ This is given by the one-parameter group of outer automorphisms formed by the equivalence classes of auto-morphisms under inner (unitary) automorphisms Alain Connes has shown that thisgroup is independent of ρ It only depends on the algebra itself Connes has stressedthe fact that this group provides an abstract notion of time flow that depends only onthe algebraic structure of the observables and nothing else
The thermal time hypothesis and the notion of thermal time have notyet been extensively investigated They might provide the key by whichto relate timeless fundamental mechanics with our experience of a worldevolving in time
552 The ldquochoicerdquo of the physical scalar product
The solutions of the WheelerndashDeWitt equation (564) form the linear space H Thisspace is naturally equipped with a scalar product that makes it a Hilbert space Thisscalar product is often denoted the ldquophysicalrdquo scalar product in order to distinguishit from the scalar product in K denoted the ldquokinematicalrdquo scalar product
The relation between kinematical and physical scalar product depends on the hamil-tonian H The space H is the eigenspace of H corresponding to the eigenvalue zeroIn order for solutions to exist the spectrum of H must therefore include zero If zerois part of the discrete spectrum of H then H is a proper subspace of K that is thesolutions of the WheelerndashDeWitt equation (564) are normalizable states in K In thiscase the physical scalar product is the same as the kinematical scalar product andthere is no complication But if zero is part of the continuum spectrum of H then H isformed by generalized eigenvectors which are in S prime and not in K That is the solutionsof the WheelerndashDeWitt equation (564) are nonnormalizable states in K In this casethe physical scalar product is different from the kinematical scalar product What isit
In the quantum gravity (and quantum cosmology) literature there is a certain con-fusion regarding the issue of the definition of the physical scalar product For instanceone often reads that this issue has to do with the notion of time This is a conceptualmistake that derives from the observation that in a nonrelativistic theory there is a pre-ferred time variable and the problem of defining H starting from K does not appearBut the fact that the issue of defining the product in H appears in timeless systemsdoesnrsquot imply that it cannot be resolved unless there is a time variable
In fact there are a large number of solutions to this issue all essentially equivalentPreferences vary here are some of the solutions proposed
(i) The scalar product can be defined on H using the matrix elements of the projectoras illustrated above
55 Complements 207
(ii) Here is a general theorem on the issue If H is a self-adjoint operator on a Hilbertspace K then we can write
K =
int
S
ds Hs (5171)
Here S is the spectrum of H ds is a measure on this spectrum and Hs is a family ofHilbert spaces labeled by the eigenvalues s The meaning of this integral over Hilbertspaces is the following any vector ψ isin K can be written as a family ψs where forevery s ψs isin Hs and
(ψ φ)K =
int
S
ds (ψs φs)Hs (5172)
where ( )H is the scalar product in the Hilbert space H and in this instance theintegral is a standard numerical integral The relevance of this theorem is that it statesthat there is a Hilbert space H0 That is a scalar product on the space of the solutionsof Hψ = 0
Here is a simple example of how the theorem works Consider the space K =L2[R
2 dxdy] and the self-adjoint operator H = minusiddy The solutions of Hψ = 0or
minusid
dyψ(x y) = 0 (5173)
are functions ψ(x y) constant in y and are nonnormalizable in K However the decom-position (5171) (5172) is immediate
K =
int
R
dy Hy (5174)
where H(y) = L2[Rdx] In fact
(ψ φ)K =
int
R2dx dy ψ(x y) φ(x y) =
int
R
dy (ψy φy)Hy (5175)
where ψy(x) = ψ(x y) and
(ψy φy)Hy =
int
R
dx ψy(x) φy(x) (5176)
The space of the solutions of (5173) is H(0) and has the natural Hilbert structureH(0) = L2[R dx]
(iii) Here is another solution Pick a set of self-adjoint operators Ai in K thatcommute with H These are well defined on the space H because if Hψ = 0 thenH(Aiψ) = AiHψ = 0 Now require that the operators Ai be self-adjoint in the phys-ical scalar product For a sufficient number of operators this requirement fixes thescalar product of H
In the example given in (ii) above the obvious self-adjoint operators that commutewith H = minusiddy are x and minusiddx These are well defined on the space of the functionsof x alone There is only one scalar product on this space of functions that makes xand minusiddx self-adjoint the one of L2[R dx]
(iv) A convenient way of addressing the problem especially in the case in which His not a single operator but has many components is given by the ldquogroup averagingrdquotechnique Assume the WheelerndashDeWitt equation has the form Hiψ = 0 where theself-adjoint operators Hiψ = 0 are the generators of a unitary action of a group U on
208 Quantum mechanics
K Assume also that S is invariant under this action and that we can find an invariantmeasure on the group or at least on the orbit of the group in K Then we can generalizethe operator P S rarr H of (558)
P =
int
U
dτ U(τ) (5177)
and write the physical scalar product as
(Pψ Pφ)H equiv (Pψ)(φ) =
int
U
dτ (ψ|U(τ)|φ)K (5178)
There certainly are other techniques as well This is a field in which the same ideashave independently reappeared many times under different names (and with differentlevels of mathematical precision) All these techniques are generally equivalent If thereis a case in which they differ wersquoll have to resort to physical arguments to find thephysically correct choice
553 Reality conditions and scalar product
Section 327 illustrated the possibility of using mixed complex and real dynamicalvariables a strategy that will turn out to be useful in GR Here I illustrate whathappens with the same choice in quantum theory In particular I illustrate the key rolethat the reality conditions play in quantum theory Recall the simple example discussedin Section 327 a free particle described in the coordinates x and z = xminus ip We canwrite the quantum theory in terms of wave functionals ψ(z) of the complex variable zThe Schrodinger equation gives immediately (see (3134))
ihpartψ(z t)
partt= H0
(hpart
partz z
)ψ(z t) = minus 1
2m
(hpart
partzminus z
)2
ψ(z t) (5179)
A complete family of solutions is given by
ψk(z t) = eminusihS(ztk) (5180)
where S(z t k) is given in (3135) Observe now that in the quantum theory the realitycondition (3138) becomes a relation between operators
z + zdagger = 2hpart
partz (5181)
Notice that classical complex conjugation is translated into the adjoint operation thisis necessary in order for real quantities to be represented by self-adjoint operators Now(5181) makes sense only after we have specified the scalar product because the daggeroperation is defined in terms of and therefore depends on the scalar product Indeedrequiring the reality condition (5181) to hold amounts to posing a condition on thescalar product of the theory Let us search for a scalar product of the form
(ψ φ) =
intdzdz f(z z) ψ(z) φ(z) (5182)
where f is a function to specify Imposing (5181) gives the condition on f
(z + z)f(z z) = minus2hpart
partzf(z z) (5183)
This gives
f(z z) = eminus(z+z)24h (5184)
56 Relational interpretation of quantum theory 209
Let us check whether the states (5180) are well defined with respect to this productInserting (5180) (at t = 0 for simplicity) and (5184) in (5182) gives
(ψk ψkprime) =
intdzdz eminus(z+z)24h e
ih
(kzminus i2 z2) eminus
ih
(kprimez+ i2 z2) (5185)
A simple change of variables shows that the integral in the imaginary part of z is finiteand the integral in the real part of z is proportional to δ(k kprime) Therefore the statesψk form a standard continuous orthogonal basis of generalized states They are clearlyeigenstates of the momentum since
pψk = i(xminus z)ψk = i
(part
partzminus z
)ψk = kψk (5186)
In fact what we have developed is a simple rewriting of the standard Hilbert space ofa quantum particle
Notice that appearances can be misleading For instance for k = 0 the state ψk
readsψ0(z t) = e+z22h (5187)
This looks like a badly nonnormalizable state but it is not It is a well-defined general-ized state since the negative exponential in the measure compensates for the positiveexponential in the state
56 Relational interpretation of quantum theory
Quantum mechanics is one of the most successful scientific theories everHowever its interpretation is controversial What does the theory actuallytell us about the physical world This question sparked off a lively debatewhich was intense during the 1930s the early days of the theory and isgenerating new interest today
The possibility that the interpretation of an empirically successful the-ory could be debated should not surprise examples abound in the historyof science For instance the great scientific revolution was fueled by thedebate on whether the efficacy of the copernican system should be takenas an indication that the Earth really moves In more recent times Ein-steinrsquos celebrated contribution to special relativity consisted to a largeextent just in understanding the physical interpretation (simultaneityis relative) of an already existing effective mathematical formalism (theLorentz transformations) In these cases as in the case of quantum me-chanics an overly strictly empiricist position could have circumventedthe problem altogether by reducing the content of the theory to a list ofpredicted numbers But science would not then have progressed
Quantum theory was first constructed for describing microscopic ob-jects (atoms electrons photons) and the way these interact with macro-scopic apparatuses built to measure their properties Such interactionswere called ldquomeasurementsrdquo The theory is formed by a mathematicalformalism which allows probabilities of alternative outcomes of such mea-surements to be calculated If used just for this purpose the theory raises
210 Quantum mechanics
no difficulty But we expect the macroscopic apparatuses themselves ndash infact any physical system in the world ndash to obey quantum theory andthis seems to raise contradictions within the theory Here I discuss theseapparent contradictions and a possible resolution This resolution offersa precise answer to the question of what the quantum theory actually tellsus about the physical world
561 The observer observed
Measurements A ldquomeasurementrdquo of the variable A of a system S is aninteraction between the system S and another system O whose effecton O depends on the value that the variable A has at the time of theinteraction We say that the variable A is ldquomeasuredrdquo and that its valuea is the ldquooutcome of the measurementrdquo For instance let S be a particlethat impacts on O let the effect of this impact depend on the positionof the particle and let q be the value of the position at the moment ofthe impact Then we say that the position Q is measured and that theoutcome of the measurement is q
The term ldquomeasurementrdquo and the common terminology used to de-scribe measurement situations (S for ldquoSystemrdquo and O for ldquoObserverrdquo)are very misleading because they evoke a human intentionally ldquoobservingrdquoS and using an apparatus to gather data about it There is nothingldquohumanrdquo or ldquointentionalrdquo in the definition of measurement given aboveThe system O does not need to be human nor to be a special ldquoapparatusrdquonor to be macroscopic The measured value need not be stored Any in-teraction between two physical systems is a measurement The measuredvariable of the system S is the variable that determines the effect that theinteraction has on O This is true in classical as well as in quantum theory
Classical states and quantum states In classical mechanics a system Sis described by a certain number of physical variables ABC For in-stance a particle is described by its position Q and velocity V Thesevariables change with time They represent the contingent properties ofthe system We say that the values of these variables determine at everymoment the ldquostaterdquo of the system If the value of the position Q of theparticle is q and the value of its velocity V is v we say that the state is(q v) In classical mechanics a state is therefore a list of values of physicalvariables
Quantum mechanics differs from classical mechanics because it assumesthat the variables of the system do not have a determined value at alltimes Werner Heisenberg introduced this key idea According to quantumtheory an electron does not have a well-determined position at every
56 Relational interpretation of quantum theory 211
time When it is not interacting with an external system sensitive to itsposition the electron can be ldquospread outrdquo over different positions It is ina ldquoquantum superpositionrdquo of different positions
It follows that in quantum mechanics the state of the system cannotbe captured by giving the value of its variables Instead quantum theoryintroduces a novel notion of the ldquostaterdquo of a system different from theclassical list of variable values The new notion of ldquoquantum staterdquo wasintroduced in the work of Erwin Schrodinger in the form of the ldquowavefunctionrdquo of the system Paul Adrien Maurice Dirac gave it its generalabstract formulation in terms of a vector Ψ moving in an abstract vectorspace From the knowledge of the state Ψ we can compute the probabilityof the different measurement outcomes a1 a2 of any variable A Thatis the probability of the different ways in which the system S can affecta system O in the course of an interaction
The theory prescribes that at every such measurement we must updatethe value of Ψ to take into account which of the different outcomes hasbeen realized This sudden change of the state Ψ depends on the outcomeof the measurement and is therefore probabilistic This is the ldquocollapse ofthe wave functionrdquo
The notion of ldquostate of the systemrdquo of classical mechanics is there-fore split into two distinct notions in quantum theory (i) the state Ψthat expresses the probability for the different ways the system S caninteract with its surroundings and (ii) the actual sequence of valuesq1 q2 q3 that the variables of S take in the course of the interac-tions These are the called ldquomeasurement outcomesrdquo I prefer calling themldquoquantum eventsrdquo
We can either think that Ψ is a ldquorealrdquo entity or that it is nothing morethan a theoretical bookkeeping for the quantum events which are theldquorealrdquo events The choice of the relative ontological weight we attribute tothe state Ψ or the quantum events q1 q2 q3 is a matter of convenienceempirical evidence alone does not uniquely determine what is ldquorealrdquoI think the second choice is cleaner but in the following I refer to both
The observer observed The key problem of the interpretation of quantummechanics is illustrated by the following situation Consider a real physicalsituation illustrated in Figure 51 in which at some time t a system Ointeracts with a system S and then at a later time tprime a third system Oprime
interacts with the coupled system [S + O] formed by S and O togetherLet the effect on O of the first interaction depend on the variable A ofthe system S and the effect on Oprime of the second interaction depend onthe variable B of the coupled system [S+O] (That is we can say that O
212 Quantum mechanics
Fig 51 The observer observed
measures the variable A of S at time t and then Oprime measures the variableB of [S +O] at time tprime) Before the first interaction say S was in a statewhere A is a quantum superposition of two values a1 and a2 Say thatat the first interaction O measures the value a1 of the variable O Thepuzzling question can be formulated in various equivalent manners
bull What is the state of S and O between the two interactions
bull Has the quantum event a1 happened or not
bull Does the quantity A have a determined value after the first interac-tion or not
Say that before the first interaction the state of S was Ψ = c1Ψ1 +c2Ψ2
where Ψ1Ψ2 are states where A has values a1 a2 respectively Then attime t we have
c1Ψ1 + c2Ψ2 rarr Ψ1
A takes the value a1(5188)
However the system O obeys the laws of quantum theory as well There-fore we can also give a quantum description of the evolution of the coupledquantum system (S + O) formed by S and O together If we do so no
56 Relational interpretation of quantum theory 213
collapse happens Instead the effect of the interaction is the Schrodingerevolution
(c1Ψ1 + c2Ψ2
)otimes Φ rarr
(c1Ψ1 otimes Φ1 + c2Ψ2 otimes Φ2
)
A is still in the superposition of the two values a1 a2
(5189)
for suitable states ΦΦ1Φ2 of OWhat is real seems to depend on how we choose to describe the world
What is the real state of affairs of the world after the interaction betweenS and O (5188) or (5189) In either case we get a difficulty If we saythat after t the state has collapsed as in (5188) and A has the value a1we get the wrong predictions about the second measurement at time tprimeIn fact quantum theory allows us to predict the probability distributionof the possible outcomes of the second measurement but to computethis we have to use the state (5189) and not the state (5188) Indeedif B and A do not commute this probability distribution can be affectedby the interference between the two different ldquobranchesrdquo in (5189) Inother words we have to assume that the variable A was in a quantumsuperposition of the values a1 a2 and not determined
But if we do so and say that after the first measurement the state is(5189) then we must say that A has no determined value at time t Butthe situation is general any measurement can be thought of as the firstmeasurement of the example and therefore we must conclude that novariable can take a definite value ever
Thus we seem to get a contradiction in both cases whether we thinkthat the wave function has collapsed and a1 was realized or whether wethink it hasnrsquot This is the core of the difficulty of the interpretation ofquantum theory
Real wave functions or real quantum events Let us examine the abovedifficulty in a bit more detail from the two points of view of the twopossible ontologies of quantum theory
If we think that Ψ is real but it never truly collapses there is no sim-ple and compelling reason why the world should appear as described byvalues of physical quantities that take determined values at each interac-tion as it does We experience particles in given positions not particlewavefunctions The relation between a noncollapsing wavefunction ontol-ogy and our experience of the world is very indirect and involuted Weneed some complicated story to understand how specific observed valuesq1 q2 q3 can emerge from the sole Ψ If this story is given (which ispossible) we are then in a situation similar to the one of a quantum eventontology to which I now turn
214 Quantum mechanics
I think it is preferable to take the quantum events as the actual elementsof reality and view Ψ just as a bookkeeping device coding the events thathappened in the past and their consequences For instance I prefer to saythat the ldquorealityrdquo of a subatomic particle is expressed by the sequence ofthe positions of the particle revealed by the bubbles in a bubble chambernot by the spherically symmetric wave function emerging from the inter-action area The reality of the electron is in the events where it revealsitself interacting with its surrounding not in the abstract probabilityamplitude for such events From this perspective the real events of theworld are the ldquorealizationsrdquo (the ldquocoming to realityrdquo the ldquoactualizationrdquo)of the values q1 q2 q3 in the course of the interactions between physicalsystems These quantum events have an intrinsically discrete ldquoquantizedrdquogranular structure
This perspective however does not solve the above puzzle either Thekey puzzle of quantum mechanics becomes the fact that the statementthat the quantum event a1 ldquohas happenedrdquo can be at the same time trueand not-true has the quantum event a1 happened or not If we answer nothen we are forced to say that no quantum event ever happens becausethe situation described above is completely general any quantum eventhappening in the interaction of two systems S and O is ldquonon-happeningrdquoas far as the effect of (S + O) on a further system Oprime is concerned If wesay yes then we contradict the predictions of quantum mechanics (aboutthe second interaction)
The ldquosecond observerrdquo puzzle captures the core conceptual difficultyof the interpretation of quantum mechanics reconciling the possibilityof quantum superposition with the fact that the world we observe anddescribe is characterized by determined values of physical quantities Moreprecisely the puzzle shows that we cannot disentangle the two accordingto the theory a quantum event (a1) can be at the same time realized andnot realized
A possible escape from the puzzle is to assume that there are ldquospecialrdquosystems that produce the collapse and cause quantum events to happenFor instance these could be ldquomacroscopicrdquo systems or ldquosufficiently com-plexrdquo systems or ldquosystems with memoryrdquo or the ldquogravitational fieldrdquo orhuman ldquoconsciousnessrdquo All these systems and others have been sug-gested as causing quantum collapse and generating quantum events Ifthis were correct at some point we shall be able to measure violations ofthe predictions of QM That is QM as we know it would break down forthose systems
So far this breaking down of QM has never been observed We canfancy a phenomenology that we have not yet observed that could bringback reality to the way we used to think it is It is certainly worthwhile to
56 Relational interpretation of quantum theory 215
investigate this possibility theoretically and experimentally But we shouldnot forget that reality might be truly different from what we thought andmight be simply demanding us to renounce some old prejudice I thinkthat the history of physics indicates that the productive attitude is not toresist the conceptual novelty of empirically successful theories but ratherto make an effort to understand it We should not force reality into ourprejudices but rather try to adapt our conceptual schemes to what welearn about the world
562 Facts are interactions
I think that the key to the solution of the difficulty can be found inthe observation that the two descriptions (5188) and (5189) refer todifferent systems the first to O the second to Oprime More precisely the firstis relevant for describing the effects of interactions on O the second fordescribing the effects of interactions on Oprime
The solution of the puzzle can be found in the idea that quantum eventsare the elements of reality but they are always relative to a physicalsystem the quantum event a1 happens with respect to O but it does nothappen with respect to Oprime
In other words the way out from the puzzle is that the values of thevariables of any physical system are relational They do not express aproperty of the system S alone but rather refer to the relation betweenthis system and another system The variable A has value a1 with respectto O but it has no determined values with respect to Oprime This pointof view is called the relational interpretation of quantum mechanics orsimply relational quantum mechanics
The central idea of relational quantum mechanics is that there is nomeaning in saying that a certain variable of the system S takes the valueq There is only meaning in saying that a variable has value q with respectto a system O In the example discussed above for instance the fact thatA takes the value a1 with respect to O does not imply that A has thevalue a1 also with respect to Oprime
If we avoid all statements that are not referred to a physical systemwe can get rid of all apparent contradictions of quantum theory Theapparent contradiction between the two statements that a variable hasor hasnrsquot a value is resolved by referring the statements to the differentsystems with which the system in question interacts If I observe anelectron at a certain position I cannot conclude that the electron isthere I can only conclude that the electron as seen by me is there
Indeed quantum theory must be understood as an account of the waydistinct physical systems affect one another when they interact and not
216 Quantum mechanics
the way physical systems ldquoarerdquo This account exhausts all that can besaid about the physical world The physical world can be described asa network of interacting components where there is no meaning to ldquothestate of an isolated systemrdquo The state of a physical system is the networkof its relationships with the surrounding systems The physical structureof the world is identified with this network of relationships
The unique account of the state of the world of the classical theoryis thus shattered into a multiplicity of accounts one for each possibleldquoobservingrdquo physical system Quantum mechanics is a theory about thephysical description of physical systems relative to other systems and thisis a complete description of the world
Of course we can pick a system O once and for all as ldquothe observersystemrdquo and be concerned only with the effects of the rest of the worldon this system Each interaction between the rest of the world and O iscorrectly described by standard quantum mechanics In this descriptionthe quantum state Ψ collapses at each interaction with O This descriptionis completely self-consistent but it treats O as if it were a special systema classical nonquantum system If we want to describe O itself quantummechanically we can but we have to pick a different system Oprime as theobserver and describe the way O interacts with Oprime In this descriptionthe quantum properties of O are taken into account but not the ones ofOprime because this description describes the effects of the rest of the worldon Oprime
Consistency This relativisation of actuality is viable thanks to a remark-able property of the formalism of quantum mechanics
John von Neumann was the first to notice that the formalism of thetheory treats the measured system (S) and the measuring system (O) dif-ferently but the theory is surprisingly flexible on the choice of where to putthe boundary between the two Different choices give different accounts ofthe state of the world (for instance the collapse of the wave function hap-pens at different times) but this does not affect the predictions about thefinal observations This flexibility reflects a general structural propertyof quantum theory which guarantees consistency among all the distinctldquoaccounts of the worldrdquo of the different observing systems The mannerin which this consistency is realized however is subtle
As a simple illustration of this phenomenon consider the case in whicha system O with two states Φ1 and Φ2 (say a light-bulb which can be onor off ) interacts with a system S with two states Ψ1 and Ψ2 (say thespin of the electron which can be up or down) Assume the interactionis such that if the spin is up (down) the light goes on (off ) To start
56 Relational interpretation of quantum theory 217
with the electron can be in a superposition of its two states In theaccount of the state of the electron that we can associate with the lightthe wave function of the electron collapses to one of two states duringthe interaction as in (5188) and the light is then either on or off Butwe can also consider the lightelectron composite system as a quantumsystem and study the interactions of this composite system with anothersystem Oprime In the account associated to Oprime there is no collapse at the timeof the interaction and the composite system is still in the superpositionof the two states [spin uplight on] and [spin downlight off ] after theinteraction as in (5189) As remarked above it is necessary to assumethis superposition because it accounts for measurable interference effectsbetween the two states if quantum mechanics is correct these interferenceeffects are truly observable by Oprime
So we have two discordant accounts of the same events the one associ-ated to O where the spin has a determined value and the one associatedto Oprime where the spin is in a superposition Now can the two discordantaccounts be compared and does the comparison lead to a contradiction
They can be compared because the information on the first accountis stored in the state of the light and Oprime has access to this informationTherefore O and Oprime can compare their accounts of the state of the worldHowever the comparison does not lead to a contradiction because the com-parison is itself a physical process that must be understood in the contextof quantum mechanics
Indeed Oprime can physically interact with the electron and then with thelight (or equivalently with the light and then with the electron) If forinstance Oprime finds the spin of the electron up quantum mechanics predictsthat the observer will then consistently find the light on because in thefirst measurement the state of the composite system collapses on its [spinuplight on] component namely on the first term of the right-hand sideof (5189)
That is the multiplicity of accounts does not lead to a contradictionprecisely because the comparison between different accounts can only be aphysical quantum interaction Many common paradoxes of quantum me-chanics follow from assuming that the communication between differentobservers violates quantum mechanics4 This internal self-consistency ofthe quantum formalism is general and is perhaps its most remarkable
4The EPR (EinsteinndashPodolskindashRosen) apparent paradox might be among these Thetwo observers far from each other are physical systems The standard account neglectsthe fact that each of the two is in a quantum superposition with respect to the otheruntil the moment they physically communicate But this communication is a physicalinteraction and must be strictly consistent with causality
218 Quantum mechanics
aspect5 This self-consistency is a strong indication of the relational na-ture of the world
563 Information
What appears with respect to O as a measurement of the variable A (witha specific outcome) appears with respect to Oprime simply as a dynamicalprocess that establishes a correlation between S and O As far as theobserver O is concerned the variable A of a system S has taken a certainvalue As far as the second observer Oprime is concerned the only relevantelement of reality is that a correlation is established between S and O
Concretely this correlation appears in all further observations that Oprime
would perform on the [S + O] system That is the way the two systemsS and O will interact with Oprime is characterized by the fact that thereis a correlation Oprime will find some properties of O correlated with someproperties of S
On the other hand until it physically interacts with [S+O] the systemOprime has no access to the actual outcomes of the measurements performedby O on S These actual outcomes are real only with respect to O
The existence of a correlation between the possible outcomes of a mea-surement performed by Oprime on S and the outcomes of a measurementperformed by Oprime on O can be interpreted in terms of information In factthis correlation corresponds precisely to Shannonrsquos definition of informa-tion According to this definition ldquoO has information about Srdquo meansthat we shall observe O and S in a subset of the set formed by the carte-sian product of the possible states of O and the possible states of S Thusa measurement of S by O has the effect that ldquoO has information aboutSrdquo This statement has a precise technical meaning which refers to thepossible outcomes of the observations by a third system Oprime
On the other hand if we interact a sufficient number of times with aphysical system S we can then predict (the distribution probability ofthe) future outcomes of our interactions with this system In this senseby interacting with S we can say we ldquohave informationrdquo about S (This
5In fact one may conjecture that this peculiar consistency between the observationsof different observers is the missing ingredient for a reconstruction theorem of theHilbert space formalism of quantum theory Such a reconstruction theorem is stillunavailable On the basis of reasonable physical assumptions one is able to derivethe structure of an orthomodular lattice containing subsets that form Boolean alge-bras which ldquoalmostrdquo but not quite implies the existence of a Hilbert space and itsprojectorsrsquo algebra Perhaps an appropriate algebraic formulation of the condition ofconsistency between subsystems could provide the missing hypothesis to complete thereconstruction theorem
56 Relational interpretation of quantum theory 219
information need not be stored or utilized but its existence is the neces-sary physical condition for being able to store it or utilize it for predic-tions)
Therefore we have two distinct senses in which the physical theory isabout information But a moment of reflection shows that the two simplyreflect the same physical reality as it affects two different systems On theone hand O has information about S because it has interacted with S andthe past interactions are sufficient to ldquogive informationrdquo namely to deter-mine (the probability distribution of) the result of future interactions Onthe other hand O has information about S in the sense that there are cor-relations in the outcomes of measurements that Oprime can make on the two
There is a crucial subtle difference that can be figuratively expressedas follows O ldquoknowsrdquo about S while Oprime only knows that O knows aboutS but does not know what O knows As far as Oprime is concerned a physicalinteraction between S and O establishes a correlation it does not selectan outcome
These observations are sufficient to conclude that what precisely quan-tum mechanics is about is the information that physical systems haveabout one another
The common unease with taking quantum mechanics as a fundamentaldescription of Nature referred to as the measurement problem can betraced to the use of an incorrect notion in the same way that uneasewith Lorentz transformations derived from the notion shown by Einsteinto be mistaken of an observer-independent time The incorrect notionthat generates the unease with quantum mechanics is the notion of anobserver-independent state of a system or observer-independent valuesof physical quantities or an observer-independent quantum event
We can assume that all systems are equivalent there is no a prioriobserverndashobserved distinction the theory describes the information thatsystems have about one another The theory is complete because thisdescription exhausts the physical world
In physics the move of deepening our insight into the physical worldby relativizing notions previously treated as absolute has been appliedrepeatedly and very successfully Here are a few examples
The notion of the velocity of an object has been understood as mean-ingless unless it is referred to a reference body with respect to whichthe object is moving With special relativity simultaneity of two distantevents has been understood as meaningless unless referred to a specificstate of motion of something (This something is usually denoted as ldquotheobserverrdquo without of course any implication that the observer is humanor has any other peculiar property besides having a state of motion Simi-larly the ldquoobserver systemrdquo O in quantum mechanics need not be humanor have any other property beside the possibility of interacting with the
220 Quantum mechanics
ldquoobservedrdquo system S) With general relativity the position in space andtime of an object has been understood as meaningless unless it is referredto the gravitational field or to another dynamical physical entity
The step proposed by the relational interpretation of quantum mechan-ics has strong analogies with these In a sense it is a longer jump sinceall the contingent (variable) properties of all physical systems are takento be meaningful only as relative to a second physical system This is notan arbitrary step It is a conclusion which is difficult to escape followingfrom the observation ndash explained above in the example of the ldquosecond ob-serverrdquo ndash that a variable (of a system S) can have a well-determined valuea1 for one observer (O) and at the same time fail to have a determinedvalue for another observer (Oprime)
This way of thinking of the world has perhaps heavy philosophical im-plications But it is Nature that is forcing us to this way of thinking If wewant to understand Nature our task is not to frame Nature into our philo-sophical prejudices but rather to learn how to adjust our philosophicalprejudices to what we learn from Nature
564 Spacetime relationalism versus quantum relationalism
I close with a very speculative suggestion As discussed in Section 23 themain idea underlying GR is the relational interpretation of localizationobjects are not located in spacetime They are located with respect toone another In the present section I have observed that the lesson ofQM is that quantum events and states of systems are relational theymake sense only with respect to another system Thus both GR and QMare characterized by a form of relationalism Is there a connection betweenthese two forms of relationalism
Let us look closer at the two relations In GR the localization of anobject S in spacetime is relative to another object (or field) O to whichS is contiguous Contiguities or equivalently Einsteinrsquos ldquospacetime coin-cidencesrdquo are the basic relations that construct spacetime In QM thereare no absolute properties or facts properties of a system S are relativeto another system O with which S is interacting Facts are interactionsThus interactions form the basic relations between systems
But there is a strict connection between contiguity and interaction Onthe one hand S and O can interact only if they are contiguous if they arenearby in spacetime this is locality Interaction requires contiguity Onthe other hand what does it mean that S and O are contiguous Whatelse does it mean besides the fact that they can interact6 Therefore
6The very word ldquocontiguousrdquo derives from the Latin cum-tangere to touch each otherthat is to inter-act
Bibliographical notes 221
contiguity is manifested by interacting In a sense the fact that inter-actions are local means that there is a sort of identity between beingcontiguous and interacting
Thus locality ties together very strictly the spacetime relationalism ofGR with the relationalism underlying QM It is tempting to try to developa general conceptual scheme based on this observation This could be aconceptual scheme in which contiguity is nothing else than a manifesta-tion or can be identified with the existence of a quantum interactionThe spatiotemporal structure of the world would then be directly deter-mined by who is interacting with whom This is of course very vagueand might lead nowhere but I find the idea intriguing
mdashmdash
Bibliographical notes
Textbooks on quantum theory are numerous I think the best of all is thefirst of them Dirac [148] because of Diracrsquos crystal-clear thinking In theearlier editions Dirac uses a relativistic notion of state (that does notevolve in time) as is done here He calls these states ldquorelativisticrdquo as isdone here In later editions he switches to Schrodinger states that evolvein time explaining in a preface that it is easier to calculate with thesebut it is a pity to give up relativistic states which are more fundamental
I have discussed the idea that QM remains consistent also in the absenceof unitary time evolution in [98] and [149] The same idea is developed bymany authors see [26] [150] and references therein
In the past I have discussed relativistic systems only in terms of ldquoevolv-ing constantsrdquo The two-oscillators example used in the text was consid-ered in these terms in [151 152] The probabilistic interpretation of thecovariant formulation presented in this chapter is an evolution of thispoint of view and derives from [144]
I have taken the discussion on the boundary formulation of QFT from[145] The idea that quantum field theory must be formulated in termsof boundary data on a finite surface has been advocated by Robert Oeckl[153] The derivation of the local Schrodinger equation is in [146] and[154] The TomonagandashSchwinger equation was introduced in [155] On thedifficulty of a direct interpretation of the n-point functions in quantumgravity see for instance [156] The HartlendashHawking state was introducedin [157]
The possibility of defining the physical scalar product on the space ofthe solutions of the WheelerndashDeWitt equation even when these solutionsare nonnormalizable in the kinematical Hilbert space has been discussed
222 Quantum mechanics
by many authors using a variety of techniques A nice mathematical con-struction has been given by Don Marolf see [158] and references therein
The thermal time hypothesis was extended to QM and QFT in [125]The relational interpretation presented here is discussed in [159 160]
see also [161 162] An overview of similar points of view is in the onlineStanford Encyclopedia of Philosophy [163] on a possibly related point ofview see also [164] The role of information in the foundations of quantumtheory has been stressed by John Wheeler in [165 166] For a recent dis-cussion on the role of information in the foundation of quantum theorysee for instance [167] and references therein An original and fascinat-ing point of view on the relational aspects of quantum and relativity isexplored by David Finkelstein in [168]
Part II
Loop quantum gravity
ndash Now itrsquos time to leave the capsuleif you dare
ndash This is Major Tom to Ground ControlIrsquom stepping through the doorAnd Irsquom floating in a most peculiar wayAnd the stars look very different today
David Bowie Space Oddity