me ch ani cs and stati sti cal th e r modynami cs

109
THERMODYNAMICS AND STATISTICAL MECHANICS V. Parameswaran Nair City College of New York

Upload: others

Post on 26-Jul-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

THERMODYNAMICS AND STATISTICAL MECHANICS

V. Parameswaran NairCity College of New York

Page 2: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

Book: Thermodynamics and StatisticalMechanics (Nair)

Page 3: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

This text is disseminated via the Open Education Resource (OER) LibreTexts Project (https://LibreTexts.org) and like the hundredsof other texts available within this powerful platform, it is freely available for reading, printing and "consuming." Most, but not all,pages in the library have licenses that may allow individuals to make changes, save, and print this book. Carefullyconsult the applicable license(s) before pursuing such effects.

Instructors can adopt existing LibreTexts texts or Remix them to quickly build course-specific resources to meet the needs of theirstudents. Unlike traditional textbooks, LibreTexts’ web based origins allow powerful integration of advanced features and newtechnologies to support learning.

The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online platformfor the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable textbook costs to ourstudents and society. The LibreTexts project is a multi-institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher learning by developing an Open Access Resourceenvironment. The project currently consists of 14 independently operating and interconnected libraries that are constantly beingoptimized by students, faculty, and outside experts to supplant conventional paper-based books. These free textbook alternatives areorganized within a central environment that is both vertically (from advance to basic level) and horizontally (across different fields)integrated.

The LibreTexts libraries are Powered by MindTouch and are supported by the Department of Education Open Textbook PilotProject, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning SolutionsProgram, and Merlot. This material is based upon work supported by the National Science Foundation under Grant No. 1246120,1525057, and 1413739. Unless otherwise noted, LibreTexts content is licensed by CC BY-NC-SA 3.0.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do notnecessarily reflect the views of the National Science Foundation nor the US Department of Education.

Have questions or comments? For information about adoptions or adaptions contact [email protected]. More information on ouractivities can be found via Facebook (https://facebook.com/Libretexts), Twitter (https://twitter.com/libretexts), or our blog(http://Blog.Libretexts.org).

This text was compiled on 06/11/2022

®

Page 4: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

TABLE OF CONTENTS

These are the lecture notes for the course on Thermodynamics and Statistical Mechanics. This is a course meant for upper levelundergraduate students in physics, so that is the level at which most topics are discussed.

1: Basic Concepts

1.1: Definitions1.2: The Zeroth Law of Thermodynamics1.3: Equation of State

2: The First Law of Thermodynamics

2.1: The First Law2.2: Adiabatic and Isothermal Processes2.3: Barometric Formula and the Speed of Sound

3: The Second Law of Thermodynamics

3.1: Carnot Cycle3.2: The Second Law3.3: Consequences of the Second Law3.4: Absolute Temperature and Entropy3.5: Some Other Thermodynamic Engines

4: The Third Law of Thermodynamics

5: Thermodynamic Potentials and Equilibrium

5.1: Thermodynamic Potentials5.2: Thermodynamic Equilibrium5.3: Phase Transitions

6: Thermodynamic Relations and Processes

6.1: Maxwell Relations6.2: Other Relations6.3: Joule-Kelvin Expansion

7: Classical Statistical Mechanics

7.1: The Binomial Distribution7.2: Maxwell-Boltzmann Statistics7.3: The Maxwell Distribution For Velocities7.4: The Gibbsian Ensembles7.5: Equation of State7.6: Fluctuations7.7: Internal Degrees of Freedom7.8: Examples

Page 5: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

2

8: Quantum Statistical Mechanics

8.1: Prelude to Quantum Statistical Mechanics8.2: Bose-Einstein Distribution8.3: Fermi-Dirac Distribution8.4: Applications of the Bose-Einstein Distribution8.5: Applications of the Fermi-Dirac Distribution

9: The Carathéodory Principle

9.1: Mathematical Preliminaries9.2: Carathéodory Statement of the Second Law

10: Entropy and Information

10.1: Information10.2: Maxwell’s Demon10.3: Entropy and Gravity

Bibliography

Index

Glossary

Book: Thermodynamics and Statistical Mechanics (Nair) is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated byV. Parameswaran Nair.

Page 6: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

CHAPTER OVERVIEW

1: Basic ConceptsThermodynamics is the description of thermal properties of matter in bulk. The study of phenomena involving the transfer of heatenergy and allied processes form the subject matter. The fundamental description of the properties of matter in bulk, such astemperature, heat energy, etc., are given by statistical mechanics. For equilibrium states of a system the results of statisticalmechanics give us the laws of thermodynamics. These laws were empirically enunciated before the development of statisticalmechanics. Taking these laws as axioms, a logical buildup of the subject of thermodynamics is possible.

1.1: Definitions1.2: The Zeroth Law of Thermodynamics1.3: Equation of State

1: Basic Concepts is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

Page 7: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1.1.1 https://phys.libretexts.org/@go/page/32010

1.1: Definitions

Thermodynamic Coordinates

The macroscopically and directly observable quantities for any state of a physical system are the thermodynamic coordinates ofthat state. As an example, the pressure p and volume V of a gas can be taken as thermodynamic coordinates. In more generalsituations, other coordinates, such as magnetization and magnetic field, surface tension and area, may be necessary. The point isthat the thermodynamic coordinates uniquely characterize the macroscopic state of the system.

Thermal ContactTwo bodies are in thermal contact if there can be free flow of heat between the two bodies.

Adiabatic IsolationA body is said to be in adiabatic isolation if there can be no flow of heat energy into the body or out of the body into theenvironment. In other words, there is no exchange of heat energy with the environment.

Thermodynamic Equilibrium

A body is said to be in thermodynamic equilibrium if the thermodynamic coordinates of the body do not change with time.

Two bodies are in thermal equilibrium with each other if on placing them in thermal contact, the thermodynamic coordinates do notchange with time.

Quasistatic Changes

The thermodynamic coordinates of a physical can change due to any number of reasons, due to compression, magnetization, supplyof external heat, work done by the system, etc. A change is said to be quasistatic if the change in going from an initial state ofequilibrium to a final state of equilibrium is carried out through a sequence of intermediate states which are all equilibrium states.The expectation is that such quasistatic changes can be achieved by changes which are slow on the time-scale of the molecularinteractions.

Since thermodynamics is the description of equilibrium states, the changes considered in thermodynamics are all quasistaticchanges.

Work done by a SystemIt is possible to extract work from a thermodynamic system or work can be done by external agencies on the system, through aseries of quasistatic changes. The work done by a system is denoted by W. The amount of work done between two equilibriumstates of a system will depend on the process connecting them. For example, for the expansion of a gas, the work done by thesystem is

Exact Differentials

Consider a differential form defined on some neighborhood of an -dimensional manifold which may be written explicitly as

where are functions of the coordinates . is an exact differential form if we can integrate along any curve between two

points, say, and and the result depends only on the two end-points and isindependent of the path . This means that there exists a function in the neighborhood under consideration such that .A necessary condition for exactness of is

dW = p dV (1.1.1)

n

A = d∑i

fi xi (1.1.2)

fi xi A A C

= ( , , ⋅ ⋅ ⋅, )x  x1 x2 xn = ( , , ⋅ ⋅ ⋅, )x′→

x1′x2′

xn′

C F A = d F

A

− = 0∂fj

∂xi∂fi∂xj

(1.1.3)

Page 8: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1.1.2 https://phys.libretexts.org/@go/page/32010

Conversely, if the conditions Equation are satisfied, then one can find a function such that in a star-shaped

neighborhood of the points and .

The differential forms we encounter in thermodynamics are not necessarily exact. For example, the work done by a system, say, is not an exact differential. Thus the work done in connecting two states and , which is given by , will depend on

the path, i.e., the process involved in going from state to state .

1.1: Definitions is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

1.1.3 F A = dF

x  x′→

dW α β dW∫ β

α

α β

Page 9: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1.2.1 https://phys.libretexts.org/@go/page/32011

1.2: The Zeroth Law of ThermodynamicsThis can be stated as follows.

If two bodies and are in thermal equilibrium with a third body , then they are in thermal equilibrium with each other.

Consequences of the Zeroth LawThermal equilibrium of two bodies will mean a restrictive relation between the thermodynamic coordinates of the first body andthose of the second body. In other words, thermal equilibrium means that

if A and B are in thermal equilibrium. Thus the zeroth law states that

This is possible if and only if the relations are of the form

This means that, for any body, there exists a function of the thermodynamic coordinates , such that equality of for twobodies implies that the bodies are in thermal equilibrium. The function is not uniquely defined. Any single-valued function of ,say, will also satisfy the conditions for equilibrium, since

The function is called the empirical temperature. This is the temperature measured by gas thermometers.

The zeroth law defines the notion of temperature. Once it is defined, we can choose variables as the thermodynamiccoordinates of the body, of which only are independent. The relation is an equation of state.

1.2: The Zeroth Law of Thermodynamics is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

Zeroth Law of Thermodynamics:

A B C

F ( , ) = 0xA

→xB

→(1.2.1)

} ⇒ F ( , ) = 0F ( , ) = 0xA

→xB

F ( , ) = 0xB

→xC

→ xA

→xC

→(1.2.2)

F ( , ) = t( ) − t( ) = 0xA

→xB

→xA

→xB

→(1.2.3)

t( )x  x  t

t t

T (t)

= ⇒ =tA tB TA TB (1.2.4)

t( )x 

n +1 ( , t)x 

n t( )x 

Page 10: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1.3.1 https://phys.libretexts.org/@go/page/32012

1.3: Equation of State

The Ideal Gas Equation of State

In specifying the equation of state, we will use the absolute temperature, denoted by . We will introduce this concept later, but fornow, we will take it as given. The absolute temperature is always positive, varying from zero (or absolute zero) to infinity. Theideal gas is then characterized by the equation of state

where denotes the number of molecules of the gas and is a constant, known as Boltzmann’s constant. Another way of writingthis is as follows. We define the Avogadro number as . This number comes about as follows. The mass of an atomis primarily due to the protons and neutrons in its nucleus. Each proton has a mass of , each neutron has a massof . If we neglect the mass difference between the proton and the neutron, the mass of an atom of atomic weight

(= number of protons + number of neutrons in the nucleus) is given by . Thus if we take grams of thematerial, the number of atoms is given by . This is essentially the Avogadro number. The massdifference between the proton and neutron is not completely negligible and also there are slight variations from one type of nucleusto another due to the varying binding energies of the protons and neutrons. So we standardize the Avogadro number by defining itas , which is very close to the number of atoms in of the isotope of carbon (which was used tostandardize the atomic masses).

If we have molecules of a material, we say that it has moles of the material, where . Thus we can rewrite

the ideal gas equation of state as

Numerically, we have, in joules per kelvin unit of temperature,

The van der Waals Equation of State

The ideal gas law is never obtained for real gases. There are intermolecular forces which change the equation of state, not tomention the quantum nature of the dynamics of the molecules which becomes more important at low temperatures. The equation ofstate can in principle be calculated or determined from the intermolecular forces in statistical mechanics. The corrections to theideal gas law can be expressed as series of terms known as the virial expansion, the second virial coefficient being the first suchcorrection. While the method is general, the specifics depend on the nature of the molecules and a simple formula is not easy towrite down.

Figure : The general form of the intermolecular potential as a function of distance

An equation of state which captures some very general features of the intermolecular forces was written down by van der Waals, inthe 1870s, long before the virial expansion was developed. It is important in that it gives a good working approximation for manygases. The van der Waals equation of state is

t

p V = n k T (1.3.1)

N k

6.02214 ×1023

1.6726 × gm10−24

1.6749 × gm10−24

A A ×1.6726 × gm10−24 A

(1.67 × ≈ 6 ×10−24)−1 1023

6.02214 ×1023 12 gm C 12

N n n = N

Avogadro Number

p V = n R T , R = k (Avogadro Number) (1.3.2)

k ≈ 1.38065 × , R ≈ 8.314510−23 J

K

J

mole K(1.3.3)

1.3.1

(p + )(V −bN) = NkTaN 2

V 2(1.3.4)

Page 11: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1.3.2 https://phys.libretexts.org/@go/page/32012

The reasoning behind this equation is as follows. In general, intermolecular forces have a short range repulsion, see Fig. 1.3.1. Thisprevents the molecules from forming bound states. The formation of bound states would be a chemical reaction, so we are reallyconsidering gases where there is no further chemical reaction beyond the initial formation of the molecules. For example, if weconsider oxygen, two oxygen atoms bind together to form the oxygen molecule , but there is no binding for two oxygenmolecules (two ’s) to form something more complicated. At the potential level, this is due to a short range repulsion keepingthem from binding together. In van der Waals’ reasoning, such an effect could be incorporated by arguing that the full volume isnot available to the molecules, a certain volume around each molecule is excluded from being accessible to other molecules. Sowe must replace by in the ideal gas law.

Intermolecular forces also have an attractive part at slightly larger separations. This attraction would lead to the molecules comingtogether, thus reducing the pressure. So the pressure calculated assuming the molecules are noninteracting, which is the kineticpressure , must be related to the actual pressure by

The interaction is pairwise primarily, so we expect a factor of (which is number of pairings one can do) for molecules.This goes like ∼ , which explains the second term in Equation . Using the ideal gas law for with the volume

, we get Equation . In this equation, and are parameters specific to the gas under consideration.

1.3: Equation of State is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

O2

O2

V

b

V (V −bN)

pkin p

p = −pkin

aN 2

V 2(1.3.5)

N(N−1)

2N

N 2 1.3.5 pkin

(V −bN) 1.3.4 a b

Page 13: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

2.1.1 https://phys.libretexts.org/@go/page/32014

2.1: The First LawThe first law of thermodynamics is the conservation of energy, including the equivalence of work and energy, and about assigningan internal energy to the system.

When an amount of heat is supplied to the system and an amount of work is done by the system, changes are producedin the thermodynamic coordinates of the system such that

where is a function of the thermodynamic coordinates of the system. (In other words, is an exact differential.)

If the system is in adiabatic isolation, and . Since is an exact differential, this means that

Thus the adiabatic work done by the system is independent of the process involved and depends only on the initial and final statesof the system. It is the recognition of this fact through careful experiments (by Joule) which led to the first law.

The quantity is called the internal energy of the system. For a gas, where the work done is given by Equation , we maywrite the first law as

If we put , we get a relation between and which is valid for adiabatic processes. The curve connecting and soobtained is called an adiabatic. Starting with different initial states, we can get a family of adiabatics. In general, when we havemore thermodynamic coordinates, adiabatics can be similarly defined, but are higher dimensional surfaces.

Specific HeatsWhen heat is supplied to a body, the temperature increases. The amount of heat needed to raise the temperature by is calledthe specific heat. This depends on the process, on what parameters are kept constant during the supply of heat. Two useful specificheats for a gas are defined for constant volume and constant pressure. If heat is supplied keeping the volume constant, then theinternal energy will increase. From the first law, we find, since ,

Thus the specific heat may be defined as the rate of increase of internal energy with respect to temperature. For supply of heatat constant pressure, we have

Thus the specific heat at constant pressure may be taken as the rate at which the quantity increases with temperature. Thelatter quantity is called the enthalpy.

In general, the two specific heats are functions of temperature. The specific heat at constant volume, , can be calculated usingstatistical mechanics or it can be measured in experiments. can then be evaluated using the equation of state for the material.The ratio is often denoted by .

2.1: The First Law is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

First Law of Thermodynamics

dQ dW

dU = dQ−dW (2.1.1)

U dU

dQ = 0 dU = −dW dU

= −Wadiab Uinitial Ufinal (2.1.2)

U 2.1.2

dU = dQ−p dV (2.1.3)

dQ = 0 p V p V

dQ dT

dV = 0

dU = dQ = dTCv (2.1.4)

Cv

d(U +pV ) = dQ+ V dp

= dQ

≡ dTCp

U +pV

Cv

CpCp

Cvγ

Page 14: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

2.2.1 https://phys.libretexts.org/@go/page/32015

2.2: Adiabatic and Isothermal ProcessesAmong the various types of thermodynamic processes possible, there are two very important ones. These are the adiabatic andisothermal processes. An adiabatic process is one in which there is no supply of heat to the body undergoing change ofthermodynamic state. In other words, the body is in adiabatic isolation. An isothermal process is a thermodynamic change wherethe temperature of the body does not change.

The thermodynamic variables involved in the change can be quite general; for example, we could consider magnetization and themagnetic field, surface tension and area, or pressure and volume. For a gas undergoing thermodynamic change, the relevantvariables are pressure and volume. In this case, for an adiabatic process, since ,

From these, we find

Generally can depend on temperature (and hence on pressure), but if we consider a material (such as the ideal gas) for which isa constant, the above equation gives

This is the equation for an adiabatic process for an ideal gas.

If we consider an isothermal process for an ideal gas, the equation of state gives

2.2: Adiabatic and Isothermal Processes is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. ParameswaranNair.

dQ = 0

dU

d(U +pV )

≡ dT = −pdVCv

≡ dT = V dpCp

γ + = 0dV

V

dp

p(2.2.1)

γ γ

p = constantV γ (2.2.2)

pV = nRT = constant (2.2.3)

Page 15: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

2.3.1 https://phys.libretexts.org/@go/page/32016

2.3: Barometric Formula and the Speed of SoundHere we consider two simple examples of using the ideal gas law and the formula for adiabatic expansion.First, consider thebarometric formula which gives the density (or pressure) of air at a height above the surface of Earth (also called Pascal'sprinciple). We assume complete equilibrium, mechanical, and thermal. The argument is illustrated in Figure .

Figure : Matching of forces for barometric formula

We consider a layer of air, with horizontal cross-sectional area and height . If we take the molecules to have a mass and thenumber density of particles to be , then the weight of this layer of air is . This is the force acting downward. It iscompensated by the difference of pressure between the upper boundary and the lower boundary for this layer. The latter is thus

, again acting downward, as we have drawn it. The total force being zero for equilibrium, we get

Thus the variation of pressure with height is given by

Let us assume the ideal gas law for air; this is not perfect, but is a reasonably good approximation. Then and theequation above becomes

The solution is

or

This argument is admittedly crude. In reality, the temperature also varies with height. Further, there are so many nonequilibriumprocesses (such as wind, heating due to absorption of solar radiation, variation of temperature between day and night, etc.) in theatmosphere that Equation can only be valid for a short range of height and over a small area of local equilibrium.

Our next example is about the speed of sound. For this, we can treat the medium, say, air, as a fluid, characterized by a numberdensity and a flow velocity . The equations for the fluid are

Equation is the equation of continuity which expresses the conservation of particle number, or mass, if we multiply theequation by the mass of a molecule. Equation is the fluid equivalent of Newton’s second law. In the absence of external

h

2.3.1

2.3.1

A dh m

ρ mg×ρAdh

dp×A

dp A+mgρAdh = 0 (2.3.1)

= −mgρdp

dh(2.3.2)

p = = ρkTNkT

V

= − pdp

dh

mg

kT(2.3.3)

p = exp(− ),p0mgh

kT(2.3.4)

ρ = exp(− )ρ0

mgh

kT(2.3.5)

2.3.5

ρ v  

+∇ ⋅ (ρ )∂ρ

∂tv  

+∂v  

∂tvi∂iv  

= 0

= −∇p

ρ

(2.3.6)

(2.3.7)

2.3.6

2.3.7

Page 16: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

2.3.2 https://phys.libretexts.org/@go/page/32016

forces, we still can have force terms; in fact the gradient of the pressure, as seen from the equation, acts as a force term. This mayalso be re-expressed in terms of the density, since pressure and density are related by the equation of state.

Now consider the medium in equilibrium with no sound waves in it. The equilibrium pressure should be uniform; we denote this by, with the corresponding uniform density as . Further, we have in equilibrium. Now we can consider sound waves as

perturbations on this background, writing

Treating and as being of the first order in the perturbation, the fluid equations can be approximated as

Taking the derivative of the first equation and using the second, we find

where

Equation is the wave equation with the speed of propagation given by . This equation shows that density perturbations cantravel as waves; these are the sound waves. It is easy to check that the wave equation has wave-like solutions of the form

with . This again identifies as the speed of propagation of the wave.

We can calculate the speed of sound waves more explicitly using the equation of state. If we take air to obey the ideal gas law, wehave . If the process of compression and de-compression which constitutes the sound wave is isothermal, then

Actually, the time-scale for the compression and de-compression in a sound wave is usually very short compared to the timeneeded for proper thermalization, so that it is more accurate to treat it as an adiabatic process. In this case, we have

so that

Since , this gives a speed higher than what would be obtained in an isothermal process. Experimentally, one can show thatthis formula for the speed of sound is what is obtained, showing that sound waves result from an adiabatic process.

2.3: Barometric Formula and the Speed of Sound is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

p0 ρ0 = 0v  

ρ

p

v  

= +δρρ0

= +δρp0 ( )∂p

∂ρ 0

= 0 +v  

δρ v  

(δρ)∂

∂t

∂v  

∂t

≈ − ∇ ⋅ρ0 v  

≈ − ∇δρ1

ρ0

( )∂p

∂ρ 0

( − ) δρ = 0∂2

∂t2c2s∇2 (2.3.8)

=c2s ( )

∂p

∂ρ 0

(2.3.9)

2.3.8 cs

δρ = A cos(ωt− ⋅ ) +B sin(ωt− ⋅ )k  x  k  x  (2.3.10)

= ⋅ω2 c2s k  k  cs

p = ρkT

= .( )∂p

∂ρ 0

p0

ρ0

(2.3.11)

p = constantV γ (2.3.12)

= = γc2s ( )

∂p

∂ρ 0

p0

ρ0

(2.3.13)

γ > 1

Page 17: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

CHAPTER OVERVIEW

3: The Second Law of Thermodynamics3.1: Carnot Cycle3.2: The Second Law3.3: Consequences of the Second Law3.4: Absolute Temperature and Entropy3.5: Some Other Thermodynamic Engines

3: The Second Law of Thermodynamics is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. ParameswaranNair.

Page 18: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.1.1 https://phys.libretexts.org/@go/page/32018

3.1: Carnot CycleA thermodynamic engine operates by taking in heat from a hot reservoir and performing certain work and then giving up a certainamount of heat into a colder reservoir. If it can be operated in reverse, it can function as a refrigerator. The Carnot cycle is areversible cyclic process (or engine) made of the following four steps:

1. It starts with an adiabatic process which raises the temperature of the working material of the engine to, say, 2. This is followed by a isothermal process, taking in heat from the reservoir at .3. The next step is an adiabatic process which does some amount of work and lowers the temperature of the material to .4. The final step is isothermal, at the lower temperature , dumping some amount of heat into a colder reservoir, with the

material returning to the thermodynamic state at the beginning of the cycle.

This is an idealized engine, no real engine can be perfectly reversible. The utility of the Carnot engine is to give the framework andlogic of the arguments related to the second law of thermodynamics. We may say it is a gedanken engine. The processes involvedin the Carnot cycle may refer to compression and expansion if the material is a gas; in this case, the cycle can be illustrated in a

diagram as shown in Fig. 3.1.1. But any other pair of thermodynamic variables will do as well. We can think of a Carnotcycle utilizing magnetization and magnetic field, or surface tension and area, or one could consider an electrochemical cell.

Let the amount of heat taken in at temperature be and let the amount of heat given up at the lower temperature be .Since this is an idealized case, we assume there is no loss of heat due to anything like friction. Thus the amount of work done,according to the first law is

Figure : The Carnot cycle for a gas

The efficiency of the engine is given by the amount of work done when a given amount of heat is supplied, which is . (The heat

which is dumped into the reservoir at lower temperature is not usable for work.) The efficiency for a Carnot cycle is thus

The importance of the Carnot cycle is due to its idealized nature of having no losses and because it is reversible. This immediatelyleads to some simple but profound consequences.

3.1: Carnot Cycle is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

TH

TH

TL

TL

p −V

TH QH TL QL

W = −QH QL (3.1.1)

3.1.1

W

QH

QL η

η = = 1 −−QH QL

QH

QL

QH

(3.1.2)

Page 19: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.2.1 https://phys.libretexts.org/@go/page/32019

3.2: The Second LawThe second law of thermodynamics is a statement of what we know by direct experience. It is not something that is derived frommore fundamental principles, even though a better understanding of this law has emerged over time. There are several ways to statethe second law, the most common ones being the Kelvin statement and the Clausius statement.

K: Kelvin statement of the second law: There exists no thermodynamics process whose sole effect is to extract an amount ofheat from a source and convert it entirely to work.

C: Clausius statement of the second law: There exists no thermodynamics process whose sole effect is to extract an amountof heat from a colder source and deliver it to a hotter source.

The keyword here is “sole". Consider the expansion of a gas and consequent conversion of heat into work. Complete conversioncan be achieved but this is not the sole effect, for the state of the system has been changed. The second law does not forbid such aprocess.

The two statements are equivalent. This can be seen by showing that and , where the tildes denote the negation ofthe statements.

First consider . Consider two heat reservoirs at temperatures and , with . Since is false, we can extract acertain amount of heat from the colder reservoir (at ) and convert it entirely to work. Then, we can use this work to deliver acertain amount of heat to the hotter reservoir. For example, work can be converted to heat by processes like friction. So we canhave some mechanism like this to heat up the hotter source further. The net result of this operation is to extract a certain amount ofheat from a colder source and deliver it to a hotter source, thus contradicting . Thus .

Now consider . Since is presumed false, we can extract an amount of heat, say , from the colder reservoir (at ) anddeliver it to the hotter source. Then we can have a thermodynamic engine take this amount of heat from the hotter reservoir anddo a certain amount of work delivering an amount of heat to the colder reservoir. The net result of this cycle isto take the net amount of heat from the reservoir at and convert it entirely to work. This shows that .

The two statements and show the equivalence of the Kelvin and Clausius statements of the second law.

The second law is a statement of experience. Most of the thermodynamic results can be derived from a finer description ofmaterials, in terms of molecules, atoms, etc. However, to date, there is no clear derivation of the second law. Many derivations,such as Boltzmann’s -theorem, or descriptions in terms of information, have been suggested, which are important in their ownways, but all of them have some additional assumptions built-in. This is not to say that they are not useful. The assumptions madehave a more fundamental nature and do clarify many aspects of the second law.

3.2: The Second Law is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

⇒K~

C~

⇒C~

K~

⇒K~

C~

TL TH >TH TL K

TL

C ⇒K~

C~

⇒C~

K~

C Q2 TL

Q2

W = −Q2 Q1 Q1

−Q2 Q1 TL ⇒C~

K~

⇒K~

C~

⇒C~

K~

H

Page 20: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.3.1 https://phys.libretexts.org/@go/page/32020

3.3: Consequences of the Second LawOnce we take the second law as an axiom of thermodynamics, there are some important and immediate consequences. The firstresult is about the efficiency of the Carnot cycle, captured as the following theorem.

No engine operating between two specified heat reservoirs can be more efficient than a Carnot engine.

The proof is easy, based on the second law. Consider two engines, say a Carnot engine and another engine we call , and tworeservoirs, at temperatures and . The Carnot engine is reversible, so we can operate it as a refrigerator. So we can arrange forit to take a certain amount of heat from the colder reservoir and deliver an amount to the hotter reservoir. This will, ofcourse, require work to drive the Carnot engine. Now we can arrange for to take from the hotter reservoir,and do an amount of work , delivering heat to the colder reservoir. The efficiencies are given by

Assume E2 is more efficient. Then , or . Thus The net amount of heat extracted from the hotterreservoir is zero, the net amount of heat extracted from the colder reservoir is . This is entirely converted to work (equalto ) contradicting the Kelvin statement of the second law. Hence our assumption of must be false, proving theCarnot theorem. Thus we must have .

We also have an immediate corollary to the theorem:

All perfectly reversible engines operating between two given temperatures have the same efficiency.

This is also easily proved. Consider the engine to be a Carnot engine. From what we have already shown, we will have . Since is reversible, we can change the roles of and , running as a refrigerator and as the engine

producing work. In this case, the previous argument would lead to . We end up with two statements, and .The only solution is . Notice that this applies to any reversible engine, since we have not used any specific properties of theCarnot engine except reversibility.

If an engine is irreversible, the previous arguments hold, showing , but we cannot get the other inequality because is notreversible. Thus irreversible engines are less efficient than the Carnot engine.

A second corollary to the theorem is the following:

The efficiency of a Carnot engine is independent of the working material of the engine.

The arguments so far did not use any specific properties of the material of the Carnot engine, and since all Carnot engines betweentwo given reservoirs have the same efficiency, this clear. We now state another important consequence of the second law

The adiabatics of a thermodynamic system do not intersect.

We prove again by reductio ad absurdum. Assume the adiabatics can intersect, as shown in Fig. 3.3.1. Then we can consider aprocess going from to which is adiabatic and hence no heat is absorbed or given up, then a process from to whichabsorbs some heat , and then goes back to A along another adiabatic. Since the thermodynamic state at A is restored, thetemperature and internal energy are the same at the end as at the beginning, so that . Thus by the first law, ,which means that a certain amount of heat is absorbed and converted entirely to work with no other change in the system. Thiscontradicts the Kelvin statement of the second law. It follows that adiabatics cannot intersect.

Theorem : Carnot Theorem3.3.1

E1 E2

TH TL

Q1 >Q2 Q1

W = −Q2 Q1 E2 Q2

= −W ′ Q2 Q′1 Q′

1

= , =η1

W

Q2η2

W ′

Q2(3.3.1)

>η2 η1 > WW ′ >Q1 Q′1

−Q1 Q′1

−WW ′ >η2 η1

≤η2 η1

Proposition 1

E2

<η2 η1 E2 E1 E2 E2 E1

≤η1 η2 ≤η1 η2 ≤η2 η1

=η1 η2

≤η2 η1 E2

Proposition 2

Proposition 3

A B B C

ΔQ

ΔU = 0 ΔQ = ΔW

Page 22: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.4.1 https://phys.libretexts.org/@go/page/32021

3.4: Absolute Temperature and EntropyAnother consequence of the second law is the existence of an absolute temperature. Although we have used the notion of absolutetemperature, it was not proven. Now we can show this just from the laws of thermodynamics.

We have already seen that the efficiency of a Carnot cycle is given by

where is the amount of heat taken from the hotter reservoir and is the amount given up to the colder reservoir. Theefficiency is independent of the material and is purely a function of the lower and upper temperatures. The system underconsideration can be taken to be in thermal contact with the reservoirs, which may be considered very large. There is no exchangeof particles or any other physical quantity between the reservoirs and the system, so no parameter other than the temperature canplay a role in this. Let denote the empirically defined temperature, with and corresponding to the reservoirs between whichthe engine is operating. We may thus write

for some function of the temperatures. Now consider another Carnot engine operating between and , with correspondingQ’s, so that we have

Now we can couple the two engines and run it together as a single engine, operating between and , with

Evidently

so that we get the relation

This requires that the function must be of the form

for some function . Thus there must exist some function of the empirical temperature which can be defined independently ofthe material. This temperature is called the absolute temperature. Notice that since , we have if

. Thus should be an increasing function of the empirical temperature. Further, we cannot have for sometemperature . This would require . The corresponding engine would take some heat from the hotter reservoir andconvert it entirely to work, contradicting the Kelvin statement. This means that we must take f to be either always positive oralways negative, for all . Conventionally we take this to be positive. The specific form of the function determines the scale oftemperature. The simplest is to take a linear function of the empirical temperatures (as defined by conventional thermometers).Today, we take this to be

The unit of absolute temperature is the kelvin.

Once the notion of absolute temperature has been defined, we can simplify the formula for the efficiency of the Carnot engine as

η = 1 −Q2

Q3(3.4.1)

Q3 Q2

θ θ3 θ2

= f( , )Q2

Q3θ2 θ3 (3.4.2)

f θ2 θ1

= f( , )Q1

Q2θ1 θ2 (3.4.3)

θ3 θ1

= f( , )Q1

Q3θ1 θ3 (3.4.4)

=Q2

Q3

Q1

Q2

Q1

Q3(3.4.5)

f( , ) f( , ) = f( , )θ1 θ2 θ2 θ3 θ1 θ3 (3.4.6)

f

f( , ) =θ1 θ2

f( )θ1

f( )θ2(3.4.7)

f(θ)

<Q1 Q2 |f( )| < |f( )|θ1 θ2

<θ1 θ2 |f | f( ) = 0θ1

θ1 = 0Q1 Q2

θ

f(θ) ≡ T = Temperature in Celsius +273.16 (3.4.8)

η = 1 −TL

TH(3.4.9)

Page 23: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.4.2 https://phys.libretexts.org/@go/page/32021

Figure : Illustrating Clausius theorem

Also, we have , which we may rewrite as . Since Q2 is the heat absorbed and Q1 is the heat released into thereservoir, we can assign ± signs to the Q’s, + for intake and − for release of heat, and write this equation as

In other words, if we sum over various steps (denoted by the index i) of the cycle, with appropriate algebraic signs,

If we consider any closed and reversible cycle, as shown in Fig. 3.4.1, we can divide it into small cycles, each of which is a Carnotcycle. A few of these smaller cycles are shown by dotted lines, say with the long dotted lines being adiabatics and the short dottedlines being isothermals. By taking finer and finer such divisions, the error in approximating the cycle by a series of closed Carnotcycles will go to zero as the number of Carnot cycles goes to infinity. Since along the adiabatics, the change in is zero, we canuse the result from Equation above to write

where we denote the heat absorbed or given up during each infinitesimal step as . The statement in Equation is due toClausius. The important thing is that this applies to any closed curve in the space of thermodynamic variables, provided the processis reversible.

This equation has another very important consequence. If the integral of a differential around any closed curve is zero, then we canwrite the differential as the derivative of some function. Thus there must exist a function such that

This function is called entropy. It is a function of state, given in terms of the thermodynamic variables.

Clausius’ Inequality

Clausius’ discovery of entropy is one of most important advances in the physics of material systems. For a reversible process, wehave the result,

as we have already seen. There is a further refinement we can make by considering irreversible processes. There are manyprocesses such as diffusion which are not reversible. For such a process, we cannot write . Nevertheless, since entropyis a function of the state of the system, we can still define entropy for each state. For an irreversible process, the heat transferred toa system is less than where is the entropy change produced by the irreversible process. This is easily seen from the secondlaw. For assume that a certain amount of heat is absorbed by the system in the irreversible process. Consider then acombined process where the system changes from state A to state B in an irreversible manner and then we restore state A by a

3.4.1

=Q1

Q2T1T2

=Q1

T1

Q2

T2

+ = 0Q1

T1

Q2

T2(3.4.10)

= 0∑cycle

Qi

Ti(3.4.11)

Q

3.4.11

= 0∮cycle

dQ

T(3.4.12)

dQ 3.4.12

S(p,V )

= dS, or dQ = TdSdQ

T(3.4.13)

S

= dS = 0∮cycle

dQ

T∮cycle

(3.4.14)

dQ = TdS

TdS dS

dQirr

Page 24: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.4.3 https://phys.libretexts.org/@go/page/32021

reversible process. For the latter step, . The combination thus absorbs an amount of heat equal to withno change of state. If this is positive, this must be entirely converted to work. However, that would violate the second law. Hencewe should have

If we have a cyclic process, since is a state function, and hence

with equality holding for a reversible process. This is known as Clausius’ inequality.

For a system in thermal isolation, , and the condition becomes

In other words, the entropy of a system left to itself can only increase, equilibrium being achieved when the entropy (for thespecified values of internal energy, number of particles, etc.) is a maximum.

The second law has been used to define entropy. But once we have introduced the notion of entropy, the second law is equivalent tothe statement that entropy tends to increase. For any process, we can say that

We can actually see that this is equivalent to the Kelvin statement of the second law as follows. Consider a system which takes upheat at temperature . For the system (labeled 1) together with the heat source (labeled 2), we have . But thesource is losing heat at temperature and if this is reversible, . Further if there is no other change in the system,

and . Thus

Since and this equation implies that the work done by the system cannot be positive, if we have .Thus, we have arrived at the Kelvin statement that a system cannot absorb heat from a source and convert it entirely to workwithout any other change. We may thus restate the second law in the form:

Second Law of Thermodynamics: The entropy of a system left to itself will tend to increase to a maximum value compatiblewith the specified values of internal energy, particle number, etc.

Nature of Heat Flow

We can easily see that heat by itself flows from a hotter body to a cooler body. This may seem obvious, but is a crucial result of thesecond law. In some ways, the second law is the formalization of such statements which are “obvious" from our experience.

Consider two bodies at temperatures and , thermally isolated from the rest of the universe but in mutual thermal contact. Thesecond law tells us that . This means that

Because the bodies are isolated from the rest of the world, , so that we can write the condition above as

If , we must have and if . Either way, heat flows from the hotter body to the colder body.

d = TdSQrev d −TdSQirr

d −TdS < 0Qirr (3.4.15)

∮ dS = 0 S

∮ ≤ 0dQ

T(3.4.16)

dQ = 0 d < TdSQirr

dS > 0 (3.4.17)

≥ 0dS

T(3.4.18)

ΔQ T Δ +Δ ≥ 0S1 S2

T Δ = −S2ΔQ

T

Δ = 0S1 Δ = 0U1

Δ +Δ ≥ 0 ⇒ − ≥ 0 ⇒ ΔQ ≤ 0S1 S2ΔQ

T(3.4.19)

Δ = 0, ΔQ = ΔWU1 ≥ 0dS

dt

Proposition 4

T1 T2

dS ≥ 0

+ ≥ 0dQ1

T1

dQ2

T2(3.4.20)

d +d = 0Q1 Q2

( − )d ≥ 01

T1

1

T2Q1 (3.4.21)

>T1 T2 d < 0Q1 < , d > 0T1 T2 Q1

Page 26: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.5.1 https://phys.libretexts.org/@go/page/32022

3.5: Some Other Thermodynamic EnginesWe will now consider some other thermodynamic engines which are commonly used.

Otto CycleThe automobile engine operates in four steps, with the injection of the fuel-air mixture into the cylinder. It undergoes compressionwhich can be idealized as being adiabatic. The ignition then raises the pressure to a high value with almost no change of volume.The high pressure mixture rapidly expands, which is again almost adiabatic. This is the power stroke driving the piston down. Thefinal step is the exhaust when the spent fuel is removed from the cylinder. This step happens without much change of volume. Thisprocess is shown in Fig. 3.4.2. We will calculate the efficiency of the engine, taking the working material to be an ideal gas.

Heat is taken in during the ignition cycle to . The heat comes from the chemical process of burning but we can regard it as heattaken in from a reservoir. Since this is at constant volume, we have

Heat is given out during the exhaust process to , again at constant volume, so

Further, states and are connected by an adiabatic process, so are and . Thus is preserved for theseprocesses. Also, , , so we have

Figure : The idealized Diesel cycle

This gives

The efficiency is then

Diesel Cycle

The idealized operation of a diesel engine is shown in Fig. 3.5.1. Initially, only air is admitted into the cylinder. It is thencompressed adiabatically to very high pressure (and hence very high temperature). Fuel is then injected into the cylinder. Thetemperature in the cylinder is high enough to ignite the fuel. The injection of the fuel is controlled so that the burning happens atessentially constant pressure ( to in figure). This is the key difference with the automobile engine. At the end of the burningprocess, the expansion continues adiabatically (part to ). From back to we have the exhaust cycle as in the automobileengine.

B C

= ( − )QH Cv TC TB (3.5.1)

D A

= ( − )QL Cv TD TA (3.5.2)

C D A B p = nRTV γ V γ−1

=VA VD =VB VC

= ( , = (TD TC

VB

VA

)γ−1

TA TB

VB

VA

)γ−1

(3.5.3)

3.5.1

= ( − )(QL Cv TC TB

VB

VA

)γ−1

(3.5.4)

η = 1 − = 1 −(QL

QH

VB

VA

)γ−1

(3.5.5)

B C

C D D A

Page 27: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

3.5.2 https://phys.libretexts.org/@go/page/32022

Taking the air (and fuel) to be an ideal gas, we can calculate the efficiency of the diesel engine. Heat intake (from burning fuel) is atconstant pressure, so that

Heat is given out ( to ) at constant volume so that

We also have the relations,

Further, implies , which, in turn, gives

We can now write

These two equations give the efficiency of the diesel cycle as

3.5: Some Other Thermodynamic Engines is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

= ( − )QH Cp TC TB (3.5.6)

D A

= ( − )QL Cp TD TA (3.5.7)

= ( , = (TA TB

VB

VA

)γ−1

TD TC

VC

VA

)γ−1

(3.5.8)

=pB pC = ( )TB TCVB

VC

= ( ) (TA TC

VB

VC

VB

VA

)γ−1

(3.5.9)

= [( −( )( ]QL CvTC

VC

VA

)γ−1

TB

VC

VB

VA

)γ−1

= ( [1 −( ]CvTC

VC

VA

)γ−1 VB

VC

(3.5.10)

= [1 − ]QH CpTC

VB

VC

(3.5.11)

η = 1 − ( [ ]1

γ

VC

VA

)γ−1 1 −( VB

VC)

γ

1 −( )VB

VC

= 1 − ( )[ ]1

γ

TD

TC

1 −( VB

VC)

γ

1 −( )VB

VC

(3.5.12)

Page 28: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

4.1 https://phys.libretexts.org/@go/page/32023

4: The Third Law of ThermodynamicsThe second law defines entropy by the relation . Thus entropy is defined only up to an additive constant. Further, itdoes not tell us about the behavior of as a function of close to absolute zero. This is done by the third law.

The contribution to the entropy by each set of degrees of freedom in internal thermodynamic equilibrium tends to zero in adifferentiable way as the absolute zero of temperature is approached.

The limiting value of is independent of the process by which is approached; it does not matter whether the system is inliquid or solid phase, whether it is under pressure, etc. Further, the differentiability condition says that is finite at absolutezero, where the derivative is taken along any process.

Unattainability of Absolute Zero

For different starting points, the variation of entropy with temperature will be as shown in Figure . A process that reduces thetemperature can be viewed as an isothermal decrease of entropy from to , followed by an isentropic (adiabatic) decrease oftemperature along . By continuing the process, one can get closer and closer to absolute zero. But it is obvious that thetemperature decreases achieved will become smaller and smaller; after any finite number of steps, there will still be a positivenonzero temperature. Thus we conclude that only an asymptotic approach to absolute zero is possible for any thermodynamicprocess. Since any thermodynamic process for cooling can be built up as a sum of steps like , the conclusion holds in general.

Figure : Illustrating the unattainability of absolute zero

Vanishing of Specific Heats at Absolute ZeroThe specific heat for any process can be written as

where R specifies the quantity held constant. Since, by the third law, is finite as , we find as . Inparticular, the specific heats at constant volume ( ) and at constant pressure ( ) vanish at absolute zero.

4: The Third Law of Thermodynamics is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. ParameswaranNair.

dQ = TdS

S T

Third Law of Thermodynamics

S T = 0

( )∂S

∂T

4.1

A B

BC

ABC

4.1

=( = T(CR

∂Q

∂T)

R

∂S

∂T)

R

(4.1)

( ∂Q

∂T)R

T → 0 → 0CR T → 0

Cv Cp

Page 29: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

CHAPTER OVERVIEW

5: Thermodynamic Potentials and Equilibrium

Extensive and Intensive Variables

All quantities in thermodynamics fall into two types: extensive and intensive. If we consider two independent systems, quantitieswhich add up to give the corresponding quantities for the complete system are characterized as extensive quantities. The volume ,the internal energy , the enthalpy, and as we will see later on, the entropy are extensive quantities. If we divide a system intosubsystems, those quantities which remain unaltered are called intensive variables. The pressure , the temperature , the surfacetension are examples of intensive variables.

In any thermodynamic system, there is a natural pairing between extensive and intensive variables. For example, pressure andvolume go together as in the formula . Temperature is paired with the entropy, surface tension with the area, etc.

5.1: Thermodynamic Potentials5.2: Thermodynamic Equilibrium5.3: Phase Transitions

5: Thermodynamic Potentials and Equilibrium is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

V

U S

p T

dU = dQ−pdV

Page 30: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

5.1.1 https://phys.libretexts.org/@go/page/32026

5.1: Thermodynamic PotentialsThe second law leads to the result , so that, for a gaseous system, the first law may be written as

More generally, we have

where is the surface tension, is the area, is the magnetization, is the magnetic field, etc.

Returning to the case of a gaseous system, we now define a number of quantities related to . These are called thermodynamicpotentials and are useful when considering different processes. We have already seen that the enthalpy is useful for consideringprocesses at constant pressure. This follows from

so that for processes at constant pressure, the inflow or outflow of heat may be seen as changing the enthalpy

The Helmholtz free energy is defined by

Taking differentials and comparing with the formula for , we get

The Gibbs free energy is defined by

Evidently,

Notice that by construction, , and are functions of the state of the system. They may be expressed as functions of and ,for example. They are obviously extensive quantities.

So far, we have considered the system characterized by pressure and volume. If there are a number of particles, say, which makeup the system, we can also consider the -dependence of various quantities. Thus we can think of the internal energy as afunction of , and , so that

The quantity which is defined by

is called the chemical potential. It is obviously an intensive variable. The corresponding equations for , and are

Thus the chemical potential µ may also be defined as

Since is an extensive quantity and so are and , the internal energy has the general functional form

dQ = TdS

dU = TdS−pdV (5.1.1)

dU = TdS−pdV −σdA−MdB (5.1.2)

σ A M B

U

H

dH = d(U +pV ) = dQ+V dp (5.1.3)

F = U −TS (5.1.4)

dU

dF = −SdT −pdV (5.1.5)

G

G= F +pV = H −TS = U −TS+pV (5.1.6)

dG= −SdT +V dp (5.1.7)

H F G p V

N

N U

S V N

dU =( dS+( dV +( dN ≡ TdS−pdV +μdN∂U

∂S)

V,N

∂U

∂V)

S,N

∂U

∂N)

S,V

(5.1.8)

μ

μ =(∂U

∂N)

S,V

(5.1.9)

H F G

dH = TdS+V dp+μdN

dF = −SdT −pdV +μdN

dG= −SdT +V dp+μdN

(5.1.10)

μ =( =( =(∂H

∂N)

S,p

∂F

∂N)

T ,V

∂G

∂N)

T ,p

(5.1.11)

U S V

Page 31: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

5.1.2 https://phys.libretexts.org/@go/page/32026

where depends only on and which are intensive variables. In a similar way, we can write

The last equation is of particular interest. Taking its variation, we find

Comparing with Equation , we get

The quantity is identical to the chemical potential, so that we may write

We may rewrite the other two relations as

Further, using , we can rewrite the equation for as

This essentially combines the previous two relations and is known as the Gibbs-Duhem relation. It is important in that it provides arelation among the intensive variables of a thermodynamic system.

5.1: Thermodynamic Potentials is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

U = Nu( , )S

N

V

N(5.1.12)

u S

N

V

N

H = N h( , p)S

N

F = N f(T , )V

N

G= N g(T , p)

(5.1.13)

dG= N ( dT +N ( dT +g dN∂g

∂T)

p

∂g

∂p)

T

(5.1.14)

5.1.10

S = −N ( , V = N ( , μ = g∂g

∂T)

p

∂g

∂p)

T

(5.1.15)

g

G= μN (5.1.16)

S = −N ( dT , V = N (∂μ

∂T)

p

∂μ

∂p)

T

(5.1.17)

G= μN dG

Ndμ+SdT −V dp = 0 (5.1.18)

Page 32: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

5.2.1 https://phys.libretexts.org/@go/page/32027

5.2: Thermodynamic EquilibriumThe second law of thermodynamics implies that entropy does not decrease in any natural process. The final equilibrium state willthus be the state of maximum possible entropy. After attaining this maximum possible value, the entropy will remain constant. Thecriterion for equilibrium may thus be written as

We can take to be a function of , and . The system, starting in an arbitrary state, adjusts , and among its differentparts and constitutes itself in such a way as to maximize entropy. Consider the system subdivided into various smaller subsystems,say, indexed by . The thermodynamic quantities for each such unit will be indicated by a subscript . For an isolatedsystem, the total internal energy, the total volume and the total number of particles will be fixed, so that the changes in thesubsystems must be constrained as

Since is extensive, where . We can now maximize entropy subject to the constraints Equation by considering the maximization of

where the Lagrange multipliers enforce the required constraints. The variables , and can now be freely varied.Thus, for the condition of equilibrium, we get

Since the variations are now independent, this gives, for equilibrium,

This can be rewritten as

where we used

Equation tells us that, for equilibrium, the temperature of all subsystems must be the same, the pressure in differentsubsystems must be the same and the chemical potential for different subsystems must be the same.

Reaction EquilibriumSuppose we have a number of constituents , at constant temperature and pressure which undergo a reactionof the form

The entropy of the system is of the form

where the summation covers all ’s and ’s. Since the temperature and pressure are constant, reaction equilibrium is obtainedwhen the change so as to maximize the entropy. This gives

δS = 0 (5.2.1)

S U V N U V N

i = 1, . . ,n i

δ = 0, δ = 0, δ = 0∑i

Ui ∑i

Vi ∑i

Ni (5.2.2)

S S = ∑i Si = ( , , )Si Si Ui Vi Ni

5.2.2

S = − ( −U) − ( −V ) − ( −N)∑i

Si λ1 ∑i

Ui λ2 ∑i

Vi λ3 ∑i

Ni (5.2.3)

, ,λ1 λ2 λ3 Ui Vi Ni

dS = [( )− ]d +[( )− ]d +[( )− ]d = 0∑i

∂S

∂Ui

λ1 Ui

∂S

∂Viλ2 Vi

∂S

∂Ni

λ3 Ni (5.2.4)

( ) = , ( ) = , ( ) =∂S

∂Ui

λ1∂S

∂Viλ2

∂S

∂Ni

λ3 (5.2.5)

= , = , =1

Tiλ1

pi

Tiλ2

μi

Tiλ3 (5.2.6)

dS = + dV − dNdU

T

p

T

μ

T(5.2.7)

5.2.6

, , . . . , , , . . .A1 A2 B1 B2

+ + . . . ⇌ + + . . .νA1A1 νA2A2 νB1B1 νB2B2 (5.2.8)

S = ( )∑k

Sk Nk (5.2.9)

A B

Nk

Page 33: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

5.2.2 https://phys.libretexts.org/@go/page/32027

The quantities are not independent, but are restricted by the reaction. When the reaction happens, of -particles must bedestroyed, of -particles must be destroyed, etc., while of particles are produced, etc. Thus we can write

where is arbitrary. The condition of equilibrium thus reduces to

With the understanding that the ’s for the reactants will carry a minus sign while those for the products have a plus sign, we canrewrite this as

This condition of reaction equilibrium can be applied to chemical reactions, ionization and dissociation processes, nuclearreactions, etc.

5.2: Thermodynamic Equilibrium is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

dS = d = 01

T∑k

μk Nk (5.2.10)

dNk νA1A1

νA2A2 νB1 B1

d = − d , d = − d , . . . , d = − d , d = − d , . . . , etc.NA1νA1

N0 NA2νA2

N0 NB1νB1

N0 NB2νB2

N0 (5.2.11)

dN0

− + = 0∑A

νAiμAi

∑B

νBiμBi

(5.2.12)

ν

= 0∑k

νkμk (5.2.13)

Page 34: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

5.3.1 https://phys.libretexts.org/@go/page/32028

5.3: Phase TransitionsIf we have a single constituent, for thermodynamic equilibrium, we should have equality of , and for different subparts of thesystem. If we have different phases of the system, such as gas, liquid, or solid, in equilibrium, we have

where the subscripts refer to various phases.

We will consider the equilibrium of the two phases in more detail. In this case, we have

If equilibrium is also obtained for a state defined by , , then we have

These two equations yield

This equation will tell us how should change when is altered (or vice versa) so as to preserve equilibrium. Expanding to firstorder in the variations, we find

where we have used

Equation reduces to

where is the latent heat of the transition. This equation is known as the Clausius-Clapeyron equation. It can beused to study the variation of saturated vapor pressure with temperature (or, conversely, the variation of boiling point withpressure). As an example, consider the variation of boiling point with pressure, when a liquid boils to form gaseous vapor. In thiscase, we can take . Further, if we assume, for the sake of the argument, that the gaseous phase obeys the idealgas law, , then the Clausius-Clapeyron Equation becomes

Integrating this from one value of to another,

Thus for , must be larger than ; this explains the increase of boiling point with pressure.

If and are continuous at the transition, and . In this case, we have to expand to second order in thevariations. Such a transition is called a second order phase transition. In general, if the first derivatives of are continuous,and the -th derivatives are discontinuous at the transition, the transition is said to be of the -th order. Clausius-Clapeyronequation, as we have written it, applies to the first order phase transitions. These have a latent heat of transition.

5.3: Phase Transitions is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

T p µ

= = = . . . = TT1 T2 T3

= = = . . . = pp1 p2 p3

= = = . . .µ1 µ2 µ3

(5.3.1)

(p, T ) = (p, T )µ1 µ2 (5.3.2)

(p+dp T +dT )

(p+dp , T +dT ) = (p+dp , T +dT )µ1 µ2 (5.3.3)

Δ = Δ , Δμ = μ(p+dp , T +dT ) − μ(p , T )μ1 μ2 (5.3.4)

p T

− + = − +s1dT v1dp s2dT v2dp (5.3.5)

= − ≡ s, = − ≡ v∂µ

∂T

S

N

∂µ

∂p

V

N(5.3.6)

5.3.5

= =dp

dT

−s1 s2

−v1 v2

L

T ( − )v1 v2

(5.3.7)

L = T ( − )s1 s2

= ≫ =v1 vg v2 vl

=vgkTp

5.3.7

≈ pdp

dT

L

kT 2(5.3.8)

T

log( ) = ( − )p

p0

L

k

1

T0

1

T(5.3.9)

p > p0 T T0

∂µ

∂T

∂µ

∂p=s1 s2 =v1 v2 µ

(n−1) µn n

Page 36: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

6.1.1 https://phys.libretexts.org/@go/page/32030

6.1: Maxwell RelationsFor a system with one constituent with fixed number of particles, from the first and second laws, and from Equation 5.1.10, wehave the basic relations

The quantities on the left are all perfect differentials. For a general differential of the form

to be a perfect differential, the necessary and sufficient condition is

Applying this to the four differentials in , we get

These four relations are called the Maxwell relations.

A Mathematical Result

Let , , be three variables, of which only two are independent. Taking to be a function of and , we can write

If now we take and as the independent variables, we can write

Upon substituting this result into , we get

Since we are considering and as independent variables now, this equation immediately yields the relations

These relations can be rewritten as

dU = TdS − pdV

dH = TdS − V dp

dF = −SdT − pdV

dG = −SdT − V dp

(6.1.1)

dR

dR = Xdx+Y dy (6.1.2)

( = (∂X

∂y)

x

∂Y

∂x)

y

(6.1.3)

6.1.1

( = −(∂T

∂V)

S

∂p

∂S)

V

( = (∂T

∂p)

S

∂V

∂S)

p

( = (∂S

∂V)

T

∂p

∂T)

V

( = −(∂S

∂p)

T

∂V

∂T)

p

(6.1.4)

X Y Z Z X Y

dZ = ( dX + ( dY∂Z

∂X)

Y

∂Z

∂Y)

X

(6.1.5)

X Z

dY = ( dX + ( dZ∂Y

∂X)

Y

∂Y

∂Z)

X

(6.1.6)

6.1.5

dZ = [( + ( ( ]dX + ( ( dZ∂Z

∂X)

Y

∂Z

∂Y)

X

∂Y

∂X)

Z

∂Z

∂Y)

X

∂Y

∂Z)

X

(6.1.7)

X Z

( ( dZ = 1∂Z

∂Y)

X

∂Y

∂Z)

X

( + ( ( = 0∂Z

∂X)

Y

∂Z

∂Y)

X

∂Y

∂X)

Z

(6.1.8)

Page 37: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

6.1.2 https://phys.libretexts.org/@go/page/32030

6.1: Maxwell Relations is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

( =∂Z

∂Y)

X

1

( dY

dZ)X

( ( ( = −1∂X

∂Z)

Y

∂Z

∂Y)

X

∂Y

∂X)

Z

(6.1.9)

Page 38: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

6.2.1 https://phys.libretexts.org/@go/page/32031

6.2: Other Relations

The TdS Equations

The entropy is a function of the state of the system. We can take it to be a function of any two of the three variables ( , , ).Taking to be a function of and , we write

For the first term on the right hand side, we can use

where is the specific heat at constant pressure. Further, using the last of the Maxwell relations, we can now write Equation as

The coefficient of volumetric expansion (due to heating) is defined by

Equation can thus be rewritten as

If we take to be a function of and ,

Again the first term on the right hand side can be expressed in terms of Cv, the specific heat at constant volume, using

Further using the Maxwell relations, we get

Equations (or and ) are known as the equations.

Equations for Specific HeatsEquating the two expressions for , we get

By the equation of state, we can write as a function of and , so that

Using this in Equation , we find

S p T V

S p T

TdS = T( dT + T( dp∂S

∂T)

p

∂S

∂p)

T

(6.2.1)

T( dT = ( dT =∂S

∂T)

p

∂Q

∂T)

p

Cp (6.2.2)

Cp 6.2.2

Tds = dT − T( dpCp

∂V

∂T)

p

(6.2.3)

α = (1

V

∂V

∂T)

p

(6.2.4)

6.2.3

TdS = dT − αTV dpCp (6.2.5)

S V T

TdS = T( dT + T( dV∂S

∂T)

V

∂S

∂V)

T

(6.2.6)

T( =∂S

∂T)

V

Cv (6.2.7)

TdS = dT + T( dVCv

∂p

∂T)

V

(6.2.8)

6.2.3 6.2.5 6.2.8 TdS

TdS

( − )dT = T[( dV + ( dp]Cp Cv

∂p

∂T)

V

∂V

∂T)

p

(6.2.9)

p V T

dp = ( dT + ( dV∂p

∂T)

V

∂p

∂V)

T

(6.2.10)

6.2.9

( − )dT = T[( + ( ( ]dV + T( ( dTCp Cv

∂p

∂T)

V

∂V

∂T)

p

∂p

∂V)

T

∂V

∂T)

p

∂p

∂T)

V

(6.2.11)

Page 39: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

6.2.2 https://phys.libretexts.org/@go/page/32031

However, using Equation 6.1.9, taking , and , we have

Thus the coefficient of in Equation vanishes and we can simplify it as

where we have used Equation again. We have already defined the coefficient of volumetric expansion . The isothermalcompressibility is defined by

In terms of these we can express as

This equation is very useful in calculating from measurements of and and . Further, for all substances, . Thus,we see from this equation that . (The result can be proved in statistical mechanics.)

In the equations, if , and are related adiabatically, and we get

This gives

We have the following relations among the terms involved in this expression,

Using these we find

Going back to the Maxwell relations and using the expressions for , we find

these immediately yield the relations

X = V Y = p Z = T

( + ( ( = 0∂p

∂T)

V

∂V

∂T)

p

∂p

∂V)

T

(6.2.12)

dV 6.2.11

− = T( ( = −T[( (Cp Cv

∂V

∂T)

p

∂p

∂T)

V

∂V

∂T)

p

]2 ∂p

∂V)

T

(6.2.13)

6.2.12 α

κT

= −V(1

κT

∂p

∂V)

T

(6.2.14)

−Cp Cv

− = VCp Cv

Tα2

κT(6.2.15)

Cv Cp α κT > 0κT

≥Cp Cv > 0κT

TdS dT dV dp dS = 0

= T( ( , = −T( (Cp

∂V

∂T)

p

∂p

∂T)

S

Cv

∂p

∂T)

V

∂V

∂T)

S

(6.2.16)

= −( ( [( (Cp

Cv

∂V

∂T)

p

∂p

∂T)

S

∂p

∂T)

V

∂V

∂T)

S

]−1

(6.2.17)

( = ( ( = (∂p

∂T)

S

∂V

∂T)

S

∂p

∂V)

S

∂V

∂T)

S

1

V κS

( = ( ( ( = −( = ( V∂V

∂T)

p

∂p

∂T)

V

∂T

∂p)

V

∂V

∂T)

p

∂p

∂T)

V

1

( ∂p

∂V)T

∂p

∂T)

V

κT

(6.2.18)

=Cp

Cv

κT

κS(6.2.19)

TdS

dU = dT + [T( −p]dVCv

∂p

∂T)

V

dH = dT + [V −T( ]dpCp

∂V

∂T)

p

(6.2.20)

= ( , = (Cv

∂U

∂T)

V

Cp

∂H

∂T)

p

( = T( −p ( = V −T(∂U

∂V)

T

∂p

∂T)

V

∂H

∂p)

T

∂V

∂T)

p

(6.2.21)

Page 40: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

6.2.3 https://phys.libretexts.org/@go/page/32031

Gibbs-Helmholtz RelationSince the Helmholtz free energy is defined as ,

This gives immediately

Using this equation for entropy, we find

This is known as the Gibbs-Helmholtz relation. If is known as a function of and , we can use these to obtain , and .Thus, all thermodynamic variables can be obtained from as a function of and .

6.2: Other Relations is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

F = U −TS

dF = dU − TdS − SdT = −SdT − pdV (6.2.22)

S = −( , p = −(∂F

∂T)

V

∂F

∂V)

T

(6.2.23)

U = F + TS = F − T(∂F

∂T)

V

(6.2.24)

F T V S p U

F T V

Page 41: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

6.3.1 https://phys.libretexts.org/@go/page/32032

6.3: Joule-Kelvin ExpansionThe expansion of a gas through a small opening or a porous plug with the pressure on either side being maintained is called Joule-Kelvin expansion. It is sometimes referred to as the Joule-Thomson expansion since Thomson was Lord Kelvin’s original name.The pressures are maintained by the flow of gases but for the theoretical discussion, we may think of them as being maintained bypistons which move in or out to keep the pressure the same. The values of the pressures on the two sides of the plug are not thesame. The gas undergoes a decrease in volume on one side as the molecules move through the opening to the other side. Thevolume on the other side increases as molecules move in. The whole system is adiabatically sealed so that the net flow of heat in orout is zero.

Since , we can write, from the first law,

Consider the gas on one side starting with volume going down to zero while on the other side the volume increases from zero to. Integrating Equation , we find

This yields the relation

Thus the enthalpy on either side of the opening is the same. It is isenthalpic expansion. The change in the temperature of the gas isgiven by

The quantity

is called the Joule-Kelvin coefficient. From the variation of we have

so that, considering an isenthalpic process we get

This gives a convenient formula for . Depending on whether this coefficient is positive or negative, there will be heating orcooling of the gas upon expansion by this process. By choosing a range of pressures for which is negative, this process can beused for cooling and eventual liquifaction of gases.

6.3: Joule-Kelvin Expansion is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

dQ = 0

dU + pdV = 0 (6.3.1)

V1

V2 6.3.1

(dU + pdV ) + (dU + pdV ) = 0∫0

, ,p1 V1 U1

∫, ,p2 V2 U2

0

(6.3.2)

+ = +U1 p1V1 U2 p2V2 (6.3.3)

ΔT = dp = dp∫p2

p1

( )∂T

∂p H

∫p2

p1

μJK (6.3.4)

=μJK ( )∂T

∂p H

(6.3.5)

H

dH = dT + [V − T ] dpCp ( )∂V

∂T p

(6.3.6)

= [T − V ]μJK

1

Cp

( )∂V

∂T p

(6.3.7)

μJK

μJK

Page 42: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

CHAPTER OVERVIEW

7: Classical Statistical MechanicsDynamics of particles is given by Newton’s laws or, if we include quantum effects, the quantum mechanics of point-particles.Thus, if we have a large number of particles such as molecules or atoms which constitute a macroscopic system, then, in principle,the dynamics is determined. Classically, we just have to work out solutions to Newton’s laws. But for systems with large numbersof particles, such as the Avogadro number which may be necessary in some cases, this is a wholly impractical task. We do not havegeneral solutions for the three-body problem in mechanics, let alone for particles. What we can attempt to do is a statisticalapproach, where one focuses on certain averages of interest, which can be calculated with some simplifying assumptions. This isthe province of Statistical Mechanics.

If we have particles, in principle, we can calculate the future of the system if we are given the initial data, namely, the initialpositions and velocities or momenta. Thus we need input numbers. Already, as a practical matter, this is impossible, since

and we do not, in fact, cannot measure the initial positions and momenta for all the molecules in a gas at any time. Sogenerally we can make the assumption that a probabilistic treatment is possible. The number of molecules is so large that we cantake the initial data to be a set of random numbers, distributed according to some probability distribution. This is the basic workinghypothesis of statistical mechanics. To get some feeling for how large numbers lead to simplification, we start with the binomialdistribution.

7.1: The Binomial Distribution7.2: Maxwell-Boltzmann Statistics7.3: The Maxwell Distribution For Velocities7.4: The Gibbsian Ensembles7.5: Equation of State7.6: Fluctuations7.7: Internal Degrees of Freedom7.8: Examples

7: Classical Statistical Mechanics is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

1023

N

6 N

N ∼ 1023

Page 43: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.1.1 https://phys.libretexts.org/@go/page/32034

7.1: The Binomial DistributionThis is exemplified by the tossing of a coin. For a fair coin, we expect that if we toss it a very large number of times, then roughlyhalf the time we will get heads and half the time we will get tails. We can say that the probability of getting heads is and theprobability of getting tails is . Thus the two possibilities have equal a priori probabilities.

Now consider the simultaneous tossing of N coins. What are the probabilities? For example, if , the possibilities are , , and . There are two ways we can get one head and one tail, so the probabilities are , , and for two heads, one

head and no heads respectively. The probability for one head (and one tail) is higher because there are many (two in this case) waysto get that result. So we can ask: How many ways can we get heads (and tails)? This is given by the number of wayswe can choose out of , to which we can assign the heads. In other words, it is given by

The probability for any arrangement , will be given by

where we have used the binomial theorem to write the denominator as . This probability as a function of for large valuesof , is shown in Figure . Notice that already for , the distribution is sharply peaked around the middle value of

. This becomes more and more pronounced as becomes large. We can check the place where the maximum occurs by

noting that the values and are very close to each other, infinitesimally different for , so that may betaken to be continuous as . Further, for large numbers, we can use the Stirling formula

Figure : The binomial distribution showing as a function of = the number of heads for

Then we get

This has a maximum at . Expanding around this value, we get

We see that the distribution is peaked around with a width given by . The probability of deviation from the meanvalue is very very small as . This means that many quantities can be approximated by their mean values or values at themaximum of the distribution.

12

1 − =12

12

N = 2 HH

HT T H T T 14

12

14

n1 (N − )n1

n1 N

W ( , ) = = , + = Nn1 n2N !

!(N − )!n1 n1

N !

! !n1 n2n1 n2 (7.1.1)

n1 n2

p( , ) = =n1 n2W ( , )n1 n2

W ( , )∑ ,n′1 n′

2n′

1 n′2

W ( , )n1 n2

2N(7.1.2)

2N x =n1

N

N n1 7.1.1 N = 8

= 4n1 Nn1

N

( +1)n1

NN → ∞ x =

n1

N

N → ∞

log N ! ≈ N log N − N (7.1.3)

7.1.1 W( , )n1 n2 n1 N = 8

log p = log W ( , ) −N log 2n1 n2

≈ N log N −N −( log − ) −(N − ) log(N − ) +(N − ) −N log 2n1 n1 n1 n1 n1 n1

≈ −N [x log x +(1 −x) log(1 −x)] −N log 2

x = =x∗12

log p

log p ≈ −2N(x − + O((x − ), or p ≈ exp(−2N(x −x∗ )x∗)2x∗)3 )2 (7.1.4)

x∗ Δ ∼ ( N)x2 14

N → ∞

Page 44: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.1.2 https://phys.libretexts.org/@go/page/32034

We have considered equal a priori probabilities. If we did not have equal probabilities then the result will be different. For example,suppose we had a coin with a probability of , for heads and probability for tails. Then the probability for coins would go like

(Note that reproduces Equation .) The maximum is now at . The standard deviation from the maximumvalue is unchanged.

Here we considered coins for each of which there are only two outcomes, head or tail. If we have a die with 6 outcomes possible,we must consider splitting into . Thus we can first choose out of in ways, then choose out

of the remaining in ways and so on, so that the number of ways we can get a particular assignment

is

More generally, the number of ways we can distribute particles into boxes is

Basically, this gives the multinomial distribution.

7.1: The Binomial Distribution is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

q 0 < q < 1 (1 −q) N

p( , ) = (1 −q W ( , )n1 n2 qn1 )n2 n1 n2 (7.1.5)

q = 12

7.1.2 x = = qx∗

N , , ⋅ ⋅ ⋅,n1 n2 n6 n1 NN !

( !(N− )!)n1 n1n2

N −n1(N− )!n1

( !(N− − )!)n2 n1 n2

, , ⋅ ⋅ ⋅,n1 n2 n6

W ({ }) = ⋅ ⋅⋅ = , = Nni

N !

( !(N − )!)n1 n1

(N − )!n1

( !(N − − )!)n2 n1 n2

N !

, , ⋅ ⋅ ⋅,n1 n2 n6

∑i

ni (7.1.6)

N K

W ({ }) = , = Nni Ni∏i

K1

!ni

∑i=1

K

n1 (7.1.7)

Page 45: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.2.1 https://phys.libretexts.org/@go/page/32035

7.2: Maxwell-Boltzmann StatisticsNow we can see how all this applies to the particles in a gas. The analog of heads or tails would be the momenta and other numberswhich characterize the particle properties. Thus, we can consider particles distributed into different cells, each of the cellsstanding for a collection of observables or quantum numbers which can characterize the particle. The number of ways in which particles can be distributed into these cells, say of them, would be given by Equation 7.1.8. The question is then about a prioriprobabilities. The basic assumption which is made is that there is nothing to prefer one set of values of observables over another, sowe assume equal a priori probabilities. This is a key assumption of statistical mechanics. So we want to find the distribution ofparticles into different possible values of momenta or other observables by maximizing the probability

Here is a normalization constant given by , the analog of in Equation 7.1.2. Now the maximization has to bedone obviously keeping in mind that , since we have a total of particles. But this is not the only condition. Forexample, energy is a conserved quantity and if we have a system with a certain energy , no matter how we distribute the particlesinto different choices of momenta and so on, the energy should be the same. Thus the maximization of probability should be donesubject to this condition. Any other conserved quantity should also be preserved. Thus our condition for the equilibriumdistribution should read

where (for various values of ) give the conserved quantities, the total particle number and energy being two suchobservables.

The maximization of probability seems very much like what is given by the second law of thermodynamics, wherein equilibrium ischaracterized by maximization of entropy. In fact this suggests that we can define entropy in terms of probability or , so that thecondition of maximization of probability is the same as the condition of maximization of entropy. This identification was made byBoltzmann who defined entropy corresponding to a distribution of particles among various values of momenta and otherobservables by

where is the Boltzmann constant. For two completely independent systems , , we need , while .Thus the relation should be in terms of . This equation is one of the most important formulae in physics. It is true even forquantum statistics, where the counting of the number of ways of distributing particles is different from what is given by Equation7.1.8. We will calculate entropy using this and show that it agrees with the thermodynamic properties expected of entropy. We canrestate Boltzmann’s hypothesis as

With this identification, we can write

We will consider a simple case where the single particle energy values are (where may be interpreted as momentum label) andwe have only two conserved quantities to keep fixed, the particle number and the energy. To carry out the variation subject to theconditions we want to impose, we can use Lagrange multipliers. We add terms , and vary theparameters (or multipliers) , to get the two conditions

N

N

K

p( ) = Cni Ni∏i

K 1

!ni

(7.2.1)

C C = P W∑{ }ni2N

= N∑i ni N

U

p({ }) = 0, subject to = fixedδnini ∑

i

niO(α)i (7.2.2)

O(α) α

W

{ }ni

S = k logW ({ })ni (7.2.3)

k A B S = +SA SB W = WAWB

logW

p({ }) = C W ({ }) = Cni ni eS

k (7.2.4)

S = k log[ ]Ni∏i

1

!ni

≈ k[N logN −N − ( log − )]∑i

ni ni ni

(7.2.5)

(7.2.6)

ϵi i

λ( −N) −β( −U)∑i ni ∑i niϵiβ λ

= N , = U∑i

ni ∑i

niϵi (7.2.7)

Page 46: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.2.2 https://phys.libretexts.org/@go/page/32035

Since these are anyway obtained as variational conditions, we can vary freely without worrying about the constraints, when wetry to maximize the entropy. Usually we use instead of , where , so we will use this way of writing the Lagrangemultiplier. The equilibrium condition now becomes

This simplifies to

Since are not constrained, this means that the quantity in brackets should vanish, giving the solution

This is the value at which the probability and entropy are a maximum. It is known as the Maxwell-Boltzmann distribution. As inthe case of the binomial distribution, the variation around this value is very very small for large values of ) , so that observablevalues can be obtained by using just the solution in Equation . We still have the conditions from Equation obtained asmaximization conditions (for variation of , ), which means that

The first of these conditions will determine in terms of and the second will determine in terms of the total energy .

In order to complete the calculation, we need to identify the summation over the index . This should cover all possible states ofeach particle. For a free particle, this would include all momenta and all possible positions. This means that we can replace thesummation by an integration over . Further the single-particle energy is given by

Since

we find from

The value of the entropy at the maximum can now be expressed as

From this, we find the relations

ni

µ λ λ = βµ

δ[ −β( −U) +λ( −N)] = 0S

k∑i

niϵi ∑i

ni (7.2.8)

δ (log +β −βμ) = 0∑i

ni ni ϵi (7.2.9)

ni

=ni e−β( −μ)ϵi (7.2.10)

ni

7.2.10 7.2.7

β λ

= N∑i

e−β( −μ)ϵi

= U∑i

ϵie−β( −μ)ϵi

(7.2.11)

μ N β U

i

p xd3 d3

ϵ =p2

2m(7.2.12)

∫ x p exp(− ) = Vd3 d3 βp2

2m( )

2πm

β

3

2

(7.2.13)

7.2.11

β =3N

2U

βμ = log[ ] = log[ ]N

V( )

β

2πm

3

2 N

V( )

3N

4πmU

3

2

(7.2.14)

S = k[ N −N logN +N log V − N log( )]5

2

3

2

3N

4πmU(7.2.15)

Page 47: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.2.3 https://phys.libretexts.org/@go/page/32035

Comparing these with

which is the same as Equation 5.1.8, we identify

Further, is the chemical potential and is the internal energy. The last relation in Equation tells us that

which is the ideal gas equation of state.

Once we have the identification from Equation , we can also express the chemical potential and internal energy as functionsof the temperature:

The last relation also gives the specific heats for a monatomic ideal gas as

These specific heats do not vanish as , so clearly we are not consistent with the third law of thermodynamics. This isbecause of the classical statistics we have used. The third law is a consequence of quantum dynamics. So, apart from the third law,we see that with Boltzmann’s identification of entropy as , we get all the expected thermodynamic relations andexplicit formulae for the thermodynamic quantities.

We have not addressed the normalization of the entropy carefully so far. There are two factors of importance. In arriving atEquation , we omitted the in , using in Boltzmann’s formula rather than itself. This division by is called theGibbs factor and helps to avoid a paradox about the entropy of mixing, as explained below. Further, the number of states cannot bejust given by since this is, among other reasons, a dimensionful quantity. The correct prescription comes from quantummechanics which gives the semiclassical formula for the number of states as

where is Planck’s constant. Including this factor, the entropy can be expressed as

This is known as the Sackur-Tetrode formula for the entropy of a classical ideal gas.

( )∂S

∂U V,N

( )∂S

∂N V,U

( )∂S

∂V U,N

= k = k β3N

2U

= −log[ ] = −k βμN

V( )

3N

4πmU

3

2

= kN

V

(7.2.16)

(7.2.17)

(7.2.18)

dU = T dS−p dV + µ dN (7.2.19)

β =1

kT(7.2.20)

µ U 7.2.18

p =N k T

V(7.2.21)

7.2.20

μ = kT log[ ]N

V( )

1

2πmkT

3

2

U = NkT3

2

(7.2.22)

= Nk, = NkCv

3

2Cp

5

2(7.2.23)

T → 0

S = k logW

7.2.15 N ! W W

N !W N !

x pd3 d3

Number of states =x pd3 d3

(2πh)3(7.2.24)

h

S = N k [ +log( )+ log( )+ log( )]5

2

V

N

3

2

U

N

3

2

4πm

3(2πħ)2(7.2.25)

Page 48: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.2.4 https://phys.libretexts.org/@go/page/32035

Gibbs ParadoxOur expression for entropy has omitted the factor . The original formula for the entropy in terms of includes the factor of in . This corresponds to an additional factor of in the formula from Equation . The question of whether weshould keep it or not was considered immaterial since the entropy contained an additive undetermined term anyway. However,Gibbs pointed out a paradox that arises with such a result. Consider two ideal gases at the same temperature, originally withvolumes and and number of particles and . Assume they are mixed together, this creates some additional entropywhich can be calculated as . Since , if we use the formula from Equation without the factors due to

(which means with an additional ), we find

(We have also ignored the constants depending on for now.) This entropy of mixing can be tested experimentally and is indeedcorrect for monatomic nearly ideal gases. The paradox arises when we think of making the gases more and more similar, taking alimit when they are identical. In this case, we should not get any entropy of mixing, but the above formula gives

(In this limit, the constants depending on are the same, which is why we did not have to worry about it in posing this question.)This is the paradox. Gibbs suggested dividing out the factor of , which leads to the formula in Equation . If we use thatformula, there is no change for identical gases because the specific volume is the same before and after mixing. For dissimilargases, the formula of mixing is still obtained. The Gibbs factor of arises naturally in quantum statistics.

Equipartition

The formula for the energy of a single particle is given by

If we consider the integral in Equation for each direction of , we have

The corresponding contribution to the internal energy is , so that for the three degrees of freedom we get , per particle.We have considered the translational degrees of freedom corresponding to the movement of the particle in 3-dimensional space. Formore complicated molecules, one has to include rotational and vibrational degrees of freedom. In general, in classical statisticalmechanics, we will find that for each degree of freedom we get . This is known as the equipartition theorem. The specific heatis thus given by

Quantum mechanically, equipartition does not hold, at least in this simple form, which is as it should be, since we know the specificheats must go to zero as .

7.2: Maxwell-Boltzmann Statistics is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

N ! W N !

W N logN −N ???

V1 V2 N1 N2

S− −S1 S2 U = NkT32

???1N !

N logN −N

S− − = k[N log V − log − log ]S1 S2 N1 V1 N2 V2 (7.2.26)

m

S− − = k[ log( )+ log( )]S1 S2 N1V

V1N2

V

V2(7.2.27)

m

N ! ???V

N

N !

ϵ = =p2

2m

+ +p21 p2

2 p23

2m(7.2.28)

7.2.13 p

∫ dx dp exp(− ) =βp2

1

2mL1( )

2πm

β

12

(7.2.29)

kT12

kT32

kT12

= k × number of degrees of freedomCv

1

2(7.2.30)

T → 0

Page 49: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.3.1 https://phys.libretexts.org/@go/page/32036

7.3: The Maxwell Distribution For VelocitiesThe most probable distribution of velocities of particles in a gas is given by Equation 7.2.9 with . Thus we expectthe distribution function for velocities to be

This is known as the Maxwell distribution. Maxwell arrived at this by an ingenious argument many years before the derivation wegave in the last section was worked out. He considered the probability of a particle having velocity components . If theprobability of a particle having the x-component of velocity between and is , then the probability for

would be

Since the dynamics along the three dimensions are independent, the probability should be the product of the individual ones.Further, there is nothing to single out any particular Cartesian component, they are all equivalent, so the function f should be thesame for each direction. This leads to . Finally, we have rotational invariance in a free gas with no external potentials, so the

probability should be a function only of the speed . Thus we need a function such that

depends only on . The only solution is for to be of the form

for some constant . The distribution of velocities is thus

Since the total probability must be one, we can identify the constant as .

We now consider particles colliding with the wall of the container, say the face at , as shown in Fig. 7.3.1. The momentumimparted to the wall in an elastic collision is . At any given instant roughly half of the molecules will have acomponent towards the wall. All of them in a volume (where is the area of the face) will reach the wall in one second,so that the force on the wall due to collisions is

Averaging over using Equation , we get the pressure

Figure : A typical collision with the wall of the container at . The velocity component before and after collision isshown by the dotted line.

Comparing with the ideal gas law, we can identify as . Thus the distribution of velocities is

ϵ = = mp2

2m12

v2

f(v) v= C exp(− ) vd3 mv2

2kTd3 (7.3.1)

( , , )v1 v2 v3

v1 +dv1 v1 f( )dv1 v1

( , , )v1 v2 v3

Probability of ( , , ) = f( )f( )f( )d d dv1 v2 v3 v1 v2 v3 v1 v2 v3 (7.3.2)

7.3.2

v= + +v21

v22

v23

− −−−−−−−−−√ f(v) f( )f( )f( )v1 v2 v3

v f(v)

f( ) ∝ exp−(α )v1 v21 (7.3.3)

α

f v = C exp(−α ) vd3 v21 d3 (7.3.4)

∫ f vd3 C ( α

π)

3

2

x = L1

Δ = 2mp1 v1

v1 A×v1 A

F = ( )A × × (2m ) = ( )Am1

2

N

Vv1 v1

N

Vv2

1 (7.3.5)

v21 7.3.4

p = =( )m⟨ ⟩ = ( )⟨F ⟩

A

N

Vv2

1

N m

2αV(7.3.6)

7.3.1 x = L1 v1

α m

2kT

Page 50: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.3.2 https://phys.libretexts.org/@go/page/32036

This is in agreement with Equation 7.2.9.

Adapting Maxwell’s Argument to a Relativistic GasMaxwell’s argument leading to Equation is so simple and elegant that it is tempting to see if there are other situations towhich such a symmetry-based reasoning might be applied. The most obvious case would be a gas of free particles for whichrelativistic effects are taken into account. In this case, and it is clear that cannot be obtained from a product ofthe form . So, at first glance, Maxwell’s reasoning seems to fail. But this is not quite so, as the following line ofreasoning will show.

As a first step, notice that the distribution in Equation is for a gas which has no overall drift motion. This is seen by notingthat

We can include an overall velocity by changing the distribution to

It is easily verified that . It is important to include the overall motion in the reasoning since the symmetry is the full set ofLorentz transformations in the relativistic case and they include velocity-transformations.

Secondly, we note that in the relativistic case where we have the 4-momentum , and what is needed to sum overall states is not the integration over all , rather we must integrate with the invariant measure

where is the step function,

Further the -function can be expanded as

The integration over is trivial because of these equations and we find that

Now we seek a function which can be written in the form involving the four components of andintegrate it with the measured Equation . The function f must also involve the drift velocity in general. In the relativisticcase, this is the 4-velocity , whose components are

The solution is again an exponential

f(v) v = exp(− ) vd3 ( )m

2πkT

3

2 mv2

2kTd3

f(v) v = exp(− ) pd3 ( )1

2πmkT

3

2 ϵ

kTd3

(7.3.7)

7.3.7

ϵ = +p2 m2− −−−−−−

√ e−βϵ

f( )f( )f( )p1 p2 p3

7.3.7

⟨ ⟩ = ∫ p exp(− ) = 0vi d3 pi

m( )

1

2πmkT

3

2 ϵ

kT(7.3.8)

f(p) = exp(− )( )1

2πmkT

3

2 ( −mp   u )2

2mkT(7.3.9)

⟨ ⟩ =vi ui

pµ µ = 0, 1, 2, 3

dµ = pδ( − )Θ( )d4 p2 m2 p0 (7.3.10)

Θ

Θ(x) ={ 1 for x>00 for x<0

(7.3.11)

δ

δ( − ) = δ( − − ) = [δ( − ) +δ( + )]p2 m2 p20 p  2 m2 1

2p0p0 +p  2 m2

− −−−−−−√ p0 +p  2 m2

− −−−−−−√ (7.3.12)

p0

∫ dμf( ) = ∫ f( , )pμpd3

2 +p  2 m2− −−−−−−

√+p  2 m2

− −−−−−−√ p   (7.3.13)

f( )f( )f( )f( )p0 p1 p2 p3 pµ

7.3.10

U µ

= , =U01

1 − u 2− −−−−

√Ui

ui

1 − u 2− −−−−

√(7.3.14)

f(p) = C exp(−β ) = C exp(−β( − ⋅ ))pµUµ p0U0 p   U   (7.3.15)

Page 51: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.3.3 https://phys.libretexts.org/@go/page/32036

With the measure from Equation , we find

for any observable and where . At this stage, if we wish to, we can consider a gas with no overall motion,setting , to get

This brings us back to a form similar to Equation 7.2.9, even for the relativistic case.

7.3: The Maxwell Distribution For Velocities is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

7.3.10

∫ dµ f(p) B(p) = C ∫ exp(−β( − ⋅ )) B( , )pd3

2ϵpϵpU0 p   U   ϵp p   (7.3.16)

B(p) =ϵp +p2 m2− −−−−−−

√= 0u 

⟨B⟩ = C ∫ exp(−β ) B( , )pd3

2ϵpϵp ϵp p   (7.3.17)

Page 52: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.4.1 https://phys.libretexts.org/@go/page/32037

7.4: The Gibbsian EnsemblesThe distribution which was obtained in Equation 7.17 gives the most probable number of particles with momentum as

. This was obtained by considering the number of ways in which free particles can be distributed amongpossible momentum values subject to the constraints of fixed total number of particles and total energy. We want to consider somegeneralizations of this now. First of all, one can ask whether a similar formula holds if we have an external potential. Thebarometric formula (2.13) has a similar form since is the potential energy of a molecule or atom in that context. So, forexternal potentials, one can make a similar argument.

Interatomic or intermolecular forces are not so straightforward. In principle, if we have intermolecular forces, single particle energyvalues are not easily identified. Further, in some cases, one may even have new molecules formed by combinations or bound statesof old ones. Should they be counted as one particle or two or more? So, one needs to understand the distribution from a moregeneral perspective. The idea is to consider the physical system of interest as part of a larger system, with exchange of energy withthe larger system. This certainly is closer to what is really obtained in most situations. When we study or do experiments with a gasat some given temperature, it is maintained at this temperature by being part of a larger system with which it can exchange energy.Likewise, one could also consider a case where exchange of particles is possible. The important point is that, if equilibrium is beingmaintained, the exchange of energy or particles with a larger system will not change the distribution in the system under studysignificantly. Imagine high energy particles get scattered into the volume of gas under study from the environment. This can raisethe temperature slightly. But there will be roughly equal number of particles of similar energy being scattered out of the volumeunder study as well. Thus while we will have fluctuations in energy and particle number, these will be very small compared to theaverage values, in the limit of large numbers of particles. So this approach should be a good way to analyze systems statistically.

Arguing along these lines one can define three standard ensembles for statistical mechanics: the micro-canonical, the canonical andthe grand canonical ensembles. The canonical ensemble is the case where we consider the system under study (of fixed volume )as one of a large number of similar systems which are all in equilibrium with larger systems with free exchange of energy possible.For the grand canonical ensemble, we also allow free exchange of particles, so that only the average value of the number ofparticles in the system under study is fixed. The micro-canonical ensemble is the case where we consider a system with fixedenergy and fixed number of particles. (One could also consider fixing the values of other conserved quantities, either at the averagelevel (for grand canonical case) or as rigidly fixed values (for the micro-canonical case)).

We still need a formula for the probability for a given distribution of particles in various states. In accordance with the assumptionof equal a priori probabilities, we expect the probability to be proportional to the number of states available to the systemsubject to the constraints on the conserved quantities. In classical mechanics, the set of possible trajectories for a system of particlesis given by the phase space since the latter constitutes the set of possible initial data. Thus the number of states for a system of particles would be proportional to the volume of the subspace of the phase space defined by the conserved quantities. In quantummechanics, the number of states would be given in terms of the dimension of the Hilbert space. The semiclassical formula for thecounting of states is then

In other words, a cell of volume in phase space corresponds to a state in the quantum theory. (This holds for largenumbers of states; in other words, it is semiclassical.) This gives a more precise meaning to the counting of states via the phasevolume. In the microcanonical ensemble, the total number of states with total energy between and would be

where is the Hamiltonian of the -particle system. The entropy is then defined by Boltzmann’s formula as

. For a Hamiltonian , this can be explicitly calculated and leads to the formulae we have alreadyobtained. However, as explained earlier, this is not easy to do explicitly when the particles are interacting. Nevertheless, the keyidea is that the required phase volume is proportional to the exponential of the entropy,

p

= exp(−β( − µ))np ϵp

mgh

V

N

N

dN =∏i=1

N d3xid3pi

(2πħ)3(7.4.1)

(2πh)3N

E E+δE

N = ≡ W (E)∫H=E+δE

H=E

∏i=1

N d3xid3pi

(2πħ)3(7.4.2)

H({x}, {p}) N

S(E) = k logW (E) H =∑i

p2i

2m

Probability ∝ exp( )S

k(7.4.3)

Page 53: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.4.2 https://phys.libretexts.org/@go/page/32037

This idea can be carried over to the canonical and grand canonical ensembles.

In the canonical ensemble, we consider the system of interest as part of a much larger system, with, say, particles. Thetotal number of available states is then

The idea is then to consider integrating over the particles to obtain the phase volume for the remaining, viewed as a subsystem.We refer to this subsystem of interest as system 1 while the particles which are integrated out will be called system 2. If the totalenergy is , we take the system 1 to have energy , with system 2 having energy . Of course, is not fixed, but canvary as there can be some amount of exchange of energy between the two systems. Integrating out the system 2 leads to

We then expand as

where have used the thermodynamic formula for the temperature. The temperature is the same for system 1 and the larger system(system 1 + system 2) of which it is a part. is the Hamiltonian of the particles in system 1. This shows that, as far as thesystem under study is concerned, we can take the probability as

Here C is a proportionality factor which can be set by the normalization requirement that the total probability (after integration over

all remaining variables) is 1. (The factor from Equation can be absorbed into the normalization as it is a constantindependent of the phase space variables for the particles in system 1. Also, the subscript 1 referring to the system under study isnow redundant and has been removed.)

There are higher powers in the Taylor expansion in Equation which have been neglected. The idea is that these are very smallas is small compared to the energy of the total system. In doing the integration over the remaining phase space variables, inprinciple, one could have regions with comparable to , and the neglect of terms of order may not seem justified.However, the formula in terms of the energy is sharply peaked around a certain average value with fluctuations being verysmall, so that the regions with comparable to will have exponentially vanishing probability. This is the ultimate justificationfor neglecting the higher terms in the expansion from Equation . We can a posteriori verify this by calculating the meansquare fluctuation in the energy value which is given by the probability distribution in Equation . This will be taken upshortly.

Turning to the grand canonical case, when we allow exchange of particles as well, we get

N +M

dN = ∏i=1

N+M d3xid3pi

(2πħ)3(7.4.4)

M

M

E E1 E−E1 E1

δN = W (E− ) =∏i=1

N d3xid3pi

(2πħ)3E1 ∏

i=1

N d3xid3pi

(2πħ)3e

S(E− )E1

k (7.4.5)

S(E)

S(E− )E1 = S(E) − + . . .E1( )∂S

∂E V,N

= S(E) − + . . .1

TE1

= S(E) − + . . .HN

T

(7.4.6)

HN N

Probability = C exp(− )∏i=1

N d3xid3pi

(2πħ)3

(x, p)Hn

T(7.4.7)

eS(E)

k 7.4.6

7.4.6

E1

HN E E21

7.4.7

E1 E

7.4.6

7.4.7

S(E− , (N +M) −N))E1 = S(E,N +M) − −N + . . .E1( )∂S

∂U V,N+M

( )∂S

∂N U,V

= S(E,N +M) − + N + . . .1

TE1

μ

T

= S(E,N +M) − ( − ) + . . .1

THN μN

(7.4.8)

Page 54: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.4.3 https://phys.libretexts.org/@go/page/32037

By a similar reasoning as in the case of the canonical ensemble, we find, for the grand canonical ensemble,

More generally, let us denote by an additively conserved quantum number or observable other than energy. The generalformula for the probability distribution is then

Even though we write the expression for particles, it should be kept in mind that averages involve a summation over as well.Thus the average of some observable is given by

Since the normalization factor is fixed by the requirement that the total probability is 1, it is convenient to define the “partitionfunction". In the canonical case, it is given by

We have introduced an extra factor of . This is the Gibbs factor needed for resolving the Gibbs paradox; it is natural in thequantum counting of states. Effectively, because the particles are identical, permutation of particles should not be counted as a newconfiguration, so the phase volume must be divided by to get the “correct" counting of states. We will see that even this is notentirely adequate when full quantum effects are taken into account. In the grand canonical case, the partition function is defined by

Using the partition functions in place of , and including the Gibbs factor, we find the probability of a given configuration as

while for the grand canonical case we have

The partition function contains information about the thermodynamic quantities. Notice that, in particular,

We can also define the average value of the entropy (not the entropy of the configuration corresponding to particular way ofdistributing particles among states, but the average over the distribution) as

While the averages and do not depend on the factors of and , the entropy does. This is why we chosethe normalization factors in Equation to be what they are.

Probability ≡ d = C exp(− )pN ∏i=1

Nd3xid

3pi

(2πħ)3

H(x, p) −μN

kT(7.4.9)

d = C exp(− )pN ∏i=1

Nd3xid

3pi

(2πħ)3

H(x, p) −∑α μαOα

kT(7.4.10)

N N

B(x, p)

⟨B⟩ = ∫ d (x, p)∑N=0

pN BN (7.4.11)

C

= ∫ [ ] exp(− )QN

1

N !∏i=1

N d3xid3pi

(2πħ)3

H(x, p)

kT(7.4.12)

1N !

N !

Z = ∫ [ ] exp(− )∑N

1

N !∏i=1

N d3xid3pi

(2πħ)3

H(x, p) −∑α μαOα

kT(7.4.13)

1C

d = [ ] exp(− )pN1

QN

1

N !∏i=1

N d3xid3pi

(2πħ)3

H(x, p)

kT(7.4.14)

d = [ ] exp(− )pN1

Z

1

N !∏i=1

N d3xid3pi

(2πħ)3

H(x, p) −∑α μαOα

kT(7.4.15)

logZ = ⟨ ⟩1

β

∂μα

− logZ = ⟨H − ⟩ = U − ⟨ ⟩∂

∂β∑α

μαOα ∑α

μα Oα

(7.4.16)

= k[logZ+β(U − ⟨ ⟩)]S ∑α

μα Oα (7.4.17)

U = ⟨H⟩ ⟨ ⟩Oα N ! (2πh)3N

7.4.12

Page 55: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.4.4 https://phys.libretexts.org/@go/page/32037

Consider the case when we have only one conserved quantity, the particle number, in addition to the energy. In this case, Equation can be written as

Comparing this with the definition of the Gibbs free energy in Equation 5.6 and its expression in terms of in Equation 5.16, wefind that we can identify

This gives the equation of state in terms of the partition function.

These equations ( ), ( - ) are very powerful. Almost all of the thermodynamics we have discussed before iscontained in them. Further, they can be used to calculate various quantities, including corrections due to interactions amongparticles, etc. As an example, we can consider the calculation of corrections to the equation of state in terms of the intermolecularpotential.

7.4: The Gibbsian Ensembles is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

7.4.17

µN = U −TS+kT logZ (7.4.18)

µ

pV = kT logZ (7.4.19)

7.4.13 7.4.16 7.4.19

Page 56: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.5.1 https://phys.libretexts.org/@go/page/32038

7.5: Equation of StateThe equation of state, as given by Equation 7.4.19, requires the computation of the grand canonical partition function. We willconsider the case where the only conserved quantities are the Hamiltonian and the number of particles. The grand canonicalpartition function can then be written as

where is the canonical partition for a fixed number of particles, given in Equation 7.4.12. The variable is called thefugacity. The easiest way to proceed to the equation of state is to consider an expansion of in powers of . This is known asthe cumulant expansion and, explicitly, it takes the form

where the first few coefficients are easily worked out as

The Hamiltonian, for particles, has the form

where we have included a general two-particle interaction and the ellipsis stands for possible 3-particle and higher interactions. For

we just have the first term. The -integrations factorize and, if , we get . All cumulants vanishexcept for . We can explicitly obtain

For , we find

where, in the second line, we have taken the potential to depend only on the difference and carried out the integration overthe center of mass coordinate. Thus

Using the cumulant expansion to this order, we find and the average number of particles in the volume isgiven by

which can be solved for as

Z = = 1 + z =∑N=0

zNQN ∑N=1

zNQN eβμ (7.5.1)

QN z = eβµ

logZ z

log[1 + ] = 1 +∑N=1

zNQN ∑n=1

zNan (7.5.2)

a1

a2

a3

= Q1

= −Q21

2Q2

1

= − +Q3 Q2Q11

3Q2

1

(7.5.3)

N

= + V ( , )+. . .HN ∑i

p2i

2m∑i<j

xi xj (7.5.4)

N = 1 p V ( , ) = 0xi xj =QNQN

1

N !an

a1

= = ∫ exp(− ) = Va1 Q1d3xid

3pi

(2πħ)3

βp2

2m( )

m

2πβħ2

3

2

(7.5.5)

Q2

Q2 = ∫1

2!

Q21

V 2d3x1d

3x2 e−βV( , )x1 x2

= ∫ xQ2

1

2Vd3 e−βV(x)

(7.5.6)

−x 1 x 2

= ∫ x (1 − )a2

Q21

2Vd3 e−βV(x) (7.5.7)

logZ ≈ z +Q1 z2a2 V

= logZ ≈ z +N1

β

∂μQ1 z2a2 (7.5.8)

z

Page 57: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.5.2 https://phys.libretexts.org/@go/page/32038

If we ignore the -term, this relation is the same as what we found in Equation 7.2.14, with the addition of the correction,

Using Equation (with the -term) back in and the expression for , we get

These formulae show explicitly the first correction to the ideal gas equation of state. The quantity is a function of temperature;it is called the second virial coefficient and can be calculated, once a potential is known, by carrying out the integration. Even forcomplicated potentials it can be done, at least, numerically. As a simple example, consider a hard sphere approximation to theinteratomic potential,

In this case the integral is easily done to obtain This is independent of the temperature. One can consider more

realistic potentials for better approximations to the equation of state.

The van der Waals equation, which we considered earlier, is, at best, a model for the equation of state incorporating some featuresof the interatomic forces. Here we have a more systematic way to calculate with realistic interatomic potentials. Nevertheless, thereis a point of comparison which is interesting. If we expand the van der Waals equation in the form in Equation , it has

; the term is independent of the temperature. Comparing with the hard sphere repulsion at short distances, we seehow something like the excluded volume effect can arise.

We have considered the first corrections due to the interatomic forces. More generally, the equation of state takes the form

This is known as the virial expansion, with referred to as the -th virial coefficient. These are in general functions of thetemperature; they can be calculated by continuing the cumulant expansion to higher order and doing the integrations needed for

. In practice such a calculation becomes more and more difficult as increases. This virial expansion in Equation is inpowers of the density and integrations involving powers of , where is the potential energy of the interaction. Thus forlow densities and interaction strengths small compared to , truncation of the series at some finite order is a good approximation.So only a few of the virial coefficients are usually calculated.

It is useful to calculate corrections to some of the other quantities as well. From the identification of in Equation and fromEquation 7.4.16, we can find the internal energy as

In a similar way, the entropy is found to be

Since the number of pairs of particles is , the correction to the internal energy is easily understood as an average of thepotential energy. Also the first set of terms in the entropy reproduces the Sackur-Tetrode formula (7.2.25) for an ideal gas.

7.5: Equation of State is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

z ≈ −2 +. . .Q1 N a2N

2

Q21

(7.5.9)

a2 (2πħ)3

βμ = log[ ( )]N

V

2πħ2

mkT(7.5.10)

7.5.9 a2 logZ 7.5.7 a2

≈ [1 + +. . . ]pV

kTN

N

VB2

= ∫ x (1 − )B21

2d3 e−βV(x)

(7.5.11)

B2

V (r) ={∞0

r < r0

r > r0(7.5.12)

= ( )B22πr3

0

3

7.5.11

= b−( )B2a

kTb

= [1 + ]pV

kTN∑n=2

( )N

V

n−1

Bn (7.5.13)

Bn n

QN N 7.5.13NV

e−βVint Vint

kT

N 7.5.8

U = kT + ∫ xV (x) +. . .3

2N

N2

2Vd3 e−βV(x) (7.5.14)

= k[ +log + log( )+ ∫ x (βV (x) −1 + )]S N5

2

V

N

3

2

mkT

2πħ2

N

2Vd3 e−βV(x) e−βV(x) (7.5.15)

≈N2

2

Page 58: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.6.1 https://phys.libretexts.org/@go/page/32039

7.6: FluctuationsWe will now calculate the fluctuations in the values of energy and the number of particles as given by the canonical and grandcanonical ensembles. First consider . From the definition, we have

If we calculate from the partition function as a function of and , we can differentiate it with respect to to get

The Gibbs free energy is given by and it also obeys

These follow from Equation 5.1.16 and Equation 5.1.10. Since is fixed in the differentiations we are considering, this gives

The equation of state gives p as a function of the number density , at fixed temperature. Thus

Using this in Equation , we get

From Equation , we now see that the mean square fluctuation in the number is given by

This goes to zero as becomes large, in the thermodynamic limit. An exception could occur if becomes very small. This

can happen at a second order phase transition point. The result is that fluctuations in numbers become very large at the transition.The theoretical treatment of such a situation needs more specialized techniques.

We now turn to energy fluctuations in the canonical ensemble. For this we consider to be fixed and write

The derivative of with respect to gives the specific heat, so we find

Once again, the fluctuations are small compared to the average value as becomes large.

7.6: Fluctuations is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

N

1

β

∂Z

∂μ

1

β2

Z∂2

∂μ2

= Z⟨N⟩ = Z N

= Z⟨ ⟩N 2

(7.6.1)

N β µ µ

= − + = ⟨ ⟩− ⟨N = Δ1

β

∂N

∂μ

1

Z2β2( )

∂Z

∂μ

2 1

β2

Z∂2

∂μ2N 2 ⟩2 N 2 (7.6.2)

G= µN

dG= −SdT +V dp+ µdN (7.6.3)

T

=∂µ

∂N

V

N

∂p

∂N(7.6.4)

ρ ≡ N

V

=∂p

∂N

1

V

∂p

∂ρ(7.6.5)

7.6.4

=1

β

∂N

∂μN

kT∂p

∂ρ

(7.6.6)

7.6.2

=ΔN 2

N 2

1

N

kT

∂p

∂ρ

(7.6.7)

N ( )∂p

∂ρ

N

∂U

∂β= [− ] = −[ ]

∂β

1

QN

∂QN

∂β[ ]

1

QN

∂QN

∂β

21

QN

∂2QN

∂β2

= ⟨H − ⟨ ⟩ ≡ −Δ⟩2 H 2 U 2

(7.6.8)

U = ⟨H⟩ T

Δ = k , = ∼U 2 CvT2 ΔU 2

U 2

kCvT2

U 2

1

N(7.6.9)

N

Page 59: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.7.1 https://phys.libretexts.org/@go/page/32040

7.7: Internal Degrees of FreedomMany particles, such as atoms, molecules have internal degrees of freedom. This can be due to atomic energy levels, due tovibrational and rotational states for molecules, etc. Very often one has to consider mixtures of particles where they can be indifferent internal states as well. For example, in a sodium lamp where the atoms are at a high temperature, some of the atoms are inan electronic excited state while some are in the ground state. Of course, the translational degrees of freedom are important as well.In principle, at the classical level, the internal dynamics has its own phase space and by including it in the integration measure forthe partition function, we can have a purely classical statistical mechanics of such systems. However, this can be grosslyinadequate. Even though in many situations, the translational degrees of freedom can be treated classically, it is necessary to takeaccount of the discreteness of states and energy levels for the internal degrees of freedom. The question is: How do we do this?

The simplest strategy is to consider the particles in different internal states as different species of particles. For example, consider agas of, say, argon atoms which can be in the ground state (call them A) and in an excited state (call them A∗ ). The partitionfunction would thus be

If we ignore interatomic forces, considering a gas of free particles, , so that

The single particle partition function which we have used so far is of the for

This is no longer good enough since and have difference in energies for the internal degrees of freedom and this is not

reflected in using just . Because of the equivalence of mass and energy, this means that and are different. From therelativistic formula

we see that this difference is taken account of if we include the rest energy in . (Here is the speed of light in vacuum.) Thus weshould use the modified formula

even for nonrelativistic calculations. If there are degenerate internal states, the mass would be the same, so in we could get amultiplicity factor. To see how this arises, consider a gas of particles each of which can have internal states of the same energy(or mass). Such a situation is realized, for example, by a particle of spin with . If we treat each internal state as aseparate species of particle, the partition function would be

Each of the will specify the average number in the distribution for each internal state. However, in many cases, we do notspecify the numbers for each state, only the total average number is macroscopically fixed for the gas. Thus all are the samegiving a single factor , , in Equation . Correspondingly, if we have no interparticle interactions, we get

where we used the fact that all masses are the same. We see the degeneracy factor explicitly.

Z = ∑,NA NA∗

zNA

AzNA

A∗ Q ,NA NA∗ (7.7.1)

= !QN(Q1)

N

N

Z = exp ( ) +( )ZAQ1A ZA∗Q1∗A

(7.7.2)

= ∫ exp(−β )Q1x pd3 d3

(2πh)3

p2

2m(7.7.3)

A A∗

p2

2mmA mA∗

= ≈ m + +. . .Ep +m2c4 C 2p2− −−−−−−−−−

√ c2 p2

2m(7.7.4)

Q1 c

Q1 = ∫ exp(−β [m + ])x pd3 d3

(2πh)3c2 p2

2m

= Ve−βmc2 mkT

2πħ2

3

2

(7.7.5)

logZ

g

s g = 2s+1

Z = ⋅ ⋅ ⋅∑{ }Ni

zN11 zN2

2 zNg

g Q , ,...N1 N2 Ng(7.7.6)

µizi

zN N =∑Ni 7.7.6

Z = exp(z∑ )= exp(z g )Qi Q1 (7.7.7)

g

Page 61: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.8.1 https://phys.libretexts.org/@go/page/32041

7.8: Examples

7.8.1: Osmotic Pressure

Figure : Illustrating the set-up for calculating osmotic pressure

An example of the use of the idea of the partition function in a very simple way is provided by the osmotic pressure. Here oneconsiders a vessel partitioned into two regions, say, I and II, with a solvent (labeled A) on one side and a solution of the solventplus a solute (labeled B) on the other side. The separation is via a semipermeable membrane which allows the solvent molecules topass through either way, but does not allow the solute molecules to pass through. Thus the solute molecules stay in region II, as theFig. 7.8.1. When such a situation is set up, the solvent molecules pass back and forth and eventually achieve equilibrium with theaverage number of solvent molecules on each side not changing any further. What is observed is that the pressure in the solution pIIis higher than the pressure pI in the solvent in region I. Once equilibrium is achieved, there is no further change of volume ortemperature either, so we can write the equilibrium condition for the solvent as

Correspondingly, we have , for the fugacities. The partition function has the form

Here is the partition function for just the solvent. For simplicity, let us take the volumes of the two regions to be the same. Thenwe may write

even though this occurs in the formula for the full partition function in region II, since is the same for regions I and II. Goingback to , we expand log in powers of , keeping only the lowest order term, which is adequate for dilute solutions. Thus

The derivative of the partition function with respect to is related to , as in Equation 7.4.16, so that

Further, is given by . Using these results, Equation gives

7.8.1

=μ(I)

(II)

A(7.8.1)

=z(I)

Az

(II)

A

Z = ∑,NA NB

zNA

A zNB

B Q ,NA NB

= + +⋅ ⋅ ⋅∑NA

zNA

AQ ,0NA ∑

NA

zNA

AzB Q ,1NA

= (1 + +⋅ ⋅ ⋅)ZA

1

ZA

∑NA

zNA

A zB Q ,1NA

(7.8.2)

ZA

log =ZA

VpI

kT(7.8.3)

zA

Z zB

log Z ≈ log + ( )ZA zB

1

ZA

∑NA

zNA

AQ ,1NA

(7.8.4)

µB NB

( ) =1

ZA

∑NA

zNA

A Q ,1NA NB (7.8.5)

log ZVpII

kT7.8.4

= + kT = + kTpII pI

NB

VpI nB (7.8.6)

Page 62: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.8.2 https://phys.libretexts.org/@go/page/32041

where is the number density of the solute. The pressure difference is called the osmotic pressure.

7.8.2: Equilibrium of a Chemical ReactionHere we consider a general chemical reaction of the form

If the substances , , , can be approximated as ideal gases, the partition function is given by

For the individual chemical potentials, we can use the general formula in 7.5.10 but with the correction due to the factor asin Equation 7.7.5, since we have different species of particles here. Thus

The condition of equilibrium of the reaction is given as . Using Equation , this becomes

The total pressure of the mixture of the substances is given from as

So if we define the concentrations,

then we can rewrite Equation as

With our interpretation of the masses as rest energy, we see that is the heat of reaction, i.e., the total energy released by thereaction. is positive for an exothermic reaction and negative for an endothermic reaction. , in Equation , is known as thereaction constant and is a function only of the temperature (and the masses of the molecules involved, but these are fixed once areaction is chosen). The condition in Equation on the concentrations of the reactants is called the law of mass action.

7.8.3 Ionization EquilibriumAnother interesting example is provided by ionization equilibrium, which is of interest in plasmas and in astrophysical contexts.Consider the ionization reaction of an atom

There is a certain amount of energy needed to ionize the atom . Treating the particles involved as different species withpossible internal states, we can use Equation 7.7.7 to write

By differentiation with respect to , we find, for each species,

nB −pII pI

A + B ⇌ C + DνA νB νC νD (7.8.7)

A B C D

log Z = + + +zAQ1A zBQ1B zCQ1C zDQ1D (7.8.8)

βmc2

βμ = log n +β m −logc2 ( )mkT

2πħ2

3

2

(7.8.9)

+ − −νAμA νBμB νC μC νDμD 7.8.9

log( )nνA

AnνB

B

nνC

CnνD

D

f

= −β ( + − − ) + + − −c2 νAμA νBμB νC μC νDμD νAfA νBfB νC fC νDfD

= log( )mkT

2πħ2

3

2

(7.8.10)

pV = kT log Z

p = ( + + + )kTnA nB nC nD (7.8.11)

= , i = A, B, C, Dxi

ni

∑j nj

(7.8.12)

7.8.10

xνA

A xνB

B

xνC

C xνD

D

p + − −νA νB νC νD

ϵ

ϕ

= exp(−β ϵ+ + − − ) ≡ KνAϕA νBϕB νC ϕC νDϕD

= ( + − − )νAmA νBmB νC mC νDmD c2

= f +log kT

(7.8.13)

ϵ

ϵ K 7.8.13

7.8.13

X

X ⇌ +X+ e− (7.8.14)

ϵI X

log Z = + +zXgXQ1X zegeQ1e ZX+ gX+ Q1X+ (7.8.15)

µ

Page 63: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

7.8.3 https://phys.libretexts.org/@go/page/32041

The condition for equilibrium is the same as . Using Equation , this becomes

The mass of the atom is almost equal to the mass of and the electron; the difference is the binding energy of the electron in . This is the ionization energy , . Using this, Equation can be rewritten as

This is known as Saha’s equation for ionization equilibrium. It relates the number density of the ionized atom to that of the neutralatom. (While the mass difference is important for the exponent, it is generally a good approximation to take in the

factor . So it is often omitted.) The degeneracy for the electron states, namely , is due to the spin degrees of freedom, so

. The degeneracies and will depend on the atom and the energy levels involved.

The number densities can be related to the pressure by the equation of state; they are also important in determining the intensities ofspectral lines. Thus by observation of spectral lines from the photospheres of stars, one can estimate the pressures involved.

7.8: Examples is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

z =n

geβmc2

( )2πħ

2

mkT

3

2

(7.8.16)

− − = 0µX µX+ µe ( ) = 1zX

+zX ze7.8.16

1 =nX

gX

gX+

nX+

ge

ne

eβ( − +− )mX mX + me c2

( )+mX me

mX

3

2

( )kT

2πħ2

3

2

(7.8.17)

X X+

X ϵI ( − − ) = −mX mX+ me c2 ϵI 7.8.17

( ) =( )+nX ne

nX

+gX ge

gX

eβϵI ( )+mX me

mX

3

2

( )kT

2πħ2

3

2

(7.8.18)

≈mX mX+

( )mX +

mX

3

2ge

= 2ge gX gX+

Page 64: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

CHAPTER OVERVIEW

8: Quantum Statistical MechanicsOne of the important lessons of quantum mechanics is that there is no a priori meaning to qualities of any system, no independentreality, aside from what can be defined operationally in terms of observations. Thus we cannot speak of this electron (or photon, orany other particle) versus that electron (or photon, or any other particle). We can only say that there is one particle with a certain setof values for observables and there is another, perhaps with a different set of values for observables. This basic identity of particlesaffects the counting of states and hence leads to distributions different from the Maxwell-Boltzmann distribution we havediscussed. This is the essential refinement due to quantum statistics.

8.1: Prelude to Quantum Statistical Mechanics8.2: Bose-Einstein Distribution8.3: Fermi-Dirac Distribution8.4: Applications of the Bose-Einstein Distribution8.5: Applications of the Fermi-Dirac Distribution

8: Quantum Statistical Mechanics is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

Page 65: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.1.1 https://phys.libretexts.org/@go/page/32096

8.1: Prelude to Quantum Statistical MechanicsThere are two kinds of particles from the point of view of statistics, bosons and fermions. The corresponding statisticaldistributions are called the Bose-Einstein distribution and the Fermi-Dirac distribution. Bosons have the property that one canhave any number of particles in a given quantum state, while fermions obey the Pauli exclusion principle which allows a maximumof only one particle per quantum state. Any species of particles can be put into one of these two categories. The natural question is,of course, how do we know which category a given species belongs to; is there an a priori way to know this? The answer to thisquestion is yes, and it constitutes one of the deep theorems in quantum field theory, the so-called spin-statistics theorem. Theessence of the theorem, although this is not the precise statement, is given as follows.

Identical particles with integer values for spin (or intrinsic angular momentum) are bosons, they obey Bose-Einstein statistics.

where is an integer, obey the Pauli exclusion principle and hence they are fermions and obey Fermi-Dirac statistics.

Thus, among familiar examples of particles, photons (which have spin = ), phonons (quantized version of elastic vibrations in asolid), atoms of , etc. are bosons, while the electron (spin = ), proton (also spin ), atoms of , etc. are fermions. In allcases, particles of the same species are identical.

8.1: Prelude to Quantum Statistical Mechanics is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

Theorem : Spin-Statistics Theorem8.1.1

Identical particles with spin (or intrinsic angular momentum) =(n + )1

2(8.1.1)

n

1

He4 12

− 12

He3

Page 66: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.2.1 https://phys.libretexts.org/@go/page/32043

8.2: Bose-Einstein DistributionWe will now consider the derivation of the distribution function for free bosons carrying out the counting of states along the lines ofwhat we did for the Maxwell-Boltzmann distribution. Let us start by considering how we obtained the binomial distribution. Weconsidered a number of particles and how they can be distributed among, say, boxes. As the simplest case of this, consider twoparticles and two boxes. The ways in which we can distribute them are as shown below. The boxes may correspond to different valuesof momenta, say and , which have

the same energy. There is one way to get the first or last arrangement, with both particles in one box; this corresponds to

There are two ways to get the arrangement of one particle in each box, corresponding to

The generalization of this counting is what led us to the Maxwell-Boltzmann statistics. But if particles are identical with no intrinsicattributes distinguishing the particles we have labeled 1 and 2, this result is incorrect. The possible arrangements are just

Counting the arrangement of one particle per box twice is incorrect; the correct counting should give and , giving a total of 3 distinct arrangements or states. To generalize this, we first consider particles to be distributed in 2

boxes. The possible arrangements are in box 1, 0 in box 2; in box 1, 1 in box 2; · · · ; 0 in box 1, in box 2, giving distinct arrangements or states in all. If we have n particles and 3 boxes, we can take particles in the first two boxes (with

possible states) and k particles in the third box. But can be anything from zero to , so that the total number of states is

We may arrive at this in another way as well. Represent the particles as dots and use 3 − 1 = 2 partitions to separate them into threegroups. Any arrangement then looks like

Clearly any permutation of the entities (dots or partitions) gives an acceptable arrangement. There are suchpermutations. However, the permutations of the two partitions do not change the arrangement, neither do the permutations of the dotsamong themselves. Thus, the number of distinct arrangements or states is

Generalizing this argument, for boxes and particles, the number of distinct arrangements is given by

We now consider particles, with of them having energy each, of them with energy each, etc. Further, let be thenumber of states with energy , the number of states with energy , etc. The degeneracy may be due to different values of

momentum (e.g., different directions of with the same energy or other quantum numbers, such as spin. The total numberof distinct arrangements for this configuration is

K

p1→

p2→

W (2, 0) = = = W (0, 2)2!

2!0!

2!

0!2!(8.2.1)

W (1, 1) =2!

1!1!(8.2.2)

W (2, 0) = W (0, 2) = 1

W (1, 1) = 1 n

n n−1 n n+1

n−k

n−k+1 k n

(n−k+1) = =∑k=0

n (n+2)(n+1)

2

(n+3 −1)!

n!(3 −1)!(8.2.3)

n+2 (n+2)!

=(n+2)!

n!2!

(n+2)(n+1)

2(8.2.4)

g n

W =(n+g−1)!

n!(g−1)!(8.2.5)

N n1 ϵ1 n2 ϵ2 g1

ϵ1 g2 ϵ2 gα

p   ϵ = )p2

2m

Page 67: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.2.2 https://phys.libretexts.org/@go/page/32043

The corresponding entropy is and we must maximize this subject to and . With the Stirlingformula for the factorials, the function to be extremized is thus

The extremization condition is

with the solution

where we have approximated , since the use of the Stirling formula needs large numbers. As the occupation number perstate, we can take the result as

with the degeneracy factors arising from the summation over states of the same energy. This is the Bose-Einstein distribution. For afree particle, the number of states in terms of the momenta can be taken as the single-particle , which, from Equation 7.4.1, is

Thus the normalization conditions on the Bose-Einstein distribution are

The remaining sum in this formula is over the internal states of the particle, such as spin states. It is also useful to write down thepartition function . Notice that we may write the occupation number in Equation as

This result holds for each state for which we are calculating . But recall that the total number should be given by

Therefore, we expect the partition function to be given by

For each state with fixed quantum numbers, we can write

W ({ }) =nα ∏α

( + −1)nα gα

!( −1)!nα gα(8.2.6)

k logW = N∑α nα = U∑α nαϵα

−βU +βµN = [( + −1) log( + −1) −( + −1) − log + −β +βμ ]S

k∑α

nα gα nα gα nα gα nα nα nα ϵαnα nα

+terms independent of nα

(8.2.7)

log[ ]+ −1nα gα

(8.2.8)

=nα

−1eβ( −μ)ϵα(8.2.9)

−1 ≈gα gα

n =1

−1eβ( −μ)ϵα(8.2.10)

dN

dN =x pd3 d3

(2πħ)3(8.2.11)

∑∫ = Nx pd3 d3

(2πħ)3

1

−1eβ( −μ)ϵα

∑∫ = Ux pd3 d3

(2πħ)3

ϵ

−1eβ( −μ)ϵα

(8.2.12)

Z n 8.2.10

n = [−log(1 − )]1

β

∂μe−β( −μ)ϵα (8.2.13)

n N

N = logZ1

β

∂μ(8.2.14)

logZ

Z

= −∑ log(1 − )e−β( −μ)ϵα

=∏1

1 −e−β( −μ)ϵα

(8.2.15)

Page 68: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.2.3 https://phys.libretexts.org/@go/page/32043

This shows that the partition function is the sum over states of possible occupation numbers of the Boltzmann factor . Thisis obtained in a clearer way in the full quantum theory where

where and are the Hamiltonian operator and the number operator, respectively, and denotes the trace over a complete set ofstates of the system.

8.2: Bose-Einstein Distribution is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

1

1 −e−β( −μ)ϵα= 1 + + +e−β(ϵ−μ) e−2β(ϵ−μ) e−3β(ϵ−μ)

=∑n

e−nβ(ϵ−μ)

(8.2.16)

e−beta(ϵ−μ)

Z = Tr( )e−β( −μ )H N (8.2.17)

H N Tr

Page 69: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.3.1 https://phys.libretexts.org/@go/page/32044

8.3: Fermi-Dirac DistributionThe counting of distinct arrangements for fermions is even simpler than for the Bose-Einstein case, since each state can have anoccupation number of either zero or 1. Thus consider states with particles to be distributed among them. There are stateswhich are singly occupied and these can be chosen in ways. The total number of distinct arrangements is thus given by

The function to be maximized to identify the equilibrium distribution is therefore given by

The extremization condition reads

with the solution

So, for fermions in equilibrium, we can take the occupation number to be given by

with the degeneracy factor arising from summation over states of the same energy. This is the Fermi-Dirac distribution. Thenormalization conditions are again,

As in the case of the Bose-Einstein distribution, we can write down the partition function for free fermions as

Notice that, here too, the partition function for each state is ; it is just that, in the present case, can only be zero or1.

8.3: Fermi-Dirac Distribution is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

g n ng!

(n!(g−n)!)

W ({ }) =nα ∏α

!( − )nα gα nα

(8.3.1)

−βU +βµN = − log −( − ) log( − ) −β( −μ) +constantS

knα nα gα nα gα nα ϵα nα (8.3.2)

log[ ] = β( −μ)( − )gα nα

ϵα (8.3.3)

=nα

+1eβ( −μ)ϵα(8.3.4)

n =1

+1eβ(ϵ−μ)(8.3.5)

∑∫ = Nx pd3 d3

(2πħ)3

1

+1eβ(ϵ−μ)

∑∫ = Ux pd3 d3

(2πħ)3

ϵ

+1eβ(ϵ−μ)

(8.3.6)

logZ

Z

=∑ log(1 + )e−β(ϵ−μ)

=∏1

1 +e−β(ϵ−μ)

(8.3.7)

∑n e−nβ(ϵ−μ) n

Page 70: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.1 https://phys.libretexts.org/@go/page/32045

8.4: Applications of the Bose-Einstein DistributionWe shall now consider some simple applications of quantum statistics, focusing in this section on the Bose-Einstein distribution.

8.3.1: The Planck Distribution for Black Body RadiationAny material body at a finite nonzero temperature emits electromagnetic radiation, or photons in the language of the quantumtheory. The detailed features of this radiation will depend on the nature of the source, its atomic composition, emissivity, etc.However, if the source has a sufficiently complex structure, the spectrum of radiation is essentially universal. We want to derivethis universal distribution, which is also known as the Planck distribution.

Since a black body absorbs all radiation falling on it, treating all wavelengths the same, a black body may be taken as a perfectabsorber. (Black bodies in reality do this only for a small part of the spectrum, but here we are considering the idealized case.) Bythe same token, black bodies are also perfect emitters and hence the formula for the universal thermal radiation is called the blackbody radiation formula.

The black body radiation formula was obtained by Max Planck by fitting to the observed spectrum. He also spelled out some of thetheoretical assumptions needed to derive such a result and this was, as is well known, the beginning of the quantum theory.Planck’s derivation of this formula is fairly simple once certain assumptions, radical for his time, are made; from the modern pointof view it is even simpler. Photons are particles of zero rest mass, the energy and momentum of a photon are given as

where the frequency of the radiation and the wave number are related to each other in the usual way, . Furtherphotons are spin-1 particles, so we know that they are bosons. Because they are massless, they have only two polarization states,even though they have spin equal to 1. (For a massive particle we should expect polarization states for a spin-1particle.) We can apply the Bose-Einstein distribution Equation 8.1.10 directly, with one caveat. The number of photons is not awell-defined concept. Since long-wavelength photons carry very little energy, the number of photons for a state of given energycould have an ambiguity of a large number of soft or long-wavelength photons. This is also seen more theoretically; there is noconservation law in electromagnetic theory beyond the usual ones of energy and momentum. This means that we should not have achemical potential which is used to fix the number of photons. Thus the Bose-Einstein distribution simplifies to

We now consider a box of volume in which we have photons in thermal equilibrium with material particles such as atoms andmolecules. The distribution of the internal energy as a function of momentum is given by

where the factor of is from the two polarization states. Using Equation , for the energy density, we find

This is Planck’s radiation formula. If we use and carry out the integration over angular directions of , it reduces to

This distribution function vanishes at and as . It peaks at a certain value which is a function of the temperature. InFig. 8.3.1, we show the distribution for some sample values of temperature. Note that the value of at the maximum increases withtemperature; in addition, the total amount of radiation (corresponding to the area under the curve) also increases with temperature.

ϵ = ħω, = ħp   k  (8.4.1)

ω k  ω = c | |k 

(2s+1) = 3

n =1

−1eβϵ(8.4.2)

V

dU = 2x pd3 d3

(2πħ)3

ϵ

−1eβϵ(8.4.3)

2 8.4.1

d u = 2kd3

(2π)3

ħω

−1eħω

kT

(8.4.4)

ω = c | |k  k 

d u =ħ

π2c3

dω ω3

−1eħω

kT

(8.4.5)

ω = 0 ω → ∞

ω

Page 71: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.2 https://phys.libretexts.org/@go/page/32045

Figure : The Planck distribution as a function of frequency for three sample values of temperature, with ; unitsare arbitrary

If we integrate over all frequencies, the total energy density comes out to be

where we have used the result

Rate of Radiation from a Black Body

We can convert the formula for the energy density to the intensity of the radiation by considering the conservation of energy inelectrodynamics. The energy density of the electromagnetic field is given by

Using the Maxwell equations in free space, we find

Integrating over a volume , we find

Thus the energy flux per unit area or the intensity is given by the Poynting vector . For electromagnetic waves, , and are orthogonal to each other and both are orthogonal to , the wave vector which gives the direction of

propagation, i.e., the direction of propagation of the photon. In this case we find

Using the Planck formula , the magnitude of the intensity of blackbody radiation is given by

We have considered radiation in a box of volume in equilibrium. To get the rate of radiation per unit area of a blackbody, notethat, because of equilibrium, the radiation rate from the body must equal the energy flux falling on area under consideration (which

8.4.1 > >T3 T2 T1

8.4.5

u = (kTπ2

15(ħc)3)4 (8.4.6)

dx = 3!ζ(4) =∫∞

0

x3

−1exπ4

15(8.4.7)

u = ( + )1

2E2 B2 (8.4.8)

∂u

∂t

P  

= + = c[ (∇ ×B − (∇ ×E ]EiEi BiBi Ei )i Bi )i

= −c∇ ⋅ ( × ) = −∇ ⋅E   B  P  

= c( × )E   B 

(8.4.9)

V

= ∫ x u = ⋅ d∂

∂td3 ∮

∂V

P   S   (8.4.10)

= c( × )P   E   B 

|E| = |B| E   B  k 

u = , = cuE2 P   k (8.4.11)

8.4.4

d I = 2ckd3

(2π)3

ħω

−1eħω

kT

(8.4.12)

V

Page 72: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.3 https://phys.libretexts.org/@go/page/32045

is all taken to be absorbed since it is a blackbody); thus emission rate equals absorption rate as expected for equilibrium. The flux isgiven by

where is the normal to the surface and is the angle between and . Further, in the equilibrium situation, there are photonsgoing to and away from the surface under consideration, so we must only consider positive values of , or .Thus the radiation rate over all wavelengths per unit area of the emitter is given by

This result is known as the Stefan-Boltzmann law.

Radiation Pressure

Another interesting result concerning thermal radiation is the pressure of radiation. For this, it is convenient to use one of therelations in Equation 6.2.21, namely,

From Equation , we have

Equations and immediately lead to

Radiation pressure is significant and important in astrophysics. Stars can be viewed as a gas or fluid held together by gravity. Thegas has pressure and the pressure gradient between the interior of the star and the exterior region tends to create a radial outflow ofthe material. This is counteracted by gravity which tends to contract or collapse the material. The hydrostatic balance in the star isthus between gravity and pressure gradients. The normal fluid pressure is not adequate to prevent collapse. The radiation producedby nuclear fusion in the interior creates an outward pressure and this is a significant component in the hydrostatic equilibrium ofthe star. Without this pressure a normal star would rapidly collapse.

Maximum of Planck Distribution

We have seen that the Planck distribution has a maximum at a certain value of . It is interesting to consider the wavelength atwhich the distribution has a maximum. This can be done in terms of frequency or wavelength, but we will use the wavelength hereas this is more appropriate for the application we consider later. (The peak for frequency and wavelength occur at different placessince these variables are not linearly related, but rather are reciprocally related.) Using , we can write down thePlanck distribution (Equation ) in terms of the wavelength as

⋅ d = ⋅ dS = cu ⋅ dS = cu cosθdSP   S   P   n k n (8.4.13)

n θ k n

⋅ = cosθk n 0 ≤ θ ≤ π

2

R

σ

= 2c ∫kd3

(2π)3

ħωcosθ

−1eħω

kT

= 2c ∫ dθ sinθcosθdk k2

4π2

ħω

−1eħω

kT

π

2

0

= dωħ

4π2c2∫

0

ω3

−1eħω

kT

= σ T 4

=π2k4

60ħ3c2

(8.4.14)

= T −p( )∂U

∂V T

( )∂p

∂T V

(8.4.15)

8.4.6

U = Vπ2

15(ħc)3k4T 4 (8.4.16)

8.4.15 8.4.16

p = = uπ2

45(ħc)3k4T 4 1

3(8.4.17)

ω λ∗

dω = −(2πc) dλ

λ2

8.4.12 λ

Page 73: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.4 https://phys.libretexts.org/@go/page/32045

(The minus sign in only serves to show that when the intensity increases with frequency, it should decrease with and viceversa. So we have dropped the minus sign. is the solid angle for the angular directions.) Extremization with respect to givesthe condition

where . The solution of this transcendental equation is

This relation is extremely useful in determining the temperature of the outer layer of stars, called the photosphere, from which wereceive radiation. By spectroscopically resolving the radiation and working out the distribution as a function of wavelength, we cansee where the maximum is, and this gives, via Equation , the temperature of the photosphere. Notice that higher temperaturescorrespond to smaller wavelengths; thus blue stars are hotter than red stars. For the Sun, the temperature of the photosphere isabout , corresponding to a wavelength . Thus the maximum for radiation from the Sun is in the visible region,around the color green.

Another case of the importance in which the radiation pressure and the we calculated are important is in the early history of theuniverse. Shortly after the Big Bang, the universe was in a very hot phase with all particles having an average energy so high thattheir masses could be neglected. The radiation pressure from all these particles, including the photon, is an important ingredient insolving the Einstein equations for gravity to work out how the universe was expanding. As the universe cooled by expansion, theunstable massive particles decayed away, since there was not enough average energy in collisions to sustain the reverse process.Photons continued to dominate the evolution of the universe. This phase of the universe is referred to as the radiation-dominatedera.

Later, the universe cooled enough for electrons and nuclei to combine to form neutral atoms, a phase known as the recombinationera. Once this happened, since neutral particles couple only weakly (through dipole and higher multipole moments) to radiation, theexisting radiation decoupled and continued to cool down independently of matter. This is the matter dominated era in which wenow live. The radiation obeyed the Planck spectrum at the time of recombination, and apart from cooling would continue to do soin the expanding universe. Thus the existence of this background relic radiation is evidence for the Big Bang theory. This cosmicmicrowave background radiation was predicted to be a consequence of the Big Bang theory, by Gamow, Dicke and others in the1940s. The temperature was estimated in calculations by Alpher and Herman and by Gamow in the 1940s and 1950s. The radiationwas observed by Penzias and Wilson in 1964. The temperature of this background can be measured in the same way as for stars, bycomparing the maximum of the distribution with the formula . It is found to be approximately . (Actually this has beenmeasured with great accuracy by now, the latest value being . The corresponding is in the microwaveregion, which is why this is called the cosmic microwave background.

8.3.2: Bose-Einstein Condensation

We will now work out some features of an ideal gas of bosons with a conserved particle number; in this case, we do have achemical potential. There are many atoms which are bosons and, if we can neglect the interatomic forces as a first approximation,this discussion can apply to gases made of such atoms. The partition function Z for gas of bosons was given in Equation 8.1.15.Since log Z is related to pressure as in Equation 7.4.19, this gives immediately

d I = dλdΩ2(2πħ)

c2

1

( −1)λ5 eħω

kT

(8.4.18)

dω λ

Ω λ

(x−5) +5 = 0ex (8.4.19)

x = βħω

≈λ∗

(2πħ)c

k

1

4.96511

1

T(8.4.20)

8.4.20

5777K ≈ 502nmλ∗

λ∗

8.4.20 2.7K

2.72548 ±0.00057K) λ∗

pV

kT= logZ = −∫

x pd3 d3

(2πħ log(1 − ))3 e−β(ϵ−μ)

= V [z+ + +⋅ ⋅ ⋅]( )mkT

2πħ2

3

2 z2

25

2

z3

35

2

= V (z)( )mkT

2πħ2

3

2

Li 5

2

(8.4.21)

Page 74: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.5 https://phys.libretexts.org/@go/page/32045

where is the fugacity and denotes the polylogarithm defined by

The total number of particles is given by the normalization condition (Equation 8.1.12) and works out to

We have defined the thermal wavelength by

Apart from some numerical factors of order 1, this is the de Broglie wavelength for a particle of energy .

If we eliminate in favor of from this equation and use it in Equation , we get the equation of state for the ideal gas ofbosons. For high temperatures, this can be done by keeping the terms up to order in the polylogarithms. This gives

This equation shows that even the perfect gas of bosons does not follow the classical ideal gas law. In fact, we may read off thesecond virial coefficient as

The thermal wavelength is small for large , so this correction is small at high temperatures, which is why the ideal gas was a goodapproximation for many of the early experiments in thermal physics. If we compare this with the second virial coefficient of aclassical gas with interatomic potential as given in Equation 7.5.11, namely,

we see that we can mimic Equation ref{8.3.26} by an attractive interatomic potential. Thus bosons exhibit a tendencyto cluster together.

We can now consider what happens when we lower the temperature. It is useful to calculate a typical value of . Putting in theconstants,

( is the mass of the proton the mass of the hydrogen atom.) Thus for hydrogen at room temperature, is of atomic size. Since is approximately the free volume available to a molecule, we find from Equation that must be very small under normal

conditions. The function starts from zero at and rises to about 2.61238 at , see Fig. 8.3.2. Beyond that, even

though the function can be defined by analytic continuation, it is imaginary. In fact, there is a branch cut from to . Thus for, we can solve Equation for in terms of . As the temperature is lowered, decreases and eventually we get to the

point where . This happens at a temperature

z = eβµ (z)Lis

(z) =Lis ∑n=1

∞ zn

ns(8.4.22)

N

N

V= [z+ + +⋅ ⋅ ⋅]( )

mkT

2πħ2

3

2 z2

23

2

z3

33

2

= (z) =( )mkT

2πħ2

3

2

Li 32

1

λ3Li (z)3

2

(8.4.23)

λ

λ =2πħ

2

mkT

− −−−−

√ (8.4.24)

kT

z N

V8.4.21

z2

p = kT [1 − +⋅ ⋅ ⋅]N

V

N

V

λ3

25

2

(8.4.25)

= −B2λ3

252

(8.4.26)

T

V (x)

= ∫ x (1 − )B21

2d3 e−βV(x) (8.4.27)

(V (x) < 0)

λ

λ = ×6.3 ×  meters( )( )300

T

mp

m

− −−−−−−−−−−−

√ 10−10 (8.4.28)

mp ≈ λV

N8.4.23 z

L (z)i 3

2

z = 0 z = 1

z = 1 ∞

z < 1 8.4.23 z N

z = 1

= ( )Tc1

k( )

N

V 2.61238

2

3 2πħ2

m(8.4.29)

Page 75: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.6 https://phys.libretexts.org/@go/page/32045

If we lower the temperature further, it becomes impossible to satisfy Equation . We can see the problem at moreclearly by considering the partition function, where we separate the contribution due to the zero energy state,

We see that the partition function has a singularity at . This is indicative of a phase transition. The system avoids thesingularity by having a large number of particles making a transition to the state of zero energy and momentum. Recall that thefactor may be viewed as , as a sum over different possible occupation numbers for the ground state. The idea here is

that, instead of various possible occupation numbers for the ground state, what happens below is that there is a certainoccupation number for the ground state, say, , so that the partition function should read

Thus, rather than having different probabilities for the occupation numbers for the ground state, with correspondingly differentprobabilities as given by the Boltzmann factor, we have a single multiparticle quantum state, with occupation number , for theground state. The normalization condition Equation is then changed to

Below , this equation is satisfied, with , and with compensating for the second term on the right hand side as increases. This means that a macroscopically large number of particles have to be in the ground state. This is known as Bose-Einstein condensation. In terms of , we can rewrite Equation as

which gives the fraction of particles which are in the ground state.

Since for temperatures below , we have . This is then reminiscent of the case of photons where we do not have aconserved particle number. The proper treatment of this condensation effect requires quantum field theory, using the concept ofspontaneous symmetry breaking. In such a description, it will be seen that the particle number is still a conserved operator but thatthe condensed state cannot be an eigenstate of the particle number.

Figure : The polylogarithm \text{Li}_{\frac{3}{2}} (z)

8.4.23 z = 1

Z =1

1 −z∏p≠0

1

1 −ze−βϵp(8.4.30)

z = 1

)1(1−z

∑n zn

TcN0

Z = zN0 ∏p≠0

1

1 −ze−βϵp(8.4.31)

N0

8.4.23

= + (z)N

V

N0

V

1

λ3Li 3

2

(8.4.32)

Tc z = 1 N0 λ

Tc 8.4.32

= +[1 − ]N0

V

N

V( )

T

TC

3

2

(8.4.33)

z = 1 Tc µ = 0

8.4.2

Page 76: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.7 https://phys.libretexts.org/@go/page/32045

Figure : Qualitative behavior of the specific heat of a gas of bosons

There are many other properties of the condensation phenomenon we can calculate. Here we will focus on just the specific heat.The internal energy for the gas is given by

At high temperatures, is small and and Equation gives . Thus in agreement with the

classical ideal gas. This gives .

For low temperatures below , and we can set . The specific heat becomes

We see that the specific heat goes to zero at absolute zero, in agreement with the third law of thermodynamics. It rises to a valuewhich is somewhat above at . Above , we must solve for in terms of and substitute back into the formula for .But qualitatively, we can see that the specific heat has to decrease for reaching the ideal gas value of at very hightemperatures. A plot of the specific heat is shown in Fig. 8.3.3.

There are many examples of Bose-Einstein condensation by now. The formula for the thermal wavelength in Equation shows that smaller atomic masses will have larger and one may expect them to undergo condensation at higher temperatures.While molecular hydrogen (which is a boson) may seem to be the best candidate, it turns to a solid at around . The bestcandidate is thus liquid Helium. The atoms of the isotope are bosons. Helium becomes a liquid below and it has adensity of about (under normal atmospheric pressure) and if this value is used in the formula , we find to be about

. What is remarkable is that liquid Helium undergoes a phase change at . Below this temperature, it becomes asuperfluid, exhibiting essentially zero viscosity. (He3 atoms are fermions, there is superfluidity here too, at a much lowertemperature, and the mechanism is very different.) This transition can be considered as an example of Bose-Einstein condensation.Helium is not an ideal gas of bosons, interatomic forces (particularly a short-range repulsion) are important and this may explainthe discrepancy in the value of . The specific heat of liquid is shown in Fig. 8.3.4. There is a clear transition point, with thespecific heat showing a discontinuity in addition to the peaking at this point. Because of the similarity of the graph to the Greekletter , this is often referred to as the -transition. The graph is very similar, in a broad qualitative sense, to the behavior we foundfor Bose-Einstein condensation in Fig. 8.3.3; however, the Bose-Einstein condensation in the noninteracting gas is a first-ordertransition, while the -transition is a second-order transition, so there are differences with the Bose-Einstein condensation ofperfect gas of bosons.

8.4.3

U = ∫x pd3 d3

(2πħ)3

ϵ

−1eβ(ϵ−μ)

= V kT (z)3

2

1

λ3Li 5

2

(8.4.34)

z (z) ≈ zLi 5

28.4.23 =z

λ3

N

VU = NkT3

2

= ( )NkCv32

Tc z = 1 (z) = (1) ≈ 1.3415Li 5

2

Li 5

2

= V k = N k ≈ 1.926 N kCv

15

4

(1)Li 5

2

λ3

15

4

(1)Li 5

2

(1)Li 5

2

( )T

Tc

3

2

( )T

Tc

3

2

(8.4.35)

32

T = Tc Tc z N U

T > Tc32

8.4.28

λ

14 K

He4 4.2K

125kg

m3 8.4.29 Tc

3 K 2.17 K

Tc He4

λ λ

λ

Page 77: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.8 https://phys.libretexts.org/@go/page/32045

Figure : Specific heat of liquid Helium near the -transition

The treatment of superfluid Helium along the lines we have used for a perfect gas is very inadequate. A more sophisticatedtreatment has to take account of interatomic forces and incorporate the idea of spontaneous symmetry breaking. By now, there is afairly comprehensive theory of liquid Helium.

Recently, Bose-Einstein condensation has been achieved in many other atomic systems such as a gas of atoms, atoms,and a number of others, mostly alkaline and alkaline earth elements.

8.3.3: Specific Heats of SolidsWe now turn to the specific heats of solids, along the lines of work done by Einstein and Debye. In a solid, atoms are not free tomove around and hence we do not have the usual translational degrees of freedom. Hence the natural question which arises is:When a solid is heated, what are the degrees of freedom in which the energy which is supplied can be stored? As a firstapproximation, atoms in a solid may be taken to be at the sites of a regular lattice. Interatomic forces keep each atom at its site, butsome oscillation around the lattice site is possible. This is the dynamics behind the elasticity of the material. These oscillations,called lattice vibrations, constitute the degrees of freedom which can be excited by the supplied energy and are thus the primaryagents for the specific heat capacity of solids. In a conductor, translational motion of electrons is also possible. There is thus anelectronic contribution to the specific heat as well. This will be taken up later; here we concentrate on the contribution from thelattice vibrations. In an amorphous solid, a regular lattice structure is not obtained throughout the solid, but domains with regularstructure exist, and so, the elastic modes of interest are still present.

Figure : Typical phonon dispersion curves

Turning to the details of the lattice vibrations, for atoms on a lattice, we expect modes, since each atom can oscillate alongany of the three dimensions. Since the atoms are like beads on an elastic string, the oscillations can be transferred from one atom tothe next and so we get traveling waves. We may characterize these by a frequency ω and a wave number . The dispersion relationbetween and can be obtained by solving the equations of motion for coupled particles. There are distinct modescorresponding to different relations; the typical qualitative behavior is shown in Fig. 8.3.5. There are three acoustic modesfor which , for low , being the speed of sound in the material. The three polarizations correspond to oscillations inthe three possible directions. The long-wavelength part of these modes can also be obtained by solving for elastic waves (in termsof the elastic moduli) in the continuum approximation to the lattice. They are basically sound waves, hence the name acousticmodes. The highest value for is limited by the fact that we do not really have a continuum; the shortest wavelength is of theorder of the lattice spacing.

There are also the so-called optical modes for which for any . The minimal energy needed to excite these is typically in therange of 30-60 meV or so; in terms of photon energy, this corresponds to the infrared and visible optical frequencies, hence the

8.4.4 λ

7Rb8 3Na2

8.4.5

N 3N

ω k  N

ω−k

ω ≈ | |cs k  | |k  cs

| |k 

ω ≠ 0 k 

Page 78: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.9 https://phys.libretexts.org/@go/page/32045

name. Since , the optical modes are not important for the specific heat at low temperatures.

Just as electromagnetic waves, upon quantization, can be viewed as particles, the photons, the elastic waves in the solid can bedescribed as particles in the quantum theory. These particles are called phonons and obey the expected energy and momentumrelations

The relation between and may be approximated for the two cases rather well by

where is a constant independent of . If there are several optical modes, the corresponding ’s may be different. Here weconsider just one for simplicity. The polarizations correspond to the three Cartesian axes and hence they transform as vectors underrotations; i.e., they have spin = 1 and hence are bosons. The thermodynamics of these can now be worked out easily.

First, consider the acoustic modes. The total internal energy due to these modes is

The factor of 3 is for the three polarizations. For most of the region of integration which contributes significantly, we areconsidering modes of wavelengths long compared to the lattice spacing and so we can assume isotropy and carry out the angularintegration. For high , the specific crystal structure and anisotropy will matter, but the corresponding ’s are high and the factor will diminish their contributions to the integral. Thus

Here is the Debye frequency which is the highest frequency possible for the acoustic modes. The value of this frequency willdepend on the solid under consideration. We also define a Debye temperature by = . We then find

For low temperatures, is so large that one can effectively replace it by in a first approximation to the integral. For high , we can expand the integrand in powers of u to carry out the integration. This way we find

The internal energy for is thus

The specific heat at low temperatures is thus given by

We can relate this to the total number of atoms in the material as follows. Recall that the total number of vibrational modes for atoms is . Thus

If we ignore the optical modes, we get

1eV ≈ K104

E = ħω, = ħp   k  (8.4.36)

ω k 

w ≈{ | |cs k 

ω0

(Acoustic)

(Optical)(8.4.37)

ω0 k  ω0

U = 3 ∫x kd3 d3

(2π)3

ħω

−1eβħω(8.4.38)

k ω e−βħω

U = V ∫ dω3ħ

2π2c3s

∫ωD

0

ω3

−1eβħω(8.4.39)

ωD

TD ħωD kTD

U = 3( ) (kT duV

2π2ħ3c3

s

)4 ∫

TD

T

0

u3

−1eu(8.4.40)

TD

T∞

T >> TD

du =∫

TD

T

0

u3

−1eu

⎩⎨⎪

+O( )π4

15e−

TD

T

− +⋅ ⋅ ⋅13( )TD

T

318( )TD

T

4

T << TD

T >> TD

(8.4.41)

T << TD

U =( ) +O( )V

2π2ħ3c3

s

(kTπ4 )4

5e−

TD

T (8.4.42)

U ≈( ) (kTV

2π2ħ3c3

s

4 k

5π4 )3 (8.4.43)

N

3N

3 + Total number of optical modes = 3N∫ωD x kd3 d3

(2π)3(8.4.44)

Page 79: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.10 https://phys.libretexts.org/@go/page/32045

This formula will hold even with optical modes if is interpreted as the number of unit cells rather than the number of atoms. Interms of , we get, for ,

The expression for in Equation and is the famous law for specific heats of solids at low temperatures derivedby Debye in 1912. There is a universality to it. The derivation relies only on having modes with at low . There are alwaysthree such modes for any elastic solid. These are the sound waves in the solid. (The existence of these modes can also beunderstood from the point of view of spontaneous symmetry breaking, but that is another matter.) The power 3 is of course relatedto the fact that we have three spatial dimensions. So any elastic solid will exhibit this behavior for the contribution from the latticevibrations. As we shall see shortly, the optical modes will not alter this result. Some sample values of the Debye temperature aregiven in Table 8.3.1. This will give an idea of when the low temperature approximation is applicable.

Table 8.3.1: Some Sample Debye Temperatures

Solid in Solid in

Gold 170 Aluminum 428

Silver 215 Iron 470

Platinum 240 Silicon 645

Copper 343.5 Carbon 2230

For , we find

The specific heat is then given by

Turning to the optical modes, we note that the frequency is almost independent of , for the whole range of . So it is a goodapproximation to consider just one frequency , for each optical mode. Let be the total number of degrees of freedom in theoptical mode of frequency . Then the corresponding internal energy is given by

The specific heat contribution is given by

( ) =V

2π2c3s

3N

ω3D

(8.4.45)

N

N T << TD

U

Cv

= +O( )3Nkπ4

5

T 4

T 3D

e−TD

T

= +O( )12Nkπ4

5

T 3

T 3D

e−TD

T

(8.4.46)

Cv 8.4.43 8.4.46 T 3

ω ∼ k k

TD K TD K

T >> TD

U =( )[kT (ħ − (ħ +O( )]V

2π2c3s

ωD)4 3

8ωD)3 TD

T

= 3N [kT − ħ +O( )]3

8ωD

TD

T

(8.4.47)

= k( ) (ħ +O( ) = 3Nk+O( )Cv

V

2π2c3s

ωD)3 1

T 2

1

T 2(8.4.48)

ω k k

ω0 Nopt

ω0

=Uopt Nopt

ħω0

−1eβħω0(8.4.49)

Page 80: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.4.11 https://phys.libretexts.org/@go/page/32045

8.4: Applications of the Bose-Einstein Distribution is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

U = Nk(βħω0)2

( −1)(1 − )eβħω0 eβħω0

≈ Nk[1 − +⋅ ⋅ ⋅] for T >> ħ1

12( )

ħω0

kT

2

ω0

≈ Nk exp(− ) for T << ħ( )ħω0

kT

2ħω0

kTω0

(8.4.50)

Page 81: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.1 https://phys.libretexts.org/@go/page/32046

8.5: Applications of the Fermi-Dirac DistributionWe now consider some applications of the Fermi-Dirac distribution (8.2.5). It is useful to start by examining the behavior of thisfunction as the temperature goes to zero. This is given by

Thus all states below a certain value, which is the zero-temperature value of the chemical potential, are filled with one fermioneach. All states above this value are empty. This is a highly quantum state. The value of for the highest filled state is called theFermi level. Given the behavior in Equation , it is easy to calculate the Fermi level in terms of the number of particles. Let correspond to the magnitude of the momentum of the highest filled level. Then

where is the number of polarizations for spin, . Denoting = , the Fermi level is thus given by

The ground state energy is given by

The pressure is then easily calculated as

(Since depends on , it is easier to use Equation for this.) The multiparticle state here is said to be highly degenerate asparticles try to go to the single quantum state of the lowest energy possible subject to the constraints of the exclusion principle. Thepressure from Equation is referred to as the degeneracy pressure. Since fermions try to exclude each other, it is as if there issome repulsion between them and this is the reason for this pressure. It is entirely quantum mechanical in origin, due to the neededcorrelation between the electrons. As we will see, it plays an important role in astrophysics.

The Fermi energy determines what temperatures can be considered as high or low. For electrons in a metal, is of the order of , corresponding to temperatures around . Thus, for most of the physics considerations, electrons in a metal are at low

temperatures. For atomic gases, the Fermi level is much smaller due to the factor in Equation , and room temperature ishigh compared to . We will first consider the high temperature case, where we expect small deviations from the classical physics.

The expression for , given by the normalization condition (8.2.6) is

n →{10

ϵ < μ

ϵ > μ(8.5.1)

ϵ

8.5.1 pF

N = = Vgs ∫p

F

x pd3 d3

(2πħ)3gs

p3F

6π2ħ3

(8.5.2)

gs = 2s+1gsNV

n

= =ϵFp2F

2m

ħ2

2m( )

6π2n

gs

2

3

(8.5.3)

U = gs ∫p

F

x pd3 d3

(2πħ)3

p2

2m

= V =gsp5F

20 mπ2 ħ3

3

10

h2

2m( )

6π2

gs

2

3 N5

3

V23

(8.5.4)

= V3

5ϵF n (8.5.5)

p =h2

5m( )

6π2

gs

2

3

n5

3 (8.5.6)

ϵF n 8.5.4

8.5.6

ϵF ϵF

eV 104K1m

8.5.3

ϵF

N

=nN

V= ∫ = ∫gs

kd3

(2π)3

1

+1eβ(ϵ−μ)gs

kd3

(2π)3

zeβϵ

1 +zeβϵ

= du [z− +⋅ ⋅ ⋅]gs4

λ3 π−−√∫

0u2 e−u z2e−u2

z3e−2u2

= (− (−z))gs

λ3Li 3

2

(8.5.7)

Page 82: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.2 https://phys.libretexts.org/@go/page/32046

where is the thermal wavelength, defined as before, by . The partition function , from (8.2.7), is given by

This being \frac{pV}{kT}, the equation of state is given by

At low densities and high temperatures, we see from the power series expansion of the polylogarithms that it is consistent to take to be small. Keeping terms up to the quadratic order in , we get

So, as in the bosonic case, we are not far from the ideal gas law. The correction may be identified in terms of the second virialcoefficient as

This is positive; so, unlike the bosonic case, we would need a repulsive potential between classical particles to mimic this effect viathe classical expression (8.3.27) for .

8.4.1: Electrons in a Metal

Consider a two-state system in quantum mechanics and, to begin with, we take the states to be degenerate. Thus the Hamiltonian isjust a diagonal matrix,

If we consider a perturbation to this system such that the Hamiltonian becomes

then the degeneracy between the two eigenstates of is lifted and we have two eigenstates with eigenvalues

Now consider a system with states, with the Hamiltonian as an matrix. Starting with all states degenerate, a perturbationwould split the levels by an amount depending on the perturbing term. We would still have N eigenstates, of different energieswhich will be close to each other if the perturbation is not large. As becomes very large, the eigenvalues will be almostcontinuous; we get a band of states as the new eigenstates. This is basically what happens in a solid. Consider atoms on a lattice.The electronic states, for each atom by itself, is identical to the electronic states of any other atom by itself. Thus we have aHamiltonian with a very large degeneracy for any of the atomic levels. The interatomic forces act as a perturbation to these levels.The result is that, instead of each atomic level, the solid has a band of energy levels corresponding to each unperturbed single-atomstate. Since typically the Avogadro number, it is a very good approximation to treat the band as having continuous energyeigenvalues between two fixed values. There are gaps between different bands, reflecting the energy gaps in the single-atom case.Thus the structure of electronic states in a solid is a series of well-separated bands with the energy levels within each band so closetogether as to be practically continuous. Many of these eigenstates will have wave functions localized around individual nuclei.

λ λ = 2πħ2

mkT

− −−−√ Z

logZ = ∫ log(1 + )gsx pd3 d3

(2πħ)3e−β(ϵ−μ) (8.5.8)

p

kT= ∫ log(1 + )

gs

V

x pd3 d3

(2πħ)3e−β(ϵ−μ)

= (− (−z))gs

λ3Li 5

2

(8.5.9)

z

z

z

p

kT

≈ + +⋅ ⋅ ⋅nλ3

gs

1

23

2

( )nλ3

gs

2

= [1 + +⋅ ⋅ ⋅]n nλ3

gs25

2

(8.5.10)

=B2λ3

gs25

2

(8.5.11)

B2

2 ×2

=( )H0E0

0

0

E0(8.5.12)

H = +V = +( ) =( )H0 H00

v

v

0

E0

v

v

E0(8.5.13)

H0

= ±vE± E0 (8.5.14)

N N ×N

N

N

N ∼

Page 83: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.3 https://phys.libretexts.org/@go/page/32046

These correspond to the original single-atom energy states which are not perturbed very much by the neighboring atoms. Typically,inner shell electrons in a multi-electron atom would reside in such states. However, for the outer shell electrons, the perturbationscan be significant enough that they can hop from one atomic nucleus to a neighbor, to another neighbor, and so on, givingessentially free electrons subject to a periodic potential due to the nuclei. In fact, for the calculation of these bands, it is a betterapproximation to start from free electrons in a periodic potential rather than perturbing individual atomic states. These nonlocalizedbands are crucial for electrical conductivity. The actual calculation of the band structure of a solid is a formidable problem, but forunderstanding many physical phenomena, we only need the general structure.

Consider now a solid with the electronic states being a set of bands. We then consider filling in these bands with the availableelectrons. Assume that the number of electrons is such that a certain number of bands are completely filled, at zero temperature.Such a material is an insulator, because if an electric field is applied, then the electrons cannot respond to the field because of theexclusion principle, as there are no unoccupied states of nearby energy.

Table 8.4.1: Some Sample Fermi levels

Metal Metal

Gold 5.53 Aluminum 11.7

Silver 5.49 Iron 11.1

Copper 7.00 Zinc 9.47

The only available unoccupied states are in the next higher band separated by an energy gap. As a result, the electrical conductivityis zero. If the field is strong enough to overcome the gap, then, of course, there can be conduction; this amounts to a dielectricbreakdown.

However, when all the available electrons have been assigned to states, if there is a band of nonlocalized states which is not entirelyfilled, it would mean that there are unoccupied states very close to the occupied ones. Electrons can move into these when anelectric field is applied, even if the amount of energy given by the potential is very small. This will lead to nonzero electricalconductivity. This is the case for conductors; they have bands which are not fully filled. Such bands are called conducting bands,while the filled ones are called valence bands.

The nonlocalized states of the conducting band can be labeled by the electron momentum with energy . The latter is, ingeneral, not a simple function like , because of interactions with the lattice of atoms and between electrons. In general, it is notisotropic either but will depend on the crystalline symmetry of the lattice. But for most metals, we can approximate it by the simpleform

The effect of interactions can be absorbed into an effective electron mass . (Showing that this can actually be done is a fairlycomplicated task; it goes by the name of fermi liquid theory, originally guessed, with supporting arguments, by Landau and provedto some extent by Migdal, Luttinger and others. We will not consider it here.) At zero temperature, when we have a partially filledband, the highest occupied energy level within the band is the Fermi level . The value of can be calculated from ( ),knowing , the number of electrons (not bound to sites); this is shown in Table 8.4.1. Since is of the order in terms oftemperature, we see that, for phenomena at normal temperatures, we must consider the low temperature regime of the Fermi-Diracdistribution. The fugacity is very large and we need a large fugacity asymptotic expansion for various averages. This isdone using a method due to Sommerfeld.

Consider the expression for from ( ), which we can write as

in eVϵF in eVϵF

k  ϵ(k)ħ

2k2

2m

ϵ =ħ

2k2

2m∗(8.5.15)

m∗

ϵF ϵF 8.5.3n 1 eV K104

z = eβµ

n 8.5.7

n = ∫gskd3

(2π)3

1

+1eβ(ϵ−μ)

= du = dwgs

λ3

4

π−−√∫

0

u2

+1e −βμu2

gs

λ3

2

π−−√∫

0

w−−√

+1ew−βμ

(8.5.16)

Page 84: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.4 https://phys.libretexts.org/@go/page/32046

where and . The idea is to change the variable of integration to . The lower limit of integration will

then be , which may be replaced by as a first approximation. But in doing so, we need to ensure that the integrandvanishes at . For this one needs to do a partial integration first. Explicitly, we rewrite Equation as

In the first line, we have done a partial integration of the expression from Equation ; in the second line, we replaced thelower limit by . The discrepancy in doing this is at least of order due to the in the denominator of the integrand. Thisis why we needed a partial integration. We can now expand in powers of ; the contribution from large values of will be small because the denominator ensures the integrand is sharply peaked around . Odd powers of w give zero sinceintegrand would be odd under . Thus

This gives us the equation for as

By writing , we can solve this to first order as

where we have used the expression for in terms of . As expected, the value of at zero temperature is . Turning to theinternal energy, by a similar procedure, we find

Using the result ( ) for , this becomes

The most interesting result of this calculation is that there is an electronic contribution to the specific heat which, at lowtemperatures, is given by

As expected from the third law, this too vanishes as .

8.4.2: White Dwarf StarsA gas of fermions which is degenerate is also important in many other physical phenomena, including astrophysics. Here we willbriefly consider its role in white dwarf stars.

u = k ħ2

2mkT

− −−−√ w = u2 w−βµ

−βµ −∞−∞ 8.5.16

nλ3

gs= dw

4

3 π−−√∫

−βμ

(w+βμ)3

2

( +1)( +1)ew e−w

= dw +O( )4

3 π−−√∫

−∞

(w+βμ)3

2

( +1)( +1)ew e−we−βμ

(8.5.17)

8.5.16

−∞ e−βµ e−w

(w+βµ)3

2 w |w|w = 0

w → −w

dw∫∞

−∞

(w+βμ)3

2

( +1)( +1)ew e−w= [(βµ + (βµ +⋅ ⋅ ⋅]∫

−∞

dw

( +1)( +1)ew e−w)

3

23

8)−

1

2 w2

= (βµ +(βµ +⋅ ⋅ ⋅)3

2 )−1

2π2

8

(8.5.18)

µ

= (βµ + (βµ3nλ3 π

−−√

4 gs)

3

2π2

8)−

1

2 (8.5.19)

µ = + +⋅ ⋅ ⋅µ0 µ1

βµ ≈ β [1 − +⋅ ⋅ ⋅]ϵFπ2

12( )kT

ϵF

2

(8.5.20)

ϵF n µ ϵF

U

V= kT dw +O( )gs

4

5 π−−√∫

−∞

(w+βμ)5

2

( +1)( +1)ew e−we−βμ

= kT [(βµ + (βµ +⋅ ⋅ ⋅]gs4

5 π−−√)

5

25 π2

8)

1

2

(8.5.21)

8.5.20 µ

= [1 + +⋅ ⋅ ⋅]U

V

3

5ϵF n

5π2

12( )kT

ϵF

2

(8.5.22)

= N k +O( )Cv

π2

2

kT

ϵFT 3 (8.5.23)

T → 0

Page 85: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.5 https://phys.libretexts.org/@go/page/32046

The absolute magnitude of a star which is proportional to its luminosity or total output of energy per unit time is related to itsspectral characteristic, which is in turn related to the temperature of its photosphere. Thus a plot of luminosity versus spectralclassification, known as a Hertsprung-Russell diagram, is a useful guide to classifying stars. Generally, bluer stars or hotter starshave a higher luminosity compared to stars in the red part of the spectrum. They roughly fall into a fairly well-defined curve. Starsin this category are called main sequence stars. Our own star, the Sun, is a main sequence star. There are two main exceptions,white dwarfs which tend to have low luminosity even though they are white and red giants which have a higher luminosity thanexpected for the red part of the spectrum. White dwarfs have lower luminosity because they have basically run of hydrogen forfusion and usually are not massive enough to pass the threshold for the fusion of higher nuclei. They are thus mostly made ofhelium. The radiation is primarily from gravitational contraction. (Red giants are rather low mass stars which have exhausted thehydrogen in their cores. But then the cores contract, hydrogen from outer layers get pulled in somewhat and compressed enough tosustain fusion outside the core. Because the star has a large radius, the total output is very high even though the photosphere is notvery hot, only around 4000 .)

Returning to white dwarfs, what keeps them from completely collapsing is the degeneracy pressure due to electrons. The stars arehot enough for most of the helium to be ionized and so there is a gas of electrons. The Fermi level is around or so, whilethe temperature in the core is of the order of . Thus the electron gas is degenerate and the pressure due to this isimportant in maintaining equilibrium. Electron mass being , the gas is relativistic. If we use the extreme relativisticformula, the energy-momentum relation is

Calculating the Fermi level and energy density, we find

For the condition for hydrostatic equilibrium, consider a spherical shell of material in the star, of thickness at a radius from thecenter. If the density (which is a function of the radius) is , then the mass of this shell is . The attractive force pullingthis towards the center is , where is the mass enclosed inside the sphere of radius . The pressure difference

between the inside and outside of the shell under consideration is , with an outward force Thusequilibrium requires

Further, the mass enclosed can be written as

These two equations, along with the equation of state, give a second order equation for . The radius R of the star is defined by .

What contributes to the pressure? This is the key issue in solving these equations. For a main sequence star which is still burninghydrogen, the kinetic pressure (due to the random movements of the material particles) and the radiation pressure contribute. For awhite dwarf, it is basically the degeneracy pressure. Thus we must solve these equations, using the pressure from Equation .The result is then striking. If the mass of the star is beyond a certain value, then the electron degeneracy pressure is not enough tocounterbalance it, and hence the star cannot continue as a white dwarf. This upper limit on the mass of a white dwarf isapproximately 1.4 times the mass of the Sun. This limit is known as the Chandrasekhar limit.

What happens to white dwarfs with higher masses? They can collapse and ignite other fusion processes, usually resulting in asupernova. They could end up as a neutron star, where the electrons, despite the degeneracy pressure, have been squeezed to a point

K

20 MeV

K ∼ eV107 103

∼ 0.5 MeV

ϵ ∼ c| |p   (8.5.24)

≡N

Vn

U

V

p

= dw =gs ∫kF

0

kd3

(2π)3gs

k3F

6π2

= ħc dw k = ≡ Kgs ∫kF

0

kd3

(2π)3

ħcgs

8π2( )

6π2n

gs

4

3

n4

3

=K

3n

4

3

(8.5.25)

dr r

ρ(r) ρ4π drr2

−( )4π drGmρ

r2r2 m(r) r

p(r) −p(r+dr) ( )4π drdp

drr2

= −dp

dr

G m ρ

r2(8.5.26)

m(r) = dr4π ρ∫r

0r2 (8.5.27)

ρ(r)p(R) = 0

8.5.25

Page 86: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.6 https://phys.libretexts.org/@go/page/32046

where they combine with the protons and we get a star made of neutrons. This (very dense) star is held up by neutron degeneracypressure. (The remnant from the Crab Nebula supernova explosion is such a neutron star.) There is an upper limit to the mass ofneutron stars as well, by reasoning very similar to what led to the Chandrasekhar limit; this is known as the Tolman-Oppenheimer-Volkov limit. What happens for higher masses? They may become quark stars, and for even higher masses, beyond the stabilitylimit of quark stars, they may completely collapse to form a black hole.

8.4.3: Diamagnetism and ParamagnetismDiamagnetism and paramagnetism refer to the response of a material system to an external magnetic field . To quantify this, welook at the internal energy of the material, considered as a function of the magnetic field. The magnetization of the materialis then defined by

The magnetization is the average magnetic dipole moment (per unit volume) which the material develops in response to the fieldand, in general, is itself a function of the field . For ferromagnetic materials, the magnetization can be nonzero even when weturn off the external field, but for other materials, for small values of the field, we can expect a series expansion in powers of , sothat

is the magnetic susceptibility of the material. In cases where the linear approximation ( ) is not adequate, we define

In general is a tensor, but for materials which are isotropic to a good approximation, we can take , defining a scalarsusceptibility . Materials for which are said to be diamagnetic while materials for which are said to beparamagnetic. The field , which appears in the Maxwell equation which has the free current as the source, is related to the

field by ; is the magnetic permeability.

Regarding magnetization and susceptibility, there is a theorem which is very simple but deep in its implications. It is originally dueto Niels Bohr and later rediscovered by H.J. van Leeuwen. The theorem can be rephrased as follows.

The equilibrium partition function of a system of charged particles obeying classical statistics in an external magnetic field isindependent of the magnetic field.

It is very easy to prove this theorem. Consider the Hamiltonian of a system of charged particles in an external magnetic field. Itis given by

where refers to the particle, , as usual, and is the potential energy. It could include the electrostatic potential energyfor the particles as well as the contribution from any other source. is the vector potential which is evaluated at the positionof the -th particle. The classical canonical partition function is given by

The strategy is to change the variables of integration to

The Hamiltonian becomes

Bi

U Mi

=Mi

1

V(− )

∂U

∂Bi S,V,N

(8.5.28)

Bi

Bi

= +O( )Mi χijBj B2 (8.5.29)

χij 8.5.29

=( ) =(− )χij

∂Mi

∂Bj

1

V( )

U∂2

∂ ∂Bi Bj S,V,N

(8.5.30)

χij = χχij δij

χ χ < 0 χ > 0

H  J  

B  = (1 −χ) =H  B  B 

µµ

Theorem — Bohr-van Leeuwen Theorem8.5.1

N

H = +V (x)∑α=1

N ( − ( ))pαi qαAi xαi

2 mα

(8.5.31)

α i = 1, 2, 3 V

( )Ai xαiα

= ∫QN

1

N !∏α

d3xαd3pα

2πħ3

e−βH (8.5.32)

= − ( )Παi pαi qαAi xαi (8.5.33)

Page 87: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.7 https://phys.libretexts.org/@go/page/32046

Although this eliminates the external potential from the Hamiltonian, we have to be careful about the Jacobian of thetransformation. But in this case, we can see that the Jacobian is 1. For the phase space variables of one particle, we find

The determinant of the matrix in this equation is easily verified to be the identity and the argument generalizes to particles.Hence

We see that has disappeared from the integral, proving the theorem. Notice that any mutual binding of the particles viaelectrostatic interactions, which is contained in , does not change this conclusion. The argument extends to the grandcanonical partition since it is .

For the cognoscenti, what we are saying is that one can describe the dynamics of charged particles in a magnetic field in twoways. We can use the Hamiltonian ( ) with the symplectic form or one can use the Hamiltonian ( )

with the symplectic form . The equations of motion will be identical. But in the secondform, the Hamiltonian does not involve the vector potential. The phase volume defined by is also independent of . Thusthe partition function is independent of .

This theorem shows that the explanation for diamagnetism and paramagnetism must come from the quantum theory. We willconsider these briefly, starting with diamagnetism. The full treatment for an actual material has to take account of the proper wavefunctions of the charged particles involved, for both the localized states and the extended states. We will consider a gas of chargedparticles, each of charge and mass , for simplicity. We take the magnetic field to be along the third axis. The energy eigenstatesof a charged particle in an external uniform magnetic field are the so-called Landau levels, and these are labeled by . Theenergy eigenvalues are

where . is the momentum along the third axis and labels the Landau level. Each of these levels has a degeneracy equalto

The states with the same energy eigenvalue are labeled by . The particles are fermions (electrons) and hence the occupationnumber of each state can be zero or one. Thus the partition function is given by

where is the fugacity as usual. For high temperatures, we can consider a small -expansion. Retaining only the leading term,

For high temperatures, we can also use a small -expansion,

H = +V (x)∑α=1

NΠαiΠαi

2 mα

(8.5.34)

Ai

( ) = [ ]( )dΠi

dxi

δij

0

−q∂Ai

∂xj

δij

dpj

dxj(8.5.35)

N

= ∫QN

1

N !∏α

d3xαd3Πα

2πħ3

e−βH({Π,x}) (8.5.36)

Ai

V (x)∑n z

nQN

Note

8.5.31 ω = d ∧ dpi xi 8.5.34

Ω = d ∧ d +q d ∧ dΠi xi∂Aj

∂xixi xj

Ω Ai

Ai

e m

p, k,λ

= +ħω(k+ ), k = 0, 1, ⋅ ⋅ ⋅Ek,pp2

2m

1

2(8.5.37)

ω = eBm p k

Degeneracy  = × Area of SampleeB

(2πħ)(8.5.38)

λ

Z

logZ = log(1 + )∑λ,k,p

e−β +βμEk,p

= ∫ x log(1 +z )d2 eB

(2πħ)

dp dx3

(2πħ)∑k

e−βEk,p

(8.5.39)

z z

logZ = V (2πmkT z +⋅ ⋅ ⋅, x =eB

(2πħ)2)

1

2e− x

2

1 −e−x

ħω

kT(8.5.40)

x

Page 88: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.8 https://phys.libretexts.org/@go/page/32046

This leads to

The definition ( ) is equivalent to . From , we have , so that

which shows that

Using Equation ( ) and ( ), we see that

Further, the average particle number is given by

Using this to eliminate , we find from Equation ,

The diamagnetic susceptibility is thus

Although the quantum mechanical formula for the energy levels is important in this derivation, we have not really used the Fermi-Dirac distribution, since only the high temperature case was considered. At low temperatures, the Fermi-Dirac distribution will beimportant. The problem also becomes closely tied in with the quantum Hall effect, which is somewhat outside the scope of what wewant to discuss here. So we will not consider the low temperature case for diamagnetism here. Instead we shall turn to a discussionof paramagnetism.

Paramagnetism can arise for the spin magnetic moment of the electron. Thus this is also very much a quantum effect. TheHamiltonian for a charged point particle including the spin-magnetic field coupling is

Here is the spin vector and is the gyromagnetic ratio. For the electron , being the Pauli matrices, and is very closeto 2; we will take . Since we want to show how a positive can arise from the spin magnetic moment, we will, for thisargument, ignore the vector potential in the first term of the Hamiltonian. The energy eigenvalues are thus

≈ (1 − +⋅ ⋅ ⋅)e− x

2

1 −e−x

1

x

x2

24(8.5.41)

logZ = V z(1 − +⋅ ⋅ ⋅)( )2πmkT

(2πħ)2

3

2 x2

24(8.5.42)

8.5.28 dU = TdS−pdV + µdN −MV dB pV = kT logZG−F = µN −F = kT logZ

d(kT logZ) = dG−dF = Ndµ +SdT +pdV +MV dB (8.5.43)

M = =kT

V( )

∂ logZ

∂B T ,V,μ

kT

V( )

∂ logZ

∂B T ,V,z

(8.5.44)

8.5.28 8.5.43

M = z[− +⋅ ⋅ ⋅]( )2πmkT

(2πħ)2

3

2 1

12( )eħ

m

2B

kT(8.5.45)

N ≡ z = V z(1 − +⋅ ⋅ ⋅)( )∂ logZ

∂z T ,V,B

( )2πmkT

(2πħ)2

32 x2

24(8.5.46)

z 8.5.45

M = [− +⋅ ⋅ ⋅]N

V

1

12( )eħ

m

2 B

kT(1 − )

x2

24

−1

≈ [− ]N

V

1

12( )eħ

m

2 B

kT

(8.5.47)

χ ≈ −N

V( )eħ

m

21

12kT(8.5.48)

H = − γ ⋅(p−eA)2

2 m

e

2 mS   B  (8.5.49)

S   γ =S   ħ

2σ  σ  γ

γ = 2 χ

A

= ∓ B, =Ep,±p2

2mμ0 μ0

2m(8.5.50)

Page 89: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.9 https://phys.libretexts.org/@go/page/32046

The partition function is thus given by

where with

From , we get

By taking , we see that the number density of electrons for both spin states together is . For hightemperatures, we can approximate the integral in Equation by

Using this, the magnetization becomes,

The susceptibility at high temperatures is thus given as

Turning to low temperatures, notice that we have already obtained the required expansion for the integral in Equation ; this iswhat we have done following Equation , so we can use the formula Equation for , along with Equation for theFermi level in terms of , applied in the present case to separately. Thus

Defining , we can write

The fact that from Equation now can be written as

After solving this for , we can get the magnetization as . Since for , to linear order in themagnetic field, we find

Z

logZ = ∫ [log(1 + )+log(1 + )]x pd3 d3

(2πħ)3z+e

−βp2

2m z−e−

βp2

2m (8.5.51)

= exp(β )Z± μ±

= μ± Bμ± μ0 (8.5.52)

logZ

M

= (kT ) = ( − )1

V( )

∂ logZ

∂B z

μ0 n+ n−

= ∫pd3

(2πħ)3

1

+1eβ( − )p2

2mμ±

(8.5.53)

z( )∂ log Z

∂zn = +n+ n−

8.5.53

≈ ∫ =n±pd3

(2πħ)3e−β

p2

2m eβμ± [ ]2πmkT

(2πħ)2

3

2

eβμ± (8.5.54)

M = n tanh( )μ0Bμ0

kT

≈ B, for  B << kTnμ2

0

kTμ0

(8.5.55)

χ =nμ2

0

kT(8.5.56)

8.5.538.5.16 8.5.20 µ 8.5.3

n µ±

µ±

( )ϵF n±

= ( )[1 − +⋅ ⋅ ⋅]ϵF n±π2

12( )

kT

( )ϵF n±

2

= (6ħ

2

2mπ2n±)

2

3

(8.5.57)

Δ = −n+ n−

( ) = (n) , = (3 nϵF n± ϵF ]n=(1± )Δ

n

ϵFħ

2

2mπ2 )

2

3 (8.5.58)

− = 2 Bµ+ µ− µ0 8.5.52

2 B = [ − ]− − +⋅ ⋅ ⋅µ0 ϵF (1 + )Δ

n

23

(1 − )Δ

n

23 π2

12( )kT

ϵF

2 ⎡

⎣⎢

1

(1 + )Δn

2

3

1

(1 − )Δn

2

3

⎦⎥ (8.5.59)

Δn M = Δ  or χ =µ0

Δµ0

BΔ = 0 B = 0

Page 90: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

8.5.10 https://phys.libretexts.org/@go/page/32046

The susceptibility at low temperatures is then

The susceptibility from spin magnetic moment shows paramagnetic behavior both at high and low temperatures, as seen fromEquation and .

8.5: Applications of the Fermi-Dirac Distribution is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

= [1 − +⋅ ⋅ ⋅]Δ

n

3

2

Bμ0

ϵF

π2

12( )kT

ϵF

2

(8.5.60)

χ = [1 − +⋅ ⋅ ⋅]3

2

nμ0

ϵF

π2

12( )kT

ϵF

2

(8.5.61)

8.5.56 8.5.61

Page 91: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

CHAPTER OVERVIEW

9: The Carathéodory PrincipleThe formulation of the second law from thermodynamics used the concept of heat engines, at least indirectly. But the law is verygeneral and one could ask whether there is another formulation which does not invoke heat engines but leads to the notion ofabsolute temperature and the principle that entropy cannot spontaneously decrease. Such a version of the second law is obtained inan axiomatization of thermodynamics due to C. Carathéodory.

9.1: Mathematical Preliminaries9.2: Carathéodory Statement of the Second Law

9: The Carathéodory Principle is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

Page 92: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

9.1.1 https://phys.libretexts.org/@go/page/32048

9.1: Mathematical PreliminariesWe will start with a theorem on differential forms which is needed to formulate Carathéodory’s version of the second law.

Before proving Carathéodory’s theorem, we will need the following result.

Let denote a differential one-form. If , then at least locally, one can find an integrating factor for ; i.e.,there exist functions and such that .

The proof of this result is most easily done inductively in the dimension of the space. First, we consider the two-dimensional case, so that. In this case, the condition is vacuous. Write . We make a coordinate transformation to ,

where

where is an arbitrary function which can be chosen in any convenient way. This equation shows that

Equations define a set of nonintersecting trajectories, λ being the parameter along the trajectory. We choose as the coordinate ontransverse sections of the flow generated by ( ). Making the coordinate transformation from , to , , we can now write theone-form as

This proves the theorem for two dimensions. In three dimensions, we have

The strategy is to start by determining , for the , subsystem. We choose the new coordinates as , , and impose Equation . Solving these, we will find and as functions of and . The trajectories will also depend on the staring points which may

be taken as points on the transverse section and hence labeled by . Thus we get

The one-form in Equation now becomes

We now consider imposing the equations ,

Theorem — Integrating Factor Theorem9.1.1

A = dAi xi A∧ dA = 0 A

τ φ A = τ dφ

i = 1, 2 A∧ dA = 0 A = d + dA1 x1 A2 x2 λ φ

dx1

dx2

= −f( , )x1 x2 A2

= f( , )x1 x2 A1

(9.1.1)

f( , )x1 x2

+ = 0A1∂x1

∂λA2

∂x2

∂λ(9.1.2)

9.1.1 ϕ

9.1.1 x1 x2 λ ϕ

A

A

τ

=( + ) dλ+( + ) dϕA1∂x1

∂λA2

∂x2

∂λA1

∂x1

∂ϕA2

∂x2

∂ϕ

= τ dϕ

= Ai

∂xi

∂ϕ

(9.1.3)

A = d + d + dA1 x1 A2 x2 A3 x3 (9.1.4)

τ ϕ A1 A2 λ ϕ x3

9.1.1 x1 x2 λ x3

ϕ

= (λ,ϕ, ), = (λ,ϕ, )x1 x1 x3 x2 x2 x3 (9.1.5)

A 9.1.4

A

A3

=( + ) dλ+( + ) dϕ+ d +( + ) dA1∂x1

∂λA2

∂x2

∂λA1

∂x1

∂ϕA2

∂x2

∂ϕA3 x3 A1

∂x1

∂x3A2

∂x2

∂x3x3

= τ dϕ+ dA3 x3

= ( + )A3 A1∂x1

∂x3A2

∂x2

∂x3

(9.1.6)

A∧ dA = 0

A∧ dA = [ ( − ) + ( − ) + ( − )]d ∧ dλ∧ dϕA3 ∂λAϕ ∂ϕAλ Aλ ∂ϕA3 ∂3Aϕ Aϕ ∂3Aλ ∂λA3 x3

= 0

(9.1.7)

Page 93: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

9.1.2 https://phys.libretexts.org/@go/page/32048

Since and from Equation , this equation becomes

Writing , this becomes

Since is not identically zero for us, we get and, going back to Equation , we can write

The quantity in the square brackets is a one-form on the two-dimensional space defined by , . For this we can use the two-dimensional result and write it as , so that

This proves the theorem for the three-dimensional case.

The extension to four dimensions follows a similar pattern. The solutions to Equation become

so that we can bring to the form

We now turn to imposing the condition . In local coordinates this becomes

There are four independent conditions here corresponding to . Using and , these four equations become

Again, we introduce and by \widetilde{A}_3 = τ h\), . Then, equations ( ) and ( ) become

Equation is then identically satisfied. The last equation, namely, , simplifies to

Using these results, Equation becomes

The quantity in the square brackets is a one-form on the three-dimensional space of and we can use the previous result for anintegrating factor for this. The condition for the existence of an integrating factor for is precisely . Thus if wehave Equation , we can write as for some functions and , so that finally takes the form .

= 0Aλ = τAϕ 9.1.6

−τ = 0A3∂τ

∂λ

∂A3

∂λ(9.1.8)

= τ hA3

= 0τ2 ∂h

∂λ(9.1.9)

τ = 0∂h∂λ

9.1.6

A = τ [dϕ+h(ϕ, )d ]x3 x3 (9.1.10)

ϕ x3

dτ ϕ

A = ττ[dϕ+h(ϕ, )d ] = τ d ≡ Tdx3 x3 τ ϕ ϕ (9.1.11)

T = ττ

9.1.1

= (λ,ϕ, , ), = (λ,ϕ, , )x1 x1 x3 x4 x2 x2 x3 x4 (9.1.12)

A

A =( + ) dϕ+( + + ) d +( + + ) dA1∂x1

∂ϕA2

∂x2

∂ϕA3 A1

∂x1

∂x3A2

∂x2

∂x3x3 A4 A1

∂x1

∂x4A2

∂x2

∂x4x4

= τ dϕ+ d + dA3 x3 A4 x4

(9.1.13)

A∧ dA = 0

( − ) + ( − ) + ( − ) = 0Aα ∂µAν ∂νAµ Aµ ∂νAα ∂αAν Aν ∂αAµ ∂µAα (9.1.14)

(α, µ, ν) = (1, 2, 3), (4, 1, 2), (3, 4, 1), (3, 2, 4) = 0Aλ

= τAϕ

−τ = 0A3∂τ

∂λ

∂A3

∂λ(9.1.15)

−τ = 0A4∂τ

∂λ

∂A4

∂λ(9.1.16)

− = 0A4∂A3

∂λA3

∂A4

∂λ(9.1.17)

− +τ − + −τ = 0A3∂A4

∂ϕA4

∂A3

∂ϕ

∂A3

∂x4A3

∂τ

∂x4A4

∂τ

∂x3

∂A4

∂x3(9.1.18)

h g = τgA4 9.1.15 9.1.16

= 0, = 0∂h

∂λ

∂g

∂λ(9.1.19)

9.1.17 9.1.18

h −g + − = 0∂g

∂ϕ

∂h

∂ϕ

∂h

∂x4

∂g

∂x3(9.1.20)

9.1.13

A == [τdϕ+hd +gd ]x3 x4 (9.1.21)

ϕ, ,x3 x4

dϕ+hd +gdx3 x4 9.1.20

9.1.20 dϕ+hd +gdx3 x4 tds t s A A = TdS

Page 94: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

9.1.3 https://phys.libretexts.org/@go/page/32048

Thus the theorem is proved for four dimensions. The procedure can be extended to higher dimensions recursively, establishing thetheorem for all dimensions.

Now we turn to the basic theorem needed for the Carathéodory formulation. Consider an n-dimensional manifold with a one-form on it. A solution curve to is defined by along the curve. Explicitly, the curve may be taken as given by a set of function

where is the parameter along the curve and

In other words, the tangent vector to the curve is orthogonal to A_i. The curve therefore lies on an -dimensional surface. Twopoints, say, and on are said to be accessible if there is a solution curve which contains and . Carathéodory’s theorem isthe following:

If in the neighborhood of a point there are -inaccessible points, then admits an integrating factor; i.e., where and are well defined functions in the neighborhood.

The proof of the theorem involves a reductio ad absurdum argument which constructs paths connecting to any other point in theneighborhood. (This proof is due to H.A. Buchdahl, Proc. Camb. Phil. Soc. 76, 529 (1979).) For this, define

Now consider a point near We have a displacement vector for the coordinates of (from ). can in general have acomponent along and some components orthogonal to . The idea is to solve for these from the equation . Let be a pathwhich begins and ends at , i.e., , , and which is orthogonal to . Thus it is a solution curve. Any closedcurve starting at and lying in the -dimensional space orthogonal to can be chosen. Consider now a nearby path given by

. This will also be a solution curve if . Expanding to first order in , this is equivalent to

where we also used . We may choose to be of the form where is antisymmetric, to be consistent with

. We can find quantities such that this is true; in any case, it is sufficient to show one path which makes accessible. So

we may consider ’s of this form. Thus Equation becomes

This is one equation for the components of the displacement . We can choose the components of which are orthogonal to as we like and view this equation as determining the remaining component, the one along . So we rewrite this equation as an

equation for as follows.

This can be rewritten as

M A

A A = 0

= (t)xi ξi t

= = 0Ai

dxi

dtAi ξ

i(9.1.22)

(n−1)

P P ′ M A P P ′

Theorem — Carathéodory’s Theorem.9.1.2

P A A A = TdS T

S

P

= ( − ) + ( − ) + ( − )Cijk Ai ∂jAk ∂kAj Ak ∂iAj ∂jAi Aj ∂kAi ∂iAk (9.1.23)

P ′ P ϵηi P ′ P ηi

Ai Ai A = 0 (t)ξi

P (0) = (1) = 0ξi ξi 0 ≤ t ≤ 1 Ai

P (n−1) Ai

(t) = (t) + ϵ (t)xi ξi ηi (ξ+ ϵη)( + ϵ = 0AI ξ η)i ϵ

+ ( ) = 0Ai ηi ξ

i ∂Ai

∂xjηj (9.1.24)

= 0Ai ξi

ξi

=ξi

f ijAj f ij

= 0Ai ξi

f ij P ′

ξi

9.1.24

+ ( ) = 0Ai ηi ηj ∂jAi f

ikAk (9.1.25)

n ηi n−1 ηi

Ai Ai

Aiηi

( )d

dtAiη

i = +Aiηi Ai η

i

= ( ) − ( ) )∂jAi ξjηi ηj ∂jAi f

ikAk

= − ( − )ηif jk ∂iAj ∂jAi Ak

= [ ( − ) + ( − ) + ( − )] + (A ⋅ η) ( − )1

2ηif jk Ak ∂iAj ∂jAi Aj ∂kAi ∂iAk Ai ∂jAk ∂kAj

1

2f jk ∂jAk ∂kAj

= + (A ⋅ η) ( − )1

2ηif jkCkij

1

2f ij ∂iAj ∂jAi

(9.1.26)

(A ⋅ η) −F (A ⋅ η) = − ( )d

dt

1

2Ckijη

if jk (9.1.27)

Page 95: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

9.1.4 https://phys.libretexts.org/@go/page/32048

where . The important point is that we can choose , along with a coordinate transformation if needed, suchthat has no component along . For this, notice that

where . There are components for , for which we have one equation if we set to zero. Wecan always find a solution; in fact, there are many solutions. Making this choice, has no component along , so thecomponents of on the right hand side of Equation are orthogonal to . As mentioned earlier, there is a lot of freedom in howthese components of are chosen. Once they are chosen, we can integrate Equation to get , the component along .Integrating Equation , we get

We have chosen . It is important that the right-hand side of Equation does not involve for us to be able tointegrate like this. We choose all components of orthogonal to to be such that

We then choose , if needed by scaling it, such that in Equation gives . We have thus shown that wecan always access along a solution curve. The only case where the argument would fail is when . In this case, ascalculated is zero and we have no guarantee of matching the component of the displacement of along the direction of . Thus ifthere are inaccessible points in the neighborhood of , then we must have . In this case, by the previous theorem, admits anintegrating factor and we can write for some functions and in the neighborhood of . This completes the proof of theCarathéodory theorem.

9.1: Mathematical Preliminaries is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

F = ( − )12f ij ∂iAj ∂jAi f ij

Ckijηif jk Ai

= − +Ckijηif jkAi A2Fij AiAkFkj AjAkFkif

ij (9.1.28)

= −Fij ∂iAj ∂jAi n(n−1)12

f ij Ckijηif jkAi

Ckijηif jk Ai

η 9.1.27 Ai

η 9.1.27 (A ⋅ η) Ai

9.1.27

A ⋅ η(1) = dt exp( d F ( ))( )∫1

0

∫1

t

t′ t′ 1

2Ckijη

if jk (9.1.29)

η(0) = 0 9.1.27 (A ⋅ η)

ηi Ai

ϵ =  coordinates of   orthogonal to Aηi P ′ (9.1.30)

f jk A ⋅ η(1) 9.1.30 ( −Ai xP ′ xP )i

P ′ = 0Cijk A ⋅ η(1)

P ′ Ai

P = 0Cijk A

A = TdS T S P

Page 96: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

9.2.1 https://phys.libretexts.org/@go/page/32049

9.2: Carathéodory Statement of the Second LawThe statement of the second law due to Carathéodory is:

In the neighborhood of any equilibrium state of a physical system with any number of thermodynamic coordinates, there existstates which are inaccessible by adiabatic processes.

The adiabatic processes can be quite general, not necessarily quasi-static. It is easy to see that this leads immediately to the notionof absolute temperature and entropy. This has been discussed in a concise and elegant manner in Chandrasekhar’s book on stellarstructure. We briefly repeat his argument for completeness. For simplicity, consider a gas characterized by pressure and volume

, and (empirical) temperature , only two of which are adequate to specify the thermodynamic state, the third being given by anequation of state. Since these are the only variables, has an integrating factor and we may write

where and will be functions of the variables . The power of Carathéodory’s formulation becomes clear when we considertwo such systems brought into thermal contact and come to equilibrium. We then have a common temperature t and thethermodynamic variables can now be taken as (or and one variable from each of . We also have

. The number of variables is now three; nevertheless, the Carathéodory principle tells us that we can write

We now choose as the independent variables. Equation then leads to

The last of these equations tells us that is only a function of and , . Further, since σ is a well-defined functionof the various variables, derivatives on commute and so

with a similar relation for derivatives with respect to as well. Thus we have the result

Equivalently, we can write

This shows that the combination is independent of the system and is a universal function of the common variable .

Taking this function as and integrating, we get

The ’s are determined up to a function of the ’s; we take this arbitrariness as , where is a constant and is a function ofthe ’s involved. We can now define the absolute temperature as

Definition: Carathéodory Principle

p

V t

dQ

dQ = τdσ (9.2.1)

σ τ p,V , t

, , tV1 V2 t ( , ), ( , ))p1 V1 p2 V2

dQ = d +dQ1 Q2

τ = d + ddσ τ1 σ1 τ2 σ2 (9.2.2)

t, ,σ1 σ2 9.2.2

= , = , frac∂σ∂t = 0∂σ

∂σ1

τ1

τ

∂σ

∂σ2

τ2

τ(9.2.3)

σ σ1 σ2 σ = σ( , )σ1 σ2

σ

− = 0∂

∂t

∂σ

∂σ1

∂σ

∂σ1

∂σ

∂t(9.2.4)

σ2

( ) = 0, ( ) = 0∂

∂t

τ1

τ

∂t

τ2

τ(9.2.5)

= =1

τ1

∂τ1

∂t

1

τ2

∂τ2

∂t

1

τ

∂τ

∂t(9.2.6)

( )( )1τ

∂τ∂t

t

g(t)

τ

τ1

τ2

= Σ( , )C  exp( dt g(t))σ1 σ2 ∫t

t0

= ( )C  exp( dt g(t))Σ1 σ1 ∫t

t0

= ( )C  exp( dt g(t))Σ2 σ2 ∫t

t0

(9.2.7)

τ σ CΣ C Σ

σ

Page 97: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

9.2.2 https://phys.libretexts.org/@go/page/32049

Notice that, in the case under consideration, as expected for equilibrium. This gives , etc. Therelation now reduces to

In the two-dimensional space with coordinates , , the vector has vanishing curl, i.e., , since only depends on and similarly for . Thus Equation shows that is a perfect differential. This means that there existsa function such that ; this also means that can depend on and only through the combination . Thusfinally we have

In this way, the Carathéodory principle leads to the definition of entropy .

Figure : Illustrating the Carathéodory principle and increase of entropy

One can also see how this leads to the principle of increase of entropy. For this, consider a system with thermodynamic variables.The entropy will be a function of these. We can alternatively choose of the given variables and the entropy to characterizestates of the system. Now we ask the question: Given a state , can we find a path which takes us via adiabatic processes toanother state ? It is useful to visualize this in a diagram, with as one of the axes, as in Fig. 9.2.1. We show one of the otheraxes, but there could be many. To get to , we can start from and go along a quasi-static reversible adiabatic to and then, viasome nonquasi-static process such as stirring, mixing, etc., get to , keeping the system in adiabatic isolation. This second processcan be irreversible. The idea is that the first part does not change the entropy, but brings the other variables to their desired finalvalue. Then we move to the required value of by some irreversible process. As shown . Suppose the secondprocess can also decrease the entropy in some cases so that we can go from to by some similar process. Then we see that allstates close to are accessible. Starting from any point, we can move along the surface of constant to get to the desired value ofthe variables, except for and then jump to the required value of by the second process. This contradicts the Carathéodoryprinciple. Thus, if we postulate this principle, then we have to conclude that in all irreversible processes in adiabatic isolation theentropy has to either decrease or increase; we cannot have it increase in some processes and decrease in some other processes. So should be either a nondecreasing quantity or a nonincreasing quantity. The choice of the sign of the absolute temperature, via thechoice of the sign of the constant in Equation , is related to which case we choose for entropy. The conventional choice, ofcourse, is to take and entropy to be nondecreasing. In other words

Thus effectively, we have obtained the version of the second law as given in Proposition 4 in Chapter 3.

9.2: Carathéodory Statement of the Second Law is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V.Parameswaran Nair.

T ≡ C  exp( dt g(t))∫t

t0

(9.2.8)

= = TT1 T2 d = T dQ1 Σ1 σ1

dQ = d +dQ1 Q2

Σdσ = d + dΣ1 σ1 Σ2 σ2 (9.2.9)

σ1 σ2 ( , )Σ1 Σ2 − = 0∂1Σ2 ∂2Σ1 Σ1

σ1 Σ2 9.2.9 ΣdσS Σ = dSdσ Σ σ1 σ2 σ( , )σ1 σ2

dQ = T dS (9.2.10)

S

9.2.1

n

n−1 S

A

C S

C A B

C

S SC > SB = SA

B D

B S

S S

S

C 9.2.8

T ≥ 0

ΔS ≥ 0 (9.2.11)

Page 98: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1

CHAPTER OVERVIEW

10: Entropy and InformationThe concept of entropy is one of the more difficult concepts in physics. Historically, it emerged as a consequence of the second lawof thermodynamics, as in Equation 3.4.13. Later, Boltzmann gave a general definition for it in terms of the number of ways ofdistributing a given number of particles, as in Equation 7.2.3. But a clearer understanding of entropy is related to its interpretationin terms of information. We will briefly discuss this point of view here.

10.1: Information10.2: Maxwell’s Demon10.3: Entropy and Gravity

10: Entropy and Information is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

Page 99: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

10.1.1 https://phys.libretexts.org/@go/page/32051

10.1: InformationWe want to give a quantification of the idea of information. This is originally due to C. Shannon.

Consider a random variable with probability distribution with . For simplicity, initially, we take to be a discrete randomvariable, with possible values , with being the probability for . We may think of an experiment forwhich the outcomes are the , and the probability for being in a trial run of the experiment. We want to define a concept ofinformation associated with . The key idea is to note that if an outcome has probability 1, the occurrence of that outcomecarries no information since it was clear that it would definitely happen. If an outcome has a probability less than 1, then itsoccurrence can carry information. If the probability is very small, and the outcome occurs, it is unlikely to be a random event andso it makes sense to consider it as carrying information. Based on this intuitive idea, we expect information to be a function of theprobability. By convention, we choose ) to be positive. Further from what we said, . Now consider two completelyindependent events, with probabilities and . The probability for both to occur is , and will carry information . Sincethe occurrence of each event separately carries information and , we expect

Finally, if the probability of some event is changed by a small amount, we expect the information for the event to be changed by asmall amount as well. This means that we would like to be a continuous and differentiable function of . Thus we need acontinuous and differentiable function obeying the requirements , and . The onlyfunction which obeys these conditions is given by

This is basically Shannon’s definition of information. The base used for this logarithm is not specified by what has been said so far;it is a matter of choosing a unit for information. Conventionally, for systems using binary codes, we use , while for moststatistical systems we use the natural logarithms.

Consider now the outcome xi which has a probability . The amount of information for is . Suppose now that we do trials of the experiment, where is very large. Then the number of times xi will be realized is . Thus it makes sense to definean average or expectation value for information as

This expected value for information is Shannon’s definition of entropy.

This definition of entropy requires some clarification. It stands for the amount of information which can be coded using theavailable outcomes. This can be made clearer by considering an example, say, of tosses of a coin, or equivalently a stream of 0sand 1s, units long. Each outcome is then a string of 0s and 1s; we will refer to this as a word, since we may think of it as thebinary coding of a word. We take these to be ordered so that permutations of 0s and 1s in a given word will be counted as distinct.The total number of possibilities for this is , and each occurs with equal probability. Thus the amount of information in realizinga particular outcome is , or bits if we use logarithm to base 2. The entropy of the distribution is

Now consider a situation where we specify or fix some of the words. For example, let us say that all words start with 0, Then theprobability of any word among this restricted set is now , and entropy becomes . Thus entropy hasdecreased because we have made a choice; we have used some information. Thus entropy is the amount of information which canbe potentially coded using a probability distribution.

This definition of entropy is essentially the same as Boltzmann’s definition or what we have used in arriving at various distributionfunctions for particles. For this consider the formula for entropy which we used in chapter 7, Equation 7.2.5,

x p(x) x

N , , ⋅ ⋅ ⋅,x1 x2 xN ≡ p( )pi xi xixi xi pi

I(p) p(x)

I(p I(1) = 0

p p p p I(p )p

I(p) I( )p

I(p ) = I(p) +I( )p p (10.1.1)

I(p) p

I(p) I(p) ≥ 0 I(1) = 0 I(p ) = I(p) +I( )p p

I(p) = −log p (10.1.2)

plog2

pi xi −log pi N

N N pi

S = I( ) = − log∑i

pi pi ∑i

pi pi (10.1.3)

N

N

2N

I = N log 2 N

S =∑ log = N log 21

2N2N (10.1.4)

1

2N−1S = (N −1) log 2

S ≈ k[N logN −N − ( log − )]∑i

ni ni ni (10.1.5)

Page 100: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

10.1.2 https://phys.libretexts.org/@go/page/32051

Here the is the occupation number for the state . In limit of large , may be interpreted as the probability for the state .Using the symbol for this, we can rewrite Equation as

showing that the entropy as defined by Boltzmann in statistical physics is the same as Shannon’s information-theoretic definition.(In thermodynamics, we measure in ; we can regard Boltzmann’s constant as a unit conversion factor. Thus fromthermodynamics is the quantity to be compared to the Shannon definition.) The states in thermodynamics are specified by thevalues of positions and momenta for the particles, so the outcomes are continuous. A continuum generalization of Equation is then

where is an appropriate measure, like the phase space measure in Equation 7.4.1.

Normally, we maximize entropy subject to certain averages such as the average energy and average number of particles beingspecified. This means that the observer has, by observations, determined these values, and hence the number of available states isrestricted. Only those states which are compatible with the given average energy and number of particles are allowed. Thisconstrains the probability distribution which maximizes the entropy. If we specify more averages, then the maximal entropy islower. The argument is similar to what was given after Equation , but we can see this more directly as well. Let

be a set of observables. The maximization of entropy subject to specifying the average values of these is givenby maximizing

Here are the average values which have been specified and are Lagrange multipliers. Variation with respect to the s givethe required constraint

The distribution which extremizes Equation is given by

The corresponding entropy is given by

Now let us consider specifying averages. In this case, we have

This distribution reverts to , and likewise , if we set to zero.

If we calculate using the distribution and the answer comes out to be the specified value, then there is no information ingoing to . Thus it is only if the distribution which realizes the specified value differs from that there is additionalinformation in the choice of . This happens if . It is therefore useful to consider how changes with . We find,directly from Equation ,

ni i Nni

Ni

pi 10.1.5

= − logS

k∑i

pi pi (10.1.6)

S J

Kk S

k

10.1.6

= dN p log pS

k(10.1.7)

dN

10.1.4

,α = 1, 2, ⋅ ⋅ ⋅,nAα

= ∫ [−p log p− p]+ ⟨ ⟩S

k∑α

n

λαAα ∑α

n

λα Aα (10.1.8)

⟨ ⟩Aα λα λ

⟨ ⟩ = ∫ pAα Aα (10.1.9)

p 10.1.8

= , = ∫pn1

Zn

e−∑nα λαAα Zn e−∑

nα λαAα (10.1.10)

= log + ⟨ ⟩Sn

kZn ∑

α

n

λα Aα (10.1.11)

n+1

pn+1

Sn+1

k

= , = ∫1

Zn+1e−∑n+1

α λαAα Zn+1 e−∑n+1α λαAα

= log + ⟨ ⟩Zn+1 ∑α

n+1

λα Aα

(10.1.12)

pn →Sn+1 Sn λn+1

⟨ ⟩An+1 pnpn+1 ⟨ ⟩An+1 pn

⟨ ⟩An+1 ≠ 0λn+1 S λα10.1.11

Page 101: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

10.1.3 https://phys.libretexts.org/@go/page/32051

The change of the maximal entropy with the s is given by a set of correlation functions designated as . We can easily see thatthis matrix is positive semi-definite. For this, we use the Schwarz inequality

For any set of complex numbers , we take and . We then see from (10.14) that . (The integralsin Equation should be finite for the inequality to make sense. We will assume that at least one of the λs, say correspondingto the Hamiltonian. is always included so that the averages are finite.) Equation then tells us that decreases as more andmore s pick up nonzero values. Thus we must interpret entropy as a measure of the information in the states which are still freelyavailable for coding after the constraints imposed by the averages of the observables already measured. This also means that theincrease of entropy in a system left to itself means that the system tends towards the probability distribution which is completelyrandom except for specified values of the conserved quantities. The averages of all other observables tend towards the values givenby such a random distribution. In such a state, the observer has minimum knowledge about observables other than the conservedquantities.

10.1: Information is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by V. Parameswaran Nair.

∂S

∂λα

Mαβ

= ⟨ ⟩ = [−⟨ ⟩+ ⟨ ⟩⟨ ⟩]∑β

λβ∂

∂λαAβ ∑

β

λβ AαAβ Aα Aβ

= −∑β

Mαβλβ

= ⟨ ⟩− ⟨ ⟩⟨ ⟩AαAβ Aα Aβ

(10.1.13)

λ Mαβ

[∫ B][∫ C] ≥ [∫ C][∫ B]B∗ C ∗ B∗ C ∗ (10.1.14)

γα B = γαAα C = 1 ≥ 0γαγ∗βMαβ

10.1.14

10.1.13 S

λ

Page 102: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

10.2.1 https://phys.libretexts.org/@go/page/32052

10.2: Maxwell’s DemonThere is a very interesting thought experiment due to Maxwell which is perhaps best phrased as a potential violation of the secondlaw of thermodynamics. The resolution of this problem highlights the role of entropy as information.

We consider a gas of particles in equilibrium in a box at some temperature . The velocities of the particles follow the Maxwelldistribution from Equation 7.3.7,

The mean square speed given by

may be used as a measure of the temperature. Now we consider a partition which divides the box into two parts. Further, weconsider a small gate in the partition which can be opened and closed and requires a very small amount of energy, which can betaken to be zero in an idealized limit. Further, we assume there is a creature (“Maxwell’s demon") with a large brain capacity tostore a lot of data sitting next to this box. Now the demon is supposed to do the following. Every time he sees a molecule of highspeed coming towards the gate from the left side, he opens the gate and lets it through to the right side. If he sees a slowly movingmolecule coming towards the gate from the right side, he opens it and lets the molecule through to the left side. If he sees a slow-moving molecule on the left side, or a fast-moving one on the right side, he does nothing. Now after a while, the mean square speedon the left will be smaller than what it was originally, showing that the temperature on the left side is lower than .Correspondingly, the mean square speed on the right side is higher and so the temperature there is larger than . Effectively, heat isbeing transferred from a cold body (left side of the box) to a hot body (the right side of the box). Since the demon impartsessentially zero energy to the system via opening and closing the gate, this transfer is done with no other change, thus seeminglyproviding a violation of the second law. This is the problem.

We can rephrase this in terms of entropy change. To illustrate the point, it is sufficient to consider the simple case of particles inthe volume forming an ideal gas with the demon separating them into two groups of particles in volume each. If the initialtemperature is and the final temperatures are and , then the conservation of energy gives . Further, we canuse the the Sackur-Tetrode formula (7.2.25) for the entropies,

The change in entropy when the demon separates the molecules is then obtained as

Since

we see that . Thus the process ends up decreasing the entropy in contradiction to the second law.

The resolution of this problem is in the fact that the demon must have information about the speeds of molecules to be able to letthe fast ones to the right side and the slow ones to the left side. This means that using the Sackur-Tetrode formula for the entropy ofthe gas in the initial state is not right. We are starting off with a state of entropy (of gas and demon combined) which is less thanwhat is given by Equation , once we include the information carried by (or obtained via the observation of velocities by) thedemon, since the specification of more observables decreases the entropy as we have seen in the last section. While it is difficult toestimate quantitatively this entropy, the expectation is that with this smaller value of to begin with, will come out to bepositive and that there will be no contradiction with the second law. Of course, this means that we are considering a generalizationof the second law, namely that the entropy of an isolated system does not decrease over time, provided all sources of entropy in theinformation-theoretic sense are taken into account.

T

f(v) v= exp(− ) vd3 ( )m

2πkT

3

2 mv

2kTd3 (10.2.1)

⟨ ⟩ =v2 3kT

m(10.2.2)

T

T

N

V N2

V

2

T T1 T2 T = ( + )12T1 T2

S = N k [ +log( )+ log( )+ log( )]5

2

V

N

3

2

U

N

3

2

4πm

3(2πħ)2(10.2.3)

ΔS = + −S = log( )S1 S23N

2

T1T2− −−−

T(10.2.4)

= + ≥( )+T1 T2

2

2

T1T2 ( )−T1 T2

2

2

T1T2 (10.2.5)

ΔS ≤ 0

10.2.3

S ΔS

Page 104: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

10.3.1 https://phys.libretexts.org/@go/page/32053

10.3: Entropy and GravityThere is something deep about the concept of entropy which is related to gravity. This is far from being well understood, and is atopic of ongoing research, but there are good reasons to think that the Einstein field equations for gravity may actually emerge assome some sort of entropy maximization condition. A point of contact between gravity and entropy is for spacetimes with ahorizon, an example being a black hole. In an ultimate theory of quantum gravity, spacetimes with a horizon may turn out to benothing special, but for now, they may be the only window to the connection between entropy and gravity. To see something of theconnection, we look at a spherical solution to the Einstein equations, corresponding to the metric around a point (or sphericaldistribution of) mass. This is the Schwarzschild metric given as

We are writing this in the usual spherical coordinates for the spatial dimensions. is Newton’s gravitational constant and is the speed of light in vacuum. We can immediately see that there are two singularities in this expression. The first is obviously

at , similar to what occurs in Newton’s theory for the gravitational potential, and the second is at . This secondsingularity is a two-sphere since it occurs at finite radius. Now, one can show that is a genuine singularity of the theory, inthe sense that it cannot be removed by a coordinate transformation. The singularity at is a coordinate singularity. It is likethe singularity at , when we use spherical coordinates and can be eliminated by choosing a different set of coordinates.Nevertheless, the radius does have an important role. The propagation of light, in the ray optics approximation, is describedby . As a result, one can see that nothing can escape from to larger values of the radius, to be detected byobservers far away. An observer far away who is watching an object falling to the center will see the light coming from it beingredshifted due to the factor, eventually being redshifted to zero frequency as it crosses ; the object fades out.For this reason, and because it is not a real singularity, we say that the sphere at is a horizon. Because nothing can escapefrom inside the horizon, the region inside is a black hole. The value is called the Schwarzschild radius.

Are there examples of black holes in nature? The metric ( ) can be used to describe the spacetime outside of a nearlyspherical matter distribution such as a star or the Sun. For the Sun, with a mass of about , the Schwarzschild radius isabout . The form of the metric in ( ) ceases to be valid once we pass inside the surface of the Sun, and so there is nohorizon physically realized for the Sun (and for most stars). (Outside of the gravitating mass, one can use ( ) which is howobservable predictions of Einstein’s theory such as the precession of the perihelion of Mercury are obtained.) But consider a starwhich is more massive than the Chandrasekhar and Tolman-Oppenheimer-Volkov limits. If it is massive enough to contractgravitationally overcoming even the quark degeneracy pressure, its radius can shrink below its Schwarzschild radius and we canget an black hole. The belief is that there is such a black hole at the center of our galaxy, and most other galaxies as well.

Returning to the physical properties of black holes, although classical theory tells us that nothing can escape a black hole, a mostinteresting effect is that black holes radiate. This is a quantum process. A full calculation of this process cannot be done without aquantum theory of gravity (which we do not yet have). So, while the fact that black holes must radiate can be argued in generality,the nature of the radiation can only be calculated in a semiclassical way. The result of such a semiclassical calculation is thatirrespective of the nature of the matter which went into the formation of the black hole, the radiation which comes out is thermal,following the Planck spectrum, corresponding to a certain temperature

Although related processes were understood by many scientists, the general argument for radiation from black holes was due toHawking and hence the radiation from any spacetime horizon and the corresponding temperature are referred to as the Hawkingradiation and Hawking temperature, respectively.

Because there is a temperature associated with a black hole, we can think of it as a thermodynamic system obeying

The internal energy can be taken as following the Einstein mass-energy equivalence. We can then use Equation tocalculate the entropy of a black hole as

d = d (1 − )− − d − θds2 c2 t2 2GM

rc2

dr2

(1 − )2GMrc2

r2 θ2 r2 sin2 φ2 (10.3.1)

(r, θ,φ) G

c

r = 0 r = 2GM

c2

r = 0

r = 2GMc2

θ = 0 π2GMc2

ds = 0 r < 2GM

c2

(1 − )2GM

rc2 r = 2GM

c2

r = 2GM

c2

2GM

c2

10.3.1

2 × kg1030

1.4km 10.3.1

10.3.1

=THħc3

8πkGM(10.3.2)

dU = T dS (10.3.3)

Mc2 10.3.3

Page 105: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

10.3.2 https://phys.libretexts.org/@go/page/32053

(This formula for the entropy is known as the Bekenstein-Hawking formula.) Here is the area of the horizon, \(A = 4πR_S^2 ,R_S = \frac{2GM}{c^2} being the Schwarzschild radius.

These results immediately bring up a number of puzzles.

1. A priori, there is nothing thermodynamic about the Schwarzschild metric or the radiation process. The radiation can obtainedfrom the quantized version of the Maxwell equations in the background spacetime ( ). So how do thermodynamicconcepts arise in this case?

2. One could envisage forming a black hole from a very ordered state of very low entropy. Yet once the black hole forms, theentropy is given by ( ). There is nothing wrong with generating more entropy, but how did we lose the information codedinto the low entropy state? Further, the radiation coming out is thermal and hence carries no information. So is there any way tounderstand what happened to it? These questions can be sharpened further. First of all, we can see that the Schwarzschild blackhole can evaporate away by Hawking radiation in a finite time. This is because the radiation follows the Planck spectrum and sowe can use the Stefan-Boltzmann law (8.3.14) to calculate the rate of energy loss. Then from

we can obtain the evaporation time. Now, there is a problem with the radiation being thermal. Time-evolution in the quantumtheory is by unitary transformations and these do not generate any entropy. So if we make a black hole from a very low entropystate and then it evaporates into thermal radiation which is a high entropy state, how is this compatible with unitary time-evolution? Do we need to modify quantum theory, or do we need to modify the theory of gravity?

3. Usually, when we have nonzero entropy, we can understand that in terms of microscopic counting of states. Are the number ofstates of a black hole proportional to ? Is there a quantitative way to show this?

4. The entropy is proportional to the area of the horizon. Usually, entropy is extensive and the number of states is proportional to

the volume (via things like ). How can all the states needed for a system be realized in terms of a lower dimensional

surface?

There are some tentative answers to some of these questions. Although seemingly there is a problem with unitary time-evolution,this may be because we cannot do a full calculation. The semiclassical approximation breaks down for very small black holes. Sowe cannot reliably calculate the late stages of black hole evaporation. Example calculations with black holes in lower dimensionscan be done using string theory and this suggests that time-evolution is indeed unitary and that information is recovered in thecorrelations in the radiation which develop in later stages.

For most black hole solutions, there is no reliable counting of microstates which lead to the formula ( ). But there are somesupersymmetric black holes in string theory for which such a counting can be done using techniques special to string theory. Forthose cases, one does indeed get the formula ( ). This suggests that string theory could provide a consistent quantum theory ofblack holes and, more generally, of spacetimes with horizons. It could also be that the formula ( ) has such universality (asmany things in thermodynamics do) that the microscopic theory may not matter and that if we learn to do the counting of statescorrectly, any theory which has quantum gravity will lead to ( ), with perhaps, calculable additional corrections (which aresubleading, i.e., less extensive than area).

The idea that a lower dimensional surface can encode enough information to reconstruct dynamics in a higher dimensional space issimilar to what happens in a hologram. So perhaps to understand the entropy formula ( ), one needs a holographicformulation of physical laws. Such a formulation is realized, at least for a restricted class of theories, in the so-called AdS/CFTcorrespondence (or holographic correspondence) and its later developments. The original conjecture for this is due to J.Maldacena and states that string theory on an anti-de Sitter(AdS) spacetime background in five dimensions (with an additional 5-sphere) is dual to the maximally supersymmetric Yang-Mills gauge theory (which is a conformal field theory (CFT)) on theboundary of the AdS space. One can, in principle, go back and forth, calculating quantities in one using the other. Although still aconjecture, this does seem to hold for all cases where calculations have been possible.

It is clear that this is far from a finished story. But from what has been said so far, there is good reason to believe that research overthe next few years will discover some deep connection between gravity and entropy.

=SB−H

c3

ħ

A

4 G(10.3.4)

A

10.3.1

10.3.4

= −σ Ad(M )c2

dtT 4H (10.3.5)

SB−H

x pd3 d3

(2πħ)3

10.3.4

10.3.4

10.3.4

10.3.4

10.3.4

Page 107: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1 https://phys.libretexts.org/@go/page/34449

BibliographySome useful reference books are:

1. H.B. Callen, Thermodynamics and an Introduction to Thermostatistics (John Wiley & Sons, 1985)2. D. Chandler, Introduction to Modern Statistical Mechanics (Oxford University Press, 1987)3. W.Greiner, L. Neise and H. Stöcker, Thermodynamics and Statistical Mechanics (Springer, 1995)4. K. Huang, Statistical Mechanics (Joh Wiley & Sons, Inc. 1987)5. R. Kubo, Thermodynamics (North-Holland Publishing Co., 1968)6. F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw-Hill Book Co., 1965)7. D.V. Schroeder, An Introduction to Thermal Physics (Addison Wesley Longman, 2000)8. M.W. Zemansky and R.H. Dittman, Heat and Thermodynamics (7th Ed.) (The McGrawHill Book Companies, Inc., 1997)

The book by Chandrasekhar, referred to in chapter 9, is S. Chandrasekhar, An Introduction to the Study of Stellar Structure (DoverPublications, Inc., 1967)

Page 108: ME CH ANI CS AND STATI STI CAL TH E R MODYNAMI CS

1 https://phys.libretexts.org/@go/page/32093

IndexAabsolute zero

4: The Third Law of Thermodynamics adiabatics

2.1: The First Law

BBinomial distribution

7.1: The Binomial Distribution Black body radiation

8.4: Applications of the Bose-Einstein Distribution

CCarnot cycle

3.1: Carnot Cycle Chandrasekhar limit

10.3: Entropy and Gravity coefficient of expansion

6.2: Other Relations cumulant expansion

7.5: Equation of State

DDiesel cycle

3.5: Some Other Thermodynamic Engines

Eempirical temperature

1.2: The Zeroth Law of Thermodynamics equipartition theorem

7.2: Maxwell-Boltzmann Statistics

Ffirst law of thermodynamics

2.1: The First Law fluctuations

7.6: Fluctuations

fugacity7.5: Equation of State

GGibbs paradox

7.2: Maxwell-Boltzmann Statistics grand canonical partition function

7.5: Equation of State

HHawking radiation

10.3: Entropy and Gravity Hawking temperature

10.3: Entropy and Gravity holographic correspondence

10.3: Entropy and Gravity

Iideal gas equation of state

1.3: Equation of State internal energy

2.1: The First Law isothermal compressibility

6.2: Other Relations

MMaxwell distribution

7.3: The Maxwell Distribution For Velocities Maxwell relations

6.1: Maxwell Relations Maxwell’s demon

10.2: Maxwell’s Demon

Oosmotic pressure

7.8: Examples Otto cycle

3.5: Some Other Thermodynamic Engines

Pperfect differential

6.1: Maxwell Relations Planck distribution

8.4: Applications of the Bose-Einstein Distribution

SSchwarzschild metric

10.3: Entropy and Gravity Schwarzschild radius

10.3: Entropy and Gravity second virial coefficient

7.5: Equation of State Shannon entropy

10.1: Information specific heat

2.1: The First Law speed of sound

2.3: Barometric Formula and the Speed of Sound Stirling approximation

7.1: The Binomial Distribution

Tthermodynamic coordinates

1.1: Definitions thermodynamic equilibrium

5.2: Thermodynamic Equilibrium

Vvan der Waals equation of state

1.3: Equation of State virial expansion

7.5: Equation of State

Zzeroth law of thermodynamics

1.2: The Zeroth Law of Thermodynamics