advanced calculus 2 for electrical engineers math-212/ece ...ece206/outline/ece206.pdf · advanced...

Advanced Calculus 2 for Electrical Engineers

MATH-212/ECE-206

FALL TERM, 2013

Andrew J. Heunis c©

Department of Electrical and Computer Engineering

University of Waterloo

Waterloo

Ontario N2L 3G1

February 20, 2014

Contents

1 Goals and Preview 3

2 Multidimensional Integration 5

2.1 Two Dimensional Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Three Dimensional Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Scalar and Vector Fields 33

3.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Definition of Vector and Scalar Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Curves and Paths in Space 44

4.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2 Paths and Parametric Representation of Curves . . . . . . . . . . . . . . . . . . . . 45

4.3 Derivatives Along a Path and Tangent to a Curve . . . . . . . . . . . . . . . . . . . 51

4.4 Simple Curves and Closed Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5 Line Integral and Arc Length 58

5.1 Line Integral of a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.2 Line Integral of Scalar Field and Arc Length . . . . . . . . . . . . . . . . . . . . . . 67

6 Conservative Vector Fields 70

6.1 Gradient of a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.2 Conservative Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.3 Conservation of Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

7 Green’s Theorem in the Plane 83

7.1 Green’s Theorem for Rectangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

1

7.2 Green’s Theorem: General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8 Surfaces, Surface Area and Surface Integrals 91

8.1 Parametric Representation of Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . 91

8.2 Tangents to a Surface and Smooth Surfaces . . . . . . . . . . . . . . . . . . . . . . 102

8.3 Area of a Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

8.4 Surface Integral of a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

8.5 Surface Integral of a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

9 Vector Calculus 134

9.1 Differential Operators of Vector Calculus: Divergence, Curl, Laplacian . . . . . . . . 134

9.2 Theorem of Stokes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

9.3 Divergence Theorem of Gauss-Ostrogradskii . . . . . . . . . . . . . . . . . . . . . . 158

9.4 The Continuity Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

10 The Basic Laws of Electricity and Magnetism 173

10.1 Static Electric Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

10.2 Static Magnetic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

10.3 Time Varying Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

11 Maxwell’s Equations 193

11.1 The Ampere-Maxwell Law for Time Varying Fields . . . . . . . . . . . . . . . . . . 193

11.2 Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

11.3 Electromagnetic Waves without Sources . . . . . . . . . . . . . . . . . . . . . . . . 198

11.4 Electromagnetic Waves with Sources . . . . . . . . . . . . . . . . . . . . . . . . . . 210

12 Cylindrical and Spherical Coordinates 217

12.1 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

12.2 Cylindrical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

12.3 Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

2

Chapter 1

Goals and Preview

This course continues the sequence of calculus courses that you have taken during the last several

years, and is specifically about vector calculus and the calculus of complex variables. Vector calculus

builds upon the elementary calculus you have learned, but is far more powerful than this elementary

calculus. Indeed, comparing vector calculus with elementary calculus is like comparing a turbo-

charged Mercedes Benz with an oxcart. The power of vector calculus is really a consequence of its

main theorems; these are Green’s theorem, Stokes’ theorem and the Gauss-Ostrogradskii theorem,

and we shall see all of these in the course. The ideas and theorems of vector calculus are completely

indispensable for modern science and technology, and are used in electromagnetism, aerodynamics,

fluid mechanics, classical mechanics, quantum mechanics and gravitational physics. In particular,

vector calculus is the essential tool for really understanding the laws of electricity and magnetism,

for vector calculus enables one to effectively compress every single known law of electricity and

magnetism into a set of just four equations, called Maxwell’s equations. In this course we shall apply

the tools of vector calculus to a preliminary study of the main properties of Maxwell’s equations,

and you will be seeing much more on these equations in follow-up courses on electromagnetic fields

and electromagnetic waves.

Concerning Maxwell’s equations the physicist Richard P. Feynman (Nobel Prize in Physics,

1965) stated “From a long view of the history of mankind - seen, from, say, ten thousand years from

now - there can be little doubt that the most significant event of the 19th century will be judged as

Maxwell’s discovery of the laws of electrodynamics” 1 That is, the discovery of Maxwell’s equations

(c. 1861 - 1865) transcends in importance absolutely everything that took place in the 19th century,

1see Richard P. Feynman “The Feynman Lectures on Physics, Volume II: Mainly Electromagnetism and Matter”,

Chapter 1, Section 6 (Electromagnetism in science and technology).

3

including the depressing litany of wars, revolutions, colonizations, exploitations and land-grabs, and

the usual political double-dealing, finagling, back-stabbing and horse-trading. Ten thousand years

from now all of that will have been largely forgotten, but the priceless value of Maxwell’s equations

will remain. In fact, without Maxwell’s equations, there would be quite literally no modern science

or modern technology, and therefore of course no modern civilization either. Today Maxwell’s

equations are indispensable to scientists working at the very frontiers of physics, on problems of

high energy physics, gravitational physics and quantum electrodynamics, and are just as essential for

engineering applications, for these equations are the very key to radio, television, radar, microwave

ovens, microwave communications, space satellites, cell phones, the internet ..... the list is virtually

endless!

Here are Maxwell’s marvelous equations:

∇ ·EEE =ρ

ε0,

∇ ·BBB = 0,

∇×EEE +∂BBB

∂t= 0,

∇×BBB = µ0JJJ + ε0µ0∂EEE

∂t.

These equations will likely look impenetrable to you now, but you will be quite familiar with them

by the end of this course. At this point you may perhaps recognize some of the symbols from

elementary physics, such as the electric field EEE and the magnetic field BBB, and we will shortly find

out about the charge density field ρ and the current density field JJJ . However, what are the bizarre

looking symbols ∇· and ∇×, and how does one extract such an enormous history-transforming

punch out of these equations? The answers to these questions is to be found in the power of the

vector calculus that we are going to study in this course.

4

Chapter 2

Multidimensional Integration

Vector calculus involves a number of rather fancy integrals that we shall study later in the course,

such as line integrals and surface integrals. In fact, the basic theorems which give vector calculus its

extraordinary power (that is Green’s theorem, theorem of Stokes, theorem of Gauss-Ostrogradskii),

can only be formulated in terms of line and surface integrals. However, these integrals, and espe-

cially the surface integral, in their turn rely on the simpler ideas of multidimensional integration,

specifically two dimensional “dxdy” integrals (or double integrals) and three dimensional “ dx dy dz”

integrals (or triple integrals). We are therefore going to devote this chapter to recalling the main

ideas of multidimensional integration. You should also note that multidimensional integrals serve

not only the requirements of vector calculus, but are also indispensable in areas such as probabil-

ity and communications, where vector calculus does not naturally arise. The ideas of the present

chapter are therefore also prerequisites for later courses that you will take on probability and com-

munications.

2.1 Two Dimensional Integration

In the present section we focus on two dimensional integration, that is the integration of functions

defined on portions of the plane R2. You are likely already familiar with two dimensional integration,

but in view of its huge importance we shall briefly recall the main aspects of two dimensional

integration here. Suppose we have a real-valued function

(2.1.1) f : D → R,

in which D ⊂ R2 is the rectangle shown in Figure 2.1: The sides of the rectangle D are the intervals

5

Figure 2.1: Rectangle D in the plane R2

a ≤ x ≤ b and c ≤ y ≤ d, and in the notation of sets we write D as

(2.1.2) D = (x, y) ∈ R2 | a ≤ x ≤ b and c ≤ y ≤ d

For the sake of brevity we will usually denote this rectangle in the following “mathematical” notation

(2.1.3) D = [a, b]× [c, d].

in which the intervals a ≤ x ≤ b and c ≤ y ≤ b are indicated by the abbreviated notations [a, b]

and [c, d]. We now define what is meant by the integral of the function f over the rectangle D. To

this end subdivide the interval a ≤ x ≤ b into n + 1 equally spaced points xj and subdivide the

interval c ≤ y ≤ d into n+ 1 equally spaced points yk, that is

(2.1.4) a = x0 < x1 < . . . < xn = b, c = y0 < y1 < . . . < yn = d,

with spacing ∆x and ∆y between successive subdivision points

(2.1.5) ∆x := xi+1 − xi =b− an

, ∆y := yj+1 − yj =d− cn

,

(see Figure 2.2). Let Djk be the (small) rectangle given by

Djk := (x, y) ∈ R2 | xj ≤ x ≤ xj+1 and yk ≤ y ≤ yk+1

≡ [xj, xj+1]× [yk, yk+1],(2.1.6)

6

Figure 2.2: Rectangles Djk and D in the plane R2

and fix some point rrrjk := (ξj, ηk) in Djk, that is xj ≤ ξj ≤ xj+1 and yk ≤ ηk ≤ yk+1 (again see

Figure 2.2).

Now define the Riemann sum of the function f on the rectangle D:

(2.1.7) Sn :=n−1∑j=0

n−1∑k=0

f(ξj, ηk)∆x∆y,

for each n = 1, 2, . . .. We can define the integral of the function f over the rectangle D as follows:

Definition 2.1.1. If the sequence of Riemann sums Sn, n = 1, 2, . . . converges to a limit S as

n → ∞, and the limit S is the same for every choice of points (ξj, ηk) in Djk, then S is called the

integral of the function f over the rectangle D.

Remark 2.1.2. The various notations for the integral in Definition 2.1.1 are

(2.1.8)

∫D

f(x, y) dx dy,

∫D

f(x, y) dA,

∫D

f dx dy,

∫D

f dA,

The essential elements in all of these notations is the subscript D attached to the integral, indicating

the region in R2 over which one integrates, and the integrand f indicating the function being

integrated. The symbol “ dA” is effectively just shorthand for “ dx dy”. The first two notations

explicitly remind us that we are integrating over D with respect to an underlying space variable in

R2 which is generically denoted by (x, y). It can be quite tedious to keep carrying the space variable

(x, y), and so, in the third and fourth notations of (2.1.8), this variable is suppressed, but always

understood to be present!

7

Remark 2.1.3. Definition 2.1.1 raises a number of questions. What is the situation if the sequence

of Riemann sums Sn, n = 1, 2, . . . fails to converge to any limit? In this case the integral of

the function f over the rectangle D does “not make sense” and is said to be undefined. Another

very natural question: under what conditions on f can we be sure that the integral of f over a

rectangle D “makes sense” (or is defined) in the sense of Definition 2.1.1? This is a rather profound

question, the answer to which is given by a branch of mathematics called the abstract theory of

Lebesgue measure and integration. Fortunately, we need never be concerned with this question, for

the abstract theory of measure and integration tells us that the class of functions which can be

integrated over D is simply huge, and we are completely safe in assuming that every function that

we shall encounter has an integral which is defined.

Remark 2.1.4. It is one thing to formulate the definition of an integral, as we have done in

Definition 2.1.1, quite another matter to actually calculate the integral over a rectangle D of a given

function f . For this we need an essential result called Fubini’s theorem. To state this result suppose

that f : D → R and D is the rectangle at (2.1.3) i.e. shown in Figure 2.1. Now define the function

h1(x) for all a ≤ x ≤ b as follows:

(2.1.9) h1(x) :=

∫ d

c

f(x, y) dy, for all a ≤ x ≤ b.

It is most important to understand the sense in which the integration on the right side of (2.1.9)

is meant: We fix some x in the interval a ≤ x ≤ b which leaves us with a function f(x, y) which

depends only on y in the interval c ≤ y ≤ d (since x is fixed). The right side of (2.1.9) is the integral

of this function with respect to y in the interval c ≤ y ≤ d. Of course, for different choices of x in

the interval a ≤ x ≤ b, we generally get different values for the integral, that is we get a function

h1(x) of x over the interval a ≤ x ≤ b. The important thing is that the integral in (2.1.9) is just

an ordinary single-variable integral over an interval, and this is usually quite easy to evaluate. In

exactly the same way we also define the function h2(y) for all c ≤ y ≤ d as follows:

(2.1.10) h2(y) :=

∫ b

a

f(x, y) dx, for all c ≤ y ≤ d.

Having defined the integrals at (2.1.9) and (2.1.10) we can state

Theorem 2.1.5 (Fubini theorem for rectangles in R2). Suppose that f : D → R where D is the

rectangle at (2.1.3), and h1(x) and h2(y) are defined by (2.1.9) and (2.1.10) respectively. Then

(2.1.11)

∫D

f(x, y) dx dy =

∫ b

a

h1(x) dx =

∫ d

c

h2(y) dy.

8

Remark 2.1.6. The equalities at (2.1.11) are usually written in a more complete and self-contained

way as follows:

(2.1.12)

∫D

f(x, y) dx dy =

∫ b

a

∫ d

c

f(x, y) dy

dx =

∫ d

c

∫ b

a

f(x, y) dx

dy.

Notice that in the right-hand integral in (2.1.12) we fix first keep y fixed and integrate with respect

to x to get the “inner integral” in braces, and then integrate with respect to y, whereas for the

middle integral in (2.1.12) we do things the other way around. The absolutely essential thing

about Fubini’s theorem is that it reduces evaluation of an integral over a rectangle to the successive

evaluations of two integrals over intervals. These are called iterated integrals. Either we can use

the iterated integral in the middle or the iterated integral on the right of (2.1.12). Both choices will

work (Fubini’s theorem guarantees this!) but in practice it is often the case that one choice involves

less work than the other.

Example 2.1.7. For the function f : D → R given by

(2.1.13) f(x, y) = x2 + y2, with D := [−1, 1]× [0, 1],

evaluate the integral∫Df(x, y) dx dy.

We use Fubini’s theorem with a = −1, b = 1, c = 0 and d = 1. Following (2.1.9), put

h1(x) :=

∫ d

c

f(x, y) dy

=

∫ 1

0

[x2 + y2] dy (from (2.1.13))

=

[x2y +

y3

3

]y=1

y=0

(keeping x constant in the dy-integration)

= x2 +1

3.

(2.1.14)

Now use Fubini’s theorem in the form of (2.1.11):∫D

f(x, y) dx dy =

∫ b

a

h1(x) dx

=

∫ 1

−1

x2 +

1

3

dx =

4

3.

(2.1.15)

You should now repeat this calculation, but using (2.1.10) instead of (2.1.9), to verify that you get

the same value for the integral.

9

Remark 2.1.8. Suppose we must integrate a function f : D → R when D ⊂ R2 is not a rectangle

square to the x− y axes, as has been the case in all previous double integrals. An obvious example

of such a non-rectangular region is the unit disc, that is the disc of unit radius centered at the

origin of R2. We cannot directly evaluate an integral over the unit disc by Fubini’s theorem, which

is restricted to integration over rectangular regions of R2. However, we can easily modify Fubini’s

theorem to integrate over a large class of non-rectangular regions D ⊂ R2 provided that these

regions are not too complicated. To formulate such a region suppose that φ1 : [a, b] → R and

φ2 : [a, b]→ R are given continuous functions over some fixed interval a ≤ x ≤ b such that

(2.1.16) φ1(x) ≤ φ2(x) for all a ≤ x ≤ b.

The region D ⊂ R2 is called y-simple with lower function φ1(x), upper function φ2(x) and common

interval of definition a ≤ x ≤ b, when D is the set of all points (x, y) such that (see Figure 2.3)

(2.1.17) a ≤ x ≤ b and φ1(x) ≤ y ≤ φ2(x),

that is

(2.1.18) D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x).

In short, a y-simple region D ⊂ R2 with lower function φ1 : [a, b] → R and upper function

φ2 : [a, b]→ R is lower bounded by the graph of the function φ1(x) and upper bounded by the graph

of the function φ2(x). From now on we fix constants c and d such that

(2.1.19) c < φ1(x) ≤ φ2(x) < d for all a ≤ x ≤ b.

It then follows that the region D is contained within the rectangle E defined by

E := (x, y) ∈ R2 | a ≤ x ≤ b and c ≤ y ≤ d

= [a, b]× [c, d],(2.1.20)

(see Figure 2.3). Now define the function f ∗ on the rectangle E as

(2.1.21) f ∗(x, y) :=

f(x, y), for all (x, y) in D,

0, for all (x, y) in E but outside D.

It is then evident that

(2.1.22)

∫D

f(x, y) dx dy =

∫E

f ∗(x, y) dx dy.

10

Figure 2.3: y-simple region D

We now evaluate the integral over the rectangle E on the right of (2.1.22) by Fubini’s theorem in

the form of the identity (2.1.12) with f ∗ in place of f , and making use of the middle integral of

(2.1.22), that is

(2.1.23)

∫E

f ∗(x, y) dx dy =

∫ b

a

∫ d

c

f ∗(x, y) dy

dx.

From (2.1.21) and (2.1.19), for each fixed x in the interval a ≤ x ≤ b we must have

(2.1.24) f ∗(x, y) :=

f(x, y), for all φ1(x) ≤ y ≤ φ2(x),

0, when either c ≤ y < φ1(x) or φ2(x) < y ≤ d.

In view of (2.1.24), in the “inner integral” over c ≤ y ≤ d appearing in (2.1.23) the upper limit of

integration d can be replaced with φ2(x) and the lower limit of integration c can be replaced with

φ1(x) without changing the value of the integral, that is

(2.1.25)

∫ d

c

f ∗(x, y) dy =

∫ φ2(x)

φ1(x)

f(x, y) dy for all a ≤ x ≤ b,

so that (2.1.25) and (2.1.23) then give

(2.1.26)

∫E

f ∗(x, y) dx dy =

∫ b

a

∫ φ2(x)

φ1(x)

f(x, y) dy

dx.

11

Upon combining (2.1.26) and (2.1.22) we find that∫D

f(x, y) dx dy =

∫ b

a

∫ φ2(x)

φ1(x)

f(x, y) dy

dx.

This result is so useful that we repeat it stated as a theorem:

Theorem 2.1.9 (Fubini for y-simple region in R2). Suppose that D is any y-simple region with

lower function φ1(x), upper function φ2(x) and common interval of definition a ≤ x ≤ b (see Figure

2.3), and f : D → R is a given function. Then

(2.1.27)

∫D

f(x, y) dx dy =

∫ b

a

∫ φ2(x)

φ1(x)

f(x, y) dy

dx.

The nice thing about (2.1.27) is that the right hand side is often very easy to calculate as the

next example shows:

Example 2.1.10. D ⊂ R2 is a y-simple region with lower function φ1(x) and upper function φ2(x)

defined by

(2.1.28) φ1(x) := 0 φ2(x) :=√

1 + cos(x), for all 0 ≤ x ≤ 2π.

Note that 1 + cos(x) ≥ 0 for all 0 ≤ x ≤ 2π i.e. the square root in the definition of φ2(x) is real

(not imaginary) valued. The function f is defined by

(2.1.29) f(x, y) := 2y for all (x, y) in D.

Evaluate the integral of function f over the region D.

From (2.1.27) with a = 0 and b = 2π

(2.1.30)

∫D

f(x, y) dx dy =

∫ 2π

0

∫ φ2(x)

φ1(x)

f(x, y) dy

dx.

For the inner dy-integral in (2.1.30) define

(2.1.31) h1(x) :=

∫ φ2(x)

φ1(x)

f(x, y) dy, for all 0 ≤ x ≤ 2π,

so that

(2.1.32)

∫D

f(x, y) dx dy =

∫ 2π

0

h1(x) dx.

12

From (2.1.31) with (2.1.29) and (2.1.28)

(2.1.33) h1(x) =

∫ φ2(x)

φ1(x)

(2y) dy =[y2]y=φ2(x)

y=φ1(x)= φ2(x)2 − φ1(x)2 = 1 + cos(x),

From (2.1.32) with (2.1.33)

(2.1.34)

∫D

f(x, y) dx dy =

∫ 2π

0

[1 + cos(x)] dx = 2π.

Remark 2.1.11. Complementary to the idea of a y-simple region is an x-simple region. To define

an x-simple region suppose that ψ1 : [c, d] → R and ψ2 : [c, d] → R are given continuous functions

over some fixed interval c ≤ y ≤ d such that

(2.1.35) ψ1(y) ≤ ψ2(y) for all c ≤ y ≤ d.

The region D ⊂ R2 is called x-simple with left function ψ1(y), right function ψ2(y) and common

interval of definition c ≤ y ≤ d, when D is the set of all points (x, y) such that (see Figure 2.4)

(2.1.36) c ≤ y ≤ d and ψ1(y) ≤ x ≤ ψ2(y),

that is

(2.1.37) D = (x, y) ∈ R2 | c ≤ y ≤ d, ψ1(y) ≤ x ≤ ψ2(y).

We see that an x-simple region D ⊂ R2 with left function ψ1 : [c, d] → R and right function

ψ2 : [c, d]→ R is bounded on the left by the graph of the function ψ1(y) and bounded on the right

by the graph of the function ψ2(y). We then have the following analog of Theorem 2.1.9:

Theorem 2.1.12 (Fubini for x-simple region in R2). Suppose that D is any x-simple region with

left function ψ1(y), right function ψ2(y) and common interval of definition c ≤ y ≤ d (see Figure

2.4), and f : D → R is a given function. Then

(2.1.38)

∫D

f(x, y) dx dy =

∫ d

c

∫ ψ2(y)

ψ1(y)

f(x, y) dx

dy.

Remark 2.1.13. Particularly useful in applications are regions that are both x-simple and y-simple

at the same time. Such regions are called regular regions. A region D ⊂ R2 is therefore regular

when it is both lower bounded by a continuous function φ1 : [a, b] → R and upper bounded by a

continuous function φ2 : [a, b]→ R (where φ1 and φ2 satisfy (2.1.16)), as well as left bounded by a

13

Figure 2.4: x-simple region D

continuous function ψ1 : [c, d]→ R and right bounded by a continuous function ψ2 : [c, d]→ R (in

which ψ1 and ψ2 satisfy (2.1.35)). Put another way, a region D ⊂ R2 is regular when it is given by

D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x)

= (x, y) ∈ R2 | c ≤ y ≤ d, ψ1(y) ≤ x ≤ ψ2(y),(2.1.39)

for some continuous functions φ1 : [a, b] → R, φ2 : [a, b] → R, ψ1 : [c, d] → R, ψ2 : [c, d] → R. For

such a region it is clear that both Theorem 2.1.9 and Theorem 2.1.12 must hold, namely

Theorem 2.1.14 (Fubini for a regular region in R2). Suppose that D ⊂ R2 is a regular region, that

is both y-simple with lower function φ1(x), upper function φ2(x), and common interval of definition

a ≤ x ≤ b, as well as x-simple with left function ψ1(y), right function ψ2(y), and common interval

of definition c ≤ y ≤ d. For a function f : D → R we have

(2.1.40)

∫D

f(x, y) dx dy =

∫ b

a

∫ φ2(x)

φ1(x)

f(x, y) dy

dx =

∫ d

c

∫ ψ2(y)

ψ1(y)

f(x, y) dx

dy.

In order to integrate f on the regular region D we can evaluate either the middle integral or

the right hand integral in (2.1.40). As we shall see this flexibility of choice can be very useful;

sometimes one of these integrals is very much easier to evaluate than the other, in which case one

obviously evaluates the easier of the two integrals.

14

Example 2.1.15. Show that the triangular shaped region D ⊂ R2 in Figure 2.5 is a regular region.

First show that D is y-simple: From Figure 2.5 it looks reasonable to define

Figure 2.5: Region D for Example 2.1.15

(2.1.41) a := 1, b := 3, and φ1(x) := 1 for all 1 ≤ x ≤ 3.

To define the upper function φ2(x) we write out the equation of the straight line passing through

A and B, that is

(2.1.42) y =x+ 1

2.

Now (2.1.42) shows that we must define the upper function by

(2.1.43) φ2(x) :=x+ 1

2for all 1 ≤ x ≤ 3.

Next show that D is also x-simple: From Figure 2.5 it looks reasonable to take

(2.1.44) c := 1, d := 2, and ψ2(y) := 3 for all 1 ≤ y ≤ 2.

It remains to define the left function ψ1(y). For this we just rewrite (2.1.42), but putting x in terms

of y, that is

(2.1.45) x = 2y − 1,

and (2.1.45) shows that we must define the left function by

(2.1.46) ψ1(y) := 2y − 1 for all 1 ≤ y ≤ 2.

15

Example 2.1.16. Show that the disc of radius r centered at the point (α, β) in R2 (see Figure 2.6)

is a regular region.


We first show that D is y-simple. For this we must determine lower and upper functions φ1(x)

and φ2(x) defined on some common interval a ≤ x ≤ b. From Figure 2.6 it looks reasonable to fix

(2.1.47) a := α− r, b := α + r.

Now the equation of the circle ABCE is of course

(2.1.48) (x− α)2 + (y − β)2 = r2.

We use this to determine the lower function φ1(x) and upper function φ2(x). From (2.1.48) we find

(y − β)2 = r2 − (x− α)2

that is

(2.1.49) y = β ±√r2 − (x− α)2.

From (2.1.49) it follows that the equation of the lower arc AEC is

(2.1.50) φ1(x) = β −√r2 − (x− α)2, for all a ≤ x ≤ b,

16

and the equation of the upper arc ABC is

(2.1.51) φ2(x) = β +√r2 − (x− α)2, for all a ≤ x ≤ b.

This shows that D is y-simple with lower function φ1(x) (see (2.1.50)) and upper function φ2(x)

(see (2.1.51)), and common interval a ≤ x ≤ b given by (2.1.47).

We next show that D is x-simple. For this we must determine left and right functions ψ1(y)

and ψ2(y) defined on some common interval c ≤ y ≤ d. From Figure 2.6 it looks reasonable to fix

(2.1.52) c := β − r, d := β + r.

From (2.1.48) we get (exactly as at (2.1.49))

(2.1.53) x = α±√r2 − (y − β)2.

From (2.1.53) it follows that the equation of the left arc BAE is

(2.1.54) ψ1(y) = α−√r2 − (y − β)2, for all c ≤ y ≤ d,

and the equation of the right arc BCE is

(2.1.55) ψ2(y) = α +√r2 − (y − β)2, for all c ≤ y ≤ d.

This shows that D is x-simple with left function ψ1(y) (see (2.1.54)) and right function ψ2(y) (see

(2.1.55)) and common interval c ≤ y ≤ b given by (2.1.52).

Example 2.1.17. A region D ⊂ R2 is shown in Figure 2.7 and function f is defined on D by

(2.1.56) f(x, y) := expy3, for all (x, y) in D.

Determine the integral of f on region D. It is clear that D is a y-simple region. In fact, it is

immediate from Figure 2.7 that, for the lower and upper functions φ1 : [a, b]→ R and φ2 : [a, b]→ R,

we should take

(2.1.57) a := 0, b := 1, φ1(x) :=√x and φ2(x) := 1 for all 0 ≤ x ≤ 1.

Since D is y-simple, from (2.1.27) we get

(2.1.58)

∫D

f(x, y) dx dy =

∫ 1

0

∫ φ2(x)

φ1(x)

f(x, y) dy

dx.

17


For the “inner” dy-integral at (2.1.58) define

(2.1.59) h1(x) :=

∫ φ2(x)

φ1(x)

f(x, y) dy, for all 0 ≤ x ≤ 1,

so that, from (2.1.59) and (2.1.58),

(2.1.60)

∫D

f(x, y) dx dy =

∫ 1

0

h1(x) dx.

We must calculate the integral on the right of (2.1.60), and so we must first determine h1(x). From

(2.1.57) and (2.1.56),

(2.1.61) h1(x) =

∫ 1

√x

expy3 dy, for all 0 ≤ x ≤ 1.

We must now integrate the function expy3. Here, however, we run into a serious problem. To

integrate this function we need to find some function g(y) such that

(2.1.62)dg(y)

dy= expy3.

Unfortunately, an explicit formula for a function g(y) satisfying (2.1.62) is completely unknown to

anyone, which means that we cannot explicitly calculate the integral at (2.1.61) for h1(x), and

therefore of course we cannot determine the integral on the right of (2.1.60). At this point we could

18

just give up and try to approximate the integral of f on D numerically. However, note that the

region D shown in Figure 2.7 is also x-simple (that is, D is a regular region). In fact, it is immediate

from Figure 2.7 that the left and right boundary functions ψ1 : [c, d] → R and ψ2 : [c, d] → R are

given by

(2.1.63) c := 0, d := 1, ψ1(y) := 0 and ψ2(y) := y2 for all 0 ≤ y ≤ 1.

Since D is a regular region we also have available the third integral on the right side of (2.1.40),

that is

(2.1.64)

∫D

f(x, y) dx dy =

∫ d

c

∫ ψ2(y)

ψ1(y)

f(x, y) dx

dy.

For the “inner” dx-integral at (2.1.64) define

h2(y) :=

∫ ψ2(y)

ψ1(y)

f(x, y) dx

=

∫ y2

0

expy3 dx (from (2.1.63) and (2.1.56))

= expy3∫ y2

0

dx

= y2 expy3, for all 0 ≤ y ≤ 1.

(2.1.65)

Upon combining (2.1.65), (2.1.64) and (2.1.63) we obtain

(2.1.66)

∫D

f(x, y) dx dy =

∫ 1

0

h2(y) dy =

∫ 1

0

y2 expy3 dy.

Now of coursed expy3

dy= 3y2 expy3

so that

(2.1.67)

∫ 1

0

y2 expy3 dy =

[expy3

3

]y=1

y=0

=e− 1

3.

From (2.1.67) and (2.1.66)

(2.1.68)

∫D

f(x, y) dx dy =e− 1

3.

19

Remark 2.1.18. In Remark 2.1.13 we observed that, when the region D ⊂ R2 is regular, then

we have available both the middle and third iterated integrals in (2.1.40) with which to compute

the integral of function f on D. We also observed that one of these integrals may be difficult

to compute whereas the other integral may be easy to compute. Example (2.1.17) shows this very

clearly. Indeed, the middle integral of (2.1.40) is actually impossible to calculate (at least explicitly)

whereas the integral on the right of (2.1.40) is quite easy to evaluate.

Remark 2.1.19. The following elementary result from two dimensional calculus in the plane is

often useful: If D ⊂ R2 is some region which is either x-simple or y-simple, then taking f to be the

function with constant value

f(x, y) = 1 for all (x, y) in D,

we have

(2.1.69)

∫D

dx dy = area of D.

This follows immediately from Definition 2.1.1.

2.2 Three Dimensional Integration

In Section 2.1 we reviewed the main aspects of two dimensional integration, that is integration of

a real-valued function f on a region D in the plane R2. In this section we extend the ideas of two

dimensional integration to integration in three dimensions, following an approach which is a clear

extension of the approach to two dimensional integrals in Section 2.1. The goal of three dimensional

integration is to integrate a real-valued function f on a region Ω of three dimensional space R3.

Three dimensional integration is essential in many areas of physics and engineering; in particular

the forthcoming divergence theorem of Gauss-Ostrogradskii, an essential result of vector calculus,

relies on three dimensional integration.

Suppose we have a real-valued function

(2.2.70) f : Ω→ R,

in which Ω ⊂ R3 is the rectangular parallelepiped shown in Figure 2.8: The sides of the parallepiped

Ω are the intervals a ≤ x ≤ b, c ≤ y ≤ d and e ≤ z ≤ g, and in the notation of sets we write Ω as

(2.2.71) Ω = (x, y, z) ∈ R3 | a ≤ x ≤ b, c ≤ y ≤ d and e ≤ z ≤ g.

20

Figure 2.8: Rectangular parallelepiped in R3

For the sake of brevity we will usually denote this rectangle in the following “mathematical” notation

(2.2.72) Ω = [a, b]× [c, d]× [e, g],

in which the intervals a ≤ x ≤ b, c ≤ y ≤ b and e ≤ z ≤ g are indicated by the abbreviated

notations [a, b], [c, d] and [e, g]. We now define the integral of the function f over the parallelepiped

Ω. To this end subdivide the interval a ≤ x ≤ b into n + 1 equally spaced points xj, subdivide

the interval c ≤ y ≤ d into n + 1 equally spaced points yk, and subdivide the interval e ≤ z ≤ g

into the n+ 1 equally spaced points zl, that is

(2.2.73) a = x0 < x1 < . . . < xn = b, c = y0 < y1 < . . . < yn = d, e = z0 < z1 < . . . < zn = g,

with spacing ∆x, ∆y and ∆z between successive subdivision points given by

(2.2.74) ∆x := xj+1 − xj =b− an

, ∆y := yk+1 − yk =d− cn

, ∆z := zl+1 − zl =g − en

,

(c.f. (2.1.4) and (2.1.5)). Let Ωjkl be the (small) parallelepiped given by

Ωjkl := (x, y, z) ∈ R3 | xj ≤ x ≤ xj+1, yk ≤ y ≤ yk+1, and zl ≤ z ≤ zl+1

≡ [xj, xj+1]× [yk, yk+1]× [zl, zl+1],(2.2.75)

21

fix some point rrrjkl := (ξj, ηk, ζl) in Ωjkl (i.e. xj ≤ ξj ≤ xj+1, yk ≤ ηk ≤ yk+1 and zl ≤ ζl ≤ zl+1) and

define the Riemann sum of the function f on the parallelepiped Ω as follows

(2.2.76) Sn :=n−1∑j=0

n−1∑k=0

n−1∑l=0

f(ξj, ηk, ζl)∆x∆y∆z,

for each n = 1, 2, . . .. Now we can define the integral of the function f over the parallelepiped Ω as

follows:

Definition 2.2.1. If the sequence of Riemann sums Sn, n = 1, 2, . . . defined by (2.2.76) converges

to a limit S as n → ∞, and the limit S is the same for every choice of points (ξj, ηk, ζl) in Ωjkl,

then S is called the integral of the function f over the parallelepiped Ω.

Remark 2.2.2. The various notations for the integral in Definition 2.2.1 are

(2.2.77)

∫Ω

f(x, y, z) dx dy dz,

∫Ω

f(x, y, z) dV,

∫Ω

f dx dy dz,

∫Ω

f dV,

The essential elements in all of these notations is the subscript Ω attached to the integral, indicating

the region in R3 over which one integrates, and the integrand f indicating the function being

integrated. The symbol “ dV ” is effectively just shorthand for “ dx dy dz”. The first two notations

explicitly remind us that we are integrating over Ω with respect to an underlying space variable

in R3 which is generically denoted by (x, y, z). It can be quite tedious to keep carrying the space

variable (x, y, z), and so, in the third and fourth notations of (2.2.77), this variable is suppressed,

but always understood to be present! (compare with Remark 2.1.2 for two dimensional integrals).

Remark 2.2.3. Exactly as for two dimensional integrals (recall Remark 2.1.3) Definition 2.1.1 raises

a number of questions. What is the situation if the sequence of Riemann sums Sn, n = 1, 2, . . .fails to converge to any limit? In this case the integral of the function f over the parallelepiped

Ω does “not make sense” and is said to be undefined. Another very natural question: how can

we be sure that the integral of a function f over a parallelepiped Ω “makes sense” in the sense of

Definition 2.2.1? As in the case of two dimensional integrals, real analysis tells us that the class of

functions which can be integrated over Ω is simply huge, and we are completely safe in assuming

that every function that we shall encounter can be integrated.

Remark 2.2.4. In Remark 2.1.4 we observed that the actual calculation of two dimensional inte-

grals over a rectangle depended on a result called Fubini’s theorem. In exactly the same way, in

order to evaluate a three dimensional integral over a parallelepiped we need a version of Fubini’s

22

theorem extended to three dimensions. We develop this extension next. Define the rectangle D1 in

the x− y plane by (see Figure 2.8)

D1 := (x, y) ∈ R2 | a ≤ x ≤ b, c ≤ y ≤ d

= [a, b]× [c, d],(2.2.78)

and define the function

(2.2.79) h1(x, y) :=

∫ g

e

f(x, y, z) dz, for all (x, y) in D1.

In (2.2.79) we have fixed some (x, y) in D1 so that f(x, y, z) is now just a function of z only, and

on the right of (2.2.79) we integrate this function of z over e ≤ z ≤ g to get a real number h1(x, y)

which depends on our choice of (x, y). Similarly, we can define the rectangle D2 in the x− z plane

by (see Figure 2.8)

D2 := (x, z) ∈ R2 | a ≤ x ≤ b, e ≤ z ≤ g

= [a, b]× [e, g],(2.2.80)


(2.2.81) h2(x, z) :=

∫ d

c

f(x, y, z) dy, for all (x, z) in D2,

and, likewise, we can define the rectangle D3 in the y − z plane by (see Figure 2.8)

D3 := (y, z) ∈ R2 | c ≤ y ≤ d, e ≤ z ≤ g

= [c, d]× [e, g],(2.2.82)


(2.2.83) h3(y, z) :=

∫ b

a

f(x, y, z) dx, for all (y, z) in D3.

We then have the following Fubini theorem for three dimensions:

Theorem 2.2.5 (Fubini for rectangular parallelepiped in R3). Suppose that f : Ω → R where Ω

is the rectangular parallelepiped at (2.2.71), the functions h1(x, y), h2(x, z) and h3(y, z) are defined

by (2.2.79), (2.2.81) and (2.2.83) respectively, and the rectangles D1, D2 and D3 are defined by

(2.2.78), (2.2.80) and (2.2.82) respectively. Then

(2.2.84)

∫Ω

f dV =

∫D1

h1(x, y) dx dy =

∫D2

h2(x, z) dx dz =

∫D3

h3(y, z) dy dz.

23

Remark 2.2.6. Observe that the three integrals on the right of (2.2.84) are two dimensional

integrals (over the rectangles D1, D2 and D3), and each of these can be reduced by the Fubini

Theorem 2.1.5 to iterated integrals over intervals. Indeed, if we apply Fubini’s theorem for two

dimensional integrals in the form of the the identity (2.1.12) to, say, the two dimensional integral

of h1(x, y) over D1, we obtain

(2.2.85)

∫D1

h1(x, y) dx dy =

∫ b

a

∫ d

c

h1(x, y) dy

dx =

∫ d

c

∫ b

a

h1(x, y) dx

dy,

and similarly for the remaining two dimensional integrals over D2 and D3 at (2.2.84).

Example 2.2.7. A rectangular parallelepiped Ω is given by

Ω := (x, y, z) ∈ R3 | 0 ≤ x ≤ α, 0 ≤ y ≤ β and 0 ≤ z ≤ γ

= [0, α]× [0, β]× [0, γ],(2.2.86)

in which α, β and γ are positive constants, and f : Ω→ R is defined by

(2.2.87) f(x, y, z) := xy2 for all (x, y, z) in Ω.

Evaluate the integral∫

Ωf dV . Following (2.2.78) and (2.2.79) define

D1 := (x, y) ∈ R2 | 0 ≤ x ≤ α, 0 ≤ y ≤ β

= [0, α]× [0, β],(2.2.88)

and

(2.2.89) h1(x, y) :=

∫ γ

0

f(x, y, z) dz, for all (x, y) in D1.

From (2.2.89) and (2.2.87) we get

(2.2.90) h1(x, y) :=

∫ γ

0

xy2 dz = xy2

∫ γ

0

dz = γxy2.

From (2.2.90) and (2.1.12) (i.e. the Fubini theorem for two dimensions) we get

(2.2.91)

∫D1

h1(x, y) dx dy =

∫ α

0

∫ β

0

h1(x, y) dy

dx =

∫ α

0

∫ β

0

γxy2 dy

dx.

For the dy-integral at (2.2.91) we have

(2.2.92)

∫ β

0

γxy2 dy = γx

∫ β

0

y2 dy =γxβ3

3.

24

Now put (2.2.92) into (2.2.91) to get

(2.2.93)

∫D1

h1(x, y) dx dy =

∫ α

0

γxβ3

3dx =

γβ3

3

∫ α

0

x dx =α2β3γ

6.

From (2.2.93) and the Fubini Theorem 2.2.84

(2.2.94)

∫Ω

f dV =

∫D1

h1(x, y) dx dy =α2β3γ

6.

From (2.2.84) it follows that one could also get the result at (2.2.94) by either integrating h2(x, z)

over D2 or integrating h3(y, z) over D3.

Remark 2.2.8. It remains to define the integral of a function f defined over a region Ω ⊂ R3

which is not a rectangular parallelepiped. In this case we fix any rectangular parallelepiped Ξ ⊂ R3

which is large enough to contain the region Ω, that is Ω ⊂ Ξ. Now define f ∗ on the rectangular

parallelepiped Ξ by

(2.2.95) f ∗(x, y, z) :=

f(x, y, z), for all (x, y, z) in Ω,

0, for all (x, y) in Ξ but outside Ω.

We then define the integral of f over the region Ω by

(2.2.96)

∫Ω

f dV =

∫Ξ

f ∗ dV.

Since Ξ is a rectangular parallelepiped the integral on the right of (2.2.96) can, at least in principle,

be evaluated using the three dimensional Fubini Theorem 2.2.5.

Remark 2.2.9. In practice, use of Theorem 2.2.5 to actually compute the integral at (2.2.96)

relies on the region Ω not being too complicated, much as was the case in Remark 2.1.8 for two

dimensional integrals. We now formulate a particularly useful type of region Ω ⊂ R3 over which we

can evaluate integrals. To this end from now on we write

(2.2.97) R2xy := the x− y plane in R3,

so that generic points in R2xy are denoted by (x, y). Suppose that γ1 and γ2 are real valued continuous

functions defined on the common region D ⊂ R2xy, that is γ1 : D → R and γ2 : D → R, and suppose

that

(2.2.98) γ1(x, y) ≤ γ2(x, y), for all (x, y) in D.

25

Let S1 be the set of points traced out in R3 by the point (x, y, γ1(x, y)) as the point (x, y) varies

throughout the set D ⊂ R2xy, that is

(2.2.99) S1 = (x, y, γ1(x, y)) ∈ R3 | (x, y) in D.

As shown in Figure 2.9 the set S1 is a surface in R3. Effectively, one can imagine D as a “floor”

and γ1(x, y) gives the “height” of a “roof” at the point (x, y) in D; then S1 represents the shape of

the roof. Similarly, corresponding to γ2 : D → R is the surface S2 given by

(2.2.100) S2 = (x, y, γ2(x, y)) ∈ R3 | (x, y) in D.

Now let Ω ⊂ R3 be the set of all points (x, y, z) in R3 which are between the surfaces S1 and S2 (see

Figure 2.9). Put another way, Ω is the set of all (x, y, z) in R3 such that (x, y) is a member of D,

and, for this (x, y), z is in the range γ1(x, y) ≤ z ≤ γ2(x, y). In set-theoretic terms we write this as

(2.2.101) Ω = (x, y, z) ∈ R3 | (x, y) in D and γ1(x, y) ≤ z ≤ γ2(x, y).

By analogy with the simple regions in R2 discussed in Remark 2.1.8 this region Ω ⊂ R3 is called

z-simple with lower function γ1(x, y), upper function γ2(x, y) and common domain of definition

D ⊂ R2xy. By following very much the same argument that we used to obtain Theorem 2.1.9 one

can establish

Theorem 2.2.10 (Fubini for z-simple region in R3). Suppose that Ω is any z-simple region with

lower function γ1(x, y), upper function γ2(x, y) and common domain of definition D ⊂ R2xy (see

Figure 2.9). If f : Ω→ R is a given function then

(2.2.102)

∫Ω

f dV =

∫D

∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

dx dy,

Equivalently, we can write (2.2.102) as

(2.2.103)

∫Ω

f dV =

∫D

h1(x, y) dx dy,

in which h1(x, y) is defined by the one dimensional integral

(2.2.104) h1(x, y) :=

∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz, for all (x, y) in D.

The nice thing about (2.2.103) is that it reduces calculation of the three dimensional integral over

Ω on the left to calculation of the two dimensional integral over D ⊂ R2xy on the right. We already

26

Figure 2.9: z-simple region Ω ⊂ R3

know from Section 2.1 how to deal with such two dimensional integrals. In fact, suppose that the

common domain of definition D is itself y-simple with lower function φ1(x), upper function φ2(x)

and common interval of definition a ≤ x ≤ b, that is D is given by

(2.2.105) D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x).

(recall Remark 2.1.8 and see (2.1.18)). Then, from (2.1.27) with h1 in place of f , we find

(2.2.106)

∫D

h1(x, y) dx dy =

∫ b

a

∫ φ2(x)

φ1(x)

h1(x, y) dy

dx,

so that (2.2.106) and (2.2.103) give

(2.2.107)

∫Ω

f dV =

∫ b

a

∫ φ2(x)

φ1(x)

h1(x, y) dy

dx.

27

Usually the definition of h1 at (2.2.104) is substituted into (2.2.107) to give the following compressed

version of (2.2.107)

(2.2.108)

∫Ω

f dV =

∫ b

a

∫ φ2(x)

φ1(x)

[∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

]dy

dx,

which displays the integral of f over the three dimensional region Ω as three iterated integrals. In

(2.2.108) the common domain of definition D of the z-simple region Ω was assumed to be given

by (2.2.105), that is D is y-simple with lower function φ1(x), upper function φ2(x) and common

interval of definition a ≤ x ≤ b. Suppose instead that D is x-simple with left function ψ1(y), right

function ψ2(y) and common interval of definition c ≤ y ≤ d, that is

(2.2.109) D = (x, y) ∈ R2 | c ≤ y ≤ d, ψ1(y) ≤ x ≤ ψ2(y).

(see (2.1.37)). Repeating the argument which led to (2.2.108) we find of course that

(2.2.110)

∫Ω

f dV =

∫ d

c

∫ ψ2(y)

ψ1(y)

[∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

]dx

dy.

If the domain D is regular, that is x-simple as well as y-simple and given by both (2.2.109) and

(2.2.105) (see Remark 2.1.8) then (2.2.110) and (2.2.108) must hold together, that is∫Ω

f dV =

∫ b

a

∫ φ2(x)

φ1(x)

[∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

]dy

dx

=

∫ d

c

∫ ψ2(y)

ψ1(y)

[∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

]dx

dy.

(2.2.111)

Example 2.2.11. The region Ω ⊂ R3 is the tetrahedron in Figure 2.10(a) with vertices O, A, B

and C, and the function f : Ω→ R is defined by

(2.2.112) f(x, y, z) := y for all (x, y, z) in Ω.

Evaluate the three dimensional integral ∫Ω

f dV.

We see that Ω is given in set-theoretic terms by

(2.2.113) Ω := (x, y, z) ∈ R3 | x ≥ 0, y ≥ 0, z ≥ 0, x+ y + z ≤ 1

28

Figure 2.10: (a) Tetrahedron Ω for the Example 2.2.11 (b) Common domain of definition D

We cannot directly use the formulation of Ω at (2.2.113) to evaluate the integral. Notice however

that Ω is a z-simple region. To see this observe that all points (x, y, z) on the triangular surface

with vertices A, B and C must satisfy the identity

(2.2.114) z = 1− x− y.

Let D ⊂ R2xy be the triangular region in the x− y plane with vertices O, A and C shown in Figure

2.10(b), and define the functions

(2.2.115) γ1(x, y) := 0, γ2(x, y) := 1− x− y, for all (x, y) in D.

With these definitions it is clear that Ω ⊂ R3 is the z-simple region given by

(2.2.116) Ω = (x, y, z) ∈ R3 | (x, y) in D and γ1(x, y) ≤ z ≤ γ2(x, y),

that is Ω is z-simple with lower function γ1(x, y) and upper function γ2(x, y) defined by (2.2.115),

and with common domain of definition D ⊂ R2xy in Figure 2.10(b). From Figure 2.10(b) we observe

that D is given by

(2.2.117) D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x),

29

in which

(2.2.118) a := 0, b := 1, φ1(x) := 0 and φ2(x) := 1− x for all 0 ≤ x ≤ 1,

that is D is y-simple with lower function φ1(x) and upper function φ2(x) defined by (2.2.118), and

with common interval of definition 0 ≤ x ≤ 1. Since Ω is z-simple and D is y-simple we can use

(2.2.108), that is ∫Ω

f dV =

∫ b

a

∫ φ2(x)

φ1(x)

[∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

]dy

dx

=

∫ 1

0

∫ 1−x

0

[∫ 1−x−y

0

y dz

]dy

dx,

(2.2.119)

in which the second equality at (2.2.119) follows from (2.2.112), (2.2.115) and (2.2.118). Now

evaluate the successive iterated integrals on the right hand side of (2.2.119):

(2.2.120)

∫ 1−x−y

0

y dz = y

∫ 1−x−y

0

dz = y(1− x− y),

so that ∫ 1−x

0

[∫ 1−x−y

0

y dz

]dy =

∫ 1−x

0

y(1− x− y) dy (from (2.2.120))

= (1− x)

∫ 1−x

0

y dy −∫ 1−x

0

y2 dy

= (1− x)

[y2

2

]y=1−x

y=0

−[y3

3

]y=1−x

y=0

=(1− x)3

2− (1− x)3

3=

(1− x)3

6.

(2.2.121)

From (2.2.121) and (2.2.119) we find∫Ω

f dV =1

6

∫ 1

0

(1− x)3 dx =1

6

[−1

4(1− x)4

]x=1

x=0

=1

24.

Remark 2.2.12. In Remark 2.2.9 we defined a region Ω ⊂ R3 which is z-simple, with lower function

γ1(x, y), upper function γ2(x, y) and common domain of definition D ⊂ R2xy (see (2.2.101)). In much

the same way we can also formulate the analogous ideas of y-simple and x-simple regions in R3.

For this put (c.f. (2.2.97))

(2.2.122) R2xz := the x− z plane in R3, R2

yz := the y − z plane in R3.

30

Now suppose that ρ1 : D → R and ρ2 : D → R are continuous functions defined on a common

region D ⊂ R2xz such that

(2.2.123) ρ1(x, z) ≤ ρ2(x, z) for all (x, z) in D.

Then Ω ⊂ R3 is called a y-simple region with lower function ρ1(x, z), upper function ρ2(x, z) and

common domain of definition D ⊂ R2xz when

(2.2.124) Ω = (x, y, z) ∈ R3 | (x, z) in D and ρ1(x, z) ≤ y ≤ ρ2(x, z),

(c.f. (2.2.101) for the analogous z-simple case). Similarly, if η1 : D → R and η2 : D → R are

continuous functions defined on some common region D ⊂ R2yz such that

(2.2.125) η1(y, z) ≤ η2(y, z) for all (y, z) in D,

then Ω ⊂ R3 is called an x-simple region with lower function η1(y, z), upper function η2(y, z) and

common domain of definition D ⊂ R2yz when

(2.2.126) Ω = (x, y, z) ∈ R3 | (y, z) in D and η1(y, z) ≤ x ≤ η2(y, z).

Finally, the region Ω ⊂ R3 is regular when it is both z-simple, y-simple and x-simple at once, that

is given equivalently by (2.2.101), (2.2.124) and (2.2.126).

We can now give the ultimate version of Fubini’s theorem for three dimensional integration,

which for completeness repeats Theorem 2.2.10 for z-simple regions and includes the cases of the

x-simple, and y-simple regions of Remark 2.2.12:

Theorem 2.2.13 (Fubini for general regions in R3). Suppose that Ω ⊂ R3 is a given region and

f : Ω→ R is a given function.

(a) If Ω is a z-simple region with lower function γ1(x, y), upper function γ2(x, y) and common

domain of definition D ⊂ R2xy (see Figure 2.9) then

(2.2.127)

∫Ω

f dV =

∫D

∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

dx dy.

(b) If Ω ⊂ R3 is a y-simple region with lower function ρ1(x, z), upper function ρ2(x, z), and common

domain of definition D ⊂ R2xz (see (2.2.124)) then

(2.2.128)

∫Ω

f dV =

∫D

∫ ρ2(x,z)

ρ1(x,z)

f(x, y, z) dy

dx dz.

31

(c) If Ω ⊂ R3 is a x-simple region with lower function η1(y, z), upper function η2(y, z) and common

domain of definition D ⊂ R2yz (see (2.2.126)) then

(2.2.129)

∫Ω

f dV =

∫D

∫ η2(y,z)

η1(y,z)

f(x, y, z) dx

dy dz.

(d) If Ω ⊂ R3 is regular (that is simultaneously x-simple, y-simple and z-simple) then (2.2.127),

(2.2.128) and (2.2.129) all hold.

32

Chapter 3

Scalar and Vector Fields

One of the most important ideas in physics and engineering is the notion of a field. Roughly speaking,

a field describes how a scalar-valued or vector-valued quantity varies through space, leading to the

more specialized ideas of a scalar field and a vector field.

3.1 Motivating Examples

Before stating the formal definitions of scalar and vector fields we give some motivating examples:

Example 3.1.1. A flat circular metal plate of radius 1 m is located with its centre at the origin of

the xy plane (which we shall denote by R2), and heated with a blow-torch. At each point (x, y) of

the unit disc centred at the origin (exactly the part of the plane occupied by the metal disc) denote

the temperature of the disc by T (x, y) (see Figure 3.1). We then have a scalar-valued function

T (x, y) defined for each point (x, y) in the unit disc. This function is an instance of a scalar field

defined on a region of two-dimensional space, namely the unit disc centred at the origin of the plane

R2.

Example 3.1.2. We can easily generalize Example 3.1.1 to the case of three dimensions as follows.

A solid metal ball of radius 1 m is located with its centre at the origin of three-dimensional xyz

space (which we shall denote by R3), and heated with a blow-torch. At each point (x, y, z) of the

unit sphere centred at the origin (exactly the part of space occupied by the metal ball) denote the

temperature of the ball by T (x, y, z). We then have a scalar-valued function T (x, y, z) defined for

each point (x, y, z) in the unit sphere. This function is an instance of a scalar field defined on a

region of three-dimensional space, namely the unit sphere centred at the origin of R3.

33

Figure 3.1: Temperature variation on the unit disc.

Example 3.1.3. A positive point charge of Q coul. is located at the origin of three-dimensional

space R3. If a positive test charge of 1 coul. is located at the point (x, y, z) then, according to

Coulomb’s law of electrostatics, a force is exerted on the test charge with a magnitude given by

(3.1.1)Q

4πε0r2, for r :=

√x2 + y2 + z2,

(ε0 is a physical constant) and direction along the radial line from the origin to (x, y, z) and away

from the charge Q at the origin (since like charges repel). We denote this force by EEE(x, y, z); this

quantity is a vector, since it has both a magnitude given by (3.1.1) and direction along the radial

line and is called the electrostatic field in space due the charge Q coul. at the origin. (see Figure

3.2). Notice that the electrostatic field EEE(x, y, z) is undefined when (x, y, z) is at the origin (i.e.

(x, y, z) = 0) since r = 0, so that the magnitude at (3.1.1) is undefined; moreover, the radial line

from the origin to itself makes no sense, so there is not a well-defined direction either. However,

EEE(x, y, z) is defined for all (x, y, z) 6= 0 i.e. all (x, y, z) not at the origin. This vector-valued function

is an instance of a vector field defined everywhere in three-dimensional space R3 except for the origin.

Example 3.1.4. Suppose that some electric charge is continuously spread or distributed or “s-

meared” throughout some fixed region D of three-dimensional space R3 (i.e. D ⊂ R3). For example,

D could be the sphere with radius of 1 m. centered at the origin of the xyz-coordinate system, or D

34

Figure 3.2: Vector field EEE(x, y, z).

could be the whole of R3 (i.e. D = R3), but all kinds of other choices for D are of course possible.

The case D = R3 is the simplest and most commonly occurring. Fix some point (x, y, z) in the

region D, and visualize another very small sphere of radius 0 < ε << 1 which is centred at (x, y, z)

(see Figure 3.3). If Vε is the volume of the small sphere and Qε is the total charge contained within

this sphere, then the ratio Qε/Vε (with units of coulombs per cubic metre) is the average charge

density in the small sphere. If the limit

(3.1.2) ρ(x, y, x) = limε→0

Qε

Vε

exists, then the scalar quantity ρ(x, y, z) defines the charge density at the point (x, y, z). If, fur-

thermore, the limit at (3.1.2) exists for each and every point (x, y, z) in the region D, then we have

a scalar-valued function ρ(x, y, z) defined at each point (x, y, z) in D. This function is a scalar field

giving the charge density at every point in the region D of three-dimensional space R3. Effectively,

ρ(x, y, z) gives the quantity of charge per unit volume concentrated at (x, y, z) that is ρ describes

the local concentration of charge at each point in the region D.

Example 3.1.5 (Total enclosed charge). In this example we use three dimensional integration (see

Section 2.2) to relate total charge to the charge density of Example 3.1.4, and for simplicity we take

D = R3 in Example 3.1.4, so that the charge is spread “everywhere” in space. Fix some region

Ω ⊂ R3. For example, Ω could be a spherical shaped region or a parallelepiped in R3. Then the

35

Figure 3.3: Small sphere centered at (x, y, z).

total charge enclosed within Ω must be given by the integral

(3.1.3) Q =

∫Ω

ρ dV.

Later, in Section 9.4, we shall use the very simple relation (3.1.3) to obtain an essential result called

the continuity equation, which describes the movement of charge through space.

Example 3.1.6. Here we use the charge density scalar field ρ(x, y, z) of Example 3.1.4 to construct

a vector field called the current density field. Suppose that the charge in the fixed region D is in

motion, moving through the region, and in particular, at each point (x, y, z) in D, the charge moves

past that point with a velocity vvv(x, y, z). Notice that vvv(x, y, z) is a vector, since it involves both

the direction of motion of the charge as well as the speed at which the charge moves past (x, y, z).

For each (x, y, z) in D define

(3.1.4) JJJ(x, y, z) := ρ(x, y, z)vvv(x, y, z).

Then JJJ(x, y, z) is a vector (since the product of a scalar and a vector is always a vector) defined

for each (x, y, z) in the region D. This vector-valued function is therefore a vector field defined

everywhere in the region D, called the current density field. Since the units of ρ(x, y, z) are coul./m3,

and the units of vvv(x, y, z) are m/sec., the units of JJJ(x, y, z) must be coulombs per sec. per square

metre, that is amperes per square meter (or amps./m2). To get a better understanding of what the

36

current density field JJJ(x, y, z) really means fix a plane or “flat” surface S with area A, and let nnn be

the unit vector normal to the surface S (see Figure 3.4). Suppose, to begin with, that the charge

Figure 3.4: Movement of charge perpendicular to S: vvv and nnn are collinear.

density ρ(x, y, z) has the constant value ρ for all (x, y, z), and similarly suppose that the velocity

vvv(x, y, z) of movement of the charge has the constant value vvv for all (x, y, z), so that

(3.1.5) ρ(x, y, z) = ρ, vvv(x, y, z) = vvv for all (x, y, z) in D.

From (3.1.4) and (3.1.5) the current density field JJJ(x, y, z) must then have the constant value

(3.1.6) JJJ = ρvvv.

In the first instance suppose that the direction of movement of the charge is exactly in the direction

of the unit normal nnn, that is nnn and vvv are collinear (see Figure 3.4). If v denotes the speed of

movement of the charge then of course

(3.1.7) v = ‖vvv‖ = nnn · vvv,

where the second equality follows by the collinearity of vvv and nnn, and the fact that nnn has unit length.

Now fix some small ∆t > 0 (regarded as time). Since vvv is perpendicular to the surface S one sees

that the total “volume of space” that crosses surface S in the time ∆t must be Av∆t, so that the

total charge Q which flows across the surface S in the time ∆t must be this total “volume of space”

37

multiplied by the constant charge density ρ, that is the total charge Q flowing across the surface S

in the time ∆t is given by

(3.1.8) Q = (Av∆t)ρ when the charge velocity vvv is collinear with nnn.

Using (3.1.7) in (3.1.8) then gives

(3.1.9) Q = (A∆t)(nnn · vvv)ρ when the charge velocity vvv is collinear with nnn.

Now suppose that the direction of charge movement is no longer collinear with the unit normal nnn,

as previously, but is instead tangential to the surface S, so that the velocity vector vvv lies along S,

that is vvv is orthogonal to the unit vector nnn (see Figure 3.5). Since the direction of charge movement

Figure 3.5: Movement of charge tangential to S: vvv and nnn are orthogonal.

is along the surface S and not through the surface, there can be no charge crossing S, so the total

charge Q that crosses S in the time ∆t is zero, that is

(3.1.10) Q = 0 when the charge velocity vvv is orthogonal to nnn.

When vvv and nnn are orthogonal then of course nnn · vvv = 0, so that we can write (3.1.10) as

(3.1.11) Q = (A∆t)(nnn · vvv)ρ when the charge velocity vvv is orthogonal to nnn.

38

Figure 3.6: Movement of charge across S: velocity vvv in a general direction.

Finally suppose that the direction of charge movement is neither collinear with nnn nor orthogonal

to nnn, as in the previous cases, but instead vvv is just in some general direction, as shown in Figure

3.6. There is now a component of velocity vvv1 with magnitude v1 in the direction of the unit normal

nnn, and a component of velocity vvv2 along the surface S and orthogonal to the unit vector nnn. As we

have just seen from (3.1.8) and (3.1.10), all charge flowing across S is due to the first component

vvv1 collinear with nnn, and none is due to the component vvv2 orthogonal to nnn, so that the total charge

Q that crosses surface S in the time ∆t is given by (3.1.8) with v1 in place of v that is

(3.1.12) Q = (Av1∆t)ρ.

But v1 is just the projection of vvv along nnn so that

(3.1.13) v1 = nnn · vvv,

and therefore, in this general case, from (3.1.13) and (3.1.12), we find that the total charge Q

passing through the surface S in the time ∆t is given by

(3.1.14) Q = (A∆t)(nnn · vvv)ρ when the charge velocity vvv is in a general direction.

Observe that (3.1.9) and (3.1.11) are just special cases of the general relation (3.1.14). Now the

quantity I = Q/∆t is the total current passing through the surface S, therefore

I =Q

∆t= A(nnn · vvv)ρ = A(ρvvv) · nnn = A(JJJ · nnn),

39

where the first equality follows from (3.1.14), the second equality is clear, and the third equality

follows from (3.1.6). We therefore have

(3.1.15) I = A(JJJ · nnn),

that is the total current through the surface S is the product of the area A of S and the inner product

JJJ · nnn of the current density JJJ with the unit normal nnn to S.

We next remove the assumption, made at (3.1.5), that the charge density and charge velocity

are constant in space (i.e. constant with respect to (x, y, z)). This is just a matter of writing out

(3.1.15) for an infinitesimally small surface dS with infinitesimally small area dA. In fact, since

the surface dS is infinitesimally small, the current density field JJJ(x, y, z) is effectively constant as

(x, y, z) varies through dS. Therefore we can use (3.1.15) with dA in place of A and JJJ(x, y, z) in

place of JJJ , to see that the infinitesimal current passing through the infinitesimal surface dS with

unit normal nnn(x, y, z) is given by

(3.1.16) dI = (JJJ(x, y, z) · nnn(x, y, z)) dA.

The relation (3.1.16) shows the usefulness of the current density vector field JJJ(x, y, z); knowing this

vector field we can calculate the current dI flowing across a small planar surface dS with area dA

and unit normal vector nnn(x, y, z) at a point (x, y, z) in dS. Later, in Section 8.5, we shall extend

this relation for applications to Maxwell’s equations.

Remark 3.1.7. The notion of charge density formulated in Example 3.1.4 and current density

formulated in Example 3.1.6 are at this stage just illustrative cases of a scalar field and a vector

field. Later these quantities will take on a much deeper significance. In fact, we shall see that

charge and current density are absolutely indispensable in the formulation of Maxwell’s equations

of electromagnetism.

3.2 Definition of Vector and Scalar Fields

With the preceding examples in mind we can now formulate in general terms exactly what is meant

by a scalar field and a vector field:

Definition 3.2.1. A vector field in R3 comprises a specified region D ⊂ R3, called the domain

of the vector field, together with a function or mapping FFF : D → R3 which assigns to each point

40

(x, y, z) in D the vector FFF (x, y, z) in R3. We usually resolve the vector FFF (x, y, z) into its scalar

components along the standard iii, jjj, kkk axes in the representation

(3.2.17) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D,

so that F1(x, y, z), F2(x, y, z) and F3(x, y, z) are respectively the x, y and z coordinates of the vector

FFF (x, y, z). In exactly the same way, a scalar field in R3 comprises a specified region D ⊂ R3, called

the domain of the vector field, together with a function or mapping f : D → R which assigns to

each point (x, y, z) in D the the real number f(x, y, z).

Remark 3.2.2. The vector field FFF : D → R3 is called a C1-vector field when, for each i = 1, 2, 3,

the partial derivatives

∂Fi(x, y, z)

∂x,

∂Fi(x, y, z)

∂y,

∂Fi(x, y, z)

∂z,

all exist and are continuous functions of (x, y, z) in D. For the most part we shall be dealing with

C1-vector fields in this course. However, we shall sometimes also have to deal with vector fields

which are even better behaved in the following sense: A vector field FFF : D → R3 is called a C2-

vector field when FFF is a C1-vector field, and in addition, for each i = 1, 2, 3, the second partial

derivatives

∂2Fi(x, y, z)

∂x2 ,∂2Fi(x, y, z)

∂y2 ,∂2Fi(x, y, z)

∂z2 ,

∂2Fi(x, y, z)

∂x∂y,

∂2Fi(x, y, z)

∂y∂z,

∂2Fi(x, y, z)

∂x∂z,

all exist and are continuous functions of (x, y, z) in D. A standard result from elementary calculus

says that, when FFF : D → R3 is a C2-vector field, then we always have

∂2Fi(x, y, z)

∂x∂y=∂2Fi(x, y, z)

∂y∂x,

∂2Fi(x, y, z)

∂x∂z=∂2Fi(x, y, z)

∂z∂x,

∂2Fi(x, y, z)

∂y∂z=∂2Fi(x, y, z)

∂z∂y,

for all (x, y, z) in D. That is, the second partial derivatives are equal regardless of the order in

which they are calculated. In exactly the same way, a scalar field f : D → R is called a C1-scalar

field when the partial derivatives

∂f(x, y, z)

∂x,

∂f(x, y, z)

∂y,

∂f(x, y, z)

∂z,

41

all exist and are continuous functions of (x, y, z) in D, and is called a C2-scalar field when it is a

C1-scalar field with the additional property that the second partial derivatives

∂2f(x, y, z)

∂x2 ,∂2f(x, y, z)

∂y2 ,∂2f(x, y, z)

∂z2 ,

∂2f(x, y, z)

∂x∂y,

∂2f(x, y, z)

∂y∂z,

∂2f(x, y, z)

∂x∂z,

all exist and are continuous functions of (x, y, z) in D. Again, if f : D → R is a C2-scalar field then

∂2f(x, y, z)

∂x∂y=∂2f(x, y, z)

∂y∂x,

∂2f(x, y, z)

∂x∂z=∂2f(x, y, z)

∂z∂x,

∂2f(x, y, z)

∂y∂z=∂2f(x, y, z)

∂z∂y,

for all (x, y, z) in D.

Remark 3.2.3. It is also useful, especially for simple examples, to have the notion of a vector

field in R2 and a scalar field in R2; we do not formulate these here, for they are obviously given by

Definition 3.2.1 with three-dimensional space R3 everywhere replaced by two-dimensional space R2.

Remark 3.2.4. The domain D of a field (either vector or scalar) occurring in Definition 3.2.1 may

represent a region of space that is natural or intrinsic to the field. In Example 3.1.2 the domain D of

the temperature field T (x, y, z) is naturally the unit sphere centred at the origin of R3, since this is

exactly the region of space occupied by the heated metal ball. In Example 3.1.3 the domain D of the

electrostatic fieldEEE(x, y, z) is all of R3 except for the origin (in mathematical notation D = R3\0)since we have seen that the electrostatic field is undefined (i.e. does not make sense) when (x, y, z)

is at the origin. On the other hand, the domain D could simply be an arbitrary portion of space to

which we wish to restrict attention; this is the case in Example 3.1.4 and Example 3.1.6 where we

just want to focus on some designated portion of space in which we have a distribution of charge.

If our charge had been spread out though all space then we could have taken D = R3. In many

instances, for simplicity and to focus on the essentials, we shall just assume D = R3, that is our

fields are defined everywhere. However, as the electrostatic field of Example 3.1.3 makes clear, we

occasionally come across fields which cannot be defined everywhere, and in these cases we must

carefully specify the domain D = R3 0, that is all of R3 except for the origin.

Remark 3.2.5. It is clear from Definition 3.2.1 that a vector field describes how a vector-valued

quantity changes through space (or a portion of space identified by the domain D of the field) and

likewise for a scalar field. However, in most instances of interest in physics and engineering, one

comes across vector and scalar-valued quantities which change not only through space but also

42

vary with time, so that, with all dependencies displayed, a time varying vector field FFF should be

written FFF (t, x, y, z) and a time varying scalar field f should be written f(t, x, y, z), in which t of

course denotes time, and (x, y, z) is a general point in some domain D ⊂ R3. This additional

time-dependence is easily fitted within Definition 3.2.1: a time-varying vector field is one in which,

for each fixed instant t, we just have a vector field which maps each (x, y, z) in D into the vector

FFF (t, x, y, z) in R3 (i.e. t is kept constant and we think only of the dependence on the space variables

(x, y, z)). Similarly, a time varying scalar field is one in which, for each fixed instant t, we just have

a scalar field which maps each (x, y, z) in D into the real number f(t, x, y, z).

43

Chapter 4

Curves and Paths in Space

Curves and paths in space are among the essential building-blocks of vector calculus. We first

motivate the ideas of a path and a curve in space with a few simple examples.

4.1 Motivating Examples

Example 4.1.1. Define

(4.1.1) γγγ(t) := (cos(πt), sin(πt)), 0 ≤ t ≤ 1.

This defines a function or mapping

(4.1.2) γγγ : [0, 1]→ R2

which takes each 0 ≤ t ≤ 1 into the vector γγγ(t) in the plane R2 given by (4.1.1). This function is

called a path in the plane R2. If we plot γγγ(t) versus 0 ≤ t ≤ 1 then we get a semicircle on the plane

R2 shown in the next figure: This semicircle, which is the image or range of the function γγγ(t), is

called the curve or trace of the path. Notice that the curve has a natural direction as t increases

from t = 0 until t = 1.


(4.1.3) γγγ(t) := (t2, t3), 1 ≤ t ≤ 3.

This defines a function or mapping γγγ : [1, 3] → R2 which takes each 1 ≤ t ≤ 3 into the vector γγγ(t)

in the plane R2 given by (4.1.3). A plot of γγγ(t) versus 1 ≤ t ≤ 3 is shown at Figure 4.2. Again, this

plot is the curve or trace of the path and has a direction which corresponds to increasing t from

t = 1 until t = 3.

44

Figure 4.1: Path defined by (4.1.1)

4.2 Paths and Parametric Representation of Curves

With Example 4.1.1 and Example 4.1.2 in mind, we can now formulate in general terms what is

meant by a path and by the corresponding curve (or trace) of the path:

Definition 4.2.1. A two-dimensional path (or parametric function for a two-dimensional curve) is

a given function or mapping

γγγ : [a, b]→ R2,

from a specified interval [a, b] into R2, which maps each a ≤ t ≤ b into the vector γγγ(t) in R2. This

mapping is usually written in the scalar component form

γγγ(t) = (x(t), y(t))

= x(t)iii+ y(t)jjj for all t in a ≤ t ≤ b.(4.2.4)

The curve of the path is the set of points Γ in the plane R2 traced by γγγ(t) as t traverses the interval

a ≤ t ≤ b. In the notation of sets we write this as

Γ := γγγ(t) ∈ R2 | a ≤ t ≤ b.

The interval a ≤ t ≤ b on which the path is defined is called the parametric interval, the variable

t is called the parametric variable of the path, and the whole function γγγ : [a, b] → R2 is called a

45


parametric representation of the curve Γ. The starting point of the path is the vector γγγ(a) while

the ending point of the path is the vector γγγ(b), and the curve Γ has a direction from the starting to

the ending point corresponding to t increasing from t = a until t = b.

We can clearly formulate analogous ideas in three-dimensional R3 space rather than the plane

R2, by an obvious modification of the preceding definition. For completeness we give this next:

Definition 4.2.2. A three-dimensional path (or parametric function for a three-dimensional curve)

is a given function or mapping

γγγ : [a, b]→ R3,

from a specified interval [a, b] into R3, which maps each a ≤ t ≤ b into the vector γγγ(t) in R3. This

mapping is usually written in the scalar component form

γγγ(t) = (x(t), y(t), z(t))

= x(t)iii+ y(t)jjj + z(t)kkk for all t in a ≤ t ≤ b.(4.2.5)

The curve of the path is the set of points Γ in the space R3 traced by γγγ(t) as t traverses the interval

a ≤ t ≤ b. In the notation of sets we write this as

Γ := γγγ(t) ∈ R3 | a ≤ t ≤ b.

46

The interval a ≤ t ≤ b on which the path is defined is called the parametric interval, the variable

t is called the parametric variable of the path, and the whole function γγγ : [a, b] → R3 is called a

parametric representation of the curve Γ. The starting point of the path is the vector γγγ(a) while

the ending point of the path is the vector γγγ(b), and the curve Γ has a direction from the starting to

the ending point corresponding to t increasing from t = a until t = b.

Figure 4.3: Path γγγ : [a, b]→ R3

Remark 4.2.3. Definition 4.2.1 and Definition 4.2.2 are word-for-word identical except that every-

where we just replace the plane R2 with three-dimensional space R3. From now on we shall usually

just formulate ideas such as path and curve in the more general case of R3 and you will be left

to formulate the corresponding idea in the simpler case of R2. On the other hand, many of our

concrete examples will be for the two-dimensional case simply because it is much easier to draw

curves on R2 than in R3.

Remark 4.2.4. If the path γγγ : [a, b]→ R3 is such that the first derivatives

dx(t)

dt,

dy(t)

dt,

dz(t)

dt,

of the scalar components at (4.2.5) exist and are continuous for all a ≤ t ≤ b, then γγγ is called a

C1-path and the curve corresponding to γγγ is called a C1-curve (compare with C1-fields in Remark

3.2.2). Again, if γγγ : [a, b] → R3 is a C1-path with the further property that the second derivatives

47

of the scalar components

d2x(t)

dt2,

d2y(t)

dt2,

d2z(t)

dt2,

exist and are continuous for all a ≤ t ≤ b, then γγγ is called a C2-path and the curve corresponding

to γγγ is called a C2-curve. In this course our focus will be almost exclusively on C1 and C2-paths

and curves. These ideas specialize in the obvious way for curves in the plane given by the path

γγγ : [a, b]→ R2; one simply discards the derivatives of the scalar component z(t).

Remark 4.2.5. Notice from Definition 4.2.1 and Definition 4.2.2 the distinction between a path

and its curve or trace. The path refers to the whole mapping, including the interval of definition,

whereas the curve is just the totality of points in space (R2 or R3) that are successively occupied

by γγγ(t) as t increases through the interval of definition. In particular, all information concerning

dependence on the parametric variable t and the interval of definition [a, b] is lost if we are just

given the curve of a path, rather than the path itself. The next example illustrates that different

paths can nevertheless have identical curves:


(4.2.6) γγγ(t) := (cos(2πt), sin(2πt)), 0 ≤ t ≤ 1/2.

We then have a mapping γγγ : [0, 1/2] → R2, and the plot of γγγ(t) versus 0 ≤ t ≤ 1/2 is shown in

Figure 4.4:

Remark 4.2.7. Observe that the path γγγ(t) at (4.1.1) is different from the path γγγ(t) at (4.2.6) (the

intervals of definition and defining formulae are clearly different) but, upon comparing Figure 4.1 and

Figure 4.4, it becomes clear that these different paths have identical curves, namely the semicircles

ABC. One can think of the paths at (4.1.1) and (4.2.6) as distinct parametric representations of

the same curve, namely the semicircular arc ABC at Figure 4.1 and Figure 4.4.

Remark 4.2.8. Here we illustrate a general method for starting with a given path and changing

the underlying parametrization to get a generally different path which nevertheless has exactly the

same curve. To illustrate the idea in a specific case let

(4.2.7) γγγ1(t) := (cos(πt), sin(πt)), 0 ≤ t ≤ 1.

be the path in Example 4.1.1. We see from Figure 4.1 that the curve of this path is the semicircular

arc ABC. Now define the function

(4.2.8) ψ : [1,√

2]→ R as ψ(s) := s2 − 1.

48


We then see that

(4.2.9) ψ(1) = 0, ψ(√

2) = 1, ψ(1)(s) = 2s− 1 > 0 for all 1 ≤ s ≤√

2,

(see the following Figure 4.5). It is clear that the function ψ(s) increases strictly monotonically

through the interval [0, 1] as s increases through the interval 1 ≤ s ≤√

2. Now define the path

γγγ2 : [1,√

2]→ R2

as follows:

γγγ2(s) := γγγ1(ψ(s))

= (cos(π(s2 − 1)), sin(π(s2 − 1))), 1 ≤ s ≤√

2.(4.2.10)

In the substitution at (4.2.10) the quantity ψ(s) = s2 − 1 replaces every occurrence of t in (4.2.7).

In Figure 4.6 we show how γγγ2(s) moves through R2 for different values of 1 ≤ s ≤√

2 indicating

specifically s = 1, s = 3/2, s =√

2. Comparing Figure 4.1 and Figure 4.6 we see that the two

paths defined by (4.2.7) and (4.2.10), which are clearly different paths, nevertheless have the same

curve, that is the arc ABC. The paths defined by (4.2.7) and (4.2.10) are distinct parametric

representations of the same curve ABC.

We can repeat this reparametrization of curves in complete generality. Suppose that

(4.2.11) γγγ1 : [a1, b1]→ R3,

49

Figure 4.5: Function ψ at (4.2.9)

is a given path. We are going to change the parametrization to get a different path with exactly

the same curve, much as we did in the preceding special case. To this end, fix some interval [a2, b2]

as well as some strictly increasing C1-function

(4.2.12) ψ : [a2, b2]→ R

such that

(4.2.13) ψ(a2) = a1, ψ(b2) = b1, ψ(1)(s) > 0 for all a2 ≤ s ≤ b2,

(see the following Figure 4.7). It is clear that the function ψ(s) increases strictly monotonically

through the interval [a1, b1] as s increases through the interval a2 ≤ s ≤ b2. Now define

(4.2.14) γγγ2(s) := γγγ1(ψ(s)), a2 ≤ s ≤ b2.

We then get a path

(4.2.15) γγγ2 : [a2, b2]→ R3,

which is clearly different from the path (4.2.11) (except for the trivial case where a1 = a2, b1 = b2

and ψ(s) = s). However, it is clear from (4.2.14) that the two paths nevertheless follow identical

curves in R3.

50


4.3 Derivatives Along a Path and Tangent to a Curve

Given a path

(4.3.16) γγγ : [a, b]→ R3,

written in the scalar component form (c.f. (4.2.5))

γγγ(t) = x(t)iii+ y(t)jjj + z(t)kkk

= (x(t), y(t), z(t)), a ≤ t ≤ b,(4.3.17)

so that the right side of (4.3.17) gives the usual (x, y, z)-coordinates in three-dimensional space

at each value of the parameter t. We then define the derivative of the path with respect to the

parameter t to be the vector in R3 given by

γγγ(1)(t) :=dx(t)

dtiii+

dy(t)

dtjjj +

dz(t)

dtkkk

=

(dx(t)

dt,

dy(t)

dt,

dz(t)

dt

), for all instants a ≤ t ≤ b,

(4.3.18)

in which the scalar components of γγγ(1)(t) are the t-derivatives of the scalar components of γγγ(t). An

alternative notation for the derivative γγγ(1)(t) at (4.3.18) is

(4.3.19)dγγγ(t)

dt:=

dx(t)

dtiii+

dy(t)

dtjjj +

dz(t)

dtkkk.

51

Figure 4.7: Function ψ at (4.2.13)

Since γγγ1(t) is a vector in R3 for each a ≤ t ≤ b, we end up with another path

(4.3.20) γγγ(1) : [a, b]→ R3,

given by the t-derivative γγγ(1)(t) in R3 at every instant a ≤ t ≤ b. In the same way it is natural to

define the second t-derivative of the path γγγ as

γγγ(2)(t) :=d

dtγγγ(1)(t)

=d2x(t)

dt2iii+

d2y(t)

dt2jjj +

d2z(t)

dt2kkk

=

(d2x(t)

dt2,

d2y(t)

dt2,

d2z(t)

dt2

), for all instants a ≤ t ≤ b.

(4.3.21)

An alternative notation for the derivative γγγ(2)(t) at (4.3.21) is

(4.3.22)d2γγγ(t)

dt2:=

d2x(t)

dt2iii+

d2y(t)

dt2jjj +

d2z(t)

dt2kkk, for all instants a ≤ t ≤ b.

Notice that the right side of (4.3.21) also defines a point in R3 for each a ≤ t ≤ b, so that we have

yet another path

(4.3.23) γγγ(2) : [a, b]→ R3,

which gives the second t-derivative γγγ(2)(t) in R3 at every instant a ≤ t ≤ b.

52

From (4.3.17) and (4.3.18), together with the fact that

dx(t)

dt= lim

∆t→0

x(t+ ∆t)− x(t)

∆t, and similarly for

dy(t)

dtand

dz(t)

dt,

it follows that we can write the first t-derivative γγγ(1)(t) as

(4.3.24) γγγ(1)(t) = lim∆t→0

γγγ(t+ ∆t)− γγγ(t)

∆t, a ≤ t ≤ b.

From Figure 4.8 one sees that the vector difference γγγ(t + ∆t) − γγγ(t) is very close to tangential to

Figure 4.8: Approximation of first t-derivative

the curve of the path (4.3.16) at γγγ(t) when ∆t is small, and therefore the “rescaled” vector

γγγ(t+ ∆t)− γγγ(t)

∆t

is again close to tangent to the curve of the path (4.3.16) at γγγ(t) when ∆t is small, and the limit

at (4.3.24) is exactly tangent to the curve of the path (4.3.16) at γ(t) (see Figure 4.9).

We therefore have the important fact that

the derivative γγγ(1)(t) is tangent to the curve of the path (4.3.16) at γγγ(t) for each instant a ≤ t ≤ b,

(see Figure 4.9). In exactly the same way one can see that

the derivative γγγ(2)(t) is tangent to the curve of the path (4.3.20) at γγγ(1)(t) for each instant a ≤ t ≤ b.

One can of course define further t-derivatives γγγ(n)(t), n = 3, 4, . . ., in exactly the same way (although

there is seldom a need for these).

53

Figure 4.9: First derivative γγγ(1)(t) is tangent to the curve at γγγ(t)

Remark 4.3.1. Given two paths

(4.3.25) γγγ1 : [a, b]→ R3 and γγγ2 : [a, b]→ R3,

define the inner product of the vectors γγγ1(t) and γγγ2(t) in R3 for each a ≤ t ≤ b, giving the R valued

function

(4.3.26) ϕ(t) := γγγ1(t) · γγγ2(t), for all t in a ≤ t ≤ b.

Then we recall the following rule for differentiation of products:

(4.3.27)d

dtϕ(t) = γγγ1(t) · γγγ(1)

2 (t) + γγγ(1)1 (t) · γγγ2(t). for all t in a ≤ t ≤ b.

Remark 4.3.2. The parametric variable for a given path

(4.3.28) γγγ : [a, b]→ R3,

typically indicated by the variable t (although symbols such as s, σ, τ , u etc. could equally

well be used) does not have any “physical” interpretation in Definition 4.2.2. However, there are

applications in which this variable is specifically interpreted as time, the parametric interval [a, b]

is a given time interval, and the vector γγγ(t) represents the point in three-dimensional space R3

occupied (e.g. by a particle, an electric charge etc.) at the instant of time a ≤ t ≤ b. Since we

54

are now talking about movement through space as time increases it makes sense to introduce the

corresponding velocity and acceleration. Of course the velocity is defined as the first t-derivative of

γγγ, namely

(4.3.29) vvv(t) := γγγ(1)(t), for all times t in a ≤ t ≤ b,

and the acceleration is defined as the second t-derivative of γγγ, namely

(4.3.30) aaa(t) := γγγ(2)(t), for all times t in a ≤ t ≤ b.

This has some very important consequences. In fact, suppose that FFF : D → R3 is a vector field

with domain which we shall take to be D := R3 for simplicity, and that, at each point (x, y, z) in

R3, the vector FFF (x, y, z) is the force acting on a particle of mass m located at (x, y, z). Suppose the

particle follows the path

γγγ : [a, b]→ R3

in response to this force. If the particle is at γγγ(t) at instant t then the force on the particle is given

by FFF (γγγ(t)). Newton’s second law then says that

(4.3.31) maaa(t) = FFF (γγγ(t)), for all times t in a ≤ t ≤ b.

Combining this with (4.3.30) gives

(4.3.32) mγγγ(2)(t) = FFF (γγγ(t)), for all times t in a ≤ t ≤ b.

This is a second order vector differential equation which can, in principle, be solved to get the path

γγγ : [a, b] → R3 followed by the particle if one knows the force vector field FFF . In practice of course

this equation may be very difficult to solve explicitly, but it can always be solved numerically. In

fact, the path followed by a space probe moving through the solar system is calculated by solving

(4.3.32) (numerically); here FFF is the sum of the forces exerted on the probe by the sun and the

various planets. These forces are obtained from Newton’s law of universal gravitation.

4.4 Simple Curves and Closed Curves

Suppose that

(4.4.33) γγγ : [a, b]→ R2

55

is a path (or parametric function) with corresponding curve Γ (recall Definition 4.2.1) shown at the

left of Figure 4.10. The curve Γ clearly has the property that it does not “cross itself” anywhere.

What this means is that for any distinct t1 and t2 in the interval [a, b] (i.e. t1 6= t2) it must be the

case that γγγ(t1) 6= γγγ(t2), since, if γγγ(t1) = γγγ(t2) for distinct t1 and t2, then the curve must cross itself

somewhere (shown at the right of Figure 4.11). The curve Γ at the left of Figure 4.10, which does

not cross itself anywhere, is called a simple curve, whereas the curve Γ at the right of Figure 4.11,

which does cross itself somewhere, is called a non-simple curve. Of course exactly the same basic

Figure 4.10: Simple and non-simple curves

idea holds for the curves of paths in three dimensional space, that is

(4.4.34) γγγ : [a, b]→ R3.

If the curve in R3 of the path (4.4.34) does not cross itself anywhere then it is a simple curve, and

if it does cross itself somewhere then it is a non-simple curve.

For paths (4.4.33) and (4.4.34) the point γγγ(a) is called the starting point of the curve, and the

point γγγ(b) is called the ending point of the curve. A curve, simple or not, is called a closed curve

when the starting point and ending point coincide, that is

(4.4.35) γγγ(a) = γγγ(b).

A closed curve which is also simple (see the left of Figure 4.11) is called a simple closed curve

whereas a simple curve which does cross itself somewhere (distinct from the “exceptional” point

56

γγγ(a) = γγγ(b)) is a non-simple closed curve. The study of simple and non-simple curves is quite deep

and is part of algebraic topology. We are going to see that simple closed curves in particular are

quite important in vector calculus.

Figure 4.11: Simple and non-simple closed curves

57

Chapter 5

Line Integral and Arc Length

You are all familiar with the integration of a given function f(t) for all t in some interval a ≤ t ≤ b.

In Chapter 2 we extended the idea of integration over intervals to integration over regions of R2 and

R3. Vector calculus acquires its extraordinary power from two further extensions of the integration

concept, namely line integrals and surface integrals. Indeed, the main theorems of vector calculus,

as well as the basic laws of electricity and magnetism, can only be stated in terms of line integrals

and surface integrals. Our goal in this chapter is to study the construction and main properties of

line integrals, deferring the more sophisticated notion of surface integrals to Chapter 8.

5.1 Line Integral of a Vector Field

To formulate what is meant by a line integral suppose that we are given a curve in the space R3

which starts at a point A and ends at a point B which we denote by Γ, and we are also given a

vector field FFF (x, y, z) (see Figure 5.1). For the sake of simplicity, and to focus on just the essentials,

we take the domain of the vector field to be all of R3.

Remark 5.1.1. In order to simplify the notation, from now on we are always going to write rrr for

the vector corresponding to a point (x, y, z) in R3, so that

(5.1.1) rrr = xiii+ yjjj + zkkk, with length ‖rrr‖ =√x2 + y2 + z2,

in which iii, jjj and kkk are the usual standard unit vectors along the x, y and z-axes respectively. With

this notation we will write FFF (rrr) as an alternative to the notation FFF (x, y, z) for the value of the

vector field FFF at the point rrr given by (5.1.1).

58

Figure 5.1: Curve Γ in R3

Now we can define the line integral of the vector fieldFFF (rrr) along the curve Γ as follows: Introduce

points rrr0, rrr1, . . . , rrri, rrri+1, . . . , rrrn along the curve as shown at Figure 5.1, with rrr0 and rrrn corresponding

to A and B respectively, and put

(5.1.2) ∆rrri := rrri+1 − rrri, i = 0, 1, . . . , n− 1.

Then the inner product FFF (rrri) ·∆rrri is a scalar. If the sum

(5.1.3)n−1∑i=0

FFF (rrri) ·∆rrri

converges to a real number as n→∞ and max0≤i≤n−1 ‖∆rrri‖ → 0, then this limit is called the line

integral of the vector field FFF (rrr) along the curve Γ and denoted by

(5.1.4)

∫Γ

FFF (rrr) · drrr or

∫Γ

FFF (x, y, z) · drrr or, more briefly, by

∫Γ

FFF · drrr

Remark 5.1.2. In the particular case where the starting and end points A and B of the curve Γ

coincide, so that Γ is a closed curve in R3 (see Figure 5.2) the notations of (5.1.4) are sometimes

modified to

(5.1.5)

∫

Γ

FFF (rrr) · drrr or

∫

Γ

FFF (x, y, z) · drrr or, more briefly, by

∫

Γ

FFF · drrr,

the purpose of the small circle over the integral sign being to indicate that the line integral is over

a closed curve. In this case the line integral is called the circulation of the vector field FFF around the

59

closed curve Γ. In general, the small circle over the integral sign is quite redundant and does not

tell us anything new. We shall therefore avoid the notations at (5.1.5) and just use the notations

at (5.1.4) even when Γ is a closed curve.

Figure 5.2: Circulation of the vector field FFF

Remark 5.1.3. What is the significance of the line integral we have just defined? This depends

on the physical significance of the field FFF . Suppose that the vector field FFF is an electric field.

Then, with reference to Figure 5.1, FFF (rrri) is the force exerted on a standard unit positive charge

(i.e. a positive charge of one coul.) when it is located at point rrri on Γ, the quantity FFF (rrri) ·∆rrri is

approximately the work done by the electric field in displacing the standard unit positive charge

along Γ from rrri to rrri+1, and the quantity at (5.1.3) is approximately the work done by the electric

field in moving the unit positive charge along Γ from A to B. It is clear that the limit given by the

line integral at (5.1.4), that is

(5.1.6)

∫Γ

FFF · drrr,

is exactly the work done by the electric field in moving the standard unit positive charge along Γ

from A to B. This quantity is of course just the voltage difference between A and B. If Γ is the

closed curve in Figure 5.2 then the circulation of the electric field FFF around the closed curve Γ,

which is again the line integral at (5.1.6), is the total work done by the electric field in moving

60

a unit positive charge once completely around Γ. This quantity (measured in volts) is called the

electromotive force (or emf) of the electric field FFF around Γ. In conclusion, we see that line integrals

can have definite physical significance. In fact, line integrals are indispensable in much of physics

and engineering. In particular, as we shall see later, line integrals are essential for expressing the

basic laws of electromagnetism in mathematical form.

It is one thing to define the line integral as we have done, quite another thing to actually calculate

the line integral of a given vector field along a given curve. As one knows from ordinary calculus

the calculation of integrals can be a challenge. Fortunately, line integrals can be quite tractable

when we know a path

(5.1.7) γγγ : [a, b]→ R3

giving the curve Γ (recall Definition 4.2.2), which is nearly always the case when we actually have

to calculate a line integral. To see this, fix points ti in the interval a ≤ t ≤ b such that

(5.1.8) a = t0 < t1 < . . . < tn = b and put ∆ti := ti+1 − ti and rrri := γγγ(ti).

Then of course

∆rrri = rrri+1 − rrri= γγγ(ti+1)− γγγ(ti)

≈ γγγ(1)(ti)(ti+1 − ti) (linear approximation when ∆ti is small)

= γγγ(1)(ti)∆ti.

(5.1.9)

Then

(5.1.10)n−1∑i=0

FFF (rrri) ·∆rrri ≈n−1∑i=0

FFF (γγγ(ti)) · γγγ(1)(ti)∆ti,

as follows from (5.1.9) and (5.1.8). If we define the function

(5.1.11) g(t) := FFF (γγγ(t)) · γγγ(1)(t), a ≤ t ≤ b,

then it is clear that the right side of (5.1.10) converges to the ordinary integral

(5.1.12)

∫ b

a

g(t) dt ≡∫ b

a

FFF (γγγ(t)) · γγγ(1)(t) dt.

61

It follows that

(5.1.13)

∫Γ

FFF (rrr) · drrr =

∫ b

a

FFF (γγγ(t)) · γγγ(1)(t) dt

so that evaluation of the line integral on the left boils down to evaluation of the ordinary integral

on the right, this just being evaluation of the integral of g(t) at (5.1.11) over the interval a ≤ t ≤ b.

In view of the expansion at (4.2.5), that is

(5.1.14) γγγ(t) = x(t)iii+ y(t)jjj + z(t)kkk for all t in a ≤ t ≤ b,

we can “expand” the right side of (5.1.13). Indeed, from (5.1.14) we have

(5.1.15) γγγ(1)(t) =dx

dt(t)iii+

dy

dt(t)jjj +

dz

dt(t)kkk for all t in a ≤ t ≤ b.

Moreover, expanding the vector field FFF according to (3.2.17), that is

(5.1.16) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D,

we see from (5.1.16), (5.1.15) and (5.1.14) that

FFF (γγγ(t)) · γγγ(1)(t) = [F1(x(t), y(t), z(t))iii+ F2(x(t), y(t), z(t))jjj + F3(x(t), y(t), z(t))kkk]

·[

dx

dt(t)iii+

dy

dt(t)jjj +

dz

dt(t)kkk

]= F1(x(t), y(t), z(t))

dx

dt(t) + F2(x(t), y(t), z(t))

dy

dt(t)

+ F3(x(t), y(t), z(t))dz

dt(t).

(5.1.17)

From (5.1.17) and (5.1.13) we are able to write the line integral in the expanded form∫Γ

FFF (rrr) · drrr =

∫ b

a

[F1(x(t), y(t), z(t))

dx

dt(t) + F2(x(t), y(t), z(t))

dy

dt(t)

+F3(x(t), y(t), z(t))dz

dt(t)

]dt.

(5.1.18)

Remark 5.1.4. Needless to say, all of the preceding trivially specializes to the case where we have

a curve Γ in the plane R2 (instead of in R3) and FFF (x, y) is a vector field in R2.

Example 5.1.5. A vector field in R2 is defined by

(5.1.19) FFF (x, y) := (y,−x) = yiii− xjjj for all (x, y) in R2.

62

Γ1 is a curve in R2 of the path

(5.1.20) γγγ : [0, π/2]→ R2, given by γγγ(t) := (cos(t), sin(t)).

Determine line integral of FFF along Γ1.

We apply (5.1.13). From (5.1.20) and (5.1.19)

(5.1.21) FFF (γγγ(t)) = (sin(t),− cos(t)), 0 ≤ t ≤ π/2,

and

(5.1.22) γγγ(1)(t) =

(d

dtcos(t),

d

dtsin(t)

)= (− sin(t), cos(t)). 0 ≤ t ≤ π/2,

From (5.1.22) and (5.1.21)

(5.1.23) FFF (γγγ(t)) · γγγ(1)(t) = (sin(t),− cos(t)) · (− sin(t), cos(t)) = − sin2(t)− cos2(t) = −1.

From (5.1.23) and (5.1.13) we have

(5.1.24)

∫Γ1

FFF (rrr) · drrr =

∫ π/2

0

FFF (γγγ(t)) · γγγ(1)(t) dt =

∫ π/2

0

(−1) dt = −π2.

Now suppose that Γ2 is another curve in R2, of the path

(5.1.25) γγγ : [0, 1]→ R2, given by γγγ(t) := (1− t, t).

We repeat the computation of the line integral of the vector field FFF at (5.1.19) but along the curve

Γ2 corresponding to the path at (5.1.25).

From (5.1.20) and (5.1.25)

(5.1.26) FFF (γγγ(t)) = (t, t− 1) 0 ≤ t ≤ 1,

and

(5.1.27) γγγ(1)(t) =

(d

dt(1− t), d

dt(t)

)= (−1, 1), 0 ≤ t ≤ 1,

so that

(5.1.28) FFF (γγγ(t)) · γγγ(1)(t) = (t, t− 1) · (−1, 1) = −1.

From (5.1.28) and (5.1.13) we have

(5.1.29)

∫Γ2

FFF (rrr) · drrr =

∫ 1

0

FFF (γγγ(t)) · γγγ(1)(t) dt =

∫ 1

0

(−1) dt = −1.

63

Figure 5.3: The curves Γ1 and Γ2

Remark 5.1.6. Example 5.1.5 illustrates something very important. In Figure 5.3 we have drawn

the curves Γ1 and Γ2 corresponding respectively to the paths γγγ at (5.1.20) and (5.1.25).

We see that Γ1 and Γ2 are distinct curves in R2, but do start at the common point A = (1, 0) and

end at the common point B = (0, 1), and that the line integrals of the vector field at (5.1.19) are

different. In general, if one has distinct curves Γ1 and Γ2 which nevertheless begin at a common

point A and end at a common point B, then the lines integrals of a vector field over these curves

will be different. We will later identify an important class of vector fields which have the special

property that that the line integral is the same for any curve from a given point A to a given point

B.

Remark 5.1.7. The equation (5.1.13) relates the line integral of the vector field FFF along a curve Γ

to a dt-integral which involves the parametric representation of Γ by some path (5.1.7). We know

from Example 4.1.1 and Example 4.2.6 that different paths can be the parametric representation of

the same curve. Furthermore, in Remark 4.2.8 we gave a general method for starting with the path

of a curve and constructing a generally different path with the same curve. With this in mind, it is

essential to check that the dt-integral on the right of (5.1.13) is the same regardless of which path

γγγ : [a, b] → R3 we use as a parametric representation of the curve Γ. The following result assures

us that this is the case:

64

Theorem 5.1.8. Suppose that FFF : R3 → R3 is a continuous vector field and that

(5.1.30) γγγ1 : [a1, b1]→ R3 and γγγ2 : [a2, b2]→ R3

are C1-paths having the same curve Γ, so that

(5.1.31) Γ = γγγ1(t) ∈ R3 | a1 ≤ t ≤ b1 and Γ = γγγ2(t) ∈ R3 | a2 ≤ t ≤ b2

i.e. γγγ1(t) traverses the curve Γ as t increases through the interval a1 ≤ t ≤ b1, and γγγ2(t) traverses

the identical curve Γ as t increases through the interval a2 ≤ t ≤ b2. Then

(5.1.32)

∫ b1

a1

FFF (γγγ1(t)) · γγγ(1)1 (t) dt =

∫ b2

a2

FFF (γγγ2(t)) · γγγ(1)2 (t) dt.

Proof: Suppose that the second parametric representation (or path)

(5.1.33) γγγ2 : [a2, b2]→ R3,

of the curve Γ is related to the first parametric representation

(5.1.34) γγγ1 : [a1, b1]→ R3,

of the same curve Γ by the construction seen at Remark 4.2.8, that is

(5.1.35) γγγ2(s) := γγγ1(ψ(s)), a2 ≤ s ≤ b2,

for some function

(5.1.36) ψ : [a2, b2]→ [a1, b1]

such that

(5.1.37) ψ(a2) = a1, ψ(b2) = b1, ψ(1)(s) > 0 for all a2 ≤ s ≤ b2.

To see that (5.1.32) holds we first evaluate the left side of (5.1.32) using integration by substitution.

Therefore define the substitution

(5.1.38) t := ψ(s), so that dt = ψ(1)(s) ds.

We will use this substitution to write the dt-integral on the left side of (5.1.32) as a ds-integral.

With this substitution the lower limit of the ds-integral is given by s1 such that a1 = ψ(s1), and

the upper limit is given by s2 such that b1 = ψ(s2), that is

(5.1.39) s1 = ψ−1(a1), s2 = ψ−1(a2).

65

Then integration by substitution gives

(5.1.40)

∫ b1

a1


∫ ψ−1(b1)

ψ−1(a1)

FFF (γγγ1(ψ(s))) · γγγ(1)1 (ψ(s))ψ(1)(s) ds.

From (5.1.37) we have

(5.1.41) ψ−1(a1) = a2, ψ−1(b1) = b2.

Now put (5.1.41) into (5.1.40):∫ b1

a1


∫ b2

a2

FFF (γγγ1(ψ(s))) · γγγ(1)1 (ψ(s))ψ(1)(s) ds

=

∫ b2

a2

FFF (γγγ2(s)) · γγγ(1)1 (ψ(s))ψ(1)(s) ds,

(5.1.42)

(we used (5.1.35) at the second equality). Now evaluate the derivative γγγ(1)1 (ψ(s)); for this take

s-derivative of each side of (5.1.35) to get

γγγ(2)2 (s) =

d

dsγγγ2(s)

=d

dsγγγ1(ψ(s)) (from (5.1.35))

= γγγ(1)1 (ψ(s))ψ(1)(s) (from the chain rule),

(5.1.43)

and from (5.1.43) we get

(5.1.44)

∫ b2

a2

FFF (γγγ2(s)) · γγγ(1)1 (ψ(s))ψ(1)(s) ds =

∫ b2

a2

FFF (γγγ2(s)) · γγγ(1)2 (s) ds.

Upon combining (5.1.44) and (5.1.42) we obtain∫ b1

a1


∫ b2

a2

FFF (γγγ2(s)) · γγγ(1)2 (s) ds,

which is just (5.1.32).

Of course Theorem 5.1.8 is an extremely important result, for it assures us that we get the same

line integral regardless of the path that we use for a parametric representation of the curve Γ. Were

this not the case then the whole notion of a line integral would not make any sense at all!

66

5.2 Line Integral of Scalar Field and Arc Length

We have defined the line integral of a vector field along a given curve in R3. We next define the line

integral of a scalar field along a given curve in R3 by a very similar construction. We therefore fix

a given curve in the space R3 which starts at a point A and ends at a point B which we denote by

Γ, and we are also given a scalar field f(x, y, z) (see Figure 5.1). As at Remark 5.1.1 we identify a

point (x, y, z) with a vector rrr (see (5.1.1)) and write f(rrr) instead of f(x, y, z). Exactly as with the

definition of the line integral of a vector field we introduce points rrr0, rrr1, . . . , rrri, rrri+1, . . . , rrrn along

the curve as shown at Figure 5.1, with rrr0 and rrrn corresponding to A and B respectively, and put

(5.2.45) ∆rrri := rrri+1 − rrri, i = 0, 1, . . . , n− 1.

If

(5.2.46) ∆si := ‖∆rrri‖

denotes the usual Euclidean length of the vector ∆rrri then the product f(rrri)∆si = f(rrri) ‖∆rrri‖ is a

scalar. If the sum

(5.2.47)n−1∑i=0

f(rrri)∆si

converges to a real number as n→∞ and max0≤i≤n−1 ‖∆rrri‖ → 0, then this limit is called the line

integral of the scalar field f along the curve Γ and denoted by

(5.2.48)

∫Γ

f(rrr) ds or

∫Γ

f(x, y, z) ds or, more briefly, by

∫Γ

f ds.

Exactly as with the line integral of a vector field, evaluation of the line integral of a scalar field is

facilitated when the curve Γ is the curve of a path (5.1.7). To see this, fix points ti in the interval

a ≤ t ≤ b exactly as at (5.1.8). Then (5.1.9) continues to hold, so that in particular

(5.2.49) ∆si = ‖∆rrri‖ ≈∥∥γγγ(1)(ti)

∥∥∆ti,

and

(5.2.50)n−1∑i=0

f(rrri)∆si ≈n−1∑i=0

f(γγγ(ti))∥∥γγγ(1)(ti)

∥∥∆ti,

as follows from (5.2.49) and (5.1.8). If we define the function

(5.2.51) g(t) := f(γγγ(t))∥∥γγγ(1)(t)

∥∥ , a ≤ t ≤ b,

67

then the right side of (5.2.50) converges to the ordinary integral

(5.2.52)

∫ b

a

g(t) dt ≡∫ b

a

f(γγγ(t))∥∥γγγ(1)(t)

∥∥ dt.

It follows that

(5.2.53)

∫Γ

f(rrr) ds =

∫ b

a

f(γγγ(t))∥∥γγγ(1)(t)

∥∥ dt.

That is, evaluation of the line integral of a scalar field f along a curve Γ with a parametric repre-

sentation (5.1.7) reduces to evaluation of the ordinary integral of the function at (5.2.51) over the

interval a ≤ t ≤ b.

Example 5.2.1. A scalar field in R2 is defined by

(5.2.54) f(x, y) := x2 + y for all (x, y) in R2,

and Γ is a curve in R2 with the parametric representation

(5.2.55) γγγ : [0, 1]→ R2, given by γγγ(t) := tiii− tjjj ≡ (t,−t).

We apply (5.2.53). From (5.2.55) and (5.2.54)

(5.2.56) f(γγγ(t)) = f(x(t), y(t)) = f(t2,−t) = t2 − t, 0 ≤ t ≤ 1,

and

(5.2.57) γγγ(1)(t) =d

dt(t)iii+

d

dt(−t)jjj = iii− jjj ≡ (1,−1), 0 ≤ t ≤ 1,

so that

(5.2.58)∥∥γγγ(1)(t)

∥∥ =√

12 + (−1)2 =√

2, 0 ≤ t ≤ 1.

From (5.2.58) and (5.2.56)

(5.2.59) f(γγγ(t))∥∥γγγ(1)(t)

∥∥ =√

2(t2 − t),

and from (5.2.59) and (5.2.53) we have

(5.2.60)

∫Γ

f(rrr) ds =√

2

∫ 1

0

(t2 − t) dt = −√

2

6.

68

Remark 5.2.2. Returning to the partition of the curve Γ shown in Figure 5.1, it is evident that if

the limit of the sum

(5.2.61)n−1∑i=0

‖∆rrri‖

converges to a real number as n→∞ and max0≤i≤n−1 ‖∆rrri‖ → 0, then this limit must be the length

of the curve Γ, so that length of Γ is just the line integral of the constant scalar field f(x, y, z) ≡ 1

along the curve Γ. In particular, the length of Γ is just given by (5.2.53) when we take f(x, y, z) ≡ 1,

that is

(5.2.62) length(Γ) =

∫ b

a

∥∥γγγ(1)(t)∥∥ dt.

An (undramatic) illustration of the use of (5.2.62) is

Example 5.2.3. Find the length of the circle of radius α > 0 in the plane R2. The circle is the

curve Γ. In order to use (5.2.62) we must parametrically represent Γ as the curve of some path

γγγ : [a, b]→ R2. In this case a clear choice of path is

(5.2.63) γγγ(t) := (α cos(t), α sin(t)), 0 ≤ t ≤ 2π.

Then, from (5.2.63),

(5.2.64) γγγ(1)(t) =

(d

dt(α cos(t)),

d

dt(α sin(t))

)= (−α sin(t), α cos(t)), 0 ≤ t ≤ 2π,

and from (5.2.64)

(5.2.65)∥∥γγγ(1)(t)

∥∥ =√

(−α sin(t))2 + (α cos(t))2 = α.

From (5.2.65) and (5.2.62)

length(Γ) =

∫ 2π

0

α dt = 2πα.

69

Chapter 6

Conservative Vector Fields

A conservative vector field is a particularly important type of vector field with the property that

“conservation of energy” always holds for these fields. Conservative vector fields occur everywhere

in physics and engineering including electromagnetism, gravitational physics and hydrodynamics.

Before defining a conservative vector field we must first formulate the idea of the gradient of a scalar

field and dispose of some “calculus” preliminaries.

6.1 Gradient of a Scalar Field

Definition 6.1.1. Suppose that f : D → R is a C1-scalar field in R3 with domain D ⊂ R3 (see

Definition 3.2.1). Then the gradient of the scalar field f is the vector field gradf on the same domain

D defined by

(gradf)(x, y, z) :=

(∂f

∂x(x, y, z),

∂f

∂y(x, y, z),

∂f

∂z(x, y, z)

)

≡ ∂f

∂x(x, y, z)iii+

∂f

∂y(x, y, z)jjj +

∂f

∂z(x, y, z)kkk, for all (x, y, z) in D.

(6.1.1)

An alternative notation for gradf is∇f (the symbol∇ is called “del” or “nabla”) so that (gradf)(x, y, z)

and ∇f(x, y, z) denote the same vector in R3 at each (x, y, z) in D.

Remark 6.1.2. Often the defining relation (6.1.1) is written with the generic variable (x, y, z)

stripped away, that is

(6.1.2) gradf = ∇f =∂f

∂xiii+

∂f

∂yjjj +

∂f

∂zkkk,

70

in which case it is understood that the domain is still the set D.

Remark 6.1.3. In Definition 6.1.1 we begin with a scalar field f and end up with a “new” field

(namely a vector field) denoted by gradf , or alternatively by ∇f . We can imagine this whole

process as given by a sort of “black box” in which the “input” is the whole scalar field f and the

“output” is the whole vector field gradf (or ∇f), as shown in Figure 6.1. That is, the black box

takes in the scalar field f and by a process of partial differentiation “pummels” it into the vector

field gradf appearing at the “output” of the box. In the circumstances it makes sense to label this

Figure 6.1: Black box for the gradient vector field

box with the alternative symbols “grad” or “∇”. Put another way, we can regard the black box as

an “operator” which takes the scalar field f and “operates” on it to produce the vector field gradf .

Since this process of operation clearly involves partial differentiation the black box defines a so-

called “differential operator”. There is another useful way to think about this differential operator.

Denote by grad (or ∇) the “symbolic” three dimensional vector

grad ≡ ∇ :=

(∂

∂x,∂

∂y,∂

∂z

)≡ ∂

∂xiii+

∂

∂yjjj +

∂

∂zkkk.

(6.1.3)

Of course this is just a “symbolic” vector, and not a “real” vector, in the sense that its components

are partial derivative symbols rather than actual numbers, but it is nevertheless very useful. If

71

we imagine that there are “gaps” behind the symbols grad (or ∇) as well as the partial derivative

symbols ∂/∂x, ∂/∂y and ∂/∂y appearing in (6.1.3), then by inserting f(x, y, z) into each of these

gaps then we obtain

gradf(x, y, z) = ∇f(x, y, z) =∂f

∂x(x, y, z)iii+

∂f

∂y(x, y, z)jjj +

∂f

∂z(x, y, z)kkk,(6.1.4)

which is exactly the relation (6.1.1). In this sense the symbolic vector at (6.1.3) describes how the

black box in Figure 6.1 works.

Remark 6.1.4. The gradient of a scalar field appears in the following useful calculation that we shall

frequently use: Suppose that f : D → R3 is a scalar field with domain D = R3 and γγγ : [a, b]→ R3

is a path with the component-wise representation at (4.3.17). Then f(γγγ(t)) ≡ f(x(t), y(t), z(t)) is

a real valued function defined on a ≤ t ≤ b. By the chain rule the t-derivative of this function is

given by

d

dtf(γγγ(t)) =

d

dtf(x(t), y(t), z(t))

=∂f

∂x(x(t), y(t), z(t))

d

dtx(t) +

∂f

∂y(x(t), y(t), z(t))

d

dty(t) +

∂f

∂z(x(t), y(t), z(t))

d

dtz(t)

=

(∂f

∂x(x(t), y(t), z(t)),

∂f

∂y(x(t), y(t), z(t)),

∂f

∂z(x(t), y(t), z(t))

)·(

d

dtx(t),

d

dty(t),

d

dtz(t)

)= ∇f(x(t), y(t), z(t)) · γγγ(1)(t) (see (4.3.18) and (6.1.4))

= ∇f(γγγ(t)) · γγγ(1)(t) (see (4.3.17)),

that is we have the general relation

(6.1.5)d

dtf(γγγ(t)) = ∇f(γγγ(t)) · γγγ(1)(t), for all t in a ≤ t ≤ b,

which holds for any scalar field f and any path γγγ : [a.b] → R3. Furthermore, integrating each side

of (6.1.5) over the interval a ≤ t ≤ b one finds∫ b

a

∇f(γγγ(t)) · γγγ(1)(t) dt =

∫ b

a

d

dtf(γγγ(t)) dt

= f(γγγ(b))− f(γγγ(a)),

that is we have the further general relation

(6.1.6)

∫ b

a

∇f(γγγ(t)) · γγγ(1)(t) dt = f(γγγ(b))− f(γγγ(a)),

which again holds for any scalar field f and any path γγγ : [a.b]→ R3.

72

6.2 Conservative Vector Fields

With the preceding preliminaries out of the way, we can define a conservative vector field:

Definition 6.2.1. A vector field FFF : D → R3 with domain D ⊂ R3 is a conservative vector field

when

(6.2.7) FFF (x, y, z) = ∇Ψ(x, y, z), for all (x, y, z) in D,

for some scalar field Ψ : D → R with the same domain D. The scalar field Ψ is called a potential

function of the vector field. In short, a vector field is conservative if it is the gradient of some scalar

field called a potential function of the vector field.

Remark 6.2.2. Observe that a vector field FFF with a potential function Ψ in fact has many potential

functions. Indeed, if c is a real constant, and we put

(6.2.8) Ψ1(x, y, z) := Ψ(x, y, z) + c, for all (x, y, z) in D,

then of course

FFF (x, y, z) = ∇Ψ1(x, y, z), for all (x, y, z) in D,

so that Ψ1 is also a potential function of FFF .

Example 6.2.3. We now give a simple but very important example of a conservative vector field

namely the electrostatic field EEE(x, y, z) from a single point charge Q of Example 3.1.3. We must

prove that this vector field is the gradient of some scalar field. Exactly as at Remark 5.1.1 we put

rrr for the vector from the origin of R3 to point (x, y, z), so that

(6.2.9) rrr = xiii+ yjjj + zkkk, with length ‖rrr‖ =√x2 + y2 + z2,

From Example 3.1.3 we know that EEE(x, y, z) has magnitude (or length) given by

(6.2.10) ‖EEE(x, y, z)‖ =Q

4πε0[x2 + y2 + z2],

(see (3.1.1)). Moreover, the direction of EEE(x, y, z) is collinear with rrr, the unit vector in the direction

of rrr, namely

(6.2.11) rrr :=rrr

‖rrr‖=

xiii+ yjjj + zkkk√x2 + y2 + z2

.

73

From (6.2.11) and (6.2.10)

EEE(x, y, z) =Q

4πε0[x2 + y2 + z2]

xiii+ yjjj + zkkk√x2 + y2 + z2

=Q

4πε0

(x

[x2 + y2 + z2]3/2,

y

[x2 + y2 + z2]3/2,

y

[x2 + y2 + z2]3/2

).

(6.2.12)

Now define a scalar field

(6.2.13) Ψ(x, y, z) := − Q

4πε0

1√x2 + y2 + z2

.

But (easy exercise!) we have

∂

∂x

1√x2 + y2 + z2

=−x

[x2 + y2 + z2]3/2,

∂

∂y

1√x2 + y2 + z2

=−y

[x2 + y2 + z2]3/2,

∂

∂z

1√x2 + y2 + z2

=−z

[x2 + y2 + z2]3/2.

(6.2.14)

From (6.2.12), (6.2.13) and (6.2.14) we get

(6.2.15) EEE(x, y, z) = ∇Ψ(x, y, z),

as required to demonstrate that EEE is a conservative vector field with a potential function given by

the scalar field Ψ at (6.2.13).

Remark 6.2.4. It is extremely important to be able to verify when a given vector field is conserva-

tive. In the previous example we did this by “guessing” a scalar function Ψ and then checking that

this is a potential function of the electric field EEE. Clearly this “guesswork” is not a very satisfactory

way in which to proceed in general. Later, we shall learn a mechanical (and easy!) test for verifying

when a vector field is conservative.

Remark 6.2.5. Here we demonstrate that it is easy to calculate the line integral of a conservative

vector field when we know its potential function. Suppose that FFF is a conservative vector field in

R3 with corresponding potential function Ψ, that is

(6.2.16) FFF (x, y, z) = ∇Ψ(x, y, z), for all (x, y, z) in R3.

74

If Γ is a curve from point A with coordinates (x0, y0, z0) to a point B with coordinates (x1, y1, z1)

then

(6.2.17)

∫Γ

FFF (rrr) · drrr = Ψ(x1, y1, z1)−Ψ(x0, y0, z0).

To verify this suppose that the path

(6.2.18) γγγ : [a, b]→ R3

is some parametric representation of the curve Γ. Then∫Γ

FFF (rrr) · drrr =

∫ b

a

FFF (γγγ(t)) · γγγ(1)(t) dt (see (5.1.13))

=

∫ b

a

∇Ψ(γγγ(t)) · γγγ(1)(t) dt (see (6.2.16))

= Ψ(γγγ(b))−Ψ(γγγ(a)) (from (6.1.6) with Ψ in place of f).

(6.2.19)

Now (6.2.17) follows from (6.2.19) since γγγ(a) = (x0, y0, z0) and γγγ(b) = (x1, y1, z1).

Remark 6.2.6. The relation (6.2.17) shows that the line integral of a conservative vector field is

easy to evaluate provided that we know a potential function of the conservative field. Much more

important, however, is the following consequence of (6.2.17): Suppose that Γ1 and Γ2 are two curves

in R3 starting at a common point (x0, y0, z0) and ending at the common point (x1, y1, z1) (see Figure

6.2).

Then, from (6.2.17) we have

(6.2.20)

∫Γ1

FFF (rrr) · drrr =

∫Γ2

FFF (rrr) · drrr,

that is, the line integral of a conservative vector field along a curve depends only on the end points

of the curve and not on the form of the curve between these end points, i.e. the line integral of a

conservative vector field is path independent.

Another related, and very important, property of conservative vector fields also follows at once

from (6.2.17). Suppose that Γ is a closed curve in R3, that is a curve which begins and ends at

the same point. Fix some point (x0, y0, z0) on Γ; then the curve starts at (x0, y0, z0) and ends at

(x1, y1, z1) = (x0, y0, z0). Upon putting (x1, y1, z1) = (x0, y0, z0) in (6.2.17) we obtain the following

(recall the notation (5.1.5)): For a conservative vector field we have

(6.2.21)

∫Γ

FFF (rrr) · drrr = 0 for every closed curve Γ in R3,

75

Figure 6.2: Curves Γ1 and Γ2 in R3 with common starting and ending points

that is, the circulation of a conservative vector field around every closed curve in R3 is always zero.

How about a converse to the above statement? That is, if we know that (6.2.21) holds for some

vector field FFF , then is it necessarily the case that FFF is conservative? Actually, this converse is true

but its proof relies on a result that we will only learn about much later, called Stokes’ theorem. We

therefore just state the following result, which not only gives the preceding converse, but, much

more importantly, also provides a genuinely practical test for determining when a given vector field

is conservative:

Theorem 6.2.7. Suppose that FFF : D → R3 is a C1-vector field with domain D = R3, and we

put FFF (x, y, z) = (F1(x, y, z), F2(x, y, z), F3(x, y, z)) for all (x, y, z) in R3 i.e. F1(x, y, z), F2(x, y, z),

and F3(x, y, z) are the real scalar components of the vector FFF (x, y, z) in R3. Then the following are

equivalent:

(a) FFF is a conservative vector field;

(b)∫

ΓFFF (rrr) · drrr = 0 for every closed curve Γ in R3;

(c) for all (x, y, z) in R3 we have

∂F3

∂y(x, y, z) =

∂F2

∂z(x, y, z),

∂F1

∂z(x, y, z) =

∂F3

∂x(x, y, z),

∂F2

∂x(x, y, z) =

∂F1

∂y(x, y, z).

Of course it is the equivalence of (a) and (c) in Theorem 6.2.7 which is of particular interest: if

we can check the three conditions of (c) then the vector field is indeed conservative.

76

Remark 6.2.8. Everything we have said above of course specializes trivially to the case of vector

fields in R2, as we briefly indicate next: Suppose that f : D → R is a scalar field in R2 with domain

D ⊂ R2. Then the gradient of this scalar field is the vector field in R2 with the same domain D

defined by

∇f(x, y, z) = (gradf)(x, y, z)

:=

(∂f

∂x(x, y),

∂f

∂y(x, y)

), for all (x, y) in D,

(6.2.22)

in which ∇ is now the two-dimensional operator defined by

(6.2.23) ∇ :=

(∂

∂x,∂

∂y

).

Exactly as at (6.1.5) and (6.1.6), in the two-dimensional case we have

d

dtf(γγγ(t)) = ∇f(γγγ(t)) · γγγ(1)(t), for all t in a ≤ t ≤ b,∫ b

a

∇f(γγγ(t)) · γγγ(1)(t) dt = f(γγγ(b))− f(γγγ(a)),

(6.2.24)

which hold for any two-dimensional scalar field f and any path γγγ : [a.b] → R2. Exactly as at

Definition 6.2.1, a vector field FFF : D → R2 with domain D ⊂ R2 is a conservative vector field when

(6.2.25) FFF (x, y) = ∇Ψ(x, y), for all (x, y) in D,

for some scalar field Ψ : D → R with the same domain D called a potential function of the vector

field. Exactly as at (6.2.17), if FFF is a conservative vector field with a potential function Ψ (that

is, (6.2.25) holds) and Γ is a curve in R2 from point A with coordinates (x0, y0) to a point B with

coordinates (x1, y1) then

(6.2.26)

∫Γ

FFF (rrr) · drrr = Ψ(x1, y1)−Ψ(x0, y0).

As a consequence of (6.2.26) we see the following: if Γ1 and Γ2 are two curves in R2 starting at a

common point (x0, y0) and ending at the common point (x1, y1) then

(6.2.27)

∫Γ1

FFF (rrr) · drrr =

∫Γ2

FFF (rrr) · drrr,

(c.f. (6.2.20)) that is, the line integral of a two-dimensional conservative vector field along a curve

in R2 depends only on the end points of the curve and not on the form of the curve between these

end points. In particular, for a conservative vector field in R2 we have

(6.2.28)

∫Γ

FFF (rrr) · drrr = 0 for every closed curve Γ in R2.

77

Finally, in the two dimensional case we note that the first two conditions in Theorem 6.2.7(c) fall

away, since there is no dependence on z and F3 is identically zero in the case of vector fields in

R2, so that we are left only with the first condition. That is, in place of Theorem 6.2.7, for two

dimensional fields we have

Theorem 6.2.9. Suppose that FFF : D → R2 is a vector field with domain D = R2, and we put

FFF (x, y) = (F1(x, y, z), F2(x, y, z)) for all (x, y) in R2 i.e. F1(x, y) and F2(x, y) are the real scalar

components of the vector FFF (x, y) in R2. Then the following are equivalent:


(b)∫


(c) for all (x, y) in R2 we have∂F2

∂x(x, y) =

∂F1

∂y(x, y).

Example 6.2.10. FFF : D → R2 is a two dimensional vector field with domain D := R2, defined by

(6.2.29) FFF (x, y) = (F1(x, y), F2(x, y)) for F1(x, y) := 2xy, F2(x, y) := 1 + x2.

Establish that FFF is a conservative vector field and determine a potential function.

We use the test of Theorem 6.2.9(c): From (6.2.29)

(6.2.30)∂F1

∂y(x, y) = 2x,

∂F2

∂x(x, y) = 2x,

so that the test of Theorem 6.2.9(c) is verified. From the equivalence of (a) and (c) of Theorem

6.2.9 we conclude that FFF is conservative. There is therefore a function Ψ : R2 → R such that

(6.2.31)∂Ψ

∂x(x, y) = F1(x, y),

∂Ψ

∂y(x, y) = F2(x, y), for all (x, y) in R2,

which we must determine. From (6.2.31) and (6.2.29)

(6.2.32)∂Ψ

∂x(x, y) = 2xy.

Integrate each side of (6.2.32) with respect to x:

(6.2.33) Ψ(x, y) = x2y + h(y).

Notice that h(y) is the constant of integration; since the integration is with respect to x this constant

may depend on y, as we have indicated. Now take y-derivatives of each side of (6.2.33) to get

(6.2.34)∂Ψ

∂y(x, y) = x2 +

dh

dy(y).

78

From (6.2.31), and (6.2.29) we have

(6.2.35)∂Ψ

∂y(x, y) = 1 + x2,

so that (6.2.35) and (6.2.34) give

1 + x2 = x2 +dh

dy(y),

so thatdh

dy(y) = 1,

and therefore

(6.2.36) h(y) = y + c,

for some constant c. Combining (6.2.36) and (6.2.33) gives the potential function

Ψ(x, y) = x2y + y + c, for all (x, y) in R2.

Example 6.2.11. FFF : D → R3 is a vector field with domain D = R3 defined by

(6.2.37) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk,

with the scalar components

(6.2.38) F1(x, y, z) = y, F2(x, y, z) = z cos(yz) + x, F3(x, y, z) = y cos(yz).

Establish that FFF is a conservative vector field and determine a potential function.

We use the test of Theorem 6.2.7. From (6.2.38) we have

(6.2.39)∂F3(x, y, z)

∂y= cos(yz)− yz sin(yz),

∂F2(x, y, z)

∂z= cos(yz)− yz sin(yz),

which cheks the first condition of Theorem 6.2.7(c); the remaining two conditions are similarly

checked showing that FFF is conservative. We then have a function Ψ : R3 → R such that

(6.2.40)∂Ψ(x, y, z)

∂x= F1(x, y, z),

∂Ψ(x, y, z)

∂y= F2(x, y, z),

∂Ψ(x, y, z)

∂z= F3(x, y, z),

for all (x, y, z) in R3. From (6.2.40) and (6.2.38)

(6.2.41)∂Ψ(x, y, z)

∂x= y,

∂Ψ(x, y, z)

∂y= z cos(yz) + x,

∂Ψ(x, y, z)

∂z= y cos(yz),

79

for all (x, y, z) in R3. Integrating the first relation of (6.2.41) with respect to x gives

(6.2.42) Ψ(x, y, z) = xy + h1(y, z).

Notice that h1(y, z) is the constant of integration; since the integration is with respect to x this

constant may depend on (y, z), as we have indicated. Now take derivatives of each side of (6.2.42)

in y to get

(6.2.43)∂Ψ(x, y, z)

∂y= x+

∂h1(y, z)

∂y.

Upon combining (6.2.43) and the second relation of (6.2.41) we find

(6.2.44)∂h1(y, z)

∂y= z cos(yz).

Integrating each side of (6.2.44) with respect to y then gives

(6.2.45) h1(y, z) = sin(yz) + h2(z).

Again, notice that h2(z) is the constant of integration; since the integration is with respect to y,

and the only variables appearing in (6.2.45) are y and z, this constant generally depends on z (but

not on x), as we have indicated. Now put (6.2.45) into (6.2.42) to get

(6.2.46) Ψ(x, y, z) = xy + sin(yz) + h2(z).

It remains to use the third relation of (6.2.41) (the only one so far not used). To this end take

z-derivatives of each side of (6.2.46) to get

(6.2.47)∂Ψ(x, y, z)

∂z= y cos(yz) +

dh2(z)

dz,

and then, from (6.2.47) and the third relation of (6.2.41) we obtain

dh2(z)

dz= 0,

so that

(6.2.48) h2(z) = c,

for a constant c. Putting (6.2.48) in (6.2.46) gives the potential function

(6.2.49) Ψ(x, y, z) = xy + sin(yz) + c.

80

6.3 Conservation of Energy

We end this section by demonstrating a very important physical property of conservative vector

fields. Suppose, as in Remark 4.3.2, that FFF : D → R3 is a vector field with domain D := R3, and

that, at each point (x, y, z) in R3, the vector FFF (x, y, z) is the force acting on a particle of mass m

located at (x, y, z). Now suppose that the force vector field FFF is conservative, with some potential

function Ψ, so that

(6.3.50) FFF (x, y, z) = ∇Ψ(x, y, z), for all (x, y, z) in R3.

If the particle moves through a point rrr = (x, y, z) with some velocity vvv then we define the total

mechanical energy E as

(6.3.51) E =1

2m ‖vvv‖2 −Ψ(rrr).

The first term on the right side of (6.3.51) is the kinetic energy and the second term is the potential

energy of the particle. If the particle follows a path

γγγ : [a, b]→ R3

in response to this force, then, at instant t, the particle is at the point γγγ(t) and moving with velocity

vvv(t) = γγγ(1)(t) (see (4.3.29)), so that the total mechanical energy at instant t must be given by

(6.3.52) E(t) =1

2m∥∥γγγ(1)(t)

∥∥2 −Ψ(γγγ(t)), for all t in a ≤ t ≤ b,

(obtained by putting rrr = γγγ(t) and vvv = γγγ(1)(t) in (6.3.51)). We will show that E(t) is necessarily

constant over a ≤ t ≤ b, so that mechanical energy is conserved by motion in a conservative force

field. To this end observe that (6.3.52) can be written as

(6.3.53) E(t) =1

2m(γγγ(1)(t) · γγγ(1)(t))−Ψ(γγγ(t)), for all t in a ≤ t ≤ b.

But (from (4.3.26) - (4.3.27)) we have

(6.3.54)d

dt(γγγ(1)(t) · γγγ(1)(t)) = 2γγγ(1)(t) · γγγ(2)(t), for all t in a ≤ t ≤ b.

and

d

dtΨ(γγγ(t)) = ∇Ψ(γγγ(t)) · γγγ(1)(t) (from (6.1.5) with Ψ in place of f)

= FFF (γγγ(t)) · γγγ(1)(t) (see (6.3.50)).

(6.3.55)

81

Now combine (6.3.53), (6.3.54) and (6.3.55) to see that

(6.3.56)d

dtE(t) = mγγγ(2)(t)−FFF (γγγ(t)) · γγγ(1)(t), for all t in a ≤ t ≤ b.

But, at Remark Remark 4.3.2, we have seen from Newton’s second law of motion that the path

satisfies the following differential equation (see (4.3.32)):

(6.3.57) mγγγ(2)(t) = FFF (γγγ(t)), for all times t in a ≤ t ≤ b.

From (6.3.57) and (6.3.56) we get

d

dtE(t) = 0, for all t in a ≤ t ≤ b,

as required to establish conservation of mechanical energy.

82

Chapter 7

Green’s Theorem in the Plane

We are going to learn an extremely powerful and useful theorem of multivariable calculus, called

Green’s theorem in the plane. This is essentially a two dimensional result so our focus will be

exclusively on fields in R2 rather than R3. As we shall see later, this two dimensional result is

nevertheless the essential tool for establishing the main results on three dimensional vector calculus

(such as Stokes’ theorem and Gauss’ theorem) which are indispensable for physics and engineering.

7.1 Green’s Theorem for Rectangles

As preparation for Green’s theorem we first look at line integrals of a vector field FFF : R2 → R2 (i.e.

for simplicity the domain of FFF is all of R2) over the very simple curves Γ1, Γ2, Γ3 and Γ4 around

the perimeter of a rectangle, as shown on Figure 7.1. The curve Γ1 is from (a, c) to (b, c), and an

obvious parametric representation of Γ1 is

(7.1.1) γγγ : [a, b]→ R2

defined by

(7.1.2) γγγ(t) := (t, c) for all a ≤ t ≤ b.

Taking the t-derivative of (7.1.2) then gives

(7.1.3) γγγ(1)(t) = (1, 0), for all a ≤ t ≤ b.

83

Figure 7.1: Curves Γ1,Γ2,Γ3 and Γ4 in the plane R2

We now have the line integral along the curve Γ1:∫Γ1

FFF (rrr) · drrr =

∫ b

a

FFF (γγγ(t)) · γγγ(1)(t) dt (from (5.1.13))

=

∫ b

a

(F1(t, c), F2(t, c)) · (1, 0) dt (from (7.1.2) and (7.1.3))

=

∫ b

a

F1(t, c) dt

=

∫ b

a

F1(x, c) dx.

(7.1.4)

At the last equality of (7.1.4) we have just re-named the variable of integration x instead of t. This

trivial change will actually be useful later on! In much the same way as for (7.1.4) we have the line

integrals over Γ2, Γ3 and Γ4 (simple exercise!):∫Γ2

FFF (rrr) · drrr =

∫ d

c

F2(b, y) dy(7.1.5) ∫Γ3

FFF (rrr) · drrr = −∫ b

a

F1(x, d) dx(7.1.6) ∫Γ4

FFF (rrr) · drrr = −∫ d

c

F2(a, y) dy.(7.1.7)

Now suppose that Γ is the closed curve in R2 around the perimeter of the rectangle [a, b] × [c, d]

shown in Figure 7.2: Upon comparing Figure 7.2 and Figure 7.1 we see that the line integral of

84

Figure 7.2: Curve Γ counter-clockwise around the perimeter of a rectangle in R2

FFF along the closed curve Γ is just the sum of the line integrals of FFF along Γ1, Γ2, Γ3 and Γ4 in

succession, that is∫Γ

FFF (rrr) · drrr =

∫Γ1

FFF (rrr) · drrr +

∫Γ2

FFF (rrr) · drrr +

∫Γ3

FFF (rrr) · drrr +

∫Γ4

FFF (rrr) · drrr

=

∫ b

a

[F1(x, c)− F1(x, d)]dx+

∫ d

c

[F2(b, y)− F2(a, y)]dy,

(7.1.8)

(we have used (7.1.4) - (7.1.7) at the second equality of (7.1.8)). The formula (7.1.8) gives the line

integral of a vector field in R2 around the perimeter of a rectangle. This is a very useful result.

In fact, we are now going to use (7.1.8) to establish the following preliminary version of Green’s

theorem in the plane:

Theorem 7.1.1 (Preliminary Green’s theorem). Suppose that Γ is the closed path around the

perimeter of the rectangle

D := [a, b]× [c, d]

in the counter-clockwise direction, as shown in Figure 7.2, and FFF : R2 → R2 is a C1-vector field

(see Remark 3.2.2 in which we just formally put F3 = 0). Then

(7.1.9)

∫Γ

FFF (rrr) · drrr =

∫D

[∂F2

∂x(x, y)− ∂F1

∂y(x, y)

]dx dy.

85

Remark 7.1.2. Notice that (7.1.9) relates a line integral around the perimeter of a rectangle to a

dx dy-integral over the rectangle. We shall see later that this relation of line integrals (circulations

actually) to dx dy-integrals in the plane is very useful.

It remains to verify (7.1.9). Using (7.1.8) this is easy. First observe that

(7.1.10)

∫ d

c

∂F1

∂y(x, y) dy = F1(x, d)− F1(x, c).

Notice that (7.1.10) just follows from elementary calculus: The integral of the derivative of a function

is just the function itself. In this case the derivative is just with respect to y (the variable x is held

fixed and plays no role in any of this). In exactly the same way

(7.1.11)

∫ b

a

∂F2

∂x(x, y) dx = F2(b, y)− F2(a, y).

Now from (7.1.8), (7.1.10) and (7.1.11) we obtain

(7.1.12)

∫Γ

FFF (rrr) · drrr =

∫ d

c

[∫ b

a

∂F2

∂x(x, y) dx

]dy −

∫ b

a

[∫ d

c

∂F1

∂y(x, y) dy

]dx.

But, from Fubini’s theorem (see especially (2.1.12)) we have

(7.1.13)

∫D

∂F2

∂x(x, y) dx dy =

∫ d

c

[∫ b

a

∂F2

∂x(x, y) dx

]dy,

and

(7.1.14)

∫D

∂F1

∂y(x, y) dx dy =

∫ b

a

[∫ d

c

∂F1

∂y(x, y) dy

]dx.

Now (7.1.9) follows at once from (7.1.12), (7.1.13) and (7.1.14).

7.2 Green’s Theorem: General Case

The version of Green’s theorem given by Theorem 7.1.1 has one major limitation, namely it is

restricted to integration over rectangles in the plane. Green’s theorem acquires real power when

this restriction is removed and we are able to integrate over non-rectangular regions D in the plane

such as the one shown in Figure 7.3, in which the perimeter of D is a simple closed curve (recall

Section 4.4) with counter-clockwise direction: We state this more general version of Green’s theorem

in the plane without proof:

86

Figure 7.3: Curve Γ counter-clockwise around perimeter of non-rectangular region D in R2

Theorem 7.2.1 (Green’s theorem in the plane). Suppose that D is a region in the plane shown at

Figure 7.3, and the perimeter of D is a simple closed curve Γ in the counter-clockwise direction. If

FFF : R2 → R2 is a C1-vector field, then

(7.2.15)

∫Γ

FFF (rrr) · drrr =

∫D

[∂F2

∂x(x, y)− ∂F1

∂y(x, y)

]dx dy.

Example 7.2.2. Suppose that D is the unit disc in the plane and FFF : R2 → R2 is a vector field

defined by

(7.2.16) FFF (x, y) := (F1(x, y), F2(x, y)),

for

(7.2.17) F1(x, y) := x2ex + y − log(1 + x2) F2(x, y) := 8x− sin(y).

Evaluate the line integral∫

ΓFFF (rrr) · drrr, in which Γis the circle around the unit disc in the counter-

clockwise direction.

Direct evaluation of this line integral by choosing a parametric representation of Γ and using

(5.1.13) is difficult (try it and see!). Instead we shall use Theorem 7.2.1. From (7.2.17) we have

(7.2.18)∂F1

∂y(x, y) = 1,

∂F2

∂x(x, y) = 8.

87

From (7.2.18) and (7.2.15)∫Γ

FFF (rrr) · drrr =

∫D

[8− 1] dx dy = 7

∫D

dx dy = 7area(D) = 7π,

where we have used Remark 2.1.19 at the third equality.

Remark 7.2.3. Green’s theorem provides a useful formula for calculating the area of a region in

R2. Indeed, fix the region D in the plane shown at Figure 7.3, and let Γ be the closed curve around

the perimeter of this region in the counter-clockwise direction (exactly as in Theorem 7.2.1). Now

take the special vector field FFF : R2 → R2 defined by

(7.2.19) F1(x, y) := −y F2(x, y) := x.

From (7.2.19) we have

(7.2.20)

∫D

[∂F2

∂x(x, y)− ∂F1

∂y(x, y)

]dx dy = 2

∫D

dx dy = 2area(D),

where we have used Remark 2.1.19 at the second equality. Now suppose that a parametric repre-

sentation of the curve Γ is

(7.2.21) γγγ : [a, b]→ R2 with γγγ(t) = (x(t), y(t)) for all a ≤ t ≤ b.

Then ∫Γ

FFF (rrr) · drrr =

∫ b

a

FFF (γγγ(t)) · γγγ(1)(t) dt (from (5.1.13))

=

∫ b

a

(F1(x(t), y(t)), F2(x(t), y(t)) ·(

dx(t)

dt,

dy(t)

dt

)dt (see (7.2.21))

=

∫ b

a

[x(t)

dy(t)

dt− y(t)

dx(t)

dt

]dt. (from (7.2.19)).

(7.2.22)

Now combine (7.2.22), (7.2.20) and (7.2.15) to obtain the area formula

(7.2.23) area(D) =1

2

∫ b

a

[x(t)

dy(t)

dt− y(t)

dx(t)

dt

]dt.

Example 7.2.4. A hypocycloid is a curve Γ in R2 comprising all (x, y) which satisfy the relation

(7.2.24) x2/3 + y2/3 = 1,

that is

(7.2.25) Γ = (x, y) ∈ R2 | x2/3 + y2/3 = 1.

88

Figure 7.4: Hypocycloid in R2

(see Figure 7.4). Determine the area of the region enclosed by the hypocycloid.

Direct evaluation of this area is very complicated (you may want to try it!). We shall see that

formula (7.2.23) makes the computation of area quite easy. A parametrization of the curve Γ around

the perimeter of the hypocycloid in the counter-clockwise direction is

(7.2.26) γγγ(t) := x(t)iii+ y(t)jjj where x(t) = [cos(t)]3, y(t) = [sin(t)]3, for all 0 ≤ t ≤ 2π.

In fact, from (7.2.26), we have

(7.2.27) [x(t)]2/3 + [y(t)]2/3 = [cos(t)]2 + [sin(t)]2 = 1, for all 0 ≤ t ≤ 2π,

so that the path γγγ : [0, 2π]→ R2 traverses the curve Γ, and it is easily checked that the direction of

traverse is counter clockwise. We now use the path given by (7.2.26) in the area formula (7.2.23).

From (7.2.26)

(7.2.28)dx(t)

dt= −3 sin(t)[cos(t)]2

dy(t)

dt= 3 cos(t)[sin(t)]2.

89

We now evaluate the integrand of the area formula (7.2.23). From (7.2.28) and (7.2.26) we find

x(t)dy(t)

dt− y(t)

dx(t)

dt= 3[cos(t)]4[sin(t)]2 + 3[sin(t)]4[cos(t)]2

= 3[cos(t) sin(t)]2[cos(t)]2 + [sin(t)]2

= 3[cos(t) sin(t)]2

=3

4[sin(2t)]2 (recall identity sin(2t) = 2 sin(t) cos(t))

=3

8[1− cos(4t)] (recall 2 sin2(θ) = 1− cos(2θ)).

(7.2.29)

From (7.2.29) and (7.2.23) with a = 0 and b = 2π we obtain

area =3

16

∫ 2π

0

[1− cos(4t)] dt

=3π

8− 3

16

∫ 2π

0

cos(4t) dt =3π

8

(7.2.30)

since ∫ 2π

0

cos(4t) dt = 0

by inspection.

90

Chapter 8

Surfaces, Surface Area and Surface

Integrals

We are going to study surfaces in three dimensional space, our final objective being the construction

of surface integrals. The theorems of Gauss-Ostrogradskii and Stokes on vector calculus that we

shall address later rely in an essential way on surface integrals. Furthermore, surface integrals are

likewise essential for stating in completely general terms the basic laws of electromagnetism, such

as Faraday’s law on electromagnetic induction and Ampere’s circulation law for magnetic fields, as

we shall see later in this chapter.

8.1 Parametric Representation of Surfaces

Intuitively, a surface in three dimensional space is a “thin”, essentially “two dimensional” object,

such as a sheet of paper or the roof of a tent. Our first task is to make this somewhat vague notion

mathematically precise in a clear definition. We first motivate this with some examples:

Example 8.1.1. The surface of the “top half” of a sphere of radius r > 0 in R3 with center at the

origin is the set of all points (x, y, z) in R3 which satisfy the relations

(8.1.1) x2 + y2 + z2 = r2, z ≥ 0.

(see Figure 8.1). The description of the hemispherical surface at (8.1.1) can be given an alternative

(and more useful) formulation as follows: Define the disc D ⊂ R2xy of radius r in the x − y plane,

that is

(8.1.2) D = (x, y) ∈ R2 | x2 + y2 ≤ r2,

91

Figure 8.1: Top half of sphere of radius r


(8.1.3) f(x, y) :=√r2 − x2 − y2, for all (x, y) in D,

(note that r2 − x2 − y2 ≥ 0 for all (x, y) in D so that the square-root is always a real number). It

is clear that the point (x, y, f(x, y)) traces out the surface S of the top half of the sphere in Figure

8.1 as (x, y) varies through the disc D.

Example 8.1.2. It is very easy to generalize Example 8.1.1 as follows: Suppose that f : D → Ris a given continuous function defined on some given region D ⊂ R2

xy in the x− y plane. As (x, y)

varies throughout D the point (x, y, f(x, y)) traces out a “surface” S in three dimensional space R3

as shown in Figure 8.2: We call this surface S the graph of the function f : D → R, and denote this

surface in set-theoretic terms as

(8.1.4) S = (x, y, f(x, y)) ∈ R3 | (x, y) in D.

Remark 8.1.3. The graphs of functions f : D → R, as in Example 8.1.1, and more generally in

Example 8.1.2, represent an important class of surfaces, but it is an unfortunate fact that not every

92

Figure 8.2: Surface S; the graph of a function f : D → R

surface in R3 can be represented by the graph of a function. Shown in Figure 8.3 is the surface

of the whole sphere of radius r centered at the origin of R3, in contrast to just the “top half” of

the sphere in Figure 8.1. If D denotes the disc of radius r in the x− y plane (see (8.1.2)) then we

see from Figure 8.3 that the straight line parallel to the z-axis through the point A inside the disc

D (with coordinates (x, y)) passes through the surface at the distinct points B and C. One could

perhaps deal with this by defining the “multivalued-valued” function

(8.1.5) f(x, y) := ±√r2 − x2 − y2, for all (x, y) in D,

(c.f. (8.1.3)), the positive and negative values in (8.1.5) corresponding to the points B and C

respectively, and then attempt to regard the surface of the whole sphere as the “graph” of the

multivalued function f given by (8.1.5). However, “multivalued functions” bring a host of intractable

complications, so much so that in mathematics we never deal with anything except single-valued

93

Figure 8.3: Whole surface of a sphere of radius r

functions. This being the case, it follows that we cannot represent the surface of the whole sphere

as the graph of a (single-valued!) function f . Similar problems occur for the surface in Figure 8.4,

which includes a “fold”. Again, this surface cannot be the graph of a function f : D → R, for some

region D in the x− y plane, since, for some points (x, y), a straight line parallel to the z-axis and

passing through (x, y) necessarily passes through the surface at three distinct points A, B and C

on account of the “fold” in the surface. Here we would need a “three-valued function” to represent

this surface, giving the z-coordinate of each of the three points A, B and C for fixed (x, y), an

absolutely hideous prospect. We repeat again: we deal only with functions that are single-valued,

so the folded surface in Figure 8.4 cannot be represented by the graph of a function. A final example

of a surface which clearly cannot be represented by the graph of any function is the outer surface of

the deformed “donut” in Figure 8.5. The surfaces in Figures 8.3, 8.4 and 8.5 suggest that we must

be quite careful in formulating exactly what we mean by a surface. We shall build on one essential

piece of intuition: One can think of a surface as a portion of a generally deformed “flat surface” or

plane. Since it takes two coordinates to specify a point in a plane it follows that one should likewise

require two “coordinates” to specify a point on a surface. The following definition builds on this

94

Figure 8.4: Surface S includes a “fold”

intuition:

Definition 8.1.4. A parametric function for a surface S is a given function or mapping

(8.1.6) ΦΦΦ : D → R3,

defined on some given region D in a u− v-plane R2uv, written in the scalar component form

ΦΦΦ(u, v) = (x(u, v), y(u, v), z(u, v))

= x(u, v)iii+ y(u, v)jjj + z(u, v)kkk for all (u, v) in D,(8.1.7)

which maps each point (u, v) in the fixed region D into the vector ΦΦΦ(u, v) in R3. The surface S of

the parametric function is the set of points in R3 traced by ΦΦΦ(u, v) as (u, v) traverses the region D.

In the notation of sets we write this as

(8.1.8) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D.

95

Figure 8.5: Surface of a deformed donut

The region D on which the parametric function is defined is called the domain of definition or

parametric domain, the variable (u, v) is called the parametric variable, and the whole function

ΦΦΦ : D → R3 is called a parametric representation of the surface.

In Example 8.1.2 the region D on which the surface is defined is a region in the x− y plane R2xy.

In contrast, the parametric domain D in the Definition 8.1.4 is a region in a general u−v plane R2uv

which could be different from the x − y plane. It is because of this flexibility in not being limited

just to regions in the x− y plane that Definition 8.1.4 describes a very much larger class of surfaces

than just the surfaces corresponding to graphs of functions that were seen in Example 8.1.2.

Remark 8.1.5. Suppose that the parametric function ΦΦΦ is a C1-function, that is (c.f. Remark

3.2.2) the first partial derivatives

∂x(u, v)

∂u,

∂y(u, v)

∂u,

∂z(u, v)

∂u,

∂x(u, v)

∂v,

∂y(u, v)

∂v,

∂z(u, v)

∂v,

of the scalar components of ΦΦΦ (see (8.1.7)) exist and are continuous functions of (u, v) in D. Then

the parametric representation (8.1.6) is called a C1-parametric representation and the corresponding

96

surface S is called a C1-surface. Our focus will be almost exclusively on the case of C1-parametric

representations and C1-surfaces.

Remark 8.1.6. Observe that if a surface S is the graph of a function f : D → R as in the

Example 8.1.2, then it has a parametric representation of the form specified in Definition 8.1.4 and

is therefore a surface in the sense of Definition 8.1.4. Indeed, suppose we are given the continuous

function f : D → R, in which D ⊂ R2xy is some region in the x − y plane, exactly as in Example

8.1.2. We now just take the u − v plane R2uv in Definition 8.1.4 to be identical to the x − y plane

R2xy. Define the parametric function ΦΦΦ : D → R3 by

ΦΦΦ(u, v) := uiii+ vjjj + f(u, v)kkk

≡ (u, v, f(u, v)), for all (u, v) in D,(8.1.9)

so that the scalar components of ΦΦΦ(u, v) are (c.f. (8.1.7))

(8.1.10) x(u, v) = u, y(u, v) = v, z(u, v) = f(u, v), for all (u, v) in D.

Then it is clear that the surface S given in Example 8.1.2 is exactly the set of points in R3 traced by

ΦΦΦ(u, v) as (u, v) traverses the region D, that is, the function at (8.1.9) is a parametric representation

of the surface S in Example 8.1.2. The advantage of Definition 8.1.4 is that it applies to a much

broader class of surfaces than the surfaces which are graphs of functions, and, as we have just noted,

also includes surfaces which are graphs of functions.

Example 8.1.7. We now return to Example 8.1.1 (the top half of a sphere of radius r centered

at the origin of R3 shown in Figure 8.1) which we already know is the surface of the graph of

the function f given by (8.1.3). We shall now give an alternative parametric representation of the

surface. To see how this works we reproduce part of Figure 8.1 in greater detail in Figure 8.6:

Fix some point A on the surface S of the sphere. Then the length of OA is the radius r of the

sphere. Introduce the angle φ from the z-axis to the ray OA, drop a perpendicular from A onto the

x− y plane to get the point B, and let θ be the angle in the x− y plane from the x-axis to the ray

OB. From the right-angle triangle OAE we find

(8.1.11) OE = r cos(φ) OB = AE = r sin(φ),

and from the right-angle triangle OBC with (8.1.11) we find

(8.1.12) OC = OB cos(θ) = r sin(φ) cos(θ) OD = OB sin(θ) = r sin(φ) sin(θ).

97

Figure 8.6: Spherical surface

But from Figure 8.6 we see that OC, OD and OE give the x, y and z coordinates of the point A

in terms of the angles θ and φ so that

(8.1.13) x(θ, φ) = r sin(φ) cos(θ), y(θ, φ) = r sin(φ) sin(θ), z(θ, φ) = r cos(φ).

This means that, if we fix the angles θ and φ then the point A in Figure 8.6 corresponding to these

angles has x− y − z coordinates given by (8.1.13). Put another way, if θ varies through the range

0 ≤ θ ≤ 2π and φ varies through the range 0 ≤ φ ≤ π/2 then the point with coordinates given by

(8.1.13) traces out the entire surface S, that is the top half of the surface of the sphere of radius

r shown in Figure 8.1. To write this in the formalism of Definition 8.1.4 we define the rectangle

D ⊂ R2 in the “θ − φ plane” by

D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π/2

= [0, 2π]× [0, π/2],(8.1.14)

98

in which we have used the abbreviated “mathematical” notation of (2.1.3) for the rectangle D.

Using (8.1.13) we define the function Φ : D → R3 by

ΦΦΦ(θ, φ) := (x(θ, φ), y(θ, φ), z(θ, φ))

≡ x(θ, φ)iii+ y(θ, φ)jjj + z(θ, φ)kkk

= r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk,

(8.1.15)

for all (θ, φ) in D defined by (8.1.14). It is clear that, as (θ, φ) traverses the rectangle D given by

(8.1.14) then the vector ΦΦΦ(θ, φ) defined by (8.1.15) traces out the surface S, namely the top half of

the sphere of radius r. We therefore have a parametric representation of the surface S in the sense

of Definition 8.1.4.

Remark 8.1.8. It follows that we have two distinct parametric representations for the surface S

which is the top half of a sphere of radius r centered at the origin of R3, namely the representation

constructed in Example 8.1.7 and the representation following from the fact that S is the graph

of a function f (recall Example 8.1.1 and Remark 8.1.6). This makes it clear that a given surface

generally has several different parametric representations. In such a situation is there any “right”

choice of parametric representation? The answer really depends on the particular problem one has

in mind. Later in this chapter we shall study integration over surfaces (or surface integrals) and

we shall see that evaluation of these integrals is often greatly simplified by choosing the “right”

parametric representation of the surface over which we must integrate. In the next example we

identify a major advantage of the parametric representation of the top half of the sphere established

in Example 8.1.7. Recall from Remark 8.1.3 that the representation of this surface as the graph of

the function (8.1.3) does not extend in any easy way to a description of the surface of the whole

sphere of radius r. By contrast, in the next example we see that the parametric representation of

Example 8.1.7 extends trivially to give a full description of the whole sphere of radius r.

Example 8.1.9. We want a parametric representation of the surface S which is now the whole

sphere of radius r centered at the origin of R3 (see Figure 8.3). Referring to Figure 8.6 we see that,

if θ varies through the interval 0 ≤ θ ≤ 2π (as in Example 8.1.7) and φ varies through the interval

0 ≤ φ ≤ π (in contrast to the range 0 ≤ φ ≤ π/2 in Example 8.1.7) then the point A must trace out

the surface of the whole sphere. It is trivial to repeat the analysis of Example 8.1.7 to see that a

parametric representation of the whole surface of the sphere is given by the region D ⊂ R2 defined

99

by

D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π

= [0, 2π]× [0, π],(8.1.16)

(compare (8.1.14)) with ΦΦΦ : D → R3 given again by (8.1.15) that is

(8.1.17) ΦΦΦ(θ, φ) = r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk,

for all (θ, φ) in D defined by (8.1.16).

In the next example we give a parametric representation of another surface which is again not the

graph of a function.

Example 8.1.10. In this example we write (r, θ) instead of (u, v) for the parametric variable since

we want to regard the first variable r as “radius” and the second variable as “angle”. Define

ΦΦΦ(r, θ) := (r cos(θ), r sin(θ), θ)

= r cos(θ)iii+ r sin(θ)jjj + θkkk,(8.1.18)

for all (r, θ) such that

(8.1.19) 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π.

We then have a parametric mapping ΦΦΦ : D → R3 in which the region D is the rectangle in the

“r − θ plane” given by (8.1.19), that is

D = (r, θ) ∈ R2 | 0 ≤ r ≤ 1 and 0 ≤ φ ≤ 2π

= [0, 1]× [0, 2π].(8.1.20)

Comparing (8.1.18) with (8.1.7) we see that the scalar components of Φ are given by

(8.1.21) x(r, θ) = r cos(θ), y(r, θ) = r sin(θ), z(r, θ) = θ.

The surface in R3 traced by ΦΦΦ(r, θ) as (r, θ) traverses the rectangle D is called a helicoid and shown

in Figure 8.7.

If r is held fixed in the range 0 ≤ r ≤ 1 then the point (x(r, θ), y(r, θ)) = (r cos(θ), r sin(θ)) traces a

circle of radius r in the plane as θ varies through the range 0 ≤ θ ≤ 2π. However, the third relation

of (8.1.21) “winds” this circle into a “spiral” of radius r around the z-axis in R3. The totality of

these “spirals” of radius r for all 0 ≤ r ≤ 1 makes up the helicoid. It is clear that the helicoid

cannot possibly be the graph of a function.

100

Figure 8.7: Helicoid(taken from Wikimedia Commons)

Remark 8.1.11. It is worthwhile to compare Definition 4.2.2 with Definition 8.1.4, for they are very

similar. In each case we begin with a specified parametric function (also called a path in Definition

4.2.2) and define a corresponding curve or surface as the range of the parametric function when the

parametric variable (t in the case of a curve, (u, v) in the case of a surface) traverses through a basic

domain (an interval [a, b] in the case of a curve, a region D ⊂ R2 in the case of a surface). One

clear difference between these definitions is that the parametric variable in the case of a curve is a

single real number (usually denoted as t) whereas the parametric variable in the case of a surface is

a pair of real numbers (usually denoted as (u, v)). Of course this just reflects the fact that a curve

results from the deformation of a portion [a, b] of a straight line (i.e. a “one-dimensional” object)

whereas a surface results from the deformation of a portion D of a plane (i.e. a “two-dimensional”

object). The deformation is described by the parametric function γγγ in the case of a curve, and

by the parametric function ΦΦΦ in the case of a surface. Finally, note one (huge) difference between

Definitions 4.2.2 and 8.1.4, namely a curve has a natural direction corresponding to the increase

of t through a ≤ t ≤ b, whereas there seems to be no similarly natural “direction” for a surface

(since there is no obvious “direction” in which (u, v) traverses a region D ⊂ R2). In fact, there is a

sense of direction for a surface (called the orientation of the surface) but the formulation of this is

a highly technical business which properly belongs to the realm of differential geometry and which

we will not take up here.

101

8.2 Tangents to a Surface and Smooth Surfaces

Suppose that we are given a C1-parametric function ΦΦΦ : D → R3 (see Definition 8.1.4 and Remark

8.1.5) for which the region D ⊂ R2 is specifically a rectangle of the form

(8.2.22) D = [a, b]× [c, d],

(c.f. (2.1.3) and see Figure 2.1), and fix some (u0, v0) in D, that is u0 is in the range a ≤ u ≤ b and

v0 is in the range c ≤ v ≤ d. Then the mapping

(8.2.23) γγγv0 : [a, b]→ R3 defined by γγγv0(u) := ΦΦΦ(u, v0) for all a ≤ u ≤ b

is the parametric representation of a curve Γv0 traced in R3 by γγγv0(u) = ΦΦΦ(u, v0) as u varies through

the interval a ≤ u ≤ b (see Figure 8.8).

Figure 8.8: Curves Γv0 and Γu0 with tangent vectors

Following (4.3.18) we take the derivative of the parametric function γγγv0(u) at u = u0, namely

(8.2.24)dγγγv0du

(u0) =∂ΦΦΦ

∂u(u0, v0) =

(∂x

∂u(u0, v0),

∂y

∂u(u0, v0),

∂z

∂u(u0, v0)

).

102

Here the first equality at (8.2.24) follows from (8.2.23) and the second equality follows from the

scalar component-wise representation of ΦΦΦ (see (8.1.7)). In view of Section 4.3 the vector at (8.2.24)

is tangent to the curve Γv0 at the point

(8.2.25) ΦΦΦ(u0, v0) = (x(u0, v0), y(u0, v0), z(u0, v0))

(see Figure 8.8). Similarly the mapping

(8.2.26) γγγu0 : [c, d]→ R3 defined by γγγu0(v) := ΦΦΦ(u0, v) for all c ≤ v ≤ d

is the parametric representation of a curve Γu0 traced in R3 by γγγu0(v) = ΦΦΦ(u0, v) as v varies through

the interval c ≤ v ≤ d, and the vector

(8.2.27)dγγγu0dv

(v0) =∂ΦΦΦ

∂v(u0, v0) =

(∂x

∂v(u0, v0),

∂y

∂v(u0, v0),

∂z

∂v(u0, v0)

)is also tangent to the curve Γu0 at the point ΦΦΦ(u0, v0) (see Figure 8.8). We now define the vector

cross product of the vector ∂ΦΦΦ(u0, v0)/∂u at (8.2.24) and the vector ∂ΦΦΦ(u0, v0)/∂v at (8.2.27), that

is

(8.2.28) NNN(u0, v0) :=∂ΦΦΦ

∂u(u0, v0)× ∂ΦΦΦ

∂v(u0, v0).

Using the familiar expression for calculating cross products of vectors, together with the formulas

for the vectors ∂ΦΦΦ(u0, v0)/∂u and ∂ΦΦΦ(u0, v0)/∂v at (8.2.24) and (8.2.27), we obtain from (8.2.28)

(8.2.29) NNN(u0, v0) :=

∣∣∣∣∣∣∣∣iii jjj kkk

∂x∂u

(u0, v0) ∂y∂u

(u0, v0) ∂z∂u

(u0, v0)∂x∂v

(u0, v0) ∂y∂v

(u0, v0) ∂z∂v

(u0, v0)

∣∣∣∣∣∣∣∣ .Now expand the 3× 3 determinant on the right of (8.2.29) by Cramer’s rule to get

NNN(u0, v0) = iii

[∂y

∂u(u0, v0)

∂z

∂v(u0, v0)− ∂y

∂v(u0, v0)

∂z

∂u(u0, v0)

]− jjj

[∂x

∂u(u0, v0)

∂z

∂v(u0, v0)− ∂x

∂v(u0, v0)

∂z

∂u(u0, v0)

]+ kkk

[∂x

∂u(u0, v0)

∂y

∂v(u0, v0)− ∂x

∂v(u0, v0)

∂y

∂u(u0, v0)

],

(8.2.30)

that is

(8.2.31) NNN(u0, v0) = iii∂(y, z)

∂(u, v)(u0, v0) + jjj

∂(z, x)

∂(u, v)(u0, v0) + kkk

∂(x, y)

∂(u, v)(u0, v0),

103

in which we have introduced the usual 2× 2 Jacobian determinants defined by

(8.2.32)

∂(y, z)

∂(u, v)(u0, v0) :=

∣∣∣∣∣ ∂y∂u(u0, v0) ∂y∂v

(u0, v0)∂z∂u

(u0, v0) ∂z∂v

(u0, v0)

∣∣∣∣∣ =

[∂y

∂u(u0, v0)

∂z

∂v(u0, v0)− ∂y

∂v(u0, v0)

∂z

∂u(u0, v0)

](8.2.33)

∂(z, x)

∂(u, v)(u0, v0) :=

∣∣∣∣∣ ∂z∂u(u0, v0) ∂z∂v

(u0, v0)∂x∂u

(u0, v0) ∂x∂v

(u0, v0)

∣∣∣∣∣ =

[∂z

∂u(u0, v0)

∂x

∂v(u0, v0)− ∂z

∂v(u0, v0)

∂x

∂u(u0, v0)

](8.2.34)

∂(x, y)

∂(u, v)(u0, v0) :=

∣∣∣∣∣∂x∂u(u0, v0) ∂x∂v

(u0, v0)∂y∂u

(u0, v0) ∂y∂v

(u0, v0)

∣∣∣∣∣ =

[∂x

∂u(u0, v0)

∂y

∂v(u0, v0)− ∂y

∂u(u0, v0)

∂x

∂v(u0, v0)

]on the right side of (8.2.30) to get (8.2.31). It is clear that ΦΦΦ(u, v) traces the surface S shown in

Figure 8.8 as (u, v) traverses the rectangular region D.

Definition 8.2.1. The surface S shown in Figure 8.8 is called smooth at the point Φ(u0, v0) when

NNN(u0, v0) 6= 0. The surface S is called smooth when it is smooth at ΦΦΦ(u0, v0) for each and every

(u0, v0) in the region D, that is NNN(u0, v0) 6= 0 for each and every (u0, v0) in D. Throughout this

course we shall be interested only in smooth surfaces.

Remark 8.2.2. Assuming, as we shall always do, that the surface S is smooth in the sense of

Definition 8.2.1, we can define the unit vector

(8.2.35) nnn(u0, v0) :=NNN(u0, v0)

‖NNN(u0, v0)‖for every (u0, v0) in D,

where, of course, from (8.2.31) and Pythagoras, for each (u0, v0) in D we have

(8.2.36) ‖NNN(u0, v0)‖ =

√[∂(x, y)

∂(u, v)(u0, v0)

]2

+

[∂(y, z)

∂(u, v)(u0, v0)

]2

+

[∂(z, x)

∂(u, v)(u0, v0)

]2

.

Since the vectors at (8.2.27) and (8.2.24) are tangent to the curves Γu0 and Γv0 respectively (as

already noted above), it follows that these vectors span the plane which is tangent to the surface

S at the point ΦΦΦ(u0, v0), and the unit vector nnn(u0, v0) is normal to this plane, and therefore also

normal to the surface S at the point ΦΦΦ(u0, v0).

104

8.3 Area of a Surface

We can now obtain a useful formula for the area of a surface with the parametric representation

ΦΦΦ : D → R3, in which, exactly as at Remark 8.2, we suppose for concreteness that the region D is

the rectangle at (8.2.22). Fix some u0, v0, and small ∆u > 0, ∆v > 0, such that

(8.3.37) a ≤ u0 < u0 + ∆u ≤ b, c ≤ v0 < v0 + ∆v ≤ d.

Then we get a small rectangle ∆D with the “corners” given by the points (u0, v0), (u0 + ∆u, v0),

(u0, v0 + ∆v) and (u0 + ∆u, v0 + ∆v), that is

∆D := (u, v) | u0 ≤ u ≤ u0 + ∆u, v0 ≤ v ≤ v0 + ∆v

= [u0, u0 + ∆u]× [v0, v0 + ∆v],(8.3.38)

and ΦΦΦ maps ∆D onto the small “piece of surface”

(8.3.39) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D.

The small surface ∆S is approximately a flat parallogram with edges AB and AC (see Figure 8.9).

Since ∆u and ∆v are small the edges AB and AC are approximately given respectively by the

following vectors vvv1 and vvv2

(8.3.40) vvv1 := ΦΦΦ(u0, v0 + ∆v)−ΦΦΦ(u0, v0), vvv2 := ΦΦΦ(u0 + ∆u, v0)−ΦΦΦ(u0, v0).

But, again since ∆u and ∆v are small, we also have the relations

(8.3.41) vvv1 ≈∂ΦΦΦ

∂v(u0, v0)∆v, vvv2 ≈

∂ΦΦΦ

∂u(u0, v0)∆u.

Now we know that the area of the parallelogram ∆S with edges given by the vectors vvv1 and vvv2 is

given by the norm of the cross-product vvv1 × vvv2 namely

(8.3.42) area∆S = ‖vvv1 × vvv2‖ .

But, from (8.3.41)

vvv1 × vvv2 ≈(∂ΦΦΦ

∂v(u0, v0)× ∂ΦΦΦ

∂u(u0, v0)

)∆u ∆v

≈NNN(u0, v0)∆u ∆v (from the definition of NNN(u0, v0) at (8.2.28)),

(8.3.43)

105

Figure 8.9: Small rectangle ∆D and approximate parallelogram ∆S

and combining (8.3.43) with (8.3.42) we obtain

(8.3.44) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v.

As ∆u and ∆v shrink to the infinitesimals du and dv, the small rectangle ∆D with lower left

corner given by (u0, v0) shrinks to the infinitesimal rectangle dD (still with lower left corner given

by (u0, v0)), the piece of surface ∆S shrinks to the infinitesimal parallelogram dS (still with one

corner “anchored” at the point A given by ΦΦΦ(u0, v0) as in Figure 8.9), and the approximation at

(8.3.44) becomes exact, so that

(8.3.45) area dS = ‖NNN(u0, v0)‖ du dv.

Now the total area of the surface S is the “sum” or integral of the elemental areas at (8.3.45) as

the infinitesimal rectangles dD cover the whole rectangle D, that is the total area of the surface S

106

must be given by any of the following three equivalent expressions

areaS =

∫D

‖NNN(u, v)‖ du dv

=

∫D

∥∥∥∥∂ΦΦΦ

∂u(u, v)× ∂ΦΦΦ

∂v(u, v)

∥∥∥∥ du dv

=

∫D

√[∂(x, y)

∂(u, v)(u, v)

]2

+

[∂(y, z)

∂(u, v)(u, v)

]2

+

[∂(z, x)

∂(u, v)(u, v)

]2

du dv,

(8.3.46)

in which the second expression follows from (8.2.28) and the third expression follows from (8.2.36)

(replacing (u0, v0) with (u, v)).

Remark 8.3.1. Notice that, in order to evaluate the area of the surface S we can use any of

the three integrals on the right hand side of (8.3.46). As a practical matter the second integral

is typically the easiest to do calculations with, as well as displaying very clearly the role of the

parametric representation in the area formula. Each of the integrals just involves integrating over

a region D of R2 and therefore can be evaluated using Fubini’s theorem when D is a rectangle (see

Theorem 2.1.5 and Remark 2.1.6), or more generally either a y-simple region or x-simple region in

R2 (see Remark 2.1.8).

Remark 8.3.2. The following important question arises in connection with the area formula given

by (8.3.46), namely the right side appears to depend on the particular parametric representation

that we have chosen for the surface S (recall (8.1.6) and (8.1.8)). This seeming dependence is shown

particularly by the second of the equivalent expressions on the right side of (8.3.46), in which the role

of the parametric representation ΦΦΦ : D → R3 is clearly indicated. However, the area of a surface is

intrinsic, and should not depend on the particular parametric representation we have chosen for the

surface. This means that if we choose to represent the same surface S by an alternative parametric

representation

(8.3.47) ΦΦΦ : D → R3 for some region D ⊂ R2,

so that in particular the surface S is traversed by ΦΦΦ(u, v) as (u, v) traverses region D, that is

(compare (8.1.8))

(8.3.48) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D,

then we had better have

(8.3.49) areaS =

∫D

∥∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥∥ du dv,

107

if the area formula is to be “independent” of the parametric representation we have chosen for the

surface S! This involves showing that the quantities on the right sides of (8.3.49) and (8.3.46) are

equal. The following theorem, which we state without proof, guarantees that this is always the

case:

Theorem 8.3.3. Suppose that

(8.3.50) ΦΦΦ : D → R3 and ΦΦΦ : D → R3

are alternative C1-parametric representations of a surface S, so that

(8.3.51) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D and S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D

i.e. ΦΦΦ(u, v) traverses S as (u, v) traverses D, and likewise ΦΦΦ(u, v) also traverses S as (u, v) traverses

D. Then

(8.3.52)

∫D

∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥ du dv =

∫D

∥∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥∥ du dv.

It follows from Theorem 8.3.3 that the formula (8.3.46) for the area of the surface S does not

depend on which parametric representation we use for S. This means, in particular, that if we have

several parametric representations of a surface S then we should use that particular representation

which involves the least amount of work in the integrations for calculating the area. This will become

clear in later examples.

Remark 8.3.4. Suppose a surface S is the graph of a function

(8.3.53) f : D → R,

as in Example 8.1.2 and Remark 8.1.6. In this case the area formula (8.3.46) simplifies, as we next

show. From Remark 8.1.6 we know that the surface S has the parametric representation ΦΦΦ : D → R3

in which

(8.3.54) ΦΦΦ(u, v) := uiii+ vjjj + f(u, v)kkk for all (u, v) in D,

(c.f. (8.1.9)). We next calculate the u-derivative and v-derivative of ΦΦΦ at (8.3.54):

(8.3.55)∂ΦΦΦ

∂u(u, v) = 1iii+ 0jjj +

∂f

∂u(u, v)kkk,

108

(8.3.56)∂ΦΦΦ

∂v(u, v) = 0iii+ 1jjj +

∂f

∂v(u, v)kkk.

From (8.3.56) and (8.3.55) we get

∂ΦΦΦ


∂v(u, v) =


1 0 ∂f∂u

(u, v)

0 1 ∂f∂v

(u, v)

∣∣∣∣∣∣∣∣ ,= −∂f

∂u(u, v)iii− ∂f

∂v(u, v)jjj + kkk,

(8.3.57)

and calculating the Pythagorean length of the cross product vector at (8.3.57) we obtain

(8.3.58)

∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥ =

√1 +

[∂f

∂u(u, v)

]2

+

[∂f

∂v(u, v)

]2

.

From (8.3.46), together with (8.3.58), we see that the area of the surface S given by the graph of

the function (8.3.53) is

(8.3.59) areaS =

∫D

√1 +

[∂f

∂u(u, v)

]2

+

[∂f

∂v(u, v)

]2

du dv.

Example 8.3.5. In this example we shall determine the area of the surface S which is the top half

of the sphere of radius r centered at the origin of R3 (see Example 8.1.1). We know that S is the

graph of the function f given by (8.1.3) that is

(8.3.60) f(u, v) :=√r2 − u2 − v2, for all (u, v) in D,

defined on the disc of radius r in the u− v plane given by (8.1.2) that is

(8.3.61) D = (u, v) ∈ R2 | u2 + v2 ≤ r2.

Evaluation of the area is therefore just a matter of substituting f and D given by (8.3.60) and

(8.3.61) into the formula given by (8.3.59) for the area of a surface which is the graph of a function

f , that is

(8.3.62) areaS =

∫D

√1 +

[∂f

∂u(u, v)

]2

+

[∂f

∂v(u, v)

]2

du dv.

109

From (8.3.60) we get

(8.3.63)∂f

∂u(u, v) =

−u√r2 − u2 − v2

,∂f

∂v(u, v) =

−v√r2 − u2 − v2

,

and, from (8.3.63) we find

(8.3.64) 1 +

[∂f

∂u(u, v)

]2

+

[∂f

∂v(u, v)

]2

=r2

r2 − u2 − v2,

Combining (8.3.64) and (8.3.62) we get

(8.3.65) areaS =

∫D

r√r2 − u2 − v2

du dv.

Evaluation of this integral is quite laborious and complicated (although not impossible), because the

integrand involves the reciprocal of a square root (which is usually quite awkward to deal with) and

because the integration is over the disc D (see (8.3.61)) rather than over a nice simple region such

as a rectangle. For this reason let us see if there is not an easier way to determine the area. Recall

from Theorem 8.3.3 that we can use any parametric representation of S in the area formula (8.3.46),

and recall from Example 8.1.7 that we have an alternative parametric representation ΦΦΦ : D → R3

of the surface S in which (see (8.1.14) and (8.1.15))

D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π/2

= [0, 2π]× [0, π/2],(8.3.66)

and

(8.3.67) ΦΦΦ(θ, φ) = r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk, for all (θ, φ) in D.

Of the equivalent expressions given by (8.3.46) the second expression is typically the easiest to use,

that is

(8.3.68) areaS =

∫D

∥∥∥∥∂ΦΦΦ

∂θ(θ, φ)× ∂ΦΦΦ

∂φ(θ, φ)

∥∥∥∥ dθ dφ,

where we have just replaced the generic parametric variable (u, v) with the parametric variable (θ, φ)

specific to the surface S given by (8.3.66) - (8.3.67). We now calculate the θ-partial derivatives of

the scalar components of ΦΦΦ(θ, φ) at (8.3.67):

(8.3.69)∂ΦΦΦ

∂θ(θ, φ) = −r sin(φ) sin(θ)iii+ r sin(φ) cos(θ)jjj + 0kkk.

110

Similarly for the φ-partial derivatives:

(8.3.70)∂ΦΦΦ

∂φ(θ, φ) = r cos(φ) cos(θ)iii+ r cos(φ) sin(θ)jjj − r sin(φ)kkk.

Now calculate the vector cross product of the partial derivative vectors at (8.3.69) and (8.3.70) to

get

(8.3.71)∂ΦΦΦ


∂φ(θ, φ) =


−r sin(φ) sin(θ) r sin(φ) cos(θ) 0

r cos(φ) cos(θ) r cos(φ) sin(θ) −r sin(φ)

∣∣∣∣∣∣∣∣ .Expanding the right side of (8.3.71) gives

∂ΦΦΦ


∂φ(θ, φ) = iii[(r sin(φ) cos(θ))(−r sin(φ))− 0]

− jjj[(−r sin(φ) sin(θ))(−r sin(φ))− 0]

+ kkk[(−r sin(φ) sin(θ))(r cos(φ) sin(θ))− (r cos(φ) cos(θ))(r sin(φ) cos(θ))]

= −r2

sin2(φ) cos(θ)iii+ sin2(φ) sin(θ)jjj + sin(φ) cos(φ)(sin2(θ) + cos2(θ))kkk

= −r2

sin2(φ) cos(θ)iii+ sin2(φ) sin(θ)jjj + sin(φ) cos(φ)kkk.

(8.3.72)

We now determine the Pythagorean length of the vector at (8.3.72):∥∥∥∥∂ΦΦΦ


∂φ(θ, φ)

∥∥∥∥ = r2√

sin4(φ) cos2(θ) + sin4(φ) sin2(θ) + sin2(φ) cos2(φ)

= r2√

sin4(φ) + sin2(φ) cos2(φ)

= r2√

sin2(φ)[sin2(φ) + cos2(φ)]

= r2| sin(φ)|.

(8.3.73)

Now put (8.3.73) into (8.3.68) to get

areaS =

∫D

∥∥∥∥∂ΦΦΦ


∂φ(θ, φ)

∥∥∥∥ dθ dφ

= r2

∫ 2π

0

∫ π/2

0

| sin(φ)| dφ

dθ

= r2

∫ 2π

0

1 dθ

= 2πr2.

(8.3.74)

111

Note that we used (8.3.73), the fact that D is a rectangle (see (8.3.66)), and the Fubini Theorem

(recall (2.1.12)) at the second equality at (8.3.74).

Remark 8.3.6. Example 8.3.5 illustrates a very important aspect of the area formula (8.3.46),

namely the choice of parametric representation of the surface S can substantially influence the

amount of work involved in using this formula. Indeed, we saw that the parametric representation

of S as the graph of the function (8.3.60) over the region D given by (8.3.61) leads to the rather

complicated integral at (8.3.65), whereas for the parametric representation of S given by (8.3.66)

and (8.3.67) the area formula is quite easy to use.

Example 8.3.7. Determine the area of the helicoid given in Example 8.1.10. The parametric

representation

ΦΦΦ : D → R3

of the helicoid is given by (8.1.18) and the region D is given by (8.1.19) (equivalently by (8.1.20)),

that is

D = (r, θ) ∈ R2 | 0 ≤ r ≤ 1 and 0 ≤ φ ≤ 2π

= [0, 1]× [0, 2π].(8.3.75)

and

(8.3.76) ΦΦΦ(r, θ) = r cos(θ)iii+ r sin(θ)jjj + θkkk, for all (r, θ) in D.

We see that (8.3.46) gives three equivalent expressions for the area. As we have already noted the

second of these expressions is typically the easiest to use, so the the area is given by

(8.3.77) area of helicoid =

∫D

∥∥∥∥∂ΦΦΦ

∂r(r, θ)× ∂ΦΦΦ

∂θ(r, θ)

∥∥∥∥ dr dθ

in which we have replaced the generic parametric variables (u, v) in (8.3.46) with the parametric

variables (r, θ) that are specific to the helicoid. We now calculate the r and θ-partial derivatives of

the scalar components of ΦΦΦ(r, θ) given by (8.3.76):

∂ΦΦΦ

∂r(r, θ) = cos(θ)iii+ sin(θ)jjj + 0kkk

∂ΦΦΦ

∂θ(r, θ) = −r sin(θ) + r cos(θ)jjj + 1kkk.

(8.3.78)

Now calculate the vector cross product of the partial derivative vectors at (8.3.78) to get

(8.3.79)∂ΦΦΦ


∂θ(r, θ) =


cos(θ) sin(θ) 0

−r sin(θ) r cos(θ) 1

∣∣∣∣∣∣∣∣ ,112

that is

∂ΦΦΦ


∂θ(r, θ) = sin(θ)iii+ cos(θ)jjj + [r cos2(θ) + r sin2(θ)]kkk

= sin(θ)iii+ cos(θ)jjj + rkkk.

(8.3.80)

Now calculate the Pythagorean length of the vector at (8.3.80):

(8.3.81)

∥∥∥∥∂ΦΦΦ


∂θ(r, θ)

∥∥∥∥ =√

sin2(θ) + cos2(θ) + r2 =√

1 + r2.

From (8.3.81) and (8.3.77)

area of helicoid =

∫D

[√r2 + 1] dr dθ,

=

∫ 2π

0

∫ 1

0

[√r2 + 1] dr

dθ (from Remark 2.1.6 and (8.3.75)).

(8.3.82)

A rather lengthy and tedious integration by substitution gives

(8.3.83)

∫ 1

0

[√r2 + 1] dr =

1

2[√

2 + log(1 +√

2)],

and then (8.3.82) with (8.3.83) give

area of helicoid = π[√

2 + log(1 +√

2)].

Remark 8.3.8. For later use we repeat the area formulae (8.3.44) and (8.3.45). We are given the

parametric representation ΦΦΦ : D → R3 of a surface S for some region D ⊂ R2xy. Fix a rectangle

(8.3.84) ∆D := (u, v) | u0 ≤ u ≤ u0 + ∆u, v0 ≤ v ≤ v0 + ∆v,

contained within the region D, in which ∆u > 0 and ∆v > 0 are small, and let ∆S be the

corresponding small surface which is the image of ∆D, that is

(8.3.85) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D,

(see Figure 8.10). Since the edges ∆u and ∆v are small the area of the surface ∆S is approximately

given by

(8.3.86) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v,

113

Figure 8.10: Small rectangle ∆D and approximate parallelogram ∆S

(see (8.3.44)), in which NNN(u0, v0) is the vector normal to the approximately flat surface ∆S, and

given by

(8.3.87) NNN(u0, v0) =∂ΦΦΦ

∂v(u0, v0)× ∂ΦΦΦ

∂u(u0, v0),

(see (8.2.28)). As ∆u and ∆v shrink to the infinitesimals du and dv, the small rectangle ∆D

with lower left corner given by (u0, v0) shrinks to the infinitesimal rectangle dD (still with lower

left corner given by (u0, v0)), the piece of surface ∆S shrinks to the infinitesimal parallelogram dS

(still with one corner “anchored” at the point A given by ΦΦΦ(u0, v0) as in Figure 8.10), and the

approximation at (8.3.86) becomes exact, so that

(8.3.88) area dS = ‖NNN(u0, v0)‖ du dv.

We shall use the area formulae (8.3.86) and (8.3.88) in Section 8.4 on surface integrals of scalar

fields, and in Section 8.5 on surface integrals of vector fields.

114

8.4 Surface Integral of a Scalar Field

In Section 8.3 we obtained a formula for the area of a surface S with a parametric representation

ΦΦΦ : D → R3 (see (8.3.46) and (8.2.36)). In this section our goal is to generalize this idea to con-

struct the integral of a given scalar field over the surface S. For a concrete instance of how this

type of integral could be useful suppose that the surface S describes an infinitesimally thin sheet

of plastic, and for each point (x, y, z) on S the function value f(x, y, z) gives the charge density (in

units coul./m2) concentrated on the surface S at the point (x, y, z). The integral that we are going

to define will enable us to determine the total charge on the surface.

To fix ideas suppose that a surface S has the parametric representation ΦΦΦ : D → R3, in which

we suppose for concreteness that the region D ⊂ R2uv is the rectangle at (8.2.22), that is

D = (u, v) | a ≤ u ≤ b, c ≤ v ≤ d

= [a, b]× [c, d],(8.4.89)

and f : R3 → R is a given continuous scalar field. As in Section 8.3 we will divide the region D

into small rectangles ∆D (exactly as at (8.3.38) and (8.3.84)). Then Φ maps ∆D onto the piece of

surface ∆S given by (8.3.85), that is (see Figure 8.10)

(8.4.90) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D,

which is an approximate parallelogram with area given by (8.3.86), that is

(8.4.91) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v,

in which, from (8.3.87),

(8.4.92) NNN(u0, v0) :=∂ΦΦΦ

∂u(u0, v0)× ∂ΦΦΦ

∂v(u0, v0).

We now multiply the area at (8.4.91) by the value of the scalar field f at point ΦΦΦ(u0, v0) corre-

sponding to the corner A in Figure 8.10, that is by the value f(ΦΦΦ(u0, v0)), to get

(8.4.93) f(ΦΦΦ(u0, v0))area∆S ≈ f(ΦΦΦ(u0, v0)) ‖NNN(u0, v0)‖∆u ∆v.

What is the significance of the quantity at (8.4.93)? To get some idea of this suppose that at

each point (x, y, z) on the surface S the value f(x, y, z) gives the density of charge per unit area

115

concentrated on the surface at (x, y, z) (with units coul./m2). Then, in particular, the charge density

on the surface at the point ΦΦΦ(u0, v0) in Figure 8.10 (i.e. given by (x, y, z) = ΦΦΦ(u0, v0)) is f(ΦΦΦ(u0, v0))

coul./m2, so that the quantity at (8.4.93) is approximately the total charge on the small piece of

surface ∆S. Exactly as at (8.3.45), as ∆u and ∆v shrink to the infinitesimals du and dv the small

rectangle ∆D in Figure 8.10 with lower left corner given by (u0, v0) shrinks to the infinitesimal

rectangle dD (still with lower left corner given by (u0, v0)), the piece of surface ∆S shrinks to the

infinitesimal parallelogram dS (still with one corner “anchored” at the point A given by ΦΦΦ(u0, v0)

as in Figure 8.10), and the approximation at (8.4.93) becomes exact, so that

(8.4.94) f(ΦΦΦ(u0, v0))area dS = f(ΦΦΦ(u0, v0)) ‖NNN(u0, v0)‖ du dv.

We now “sum” or integrate the elemental quantities at (8.4.94) as the infinitesimal rectangles dD

cover the whole rectangle D to get the quantity

(8.4.95)

∫D

f(ΦΦΦ(u, v)) ‖NNN(u, v)‖ du dv.

This quantity is known as the surface integral of the scalar field f over the surface S. If we recall the

particular interpretation of f(x, y, z) being the charge density at any point (x, y, z) on the surface

then it is clear that the surface integral at (8.4.95) gives the total charge on the surface S.

In view of (8.4.92) and (8.2.36) (but replacing (u0, v0) with (u, v)) we can equally well write the

surface integral at (8.4.95) as

(8.4.96)

∫D

f(ΦΦΦ(u, v))

∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥ du dv

as well as

(8.4.97)

∫D

f(ΦΦΦ(u, v))

√[∂(x, y)

∂(u, v)(u, v)

]2

+

[∂(y, z)

∂(u, v)(u, v)

]2

+

[∂(z, x)

∂(u, v)(u, v)

]2

du dv

in the sense that all three integrals at (8.4.95), (8.4.96) and (8.4.97) are equal. Among these various

notations that given by (8.4.96) is typically the most convenient for actual calculations, and this is

the notation we shall use from now on.

A question very similar to that addressed in Remark 8.3.2 for the area formula arises in con-

nection with surface integrals, namely are surface integrals of scalar fields independent of the

parametrization that we choose for the surface S? Again, they had better be! If fact, going

back to our motivating interpretation of f as the charge per unit area on the surface, we have noted

that the surface integrals (8.4.95), (8.4.96) and (8.4.97) all give the total charge on the surface S,

116

and this of course should not depend on the particular parametric representation of the surface

S. That surface integrals are indeed independent of whichever parametric representation for the

surface S that we use is guaranteed by the following analog (in fact generalization of) Theorem

8.3.3 which we again state without proof:

Theorem 8.4.1. Suppose that f : R3 → R is a continuous scalar field, and that

(8.4.98) ΦΦΦ : D → R3 and ΦΦΦ : D → R3



i.e. ΦΦΦ(u, v) traverses S as (u, v) traverses the region D, and likewise ΦΦΦ(u, v) traverses the same

surface S as (u, v) traverses the region D. Then

(8.4.100)∫D

f(ΦΦΦ(u, v))

∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥ du dv =

∫D

f(ΦΦΦ(u, v))

∥∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥∥ du dv.

Remark 8.4.2. A large variety of notations are commonly encountered in the literature to serve

as short-hand for the surface integrals at (8.4.95), (8.4.96) and (8.4.97), in particular

(8.4.101)

∫S

f(x, y, z) dA,

∫S

f(x, y, z) dσ, and

∫S

f(x, y, z) dS,

as well as

(8.4.102)

∫S

f dA,

∫S

f dσ, and

∫S

f dS.

In all these notations the essential elements are the subscript S of the integral, indicating the

surface over which one integrates, and of course the integrand f , which indicates the scalar field

being integrated. The notations at (8.4.101) are quite explicit, and remind us that we are integrating

over S with respect to an underlying space variable in R3 generically denoted by (x, y, z). It can

often be rather tedious to keep carrying the space variable argument (x, y, z), and this is the reason

for introducing the notations at (8.4.102), in which the space variable is suppressed (but always

understood to be present!). In this course we shall typically use the first of the notations at (8.4.102),

117

so that we write∫S

f dA =

∫D

f(ΦΦΦ(u, v)) ‖NNN(u, v)‖ du dv

=

∫D

f(ΦΦΦ(u, v))

∥∥∥∥∂ΦΦΦ


∂v(u, v)

∥∥∥∥ du dv

=

∫D

f(ΦΦΦ(u, v))

√[∂(x, y)

∂(u, v)(u, v)

]2

+

[∂(y, z)

∂(u, v)(u, v)

]2

+

[∂(z, x)

∂(u, v)(u, v)

]2

du dv,

(8.4.103)

Observe that our notation on the left side of (8.4.103) for the surface integral of the scalar field

f over the surface S completely suppresses all mention of the particular parametric representation

ΦΦΦ : D → R3 of the surface S. This is exactly as things should be, for we know from Theorem 8.4.1

that the surface integral does not in fact depend on which particular parametric representation of

the surface S is used. Again, we recall that the second of the three expressions on the right of

(8.4.103) is usually the easiest to use in actual calculations of the surface integral. We illustrate

this in Example 8.4.3 which follows. Finally, note that if the functions f is such that

f(x, y, z) = 1, for all (x, y, z) in S

then (8.4.103) just reduces to the area formula (8.3.46) as one would expect.

Example 8.4.3. Suppose that S is the helicoid of Example 8.1.10 and the scalar field is

(8.4.104) f(x, y, z) :=√x2 + y2 + 1 for all (x, y, z) in R3.

Determine the surface integral ∫S

f dA.

In Example 8.1.10 we have seen that the helicoid has the parametric representation ΦΦΦ : D → R3 in

which

D = (r, θ) ∈ R2 | 0 ≤ r ≤ 1 and 0 ≤ φ ≤ 2π

= [0, 1]× [0, 2π].(8.4.105)

and

(8.4.106) ΦΦΦ(r, θ) = r cos(θ)iii+ r sin(θ)jjj + θkkk, for all (r, θ) in D.

Of the three equivalent expressions given by (8.4.103) the easiest to use is usually the second

expression, that is

(8.4.107)

∫S

f dA =

∫D

f(ΦΦΦ(r, θ))

∥∥∥∥∂ΦΦΦ


∂θ(r, θ)

∥∥∥∥ dr dθ,

118

in which the generic parametric variable (u, v) of (8.4.103) is replaced with the parametric variable

(r, θ) specific to the helicoid. In Example 8.3.7 (in which we determined the area of the helicoid)

we have already calculated

(8.4.108)

∥∥∥∥∂ΦΦΦ


∂θ(r, θ)

∥∥∥∥ =√

1 + r2.

(see (8.3.81)). Moreover, from (8.4.106) and (8.4.104), we have

(8.4.109) f(ΦΦΦ(r, θ)) =√r2 cos2(θ) + r2 sin2(θ) + 1 =

√r2 + 1.

Now put (8.4.109) and (8.4.108) into (8.4.107) to get∫S

f dA =

∫D

[r2 + 1] dr dθ

=

∫ 2π

0

∫ 1

0

[r2 + 1] dr

dθ (from Remark 2.1.6 and (8.4.105))

=

∫ 2π

0

4

3dθ =

8π

3.

(8.4.110)

8.5 Surface Integral of a Vector Field

In Section 8.4 we obtained a formula for the surface integral of a scalar function f over a surface S

with a parametric representation ΦΦΦ : D → R3 (see (8.4.103)). In this section our goal is to define

an analogous integral, but this time for a vector field, leading to the surface integral of a vector

field over a given surface. Our construction of this surface integral will very closely parallel the

construction of the surface integral of a scalar field in Section 8.4.

In fact, proceeding exactly as in Section 8.4, suppose that the surface S has the parametric

representation ΦΦΦ : D → R3, in which we again suppose for concreteness that the region D is the

rectangle

D = (u, v) | a ≤ u ≤ b, c ≤ v ≤ d

= [a, b]× [c, d],(8.5.111)

but now FFF : R3 → R3 is a given continuous vector field. As in Section 8.3 we will divide the region

D into small rectangles ∆D (exactly as at (8.3.38) and (8.3.84)). Then Φ maps ∆D onto the piece

of surface ∆S given by (8.3.85), that is (see Figure 8.10)

(8.5.112) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D,

119

which is an approximate parallelogram with area given by (8.3.86), that is

(8.5.113) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v,

in which, from (8.3.87),

(8.5.114) NNN(u0, v0) :=∂ΦΦΦ

∂u(u0, v0)× ∂ΦΦΦ

∂v(u0, v0).

However, instead of multiplying the area at (8.5.113) by f(ΦΦΦ(u0, v0)), as we did at (8.4.93) in the

construction of the surface integral of a scalar field f , we now multiply this area by the scalar

quantity given by the inner product FFF (ΦΦΦ(u0, v0)) ·nnn(u0, v0), in which nnn(u0, v0) is the unit normal to

the surface ∆S at the point ΦΦΦ(u0, v0) given by

(8.5.115) nnn(u0, v0) :=NNN(u0, v0)

‖NNN(u0, v0)‖for every (u0, v0) in D,

(recall (8.2.35)). That is, we calculate

FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0)area∆S ≈ FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0) ‖NNN(u0, v0)‖∆u ∆v

≈ FFF (ΦΦΦ(u0, v0)) ·NNN(u0, v0)∆u ∆v,

where the first ≈ follows from (8.5.113) and the second ≈ follows from (8.5.115). We therefore have

(8.5.116) FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0)area∆S ≈ FFF (ΦΦΦ(u0, v0)) ·NNN(u0, v0)∆u ∆v.

What is the significance of the quantity on the left side of (8.5.116)? To get a sense of this suppose

that we have electric charge moving through space, and that at each point (x, y, z) in R3 the vector

FFF (x, y, z) represents the current density at (x, y, z). In the notation of Example 3.1.6, where the

current density was introduced, we should really write JJJ(x, y, z) instead of FFF (x, y, z) for the current

density, but we will continue to use FFF (x, y, z). We know from Example 3.1.6 that the quantity on

the left of (8.5.116) is the total current passing through the surface ∆S. Of course, we would like

to determine the total current passing through the whole surface S, and this means “adding up”

or “integrating” the currents passing through the small surfaces ∆S given by (8.5.116). To this

end we proceed just as we did in Section 8.4. Exactly as at (8.3.45), as ∆u and ∆v shrink to the

infinitesimals du and dv the small rectangle ∆D in Figure 8.10 with lower left corner given by

(u0, v0) shrinks to the infinitesimal rectangle dD (still with lower left corner given by (u0, v0)), the

piece of surface ∆S shrinks to the infinitesimal parallelogram dS (still with one corner “anchored”

120

at the point A given by ΦΦΦ(u0, v0) as in Figure 8.10), and the approximation at (8.5.116) becomes

exact, so that

(8.5.117) FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0)area dS = FFF (ΦΦΦ(u0, v0)) ·NNN(u0, v0) du dv.

We now “sum” or integrate the elemental quantities at (8.5.117) as the infinitesimal rectangles dD

cover the whole rectangle D to get the quantity

(8.5.118)

∫D

FFF (ΦΦΦ(u, v)) ·NNN(u, v) du dv.

This quantity is known as the surface integral of the vector field FFF over the surface S. We note that

the surface integral of a vector field, like the surface integral of a scalar field, is a scalar quantity

(and definitely not a vector quantity!). If we recall the particular interpretation of FFF (x, y, z) being

the current density at any point (x, y, z) on the surface then it is clear that the surface integral at

(8.5.118) gives the total current passing through the surface S.

In view of (8.2.28) (but replacing (u0, v0) with (u, v)) we can equally well write the surface

integral at (8.5.118) as

(8.5.119)

∫D

FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ


∂v(u, v)

]du dv

in the sense that all the integrals at (8.5.118) and (8.5.119) are equal. The integral at (8.5.119) is

typically the most convenient for actual calculations, and this is what we shall use from now on.

A question very similar to that addressed in Remark 8.3.2 for the area formula, and addressed

by Theorem 8.4.1 for surface integrals of scalar fields, of course also arises in connection with surface

integrals of vector fields, namely is the surface integral of a vector field independent of the particular

parametrization that we choose for the surface S? That this is indeed the case is guaranteed by the

following analog of Theorem 8.4.1 which we again state without proof:

Theorem 8.5.1. Suppose that FFF : R3 → R3 is a continuous vector field, and that

(8.5.120) ΦΦΦ : D → R3 and ΦΦΦ : D → R3



i.e. ΦΦΦ(u, v) traverses S as (u, v) traverses the region D, and likewise ΦΦΦ(u, v) traverses the same

surface S as (u, v) traverses the region D. Then

(8.5.122)∫D



∂v(u, v)

]du dv =

∫D

FFF (ΦΦΦ(u, v)) ·

[∂ΦΦΦ


∂v(u, v)

]du dv.

121

Remark 8.5.2. Exactly as for the case of surface integrals of a scalar field (see Remark 8.4.2) a

variety of notations are commonly encountered in the literature to denote the surface integral of a

vector field FFF over some surface S, namely

(8.5.123)

∫S

FFF (x, y, z) · dAAA,

∫S

FFF (x, y, z) · dσσσ, and

∫S

FFF (x, y, z) · dSSS

as well as

(8.5.124)

∫S

FFF · dAAA,

∫S

FFF · dσσσ, and

∫S

FFF · dSSS.

In all these notations the essential elements are the subscript S of the integral, indicating the

surface over which one integrates, and of course the integrand FFF , which indicates the vector field

being integrated. The notations at (8.5.123) are quite explicit, and remind us that we are integrating

over S with respect to an underlying space variable in R3 generically denoted by (x, y, z). As is

the case with surface integrals of scalar fields, it can often be rather tedious to keep carrying the

space variable argument (x, y, z), and this is the reason for introducing the notations at (8.5.124),

in which the space variable is suppressed (but always understood to be present!). In this course we

shall typically use the first of the notations at (8.5.124), so that we write∫S

FFF · dAAA :=

∫D



∂v(u, v)

]du dv

=

∫D

FFF (ΦΦΦ(u, v)) ·NNN(u, v) du dv,

(8.5.125)

for the integrals at (8.5.119) and (8.5.118). Observe that our notation on the left side of (8.5.125)

for the surface integral of the vector field FFF over the surface S completely suppresses all mention

of the particular parametric representation ΦΦΦ : D → R3 of the surface S. This is exactly as things

should be, for we know from Theorem 8.5.1 that the surface integral does not in fact depend on

which particular parametric representation of the surface S is used. Finally, we recall that the first

of the three expressions on the right of (8.5.125) is usually the easiest to use in actual calculations

of the surface integral.

Remark 8.5.3. Just as in Remark 8.3.4 suppose that the surface S has the special form of the

graph of a function

(8.5.126) f : D → R,

122

(as in Example 8.1.2 and Remark 8.1.6). In this case we can simplify the general formula for the

integral of a vector field FFF : R3 → R3 over the surface S given by (8.5.125), that is

(8.5.127)

∫S

FFF · dAAA :=

∫D



∂v(u, v)

]du dv.

From Remark 8.1.6 we know that the surface S has the parametric representation ΦΦΦ : D → R3 in

which


(c.f. (8.1.9)), and for ΦΦΦ given by (8.5.128) we have already seen that

∂ΦΦΦ


∂v(u, v) =


1 0 ∂f∂u

(u, v)

0 1 ∂f∂v

(u, v)

∣∣∣∣∣∣∣∣ ,= −∂f


∂v(u, v)jjj + kkk,

(8.5.129)

(see (8.3.57)). Writing FFF in the scalar component form (3.2.17), that is

(8.5.130) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in R3,

one sees from (8.5.130) and (8.5.128) that

(8.5.131) FFF (ΦΦΦ(u, v)) = F1(u, v, f(u, v))iii+ F2(u, v, f(u, v))jjj + F3(u, v, f(u, v))kkk,

for all (u, v) in D (recall (8.5.126). From (8.5.131) and (8.5.129) we get



∂v(u, v)

]= [F1(u, v, f(u, v))iii+ F2(u, v, f(u, v))jjj + F3(u, v, f(u, v))kkk]

·[−∂f∂u

(u, v)iii− ∂f

∂v(u, v)jjj + kkk

].

(8.5.132)

Multiplying out the right side of (8.5.132) and inserting into the integrand on the right of (8.5.127)

we obtain ∫S

FFF · dAAA =

∫D

[F3(u, v, f(u, v))− F1(u, v, f(u, v))

∂f

∂u(u, v)

−F2(u, v, f(u, v))∂f

∂v(u, v)

]du dv.

(8.5.133)

123

This gives the surface integral of FFF over S directly in terms of the scalar components of FFF (see

(8.5.130)) and the function f whose graph defines the surface S. We emphasize that the formula

(8.5.133) is applicable only when the surface S is the graph of a function f , as at (8.5.126). When

S is not the graph of a function then we must resort to the more general expression (8.5.127) when

we evaluate the surface integral of a vector field FFF . In fact, I always prefer to use (8.5.127), even

when S is the graph of a function f , and completely avoid the use of (8.5.133).

Remark 8.5.4. The surface integral ∫S

FFF · dAAA

of a vector field FFF over a surface S that we have constructed is called the flux of the vector field

FFF through the surface S, and has the interpretation of the “aggregate flow” of the vector field FFF

through the surface S. Is there a more precise “physical interpretation” of the flux? This depends

on the physical meaning of the vector field FFF . We have already noted that when FFF (x, y, z) is

identified with the current density JJJ(x, y, z) then the surface integral∫S

JJJ · dAAA

(i.e. the flux of the current density JJJ over the surface S) gives the total current passing through

the surface S. Of course, this current should not depend in any way on the particular parametric

representation we use for the surface S, and Theorem 8.5.1 guarantees this. Later, when we come

to Maxwell’s equations, we shall need the surface integrals of the electric field EEE(x, y, z) and the

magnetic field BBB(x, y, z) over a surface S, that is∫S

EEE · dAAA and

∫S

BBB · dAAA.

These quantities are known, respectively, as the electric flux of the electric field EEE and the magnetic

flux of the magnetic field BBB through the surface S.

Example 8.5.5. A surface S in R3 has the parametric representation ΦΦΦ : D → R3 defined by

D = (u, v) ∈ R2 | 0 ≤ u ≤ 2 and 0 ≤ v ≤ 3

= [0, 2]× [0, 3],(8.5.134)

and

(8.5.135) ΦΦΦ(u, v) = uiii+ u2jjj + vkkk, for all (u, v) in D.

124

Current flows through the surface S, with a current density given by

(8.5.136) JJJ(x, y, z) = 3z2iii+ 6jjj + 6xzkkk, for all (x, y, z) in R3.

Determine the total current passing through the surface S.

From (8.5.125), but with JJJ in place of FFF , we get

(8.5.137)

∫S

JJJ · dAAA :=

∫D

JJJ(ΦΦΦ(u, v)) ·[∂ΦΦΦ


∂v(u, v)

]du dv,

so evaluation of the current through S is just a matter of evaluating the integral on the right side

of (8.5.137). Note that we can write (8.5.135) in the form

(8.5.138) ΦΦΦ(u, v) = x(u, v)iii+ y(u, v)jjj + z(u, v)kkk,

for

(8.5.139) x(u, v) := u, y(u, v) := u2, z(u, v) := v.

From (8.5.139), (8.5.138) and (8.5.136), we obtain

JJJ(ΦΦΦ(u, v)) = 3z2(u, v)iii+ 6jjj + 6x(u, v)z(u, v)kkk

= 3v2iii+ 6jjj + 6uvkkk.(8.5.140)

Taking u-partial derivatives and v-partial derivatives of the scalar components at (8.5.135) gives

∂ΦΦΦ

∂u(u, v) = 1iii+ 2ujjj + 0kkk

∂ΦΦΦ

∂v(u, v) = 0iii+ 0jjj + 1kkk,

(8.5.141)

so that from (8.5.141) we get

(8.5.142)∂ΦΦΦ


∂v(u, v) =


1 2u 0

0 0 1

∣∣∣∣∣∣∣∣ = 2uiii− jjj + 0kkk.

From (8.5.142) and (8.5.140)

JJJ(ΦΦΦ(u, v))) ·[∂ΦΦΦ


∂v(u, v)

]= (3v2iii+ 6jjj + 6uvkkk) · (2uiii− jjj + 0kkk)

= 6(uv2 − 1).

(8.5.143)

125

From (8.5.143) and (8.5.137), the total current passing through surface S is given by∫S

JJJ · dAAA = 6

∫D

(uv2 − 1) du dv

= 6

∫ 3

0

∫ 2

0

(uv2 − 1) du

dv (from (8.5.134) and Remark 2.1.6)

= 72.

(8.5.144)

Example 8.5.6. In Example 6.2.3 we saw the following: If a single point charge Q is located at

the origin of R3 (recall Example 3.1.3), then the electric field at any point (x, y, z) is given by

(8.5.145) EEE(x, y, z) =Q

4πε0[x2 + y2 + z2]3/2(xiii+ yjjj + zkkk)

provided that (x, y, z) is not at the origin of R3 (see (6.2.12)). Suppose that the surface S is the top

half of the sphere of radius r centered at the origin of R3, exactly as at Example 8.3.5. Determine

the electric flux through the surface S.

From Remark 8.5.4 that we must compute the surface integral

(8.5.146)

∫S

EEE · dAAA.

In Example 8.3.5 we saw that there are at least two parametric representations of the surface

S, namely a “Cartesian representation”, in terms of the graph of the function f(x, y) given by

(8.3.60) defined for all (x, y) in the disc D given by (8.3.61), and a “polar representation”, given

by ΦΦΦ : D → R3 in which D is the rectangle defined by (8.3.66) and ΦΦΦ is the function defined by

(8.3.67). We know from Theorem 8.5.1 that we can use either of these parametric representations

of S to compute the surface integral (8.5.146), so the question arises which is the better (in the

sense of involving least work) of the two representations to use? We saw in Example 8.3.5 that

the amount of effort involved in computing the area depends dramatically on which parametric

representation is used, so we can expect much the same thing when we compute (8.5.146). We

know that the electric field from a point charge Q at the origin of R3 is radially symmetric around

the origin. Although this is not readily apparent from the Cartesian formulation of the electric

field given by (8.5.145) it is immediate from the description of the field in Example 3.1.3. Since the

parametric representation of the surface S given by (8.3.66) and (8.3.67) is also in a sense radially

symmetric let us try to calculate the surface integral using this representation of S, which we repeat

for convenience as follows:

D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π/2

= [0, 2π]× [0, π/2],(8.5.147)

126

and

(8.5.148) ΦΦΦ(θ, φ) = x(θ, φ)iii+ y(θ, φ)jjj + z(θ, φ)kkk, for all (θ, φ) in D,

for

(8.5.149) x(θ, φ) := r sin(φ) cos(θ), y(θ, φ) := r sin(φ) sin(θ), z(θ, φ) := r cos(φ).

(c.f. (8.3.66) and (8.3.67)). From (8.5.125), but with EEE in place of FFF and replacing the generic

parametric variables (u, v) in (8.5.125) with the parametric variables (θ, φ) of (8.5.147) - (8.5.149),

we get

(8.5.150)

∫S

EEE · dAAA :=

∫D

EEE(ΦΦΦ(θ, φ)) ·[∂ΦΦΦ


∂φ(θ, φ)

]dθ dφ,

so that evaluation of the electric flux through S reduces to evaluating the integral on the right side

of (8.5.150). In Example 8.3.5 we have calculated

(8.5.151)∂ΦΦΦ


∂φ(θ, φ) = −r2 sin(φ) sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk ,

(c.f. (8.3.72)). We next determine EEE(ΦΦΦ(θ, φ)). From (8.5.148)) and (8.5.145) we have

EEE(ΦΦΦ(θ, φ)) = EEE(x(θ, φ), y(θ, φ), z(θ, φ))

=Q

4πε0[x2(θ, φ) + y2(θ, φ) + z2(θ, φ)]3/2x(θ, φ)iii+ y(θ, φ)jjj + z(θ, φ)kkk ,

(8.5.152)

and from (8.5.149) we get

x2(θ, φ) + y2(θ, φ) + z2(θ, φ)

= r2 sin2(φ) cos2(θ) + r2 sin2(φ) sin2(θ) + r2 cos2(φ)

= r2.

(8.5.153)

Now put (8.5.153) and (8.5.149) into (8.5.152) to get

(8.5.154) EEE(ΦΦΦ(θ, φ)) =Q

4πε0r3r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk ,

127

and then, from (8.5.154) and (8.5.151) we find

EEE(ΦΦΦ(θ, φ)) ·[∂ΦΦΦ


∂φ(θ, φ)

]=

Q

4πε0r3(r) sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk

· (−r2 sin(φ)) sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk

= −Q sin(φ)

4πε0sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk

· sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk

= −Q sin(φ)

4πε0

sin2(φ) cos2(θ) + sin2(φ) sin2(θ) + cos2(φ)

= −Q sin(φ)

4πε0

sin2(φ) + cos2(φ)

= −Q sin(φ)

4πε0.

(8.5.155)

From (8.5.155) and (8.5.150)

(8.5.156)

∫S

EEE · dAAA = − Q

4πε0

∫D

sin(φ) dθ dφ = − Q

4πε0

∫ 2π

0

∫ π/2

0

sin(φ) dφ

dθ = − Q

2ε0,

where we have used the rectangular form of D (see (8.5.147)) and the Fubini theorem (see Remark

2.1.6) at the second equality of (8.5.156).

Remark 8.5.7. In Chapter 5 we defined line integrals, and in the present chapter we have devoted

considerable effort to the definition of surface integrals. Why all this effort on such seemingly

strange integrals? In this remark we are going to see how these integrals are absolutely essential

for stating one of the most fundamental and bedrock laws of physics namely Ampere’s circuital

law. From elementary physics one is familiar with the qualitative phenomenon that a current i

passing through a conductor causes a magnetic field vector BBB(x, y, z) at all points (x, y, z) in the

space surrounding the conductor. The question arises: is there any quantitative (or mathematical)

relationship between the current i and the magnetic vector field BBB that it causes? Suppose that

i is a time-constant current flowing through a long and very thin metallic conductor, and fix any

simple closed curve Γ which “loops” just once around the conductor. Assign to Γ a direction in

accordance with the usual right hand rule i.e. if the thumb of the right hand is aligned along the

conductor in the direction of the current then Γ is assigned the direction of the forefingers as shown

in Figure 8.11. It has been determined by experiment that under these conditions we always have

128

Figure 8.11: Current i and closed curve Γ for Ampere’s law

(8.5.157)

∫Γ

BBB · drrr = µ0i,

in which the quantity on the left of (8.5.157) is the usual line integral of the magnetic field HHH

around the closed curve Γ, as defined in Section 5.1, and µ0 is a constant called the magnetic

permeability of free space. This physical law is known as Ampere’s circuital law. Notice how this

law is naturally stated in terms of line integrals, and notice also its great generality; the relation

(8.5.157) holds for every imaginable closed curve Γ looping around the current-carrying conductor.

Despite its generality the circuital law in the form of (8.5.157) has the disadvantage that the cause

of the magnetic field is assumed to be current though a conductor. From the point of view of

electromagnetism it is actually much more useful to have a circuital law in which the cause of the

magnetic field is not a current through a conductor but rather a current density arising from the

diffuse or distributed movement of charge through space (recall Example 3.1.6). This is because, in

contrast to electrical circuits where one always deals with currents passing through conductors, in

electromagnetism it is much more natural to deal with charge which moves diffusely through space

rather than in concentrated fashion along the narrow confines of a conductor. Suppose therefore

that we have a diffuse movement of charge through space given by a time-constant current density

vector field JJJ , as in Example 3.1.6, with domain D = R3 for simplicity. That is, at each (x, y, z) in

R3 the current density is JJJ(x, y, z), and moreover the current density at each (x, y, z) does not vary

with time. It is a basic physical fact that this diffuse movement of charge causes a magnetic field

129

BBB(x, y, z) at each point (x, y, z) (much as a current flowing through a conductor causes a magnetic

field) and we would like to quantify the relationship between the current density vector field JJJ and

magnetic vector field BBB that it creates. To state this relationship fix some surface S in R3 with

boundary curve Γ, as shown in Figure 8.12. We emphasize that S is a purely theoretical surface

Figure 8.12: Current density JJJ and surface S for Ampere’s law

that leaves the movement of charge completely unaffected, and is not in any sense a physical surface

or barrier which impedes or disturbs the movement of charge described by the current density JJJ .

It has been determined by experiment that under these conditions the vector fields JJJ and BBB are

always related by

(8.5.158)

∫Γ

BBB · drrr = µ0

∫S

JJJ · dAAA.

Exactly as at (8.5.157) the quantity on the left of (8.5.158) is the line integral of the magnetic

field BBB around the closed curve Γ defined in Section 5.1. As for the quantity on the right side of

(8.5.158), this is of course the surface integral of the current density vector field over the surface

S that has been defined in this section. This physical law is also known as Ampere’s circuital law,

and it is an essential halfway-house in getting to Maxwell’s equations of electromagnetism, as we

shall see in later chapters. Notice how indispensable both line integrals and surface integrals are

in the statement of this basic physical law. Notice also the universality built into the statement at

(8.5.158), namely this relation between JJJ and BBB holds for every possible choice of the finite open

surface S with boundary Γ. Finally notice that, for a given surface S, the surface integral on the

130

right of (8.5.158) is nothing but the total current flowing through the surface S, as we have already

seen. In this sense there is consistency between the circuital laws at (8.5.158) and (8.5.157).

Remark 8.5.8. In this remark we are going to state another bedrock law of physics, namely

Faraday’s law of electromagnetic induction. Exactly as with Ampere’s circuital law of Remark

8.5.7 we shall see that surface integrals and line integrals are completely indispensable for the very

formulation and meaning of Faraday’s law. Suppose that BBB is a time varying magnetic field, that is

at each point (x, y, z) in R3 the magnetic field vector is BBB(t, x, y, z) for each instant t, and generally

changes as t changes. One could for example obtain such a time varying magnetic field by moving

a permanent magnet through space. We briefly noted in Remark 3.2.4 the possibility of vector

fields which can vary not just through space but also with time, but until now we been concerned

with fields that vary only through space and do not depend on time. Here we absolutely must deal

with fields which change not just through space but also with time, for we are going to see that

the essential element in Faraday’s law is the time-changing magnetic field BBB. It turns out that

this poses no serious difficulties; the mathematical tools we have developed for fields which depend

only on space are easily extended and adapted to fields which depend on time as well as space.

In essence Faraday’s law of electromagnetic induction states that a time varying magnetic field BBB

Figure 8.13: Magnetic field BBB and electric field EEE for Faraday’s law

causes a time varying electric field EEE, that is at each point (x, y, z) in R3 we get an electric field

EEE(t, x, y, z) which also varies with time t. Naturally we would like a quantitative or mathematical

relation between the time varying fields BBB and EEE, and this we state next. Fix some surface S in

131

R3 with boundary curve Γ, as shown in Figure 8.13, and define the flux of the magnetic field BBB (or

magnetic flux) through the surface S at each instant t by

(8.5.159) Φmag(t) :=

∫S

BBB · dAAA,

(recall Remark 8.5.4). Before going any further we should note a seeming oddity of the notation

at (8.5.159), namely the left side indicates a clear dependence on the time t but there is no corre-

sponding mention of t on the right side of (8.5.159). This is happening because on the right side we

are suppressing all variables in the notation for the surface integral. In fact, for each t the quantity

on the right is understood to mean the surface integral over S of the vector field

(8.5.160) FFF (x, y, z) := BBB(t, x, y, z), for all (x, y, z) in R3,

obtained from BBB by keeping t fixed. The t-dependence implicit in the right side of (8.5.159) can be

make quite explicit if we fix some parametric representation ΦΦΦ : D → R3 of the surface S (recall

Definition 8.1.4). Then the surface integral of any vector field FFF : R3 → R3 over S is given in terms

of the parametric representation by

(8.5.161)

∫S

FFF · dAAA =

∫D



∂v(u, v)

]du dv,

(recall (8.5.125)). For each t the quantity on the right side of (8.5.159) is then given by (8.5.161)

with FFF defined by (8.5.160), that is

(8.5.162)

∫S

BBB · dAAA =

∫D

BBB(t,ΦΦΦ(u, v)) ·[∂ΦΦΦ


∂v(u, v)

]du dv,

for each t. The t-dependence of the quantity on the right side of (8.5.159) is now clearly apparent

from the right side of (8.5.162). Having cleared up the interpretation of (8.5.159) we can state

Faraday’s law of electromagnetic induction in full as follows: the electric field EEE caused by the time

varying magnetic field BBB always satisfies the relation

(8.5.163)

∫Γ

EEE · drrr = − dΦmag(t)

dt,

for all t. As with Ampere’s circuital law (8.5.158) one should note the universality incorporated

in (8.5.163) (partnered with the definition (8.5.159)), namely this relation holds regardless of how

one chooses the surface S with boundary Γ. Another point to notice is that we have a notational

peculiarity at (8.5.163) not unlike that which we saw at (8.5.159), that is the right side of (8.5.163)

132

clearly depends on time t but no such dependence on t is explicitly indicated in the line integral

on the left side of (8.5.163). Again this t-dependence is certainly present but “hidden” because all

variables are suppressed in the notation for the line integral on the left of (8.5.163). To make this

t-dependence explicit fix some parametric representation

(8.5.164) γγγ : [a, b]→ R3

of the closed curve Γ (see Definition 4.2.2). From (5.1.13) we recall that the line integral of any

vector field FFF : R3 → R3 along Γ is always given by

(8.5.165)

∫Γ

FFF · drrr =

∫ b

a

FFF (γγγ(u)) · γγγ(1)(u) du.

For each t the quantity on the left of (8.5.163) is understood to mean the surface integral over S of

the vector field

(8.5.166) FFF (x, y, z) := EEE(t, x, y, z), for all (x, y, z) in R3,

obtained from EEE by keeping t fixed (much as was the case at (8.5.160), which we used to unravel the

hidden t-dependence on the right side of (8.5.159)). From (8.5.165), with FFF defined by (8.5.166) at

each t, we see that the left side of (8.5.163) is really given by

(8.5.167)

∫Γ

EEE · drrr =

∫ b

a

EEE(t,γγγ(u)) · γγγ(1)(u) du,

for each and every t. The hidden t-dependence on the left side of (8.5.163) is now clearly displayed

on the right side of (8.5.167). Usually, (8.5.163) and (8.5.159) are combined to yield Faraday’s law

of electromagnetic induction in the following form:∫Γ

EEE · drrr = − d

dt

∫S

BBB · dAAA

= −∫S

∂BBB

∂t· dAAA,

(8.5.168)

which displays the relation between the given time varying magnetic field BBB and the resulting time

varying electric field EEE.

133

Chapter 9

Vector Calculus

In this chapter our goal is to introduce several distinct ways of taking the “derivative” of given

vector and scalar fields, in the course of which we shall define the so-called divergence, curl and

Laplacian differential operators. We then establish two fundamental theorems of vector calculus,

namely the theorem of Stokes and the theorem of Gauss-Ostrogradskii, which are stated in terms of

these differential operators, as well as the surface integrals studied in Chapter 8, the line integrals

studied in Chapter 5, and the volume integrals studied in Chapter 2. As we shall see in Chapter

10 the theorems of Gauss-Ostrogradskii and Stokes are essential tools for understanding Maxwell’s

equations.

9.1 Differential Operators of Vector Calculus: Divergence,

Curl, Laplacian

In Section 6.1 we defined the gradient of a scalar field f to be the vector field gradf given by

(9.1.1) (gradf)(x, y, z) :=∂f

∂x(x, y, z)iii+

∂f

∂y(x, y, z)jjj +

∂f

∂z(x, y, z)kkk,

(see Definition 6.1.1). As we noted in Remark 6.1.3 the symbol grad (or ∇) denotes a differential

operator which “operates on” a given scalar field f to generate a vector field gradf (or ∇f) by

a process of partial differentiation. In this section our goal is to define two further differential

operators, namely the divergence and the curl which, rather like grad, operate on a given field

by partial differentiation to produce another field. In this case, however, the divergence and curl

operate on a vector field, in contrast to grad which always operates on a scalar field.

134

Definition 9.1.1. Suppose that FFF : D → R3 is a C1-vector field in R3 with domain D ⊂ R3 (recall

Definition 3.2.1 and Remark 3.2.2) given by

(9.1.2) FFF (x, y, z) := F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D.

Then the divergence of the vector field FFF is the scalar field div f on the same domain D defined by

(9.1.3) (div FFF )(x, y, z) :=∂F1

∂x(x, y, z) +

∂F2

∂y(x, y, z) +

∂F3

∂z(x, y, z), for all (x, y, z) in D.

Remark 9.1.2. Notice that the right side of (9.1.3) is a scalar quantity for each (x, y, z) in the

domain D of the given vector field FFF , and is therefore indeed a scalar field with the same domain

as the vector field FFF . In short, div denotes an operator which “operates” on a given vector field

FFF to produce a scalar field div FFF having the same domain as FFF . Much like the operator grad the

operator div involves partial differentiation, and hence is a “differential operator”, but it is clearly

a very different differential operator from grad.

Remark 9.1.3. We can use the symbolic vector ∇ defined by (6.1.3), that is

(9.1.4) ∇ :=∂

∂xiii+

∂

∂yjjj +

∂

∂zkkk,

to give an alternative and frequently used notation for the divergence. If we form a “symbolic”

inner product of the vectors on the right sides of (9.1.4) and (9.1.2), pretending that the partial

derivatives in (9.1.4) are actual numbers which we can “multiply” into the scalar components F1,

F2 and F3 of FFF , then we get (at least formally)

∇ · F (x, y, z) =

(∂

∂xiii+

∂

∂yjjj +

∂

∂zkkk

)· (F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk)

=∂F1

∂x(x, y, z) +

∂F2

∂y(x, y, z) +

∂F3

∂z(x, y, z).

(9.1.5)

Comparing (9.1.5) and (9.1.3) we see that div FFF and∇·FFF are identical, that is∇·FFF is an alternative

notation for the divergence div f , in much the same way that ∇f is alternative notation for the

gradient gradf of a scalar field f (see Definition 6.1.1). Do note that the “dot” in ∇ ·FFF is essential

since it indicates the formal inner product which gives (9.1.4) - the notation ∇FFF (i.e. without the

“dot”) makes no sense at all. In general the notation div FFF is common in the older and more

classical books, whereas the notation ∇ ·FFF is preferred in more modern books. We shall typically

use the notation ∇ ·FFF in these notes.

135

Remark 9.1.4. What is the physical significance of the divergence ∇ · FFF of a vector field FFF?

This is by no means immediately obvious from the defining formula (9.1.3). We shall see from the

Divergence Theorem of Gauss-Ostogradskii, to be established later in this chapter, that the scalar

value ∇·FFF (x, y, z) at a point (x, y, z) in R3 effectively measures the local “divergence” (or “flowing

away”) of the vector field FFF from the point (x, y, z). This local divergence is an important property

for electric and magnetic fields. In fact, we shall see in the next chapter that two of Maxwell’s

equations (originating from the Gauss laws for electric fields and magnetic fields) exactly describe

the divergence div EEE (or ∇·EEE) of the electric EEE and the divergence div BBB (or ∇·BBB) of the magnetic

field BBB, and that these divergence properties are essential for Maxwell’s theory of electromagnetic

waves.

Remark 9.1.5. A vector field FFF : D → R3 with domain D ⊂ R3 is called solenoidal (or incom-

pressible) when div FFF (x, y, z) = 0 for all (x, y, z) in D. We shall see later that the magnetic field BBB

is always solenoidal.

We next define another rather fancy differential operator, namely the curl of a vector field, which

is just as important as the divergence:

Definition 9.1.6. Suppose that FFF : D → R3 is a C1-vector field in R3 with domain D ⊂ R3 (recall



Then the curl of the vector field FFF is the vector field curl FFF on the same domain D defined by

(curl FFF )(x, y, z) :=

[∂F3

∂y(x, y, z)− ∂F2

∂z(x, y, z)

]iii+

[∂F1

∂z(x, y, z)− ∂F3

∂x(x, y, z)

]jjj

+

[∂F2

∂x(x, y, z)− ∂F1

∂y(x, y, z)

]kkk,

(9.1.7)


Remark 9.1.7. Notice that the right side of (9.1.7) is a vector for each (x, y, z) in the domain D of

the given vector field FFF , and is therefore indeed a vector field with the same domain as the vector

field FFF . That is curl denotes an operator which “operates” on a given vector field FFF to produce

another vector field curl FFF having the same domain as FFF . Much like the operators gradand div,

the operator curl again involves partial differentiation, and hence is a “differential operator”, but

it is clearly a very different differential operator from both grad and div. In particular, we know

136

from Definition 6.1.1 that grad converts a given scalar field f into a vector field gradf (or ∇f),

while from Definition 9.1.1 we see that div converts a given vector field FFF into a scalar field div FFF

(or ∇ · FFF ), and from Definition 9.1.6 we see that curl converts a given vector field FFF into another

vector field curl FFF . In view of these comments one can reasonably ask if there are any useful

differential operators which convert a given scalar field into another scalar field. There is indeed

such an operator, called the Laplacian operator, which we define later.

Remark 9.1.8. The definition of curl at (9.1.7) looks like a confusing jumble of symbols which

seems not only to have no intuitive or physical significance but appears to be almost impossible to

remember. We shall see later that the curl of a vector field in fact has definite physical significance.

Here we use the symbolic vector at (9.1.4) to get an alternative notation for curl FFF , in much the

same way that we obtained ∇·FFF as alternative notation for div FFF in Remark 9.1.3. This alternative

notation also makes it easy to remember the seemingly strange definition at (9.1.7). Much as in

Remark 9.1.3 we again form a “symbolic” product of the vectors on the right sides of (9.1.4) and

(9.1.2), but now we formally calculate the vector cross product of the right sides of (9.1.4) and

(9.1.2), rather than the inner product as we did at (9.1.5):

∇×FFF (x, y, z) =

∣∣∣∣∣∣∣∣iii jjj kkk∂∂x

∂∂y

∂∂z

F1(x, y, z) F2(x, y, z) F3(x, y, z)

∣∣∣∣∣∣∣∣=

[∂F3

∂y(x, y, z)− ∂F2

∂z(x, y, z)

]iii+

[∂F1

∂z(x, y, z)− ∂F3

∂x(x, y, z)

]jjj

+

[∂F2

∂x(x, y, z)− ∂F1

∂y(x, y, z)

]kkk.

(9.1.8)

The second equality at (9.1.8) follows from formal calculation of the determinant, pretending as

usual that the partial derivatives occurring in the determinant at (9.1.4) are actual numbers which

we can “multiply” into the scalar components F1, F2 and F3 of FFF . From (9.1.8) and (9.1.7) we get

(9.1.9) ∇×FFF (x, y, z) = curl FFF (x, y, z) for all (x, y, z) in D.

It is clear that (9.1.8) gives a useful mnemonic for remembering the definition of curl FFF , and also

gives the alternative notation ∇ × FFF for curl FFF . In general the notation curl F is common in the

older and more classical books, whereas the notation ∇×FFF is preferred in more modern books. We

shall typically use the notation ∇×FFF in these notes.

137

We can now restate Theorem 6.2.7 in terms of the curl operator defined at (9.1.8):

Theorem 9.1.9. Suppose that FFF : D → R3 is a C1-vector field with domain D = R3 (recall

Definition 3.2.1 and Remark 3.2.2), and put

FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk,

for all (x, y, z) in R3 i.e. F1(x, y, z), F2(x, y, z), and F3(x, y, z) are the real scalar components of

the vector FFF (x, y, z) in R3. Then the following are equivalent:


(b)∫


(c) (∇×FFF )(x, y, z) = 0 for all (x, y, z) in R3.

Remark 9.1.10. What “physical significance” does the curl operator have? Here we briefly indicate

one instance, from fluid mechanics, in which the curl has a very definite physical interpretation.

Suppose that the vector field VVV is the velocity flow field of a fluid moving through space (e.g. a

current of water), that is the vector VVV (x, y, z) gives the velocity of a moving fluid at each point

(x, y, z) in R3. Attached to an axis, which rotates in a bearing that you hold, are very small

paddles. The bearing is fixed at a point (x, y, z) in the moving fluid with the axis of spin aligned

along some fixed unit vector nnn (see Figure 9.1). For reasons which will become clear we call this

device a “curl meter”. There is viscous friction in the bearing, so it can be shown from elementary

Figure 9.1: Curl meter

physics that the angular speed (not angular acceleration!) of the curl meter is directly proportional

138

to the torque around the axis of the curl meter arising from the impact of the moving fluid on the

paddles (the constant of proportionality depends on the geometry of the paddles, the coefficient of

viscous friction of the fluid and several other factors). Using the principles of fluid mechanics one

can establish that

(9.1.10) angular speed of curl meter = κ|curl VVV (x, y, z) · nnn|,

where κ is another constant of proportionality. From (9.1.10) we see that there is a very direct

relation between curl VVV and the angular speed of the curl meter. Observe from (9.1.10) that when

the axis of spin of the curl meter (determined by the unit vector nnn) is collinear with curl VVV (x, y, z)

then the curl meter rotates fastest, whereas when the axis of spin is orthogonal to curl VVV (x, y, z)

then the curl meter does not rotate at all. In this way we see that curl VVV tells us the “local spin”

in the fluid velocity field VVV at any point (x, y, z) around an axis aligned with any unit vector nnn.

Remark 9.1.11. We shall see from the Theorem of Stokes, to be established later in this chapter,

that the intuitive picture of curl established in Remark 9.1.10 for a the velocity field VVV of a moving

fluid extends to a more general setting, namely the vector curl FFF (x, y, z) (or∇×FFF (x, y, z)) at a point

(x, y, z) in R3 effectively measures the local “curling” (or “turning” or “rotation” or “vorticity”)

of a general vector field FFF at point (x, y, z) (this will be discussed in Remark 9.2.7). In fact, in

older textbooks curlFFF was sometimes denoted by “vort FFF” or “rot FFF” (“vort” for “vorticity”, “rot”

for “rotation”) but these semi-comical notations were soon discarded in favor of the currently used

curl FFF and ∇ × FFF . This local rotation, like the divergence discussed in Remark 9.1.4, is also an

important property of electric and magnetic fields. We shall see in the next chapter that Faraday’s

law of electromagnetic induction (which has been previewed in Remark 8.5.8) actually describes

the local rotation curl EEE (or ∇ × EEE) of the electric field, while Ampere’s magnetic circuital law

(previewed in Remark 8.5.7) describes the local rotation curl BBB (or ∇×BBB) of the magnetic field.

Remark 9.1.12. A vector field FFF : D → R3 with domain D ⊂ R3 is called irrotational when its

curl is identically zero on the domain D, that is ∇ × FFF (x, y, z) = 0 for all (x, y, z) in D. From

Theorem 9.1.9 one sees, in particular, that

(9.1.11) FFF : R3 → R3 is conservative if and only if FFF is irrotational.

The next result shows that the gradient of a C2-scalar field is always irrotational:

Theorem 9.1.13. Suppose that f : D → R is a C2-scalar field with domain D ⊂ R3. Then the

vector field ∇f (see Definition 6.1.1) is irrotational, that is

(9.1.12) (∇× (∇f))(x, y, z) = 0 for all (x, y, z) in D.

139

Proof: From Definition 6.1.1 we have

(9.1.13) ∇f(x, y, z) =∂f

∂x(x, y, z)iii+

∂f

∂y(x, y, z)jjj +

∂f

∂z(x, y, z)kkk

for all (x, y, z) in D. Now take FFF := ∇f so that, from (9.1.13), we have

(9.1.14) F1(x, y, z) =∂f

∂x(x, y, z), F2(x, y, z) =

∂f

∂y(x, y, z), F3(x, y, z) =

∂f

∂z(x, y, z),

for all (x, y, z) in D. From (9.1.8) and (9.1.14) we get

∇× (∇f)(x, y, z) =


∂∂y

∂∂z

∂f∂x

(x, y, z) ∂f∂y

(x, y, z) ∂f∂z

(x, y, z)

∣∣∣∣∣∣∣∣=

[∂2f

∂y∂z(x, y, z)− ∂2f

∂z∂y(x, y, z)

]iii−

[∂2f

∂x∂z(x, y, z)− ∂2f

∂z∂x(x, y, z)

]jjj

+

[∂2f

∂x∂y(x, y, z)− ∂2f

∂y∂x(x, y, z)

]kkk,

(9.1.15)

for all (x, y, z) in D, where the second equality at (9.1.15) follows from formally expanding the

determinant. Since f is a C2-function we know that the mixed partial derivatives are equal, e.g. for

the mixed y − z partial derivatives we have

(9.1.16)∂2f

∂y∂z(x, y, z) =

∂2f

∂z∂y(x, y, z), for all (x, y, z) in D

and similarly for the mixed x− z and x− y partial derivatives. From (9.1.16) etc. and (9.1.15) we

obtain ∇× (∇f)(x, y, z) = 0 for all (x, y, z) in D as required.

The next result is quite similar to Theorem 9.1.13 but shows that the curl of a vector field is

always solenoidal. The proof is omitted since it is very similar to the proof of Theorem 9.1.13:

Theorem 9.1.14. Suppose that GGG : D → R3 is a C2-vector field with domain D ⊂ R3. Then the

vector field ∇×GGG (see Definition 9.1.6 and (9.1.9)) is solenoidal (see Remark 9.1.5), that is

(9.1.17) (∇ · (∇×GGG))(x, y, z) = 0 for all (x, y, z) in D.

140

Example 9.1.15. Determine the divergence and curl of the vector field defined by

(9.1.18) FFF (x, y, z) := xiii+ yjjj + zkkk for all (x, y, z) in D := R3.

We compute the divergence first. From (9.1.18) and (9.1.6) we have

(9.1.19) F1(x, y, z) = x, F2(x, y, z) = y, F3(x, y, z) = z, for all (x, y, z) in R3.

From (9.1.19) and (9.1.5)

(9.1.20) ∇ · F (x, y, z) =∂x

∂x+∂y

∂y+∂z

∂z= 3, for all (x, y, z) in R3.

For the curl, from (9.1.19) and (9.1.7) we get

(9.1.21) (curl FFF )(x, y, z) =

[∂z

∂y− ∂y

∂z

]iii+

[∂x

∂z− ∂z

∂x

]jjj +

[∂y

∂x− ∂x

∂y

]kkk = 0,

for all (x, y, z) in R3. It follows that FFF is irrotational (see Remark 9.1.12).

Example 9.1.16. Show that the vector field defined by

(9.1.22) FFF (x, y, z) := yiii− xjjj + 0kkk for all (x, y, z) in D := R3,

cannot be a conservative vector field.

In view of Remark 9.1.12 it is enough to prove that the vector field FFF in not irrotational. We

therefore calculate the curl of FFF . From (9.1.22) we have

(9.1.23) F1(x, y, z) = y, F2(x, y, z) = −x, F3(x, y, z) = 0, for all (x, y, z) in R3,

(compare (9.1.6)) and inserting (9.1.23) into (9.1.8) we obtain

(9.1.24) ∇×FFF (x, y, z) =


∂∂y

∂∂z

y −x 0

∣∣∣∣∣∣∣∣ = −2kkk, for all (x, y, z) in R3.

We therefore do not have ∇ × FFF (x, y, z) = 0 for all (x, y, z) ∈ R3. It follows that FFF cannot be

irrotational, and then, from Remark 9.1.12, it cannot be conservative.

Example 9.1.17. Show that the vector field GGG defined by

(9.1.25) GGG(x, y, z) := x3yiii+ zjjj + xzkkk for all (x, y, z) in D := R3,

141

cannot be the curl of another vector field.

This is similar to Example 9.1.16 except that now we use Theorem 9.1.14. Suppose in fact that

GGG is the curl of another vector field FFF : D → R3, that is

(9.1.26) GGG(x, y, z) = ∇×FFF (x, y, z) for all (x, y, z) in D := R3.

Then

(9.1.27) ∇ ·GGG(x, y, z) = ∇ · (∇×FFF )(x, y, z) = 0 for all (x, y, z) in D := R3,

in which the first equality follows from (9.1.26) and the second from Theorem 9.1.14. However,

from (9.1.26) and (9.1.5)

∇ ·GGG(x, y, z) =∂G1

∂x(x, y, z) +

∂G2

∂y(x, y, z) +

∂G3

∂z(x, y, z)

=∂(x3y)

∂x+∂(z)

∂y+∂(xz)

∂z= 3x2y + x,

(9.1.28)

for all (x, y, z) in R3. Since (9.1.28) contradicts (9.1.27) we see that GGG cannot be the curl of another

vector field.

We next introduce a differential operator which converts a given scalar field into another scalar

field, as promised at the end of Remark 9.1.7:

Definition 9.1.18. Suppose that f : D → R is a C2-scalar field with domain D ⊂ R3 (see Definition

3.2.1 and Remark 3.2.2). Then the Laplacian of the scalar field f is another scalar field denoted by

∇2f and defined on the same domain D by

(9.1.29) ∇2f(x, y, z) :=∂2f

∂x2 (x, y, z) +∂2f

∂y2 (x, y, z) +∂2f

∂z2 (x, y, z)


Finally, we extend the Laplacian operator, defined above for scalar fields, to vector fields:

Definition 9.1.19. Suppose that FFF : D → R3 is a C2-vector field in R3 with domain D ⊂ R3 (see



The Laplacian of the vector field FFF is another vector field ∇2FFF on the same domain D defined by

(9.1.31) (∇2FFF )(x, y, z) := (∇2F1)(x, y, z)iii+ (∇2F2)(x, y, z)jjj + (∇2F3)(x, y, z)kkk,

for all (x, y, z) in D. Here the functions ∇2Fi in (9.1.31) are of course defined by (9.1.29) with

f := Fi.

142

Remark 9.1.20. Laplacian operators occur all over physics and engineering, and in particular are

essential in the study of electromagnetism, as we shall see. The next theorem illustrates that the

Laplacian operator on a scalar field is really just the successive application of the gradient and

divergence to the scalar field:

Theorem 9.1.21. Suppose that f : D → R is a C2-scalar field with domain D ⊂ R3. Then the

Laplacian of f is the divergence of the gradient of f , that is

(9.1.32) ∇2f(x, y, z) = (∇ ·GGG)(x, y, z) where GGG(x, y, z) := (∇f)(x, y, z),

for all (x, y, z) ∈ D, more compactly

(9.1.33) ∇2f(x, y, z) = ∇ · (∇f)(x, y, z) for all (x, y, z) in D.

Proof: From Definition 6.1.1 the gradient of f is given by

(9.1.34) ∇f(x, y, z) =∂f

∂x(x, y, z)iii+

∂f

∂y(x, y, z)jjj +

∂f

∂z(x, y, z)kkk, for all (x, y, z) in D.

Now define the vector field FFF : D → R3 by

(9.1.35) FFF (x, y, z) := ∇f(x, y, z), for all (x, y, z) in D.

From (9.1.35) and (9.1.34) the scalar components of FFF are

(9.1.36) F1(x, y, z) =∂f

∂x(x, y, z), F2(x, y, z) =

∂f

∂y(x, y, z), F3(x, y, z) =

∂f

∂z(x, y, z),

for all (x, y, z) in D. Then the divergence of the gradient of f is

∇ · (∇f)(x, y, z) = ∇ ·FFF (x, y, z) (see (9.1.35))

=∂F1

∂x(x, y, z) +

∂F2

∂y(x, y, z) +

∂F3

∂z(x, y, z) (see (9.1.5))

=∂2f

∂x2 (x, y, z) +∂2f

∂y2 (x, y, z) +∂2f

∂z2 (x, y, z) (see (9.1.36)),

(9.1.37)

for all (x, y, z) in D. From (9.1.37) and (9.1.29) we obtain (9.1.33).

Remark 9.1.22. Suppose that a C2-scalar field f : D → R such that its gradient is solenoidal

(Remark 9.1.5) that is

(9.1.38) ∇f(x, y, z) = 0, for all (x, y, z) in D.

143

From (9.1.38) and Theorem 9.1.21 (see (9.1.33)) we get

(9.1.39) ∇2f(x, y, z) = 0, for all (x, y, z) in D,

that is (see (9.1.29))

(9.1.40)∂2f

∂x2 (x, y, z) +∂2f

∂y2 (x, y, z) +∂2f

∂z2 (x, y, z) = 0, for all (x, y, z) in D.

The relation (9.1.40) is a particular instance of a partial differential equation and is known as

Laplace’s equation. Any scalar function f which satisfies the relation (9.1.40) is called a solution of

Laplace’s equation. We see, therefore, that any scalar field whose gradient is solenoidal is necessarily

a solution of Laplace’s equation. Laplace’s equation is ubiquitous throughout mathematical physics

and engineering, and in particular is indispensable in the study of electromagnetism.

Remark 9.1.23. We have defined the gradient operator for scalar fields (see Definition 6.1.1), the

divergence operator (see Definition 9.1.1 and (9.1.5)) and curl operator (see Definition 9.1.6 and

(9.1.8)) for vector fields when these fields do not depend on t, that is are functions of (x, y, z) only.

In this section, as well as in later chapters, we are going to apply these operators to scalar and

vector fields which are time varying in the sense of Remark 3.2.5, that is are functions of (t, x, y, z).

It goes without saying that we just ignore t and apply the x, y and z-partial derivatives as usual.

In particular, if f(t, x, y, z) is a time varying scalar field then we define the gradient of f by

(∇f)(t, x, y, z) =∂f

∂x(t, x, y, z)iii+

∂f

∂y(t, x, y, z)jjj +

∂f

∂z(t, x, y, z)kkk,(9.1.41)

for all (t, x, y, z) (c.f. (6.1.4)), and we define the Laplacian of f by

(9.1.42) (∇2f)(t, x, y, z) :=∂2f

∂x2 (t, x, y, z) +∂2f

∂y2 (t, x, y, z) +∂2f

∂z2 (t, x, y, z)

for all (t, x, y, z) (c.f. (9.1.29)). Similarly, if FFF (x, y, z) is a time varying vector field with the scalar

component representation

(9.1.43) FFF (t, x, y, z) = F1(t, x, y, z)iii+ F2(t, x, y, z)jjj + F3(t, x, y, z)kkk,

then we define the divergence of FFF by

(9.1.44) (∇ · F )(t, x, y, z) =∂F1

∂x(t, x, y, z) +

∂F2

∂y(t, x, y, z) +

∂F3

∂z(t, x, y, z),

144

for all (t, x, y, z) (c.f. (9.1.5)), the curl of FFF by

(∇×FFF )(t, x, y, z) =


∂∂y

∂∂z

F1(t, x, y, z) F2(t, x, y, z) F3(t, x, y, z)

∣∣∣∣∣∣∣∣=

[∂F3

∂y(t, x, y, z)− ∂F2

∂z(t, x, y, z)

]iii+

[∂F1

∂z(t, x, y, z)− ∂F3

∂x(t, x, y, z)

]jjj

+

[∂F2

∂x(t, x, y, z)− ∂F1

∂y(t, x, y, z)

]kkk.

(9.1.45)

for all (t, x, y, z) (c.f. (9.1.8)), and the Laplacian of FFF by

(9.1.46) (∇2FFF )(x, y, z) := (∇2F1)(x, y, z)iii+ (∇2F2)(x, y, z)jjj + (∇2F3)(x, y, z)kkk,

for all (t, x, y, z) (c.f. (9.1.31)). Here the functions ∇2Fi(t, x, y, z) in (9.1.46) are of course defined

by (9.1.42) with f := Fi.

We now state for future reference a useful result on interchanging the divergence and curl op-

erators with partial t-derivatives for time varying vector fields FFF (t, x, y, z). This very simple result

will be useful in later applications, particularly when we look at Maxwell’s equations. Put

(9.1.47) GGG1(t, x, y, z) := (∇ ·FFF )(t, x, y, z),

GGG2(t, x, y, z) :=∂FFF (t, x, y, z)

∂t

=∂F1(t, x, y, z)

∂tiii+

∂F2(t, x, y, z)

∂tjjj +

∂F3(t, x, y, z)

∂tkkk,

(9.1.48)

for all (t, x, y, z). It is easy, although tedious, to verify that

(9.1.49)∂GGG1(t, x, y, z)

∂t= (∇ ·GGG2)(t, x, y, z),

for all (t, x, y, z). The relation (9.1.49) is usually written as

(9.1.50)∂

∂t[(∇ ·FFF )(t, x, y, z)] = ∇ ·

[∂FFF (t, x, y, z)

∂t

],

that is we can interchange the divergence with the partial t-derivative. Similarly, upon defining

(9.1.51) GGG3(t, x, y, z) := (∇×FFF )(t, x, y, z),

145

one can again check by simple but tedious calculation that

(9.1.52)∂GGG3(t, x, y, z)

∂t= (∇×GGG2)(t, x, y, z),

for all (t, x, y, z). The relation (9.1.52) is usually written as

(9.1.53)∂

∂t[(∇×FFF )(t, x, y, z)] = ∇×

[∂FFF (t, x, y, z)

∂t

],

that is we can interchange the curl with the partial t-derivative.

Remark 9.1.24. Here we collect some of the more useful identities for the gradient, divergence,

curl and Laplacian operators defined previously. Typically these identities are established by easy

(although sometimes tedious) calculations, or follow from theorems we have already established. In

the following f and g are time varying C1-scalar fields, FFF and GGG are time varying C1-vector fields

(recall Remark 3.2.2), and c is any real constant. For maximum generality we state the identities

in the time varying case, in terms of (t, x, y, z). Of course, the identities also hold for time constant

fields, in which case we just everywhere replace (t, x, y, z) with (x, y, z).

(9.1.54) (∇(cf))(t, x, y, z) = c(∇f)(t, x, y, z).

(9.1.55) (∇(f + g))(t, x, y, z) = (∇f)(t, x, y, z) + (∇g)(t, x, y, z).

(9.1.56) (∇(fg))(t, x, y, z) = g(t, x, y, z)(∇f)(t, x, y, z) + f(t, x, y, z)(∇g)(t, x, y, z).

(9.1.57) (∇(f/g))(t, x, y, z) =g(t, x, y, z)(∇f)(t, x, y, z)− f(t, x, y, z)(∇g)(t, x, y, z)

g2(t, x, y, z),

for all (t, x, y, z) in D such that g(t, x, y, z) 6= 0.

(9.1.58) (∇ · (cFFF ))(t, x, y, z) = c(∇ ·FFF )(t, x, y, z).

(9.1.59) (∇ · (FFF +GGG))(t, x, y, z) = (∇ ·FFF )(t, x, y, z) + (∇ ·GGG)(t, x, y, z).

(9.1.60) (∇× (cFFF ))(t, x, y, z) = c(∇×FFF )(t, x, y, z).

(9.1.61) (∇× (FFF +GGG))(t, x, y, z) = (∇×FFF )(t, x, y, z) + (∇×GGG)(t, x, y, z).

(9.1.62) (∇× (∇×FFF ))(t, x, y, z) = (∇(∇ ·FFF ))(t, x, y, z)− (∇2FFF )(t, x, y, z).

(∇2(fg))(t, x, y, z) = g(t, x, y, z)(∇2f)(t, x, y, z) + f(t, x, y, z)(∇2g)(t, x, y, z)

+ 2((∇f) · (∇g))(t, x, y, z),(9.1.63)

(here f and g are C2-scalar fields, recall Remark 3.2.2).

(9.1.64) ∇ · (FFF ×GGG)(t, x, y, z) = GGG(t, x, y, z) · (∇×FFF )(t, x, y, z)−FFF (t, x, y, z) · (∇×GGG)(t, x, y, z).

146

9.2 Theorem of Stokes

Stokes’ theorem is effectively a generalization of Green’s theorem (see Theorem 7.2.1) from two to

three dimensions.

Remark 9.2.1. In Definition 8.1.4 we defined a surface S as the image or range of a parametric

function

(9.2.65) ΦΦΦ : D → R3,

for some region D ⊂ R2, specifically

(9.2.66) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D,

(see (8.1.8)). The surface S is called closed when it is the boundary of some region in R3, or

equivalently completely contains some region in R3. For example, the surface of a sphere in R3 is a

closed surface. Surfaces which are not closed are called open. For example, a flat disc in R3 with

circular boundary of radius r is an open surface. Similarly, the x−y plane in R3 is an open surface.

We shall mainly be interested in finite open surfaces, that is open surfaces of finite size. The x− yplane is an open surface but clearly of infinite extent, therefore not a finite open surface. A disc

with circular boundary of finite radius r is a finite open surface. We are going to assume that our

finite open surface S always has a boundary Γ which is closed curve, that is a curve which begins

and ends at the same point (see Figure 9.2). Clearly one can traverse the closed curve Γ in two

possible directions, namely from A to B to C then back to A, or in the opposite direction from

A to C to B then back to A. For purposes of reference we need to standardize an unambiguous

direction of traverse. We always give Γ that particular direction which is such that the surface S

enclosed by Γ is on your left when you traverse Γ in this direction (see Figure 9.2). The boundary

curve Γ is then called positively oriented.

We are now able to state Stokes’ theorem:

Theorem 9.2.2 (Stokes’ theorem). Suppose that S is a finite open surface in R3 with closed posi-

tively oriented boundary curve Γ (see Remark 9.2.1 and Figure 9.2), and FFF : R3 → R3 is a C1-vector

field (recall Remark 3.2.2). Then

(9.2.67)

∫Γ

FFF · drrr =

∫S

(∇×FFF ) · dAAA.

147

Figure 9.2: Open surface S with positively oriented boundary Γ

Remark 9.2.3. In the alternative notation curl FFF for ∇× FFF for (see Remark 9.1.8) the theorem

of Stokes is frequently written

(9.2.68)

∫Γ

FFF · drrr =

∫S

(curl FFF ) · dAAA.

Remark 9.2.4. Recall that the surface integral on the right sides of (9.2.68) and (9.2.67) is defined

by (8.5.125) in terms of any parametric representation ΦΦΦ : D → R3 of the surface S, but of course

with the vector field ∇×FFF (or curl FFF ) in place of FFF in (8.5.125), that is

(9.2.69)

∫S

(∇×FFF ) · dAAA :=

∫D

(∇×FFF )(ΦΦΦ(u, v)) ·[∂ΦΦΦ


∂v(u, v)

]du dv.

As for the line integral on the left sides of (9.2.68) and (9.2.67), recall that this is defined by (5.1.13),

that is

(9.2.70)

∫Γ

FFF · drrr =

∫ b

a

FFF (γγγ(t)) · γγγ(1)(t) dt,

in which γγγ : [a, b] → R3 is any path (recall Definition 4.2.2) which traverses the boundary curve

Γ in the positively oriented direction shown in Figure 9.2 as the parametric variable t increases

through the interval a ≤ t ≤ b. Since Γ is a closed curve the quantity at (9.2.70) is the circulation

of the vector field FFF around Γ (see Remark 5.1.2). We can therefore paraphrase Stokes’ theorem

as follows: “The circulation of a C1-vector field FFF around the positively oriented boundary Γ of a

finite open surface S is equal to the surface integral of ∇×FFF over S”.

148

Proof of Theorem 9.2.2: We shall prove Theorem 9.2.2 in the special case for which the finite

open surface S is the graph of a function

(9.2.71) f : D → R,

with parametric representation in Remark 8.3.4, that is


(see (8.3.54) and Figure 9.3). We have already calculated

Figure 9.3: Open surface S in proof of Stokes’ theorem

(9.2.73)∂ΦΦΦ


∂v(u, v) = −∂f


∂v(u, v)jjj + kkk, for all (u, v) in D,

(see (8.3.57)). From (9.1.8) and (9.2.72)

(∇×FFF )(ΦΦΦ(u, v)) =

[∂F3

∂y(u, v, f(u, v))− ∂F2

∂z(u, v, f(u, v))

]iii

+

[∂F1

∂z(u, v, f(u, v))− ∂F3

∂x(u, v, f(u, v))

]jjj

+

[∂F2

∂x(u, v, f(u, v))− ∂F1

∂y(u, v, f(u, v))

]kkk,

(9.2.74)

149

for all (u, v) in D. From (9.2.74) and (9.2.73),



∂v(u, v)

]=

[∂F2

∂z(u, v, f(u, v))− ∂F3

∂y(u, v, f(u, v))

]∂f

∂u(u, v)

+

[∂F3

∂x(u, v, f(u, v))− ∂F1

∂z(u, v, f(u, v))

]∂f

∂v(u, v)

+

[∂F2

∂x(u, v, f(u, v))− ∂F1

∂y(u, v, f(u, v))

],

(9.2.75)

for all (u, v) in D, and from (9.2.75) together with (9.2.69) we get∫S

∇×FFF · dAAA

=

∫D



∂v(u, v)

]du dv

=

∫D

[∂F2

∂z(u, v, f(u, v))− ∂F3

∂y(u, v, f(u, v))

]∂f

∂u(u, v) du dv

+

∫D

[∂F3

∂x(u, v, f(u, v))− ∂F1

∂z(u, v, f(u, v))

]∂f

∂v(u, v) du dv

+

∫D

[∂F2

∂x(u, v, f(u, v))− ∂F1

∂y(u, v, f(u, v))

]du dv.

(9.2.76)

We next calculate the line integral on the left of (9.2.67). Exactly as at Remark 9.2.4 we fix a path

(9.2.77) γγγ : [a, b]→ R3

which traverses the boundary curve Γ in the direction of positive orientation (see Figure 9.3). We

write γγγ in the scalar component form

γγγ(t) = (x(t), y(t), z(t))

= x(t)iii+ y(t)jjj + z(t)kkk for all t in a ≤ t ≤ b,(9.2.78)

(see (4.2.5)), so that the line integral then becomes∫Γ

FFF (rrr) · drrr =

∫ b

a

[F1(x(t), y(t), z(t))

dx

dt(t) + F2(x(t), y(t), z(t))

dy

dt(t)

+F3(x(t), y(t), z(t))dz

dt(t)

]dt.

(9.2.79)

(see (5.1.18)). Since the surface S is the graph of the function f (see (9.2.71)) it follows that

(9.2.80) z(t) = f(x(t), y(t)), for all t in a ≤ t ≤ b,

150

(see Figure 9.3) so that from (9.2.80) and (9.2.78) we obtain

(9.2.81) γγγ(t) = x(t)iii+ y(t)jjj + f(x(t), y(t))kkk for all t in a ≤ t ≤ b.

and, from (9.2.80) and the chain rule, we get

(9.2.82)dz

dt(t) =

∂f

∂x(x(t), y(t))

dx

dt(t) +

∂f

∂y(x(t), y(t))

dy

dt(t).

We next insert (9.2.82) and (9.2.80) in (9.2.79) to get

∫Γ

FFF (rrr) · drrr =

∫ b

a

[F1(x(t), y(t), f(x(t), y(t)))

dx

dt(t) + F2(x(t), y(t), f(x(t), y(t)))

dy

dt(t)

+F3(x(t), y(t), f(x(t), y(t)))

(∂f

∂x(x(t), y(t))

dx

dt(t) +

∂f

∂y(x(t), y(t))

dy

dt(t)

)]dt

=

∫ b

a

[G1(x(t), y(t))

dx

dt(t) +G2(x(t), y(t))

dy

dt(t)

]dt,

(9.2.83)

where we have defined

(9.2.84) G1(x, y) := F1(x, y, f(x, y)) + F3(x, y, f(x, y))∂f

∂x(x, y),

(9.2.85) G2(x, y) := F2(x, y, f(x, y)) + F3(x, y, f(x, y))∂f

∂y(x, y),

for all (x, y) in R2. We next define the path in the plane by

(9.2.86) ηηη(t) := (x(t), y(t)), for all t in a ≤ t ≤ b,

where (9.2.81) relates x(t) and y(t) to γγγ(t). We next let ∆ denote the boundary of the region D

in R2. Since γγγ(t) traverses Γ in the positively oriented direction as t increases through a ≤ t ≤ b

it follows that ηηη(t) traverses ∆ in the counterclockwise direction (see Figure 9.3). From (9.2.86) we

have ∫ b

a

[G1(x(t), y(t))

dx

dt(t) +G2(x(t), y(t))

dy

dt(t)

]dt =

∫ b

a

GGG(ηηη(t)) · ηηη(1)(t) dt

=

∫∆

GGG(rrr) · drrr,

(9.2.87)

in which the vector field GGG : R2 → R2 is defined by

(9.2.88) GGG(x, y) := (G1(x, y), G2(x, y)), for all (x, y) in R2.

151

Since ηηη(t), a ≤ t ≤ b traverses ∆ counterclockwise, from Green’s theorem (see Theorem 7.2.1)

we find

(9.2.89)

∫∆

GGG(rrr) · drrr =

∫D

[∂G2

∂x(x, y)− ∂G1

∂y(x, y)

]dx dy.

Combining (9.2.89), (9.2.87), and (9.2.83) gives

(9.2.90)

∫Γ

FFF (rrr) · drrr =

∫D

[∂G2

∂x(x, y)− ∂G1

∂y(x, y)

]dx dy.

We next evaluate the integrand on the right side of (9.2.90). From (9.2.84) and (9.2.85), together

with the chain rule for partial derivatives, we obtain

∂G2

∂x(x, y)− ∂G1

∂y(x, y)

=

[∂F2

∂x(x, y, f(x, y)) +

∂F2

∂z(x, y, f(x, y))

∂f

∂x(x, y)

+∂F3

∂x(x, y, f(x, y))

∂f

∂y(x, y) +

∂F3

∂z(x, y, f(x, y))

∂f

∂x(x, y)

∂f

∂y(x, y)

+F3(x, y, f(x, y))∂2f

∂x∂y(x, y)

]−[∂F1

∂y(x, y, f(x, y)) +

∂F1

∂z(x, y, f(x, y))

∂f

∂y(x, y)

+∂F3

∂y(x, y, f(x, y))

∂f

∂x(x, y) +

∂F3

∂z(x, y, f(x, y))

∂f

∂y(x, y)

∂f

∂x(x, y)

+F3(x, y, f(x, y))∂2f

∂x∂y(x, y)

]

(9.2.91)

Now the last two terms in the first square braces cancel the last two terms in the second square

braces. After this simplification, from (9.2.91) and (9.2.90) we get∫Γ

FFF (rrr) · drrr

=

∫D

[∂F2

∂x(x, y, f(x, y)) +

∂F2

∂z(x, y, f(x, y))

∂f

∂x(x, y) +

∂F3

∂x(x, y, f(x, y))

∂f

∂y(x, y)

]−[∂F1

∂y(x, y, f(x, y)) +

∂F1

∂z(x, y, f(x, y))

∂f

∂y(x, y) +

∂F3

∂y(x, y, f(x, y))

∂f

∂x(x, y)

]dx dy

=

∫D

[∂F2

∂z(x, y, f(x, y))− ∂F3

∂y(x, y, f(x, y))

]∂f

∂x(x, y) dx dy

+

∫D

[∂F3

∂x(x, y, f(x, y))− ∂F1

∂z(x, y, f(x, y))

]∂f

∂y(x, y) dx dy

+

∫D

[∂F2

∂x(x, y, f(x, y))− ∂F1

∂y(x, y, f(x, y))

]dx dy.

(9.2.92)

152

We find that the right sides of (9.2.92) and (9.2.76) are equal and therefore the left sides are also

equal giving ∫S

∇×FFF · dAAA =

∫Γ

FFF (rrr) · drrr,

which establishes the theorem.

Remark 9.2.5. Suppose that S1 and S2 are two finite open surfaces in R3 having a common closed

positively oriented boundary curve Γ. Applying Stokes Theorem 9.2.2 to the surface S1 we get

(9.2.93)

∫Γ

FFF · drrr =

∫S1

(∇×FFF ) · dAAA,

and similarly, applying Stokes Theorem 9.2.2 to the surface S2 we also get

(9.2.94)

∫Γ

FFF · drrr =

∫S2


Combining (9.2.93) and (9.2.94),

(9.2.95)

∫S1

(∇×FFF ) · dAAA =

∫S2


We can often use (9.2.95) to replace evaluation of a difficult surface integral with the evaluation of

an easier surface integral, as the next example illustrates.

Example 9.2.6. A vector field FFF : R3 → R3 is defined by

(9.2.96) FFF (x, y, z) := xiii+ (x+ y)jjj + (x+ y + z)kkk, for all (x, y, z) in R3,

and the surface S is the top half of the sphere of radius r centered at the origin of R3 (see Example

8.1.1 and Figure 9.4). We must calculate the surface integral of the curl vector field ∇×FFF over S.

From (9.2.96) we easily get

(9.2.97) (∇×FFF )(x, y, z) = iii− jjj + kkk, for all (x, y, z) in R3.

However, direct evaluation of the surface integral∫S

(∇×FFF ) · dAAA

is quite tedious despite the simplicity of (9.2.97) (the reader may want to try the direct evaluation

to see this). We see from Figure 9.4 that if Γ is the circle of radius r in the x−y plane with counter-

clockwise direction then Γ is a closed positively oriented boundary curve of S, and therefore from

Stokes’ theorem 9.2.2 we get

(9.2.98)

∫Γ

FFF · drrr =

∫S

(∇×FFF ) · dAAA,

153

Figure 9.4: Surfaces S and S1 and closed curve Γ for Example 9.2.6

so we could try to evaluate the line integral on the left of (9.2.98) to get the required surface integral.

Unfortunately this line integral is also quite tedious to compute because the vector field FFF does not

have any obvious symmetry properties. However, observe from Figure 9.4 that the flat surface

(9.2.99) S1 = (x, y, z) | x2 + y2 ≤ r2, z = 0,

is also a finite open surface in R3 with Γ as its closed positively oriented boundary curve, that is Γ

is the common closed positively oriented boundary curve of the surfaces S and S1. From Remark

9.2.5 we then get

(9.2.100)

∫S

(∇×FFF ) · dAAA =

∫S1


We shall evaluate the surface integral on the right side of (9.2.100) using (8.5.125) (with ∇ × FFFinstead of FFF ), which we expect to be easier to deal with than the surface integral on the left side

of (9.2.100) since the surface S1 is “flat” not “curved”. To use (8.5.125) we need a parametric

representation of the surface S1. Clearly such a representation is the mapping ΦΦΦ : D → R3 in which

D ⊂ R2uv is

(9.2.101) D = (u, v) | u2 + v2 ≤ r2,

and ΦΦΦ is defined by

(9.2.102) ΦΦΦ(u, v) := uiii+ vjjj + 0kkk, for all (u, v) in D,

154

for it is immediately clear from (9.2.102) and (9.2.101) that

(9.2.103) S1 = ΦΦΦ(u, v) | (u, v) ∈ D.

This very simple parametric representation results of course from the fact that the surface S1 is

“flat”. Note also that, in this very simple case, the surface S1 being represented by ΦΦΦ is really just

identical to the parametric domain D of the mapping ΦΦΦ (recall Definition 8.1.4). From (9.2.102)

(9.2.104)∂ΦΦΦ

∂u(u, v) = iii and

∂ΦΦΦ

∂v(u, v) = jjj,

for all (u, v) in D, and from (9.2.104) we get

(9.2.105)∂ΦΦΦ


∂u(u, v) = iii× jjj = kkk, for all (u, v) in D.

From (9.2.97) and (9.2.102) we find

(9.2.106) (∇×FFF )(ΦΦΦ(u, v)) = iii− jjj + kkk, for all (u, v) in D,

and from (9.2.106) and (9.2.105) we obtain



∂u(u, v)

]= (iii− jjj + kkk) · kkk

= 1, for all (u, v) in D.

(9.2.107)

From (8.5.125), with ∇×FFF substituted in place of FFF , we get∫D

∇×FFF · dAAA =

∫D



∂v(u, v)

]du dv

=

∫D

(1) du dv (from (9.2.107))

= areaD = πr2.

(9.2.108)

From (9.2.108) and (9.2.100) we find

(9.2.109)

∫S

∇×FFF · dAAA = πr2.

Remark 9.2.7. In Remark 9.1.10 we saw that the curl of the velocity field VVV of a moving fluid

has a very direct physical interpretation. For general vector fields FFF it is not possible to be quite

so specific about the physical significance of the curl vector field ∇×FFF . Nevertheless, with the aid

of Stokes’ Theorem 9.2.2, we can at least get a partial intuitive sense of what the curl vector field

155

means. To see this suppose we have a vector field FFF : R3 → R3 (for simplicity we take the domain

of FFF to be D := R3), and fix some point (x, y, z) in R3 together with a unit vector nnn passing through

(x, y, z). Let the surface Sρ be a flat disc of radius ρ with center at (x, y, z) and lying in the plane

which is orthogonal to the unit vector nnn (see Figure 9.5). Finally, let Γρ be the closed positively

oriented boundary of S (that is Γρ is just a circle of radius ρ centered at (x, y, z) with the sense of

direction indicated in Figure 9.5). From Stokes’ Theorem 9.2.2 we get

Figure 9.5: Flat disc Sρ with circular boundary curve Γρ

(9.2.110)

∫Γρ

FFF · drrr =

∫Sρ


Now suppose that the radius ρ of the disc S is very small. Then, from the definition of a surface

integral, one sees that

(9.2.111)

∫Sρ

(∇×FFF ) · dAAA ≈ (∇×FFF )(x, y, z) · nnn(areaSρ),

since the vector field ∇×FFF is approximately constant with value (∇×FFF )(x, y, z) at all points on

the small surface S. From (9.2.111) and (9.2.110)

(9.2.112) (∇×FFF )(x, y, z) · nnn ≈ 1

areaSρ

∫Γρ

FFF · drrr.

156

Now the circulation of FFF around Γρ given by the line integral on the right side of (9.2.112) is a

measure of the aggregate “turning” or “rotation” of the vector field FFF around the very small curve

Γρ, so that the quantity on the right side of (9.2.112) is the aggregate “turning” of the vector field

FFF around Γρ per unit area of the surface S enclosed by Γρ. Taking ρ→ 0 in (9.2.112) then gives

(9.2.113) (∇×FFF )(x, y, z) · nnn = limρ→0

1

areaSρ

∫Γρ

FFF · drrr.

To get a more detailed intuitive interpretation of ∇×FFF from (9.2.113) we need to assign a physical

interpretation to the vector field FFF itself. For example, suppose that the vector field FFF is actually

the velocity field VVV that was discussed in Remark 9.1.10. Then, applying the principles of fluid

mechanics (we shall not undertake this here) one can actually establish the relation (9.1.10) on the

basis of (9.2.113). For a second example, suppose that the vector field FFF is an electric field. In

Remark 5.1.3, we have seen that the “turning” of FFF around Γρ represented by circulation on the

right side of (9.2.113) is actually the electromotive force (a very physical entity!) generated by the

electric field around Γρ. We then see from (9.2.113) that in this case the quantity (∇×FFF )(x, y, z) ·nnnis the limit of electromotive force per unit area enclosed by Γρ as ρ → 0. This interpretation of

∇×FFF when FFF is an electric field is very useful in electromagnetism.

Example 9.2.8 (from nanotechnology). A time varying electric field is given by

(9.2.114) EEE(t, x, y, z) = t(x+ y)iii− 2t2x2jjj + txykkk, (x, y, z) ∈ R3.

As in Remark 9.2.7 fix some point (x, y, z) in R3 and some unit vector nnn passing through (x, y, z),

and let Sρ be the flat disc with radius ρ and center at (x, y, z) lying in the plane which is orthogonal

to the unit vector nnn (see Figure 9.5). A circular metallic loop with total resistance R is placed along

the boundary curve Γρ. Determine the current in the loop in terms of time t when (all distances in

meters)

(9.2.115) ρ = 10−3, (x, y, z) = (1, 1, 5), nnn =iii+ 2jjj + kkk√

6, R = 10 ohm.

From basic physics the voltage v(t) generated in the circular loop is the electromotive force generated

by the electric field around Γρ, that is

(9.2.116) v(t) =

∫Γρ

EEE · dAAA,

so that the current through the loop is, by Ohm’s law,

(9.2.117) i(t) =v(t)

R=

1

R

∫Γρ

EEE · dAAA.

157

Direct evaluation of the line integral on the right side of (9.2.117) for the electric field given by

(9.2.114) is not at all easy! But, since ρ is so small, we know from the approximation (9.2.112) that

(9.2.118)

∫Γρ

EEE · drrr ≈ areaSρ[(∇×EEE)(t, x, y, z) · nnn].

From (9.2.118) and (9.2.117)

(9.2.119) i(t) ≈ πρ2

R[(∇×EEE)(t, x, y, z) · nnn],

and from (9.2.120)

∇×EEE(t, x, y, z) =


∂∂y

∂∂z

t(x+ y) −2t2x2 txy

∣∣∣∣∣∣∣∣= txiii− tyjjj − (4t2x+ t)kkk

= tiii− tjjj − (4t2 + t)kkk.

(9.2.120)

From (9.2.120), (9.2.119) and (9.2.115),

i(t) =10−6π

10√

6[tiii− tjjj − (4t2 + t)kkk] · [iii+ 2jjj + kkk]

= −10−5(2π)√6

(t+ 2t2).

(9.2.121)

9.3 Divergence Theorem of Gauss-Ostrogradskii

We come to the second major theorem of vector calculus namely the theorem of Gauss-Ostrogradskii

or divergence theorem. In contrast to Stokes’ theorem, which involves the surface integral of a vector

field over an open surface (see Remark 9.2.1), the divergence theorem involves the surface integral

of a vector field over a closed surface. We therefore expand on the notion of a closed surface which

was briefly introduced in Remark 9.2.1:

Remark 9.3.1. A surface S in R3 is a closed surface when it is the boundary of some region in

R3. Examples of closed surfaces are the surface of a sphere in R3 and the surface of a rectangular

parallelepiped in R3. In the divergence theorem we shall be interested in finite closed surfaces, that

is closed surfaces which contain a region of finite volume or finite extent. Fix a point on some closed

158

finite surface S (e.g. think of a point on the surface of a sphere or a rectangular parallelepiped in

R3). At this point there are two possible choices of unit vector which are orthogonal to the surface,

namely a unit vector which points out of the enclosed region and alternatively a unit vector which

points into the enclosed region. From now on we shall say that a finite closed surface has outward

orientation when the unit vector normal to the surface points out of the enclosed region at every

point on the surface. Analogously, a finite closed surface is said to have inward orientation when

the unit vector normal to the surface points into the enclosed region at every point on the surface

(see Figure 9.6 in which points A, B and C are on the surface S and unit vectors normal to the

surface are indicated at these points).

Figure 9.6: Closed surfaces S with outward orientation (a) and inward orientation (b)

We then have

Theorem 9.3.2 (Divergence theorem of Gauss-Ostrogradskii). Suppose that S is a closed surface

with outward orientation (see Remark 9.3.1) which encloses a finite region Ω ⊂ R3, and FFF : R3 → R3

is a C1-vector field (see Remark 3.2.2). Then

(9.3.122)

∫Ω

(div FFF ) dV =

∫S

FFF · dAAA,

(recall the definition of div FFF at Definition 9.1.1).

Remark 9.3.3. In the alternative notation ∇ ·FFF for curl FFF the divergence theorem is written

(9.3.123)

∫Ω

(∇ ·FFF ) dV =

∫S

FFF · dAAA.

159

Remark 9.3.4. Recall that the surface integral on the right sides of (9.3.122) and (9.3.123) is

defined by (8.5.125) in terms of any parametric representation ΦΦΦ : D → R3 of the surface S, that is

(9.3.124)

∫S

FFF · dAAA :=

∫D



∂v(u, v)

]du dv.

As for the three dimensional integral appearing on the left of (9.3.122) and (9.3.123), this is formu-

lated in Definition 2.2.1 in the case where Ω is a rectangular parallelepiped, and at Remark 2.2.8

for general Ω ⊂ R3, with f taken to be the scalar field ∇ ·FFF .

Proof of Theorem 9.3.2: We shall prove Theorem 9.3.2 in the special case where Ω is a regular

region in the sense of Remark 2.2.12. Writing the vector field FFF in the usual componentwise form

(9.3.125) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in R3,

(c.f. (3.2.17)), we have from (9.1.5)

(9.3.126) ∇ ·FFF (x, y, z) =∂F1

∂x(x, y, z) +

∂F2

∂y(x, y, z) +

∂F3

∂z(x, y, z), for all (x, y, z) in R3,

so that, from (9.3.126) the volume integral on the right of (9.3.123) is the sum

(9.3.127)

∫Ω

(∇ ·FFF ) dV =

∫Ω

∂F1

∂xdV +

∫Ω

∂F2

∂ydV +

∫Ω

∂F3

∂zdV.

As for the surface integral on the left of (9.3.123), we have∫S

FFF · dAAA =

∫S

(F1iii+ F2jjj + F3kkk) · dAAA (from (9.3.125))

=

∫S

GGG1 · dAAA+

∫S

GGG2 · dAAA+

∫S

GGG3 · dAAA,

(9.3.128)

in which we have defined the simple vector fields

(9.3.129) GGG1(x, y, z) := F1(x, y, z)iii, GGG2(x, y, z) := F2(x, y, z)jjj, GGG3(x, y, z) := F3(x, y, z)kkk,

for all (x, y, z) in R3. We shall establish the relations

(9.3.130)

∫Ω

∂F1

∂xdV =

∫S

GGG1 · dAAA,

∫Ω

∂F2

∂ydV =

∫S

GGG2 · dAAA,

∫Ω

∂F3

∂ydV =

∫S

GGG3 · dAAA,

since, with (9.3.130) established, we obtain (9.3.123) from (9.3.128) and (9.3.127). We establish the

third of the relations at (9.3.130), the remaining relations being proved in an identical way. Since

Ω is a regular region in the sense of Remark 2.2.12 it is, in particular, z-simple with lower function

160

γ1(x, y), upper function γ2(x, y) and common domain of definition D ⊂ R2xy (see (2.2.101)), so that,

in set theory notation, Ω is given by (2.2.101), that is

(9.3.131) Ω = (x, y, z) ∈ R3 | (x, y) in D and γ1(x, y) ≤ z ≤ γ2(x, y).

It follows from Remark 2.2.9 that the integral of any real valued function f over Ω is given by

(2.2.102). Identifying f with ∂F3/∂z we obtain

(9.3.132)

∫Ω

∂F3

∂ydV =

∫D

∫ γ2(x,y)

γ1(x,y)

∂F3

∂y(x, y, z) dz

dx dy.

By the fundamental theorem of calculus

(9.3.133)

∫ γ2(x,y)

γ1(x,y)

∂F3

∂y(x, y, z) dz = F3(x, y, γ2(x, y))− F3(x, y, γ1(x, y)),

so that, from (9.3.132) and (9.3.133),

(9.3.134)

∫Ω

∂F3

∂ydV =

∫D

F3(x, y, γ2(x, y))− F3(x, y, γ1(x, y)) dx dy.

Next, evaluate the surface integral on the right side of the three relations at (9.3.130). To this end,

since Ω is given by (9.3.131), it necessarily has the form shown in Figure 9.7, in which the upper

surface S2 is the graph of the function γ2(x, y), namely

(9.3.135) S2 = (x, y, γ2(x, y)) ∈ R3 | (x, y) in D,

(c.f. (2.2.100)), while the lower surface S1 is the graph of the function γ1(x, y), namely

(9.3.136) S1 = (x, y, γ1(x, y)) ∈ R3 | (x, y) in D,

and the “side surface” S3 is such that

(9.3.137) each unit vector normal to S3 is also normal to the unit vector kkk.

Now the surfaces S1, S2 and S3 are disjoint and cover the surface S which encloses Ω, that is

(9.3.138) S = S1 ∪ S2 ∪ S3, Si ∩ Sj = ∅ when i 6= j.

From (9.3.138) it follows that

(9.3.139)

∫S

GGG3 · dAAA =

∫S1

GGG3 · dAAA+

∫S2

GGG3 · dAAA+

∫S3

GGG3 · dAAA.

161

Figure 9.7: Region Ω and surfaces S1, S2 and S3 in the proof of Theorem 9.3.2

It remains to evaluate each of the integrals on the right of (9.3.139). One sees from the third

relation of (9.3.129) that GGG3 is in the direction of the unit vector kkk. In view of this fact, together

with (9.3.137) and the definition of surface integral, it is immediate that

(9.3.140)

∫S3

GGG3 · dAAA = 0.

We now evaluate the first integral on the right side of (9.3.139). To this end we recall Remark 8.5.3

in which we established the formula (8.5.133) for the surface integral of a general vector field FFF

over a surface which is the graph of some function f . In fact we shall use (8.5.133), but taking γ2

in place of f (since S2 is the graph of γ2, as shown at (9.3.135)) and taking GGG3 in place of FFF (since

GGG3 is the vector field we are integrating). From the third relation of (9.3.129) we see that

(9.3.141) GGG3(x, y, z) = 0iii+ 0jjj + F3(x, y, z)kkk,

so that in (8.5.133) we identify F1 = 0, F2 = 0 to get

(9.3.142)

∫S2

GGG3 · dAAA =

∫D

F3(u, v, γ2(u, v)) du dv.

162

In exactly the same way, for the surface integral of GGG3 over the lower surface S1 which is the graph

of the function γ1 (see (9.3.136)), we obtain

(9.3.143)

∫S1

GGG3 · dAAA = −∫D

F3(u, v, γ1(u, v)) du dv.

Why the negative sign on the right of (9.3.143), in contrast to the right side of (9.3.142)? This

sign is to compensate for unit vectors nnn normal to S1 pointing outwards (by the assumed outward

orientation of S) and hence downwards (since S1 is the em lower surface), whereas the unit vectors

normal to S2 point upwards (again by the assumed outward orientation of S, see Figure 9.7).

Combining (9.3.139), (9.3.140), (9.3.142) and (9.3.143),

(9.3.144)

∫S

GGG3 · dAAA =

∫D

F3(u, v, γ2(u, v))− F3(u, v, γ1(u, v)) du dv.

Comparison of (9.3.134) with (9.3.144) shows

(9.3.145)

∫S

GGG3 · dAAA =

∫Ω

∂F3

∂ydV,

which is the third relation of (9.3.130). The first and second relations of (9.3.130) are established

in the same way, using the fact that Ω is also x-simple and y-simple.

The Gauss-Ostrogradskii Theorem 9.3.2 can often be used to simplify the calculation of a sur-

face integral, reducing this to a volume integral which may often be easier to calculate, as the next

example demonstrates.

Example 9.3.5. A vector field FFF : R3 → R3 is defined by

(9.3.146) FFF (x, y, z) := 4xiii− 2y2jjj + z2kkk, for all (x, y, z) in R3.

The surface S encloses the region

(9.3.147) Ω = xyz ∈ R3 | x2 + y2 ≤ 4, 0 ≤ z ≤ 3

that is Ω is the cylinder-shaped region of radius 2 and height 3 shown in Figure 9.8. We must

evaluate the surface integral ∫S

FFF · dAAA.

Direct evaluation of this integral is not impossible, but is nevertheless quite lengthy and complicated,

as we have to evaluate the surface integral separately over each of the three bounding surfaces S1

163

Figure 9.8: Cylindrical region Ω for the Example 9.3.5

(“lower” surface), S2 (“top” surface), and S3 (“side” surface) in Figure 9.8 (you may want to try

the calculation!). On the other hand, from Theorem 9.3.2 we have that

(9.3.148)

∫S

FFF · dAAA =

∫Ω

(∇ ·FFF ) dV,

so we will try to evaluate the volume integral on the right side of (9.3.148). First calculate the

divergence. From (9.3.146) and Definition 9.1.1

(9.3.149) (∇ ·FFF )(x, y, z) =∂(4x)

∂x+∂(−2y2)

∂y+∂(z2)

∂z= 4− 4y + 2z.

We must therefore evaluate the three dimensional integral

(9.3.150)

∫Ω

f dV

for

(9.3.151) f(x, y, z) := (∇ ·FFF )(x, y, z) = 4− 4y + 2z.

We see at once, from Remark 2.2.9 and Figure 9.8, that the region Ω given by (9.3.147) is z-simple

with lower function γ1(x, y), upper function γ2(x, y), and common domain of definition D ⊂ R2xy,

defined by

(9.3.152) D := (x, y) ∈ R2 | x2 + y2 ≤ 4, γ1(x, y) := 0 γ2(x, y) := 3, for all (x, y) in D.

164

We can therefore use Fubini’s theorem in the form of (2.2.102), namely

(9.3.153)

∫Ω

f dV =

∫D

∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz

dx dy.

From (9.3.152) and (9.3.151) we get

(9.3.154)

∫ γ2(x,y)

γ1(x,y)

f(x, y, z) dz =

∫ 3

0

(4− 4y + 2z) dz = 21− 12y,

and, from (9.3.154) and (9.3.153), we get

(9.3.155)

∫Ω

f dV =

∫D

g(x, y) dx dy, where g(x, y) := 21− 12y for all (x, y) in D.

Now one sees from (9.3.152) that the region D is a disc of radius r = 2 centered at the origin of R2,

so it follows, exactly as at Example 2.1.16, that D is a y-simple region with upper function φ2(x),

lower function φ1(x), and common interval of definition a ≤ x ≤ b, with

(9.3.156) a := −2, b := 2, φ1(x) := −√

4− x2, φ2(x) :=√

4− x2, for all −2 ≤ x ≤ 2.

We can now use Fubini’s theorem in the form of (2.1.27) to evaluate the integral on the right side

of (9.3.155), that is∫D

g(x, y) dx dy =

∫ b

a

∫ φ2(x)

φ1(x)

g(x, y) dy

dx

=

∫ 2

−2

∫ √4−x2

−√

4−x2(21− 12y) dy

dx (from (9.3.156) and (9.3.155)).

(9.3.157)

For the “inner” dy-integral we have

(9.3.158)

∫ √4−x2

−√

4−x2(21− 12y) dy =

[(21y − 6y2)

]y=√

4−x2

y=−√

4−x2 = 42√

4− x2.

From (9.3.158) and (9.3.157) we get

(9.3.159)

∫D

g(x, y) dx dy = 42

∫ 2

−2

√4− x2 dx.

To evaluate the integral on the right side of (9.3.159) make the substitution

(9.3.160) x = 2 sin(θ) so that dx = 2 cos(θ) dθ.

165

Moreover, from (9.3.160),

(9.3.161) θ = arcsin(x/2) so that θ = π2

when x = 2 and θ = −π2

when x = −2.

Then, from (9.3.160) and (9.3.161),

(9.3.162)

∫ 2

−2

√4− x2 dx =

∫ π/2

−π/2

√4− 4 sin2(θ)(2 cos(θ)) dθ = 4

∫ π/2

−π/2cos2(θ) dθ.

Now observe thatd[sin(θ) cos(θ)]

dθ= cos2(θ)− sin2(θ) = 2 cos2(θ)− 1

so that

(9.3.163)d

dθ

[θ + sin(θ) cos(θ)

2

]= cos2(θ).

From (9.3.163) and (9.3.162) we find

(9.3.164)

∫ 2

−2

√4− x2 dx = 4

[θ + sin(θ) cos(θ)

2

]θ=π/2θ=−π/2

= 2π.

From (9.3.164) and (9.3.159)

(9.3.165)

∫D

g(x, y) dx dy = 42(2π) = 84π.

From (9.3.165), the first relation of (9.3.155), (9.3.151) and (9.3.148) we obtain∫S

FFF · dAAA =

∫Ω

(∇ ·FFF ) dV =

∫Ω

f dV =

∫D

g(x, y) dx dy = 84π.

Remark 9.3.6. In Remark 9.2.7 we saw that Stokes’ Theorem 9.2.2 could be used to shed some

light on the physical significance of the curl of a vector field. In much the same way we can use the

Gauss-Ostrogradskii Theorem 9.3.2 to get some understanding of the physical significance of the

divergence of a vector field. To see this suppose we have a vector field FFF : R3 → R3 (for simplicity

we take the domain of FFF to be D := R3), and take the region Ωρ ⊂ R3 to be a sphere with radius

ρ centered at some point (x, y, z) in R3, and let Sρ be the outward oriented surface of Ωρ. From

Theorem 9.3.2 we obtain

(9.3.166)

∫Ωρ

(∇ ·FFF ) dV =

∫Sρ

FFF · dAAA.

166

Now suppose that the radius ρ of the sphere Ωρ is very small. Then, from the definition of the three

dimensional integral, one sees that

(9.3.167)

∫Ωρ

(∇ ·FFF ) · dV ≈ (∇ ·FFF )(x, y, z)(volΩρ),

(where volΩρ denotes the volume of the sphere Ωρ) since the scalar field ∇ ·FFF is approximately

constant with value (∇ ·FFF )(x, y, z) at all points in the small region Ωρ centered at (x, y, z). From

(9.3.167) and (9.3.166) we get

(9.3.168) (∇ ·FFF )(x, y, z) ≈ 1

volΩρ

∫Sρ

FFF · dAAA,

and upon taking ρ→ 0 at (9.3.168) we obtain

(9.3.169) (∇ ·FFF )(x, y, z) = limρ→0

1

volΩρ

∫Sρ

FFF · dAAA. for all (x, y, z) in R3.

We recall from Remark 8.5.4 that the surface integrals over the spherical surface Sρ on the right sides

of (9.3.169) and (9.3.168) are the flux of the vector field FFF through Sρ. Then (9.3.168) effectively

says that the divergence (∇ ·FFF )(x, y, z) is approximately the flux of FFF through the small spherical

surface Sρ centered at (x, y, z) per volume of the region Ωρ enclosed by Sρ. Moreover, (9.3.169) says

that this approximation becomes exact when the radius ρ of the sphere Sρ becomes “infinitesimally

small”. If, for example, we identify the vector field FFF with a current density vector field JJJ , then

(9.3.169) becomes

(9.3.170) (∇ · JJJ)(x, y, z) = limρ→0

1

volΩρ

∫Sρ

JJJ · dAAA. for all (x, y, z) in R3.

But we know from Remark 8.5.4 that the surface integral∫Sρ

JJJ · dAAA

on the right hand side of (9.3.170) is in fact the total current flowing out of Ωρ through the boundary

surface Sρ. It then follows from (9.3.170) that (∇·JJJ)(x, y, z) must this total out-flowing current per

volume of the spherical region Ωρ enclosed by Sρ for infinitesimally small radius ρ. In particular, if

(9.3.171) (∇ · JJJ)(x, y, z) > 0,

then there must be a source of current located at the point (x, y, z), and if

(9.3.172) (∇ · JJJ)(x, y, z) < 0,

then there must be a sink of current located at the point (x, y, z).

167

Example 9.3.7. A current density vector field is given by

(9.3.173) JJJ(x, y, z) = x3iii− 2xyjjj + yzkkk, for all (x, y, z) in R3,

As in Remark 9.3.6 let Γρ be a spherical region with radius ρ > 0 and center at a point (x, y, z),

and let Sρ be the surface of Γρ. Determine the total current through the surface Sρ when

(9.3.174) ρ = 10−6, (x, y, z) = (1, 2, 1),

(all distances in meters). We know that the current through Sρ is given by the surface integral

(9.3.175) i =

∫Sρ

JJJ · dAAA.

(recall Section 8.5). Direct evaluation of the surface integral at (9.3.175) is not at all easy. We could

modify the parametric representation for the whole sphere worked out in Example 8.1.9 to account

for the fact that the center is at (x, y, z) = (1, 2, 1) to get the representation ΦΦΦ : D → R3 in which

D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π

= [0, 2π]× [0, π],(9.3.176)

and

(9.3.177) ΦΦΦ(θ, φ) = [1 + ρ sin(φ) cos(θ)]iii+ [2 + ρ sin(φ) sin(θ)]jjj + [1 + ρ cos(φ)]kkk,

for all (θ, φ) in D defined by (9.3.176) (compare with (8.1.17)). Using the general formula for surface

integral given by (8.5.125) we then see that the current at (9.3.175) is given by

i =

∫Sρ

JJJ · dAAA

=

∫D

JJJ(ΦΦΦ(θ, φ)) ·[∂ΦΦΦ


∂φ(θ, φ)

]dθ dφ.

(9.3.178)

However, substitution of (9.3.177) and (9.3.173) into (9.3.178) leads to some rather laborious inte-

grals! Since ρ is so small (see (9.3.174)) we can instead use the approximation for the divergence

established in Remark 9.3.6. From (9.3.168) we have

(∇ · JJJ)(x, y, z) ≈ 1

volΩρ

∫Sρ

JJJ · dAAA,

that is

(9.3.179)

∫Sρ

JJJ · dAAA ≈ volΩρ(∇ · JJJ)(x, y, z).

168

From (9.3.173) and (9.3.174)

(∇ · JJJ)(x, y, z) =∂(x3)

∂x+∂(−2xy)

∂y+∂(yz)

∂z

= 3x2 − 2x+ y = 3.

(9.3.180)

Upon combining (9.3.180), (9.3.179), (9.3.175) and (9.3.174) we get

i ≈ 4

3πρ3(∇ · JJJ)(x, y, z) =

4

3π(10−6)3(3) = 4π10−18 amps.

9.4 The Continuity Equation

Our goal in this section is to use the divergence Theorem 9.3.2 to establish the continuity equation,

a basic result which is central to fluid dynamics, aerodynamics, electromagnetism and several other

parts of physics and engineering. We shall obtain the continuity equation in the context of electric

charge moving diffusely through space, since it is this version of the continuity equation which is

of most relevance in the study of electromagnetism. In Example 3.1.4 we introduced the charge

density scalar field ρ, and in Example 3.1.5 we saw how this could be used to obtain the total charge

Q enclosed in a region Ω ⊂ R3, namely

(9.4.181) Q =

∫Ω

ρ(x, y, z) dx dy dz.

(see (3.1.3)). Now suppose that charge is in motion through space. This means that the charge

density at a point (x, y, z) generally changes with time t, in other words is given by ρ(t, x, y, z),

which is a function of time t for each (x, y, z). The charge density ρ is therefore a time varying scalar

field in the sense of Remark 3.2.5. Fix some arbitrary region Ω ⊂ R3 having the closed surface S as

its boundary. We shall suppose that S is outwardly oriented, that is the unit vector normal to the

surface points out of the region Ω at every point on the surface (see Remark 9.3.1). We emphasize

that S is a purely theoretical surface that leaves the movement of charge completely unaffected,

and is not in any sense a physical surface or barrier which impedes or disturbs the movement of

charge. In accordance with (9.4.181) the total charge contained within Ω at each instant t is given

by

(9.4.182) Q(t) =

∫Ω

ρ(t, x, y, z) dx dy dz.

From now on we are going to resort increasingly to the custom (which is standard throughout the

literature) of suppressing variables. Accordingly, (9.4.182) will be written in the abbreviated form

(9.4.183) Q(t) =

∫Ω

ρ(t) dV,

169

so that the reader must “mentally insert” the missing variables (x, y, z) at ρ(t) (recall from Remark

2.2.2 that dV is shorthand for dx dy dz), or written in even more stripped down form as

(9.4.184) Q(t) =

∫Ω

ρ dV,

in which case the reader must “mentally insert” all the missing variables (t, x, y, z) at ρ. One soon

gets used to this! Basing ourselves on the totally stripped down notation at (9.4.184) it follows that

the rate of increase of the total charge contained within the region Ω is given by

(9.4.185)dQ(t)

dt=

d

dt

∫Ω

ρ dV =

∫Ω

∂ρ

∂tdV.

With all variables displayed (9.4.186) of course reads

(9.4.186)dQ(t)

dt=

∫Ω

∂ρ(t, x, y, z)

∂tdx dy dz,

but from now on we are going to avoid such detailed notation unless it is really needed (which

occasionally it is). To repeat, from (9.4.185) we have

(9.4.187) rate of increase of the total charge contained within the region Ω =

∫Ω

∂ρ

∂tdV.

Now recall from Section 8.5 that the total current passing through S is given by the surface integral

of the current density vector field JJJ of the moving charge over surface the S, that is

(9.4.188) total current passing through the boundary surface S =

∫S

JJJ · dAAA,

(recall Example 3.1.6 for the definition of the current density vector field). Since S is outwardly

oriented, the current at (9.4.188) is the rate at which charge leaves the region Ω by flowing through

the boundary surface S. It then follows, by a sign-change, that

(9.4.189) rate at which charge enters region Ω through the boundary surface S = −∫S

JJJ · dAAA.

We now appeal to a basic physical law, called the law of conservation of charge which, in the present

setting, says

(9.4.190)

the rate of increase of the total charge contained within the region Ω is equal to

the rate at which charge enters region Ω through the boundary surface S.

The law of conservation of charge is completely justified by experiment. Like Ampere’s law and

Faraday’s law, this law is a bedrock principle of physics. The law of conservation of charge means

170

in particular that charge cannot be created or destroyed, that is within any region Ω there are never

any sources of charge (i.e. places at which charge just “appears”) or sinks of charge (i.e. places

at which charge just “vanishes”). In view of (9.4.190) the quantities at (9.4.189) and at (9.4.187)

must be equal, that is

(9.4.191)

∫Ω

∂ρ

∂tdV = −

∫S

JJJ · dAAA.

The relation (9.4.191) therefore expresses in mathematical form the basic principle of conservation

of charge. Unfortunately this relation is not very easy to use since the combination of a volume

integral on the left side and a surface integral on the right side is very difficult to deal with. We

are now going to use the tremendous power of vector calculus (in this case the divergence Theorem

9.3.2) to refine or “process” the relation (9.4.191) into a form which is extremely useful. From

Theorem 9.3.2, interpreting the general vector field FFF at (9.3.123) as the current density vector

field JJJ , we have

(9.4.192)

∫S

JJJ · dAAA =

∫Ω

(∇ · JJJ) dV,

and combining (9.4.192) with (9.4.191) we get

(9.4.193)

∫Ω

∂ρ

∂tdV = −

∫Ω

(∇ · JJJ) dV.

Thanks to the divergence theorem we now have the same kind of integral (a volume integral) on

each side of (9.4.193), in contrast to (9.4.191) with its awkward mix of a volume integral on one side

and a surface integral on the other side. As we shall see, the relation (9.4.193) in terms of volume

integrals alone is much easier to deal with than (9.4.191). From (9.4.193)

(9.4.194)

∫Ω

[(∇ · JJJ) +

∂ρ

∂t

]dV = 0.

At this point it is important to realize that (9.4.194) holds for each and every region Ω ⊂ R3.

Indeed, in our development of (9.4.194) we made absolutely no assumptions about Ω except that it

is just a region in R3. This is very useful for it allows us to use the following technical result which

we state without proof:

Theorem 9.4.1 (du Bois Reymond). Suppose the scalar function g : R3 → R is such that

(9.4.195)

∫Ω

g dV = 0

for each and every region Ω ⊂ R3. Then g(x, y, z) = 0 for all (x, y, z) in R3.

171

In order to use Theorem 9.4.1 at (9.4.194) it pays to write this relation displaying all the variables

which have been suppressed, that is we have

(9.4.196)

∫Ω

[(∇ · JJJ)(t, x, y, z) +

∂ρ(t, x, y, z)

∂t

]dV = 0,

for each t and each region Ω ⊂ R3. Now fix some arbitrary value of t, and for this fixed t put

(9.4.197) g(x, y, z) := (∇ · JJJ)(t, x, y, z) +∂ρ(t, x, y, z)

∂t, for all (x, y, z) in R3.

It follows from (9.4.197) and (9.4.196) that, at this fixed instant t, we have

(9.4.198)

∫Ω

g dV = 0,

for each and every region Ω ⊂ R3. Now we can apply Theorem 9.4.1 to (9.4.198) to conclude that

(9.4.199) g(x, y, z) = 0 for all (x, y, z) in R3.

Now it follows from (9.4.199) and (9.4.197), together with the arbitrary choice of t that

(9.4.200) (∇ · JJJ)(t, x, y, z) +∂ρ(t, x, y, z)

∂t= 0, for all t and points (x, y, z) in R3.

In (9.4.200), in accordance with (9.1.44), by (∇ · JJJ)(t, x, y, z) we just mean the quantity

(9.4.201) (∇ · JJJ)(t, x, y, z) =∂J1

∂x(t, x, y, z) +

∂J2

∂y(t, x, y, z) +

∂J3

∂z(t, x, y, z).

Needless to say, we will always write (9.4.200) in the stripped down form

(9.4.202) (∇ · JJJ) +∂ρ

∂t= 0,

in which all the variables (t, x, y, z) are omitted, but it should always be remembered that (9.4.202)

is just shorthand for the more detailed formulation at (9.4.200).

Remark 9.4.2. The relation (9.4.202) is called the continuity equation and expresses the conserva-

tion of charge in very convenient mathematical form. The continuity equation will be indispensable

in our study of Maxwell’s equations.

172

Chapter 10

The Basic Laws of Electricity and

Magnetism

In this chapter we are going to formulate the basic, experimentally determined, laws of electricity

and magnetism. We have already seen two of these laws, namely Ampere’s circuital law and

Faraday’s law of electromagnetic induction, which were stated just to illustrate the essential role of

line integrals and surface integrals in formulating the basic physical laws of electricity and magnetism

(see Remark 8.5.7 and Remark 8.5.8). In this chapter we shall look at these and the other laws

of electricity and magnetism more thoroughly and use the tools of vector calculus to reformulate

the laws in mathematically very convenient form. All of this is preparation for the next chapter, in

which we study Maxwell’s equations of electromagnetism.

10.1 Static Electric Fields

In this section we focus on static electric fields, that is electric fields EEE(x, y, z) which vary from

one point (x, y, z) to another but are constant with respect to time. The basic experimental fact

concerning static electric fields is Coulomb’s force law which effectively provides the definition of

the electric field EEE. We have already seen Coulomb’s law at Example 3.1.3, and we state this again

as follows:

Law 10.1.1 (Coulomb’s law of electrostatics). Suppose that a point charge of Q coul. is located

at point (u, v, w) in R3, and (x, y, z) is some other point in R3 distinct from (u, v, w) as shown in

173

Figure 10.1. Denote by rrr(u, v, w;x, y, z) the unit vector from (u, v, w) to (x, y, z), that is

(10.1.1) rrr(u, v, w;x, y, z) :=(x− u)iii+ (y − v)jjj + (z − w)kkk√

(x− u)2 + (y − v)2 + (z − w)2.

Then the electric field EEE(x, y, z) at point (x, y, z) due to the charge Q is the force exerted on a

positive test charge of 1 coul. at (x, y, z), which according to Coulomb’s inverse square force law is

given by

(10.1.2) EEE(x, y, z) =Q

4πε0[(x− u)2 + (y − v)2 + (z − w)2]rrr(u, v, w;x, y, z).

Here ε0 is a constant called the electrostatic permittivity of free space, which is given by

(10.1.3) ε0 = 8.854× 10−12coul2/(Newton meter2).

Usually (10.1.1) and (10.1.2) are combined into the following single expression for the electric field:

(10.1.4) EEE(x, y, z) =Q(x− u)iii+ (y − v)jjj + (z − w)kkk

4πε0[(x− u)2 + (y − v)2 + (z − w)2]3/2.

Figure 10.1: Electric field EEE(x, y, z) at (x, y, z) due to point charge Q at (u, v, w)

Point charges, such as the charge Q which is “concentrated” at the point (u, v, w) in the state-

ment of Coulomb’s law, are “singularities” and constitute rather unnatural objects in the theory of

electromagnetism. It is much more usual to deal, not with point charges, but rather with charge

diffusely spread through space and described by a charge density scalar field ρ of the kind introduced

174

in Example 3.1.4, and throughout this section we shall suppose that this is the case. To focus on

the main question and avoid distraction by secondary issues we shall also suppose that the domain

D of the charge density scalar field ρ is all of R3, that is D = R3 in the notation of Example

3.1.4. Finally, we shall suppose that the charge density is time constant or static in the sense that

the charge density ρ(x, y, z) at any specified point (x, y, z) is constant with time t (but generally

varies from one point (x, y, z) to another). We saw at Example 3.1.4 that the significance of ρ is

the following: if (u, v, w) is any point within an infinitesimal cube dV in R3, with infinitesimal

side-lengths du, dv and dw (with reference to the u, v and w-axes of R3), then the total charge

enclosed in the cube is the infinitesimal quantity

(10.1.5) dQ = ρ(u, v, w) du dv dw,

so that the total charge Q, diffusely spread throughout space according to the density ρ, is just the

“sum” of the elements dQ at (10.1.5) expressed in terms of a volume integral namely

Q =

∫R3

dQ

=

∫R3

ρ(u, v, w) du dv dw.

(10.1.6)

With this observation we can easily state Coulomb’s law in terms of charge diffusely spread through

space with charge density ρ. Fix some point (u, v, w) in R3 such that (u, v, w) lies in an infinitesimal

cube with side-lengths du, dv and dw, and fix some other point (x, y, z) (see Figure 10.2). As a

Figure 10.2: Electric field dEEE(x, y, z) at (x, y, z) due to infinitesimal point charge dQ at (u, v, w)

175

result of the charge dQ given by (10.1.5) the electric field at (x, y, z) is the infinitesimal vector

given by (10.1.4) with dQ in place of Q namely

(10.1.7) dEEE(x, y, z) =1

4πε0

(x− u)iii+ (y − v)jjj + (z − w)kkk[(x− u)2 + (y − v)2 + (z − w)2]3/2

ρ(u, v, w) du dv dw.

Our goal is to determine the total electric field at (x, y, z) as a result of all the charge contained

within R3. This means that we must “add up” the infinitesimal electric field vectors at (10.1.7)

for all infinitesimal cubes contained within R3. Using the calculus of three dimensional integrals

worked out in Section 2.2 this is easy. In fact

EEE(x, y, z) =

∫R3

dEEE(x, y, z)

=1

4πε0

∫R3

(x− u)iii+ (y − v)jjj + (z − w)kkk[(x− u)2 + (y − v)2 + (z − w)2]3/2

ρ(u, v, w) du dv dw,

(10.1.8)

in which we substituted from (10.1.7) at the second equality of (10.1.8). The statement (10.1.8)

amounts to Coulomb’s law for the electric field caused by charge diffusely spread through space

with charge densityρ. Expanding (10.1.8) we get

EEE(x, y, z) =

[1

4πε0

∫R3

(x− u)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw

]iii

+

[1

4πε0

∫R3

(y − v)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw

]jjj

+

[1

4πε0

∫R3

(z − w)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw

]kkk.

(10.1.9)

Observe that each of the three integrals on the right side of (10.1.9) is a standard three dimensional

integral, in which we integrate with respect to the space variables u, v and w, while keeping x, y

and z fixed.

We shall now establish that the electric field EEE at (10.1.9) is conservative (recall Definition

6.2.1). To this end define

(10.1.10) Ψ(x, y, z) := − 1

4πε0

∫R3

ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,

for all (x, y, z) in R3, and observe (easy calculus!) that

∂

∂x

[ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2

]= − (x− u)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2,

∂

∂y

[ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2

]= − (y − v)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2,

∂

∂z

[ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2

]= − (z − w)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2,

(10.1.11)

176

for all (x, y, z), in which we have kept u, v and w constant when evaluating the partial derivatives.

Recalling the gradient operator (see Definition 6.1.1) we then have

(10.1.12) ∇Ψ(x, y, z) =∂Ψ

∂x(x, y, z)iii+

∂Ψ

∂y(x, y, z)jjj +

∂Ψ

∂z(x, y, z)kkk.

Now from (10.1.10)

∂Ψ

∂x(x, y, z) = − 1

4πε0

∂

∂x

∫R3

ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw

= − 1

4πε0

∫R3

∂

∂x

[ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2

]du dv dw

=1

4πε0

∫R3

(x− u)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw (from (10.1.11)).

(10.1.13)

In exactly the same way

∂Ψ

∂y(x, y, z) =

1

4πε0

∫R3

(y − v)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw,

∂Ψ

∂z(x, y, z) =

1

4πε0

∫R3

(z − w)ρ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw,

(10.1.14)

where we have used the second and third relations of (10.1.11). Upon combining (10.1.14), (10.1.13),

(10.1.12) and (10.1.9) we find

(10.1.15) EEE(x, y, z) = (∇Ψ)(x, y, z) for all (x, y, z) in R3,

which we write with variables suppressed as

(10.1.16) EEE = ∇Ψ.

Remark 10.1.2. In Example 6.2.3 we saw that the electric field due to a point charge Q is conser-

vative. The result (10.1.16) tells us that an electric field arising from charge diffusely distributed

through space according to a specified charge density field ρ is also conservative with a potential

function given by (10.1.10).

Remark 10.1.3. From (10.1.16) and Theorem 9.1.13 we immediately obtain the important identity

(10.1.17) ∇×EEE = 0,

which tells us that the electric field EEE caused by a diffuse distribution of charge with charge density

ρ is irrotational (see Remark 9.1.12).

177

There is an alternative way of stating Coulomb’s law which is extremely important:

Law 10.1.4 (Gauss’ law for static electric fields). Suppose that EEE is the electric field arising from

the charge density ρ (see (10.1.8)). Then, for any region Ω ⊂ R3 with the closed outwardly oriented

surface S as boundary (recall Remark 9.3.1), we have

(10.1.18)

∫S

EEE · dAAA =1

ε0

∫Ω

ρ dV,

in which ε0 is given by (10.1.3).

Remark 10.1.5. We recognize the integral on the right of (10.1.18) as giving the total charge

contained within the region Ω. There is of course plenty of charge in space outside the region Ω,

but according to Gauss’ Law 10.1.4 the total flux of the electric field EEE through S has nothing

whatever to do with this “outside” charge and is determined only by the charge inside Ω! The

basic experimental fact leading to Gauss’ Law 10.1.4 is Coulomb’s law in the form of the statement

(10.1.9), and Gauss’ law is really just Coulomb’s law but stated in more esoteric mathematical

language (involving surface integrals!). The reason we prefer the esoteric statement at (10.1.18) to

the more down-to-earth inverse square law of Coulomb given by (10.1.9) is that (10.1.18) is often

much easier to use than the inverse square law, for both theoretical investigation and practical

applications. The reason for this is Gauss’ law in the form (10.1.18) is very well suited to application

of the tools of vector calculus, whereas Coulomb’s law in the form (10.1.9) is not.

Remark 10.1.6. Gauss’ law 10.1.4 is a global statement since it gives an aggregate or net property

of the electric field EEE in the form of the surface integral at (10.1.18). We are now going to use the

divergence Theorem 9.3.2 to rewrite Gauss’ law in a local or pointwise or differential form which

says something about EEE(x, y, z) at every individual point (x, y, z). To this end we use Theorem

9.3.2 (in the form of (9.3.123), with EEE in place of the generic vector field FFF ) to get

(10.1.19)

∫Ω

(∇ ·EEE) dV =

∫S

EEE · dAAA,

and upon combining (10.1.17) and (10.1.18) we find∫Ω

(∇ ·EEE) dV =1

ε0

∫Ω

ρ dV,

that is

(10.1.20)

∫Ω

[(∇ ·EEE)− 1

ε0ρ

]dV = 0.

178

Now (10.1.20) holds for each and every region Ω ⊂ R3, so we can apply Theorem 9.4.1 with

g(x, y, z) = (∇ ·EEE)(x, y, z)− 1

ε0ρ(x, y, z),

to conclude that

(10.1.21) (∇ ·EEE) =1

ε0ρ,

(in which we have suppressed the underlying variable (x, y, z)!). The statement at (10.1.21) may

be regarded as the local or pointwise or differential version of the global version of Gauss’ law for

static electric fields at (10.1.18). This relation will be of immense value when we study Maxwell’s

equations (in fact, it is one of Maxwell’s equations!). The preceding derivation of (10.1.21) illustrates

just how useful the seemingly esoteric statement at (10.1.18) can be; it would have been effectively

impossible to derive (10.1.21) directly on the basis of Coulomb’s inverse square law, even though

this law is (physically) completely equivalent to Gauss’ law 10.1.4.

Remark 10.1.7. The following question arises: Given a charge density field ρ how does one deter-

mine the electric field EEE caused by the charge density field? Of course Coulomb’s law (10.1.9) in

principle gives EEE(x, y, z) at each point (x, y, z) by direct integration. However, the integrals on the

right side of (10.1.9) are usually difficult to evaluate, so this is not a very practical way to determine

the EEE-field. The way around this obstacle is to recall from Remark 10.1.2 that EEE is conservative,

that is (see (10.1.16))

(10.1.22) EEE = ∇Ψ for a scalar potential field Ψ : R3 → R.

We are now going to use the local form of Gauss’ law, given by (10.1.21), to show that the scalar

potential Ψ necessarily satisfies a partial differential equation known as Poisson’s equation. To this

end we first take the divergence of each side of (10.1.22), that is

(10.1.23) ∇ ·EEE = ∇ · (∇Ψ).

From Theorem 9.1.21 we have

(10.1.24) ∇ · (∇Ψ) = ∇2Ψ,

so that (10.1.24) and (10.1.23) give

(10.1.25) ∇ ·EEE = ∇2Ψ.

179

Upon combining (10.1.25) and the local form of Gauss’ law (10.1.21) we get

(10.1.26) ∇2Ψ =1

ε0ρ.

Recalling the Laplacian operator (see Definition 9.1.18) we see that (10.1.26) can be written explic-

itly in terms of second partial derivatives as

(10.1.27)∂2Ψ

∂x2 +∂2Ψ

∂y2 +∂2Ψ

∂z2 =1

ε0ρ,

in which, as usual, we have suppressed the basic variable (x, y, z). The nice thing about the partial

differential equation (10.1.27) (that is (10.1.26)) is that there are available extremely powerful

numerical methods for determining functions Ψ which satisfy this equation when one is given the

charge density field ρ. Having determined this Ψ we then easily obtain the electric field from

(10.1.22) by calculating the x, y and z-partial derivatives of Ψ. This is a much more competitive

and feasible approach for calculating EEE than by direct evaluation of the integrals in (10.1.9).

Remark 10.1.8. The relation (10.1.26) (equivalently (10.1.27)) is a particular instance of a partial

differential equation, called the Poisson equation, which can be formulated in general terms as

(10.1.28) ∇2f = ψ,

or equivalently, in less codified terms, as

(10.1.29)∂2f

∂x2 +∂2f

∂y2 +∂2f

∂z2 = ψ,

in which ψ : R3 → R is a given or known scalar field, and our goal is to determine some function

f : R3 → R which satisfies (10.1.28) (equivalently (10.1.29)). Poisson equations are of central im-

portance and occur all over physics and engineering. We have just seen in Remark 10.1.7 how any

potential function Ψ of a time constant electric field EEE arising from a given time constant charge

density ρ satisfies a Poisson equation, and we shall see later that Poisson equations arise very natu-

rally in connection with magnetic fields as well. Furthermore, one also encounters Poisson equations

in fluid dynamics and aerodynamics, thermodynamics, elasticity theory, quantum mechanics and

gravitational physics, to mention just a few areas where this equation occurs. It is easy, although

tedious, to verify by direct substitution that a function f which satisfies (10.1.28) is given by

(10.1.30) f(x, y, z) = − 1

4π

∫R3

ψ(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,

for all (x, y, z) in R3.

180

10.2 Static Magnetic Fields

In this section we state the basic experimentally determined laws concerning static magnetic fields.

Just as the basic entity giving rise to a static electric field EEE in the previous Section 10.1 is a static

charge density scalar field ρ, so the basic entity giving rise to a static magnetic field BBB is a static

current density vector field JJJ of the kind introduced in Example 3.1.6, and throughout this section

we assume given a current density vector field JJJ , defined for simplicity on the domain D := R3 (in

the notation of Example 3.1.6), and static in the sense that at each (x, y, z) the current density

JJJ(x, y, z) is constant with respect to time t.

In the same way that the basic experimental fact concerning static electric fields is Coulomb’s

law, so the basic experimental fact concerning static magnetic fields is the Biot-Savart law. This

law states that a time constant or static current density vector field JJJ causes a time constant or

static magnetic field BBB, and that BBB is given in terms of JJJ by the integral

(10.2.31) BBB(x, y, z) =µ0

4π

∫R3

JJJ(u, v, w)× (x− u)iii+ (y − v)jjj + (z − w)kkk[(x− u)2 + (y − v)2 + (z − w)2]3/2

du dv dw,

for each (x, y, z) in R3, in which µ0 is a constant called the magnetic permeability of free space with

the value

(10.2.32) µ0 = 4π × 10−7 henry/meter.

Remark 10.2.1. There are definite similarities between the Biot-Savart law (10.2.31) and Coulom-

b’s law in the form of (10.1.8). The denominators in both integrands are identical, indicating that

both of the laws are “inverse square” laws. In each law there is an “input” or “cause” in the inte-

grand on the right side; this is the charge density scalar field ρ in Coulomb’s law (10.1.8), and the

current density vector field JJJ in the Biot-Savart law (10.2.31), and each law involves integration

over R3. On the other hand there is also one huge difference between the two laws: in (10.1.8) at

each (u, v, w) in R3 the vector (x−u)iii+(y−v)jjj+(z−w)kkk is multiplied by the scalar charge density

ρ(u, v, w) causing the electric field, whereas in (10.2.31) the same vector (x−u)iii+(y−v)jjj+(z−w)kkk

is cross-multiplied by the vector current density JJJ(u, v, w) causing the magnetic field. We could

easily calculate the cross product in (10.2.31) and expand BBB(x, y, z) in vector form but shall not

do this. Indeed, the value of the Biot-Savart law (10.2.31) lies in the fact that it lends itself, after

a lengthy and rather complicated mathematical analysis that we shall certainly not give here, to

restatement in the form of two laws, namely Ampere’s circuital law (which we previewed at Remark

8.5.7) and Gauss’ law for magnetic fields (in much the same way that Coulomb’s law (10.1.8) lends

181

itself to restatement in the form of Gauss’ Law 10.1.4 for electrostatic fields). We state these laws

next.

Law 10.2.2 (Ampere’s circuital law and Gauss’ law for static magnetic fields). Suppose that BBB is

the static magnetic field arising from the static current density JJJ (see (10.2.31)). Then, for each

finite open surface S with boundary curve Γ, we have

(10.2.33)

∫Γ

BBB · drrr = µ0

∫S

JJJ · dAAA,

and for each closed surface S we have

(10.2.34)

∫S

BBB · dAAA = 0.

Remark 10.2.3. We have already seen the statement (10.2.33), known as Ampere’s circuital law

(recall Remark 8.5.7). Observe that (10.2.34) is a statement about the flux of the magnetic field

through a closed surface, much like Gauss’ Law (10.1.18) is a statement about the flux of an electric

field though a closed surface. For this reason the assertion (10.2.34) is called Gauss’ law for magnetic

fields.

Remark 10.2.4. The relations (10.2.33) and (10.2.34) are global statements about the magnetic

field BBB stated in terms of line integrals and surface integrals of BBB. Exactly as we wrote the global

form (10.1.18) of Gauss’ law of electrostatics in the local form (10.1.21), we are now going to write

the laws (10.2.33) and (10.2.34) in local form. Using the divergence theorem and just repeating the

steps (10.1.19) to (10.1.21) (with BBB in place of EEE and zero in place of ρ) we get the local form of

(10.2.34) namely

(10.2.35) ∇ ·BBB = 0.

We are now going to establish the local form of Ampere’s circuital law (10.2.33), and for this we

shall need Stokes Theorem 9.2.2. Indeed, using Stokes’ theorem in the form of (9.2.67), with BBB in

place of the generic vector field FFF , we have

(10.2.36)

∫Γ

BBB · drrr =

∫S

(∇×BBB) · dAAA,

and with (10.2.36) we can write Ampere’s circuital law (10.2.33) as∫S

(∇×BBB) · dAAA = µ0

∫S

JJJ · dAAA,

182

or

(10.2.37)

∫S

[(∇×BBB)− µ0JJJ ] · dAAA = 0.

From Law (10.2.2) we know that (10.2.33) holds for each and every finite open surface S, and

therefore (10.2.37) must also hold for each and every finite open surface S. This is hugely important,

for it allows us to use the following “surface integral” analog of Theorem 9.4.1:

Theorem 10.2.5 (du Bois Reymond mk.II). Suppose the vector field GGG : R3 → R3 is such that

(10.2.38)

∫S

GGG · dAAA = 0

for each and every finite open surface S in R3. Then GGG(x, y, z) = 0 for all (x, y, z) in R3.

From (10.2.37) and Theorem 10.2.38 (with GGG := (∇×BBB)− µ0JJJ) we obtain

(10.2.39) ∇×BBB = µ0JJJ.

The relation (10.2.39) is the local form of Ampere’s circuital law (10.2.33), in much the same way

that (10.2.35) is the local form of Gauss’ law (10.2.34) for magnetic fields, and (10.1.21) is the local

form of Gauss’ law (10.1.18) for electric fields. In particular, (10.2.39) tells us that the curl (or

“rotation” or “turning” or “vorticity”) of the magnetic field BBB at each (x, y, z) in R3 is directly

proportional to the vector JJJ(x, y, z) (recall the intuitive significance of curl discussed at Remark

9.1.11 and Remark 9.2.7). We see from (10.2.39) that in the nontrivial case where JJJ is not identically

zero (so that JJJ(x, y, z) 6= 0 for some (x, y, z) in R3), then ∇×BBB also cannot be identically zero; it

then follows from the equivalence of (a) and (c) in Theorem 9.1.9 that the static magnetic field BBB

arising from a static current density JJJ cannot possibly be a conservative vector field. This is clearly

very different from a static electric field EEE arising from a static charge density ρ, which is always

conservative, as we have seen at Remark 10.1.2. It follows that it is generally not possible to write

BBB as the gradient of some scalar potential function, that is we generally do not have

(10.2.40) BBB = ∇Ψ

for a scalar potential function Ψ : R3 → R (recall Definition 6.2.1). This lack of a scalar potential

function makes magnetic fields intrinsically more difficult to deal with than electric fields. It turns

out that we can, nevertheless, always write BBB as the curl of another vector field, and this provides

some partial compensation for not having a scalar potential function at our disposal. The essence

of the matter is discussed in the next few remarks.

183

Remark 10.2.6. For the moment forget about magnetic fields and consider a general C1-vector

field FFF : R3 → R3. If it is true that FFF = ∇×GGG for some C1-vector field GGG : R3 → R3 then we know

from (the very elementary) Theorem 9.1.14 that ∇ ·FFF = ∇ · (∇×GGG) = 0, that is

(10.2.41) FFF = ∇×GGG for some vector field GGG ⇒ ∇ ·FFF = 0.

Is the converse of (10.2.41) also true? That is, if we know that ∇ ·FFF = 0 then is it necessarily the

case that FFF = ∇ ×GGG for some vector field GGG? This converse is far from obvious, and is in fact

decidedly difficult to establish, but nevertheless is true, according to the following very profound

theorem which we make no attempt to establish here:

Theorem 10.2.7 (Poincare). Suppose that FFF : R3 → R3 is a C1-vector field. If (∇·FFF )(x, y, z) = 0

for all (x, y, z) in R3 then FFF is necessarily given by

(10.2.42) FFF (x, y, z) = (∇×GGG)(x, y, z) for all (x, y, z) in R3,

for some C1-vector field GGG : R3 → R3.

Remark 10.2.8. If a vector field FFF : R3 → R3 is given by the curl of a vector field GGG : R3 → R3,

that is

(10.2.43) FFF (x, y, z) = (∇×GGG)(x, y, z), for all (x, y, z) in R3,

or more briefly

(10.2.44) FFF = ∇×GGG,

(with the variables (x, y, z) suppressed), then the vector field GGG is called a vector potential of the

vector field FFF , and correspondingly we say that the vector field FFF has a vector potential GGG. If the

vector field FFF has a vector potential GGG, that is FFF and GGG are related by (10.2.44), then FFF in fact has

infinitely many vector potentials. To see this put

(10.2.45) GGG := GGG+∇g

for any C1-function g : R3 → R. From Theorem 9.1.13 we know

(10.2.46) ∇× (∇g) = 0.

184

Then

∇× GGG = ∇× (GGG+∇g) (from (10.2.45))

= ∇×GGG+∇× (∇g) (from (9.1.61))

= ∇×GGG+ 0 (from (10.2.46))

= FFF (from (10.2.44)),

that is

(10.2.47) FFF = ∇× GGG.

In short, if GGG is a vector potential of FFF then GGG defined by (10.2.45) for any C1-function g : R3 → Ris also a vector potential of FFF , that is FFF has infinitely many vector potentials.

Remark 10.2.9. From Theorem 10.2.7 we see that a solenoidal vector field FFF (i.e. ∇ · FFF = 0,

recall Remark 9.1.5) always has some vector potential GGG. In accordance with Remark 10.2.8 the

solenoidal vector field FFF then has infinitely many vector potentials GGG of the form

GGG = GGG+∇g,

corresponding to every C1-scalar field g : R3 → R.

Remark 10.2.10. One sees from (10.2.35) that the magnetic field BBB is solenoidal, so that we can

apply Theorem 10.2.7 to conclude that

(10.2.48) BBB = ∇×AAA for some vector field AAA : R3 → R3.

Any vector fieldAAA which satisfies (10.2.48) is a magnetic vector potential of the magnetic fieldBBB. We

know from Remark 10.2.8 that, if AAA is a magnetic vector potential of BBB, then for every C1-function

g : R3 → R, the vector field

(10.2.49) AAA := AAA+∇g,

is also a magnetic vector potential of BBB.

Remark 10.2.11. In Remark 10.1.7 we addressed the question of how to determine an electric

field EEE in terms of the (known) charge density field ρ causing the electric field. Here we consider an

analogous question for static magnetic fields: if we know the static current density field JJJ how can

we determine the static magnetic field BBB in terms of JJJ? In principle the Biot-Savart law (10.2.31)

185

gives the answer, but (much as with the direct application of Coulomb’s law noted in Remark

10.1.7) the integrals appearing in (10.2.31) are usually difficult to evaluate. In Remark 10.1.7 we

saw that the calculation of an electric field EEE in terms of the known current density field ρ can be

reduced to the solution of a Poisson equation giving a potential function Ψ of the electric field. This

is a very satisfactory state of affairs because we understand clearly how to solve Poisson equations

(typically by numerical analysis or numerical methods). In the present remark we shall see that,

rather similarly, the calculation of the magnetic field BBB in terms of JJJ can be reduced to the solution

of three Poisson equations, each equation giving a scalar component of a magnetic vector potential

AAA of the magnetic field BBB. It will soon become clear that the path we must follow in attaining this

goal is a good deal longer and more complicated than the rather simple analysis of Remark 10.1.7.

This just reflects the fact that magnetic fields, not being conservative, are a lot more challenging to

deal with than electric fields. Gauss’ Law and Ampere’s law and for magnetic fields in local form

state that a time constant current density JJJ causes a time constant magnetic field BBB which satisfies

the relations

(10.2.50) ∇ ·BBB = 0, ∇×BBB = µ0JJJ.

(see (10.2.35) and (10.2.39)). Determining the magnetic fieldBBB therefore means that we must “solve

for” or “extract” BBB in terms of the known current density JJJ using the relations at (10.2.50). There

is a very powerful theorem of Helmholtz (or von Helmholtz) which says that one can indeed “solve”

for BBB in terms of JJJ from the relations at (10.2.50) and even provides a formula which determines

BBB. However, the Helmholtz theorem is an advanced result which is rather above the level of this

introductory course. Instead, we shall use the notion of magnetic vector potential in Remark 10.2.10

and a direct argument to see how to “solve” (10.2.50) for the vector field BBB. In the course of this

we shall introduce the very clever idea of a gauge transformation. In view of Remark 10.2.10 we

can write BBB in the form

(10.2.51) BBB = ∇×AAA for a magnetic vector potential AAA : R3 → R3,

and we know from Remark 10.2.10 that there are actually infinitely many such vector potentialsAAA. It

turns out that we can use the fact thatBBB has many magnetic vector potentials to find a particularly

nice vector potential AAA of BBB with the further property that it is solenoidal i.e. ∇ · AAA = 0, that is

we can actually show the following

(10.2.52) BBB = ∇× AAA for some vector potential AAA : R3 → R3 such that ∇ · AAA = 0.

186

For the moment we by-pass the question of how to establish (10.2.52) and instead concentrate on

how to “solve” the relations (10.2.50) for BBB in terms of JJJ assuming that (10.2.52) holds.

We have

∇×BBB = ∇× (∇× AAA) (from (10.2.52))

= ∇(∇ · AAA)−∇2AAA (from the identity (9.1.62)),

that is

(10.2.53) ∇×BBB = ∇(∇ · AAA)−∇2AAA.

From (10.2.52) we also have ∇ · AAA = 0, and therefore of course

(10.2.54) ∇(∇ · AAA) = 0.

In view of (10.2.54) and (10.2.53) we get

(10.2.55) ∇×BBB = −∇2AAA,

and combining (10.2.55) with Ampere’s law (the second relation of (10.2.50)) we find

(10.2.56) −∇2AAA = µ0JJJ.

Now (10.2.56) is a very nice relation indeed. In fact, recalling the componentwise expansion of the

vector fields JJJ and AAA, namely

(10.2.57) JJJ = J1iii+ J2jjj + J3kkk, AAA = A1iii+ A2jjj + A3kkk,

and recalling the componentwise expansion of ∇2AAA from Definition 9.1.19, that is

(10.2.58) (∇2AAA) = (∇2A1)iii+ (∇2A2)jjj + (∇2A3)kkk,

(as follows from (9.1.31) with AAA in place of FFF ), we can equate the scalar componets in the vector

relation (10.2.56) to get the scalar relations

(10.2.59) ∇2Ai = −µ0Ji,

for i = 1, 2, 3. Now each (10.2.59) is a scalar Poisson equation of the form (10.1.28), for which we

already know the solution (see (10.1.30)). Indeed, just matching (10.2.59) with (10.1.28) we see

from (10.1.30) that each Ai is given by

(10.2.60) Ai(x, y, z) =µ0

4π

∫R3

Ji(u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,

187

for all (x, y, z) in R3, which gives each Ai in terms of the (known) current density components Ji.

We have therefore obtained the vector field AAA, and using this we can now immediately determine

BBB by calculating the curl of AAA as at (10.2.52). In this way we have determined the magnetic field

BBB which satisfies the relations (10.2.50).

It remains to establish that (10.2.52) actually holds. To this end we fix any arbitrary magnetic

vector potential AAA of the magnetic field BBB (recall (10.2.51)) so that

(10.2.61) BBB = ∇×AAA.

With this arbitrary choice of magnetic vector potential AAA let f be any scalar field which satisfies

the relation

(10.2.62) ∇2f = −∇ ·AAA,

in which, of course, AAA is the vector field that we have just fixed (the relation (10.2.62) is of course

just a Poisson equation of the form (10.1.28), although we will not actually need to solve this

equation for f , as we will soon see). Using the arbitrarily chosen magnetic vector potential AAA,

together with the f satisfying (10.2.62), define

(10.2.63) AAA := AAA+∇f.

From Remark 10.2.10 and (10.2.61) we know that AAA is also a magnetic vector potential of BBB, that

is

(10.2.64) BBB = ∇× AAA.

The transformation of our arbitrarily chosen AAA into AAA at (10.2.63) is called a gauge transformation

of AAA. We are now going to see that AAA is actually the “nice” magnetic vector potential that we want

in that ∇ · AAA = 0. In fact

∇ · AAA = ∇ · (AAA+∇f) (from (10.2.63))

= ∇ ·AAA+∇ · (∇f) (from (9.1.59) )

= ∇ ·AAA+∇2f (from Theorem 9.1.21)

= 0 (from (10.2.62)),

that is

(10.2.65) ∇ · AAA = 0.

188

Now (10.2.52) follows from (10.2.65) and (10.2.64). Notice that we do not have to determine the

function f which satisfies (10.2.62), even though AAA (which we need in order to obtain BBB from

(10.2.64)) is actually defined in terms of this f by (10.2.63). The reason of course is that AAA has

been defined in such a way that it also satisfies the Poisson equations (10.2.59) and we can determine

AAA from these equations without having to find f . In fact, the function f occurring at (10.2.62) is just

a very clever device which ensures that AAA defined by (10.2.63) actually satisfies the all-important

Poisson equations (10.2.59) for which we know the solution. The definition of AAA at (10.2.63) is

called a Coulomb gauge transformation.

10.3 Time Varying Fields

Up until now we have concentrated on the laws of electricity and magnetism which pertain to time

constant or static electric and magnetic fields. These are Gauss’ Law 10.1.4 for static electric fields

(also expressed in local form at (10.1.21)), and the Ampere and Gauss Law 10.2.2 for static magnetic

fields (also expressed in local form at (10.2.39) and (10.2.35) respectively). In this section we are

going to focus on the basic experimentally determined laws of electricity and magnetism for time

varying electric and magnetic fields. Recall from Remark 3.2.5 that a time varying vector field FFF

is denoted more completely by FFF (t, x, y, z), which indicates that FFF generally varies with respect to

time t at each fixed (x, y, z) in R3, and generally varies with respect to space points (x, y, z) at each

fixed instant t. Similarly, a time varying scalar field f is denoted more completely by f(t, x, y, z)

with the same interpretation. We begin with the time varying version of Gauss’ law for electric

fields:

Law 10.3.1 (Gauss’ law for time varying electric fields). A time varying charge density ρ causes a

time varying electric field EEE, and the fields ρ and EEE are related by

(10.3.66) (∇ ·EEE)(t, x, y, z) =1

ε0ρ(t, x, y, z),

for each instant t and each point (x, y, z) in R3.

Remark 10.3.2. Recall that the divergence of the time varying electric field on the left of (10.3.66)

is to be interpreted in accordance with Remark 9.1.23 (see in particular (9.1.44)). As usual we will

write (10.3.66) with the variables (t, x, y, z) suppressed, that is

(10.3.67) (∇ ·EEE) =1

ε0ρ,

189

so that (t, x, y, z) must be mentally substituted into (10.3.67). We have chosen to state Gauss’

law for time varying fields in local form, because this will be particularly useful when we come to

Maxwell’s equations, but we could just as easily have stated this law in global form in terms of

surface integrals. We emphasize that Law 10.3.1 is a consequence of experimental observation, in the

same way that Law 10.1.4 for static electric fields is also a consequence of experimental observation.

In exactly the same way we can state Gauss’ law for time varying magnetic fields. Again, we

choose the state the time varying law in local form since this is the form most useful for Maxwell’s

equations:

Law 10.3.3 (Gauss’ law for time varying magnetic fields). A time varying current density JJJ causes

a time varying magnetic field BBB which satisfies

(10.3.68) (∇ ·BBB)(t, x, y, z) = 0,

for each instant t and each point (x, y, z) in R3.

Remark 10.3.4. Needless to say we will write the relation (10.3.66) with the variables (t, x, y, z)

suppressed, that is

(10.3.69) ∇ ·BBB = 0.

Law 10.3.3 is, like Law 10.3.1, a consequence of experimental observation.

We next come to Faraday’s law of electromagnetic induction which we have already seen at

Remark 8.5.8. For completeness we state this law again:

Law 10.3.5 (Faraday’s law of electromagnetic induction). A time varying magnetic field BBB causes

a time varying electric field EEE. Moreover, for each finite open surface S with boundary curve Γ, the

fields BBB and EEE are related by

(10.3.70)

∫Γ

EEE · drrr = −∫S

∂BBB

∂t· dAAA.

We are now going to write Faraday’s law in local form, in much the same way that we extracted

the local form (10.2.39) of Ampere’s law from the global form (10.2.33). To this end we use Stokes’

theorem in the form of (9.2.67), with EEE in place of the generic vector field FFF , to get

(10.3.71)

∫Γ

EEE · drrr =

∫S

(∇×EEE) · dAAA,

190

so that (10.3.71) and (10.3.70) give∫S

(∇×EEE) · dAAA = −∫S

∂BBB

∂t· dAAA,

that is

(10.3.72)

∫S

[(∇×EEE) +

∂BBB

∂t

]· dAAA = 0.

Notice that (10.3.72) holds for each and every finite open surface S. We are now going to apply

Theorem 10.2.5. To this end write (10.3.72) with all the variables displayed, that is

(10.3.73)

∫S

[(∇×EEE)(t, x, y, z) +

∂BBB(t, x, y, z)

∂t

]· dAAA = 0.

Recall that the curl (∇ × EEE)(t, x, y, z) in the integrand of of (10.3.73) is to be interpreted in

accordance with Remark 9.1.23 (see in particular (9.1.45)). The relation (10.3.73) holds for each

instant t and each finite open surface S (notice that the space variables (x, y, z) have been integrated

out). Now fix some arbitrary value of t, and for this t define the vector field

(10.3.74) GGG(x, y, z) := (∇×EEE)(t, x, y, z) +∂BBB(t, x, y, z)

∂t, for all (x, y, z) in R3.

It follows from (10.3.74) and (10.3.73) that, at this fixed instant t, we have

(10.3.75)

∫S

GGG · dAAA = 0,

for each and every finite open surface S, so that Theorem 10.2.5 gives

(10.3.76) GGG(x, y, z) = 0 for all (x, y, z) in R3.

In view of (10.3.76), (10.3.74) and the arbitrary choice of instant t, it follows that

(10.3.77) (∇×EEE)(t, x, y, z) +∂BBB(t, x, y, z)

∂t= 0

for each instant t and each (x, y, z) in R3. From now on we shall, of course, write (10.3.77) with

the variables (t, x, y, z) suppressed, that is

(10.3.78) (∇×EEE) +∂BBB

∂t= 0.

The relation (10.3.78) is Faraday’s law of electromagnetic induction in local form.

191

Remark 10.3.6. We have formulated the basic laws of electricity and magnetism in global form, in

terms of line integrals and surface integrals, and then used the tools of vector calculus (in particular

Stokes’ theorem and the divergence theorem) to reformulate these laws in local form. Why have

we reduced the laws to local form? It is important to understand that experimental observation

always gives us the laws of electricity and magnetism in global form. However, it is invariably the

case that the local form of these laws is the most useful in applications. This holds whether one

has the really ambitious goal of using the laws of electricity and magnetism to advance the state of

fundamental physics (such as the quantum theory of electrodynamics), or on the other hand one

just wants to apply the laws to some engineering problem (such as the design of antennas for cell

phones); regardless of the particular application it is typically the local form of the laws which is

the most useful. In fact, from now on we shall concentrate almost exclusively on the local form of

the laws of electricity and magnetism.

Remark 10.3.7. In the preceding account of the laws of electricity and magnetism for time varying

fields we have so far not discussed a possible time varying version of Ampere’s circuital law. Given

how nicely and easily the Gauss laws for electric and magnetic fields extend to the time varying case

(see (10.3.66) and (10.3.68)) it is reasonable to expect, on the basis of the local version of Ampere’s

law for static fields (see (10.2.39)), that the time varying version of this law might look like

(10.3.79) (∇×BBB)(t, x, y, z) = µ0JJJ(t, x, y, z),

for all instants t and points (x, y, z) in R3 (for completeness we have displayed all variables at

(10.3.79)). In fact, experimental evidence in support of the “time varying law” (10.3.79) proved

very difficult to come by. It turns out that there is indeed an extension of Ampere’s law to time

varying fields but this extension is not given by (10.3.79)! It was Maxwell who discovered that

(10.3.79) is false and then used the tools of vector calculus to get a “corrected version” of Ampere’s

law for time varying fields which turns out to be in agreement with experimental evidence, as well

as consistent with Ampere’s law in the special case of time constant fields. This constitutes one

of the very greatest discoveries in all of physics. In the next chapter we shall follow Maxwell, and

use the tools of vector calculus to see that (10.3.79) is incorrect and get the correct extension of

Ampere’s law to time varying fields.

192

Chapter 11

Maxwell’s Equations

In this chapter we first address the problem of extending Ampere’s law to time varying fields, as

discussed in Remark 10.3.7. We then state Maxwell’s equations in full and develop some of the

simplest consequences of these equations which have completely revolutionized physics.

11.1 The Ampere-Maxwell Law for Time Varying Fields

In Remark 10.3.7 we noted that the relation (10.3.79), that is

(11.1.1) ∇×BBB = µ0JJJ

(with the variables (t, x, y, z) suppressed), looks like a plausible extension of Ampere’s law to time

varying fields. Following Maxwell, we are going to use the tools of vector calculus to see that (11.1.1)

cannot possibly be true for genuinely time varying fields. To this end we require the continuity

equation (9.4.202) already established in Section 9.4, that is

(11.1.2) (∇ · JJJ) +∂ρ

∂t= 0.

Now assume that (11.1.1) holds for time varying fields. Take the divergence of each side of (11.1.1)

to get

(11.1.3) ∇ · (∇×BBB) = µ0(∇ · JJJ).

From Theorem 9.1.14 (with BBB in place of GGG) we have

(11.1.4) ∇ · (∇×BBB) = 0,

193

so that (11.1.4) and (11.1.3) give

(11.1.5) ∇ · JJJ = 0.

Upon combining (11.1.5) and (11.1.2) we find

(11.1.6)∂ρ(t, x, y, z)

∂t= 0,

(displaying all variables). It follows from (11.1.6) that the charge density scalar field ρ is time

constant, that is if (11.1.1) holds then ρ must be time constant. However, in experiments one

can easily create time varying charge density fields; indeed, such time varying fields occur all over

physics and engineering. It follows that the supposition that (11.1.1) holds for time varying fields

is contradicted by physical evidence, and therefore (11.1.1) cannot possibly be true for time varying

fields. This being the case, can we somehow get a“corrected” or “extended” version of (11.1.1)

which is true for time varying fields? Maxwell’s idea was to “guess” that a corrected version of

(11.1.1) would look like

(11.1.7) ∇×BBB = µ0JJJ +GGG

for some time varying vector field GGG. That is, one tries to correct (11.1.1) by adding a correction

term GGG to the right side. It remains to determine exactly what GGG actually is. From (11.1.7) we

have

(11.1.8) GGG = ∇×BBB − µ0JJJ.

Now take the divergence of each side of (11.1.8) to see that

∇ ·GGG = ∇ · [∇×BBB − µ0JJJ ]

= ∇ · (∇×BBB)− µ0(∇ · JJJ)

= −µ0(∇ · JJJ) (since Theorem 9.1.14 gives ∇ · (∇×BBB) = 0),

that is

(11.1.9) (∇ · JJJ) = − 1

µ0

(∇ ·GGG).

From Gauss’ law for electric fields at (10.3.67) we get

(11.1.10) ρ = ε0(∇ ·EEE).

194

Taking partial t-derivatives on each side of (11.1.10) gives

∂ρ

∂t= ε0

∂(∇ ·EEE)

∂t

= ε0∇ ·[∂EEE

∂t

],

(11.1.11)

(where the interchange of partial t-derivative and divergence at the second equality of (11.1.11) is

justified by (9.1.50) with EEE in place of FFF ). Now substitute (11.1.11) and (11.1.9) in the continuity

equation (11.1.2) to obtain

ε0∇ ·[∂EEE

∂t

]− 1

µ0

(∇ ·GGG) = 0,

that is (multiplying through by µ0)

ε0µ0∇ ·[∂EEE

∂t

]− (∇ ·GGG) = 0,

that is

(11.1.12) ∇ ·[ε0µ0

∂EEE

∂t−GGG

]= 0.

The simplest relation between the vector fields EEE and GGG which is consistent with (11.1.12) is of

course to make the argument in square brackets zero, that is

(11.1.13) GGG = ε0µ0∂EEE

∂t.

Finally, substitute (11.1.13) for GGG in (11.1.7) to get

(11.1.14) ∇×BBB = µ0JJJ + ε0µ0∂EEE

∂t.

Remark 11.1.1. The relation (11.1.14) constitutes Maxwell’s extension of Ampere’s law to time

varying fields, and is usually called the Ampere-Maxwell law. Notice that, when the fields are

time constant, the partial t-derivative on the right of (11.1.14) is of course identically zero, so

that (11.1.14) is completely consistent with Ampere’s law (11.1.1) for time constant fields, which is

known to be true from experimental evidence based on the Biot-Savart law (see the discussion for

Section 10.2). However, the fundamental question remains: is the Ampere-Maxwell law (11.1.14)

actually true for time varying fields? The derivation of (11.1.14) given above certainly looks sound

enough, but we must remember that it began with a guess, namely that the corrected version of

Ampere’s law has to look like (11.1.7) for some vector field GGG. What if this guess is wrong, and the

195

actual correction of Ampere’s law involves some more complicated modification than just “adding

in” a term GGG? In this case the law (11.1.14), that was established based on this guess, would

certainly not be true! The answer to the question is therefore not so obvious. As is always the

case, the final determination rests on experimental evidence, and experimental evidence confirming

the correctness or otherwise of the Ampere-Maxwell law turned out to be frustratingly difficult to

obtain. However, during the period 1887 - 1891, in a brilliantly innovative series of experiments, the

physicist Heinrich Hertz showed convincingly that the Ampere-Maxwell law is in fact perfectly true.

In fact, numerous modern technological devices, such as cell-phones, radio, television, radar, micro-

wave ovens, the internet etc, etc, etc, etc, rely crucially on the correctness of the Ampere-Maxwell

law; the fact that these devices work in the way they do really constitutes daily “experimental

verification” of this law.

Remark 11.1.2. A further point to notice about (11.1.14) is the following: Suppose that the

current density JJJ is identically zero, but there is still a time varying electric field EEE. According to

the Ampere-Maxwell law the time varying electric field EEE causes a magnetic field BBB, and the two

fields are related by (11.1.14) with JJJ = 0, that is

(11.1.15) ∇×BBB − ε0µ0∂EEE

∂t= 0.

This is very symmetric with respect to Faraday’s Law 10.3.5, for this says that a time varying

magnetic field causes an electric field, and the two fields are related by (10.3.78), which indeed

looks very similar to (11.1.15) when EEE and BBB are interchanged (the quantity ε0µ0 is less important

than it looks, for it is really just a consequence of our choice of standard MKS-units; in fact, by going

over to a different system of units - called “Gaussian units” - we can actually make the coefficients

of the t-partial derivative terms in (11.1.15) and (10.3.5) equal in magnitude). To this extent

Maxwell’s correction of Ampere’s law supplies a new physical law which is a symmetric counterpart

of Faraday’s law. It was in fact the search for this “missing symmetry” which motivated Maxwell to

propose the correction to Ampere’s law in the form of (11.1.7). Notice that the symmetry between

(11.1.15) and (10.3.5) is not quite complete; it is more like “skew-symmetry” because of the minus

sign preceding the partial t-derivative in (11.1.15) compared with the plus sign preceding the partial

t-derivative in (10.3.5).

196

11.2 Maxwell’s Equations

We are now finally ready to state Maxwell’s equations in all their glory. These equations comprise

Gauss’s laws for electric and magnetic fields, Faraday’s law of electromagnetic induction and the

Ampere-Maxwell law. We collect these together in the following massively important statement:

Law 11.2.1 (Maxwell). A given time varying charge density ρ, together with a given time varying

current density JJJ , causes a time varying electric field EEE and a time varying magnetic field BBB.

Moreover, these fields are related to each other, as well as to the given “sources” ρ and JJJ , by the

equations:

(a) (∇ ·EEE) =1

ε0ρ (Gauss for electric fields)

(b) (∇ ·BBB) = 0 (Gauss for magnetic fields)

(c) (∇×EEE) +∂BBB

∂t= 0 (Faraday)

(d) (∇×BBB) = µ0JJJ + ε0µ0∂EEE

∂t(Ampere-Maxwell)

(11.2.16)

(see (10.1.21), (10.3.68), (10.3.78) and (11.1.14)).

Remark 11.2.2. Notice that we have stated the preceding laws in local form. We could just

as easily have stated the laws in global form, in terms of surface and line integrals, but these

are seldom needed and the local versions are much more useful. The equations (11.2.16)(a) - (d)

are Maxwell’s equations. One may reasonably question why these are called Maxwell’s equations,

since (11.2.16)(a) - (c) and the static version of (11.2.16)(d) were known well before the time of

Maxwell. The term Maxwell’s equations is nevertheless completely appropriate, for it was Maxwell

who first truly understood the enormous power of these equations when used together. Indeed,

Maxwell used the equations to discover the existence of self-sustaining electromagnetic waves or

electromagnetic radiation. This discovery completely changed the whole science of physics, and

is the reason why Maxwell, together with Galileo, Newton and Einstein, is counted among the

four greatest physicists of all time. One especially important consequence of the discovery of

electromagnetic radiation is that it was the prime motivator leading Einstein to the later formulation

of the theory of relativity. On a more pragmatic level, electromagnetic radiation is of course the

entire basis of our modern civilization. In all of this, the term ε0µ0∂EEE/(∂t) added by Maxwell to

Ampere’s law to get (11.2.16)(d), is of central importance. In fact, it is presence of this very term

which is really at the root of the discovery of electromagnetic radiation, as we shall see in the next

few sections.

197

11.3 Electromagnetic Waves without Sources

In the present section we look at Maxwell’s equations in the source free case, in which the charge

density ρ and current density JJJ are identically zero, that is

(11.3.17) ρ = 0, JJJ = 0.

We are going to use the tools of vector calculus to see that the phenomenon of electromagnetic

waves is predicted by Maxwell’s equations. In view of (11.3.17), the Maxwell equations (11.2.16)

reduce to

(a) (∇ ·EEE) = 0,

(b) (∇ ·BBB) = 0,


∂t= 0,

(d) (∇×BBB) = ε0µ0∂EEE

∂t.

(11.3.18)

Remark 11.3.1. One may reasonably ask how it is possible to even get non-zero electric and

magnetic fields in the source free case. Indeed, Maxwell’s equations at (11.3.18) are obviously

trivially satisfied by the zero fields EEE(t, x, y, z) = 0 and BBB(t, x, y, z) = 0 for all (t, x, y, z). We get

non-zero electric and magnetic fields satisfying (11.3.18) as a consequence of electromagnetic energy

being radiated into free space by some transmitting agent. A radiator of electromagnetic energy

comprises an antenna which absorbs energy from a voltage source, as shown in Figure 11.1. This

energy sets up a time varying charge density field ρ and a time varying current density field JJJ ,

both of which are localized to the antenna, that is the charge and current density fields are zero

everywhere outside the antenna. By a very sophisticated application of Maxwell’s equations in the

general form (11.2.16), which allows for the charge and current density fields ρ and JJJ localized to

the antenna, one can show that the energy absorbed from the voltage source by the antenna is

carried into space, away from the antenna, by a non-zero electric field EEE and a non-zero magnetic

field BBB. Outside the antenna, where the charge and current density fields are zero (as we have

noted), these fields are governed by the source free equations (11.3.18). It is the behavior of these

fields outside the antenna that is our main concern in this section.

Take the curl of each side of (11.3.18)(c) to get

(11.3.19) ∇× (∇×EEE) +∇×[∂BBB

∂t

]= 0.

198

Figure 11.1: Radiation of electromagnetic energy from an antenna

From (9.1.53) (with BBB in place of FFF ),

(11.3.20)∂(∇×BBB)

∂t= ∇×

[∂BBB

∂t

],

and from (9.1.62) (with EEE in place of FFF ),

(11.3.21) (∇× (∇×EEE)) = (∇(∇ ·EEE))− (∇2EEE) = −(∇2EEE),

(suppressing the variables (t, x, y, z)), in which the second equality at (11.3.21) follows from (11.3.18)(a).

Now put (11.3.21) and (11.3.20) into (11.3.19) to get

(11.3.22) −(∇2EEE) +∂(∇×BBB)

∂t= 0.

From (11.3.22) and (11.3.18)(d) we finally obtain

(11.3.23) c2(∇2EEE) =∂2EEE

∂t2, in which c :=

1√ε0µ0

.

We note, from (11.3.23), (10.2.32) and (10.1.3), with some easy dimensional analysis, that

(11.3.24) c = 2.999× 108 meters/second,

199

which amazingly enough is the speed of light in a vacuum. The full significance will shortly become

clear! We next obtain an equation for BBB which is analogous to (11.3.23). To this end, take the curl

of each side of (11.3.18)(d), so that

(11.3.25) ∇× (∇×BBB) = ε0µ0∇×[∂EEE

∂t

].

From (9.1.62) (with BBB in place of FFF ), together with (11.3.18)(b), we find

(11.3.26) (∇× (∇×BBB)) = −(∇2BBB),

(exactly as at (11.3.21)), and of course

(11.3.27)∂(∇×EEE)

∂t= ∇×

[∂EEE

∂t

],

from (9.1.53) with EEE in place of FFF . Now (11.3.27), (11.3.26) and (11.3.25) give

(11.3.28) −(∇2BBB) = ε0µ0∂(∇×EEE)

∂t,

and, from (11.3.28) along with (11.3.18)(c), we find

(11.3.29) c2(∇2BBB) =∂2BBB

∂t2, in which c :=

1√ε0µ0

.

Expanding the vector fields EEE and BBB in the usual componentwise form, that is

(11.3.30) EEE = E1iii+ E2jjj + E3kkk, BBB = B1iii+B2jjj +B3kkk,

and recalling Definition 9.1.19, we can write the vector relations (11.3.29) and (11.3.23) in compo-

nentwise form as follows:

(11.3.31) c2(∇2Er) =∂2Er

∂t2, c2(∇2Br) =

∂2Br

∂t2, r = 1, 2, 3,

in which the Er and Br are time varying scalar fields. The relations at (11.3.31) constitute six

partial differential equations called wave equations, each of which can be solved individually for the

relevant Er and Br. In the next remark we summarize the main properties of wave equations.

Remark 11.3.2. Each of the relations at (11.3.31) constitutes a special kind of partial differential

equation, called a wave equation, which is of the form

(11.3.32) c2(∇2u) =∂2u

∂t2,

200

in which c is a positive constant. The significance of the constant c will shortly become clear, but

let us note at this point that, regardless of the units attached to the quantity u (e.g. meters, volts,

amps, bars, joules, dimensionless etc.), for dimensional consistency in (11.3.32) to hold the units of

c must be meters per second. This suggests that the constant c has something to do with speed or

velocity. Our study of (11.3.32) will soon confirm that this is indeed the case. With the variables

(t, x, y, z) explicitly shown the equation (11.3.32) looks like

(11.3.33) c2(∇2u(t, x, y, z)) =∂2u(t, x, y, z)

∂t2.

Expanding the Laplacian ∇2 (recall Definition 9.1.18) we can write the wave equation even more

explicitly as follows:

(11.3.34) c2

[∂2u

∂x2 (t, x, y, z) +∂2u

∂y2 (t, x, y, z) +∂2u

∂z2 (t, x, y, z)

]=∂2u(t, x, y, z)

∂t2.

Any function u(t, x, y, z) which satisfies the wave equation is best thought of as a time varying scalar

field in the sense of Remark 3.2.5. Solving the wave equation is a matter of finding a scalar field

u(t, x, y, z) such that the relation (11.3.32) (i.e. (11.3.33) and (11.3.34)) is satisfied, and any such

scalar field is called a solution of the wave equation. There is an extensive theory associated with

wave equations and their solutions, much of it at a very advanced level. Fortunately, in order to

study the electromagnetic phenomena arising from the wave equations at (11.3.31), we shall require

only the most basic aspects of wave equations, and we summarize these in the present remark.

Notice first that there is one very obvious solution of the wave equation, namely the identically zero

function

(11.3.35) u(t, x, y, z) = 0, for all t and for all (x, y, z),

since this function trivially satisfies (11.3.34). Fortunately, there are other, more interesting solu-

tions of the wave equation. To see this fix

(11.3.36) some unit vector nnn = n1iii+ n2jjj + n3kkk, and some C2-function ψ : R→ R.

With the nnn and ψ fixed at (11.3.36) define the time varying scalar field

u(t, x, y, z) := ψ(nnn · (xiii+ yjjj + zkkk)− ct)

= ψ(n1x+ n2y + n3z − ct),(11.3.37)

201

for all (t, x, y, z). It is easy to show that u is a solution of the wave equation (11.3.32). In fact, from

(11.3.37), we get

∂2u(t, x, y, z)

∂t2= c2ψ(2)(n1x+ n2y + n3z − ct)

∂2u(t, x, y, z)

∂x2 = (n1)2ψ(2)(n1x+ n2y + n3z − ct)

∂2u(t, x, y, z)

∂y2 = (n2)2ψ(2)(n1x+ n2y + n3z − ct)

∂2u(t, x, y, z)

∂z2 = (n3)2ψ(2)(n1x+ n2y + n3z − ct).

(11.3.38)

Then

c2

[∂2u

∂x2 (t, x, y, z) +∂2u

∂y2 (t, x, y, z) +∂2u

∂z2 (t, x, y, z)

]= c2[(n1)2 + (n2)2 + (n3)2]ψ(2)(n1x+ n2y + n3z − ct)

= c2ψ(2)(n1x+ n2y + n3z − ct)

=∂2u(t, x, y, z)

∂t2.

(11.3.39)

Here we have used (11.3.38) at the first equality of (11.3.39), the fact that nnn is a unit vector at the

second equality, and (11.3.38) again at the third equality. We see from (11.3.39) that u defined by

(11.3.37) is a solution of the wave equation (11.3.32) for each and every choice of the unit vector

nnn and C2-function ψ (see (11.3.36)). Since there are many such choices of nnn and ψ it follows that

there are many possible scalar fields u which satisfy the wave equation (11.3.32), that is the wave

equation has not just one but many solutions! We next look at the structure of the scalar field u

given by (11.3.37); among other things this will reveal why (11.3.32) is called a wave equation. In

the first instance take the unit vector nnn along the x-axis,

(11.3.40) nnn = iii, i.e. n1 = 1, n2 = 0, n3 = 0.

In this case the function u at (11.3.37) simplifies to

(11.3.41) u(t, x, y, z) = ψ(x− ct), for all (t, x, y, z).

To get a feel for this solution fix a real constant α, take the point αnnn = αiii on the x-axis, and let Pα

be the plane in R3 parallel to the y− z plane and passing through point αnnn, that is perpendicularly

intersecting the x-axis at point x = α (see Figure 11.2). Mathematically we can express Pα as

202

Figure 11.2: Plane Pα through x = α and parallel to the y − z plane

(11.3.42) Pα = (x, y, z) | x = α.

From (11.3.41) and (11.3.42) it follows that for every fixed instant t we have

(11.3.43) u(t, x, y, z) = ψ(α− ct), for all (x, y, z) in Pα,

that is, at each fixed instant t the function u(t, x, y, z) has the constant value ψ(α − ct) for each

and every (x, y, z) in the plane Pα. To see how the solution (11.3.41) describes a “wave” we look at

the function on the right side of (11.3.41) at the fixed instants t = 0, t = t1 and t = t2, for

(11.3.44) 0 < t1 < t2.

For the sake of illustration we choose a function ψ with a “bell-shaped” profile having a maximum

at α0, but any C2-function will suffice (see Figure 11.3). Now plot ψ(x − ct) against x keeping t

fixed at the values t = 0, t = t1 and t = t2 (recall (11.3.44)), as shown in Figure 11.4: It is clear

that the graph of ψ(x− ct) (seen as a function of x for each fixed t) maintains the same “shape” or

“profile” but “propagates” or “undulates” to the right with increasing t, that is we have a “wave”

which moves to the right, and if we watch the “maximum” point A, located at α0 when t = 0, we

see that it shifts to α0 + ct1 at t = t1, and then shifts to α0 + ct2 at t = t2, which indicates that the

wave moves to the right at speed c. This confirms what we noted above, namely that the constant

c appearing in the wave equation (11.3.32) has something to do with speed or velocity. As we have

already noted, at each fixed instant t the “wave function” u(t, x, y, z) is constant for all (x, y, z)

203

Figure 11.3: Profile of ψ(α) versus α

in the planar surface Pα (the constant value being ψ(α − ct)). For this reason the wave given by

(11.3.41) is called a plane wave.

Let us now return to the more general solution of the wave equation given by (11.3.37) in terms

of a general unit vector nnn (rather than the special unit vector at (11.3.40)). We shall associate with

a generic point (x, y, z) in R3 the vector rrr in the usual way, that is

(11.3.45) rrr = xiii+ yjjj + zkkk.

Let OA denote the straight line in R3 collinear with nnn and fix some real constant α. Then αnnn is a

point on OA. Let Pα be the plane in R3 which is perpendicular to the line OA and intersects line

OA at the point αnnn (see Figure 11.5). It is then evident that, for each rrr = xiii+ yjjj + zkkk in Pα, the

vector rrr−αnnn must be perpendicular to the line OA, and in particular rrr−αnnn must be perpendicular

to nnn (since OA is collinear with nnn) that is

(rrr − αnnn) · nnn = 0 for each point rrr = xiii+ yjjj + zkkk in Pα

or, since nnn · nnn = 1,

(11.3.46) rrr · nnn = (αnnn) · nnn = αnnn · nnn = α for each point rrr = xiii+ yjjj + zkkk in Pα.

204

Figure 11.4: ψ(x− ct) versus x at instants t = 0, t = t1 and t = t2 for 0 < t1 < t2

That is, for each point rrr = xiii+ yjjj + zkkk in Pα we have

n1x+ n2y + n3z = nnn · rrr (from (11.3.36))

= α (from (11.3.46)).(11.3.47)

In view of (11.3.47) and (11.3.37) we obtain the following: for each fixed instant t

(11.3.48) u(t, x, y, z) = ψ(α− ct) for each point (x, y, z) in Pα.

In view of (11.3.48) we see that it is enough to look at the dependence of ψ(α − ct) on α (which

indicates displacement along line OA) for each fixed t in order to understand the whole function u

given by (11.3.37). This dependence is obviously pretty similar to the dependence we saw at Figure

11.4, but for completeness we illustrate matters in Figure 11.6, in which we show the dependence of

ψ(α−ct) on α for the fixed instants t = 0, t = t1, and t = t2 (recall (11.3.44)). It is clear from Figure

11.6 and (11.3.48) that the wave given by (11.3.37) “propagates” or “undulates” in the direction of

the line OA, that is in the direction of the unit normal nnn, at a speed c. Moreover it follows from

(11.3.48) that, at each fixed instant t, the scalar field u(t, x, y, z) has the constant value ψ(α − ct)for all (x, y, z) in the plane Pα, that is the wave fronts are the planes Pα, and therefore the wave

described by the solution u(t, x, y, z) at (11.3.37) is again a plane wave.

205

Figure 11.5: The plane Pα is perpendicular to OA and intersects OA at αnnn

We have shown that any time varying scalar field u(t, x, y, z) given by (11.3.37), subject to

(11.3.36), qualifies as a solution of the wave equation (11.3.32), and furthermore the waves described

by u(t, x, y, z) are plane waves with wave fronts perpendicular to the unit vector nnn. Of particular

importance is the case where the external factors setting up the wave dictate that the solution of

the wave equation (11.3.32) be in the specific form

(11.3.49) u(t, x, y, z) = η(x, y, z) cos(ωt),

in which ω is a constant angular frequency and η : R3 → R is some C2-function. Thus u(t, x, y, z)

at (11.3.49) is periodic in time at each fixed point (x, y, z), and furthermore is “separated” in the

sense that the dependence of u on the space variables (x, y, z) is “locked” in the function η, which

does not depend on time t, whereas the dependence of u on time t is locked in the periodic function

cos(ωt), which does not depend on the space variables (x, y, z). Such separated functions with

sinusoidal dependence on time t occur very naturally in applications. We are going to see that, if

the “separated” solution u at (11.3.49) has planar wave fronts which are perpendicular to some unit

vector nnn, then u can be put in the general form of (11.3.37) in which the function ψ is also periodic,

so that sinusoidal dependence in the time variable t, as at (11.3.49), leads to sinusoidal dependence

in the space variables (x, y, z). First note that life is a lot easier if, at (11.3.49), we represent the

206

Figure 11.6: Dependence of ψ(α− ct) on α for t = 0, t1, t2 with 0 < t1 < t2

sinusoid cos(ωt) in exponential form using Euler’s formula, so that we write (11.3.49) as follows

(11.3.50) u(t, x, y, z) = η(x, y, z)e−jωt,

remembering that mentally we just take the real part of the right side of (11.3.50). From (11.3.50)

we have

(11.3.51) (∇2u)(t, x, y, z) = (∇2η)(x, y, z)e−jωt,

(see (9.1.42)), and it is immediate that

(11.3.52)∂2u(t, x, y, z)

∂t2= −ω2η(x, y, z)e−jωt.

Putting (11.3.52) and (11.3.51) into (11.3.33) then gives[c2(∇2η)(x, y, z) + ω2η(x, y, z)

]e−jωt = 0,

so that

(11.3.53) (∇2η)(x, y, z) + κ2η(x, y, z) = 0,

207

where we have defined

(11.3.54) κ :=ω

c.

We must now determine scalar fields η(x, y, z) which satisfy (11.3.53), since each such scalar field

substituted into the right side of (11.3.51) leads to a solution of the wave equation (11.3.32). To

this end fix some complex constant γ given by

(11.3.55) γ = Aejθ,

fix any unit vector nnn in the form of (11.3.36), and define

η(x, y, z) := γ expjκnnn · (xiii+ yjjj + zkkk)

= γ expjκ(n1x+ n2y + n3z).(11.3.56)

From (11.3.56)∂η(x, y, z)

∂x= γ(jκn1) expjκ(n1x+ n2y + n3z),

and therefore

∂2η(x, y, z)

∂x2 = γ(jκn1)2 expjκ(n1x+ n2y + n3z)

= −γκ2(n1)2 expjκ(n1x+ n2y + n3z).(11.3.57)

Similarly

(11.3.58)∂2η(x, y, z)

∂y2 = −γκ2(n2)2 expjκ(n1x+ n2y + n3z),

and

(11.3.59)∂2η(x, y, z)

∂z2 = −γκ2(n3)2 expjκ(n1x+ n2y + n3z).

From Definition 9.1.18 together with (11.3.57) - (11.3.59) we obtain

(∇2η)(x, y, z) =∂2η(x, y, z)

∂x2 +∂2η(x, y, z)

∂y2 +∂2η(x, y, z)

∂z2

= −γκ2[(n1)2 + (n2)2 + (n3)2] expjκ(n1x+ n2y + n3z)

= −γκ2 expjκ(n1x+ n2y + n3z) (since nnn is a unit vector)

= −κ2η(x, y, z) (from (11.3.56)).

(11.3.60)

208

We see from (11.3.60) that, if η is defined by (11.3.56) in terms of any complex number γ and

any unit vector nnn, then η satisfies (11.3.53). Upon combining (11.3.56) and (11.3.50) we find that

solutions u(t, x, y, z) of the wave equation (11.3.32) having the special form (11.3.50) are given by

u(t, x, y, z) = γ expjκ(n1x+ n2y + n3z)e−jωt

= γ expj[κ(n1x+ n2y + n3z)− ωt]

= γ expj[κnnn · (xiii+ yjjj + zkkk)− ωt]

= γ expjκ[nnn · (xiii+ yjjj + zkkk)− ct].

(11.3.61)

Now remembering that we just want the real part on the right side of (11.3.61), we obtain the

solutions

(11.3.62) u(t, x, y, z) = A cosκ[nnn · (xiii+ yjjj + zkkk)− ct] + θ

in which the amplitude A and phase angle θ follow from (11.3.55). We see that u at (11.3.62) is in

exactly the form of (11.3.37) with ψ given by

(11.3.63) ψ(α) := A cos(κα + θ) for all α.

Notice that ψ(α) is a periodic function of α with period given by

(11.3.64)2π

κ=

2πc

ω,

as follows from (11.3.54). This is the spatial periodicity of the solution u at (11.3.62) in the direction

of the unit vector nnn for every fixed t, usually called the wavelength of the wave. Finally, one sees

from (11.3.62) that, at each point (x, y, z) in R3, the time periodicity is given by

(11.3.65)2π

κc=

2π

ω,

which of course just confirms the time periodicity we began with at (11.3.49).

In the present remark we have summarized the essential aspects of the wave equation (11.3.32)

that we shall need for addressing the wave equations for the electric and magnetic fields at (11.3.31)

that we obtained from Maxwell’s equations. In brief, we have learned that scalar fields u of the

form at (11.3.37), defined in terms of a given unit vector nnn and a C2-function ψ (recall (11.3.36)),

constitute solutions of the wave equation (11.3.32). Moreover, these solutions have planar wave

fronts comprising planes that are perpendicular to the vector nnn, and the wave represented by u

moves in the direction of nnn at a speed c (see Figure 11.5 and Figure 11.6). Finally, if “external

209

conditions” are such that the solution u not only has planar wave fronts perpendicular to a given

unit vector nnn, but also has the special separated form at (11.3.49), with time periodicity 2π/ω, then

ψ is also periodic with periodicity at (11.3.64), this being the spatial periodicity (or wavelength) of

the wave in the direction of nnn.

We return to the wave equations at (11.3.31) for the three scalar components Er and Br, r =

1, 2, 3. Each of these equations matches (11.3.32), u being identified in turn with each of the Er

and Br, and the constant c being the speed of light at (11.3.24). It follows that all properties of

wave equations established in Remark 11.3.2 immediately carry over to the equations (11.3.31). In

particular, each of the scalar components Er and Br propagates through space as a wave moving at

the speed of light! (Although we shall not pursue the matter here one can use Maxwell’s equations

to demonstrate that light is an electromagnetic wave in which the electric and magnetic fields are

governed by the source free equations (11.3.18)). Of particular interest in applications is the case

where the voltage source for the antenna in Figure 11.1 is sinusoidal, that is

(11.3.66) v(t) = V cos(ωt).

One can show by a rather deep analysis involving Maxwell’s equations that with this sinusoidal

voltage source the solutions Er and Br of the wave equations at (11.3.31) have the special separated

form of (11.3.49) that is

(11.3.67) Er(t, x, y, z) = η(x, y, z)e−jwt,

(with a similar formula for the Br). In view of Remark 11.3.2 it then follows that we have plane

wave solutions of the form (compare (11.3.62))

(11.3.68) Er(t, x, y, z) = A cosκ[nnn · (xiii+ yjjj + zkkk)− ct] + θ,

(with a similar formula for the Br), in which the wave number κ is given by (11.3.54) with c being

the speed of light at (11.3.24). The amplitude A, unit vector nnn and phase angle θ in (11.3.68) are

functions of the amplitude V in (11.3.66) and the geometry of the antenna. These matters are

addressed in the theory of antennas, which is itself built on Maxwell’s equations.

11.4 Electromagnetic Waves with Sources

In Section 11.3 we addressed the source-free Maxwell equations (11.3.18). The identity (9.1.62) for

the “double curl” of a vector field was used to show that the electric and magnetic fields EEE and BBB

210

given by the source free equations (11.3.18) are easily reduced to the wave equations (11.3.31), each

of which can be separately “solved” for the components Er and Br according to Remark 11.3.2. In

the present section we return to Maxwell’s equations (11.2.16) with sources, which we repeat here:

(a) (∇ ·EEE) =1

ε0ρ,

(b) (∇ ·BBB) = 0,


∂t= 0,

(d) (∇×BBB) = µ0JJJ + ε0µ0∂EEE

∂t.

(11.4.69)

There are many reasons why we want to study Maxwell’s equations (11.4.69) with sources present.

For example, in Remark 11.3.1 we briefly noted that a voltage source connected to an antenna

sets up charge and current density fields ρ and JJJ within the antenna. In order to design efficient

and properly functioning antennas not only must we understand the EEE and BBB fields outside the

antenna (something we addressed in Section 11.3), but we must also thoroughly understand these

fields within the antenna and how they are related to the source fields ρ and JJJ created inside the

antenna by the voltage source. All of this is given by Maxwell’s equations (11.4.69).

The presence of the charge and current density sources ρ and JJJ in (11.4.69)(a)(d) means that

we cannot use the rather simple approach of Section 11.3 for the source free case to obtain wave

equations for Er and Br similar to (11.3.31). In fact, equations (11.4.69) with sources present, are

definitely more challenging than the source free equations (11.3.18). To get a possible clue on how to

proceed let us recall Remark 10.2.11, in which we addressed the problem of extracting the magnetic

field BBB from the relations (10.2.50) in terms of the source JJJ (for time constant fields), and for which

we used the very clever idea of a Coulomb gauge. Equations (11.4.69) present us with a problem

which is not unlike the problem of Remark 10.2.11, in that we want to extract fields (now both an

electric field EEE and a magnetic field BBB) from the relations (11.4.69) in terms of the given sources

ρ and JJJ , in the same way that we extracted the magnetic field BBB from the relations (10.2.50) in

terms of the source JJJ . Of course the set of equations (11.4.69) is clearly more complicated than the

set of equations (10.2.50), and also involves time varying fields in contrast to the time static case of

Remark 10.2.11, so the problem we are dealing with now is obviously significantly more challenging

than the problem addressed in Remark 10.2.11. It turns out, however, that the clever idea of a

gauge, which proved so effective in Remark 10.2.11, can actually be extended to the present problem

as well. Exactly as was the case in Remark 10.2.11 the magic key to all this is to be found in the

vector calculus we have learned.

211

From (11.4.69)(b) with Theorem 10.2.7 we can put

(11.4.70) BBB = ∇×AAA

for some vector field AAA called the magnetic vector potential of BBB (much as at (10.2.51) in the case

of static magnetic fields). Then

∂BBB

∂t=∂(∇×AAA)

∂t(from (11.4.70))

= ∇×[∂AAA

∂t

](from (9.1.53)),

(11.4.71)

and then

∇×[EEE +

∂AAA

∂t

]= ∇×EEE +∇×

[∂AAA

∂t

](from (9.1.61) )

= ∇×EEE +∂BBB

∂t= 0 (from (11.4.71) and (11.4.69)(c))

that is

(11.4.72) ∇×[EEE +

∂AAA

∂t

]= 0.

From Theorem 9.1.9 we see that the vector field in square brackets in (11.4.72) is conservative, and

so, from Definition 6.2.1, there is some scalar field Ψ such that

EEE +∂AAA

∂t= −∇Ψ,

that is

(11.4.73) EEE = −∂AAA

∂t−∇Ψ.

We know in fact that there are actually many vector fields AAA such that (11.4.70) holds, and many

scalar fields Ψ such that (11.4.73) holds. It turns out that we can actually find a rather special

vector field AAA and a rather special scalar field Ψ such that

(11.4.74) BBB = ∇× AAA, EEE = −

[∂AAA

∂t+∇Ψ

], (∇ · AAA) + ε0µ0

∂Ψ

∂t= 0.

While there are many vector fields AAA and scalar fields Ψ which satisfy the first two relations of

(11.4.74), it is still far from obvious that we can choose these fields AAA and Ψ in such a way that the

212

third relation of (11.4.74) is also satisfied. Later we will demonstrate that we can in fact do this,

but now let us just suppose we have fields AAA and Ψ satisfying all of (11.4.74) at our disposal and see

how we can use these fields to “solve” Maxwell’s equations (11.4.69) for the electric and magnetic

fields EEE and BBB. From the first two relations of (11.4.74) and (11.4.69)(d) we get

(11.4.75) ∇× (∇× AAA) = µ0JJJ − ε0µ0∂

∂t

[∂AAA

∂t+∇Ψ

].

Now expand the left side of (11.4.75) by the identity (9.1.62) to obtain

∇(∇ · AAA)−∇2AAA = µ0JJJ − ε0µ0∂

∂t

[∂AAA

∂t+∇Ψ

],

that is

(11.4.76) ∇2AAA− ε0µ0∂2AAA

∂t2−∇

[(∇ · AAA) + ε0µ0

∂Ψ

∂t

]= −µ0JJJ.

But the quantity in square brackets on the left side of (11.4.76) is identically zero, by the third

relation of (11.4.74), so that (11.4.76) simplifies to

(11.4.77) ∇2AAA− ε0µ0∂2AAA

∂t2= −µ0JJJ.

Now (11.4.77) is a very nice relation indeed. In fact, recalling the componentwise expansion of the

vector fields JJJ and AAA, namely

(11.4.78) JJJ = J1iii+ J2jjj + J3kkk, AAA = A1iii+ A2jjj + A3kkk,

it is immediate that we have

(11.4.79)∂2AAA

∂t2=∂2A1

∂t2iii+

∂2A2

∂t2iii+

∂2A3

∂t2iii.

Furthermore, recall the componentwise expansion of ∇2AAA from Definition 9.1.19, that is

(11.4.80) (∇2AAA) = (∇2A1)iii+ (∇2A2)jjj + (∇2A3)kkk,

(as follows from (9.1.31) with AAA in place of FFF ). In view of (11.4.78), (11.4.79) and (11.4.80), we can

equate the scalar components in the vector relation (11.4.77) to get the scalar relations

(11.4.81) ∇2Ai − ε0µ0∂2Ai

∂t2= −µ0Ji,

213

for i = 1, 2, 3. Observe that, in the case of time constant fields, the double t-derivative in (11.4.81)

is identically zero so that (11.4.81) reduces at once to the Poisson equations (10.2.59) that we

already obtained in Remark 10.2.11 in which we addressed the problem of static fields. In Remark

10.2.11 we saw that the Poisson equations (10.2.59) have nice solutions (see (10.2.60)). Can we

get comparably explicit solutions of the equations at (11.4.81) which are effectively “time varying”

Poisson equations featuring a double t-derivative of Ai in addition to the usual Laplacian of Ai?

There is indeed such an explicit formula for the solutions Ai of (11.4.81), which can be obtained

by the method of Green’s functions applied to the equations (11.4.81). The method of Green’s

functions is part of the theory of partial differential equations and is somewhat outside the scope of

this course. Accordingly, we shall not develop this method here but will just state the result that

one gets for the solutions of (11.4.81), namely

(11.4.82) Ai(t, x, y, z) =µ0

4π

∫R3

Ji(t−√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2], u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,

for all (t, x, y, z), which gives each Ai in terms of the (known) time varying current density com-

ponents Ji(t, x, y, z). Notice that the numerator in the integrand of (11.4.82) is obtained by sub-

stituting the quadruple of numbers (t −√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2], u, v, w) in place of

the generic variable (t, x, y, z) in Ji(t, x, y, z). When JJJ is time constant, so that we have Ji(x, y, z)

instead of Ji(t, x, y, z), there is no “place” for the number t−√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2],

which is substituted into the “first argument” corresponding to the variable t, so the numerator

just reduces to Ji(u, v, w), which is precisely what we have at (10.2.60).

In view of (11.4.82) we have now obtained the vector field AAA, and by calculating the curl of AAA

we get the magnetic field BBB from the first relation of (11.4.74). It remains to determine the electric

field EEE. For this we observe

1

ε0ρ = ∇ ·EEE (from (11.4.69)(a))

= ∇ ·

[−∂A

AA

∂t−∇Ψ

](from the second relation of (11.4.74))

= −∇ ·

[∂AAA

∂t

]−∇2Ψ

= −∂(∇ · AAA)

∂t−∇2Ψ,

that is

(11.4.83)∂(∇ · AAA)

∂t+∇2Ψ = − 1

ε0ρ.

214

From the third relation of (11.4.74) we have

(∇ · AAA) = −ε0µ0∂Ψ

∂t

and putting this in (11.4.83) we find

(11.4.84) ∇2Ψ− ε0µ0∂2Ψ

∂t2= − 1

ε0ρ.

This equation is of exactly the same form as (11.4.81), except that the “source” term on the right

hand side involves the given charge density ρ instead of the given component Ji of the current

density JJJ , and the “thing” we must extract from (11.4.84) is the scalar field Ψ. Applying the

method of Green’s functions to (11.4.84) we find that Ψ is given by

(11.4.85) Ψ(t, x, y, z) =1

4πε0

∫R3

ρ(t−√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2], u, v, w)

[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,

for all (t, x, y, z), which gives Ψ in terms of the (known) time varying charge density field ρ(t, x, y, z).

It remains to determine the electric field EEE, but now we have all the information required for this.

We just determine ∇Ψ from Ψ at (11.4.85), and determine (∂AAA)/(∂t) from AAA given by (11.4.82),

and then EEE is given by the second relation of (11.4.74).

It remains to address the all-important question of how to find some vector field AAA and some

scalar field Ψ which satisfies all three relations of (11.4.74). The situation is not unlike that which

we faced when we looked at static magnetic fields in Remark 10.2.11, only there we had the simpler

task of just choosing a vector field AAA such that the two relations at (10.2.52) hold, whereas here we

must choose both a vector field AAA and a scalar field Ψ such that all three relations in (11.4.74) hold,

obviously a more difficult task. Nevertheless, we will proceed by much the same sort of clever idea

we used in Remark 10.2.11, namely a gauge transformation. To define the gauge transformation we

fix any vector field AAA such that (11.4.70) holds and fix any scalar field Ψ such that (11.4.73) holds.

We know that there are actually many vector fields AAA such that (11.4.70) holds and many scalar

fields Ψ such that (11.4.73) holds; for now we just make an arbitrary choice from among the myriad

possibilities of some AAA satisfying (11.4.70) and some Ψ satisfying (11.4.73), and stick with these

choices from now on (this is similar to what we did in Remark 10.2.11, except that there things

were simpler in that we just had to arbitrarily fix a vector field AAA and not worry about a scalar

field Ψ). Now suppose that f is a time varying scalar field which satisfies the relation

(11.4.86) ∇2f − ε0µ0∂2f

∂t2= −

[(∇ ·AAA) + ε0µ0

∂Ψ

∂t

],

215

in which the AAA and Ψ on the right had side are the vector and scalar fields we have just chosen,

and define the time varying vector field AAA and the time varying scalar field Ψ in terms of AAA, Ψ and

f as follows:

(11.4.87) AAA := AAA+∇f, Ψ := Ψ− ∂f

∂t.

We shall now see that AAA and Ψ defined at (11.4.87) satisfy all relations at (11.4.74). This incredibly

clever transformation of AAA into AAA and Ψ into Ψ in terms of the function f satisfying (11.4.86) is

called the Lorentz gauge transformation. We get

∇× AAA = ∇× [AAA+∇f ] (from (11.4.87))

= ∇×AAA+∇× (∇f) (from (9.1.61) )

= ∇×AAA+ 0 (from Theorem 9.1.13 )

= BBB (from (11.4.70)),

that is

(11.4.88) BBB = ∇× AAA,

so that the first relation of (11.4.74) is established. Moreover

∂AAA

∂t+∇Ψ =

∂

∂t[AAA+∇f ] +∇[Ψ− ∂f

∂t] (from (11.4.87))

=∂AAA

∂t+∂(∇f)

∂t+∇Ψ−∇

[∂f

∂t

]=∂AAA

∂t+∂(∇f)

∂t+∇Ψ− ∂(∇f)

∂t

=∂AAA

∂t+∇Ψ

= −EEE ( from (11.4.73))

that is

(11.4.89) EEE = −

[∂AAA

∂t+∇Ψ

],

which gives the second relation of (11.4.74). As for the third relation of (11.4.74), we have

(∇ · AAA) + ε0µ0∂Ψ

∂t= ∇ · [AAA+∇f ] + ε0µ0

∂

∂t[Ψ− ∂f

∂t] (from (11.4.87) )

= ∇ ·AAA+∇2f + ε0µ0∂Ψ

∂t− ε0µ0

∂2f

∂t2

= 0 ( from (11.4.86) ),

(11.4.90)

as required.

216

Chapter 12

Cylindrical and Spherical Coordinates

We are all familiar with the standard Cartesian coordinate system in three dimensional space R3,

in which a vector vvv is expressed in terms of Cartesian coordinates with reference to some fixed set

of mutually orthogonal unit vectors iii, jjj,kkk. Of all the various coordinate systems one can install

in R3 the Cartesian coordinate system is by far the simplest, the most universal and the most

important. There are, nevertheless, some situations for which the Cartesian coordinate system is

not entirely ideal. These typically involve scalar or vector fields which exhibit some kind of inherent

symmetry, such as cylindrical symmetry around a straight axis, or spherical symmetry (also called

radial symmetry) around a fixed point. Such symmetry is an extremely valuable property, which

can enormously simplify the solution of problems, and which therefore should be exploited as much

as possible. The cylindrical and spherical coordinate systems studied here are designed for just

this purpose. An important halfway-house to both of these coordinate systems is the the familiar

system of polar coordinates for representing a point in the plane which we recall in the following

section.

12.1 Polar Coordinates

From Figure 12.1 one sees that a point A in the plane can be represented by Cartesian coordinates

comprising a pair of real numbers (x, y) giving the “x-coordinates” and “y-coordinates. Alterna-

tively, one can represent the same point A by a distance r from the origin together with an angle

θ relative to the x-axis of the line from the origin to the point A and measured in the counter

217

clockwise direction. The point A is again represented by a pair of real numbers, namely (r, θ), but

this pair of course has a very different interpretation from the Cartesian pair (x, y).

Figure 12.1: Polar coordinates

Notice in particular that the Cartesian coordinates (x, y) take values in the range

(12.1.1) −∞ < x <∞, −∞ < y <∞,

while the polar coordinates (r, θ) naturally take values in the range

(12.1.2) 0 ≤ r <∞, 0 ≤ θ < 2π.

In particular, we do not allow the value θ = 2π at (12.1.2) since this merely replicates the case

of θ = 0. The relation between polar and Cartesian coordinates is extremely simple. Indeed, if

point A has polar coordinates (r, θ) with r ≥ 0 and 0 ≤ θ < 2π then the corresponding Cartesian

coordinates (x, y) are of course given by

(12.1.3) x = r cos(θ), y = r sin(θ).

That is, if the point A in the plane is given by the polar coordinates (r, θ) then, in terms of the

Cartesian basis iii, jjj, one sees from (12.1.3) it must be given by the vector

(12.1.4) vvv(r, θ) = r cos(θ)iii+ r sin(θ)jjj.

218

Effectively, the relation (12.1.4) (i.e. the relations (12.1.3)) tells us how the Cartesian representation

of a point changes when we change its polar coordinates. Conversely, given the Cartesian coordinates

(x, y) of a point, one sees from Figure 12.1 that the corresponding polar coordinates (r, θ) are given

by

(12.1.5) r =√x2 + y2, θ =

arctan(y/x), when x > 0 and y > 0,

π + arctan(y/x), when x < 0 and −∞ < y <∞,

2π + arctan(y/x), when x > 0 and y < 0,

(recall that arctan(α) always takes values in the range −π/2 to +π/2 with arctan(α) < 0 when

α < 0).

Remark 12.1.1. When are polar coordinates preferable to Cartesian coordinates? Suppose a

particle of mass M is located at the origin of an x−y Cartesian coordinate system in the plane, and

another particle of mass m moves in the plane under the influence of the gravitational force exerted

by the particle of mass M , and given by Newton’s law of universal gravitation. For instance, one

could think of M as the mass of the earth concentrated at the origin and the particle of mass m

could be a satellite orbiting the earth. Newton’s law of gravitation states that the force FFF is one of

attraction, along the radial line joining the particles, with magnitude given by

(12.1.6) ‖FFF‖ =k

r2,

in which r is the distance between the two particles, and k is a constant determined by the masses

M and m. We see that the force is a vector field which radially symmetric around the origin. In

particular, this force depends very simply on the radial distance r and has the wonderfully nice

property that its magnitude does not depend at all on the angle θ (see Figure 12.2). In this kind

of situation one will always use polar coordinates in preference to Cartesian coordinates, in order

to take advantage of the radial symmetry of the force field FFF . In particular, by working in polar

coordinates, and using the laws of classical mechanics, it is actually quite easy to show that the

particle of mass m follows an orbit in the plane in which the polar coordinates (r, θ) of the particle

are related by

(12.1.7) r =c

1 + ε cos(θ);

here c and ε are constants depending on the masses m and M (the constant ε is called the eccentricity

of the orbit). Equation (12.1.7) is known as Kepler’s first law and is a basic result in orbital

219

mechanics which tells us that the particle of mass m moves along a curve which is either an ellipse,

a parabola or a hyperbola (corresponding to ε < 1, ε = 1 and ε > 1 respectively). Kepler’s first law

would be extremely difficult to derive - and even just to write down - in Cartesian coordinates.

Figure 12.2: Radially symmetric gravitational force field in the plane

Remark 12.1.2. The standard Cartesian unit vectors iii and jjj enable one to express any vector

vvv = (x, y) in the plane in the Cartesian form

(12.1.8) vvv = xiii+ yjjj.

When we use polar coordinates then these Cartesian vectors are no longer very appropriate and we

must develop alternative “standard” vectors which are better suited to polar coordinates. To this

end fix some point A in the plane with polar coordinates (r0, θ0) (see Figure 12.3) and use (12.1.4)

to define the path

γγγ1(r) := vvv(r, θ0)

= r cos(θ0)iii+ r sin(θ0)jjj, for all 0 ≤ r <∞,(12.1.9)

in which θ0 is held fixed and the parametric variable is r (c.f. Definition 4.2.1). Then the curve Γ1

of this path is the straight line from the origin passing through A shown in Figure 12.3. Similarly

220

to (12.1.9), define the path

γγγ2(θ) := vvv(r0, θ)

= r0 cos(θ)iii+ r0 sin(θ)jjj, for all 0 ≤ θ < 2π,(12.1.10)

in which r0 is held fixed and θ is the parametric variable. Clearly the curve Γ2 of this path is the

circle of radius r0 passing through A in a counter-clockwise direction shown in Figure 12.3. We now

Figure 12.3: Curves Γ1 and Γ2 and basis vectors eeer(r0, θ0) and eeeθ(r0, θ0)

define the tangent to the curve Γ1 at the point A, namely

(12.1.11) γγγ(1)1 (r0) :=

dγγγ1

dr(r)∣∣r=r0

.

From (12.1.11) and (12.1.9) we find

(12.1.12) γγγ(1)1 (r0) = cos(θ0)iii+ sin(θ0)jjj,

in particular γγγ(1)1 (r0) is in the direction of the straight line joining 0 to A. Now let eeer(r0, θ0) denote

the unit vector with the same direction as γγγ(1)1 (r0) namely

(12.1.13) eeer(r0, θ0) :=γγγ

(1)1 (r0)∥∥∥γγγ(1)1 (r0)

∥∥∥ = cos(θ0)iii+ sin(θ0)jjj,

221

in which the final equality follows from (12.1.12) since it is clear that∥∥∥γγγ(1)

1 (r0)∥∥∥ = 1. Again,

eeer(r0, θ0) is in the direction of the straight line joining 0 to A (see Figure 12.3).

Similarly to (12.1.11) we can also define the tangent to the curve Γ2 at the point A, namely

(12.1.14) γγγ(1)2 (θ0) :=

dγγγ2

dθ(θ)∣∣θ=θ0

= −r0 sin(θ0)iii+ r0 cos(θ0)jjj,

in which the last equality at (12.1.14) follows from (12.1.10). Now let eeeθ(r0, θ0) denote the unit

vector with the same direction as γγγ(2)1 (θ0) namely

(12.1.15) eeeθ(r0, θ0) :=γγγ

(1)2 (θ0)∥∥∥γγγ(1)2 (θ0)

∥∥∥ = − sin(θ0)iii+ cos(θ0)jjj,


2 (θ0)∥∥∥ = r0. Since γγγ

(1)2 (θ0)

is tangent to the curve Γ2 at point A so also is the vector eeeθ(r0, θ0) (see Figure 12.3). We see from

Figure 12.3 that “attached” to the point A with polar coordinates (r0, θ0) is a pair of orthogonal

basis vectors eeer(r0, θ0), eeeθ(r0, θ0) called the coordinate frame at the point A. Notice that these

basis vectors change direction (but of course not the unit length) as the point (r0, θ0) moves (see

Figure 12.4), so that we have a moving coordinate frame. Put another way

the basis vectors eeer(r, θ), eeeθ(r, θ) are functions of the point (r, θ) to which they are attached.

This is in direct contrast to the Cartesian unit basis vectors iii, jjj which of course have a constant

direction parallel to the x and y axes respectively.

For later reference we rewrite the relations (12.1.13) and (12.1.15) but replacing the generic

polar coordinates (r0, θ0) with (r, θ) (just to lighten the notation):

(12.1.16) eeer(r, θ) :== cos(θ)iii+ sin(θ)jjj,

(12.1.17) eeeθ(r, θ) = − sin(θ)iii+ cos(θ)jjj.

Remark 12.1.3. Another important question deals with how length changes when we change the

polar coordinates. To fix ideas we first look at this question in the simpler setting of Cartesian

coordinates. Suppose point A has Cartesian coordinates (x, y) and we make small perturbations

dx and dy in the coordinates to get the point B with Cartesian coordinates (x+ dx, y + dy) (see

Figure 12.5).

222

Figure 12.4: Coordinate frame eeer, eeeθ at the points (r0, θ0) and (r1, θ1)

We must determine the small distance ds between A and B. Of course, the answer is immediate

from Pythagoras, namely

ds =√

( dx)2 + ( dy)2,

or, as we shall usually write in order to get get of the awkward square-root sign,

(12.1.18) ( ds)2 = ( dx)2 + ( dy)2.

We now consider the same question, but in polar coordinates. That is, suppose point A has polar

coordinates (r, θ) and we make small perturbations dr and dθ in the polar coordinates to get the

point B with polar coordinates (r + dr, θ + dθ) (see Figure 12.6). Again, we must determine the

resulting small distance ds between A and B. Clearly AD is of length dr, and since dθ is small

the circular arc AC is effectively a straight line with length given by r dθ. Moreover, again since

dθ is small, we see that AC and AD are effectively orthogonal, so that ADBCA is effectively a

rectangle. We summarize the situation as follows:

(12.1.19) length of AD is dr, length of AC is r dθ, and ADBCA is a rectangle.

223

Figure 12.5: Perturbations in Cartesian coordinates

It is now immediate from (12.1.19) and Pythagoras that

( ds)2 = (length of AB)2

= (length of AD)2 + (length of AC)2

= ( dr)2 + r2( dθ)2.

(12.1.20)

For later reference we customarily write (12.1.20) in the seemingly more complicated Riemannian

form

(12.1.21) ( ds)2 = [hr(r, θ) dr]2 + [hθ(r, θ) dθ]2

in which hr and hθ are the so-called Riemannian scale functions defined (in this case) by

(12.1.22) hr(r, θ) := 1, hθ(r, θ) := r.

12.2 Cylindrical Coordinates

Having summarized the main aspects of plane polar coordinates in the preceding section we are

now ready to look at coordinate systems in three dimensional space R3. Of course we already have

the familiar Cartesian coordinate system for which the standard basis vectors are the triple iii, jjj,kkk

224

Figure 12.6: Perturbations in polar coordinates

of orthogonal unit vectors parallel to the x, y and z-axes respectively, in which a generic vector vvv

is represented in the form

(12.2.23) vvv = xiii+ yjjj + zkkk,

or, equivalently, by the triplet of Cartesian coordinates (x, y, z), in which x, y and z are real scalars

in the range

(12.2.24) −∞ < x <∞, −∞ < y <∞, −∞ < z <∞.

Although the Cartesian coordinate system is extraordinarily useful it nevertheless fails to take ad-

vantage of any symmetries that may be available as part of a problem. We would like to make the

most of such symmetries (when present) for symmetry can hugely simplify the solution of the prob-

lem. For this reason we introduce two coordinate systems in R3 namely the cylindrical coordinate

system and the spherical coordinate system. The cylindrical coordinate system takes advantage of

any symmetry about an axis, nearly always chosen (just for convenience) to be the z-axis, while

the inherently more complex spherical coordinate system takes advantage of any radial symmetry

around the origin of R3.

We begin with the simpler case of a cylindrical coordinate system. Suppose that A is a point

225

in R3 with Cartesian coordinates (x, y, z). Then the pair (x, y) gives the Cartesian coordinates of

the point B in the x− y-plane (see Figure 12.7).

Figure 12.7: Cylindrical coordinates

Let (r, θ) be the polar coordinates of the point B; then of course r and θ are given in terms of the

Cartesian coordinates (x, y) of B by (12.1.5), repeated here for convenience as follows:

(12.2.25) r =√x2 + y2, θ =



2π + arctan(y/x), when x > 0 and y < 0.

It is clear from Figure 12.7 that the triplet of real numbers (r, θ, z) completely specifies the point

A; this triplet constitutes the cylindrical coordinates of the point A. Clearly these cylindrical

coordinates naturally take values in the range

(12.2.26) 0 ≤ r <∞, 0 ≤ θ < 2π, −∞ < z <∞.

We see that the parameters r and θ in the cylindrical coordinates (r, θ, z) are given in terms of the

Cartesian coordinates (x, y, z) of a point A by (12.2.25). Conversely, if one is given the cylindrical

coordinates (r, θ, z) of a point A then the parameters x and y in the corresponding Cartesian

226

coordinates (x, y, z) must be given by

(12.2.27) x = r cos(θ), y = r sin(θ).

Put another way, if the point A is given by the cylindrical coordinates (r, θ, z) then, in terms of the

Cartesian basis iii, jjj,kkk, one sees from (12.2.27) it must be given by the vector

(12.2.28) vvv(r, θ, z) = r cos(θ)iii+ r sin(θ)jjj + zkkk.

Effectively, the relation (12.2.28) (equivalently the relations (12.2.27)) tells us how the Cartesian

representation of a point changes when we change its cylindrical coordinates.

We now construct a triple of orthogonal basis vectors for cylindrical coordinates which are

an analog (indeed a simple extension) of the moving coordinate frame eeer(r, θ), eeeθ(t, θ) that we

constructed in Remark 12.1.2 for polar coordinates. We proceed exactly as we did in Remark 12.1.2,

that is fix some point A in R3 with the cylindrical coordinates (r0, θ0, z0). Then (by analogy with

(12.1.9)) define the path

γγγ1(r) := vvv(r, θ0, z0)

= r cos(θ0)iii+ r sin(θ0)jjj + z0kkk, for all 0 ≤ r <∞,(12.2.29)

in which (θ, z) in (12.2.28) is held fixed at (θ0, z0), and r is the parametric variable. The curve Γ1

of this path is clearly the straight line passing through A and parallel to the line OB (see Figure

12.8 which is adapted from Mathematica).

Likewise (c.f. (12.1.10)), define the path

γγγ2(θ) := vvv(r0, θ, z0)

= r0 cos(θ)iii+ r0 sin(θ)jjj + z0kkk, for all 0 ≤ θ < 2π,(12.2.30)

in which (r, z) in (12.2.28) is held fixed at (r0, z0), and θ is the parametric variable. Clearly the

curve Γ2 of this path is the circle of radius r0, lying parallel to the x − y-plane at the “height” z0

(see Figure 12.8). Finally, define the path

γγγ3(z) = vvv(r0, θ0, z)

= r0 cos(θ0)iii+ r0 sin(θ0)jjj + zkkk, for all −∞ < z <∞,(12.2.31)

in which (r, θ) in (12.2.28) is held fixed at (r0, θ0), and z is the parametric variable. Clearly the

curve Γ3 of this path is the straight line passing through the point B and parallel to the z-axis (see

Figure 12.8). Exactly as at (12.1.11), we define the tangent to the curve Γ1 at the point A, namely

(12.2.32) γγγ(1)1 (r0) :=

dγγγ1

dr(r)∣∣r=r0

= cos(θ0)iii+ sin(θ0)jjj + 0kkk,

227

Figure 12.8: Curves Γ1, Γ2, Γ3, and basis vectors eeer(r0, θ0z0), eeeθ(r0, θ0z0), eeez(r0, θ0z0)

in which the last equality follows from (12.2.29). In the same way, from (12.2.30), we define the

tangent to the curve Γ2 at the point A, that is

(12.2.33) γγγ(1)2 (θ0) :=

dγγγ2

dθ(θ)∣∣θ=θ0

= −r0 sin(θ0)iii+ r0 cos(θ0)jjj + 0kkk,

and, from (12.2.31), we define the tangent to the curve Γ3 at the point A, that is

(12.2.34) γγγ(1)3 (z0) :=

dγγγ3

dz(z)∣∣z=z0

= 0iii+ 0jjj + kkk.

Exactly as at (12.1.13) and (12.1.15), we define unit vectors eeer(r0, θ0, z0), eeeθ(r0, θ0, z0) and eeez(r0, θ0, z0),

having the same direction as γγγ(1)1 (r0), γγγ

(1)2 (θ0) and γγγ

(1)3 (z0) respectively, that is

(12.2.35) eeer(r0, θ0, z0) :=γγγ

(1)1 (r0)∥∥∥γγγ(1)1 (r0)

∥∥∥ = cos(θ0)iii+ sin(θ0)jjj + 0kkk,


1 (r0)∥∥∥ = 1. Similarly,

228

from (12.2.33), we have∥∥∥γγγ(1)

2 (θ0)∥∥∥ = r0, and therefore

(12.2.36) eeeθ(r0, θ0, z0) :=γγγ

(1)2 (θ0)∥∥∥γγγ(1)2 (θ0)

∥∥∥ = − sin(θ0)iii+ cos(θ0)jjj + 0kkk,

and, from (12.2.34)

(12.2.37) eeez(r0, θ0, z0) :=γγγ

(1)3 (z0)∥∥∥γγγ(1)3 (z0)

∥∥∥ = 0iii+ 0jjj + kkk.

Now we know from (12.2.32) that γγγ(1)1 (r0) is tangent to the curve Γ1 at point A, so it follows from

(12.2.35) that

(12.2.38) the unit vector eeer(r0, θ0, z0) is tangent to curve Γ1 at A,

(see Figure 12.8). Similarly, from (12.2.33) and (12.2.36), and from (12.2.34) and (12.2.37), we have

(12.2.39) the unit vector eeeθ(r0, θ0, z0) is tangent to curve Γ2 at A,

(12.2.40) the unit vector eeez(r0, θ0, z0) is tangent to curve Γ3 at A,

(see Figure 12.8). Moreover, calculating the inner product (or “dot product”) of eeer(r0, θ0, z0) with

eeeθ(r0, θ0, z0) using (12.2.35) and (12.2.36) we get

(12.2.41) (eeer(r0, θ0, z0)) · (eeeθ(r0, θ0, z0)) = − sin(θ0) cos(θ0) + sin(θ0) cos(θ0) = 0.

Similarly, from (12.2.35), (12.2.36) and (12.2.37) we find

(12.2.42) (eeer(r0, θ0, z0)) · (eeez(r0, θ0, z0)) = (eeeθ(r0, θ0, z0)) · (eeez(r0, θ0, z0)) = 0.

From (12.2.41) and (12.2.42) it follows that, for each and every (r0, θ0, z0), we have

(12.2.43)

eeer(r0, θ0, z0), eeeθ(r0, θ0, z0), eeez(r0, θ0, z0) is a triplet of mutually orthogonal unit vectors.

We see from Figure 12.8 that the triplet of orthogonal unit vectors eeer(r0, θ0, z0), eeeθ(r0, θ0, z0), eeez(r0, θ0, z0)is “attached” to the point A with cylindrical coordinates (r0, θ0, z0) and constitutes a coordinate

frame at the point A. Notice that these basis vectors change direction (but not of course the unit

length) as the point A moves, so that (exactly as for the case of polar coordinates) we have a moving

229

coordinate frame. This is in direct contrast to the Cartesian unit basis vectors iii, jjj,kkk which of

course have a constant direction parallel to the x, y and z axes respectively. For later reference

we rewrite the relations (12.2.35), (12.2.36) and (12.2.37), but replacing the generic cylindrical

coordinates (r0, θ0, z0) with (r, θ, z) (to lighten the notation):

(12.2.44) eeer(r, θ, z) = cos(θ)iii+ sin(θ)jjj + 0kkk,

(12.2.45) eeeθ(r, θ, z) = − sin(θ)iii+ cos(θ)jjj + 0kkk,

(12.2.46) eeez(r, θ, z) = 0iii+ 0jjj + 1kkk,

(c.f. (12.1.16) and (12.1.17) for similar relations in the case of polar coordinates).

In Remark 12.1.3 we established an expression in Riemannian form for the change ds in distance in

the plane resulting from a small change in the polar coordinates (see (12.1.21) and (12.1.22)). We

are now going to establish an analogous expression for the change ds in three dimensional space

resulting from a small change in cylindrical coordinates.

Suppose point A has cylindrical coordinates (r, θ, z) and we make small perturbations dr, dθ and

dz in the cylindrical coordinates to get point B with cylindrical coordinates (r+ dr, θ+ dθ, z+ dz)

(see Figure 12.9). We must determine the distance ds between A and B.

From Figure 12.9 we see that the straight line AD has length dr, and since dθ is small the circular

arc AC is effectively a straight line with length given by r dθ. Moreover, the straight line AE clearly

has length dz. Also, it is clear from Figure 12.9 that AD is collinear with the unit vector eeer(r, θ, z),

AC is collinear with the unit vector eeeθ(r, θ, z), and AE is collinear with the unit vector eeez(r, θ, z).

Put another way

(12.2.47) AD = eeer(r, θ, z)( dr), AC = eeeθ(r, θ, z)(r dθ), AE = eeez(r, θ, z)( dz),

and, again from Figure 12.9, we see that

(12.2.48) AB = eeer(r, θ, z)( dr) + eeeθ(r, θ, z)(r dθ) + eeez(r, θ, z)( dz).

In view of (12.2.48), (12.2.43), and Pythagoras, we get

( ds)2 := (length of AB)2

= ( dr)2 + r2( dθ)2 + ( dz)2.(12.2.49)

230

Figure 12.9: Perturbations in cylindrical coordinates

Exactly as at (12.1.21) for polar coordinates we can put (12.2.49) into Riemannian form, that is

(12.2.50) ( ds)2 = [hr(r, θ, z) dr]2 + [hθ(r, θ, z) dθ]2 + [hz(r, θ, z) dz]2,

in which hr, hθ and hz are the Riemannian scale functions defined by

(12.2.51) hr(r, θ, z) := 1, hθ(r, θ, z) := r, hz(r, θ, z) := 1.

12.3 Spherical Coordinates

The preceding completes our introduction to the main aspects of cylindrical coordinates in three

dimensional space, and we now move on to the spherical coordinate system in three dimensional

space R3. Suppose that A is a point in R3 with Cartesian coordinates (x, y, z). Then the pair (x, y)

gives the Cartesian coordinates of the point B in the x− y plane (see Figure 12.10).

Let (r, θ) be the polar coordinates of the point B; then of course r and θ are given in terms of the

231

Figure 12.10: Spherical coordinates

Cartesian coordinates (x, y) of B by (12.1.5), repeated here as

(12.3.52) r =√x2 + y2, θ =



2π + arctan(y/x), when x > 0 and y < 0,

Moreover, from Pythagoras, the radial length of the vector from the origin O to point A is

(12.3.53) ρ =√x2 + y2 + z2,

and it follows from the right-angle triangle OAC in Figure 12.10 that the angle φ between the

positive z-axis and the radial vector OA is related to r and ρ by

(12.3.54) sin(φ) =r

ρ, that is r = ρ sin(φ).

Moreover, again from the right-angle triangle OAC in Figure 12.10, we also have

(12.3.55) cos(φ) =z

ρ, that is z = ρ cos(φ).

Combining the first relation of (12.3.55) and (12.3.53) then gives

(12.3.56) φ = arccos

(z√

x2 + y2 + z2

).

232

It is clear from Figure 12.10 that the triplet of real numbers (ρ, θ, φ) completely specifies the point A;

this triplet constitutes the spherical coordinates of the point A. Clearly these spherical coordinates

naturally take values in the range

(12.3.57) 0 ≤ ρ <∞, 0 ≤ θ < 2π, 0 ≤ φ ≤ π.

We see then that the spherical coordinates (ρ, θ, φ) are given in terms of the Cartesian coordinates

(x, y, z) of a point A by (12.3.53), the second relation of (12.3.52) and (12.3.56) respectively. Sup-

pose, conversely, that one is given the spherical coordinates (ρ, θ, φ) of a point A; how does one

determine the corresponding Cartesian coordinates (x, y, z)? From the right-angle triangle ODB in

Figure 12.10 we see that

(12.3.58) x = r cos(θ), y = r sin(θ).

Upon combining (12.3.58) with the second relation of (12.3.54) we see that the Cartesian coordinates

(x, y, z) are given in terms of the spherical coordinates (ρ, θ, φ) by

(12.3.59) x = ρ sin(φ) cos(θ), y = ρ sin(φ) sin(θ), z = ρ cos(φ),

(the last relation of (12.3.59) follows from the last relation of (12.3.55)). Put another way, if the

point A is given by the spherical coordinates (ρ, θ, φ) then, in terms of the Cartesian basis iii, jjj,kkk,one sees from (12.3.59) it must be given by the vector

(12.3.60) vvv(ρ, θ, φ) = ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk.

Effectively, the relations (12.3.59) (equivalently the relation (12.3.60)) tells us how the Cartesian

representation of a point changes when we change its spherical coordinates. We now construct

a triple of orthogonal basis vectors for spherical coordinates which are an analog of the moving

coordinate frame eeer(r, θ, z), eeeθ(r, θ, z), eeez(r, θ, z) that we constructed for cylindrical coordinates

(c.f. (12.2.44), (12.2.45) and (12.2.46)). We proceed exactly as we did in the case of cylindrical

coordinates, that is fix some point A in R3 with spherical coordinates (ρ0, θ0, φ0). By analogy with

(12.2.29) define the path

γγγ1(ρ) := vvv(ρ, θ0, φ0)

= ρ sin(φ0) cos(θ0)iii+ ρ sin(φ0) sin(θ0)jjj + ρ cos(φ0)kkk, for all 0 ≤ ρ <∞,(12.3.61)

in which (θ, φ) in (12.3.60) is held fixed at (θ0, φ0), and ρ is the parametric variable. Then the curve

Γ1 of this path is clearly the straight line collinear with the vector OA (see Figure 12.11 which is

adapted from Mathematica). Likewise (c.f. (12.2.30)) define the path

233

Figure 12.11: Curves Γ1, Γ2, Γ3, and basis vectors eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0), eeeφ(ρ0, θ0, φ0)

γγγ2(θ) := vvv(ρ0, θ, φ0)

= ρ0 sin(φ0) cos(θ)iii+ ρ0 sin(φ0) sin(θ)jjj + ρ0 cos(φ0)kkk, for all 0 ≤ θ < 2π,(12.3.62)

in which (ρ, φ) in (12.3.60) is held fixed at (ρ0, φ0), and θ is the parametric variable. Clearly the

curve Γ2 of this path is the circle of radius

(12.3.63) r0 := ρ0 sin(φ0),

lying parallel to the x− y plane at the “height”

(12.3.64) z0 := ρ0 cos(φ0),

(see Figure 12.11). Finally (c.f. (12.2.31)) define the path

γγγ3(φ) := vvv(ρ0, θ0, φ)

= ρ0 sin(φ) cos(θ0)iii+ ρ0 sin(φ) sin(θ0)jjj + ρ0 cos(φ)kkk, for all 0 ≤ φ ≤ π,(12.3.65)

in which (ρ, θ) in (12.3.60) is held fixed at (ρ0, θ0), and φ is the parametric variable. Clearly the

curve Γ3 of this path is the circle of radius ρ0 “vertical” to the x − y plane and lying in the plane

234

which contains the triangle OAB (see Figure 12.11). Exactly as at (12.2.32), we define the tangent

to the curve Γ1 at the point A, namely

(12.3.66) γγγ(1)1 (ρ0) :=

dγγγ1

dρ(ρ)∣∣ρ=ρ0

= sin(φ0) cos(θ0)iii+ sin(φ0) sin(θ0)jjj + cos(φ0)kkk

in which the last equality follows from (12.3.61). In the same way, from (12.3.62), we define the

tangent to the curve Γ2 at the point A, that is

(12.3.67) γγγ(1)2 (θ0) :=

dγγγ2

dθ(θ)∣∣θ=θ0

= −ρ0 sin(φ0) sin(θ0)iii+ ρ0 sin(φ0) cos(θ0)jjj + 0kkk,

and, from (12.3.65), we define the tangent to the curve Γ3 at the point A, that is

(12.3.68) γγγ(1)3 (φ0) :=

dγγγ3

dφ(φ)∣∣φ=φ0

= ρ0 cos(φ0) cos(θ0)iii+ ρ0 cos(φ0) sin(θ0)jjj − ρ0 sin(φ0)kkk.

Exactly as at (12.2.35), (12.2.36) and (12.2.37) we define unit vectors eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0)

and eeeφ(ρ0, θ0, φ0), having the same direction as γγγ(1)1 (ρ0), γγγ

(1)2 (θ0) and γγγ

(1)3 (φ0) respectively, that is

(12.3.69) eeeρ(ρ0, θ0, φ0) :=γγγ

(1)1 (ρ0)∥∥∥γγγ(1)1 (ρ0)

∥∥∥ = sin(φ0) cos(θ0)iii+ sin(φ0) sin(θ0)jjj + cos(φ0)kkk,


1 (ρ0)∥∥∥ = 1. Similarly,

from (12.3.67), we have∥∥∥γγγ(1)

2 (θ0)∥∥∥ = ρ0 sin(φ0), and therefore

(12.3.70) eeeθ(ρ0, θ0, φ0) :=γγγ

(1)2 (θ0)∥∥∥γγγ(1)2 (θ0)

∥∥∥ = − sin(θ0)iii+ cos(θ0)jjj + 0kkk,

and, from (12.3.68), we have∥∥∥γγγ(1)

3 (φ0)∥∥∥ = ρ0, and therefore

(12.3.71) eeeφ(ρ0, θ0, φ0) :=γγγ

(1)3 (φ0)∥∥∥γγγ(1)3 (φ0)

∥∥∥ = cos(φ0) cos(θ0)iii+ cos(φ0) sin(θ0)jjj − sin(φ0)kkk.

Now we know from (12.3.66) that γγγ(1)1 (ρ0) is tangent to the curve Γ1 at point A, so it follows

from (12.3.69) that

(12.3.72) the unit vector eeeρ(ρ0, θ0, φ0) is tangent to curve Γ1 at A,

(see Figure 12.11). Similarly, from (12.3.67) and (12.3.70), and from (12.3.68) and (12.3.71), we

have

(12.3.73) the unit vector eeeθ(ρ0, θ0, φ0) is tangent to curve Γ2 at A,

235

(12.3.74) the unit vector eeeφ(ρ0, θ0, φ0) is tangent to curve Γ3 at A,

(see Figure 12.11). Moreover, calculating the inner product of eeeρ(ρ0, θ0, φ0) with eeeθ(ρ0, θ0, φ0) using

(12.3.69) and (12.3.70) we get

(12.3.75) (eeeρ(ρ0, θ0, φ0)) · (eeeθ(ρ0, θ0, φ0)) = − sin(φ0)[sin(θ0) cos(θ0)− sin(θ0) cos(θ0)] = 0.

Similarly, from (12.3.69), (12.3.70) and (12.3.71) we find

(12.3.76) (eeeρ(ρ0, θ0, φ0)) · (eeeφ(ρ0, θ0, φ0)) = (eeeθ(ρ0, θ0, φ0)) · (eeeφ(ρ0, θ0, φ0)) = 0.

From (12.3.75) and (12.3.76) it follows that, for each and every (ρ0, θ0, φ0), we have

(12.3.77)

eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0), eeeφ(ρ0, θ0, φ0) is a triplet of mutually orthogonal unit vectors.

We see from Figure 12.11 that the orthogonal unit vectors eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0), eeeφ(ρ0, θ0, φ0)are “attached” to the point A with spherical coordinates (ρ0, θ0, φ0) and constitutes a coordinate

frame at the point A. Notice that these basis vectors change direction (but not of course the unit

length) as the point Amoves, so that (exactly as for the cases of polar and cylindrical coordinates) we

have a moving coordinate frame. Again, this is in direct contrast to the Cartesian unit basis vectors

iii, jjj,kkk which of course have a constant direction parallel to the x, y and z axes respectively. For

later reference we rewrite the relations (12.3.69), (12.3.70) and (12.3.71), but replacing the generic

spherical coordinates (ρ0, θ0, φ0) with (ρ, θ, φ) (to lighten the notation):

(12.3.78) eeeρ(ρ, θ, φ) = sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk, ,

(12.3.79) eeeθ(ρ, θ, φ) = − sin(θ)iii+ cos(θ)jjj + 0kkk,

(12.3.80) eeeφ(ρ, θ, φ) = cos(φ) cos(θ)iii+ cos(φ) sin(θ)jjj − sin(φ)kkk,

(c.f. (12.2.44), (12.2.45) and (12.2.46) for similar relations in the case of cylindrical coordinates).

For cylindrical coordinates we established an expression in Riemannian form for the change ds in

distance resulting from a small change in the cylindrical coordinates (see (12.2.50) and (12.2.51)).

We now get a comparable expression for the case of spherical coordinates. Suppose point A has

spherical coordinates (ρ, θ, φ) so that

(12.3.81) point A is given by vvv(ρ, θ, φ) (see (12.3.60))

236

and we make small perturbations dρ, dθ and dφ in the spherical coordinates to get point B with

the spherical coordinates (ρ+ dρ, θ + dθ, φ+ dφ) so that

(12.3.82) point B is given by vvv(ρ+ dρ, θ + dθ, φ+ dφ) (again see (12.3.60))

(see Figure 12.12). We must determine the distance ds between A and B. In the case of cylindrical

Figure 12.12: Perturbations in spherical coordinates

coordinates it was easy to calculate this distance just by looking at Figure 12.9, for this led imme-

diately to (12.2.48) which in turn gave us the desired relation (12.2.49) (which we then wrote in

the Riemannian form (12.2.50) and (12.2.51)). In the case of spherical coordinates it is not quite so

easy to see what is going on by looking at Figure 12.12, and in fact one can easily extract misleading

information by incorrectly interpreting this figure. Accordingly, we shall instead proceed just by the

use of ordinary calculus and not rely on any figures or pictures at all. From (12.3.81) and (12.3.82)

237

we see

( ds)2 = (length of AB)2

= ‖vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)‖2 ,(12.3.83)

so we must first calculate the vector difference [vvv(ρ + dρ, θ + dθ, φ + dφ)− vvv(ρ, θ, φ)]. But this is

easy using the formulas we have already worked out. In fact, by ordinary calculus, we have

vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)

=∂vvv

∂ρ(ρ, θ, φ)( dρ) +

∂vvv

∂θ(ρ, θ, φ)( dθ) +

∂vvv

∂φ(ρ, θ, φ)( dφ).

(12.3.84)

Now substitute vvv given by (12.3.60) into the right side of (12.3.84). We get


=∂

∂ρ[ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk]( dρ)

+∂

∂θ[ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk]( dθ)

+∂

∂φ[ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk]( dφ)

= [sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk]( dρ)

+ [− sin(θ)iii+ cos(θ)jjj](ρ sin(φ) dθ)

+ [cos(φ) cos(θ)iii+ cos(φ) sin(θ)jjj − sin(φ)kkk](ρ dφ) (evaluating the partial derivatives)

= ( dρ)eeeρ(ρ, θ, φ) + (ρ sin(φ) dθ)eeeθ(ρ, θ, φ) + (ρ dφ)eeeφ(ρ, θ, φ),

(12.3.85)

in which we have used (12.3.78), (12.3.79) and (12.3.80) at the last equality. To summarize, in

(12.3.85) we have shown


= ( dρ)eeeρ(ρ, θ, φ) + (ρ sin(φ) dθ)eeeθ(ρ, θ, φ) + (ρ dφ)eeeφ(ρ, θ, φ).(12.3.86)

In view of (12.3.86), (12.3.77) and Pythagoras we get

‖vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)‖2

= ( dρ)2 + (ρ sin(φ))2( dθ)2 + (ρ)2( dφ)2.(12.3.87)

238

Now combine (12.3.87) and (12.3.83). We get the distance between A and B (recall (12.3.81)

and (12.3.82)) in terms of small changes ( dρ, dθ, dφ) in the spherical coordinates (ρ, θ, φ) in the

Riemannian form

(12.3.88) ( ds)2 = [hρ(ρ, θ, φ) dρ]2 + [hθ(ρ, θ, φ) dθ]2 + [hφ(ρ, θ, φ) dφ]2,

in which hr, hθ and hz are the Riemannian scale functions defined by

(12.3.89) hρ(ρ, θ, φ) := 1, hθ(ρ, θ, φ) := ρ sin(φ), hφ(ρ, θ, φ) := ρ.

239

advanced calculus 2 for electrical engineers math-212/ece ...ece206/outline/ece206.pdf · advanced...

Documents